1 Cache and cache line
This section describes a typical data cache and some instruction caches; A TLB may have more complexity and an instruction cache may be simpler. The following diagram shows two memories.
Each location (should be each line in the picture) in each memory contains data (a cache line), which in different designs may range in size from 8 to 512 bytes.[citation needed] The size of the cache line is usually larger than the size of the usual access requested by a CPU instruction[citation needed], which ranges from 1 to 16 bytes[citation needed] (the largest addresses and data handled by current 32 bit and 64 bit architectures being 128 bits long, i.e. 16 bytes).[citation needed]
Each location in each memory also has an index, which is a unique number used to refer to that location. The index for a location in main memory is called an address. Each location in the cache has a tag that contains the index of the datum in main memory(this sentence may not be so accurate) that has been cached. In a CPU's data cache these entries are called cache lines or cache blocks.
2) Cache hit and cache miss
When the processor needs to read or write a location in main memory, it first checks whether that memory location is in the cache. This is accomplished by comparing the address of the memory location to all tags in the cache that might contain that address. If the processor finds that the memory location is in the cache, we say that a cache hit has occurred; otherwise, we speak of a cache miss. In the case of a cache hit, the processor immediately reads or writes the data in the cache line. The proportion of accesses that result in a cache hit is known as the hit rate, and is a measure of the effectiveness of the cache for a given program or algorithm.
In the case of a miss, the cache allocates a new entry, which comprises the tag just missed and a copy of the data. The reference can then be applied to the new entry just as in the case of a hit.
3) Read miss and write miss
Read misses delay execution because they require data to be transferred from a much slower memory than the cache itself.
Write misses may occur without such penalty since the data can be copied in the background.
Instruction caches are similar to data caches but the CPU only performs read accesses (instruction fetch) to the instruction cache. Instruction and data caches can be separated for higher performance with Harvard CPUs but they can also be combined to reduce the hardware overhead.
2 Temporal locality and spatial locality
(Try to understand addressing inside a cache line later, it is important for understanding the temporal/spatial locality)
Temporal locality: if at one point in time a particular memory location is referenced, then it is likely that the same location will be referenced again in the near future. There is a temporal proximity between the adjacent references to the same memory location. In this case it is common to make efforts to store a copy of the referenced data in special memory storage, which can be accessed faster. Temporal locality is a very special case of the spatial locality, namely when the prospective location is identical to the present location.
Spatial locality: if a particular memory location is referenced at a particular time, then it is likely that nearby memory locations will be referenced in the near future. In this case it is common to attempt to guess the size and shape of the area around the current reference for which it is worthwhile to prepare faster access.
More generally,
- Temporal locality: The concept that a resource that is referenced at one point in time will be referenced again sometime in the near future.
- Spatial locality:The concept that likelihood of referencing a resource is higher if a resource near it was just referenced.
- Good temporal locality ⇒ cache miss traffic decreases fast when cache size increases
- Good spatial locality ⇒ cache miss traffic does not increase much when line size increases