Collection
zero Useful+1
zero

L2 Cache

Temporary storage between CPU and memory
synonym L2Cache (CPU L2 cache) Generally refers to L2 cache
Temporary L2 cache between CPU and memory storage , the capacity is smaller than the memory, but the exchange speed is faster, L2 cache capacity The size determines the performance of cpu.
Chinese name
L2 Cache
Foreign name
L2 caches
Purpose
A temporary connection between CPU and memory storage
main features
Smaller capacity than memory but faster exchange speed

Product Introduction

Announce
edit
Temporary L2 cache between CPU and memory storage , its capacity is smaller than memory, but its exchange speed is faster. The data in the cache is a small part of the memory, but this small part is to be accessed by the CPU in a short time. When the CPU calls a large amount of data, it can avoid memory and directly call from the cache to speed up reading. It can be seen that adding cache to CPU is an efficient solution Memory (cache+memory) becomes a high speed cache with large memory capacity storage system Has. Cache has a great impact on CPU performance, mainly because of the data exchange order of CPU and the bandwidth between CPU and cache.

L2 Principle

Announce
edit
The working principle of cache is that when the CPU wants to read a data, it first looks in the cache. If it is found, it will be read immediately and sent to the CPU for processing; If the data is not found, it will be read from the memory at a relatively slow speed and sent to the CPU for processing data block When it is called into the cache, the whole block of data can be read from the cache in the future without having to call the memory.
It is this reading mechanism that makes the hit rate of CPU read cache very high (most CPUs can reach about 90%). That is to say, 90% of the data that the CPU reads next time is in the cache, and only about 10% needs to be read from memory. This greatly saves the time for the CPU to read the memory directly, and basically eliminates the need for the CPU to wait when reading data. In general, the CPU reads data in the order of cache first and then memory. [1]

L2 History

Announce
edit
Earliest CPU cache It's a whole, and capacity Very low, Intel The company has classified caches since the era of Pentium. Integrated in CPU core The cache in is not enough to meet the requirements of CPU, and the manufacturing process cannot greatly improve the cache capacity Therefore, there is a cache integrated on the same circuit board or motherboard as the CPU CPU core The integrated cache is called L1 cache , while the external one is called L2 cache. L1 cache Middle return Data cache (D-Cache) and Instruction Cache (I-Cache)。 Both are used to store data and execute instructions for these data, and both can be accessed by the CPU at the same time, reducing conflicts caused by cache contention and improving processor performance. When Intel introduced the Pentium 4 processor, it also added a new level 1 trace cache with a capacity of 12KB
developing process
With the development of CPU manufacturing technology, L2 cache can also be easily integrated into the CPU core, capacity It is also improving year by year. It is not accurate to define the level 1 and level 2 caches whether they are integrated in the CPU or not. And with L2 Cache It is integrated into the CPU core, and the frequency division between the L2 cache and the CPU is also changed. At this time, it is the same as Dominant frequency It can provide higher transmission speed for CPU.
L2 cache is one of the keys to CPU performance CPU core Without change, increase L2 cache capacity It can greatly improve the performance. The difference between high and low end CPUs of the same core is often L2 Cache This shows the importance of L2 cache to CPU.
When the CPU finds useful data in the cache, it is called a hit. When there is no data required by the CPU in the cache (this is called a miss), the CPU accesses the memory. Theoretically, in a CPU with L2 cache, read L1 cache The hit rate is 80%. in other words CPU L1 cache The useful data found in accounts for 80% of the total data, and the remaining 20% is from L2 Cache Read in. Since the data to be executed cannot be accurately predicted, read L2 Cache The hit rate is also about 80% (16% of the total data is read from the L2 cache). Then the remaining data will have to be called from memory, but this is a relatively small proportion. Higher end CPUs also have L3 cache It is a cache designed for data missed after reading L2 cache. Among CPUs with L3 cache, only about 5% of data needs to be called from memory, which further improves CPU efficiency.
To ensure a high hit rate when the CPU accesses, the contents of the cache should be replaced by a certain algorithm. A commonly used algorithm is the Least Recently Used Algorithm (LRU algorithm), which eliminates the least visited rows in the recent period. Therefore, it is necessary to set a counter for each line. The LRU algorithm is to clear the hit line counter and add 1 to the other line counters. When replacement is needed, the data row with the largest row counter count value is eliminated. This is an efficient and scientific algorithm. Its counter clearing process can eliminate some data that is no longer needed after frequent calls from the cache, and improve the utilization of the cache.
In CPU products, L1 cache The capacity of the L2 cache is basically between 4KB and 18KB. The capacity of the L2 cache is divided into 128KB, 256KB, 512KB, 1MB, etc. L1 cache There is little difference in capacity between products, while L2 cache capacity Is the key to improving CPU performance. The capacity of L2 cache is increased by CPU manufacturing process As a result, the increase of the capacity will inevitably lead to the increase of the number of transistors in the CPU. To integrate a larger cache on the limited CPU area, the higher the requirements for the manufacturing process will be. [2]

usage method

Announce
edit
Dual core The CPU's L2 cache is special. Compared with the previous single core CPU, the most important thing is that the data saved in the cache of the two cores should be consistent, otherwise an error will occur. To solve this problem, different CPUs use different methods:
Intel's Dual core CPU mainly includes Pentium D, Pentium EE and Core Duo L2 Cache In exactly the same way. Pentium D and Pentium EE L2 Cache Both internal cores of the CPU have independent L2 caches. The Smithfield core CPU of the 8xx series is 1MB per core, while the Presler core CPU of the 9xx series is 2MB per core. Cache between two cores inside this CPU Data synchronization Is dependent on the location Mainboard North Bridge The arbitration unit on the chip passes Front end bus It is implemented by transmission between two cores, so the data delay problem is serious and the performance is not satisfactory.
The core used by Core Duo is Yonah, whose L2 Cache The two cores share a 2MB second level cache. The shared second level cache, together with Intel's "Smart cache" shared cache technology, realizes the real cache Data synchronization , significantly reducing data latency, reducing Front end bus Occupancy, good performance, yes Dual core The most advanced L2 cache architecture on the processor. Intel's future Dual core The processor's L2 cache will use this "Smart cache" shared cache technology where two cores share L2 cache.
The core of Athlon 64 X2 CPU mainly includes Manchester and Toledo. Their second level cache is that the two cores inside the CPU have independent second level cache. The Manchester core is 512KB per core, while the Toledo core is 1MB per core. The cache data synchronization between the two cores inside the processor is controlled by the CPU's built-in System Request Interface (SRI), which can be transmitted inside the CPU. In this way, not only CPU resources are used very little, but also memory is not needed Bus Resources and data latency are also significantly reduced compared with Intel's Smithfield core and Presler core, and collaboration efficiency is significantly better than these two cores. However, since the caches of the two cores are still independent of each other in this way, the architecture is also obviously inferior to Intel's shared cache technology Smart Cache represented by Yonah core.