Shared memory is a parallel architecture in which two or more processors share a main memory.Each processor can store or retrieve information from the main memory.The communication between processors is realized by accessing shared memory.[1]
Shared storage, as its name implies, isPhysical storageA segment ofstorage space 。Shared storage segments have sizes and physical storage addresses.The process that wants to access the shared storage segment can connect this storage region to any suitable place in its address space, as can other processes.In this way, multiple processes can access the same physical storage.The arrow indicates the mapping relationship between the logical storage and physical storage of the process.
The shared memory computer system supports the traditional single address programming space, which reduces theprogrammerBecause it has strong versatility and can easily transplant existingApplication software。Early shared memory computers used centralized shared memory, that is, multiple processors passed throughBus、Cross switchOr multi-level interconnection network is connected with shared memory, and all processors have the same delay when accessing memory.However, since multiple processors share memorystorageIt becomes the bottleneck of the system.Therefore, in a large-scale shared memory computer system, shared memory is divided into many modules and distributed among nodes (one node may have one or moreprocessor), which is called distributed shared storage (DSM)computer systemThat is, each node contains a part of shared memory, and nodes are connected through a scalable Internet.Distributed memory and scalable Internet increase memory access bandwidth, but lead to uneven memory access delay.
In order to alleviate the conflict caused by memory sharing and the memory access delay caused by memory distribution, the shared memory computer systemprocessorIn general, there areCache memory。But the use of cache memory has broughtCache consistencyQuestion.In addition, the distribution of memory makes the processor have different delays in accessing different memory locations.In order to ensure the correct execution of parallel programs, a cache consistency protocol and a storage consistency model are needed.
Cache memoryThe consistency protocol is a mechanism to propagate the newly written value of one processor to other processors in the system.Cache consistency protocols are designed to implement a storage consistency model.
storageConsistency modelIt is a convention between system designers and programmers. It provides criteria for judging shared memory programs and their correct structure. Among them, the sequential consistency model is generally used as the standard for correct execution of shared memory programs, and it is also the basis for defining other weak consistency models.In the sequential consistency model, the result of parallel execution of programs by multiple processors is equal to the result of interleaving the instruction streams executed by each processor sequentially in a certain way and executing them on a single computer.If the result of one parallel execution in a multiprocessor environment is the same as that of the same program in a single processor multiprocess environment, the parallel execution is correct.
In the shared storage system, in order to implement the sequential consistency model, it is necessary to impose strict restrictions on the order of memory access events. In order to relax the restrictions on the order of memory access events, people have proposed a series of weak storage consistency models.The basic idea of these weak storage convergence models is:Consistency modelAlthough strict restrictions are imposed on the order of memory access events to ensure correct execution, these restrictions are redundant in most cases that will not cause memory access conflicts, so programmers can assume some responsibility for the correctness of execution, that is, pointing out the memory access operations that need to maintain consistency in the program,The system only guarantees the maintenance of consistency in the places pointed out by the userData consistencyThe data correlation between processors can be ignored for the part not described by the user.[1]
System structure
Announce
edit
According to the characteristics of memory distribution, consistency maintenance and implementation mode, common shared storage system architectures are as follows:
(1) NoneCache memoryCentralized shared storage structure of
The processor with this structure has no cache memory, and multiple processorsCross switchOr multi-level interconnection network.Since there is only one backup of any storage unit in the system, such system does not existCache consistencyThe scalability of the system is limited by the bandwidth of the crossbar switch or multi-level interconnection network.Typical examples of this structure are parallel vector machines andmainframe, such as Cray XMP and YMP-C90 of Cray Company in the United States.
(2) Centralized shared memory structure based on cache In the system with this structure, each processor has a cache memory, and multiple processors are generally connected to the memory through the bus.The cache memory of each processor is maintained through the listening busData consistency。Because the bus isExclusive resourcesThe scalability of such systems is limited.This structure is commonly used in servers and workstations using symmetric multiprocessor (SMP) systems, such as 4DEC, SUN, Sequent, SGl and other companies' multiprocessor workstation products.
(3) WithCache consistencyDistributed shared storage architecture based on
This structure is calledCache memoryConsistent non-uniform storage access (CC-NUMA) structure.The shared memory of this kind of system is distributed among the nodes.The nodes are connected through a scalable Internet, and each processor can cache the shared unit. The maintenance of cache consistency is the key to this kind of system, which determines the scalability of the system.Directory based methods are often used to maintain cache consistency between processors.Examples of such systems include DASH and FLASH of Slandfbrd University, Alewife of MIT, and Origin 2000 of SGI.
(4) Cache only distributed shared memory architecture (COMA)
In this structure, the memory of each node is equivalent to a large capacity cache,Data consistencyIt is also maintained at this level.The address of the shared memory of such a system is active.The storage unit is separated from the physical address, and data can be moved and copied dynamically between the memory of each node according to the memory access mode.Its advantage is that when the access of the processor is not hit in the cache, the hit rate in the local shared memory is high.Its disadvantage is that when the access of the processor is not hit at this node, because the address of the memory is active, a mechanism is needed to find the current location of the accessed unit, so the delay is large.The systems adopting this structure are KSRl of Ken dall Square Research in the United States and DDM of the Swedish Computer Research Institute.In addition, this structure is often used for sharingVirtual storage systemMedium.
(5) NoneCache consistencyDistributed shared storage architecture based on
This structure is called Non Uniform Memory Access without cache consistency (NCC-NUMA) structure.It is characterized by that although every processor has a cache memory, the hardware is not responsible for maintaining cache consistencycompilerOr programmers.Its typical representative is Cray's T3D and T3E series products. In T3D and T3E, the system provides users withLibrary functionIt is convenient for users to maintain by setting critical areas and other meansData consistency。The advantage of this is that the system has strong scalability, and high-end T3D and T3E products can reach thousands of processors.
(6) The shared virtual storage structure is also called software distributed shared storage structure.Its basic idea is to organize multiple independently addressed memories distributed in each node into a uniformly addressed shared memory space by software in a large-scale parallel processing system or cluster computing system based on message passing.This architecture has the advantages of the programmability of the shared storage system and the simplicity of the hardware of the message passing system. The main problem is the high communication overhead.In some high performance computer systemsSymmetric multiprocessorThe system provides hardware shared storage, which is realized by software between nodes.Common virtual shared storage systems include Ivy, Midway, Munin, Treadmarks and JIAJIA.[1]