Collection

zero Useful+1

zero

Storage management

Terms of software design discipline

The object of memory management is main memory, also called memory. Its main functions include allocation and recovery of main memory space, improvement of main memory utilization, expansion of main memory, and effective protection of main memory information.^[1]

Chinese name: Storage management
Foreign name: storage management

Object: Main memory
Partitioned storage: Static partition, variable partition, relocatable partition

catalog

Storage management scheme

Announce

edit

The main purpose of the storage management scheme is to solve the problem of multiple users using main memory. Its storage management scheme mainly includes partitioned storage management, paging storage management, segmented storage management, segment page storage management and virtual storage management.^[1]

Partitioned storage

There are three different ways of partitioned storage management: static partition, variable partition, and relocatable partition.

Static partition

Static partition storage management is to divide the allocatable main memory space into several consecutive areas in advance. Each area can be the same size or different. In order to describe the allocation and usage of each partition, a "main memory allocation table" needs to be set for storage management. The main memory allocation table indicates the starting address and length of each partition. The occupied flag bit in the table is used to indicate whether the partition is occupied. When the occupied flag bit is "0", it indicates that the partition has not been occupied. When allocating main memory, always select the partitions marked "0". When a partition is assigned to a job, fill in the occupation flag column with the job name of the partition. Static partition storage management is adopted, and the utilization of main memory space is not high.^[2]

Variable partition

The variable partitioning method is to partition by job size. When a job is to be loaded, check whether there is enough space in the main memory according to the amount of main memory required by the job. If there is, divide a partition according to the amount needed to allocate it to the job; If not, make the job wait for main memory space. Because the size of partitions is determined according to the actual needs of the job, and the number of partitions is also random, it can overcome the waste of main memory space in the fixed partition mode.

With the loading and withdrawal of jobs, the main memory space is divided into many partitions. Some partitions are occupied by jobs, while others are free. When a new job requires loading, it is necessary to find a large enough free area and load the job into this area. If the found free area is larger than the job demand, the job loads and then divides the original free area into two parts, one of which is occupied by the job; The other part is divided into a smaller free area. When the primary line finishes evacuating, if the area it returns is adjacent to other free areas, it can be combined into a larger free area to facilitate the loading of large operations.

Variable partition scheduling algorithm

(1) First adaptation algorithm. During each allocation, the unallocated table is always searched in order until the first free area that can meet the length requirements is found. Split the found unallocated area. Part of it is allocated to jobs, and the other part is still free. This allocation algorithm may divide large space into cells, resulting in more "fragments" of main memory.

（2） Best fit algorithm 。 Select a minimum partition from the free area that can meet the job requirements, so as to ensure that a larger area is not divided, so that it is easier to meet the requirements when loading large jobs. When this allocation algorithm is used, the free areas can be smoothly arranged in ascending order according to the size, and the search always starts from the smallest area until a satisfying area is found.

(3) The worst adaptation algorithm. Select the largest free area to be used by jobs, so that the remaining free area is not too small. This algorithm is beneficial to small and medium-sized jobs. With this allocation algorithm, the free areas can be smoothly arranged in descending order by size, and the search always starts from the largest area. In this way, tables must also be rearranged when reclaiming a partition.^[2]

Paging storage

paging Storage management is to divide the logical address space of a process into several pieces of equal size, called pages or pages, and number each page, starting from 0, such as page 0, page 1, etc. Accordingly, the memory space is also divided into several storage blocks of the same size as the page, called (physical) blocks or page frames, which are also numbered, such as 0 # blocks, 1 # blocks, and so on. When allocating memory for a process, several pages in the process are loaded into multiple physical blocks that can not be adjacent. Because the last page of a process is often filled with less than one piece, unusable fragments are formed, which are called "intra page fragments".

Segmented storage

In the segmented storage management mode, the job address space is divided into several segments, each segment defines a set of logical information. For example, there are main program segment MAIN, subprogram segment X, data segment D and stack segment S. Each paragraph has its own name. For simplicity, a segment number is usually used to replace the segment name. Each segment is addressed from 0, and a continuous address space is used. The length of the segment is determined by the length of the corresponding logical information group, so the length of each segment is not equal. The address space of the whole operation is two-dimensional because it is divided into multiple segments, that is, its logical address is composed of segment number (segment name) and address within the segment.

Segment page storage

The basic principle of the segment page system is Basic segmented storage management mode and Basic paging storage management mode The combination of principles is to divide the user program into several segments, then divide each segment into several pages, and give each segment a segment name.

Virtual Storage

When the storage space requirement of the program is larger than the actual memory space, it makes the program difficult to run Virtual Storage Technology It is the combination of the actual memory space and the relatively large external memory storage space to form a memory space that is far larger than the actual memory space Virtual storage space , the program runs in this virtual storage space. The basis for realizing virtual storage is Principle of locality of program That is, in the process of running a program, it often reflects the characteristics of running within a certain local range. In time, it often runs the same instruction segment and data (called time locality), and in space, it often runs the same instruction segment and data with a certain local storage space (in decentralized storage)

function

Announce

edit

Virtual memory technology not only allows us to use more memory, but also provides the following functions:

Addressing space

The operating system makes the system appear to have much more memory space than the actual memory. Virtual memory can be many times the actual physical space in the system. Each process runs in its own independent virtual address space. These virtual spaces are completely isolated from each other, so processes will not affect each other. At the same time, the hardware virtual memory mechanism can set some areas of memory to be non writable. This can protect code and data from malicious programs.

Memory mapping

Memory mapping technology can map the image file and data file directly to the address space of the process. In memory mapping, the contents of the file are directly connected to the virtual address space of the process.

Physical memory allocation

memory management The subsystem allows each running process in the system to share the physical memory in the system fairly.

Shared Virtual Memory

Although virtual memory allows processes to have their own virtual address space, it is sometimes necessary to share memory between processes. For example, it is possible that several processes in the system run the BASH command shell at the same time. In order to avoid the existence of a copy of the BASH program in the virtual memory space of each process, a better solution is that only one copy of the BASH program exists in the physical memory of the system and is shared among multiple processes. Dynamic libraries are another way of sharing execution code between processes. Shared memory can be used as a means of interprocess communication (IPC). Multiple processes exchange information through shared memory. Linux supports the shared memory IPC mechanism of SYSTEM V.

Abstract model

Announce

edit

Abstract model of virtual address to physical address mapping

Discuss how Linux implements virtual memory It is necessary to look at a simpler abstract model before supporting.

When the processor executes a program, it needs to read it from memory before proceeding Instruction decoding 。 Before decoding an instruction, it must fetch or store a value to a location in memory. Then execute this instruction and point to the next instruction in the program. In this process, the processor must frequently access the memory, either fetch data or store data.

Virtual memory system All addresses in are Virtual address instead of Physical address 。 A series of tables maintained by the operating system are translated from virtual address to physical address by the processor.

To make the conversion easier, virtual memory physical memory All are organized by pages. The size of pages in different systems can be the same or different, which will bring inconvenience to management. The Linux page size running on the Alpha AXP processor is 8KB, while the Intel X86 system uses 4KB pages. Each page is marked by a number called page frame number (PFN).

The virtual address in the page mode consists of two parts: the page frame number and the offset value within the page. If the page size is 4KB, then Virtual address Bit 11:0 of represents the virtual address offset value, and more than 12 bits represent the virtual page frame number. The processor must complete address separation when processing virtual addresses. stay Page Table With the help of, it converts the virtual page frame number to the physical page frame number, and then accesses the corresponding offset in the physical page.

Figure 3.1 shows the virtual address spaces of two processes X and Y, which have their own page tables. These page tables map virtual pages of each process to physical pages in memory. In the figure, the virtual page frame number 0 of process X is mapped to the physical page frame number 1. Theoretically, each page table entry should include the following contents:

1. Valid flag, indicating that the page table entry is valid

2. Physical page frame number described by page table entry

3. Access control information. It describes what operations can be performed on this page. Is it writable? Include execution code?

4. Virtual page box number is Page Table Offset in. The virtual page box number 5 corresponds to the sixth cell in the table (0 is the first).

To convert virtual addresses to Physical address , the processor must first obtain the virtual address page box number and the page offset. Generally, the page size is set to the power of 2. Set the page size in Figure 3.1 to 0x2000 bytes (8192 decimal) and an address in the virtual address space of process Y to 0x2194, then the processor will convert it to virtual page frame number 1 and intra page offset 0x194.

The processor uses the virtual page box number as the index to access the processor page table and retrieve the page table entry. If the page table entry at this location is valid, the processor will get the physical page frame number from this entry. If this entry is invalid, it means that the processor accesses virtual memory A non-existent region in. In this case, the processor cannot perform address translation. It must pass control to the operating system to complete this work.

When a process attempts to access a virtual address that the processor cannot perform effective address translation, how the processor passes control to the operating system depends on the specific processor. The common practice is that the processor causes a page failure error and falls into the core of the operating system, so that the operating system will get information about the invalidation Virtual address Information and occurrence of Page error Reason for.

Taking Figure 3.1 as an example, the virtual page frame number 1 of process Y is mapped to the system physical page frame number 4, then physical memory The starting position in is 0x8000 (4 * 0x2000). Add 0x194 byte offset to get the final physical address 0x8194.

By mapping virtual addresses to physical addresses, virtual memory You can map to system physical pages in any order. For example, in Figure 3.1, the virtual page frame number 0 of process X is mapped to the physical page frame number 1 and the virtual page frame number 7 is mapped to the physical page frame number 0, although the virtual page frame number of the latter is higher than the former. In this way, the virtual memory technology brings an interesting result: the pages in virtual memory do not need to maintain a specific order in physical memory.

Page change

Announce

edit

stay physical memory than virtual memory In a much smaller system, the operating system must improve the use efficiency of physical memory. One way to save physical memory is to load only those virtual pages that are being used by the executing program. For example, a database program may need to query a database. At this time, not all contents of the database need to be loaded into memory, but only those parts that need to be used. If the database query is a search query without adding records to the database, it is meaningless to load the code for adding records. This technique of loading only the virtual page to be accessed is called request page feed.

Storage management

When a process attempts to access a Virtual address When the processor is Page Table The entry for the referenced address could not be found in. In Figure 3.1, for virtual page frame number 2, there is no entry in the page table of process X, so when process X attempts to access the content of virtual page frame number 2, the processor cannot convert this address into a physical address. At this time, the processor notifies the operating system that Page error happen. If a page error occurs Virtual address Is invalid, indicating that the process is trying to access a non-existent virtual address. This could be Application error For example, it tries to make a random write operation to the memory. At this time, the operating system will terminate the running of this application to protect other processes in the system from the impact of this faulty process.

If the error virtual address is valid, but the page it points to is not currently in memory, the operating system must read the page from the disk image into memory. Due to the long disk access time, the process must wait for a period of time until the page is taken out. If there are other processes in the system, the operating system will select one of them to run during the waiting process while reading the page. The page read back will be placed in an idle physical page box, and the Page Table The entry corresponding to the box number of this virtual page will be added. Finally, the process will occur from Page error Start running again at. At this time, the whole virtual memory When the access process is over, the processor can continue Virtual address To physical address translation, and the process can continue to run.

Linux uses request paging to load the executable image into the virtual memory of the process. When the command is executed, the executable command file is opened, and its contents are mapped to the virtual memory of the process. These operations are completed by modifying the data structure describing the process memory image, which is called memory mapping. However, only the beginning of the image is called in physical memory The rest remains on the disk. When the image is executed, it will generate page errors, so Linux will decide which parts of the disk to transfer to memory to continue execution.

exchange

Announce

edit

If the process needs to call a virtual page physical memory As there are no free physical pages in the system, the operating system must discard some pages in the physical memory to make room for them.

If the pages discarded from physical memory come from the Executable Or data files that have not been modified do not need to save those pages. When the process needs this page again, read it directly from the executable file or data file.

However, if the page has been modified, the operating system must retain the content of the page for access again. These pages are called dirty pages. When they are moved out of memory, they must be saved in a special file called exchange file. Compared with the speed of the processor and physical memory, the speed of accessing the swap file is very slow. The operating system must make a choice between writing these dirty pages to disk and keeping them in memory.

The algorithm for choosing to discard pages often needs to determine which pages to discard or exchange Algorithm efficiency If it is very low, "bumping" will occur. In this case, the page is constantly written to the disk and read back from the disk, so the operating system cannot do any other work. Taking Figure 3.1 as an example, if the physical page frame number 1 is frequently used, it is inappropriate for the page discard algorithm to use it as a candidate for swapping to the hard disk. The set of pages currently used by a process is called Worksets 。 An efficient exchange strategy can ensure that the working set of all processes exists physical memory Medium.

Linux uses the least used (LRU) page aging algorithm to fairly select pages to be discarded from the system. This policy sets an age for each page in the system, which varies with the number of page visits. The more times a page is visited, the younger the page is; On the contrary, they are getting older. Older pages are the best candidates for pages to be exchanged.

virtual memory

Announce

edit

Virtual memory allows multiple processes to Shared memory 。 All memory access is through each process itself Page Table conduct. For two processes sharing the same physical page, their page tables must contain page table entries that point to the box number of the physical page.

The process that shares the physical page corresponds to the virtual memory The location can be different.

Storage knowledge structure

Announce

edit

1. System management: UNIX/Linux/Windows operating system management.

2. Development technology: C/C++, network programming, multi process/multi thread, Interprocess communication 。

3. Storage foundation: installation, configuration and debugging of disk, RAID array, file system and other storage related hardware and software.

4. Storage system: RAID, DAS, SAN, NAS, CAS, etc.

5. Storage protocols: TCP/IP, SCSI, iSCSI, NFS/CIFS, etc.

6. File system: VFS, EXTx/NTFS/FAT32, etc Disk File System , NFS/CIFS network file system, Lustre/GFS/AFS, etc distributed file system 。

7. Storage technology: Deduplication, SSD, HSM, Virtualization, Snapshot, Replication, CDP, VTL, Thin Provision, etc.

8. Storage architecture: master the storage requirements of different industries, be able to propose storage solutions according to actual needs, and conduct storage system architecture, design and implementation^[3] 。

Other related

Announce

edit

Physical and virtual addressing modes

The operating system itself runs on virtual memory It doesn't mean much. If the operating system is forced to maintain its own Page Table That would be a disgusting solution. Most general-purpose processors support both physical and virtual addressing modes. The physical addressing mode does not require the participation of the page table and the processor will not perform any address translation. Linux core runs directly in physics address space On.

Alpha AXP processors have no special physical addressing mode. It divides the memory space into several areas and specifies two of them as physical mapping addresses. The core address space is called KSEG Address space, which is located in the area above the address 0xfffffc0000000000. In order to execute the core code located in KSEG or access the data there, the code must be executed in core mode. The Linux core on Alpha starts from the address 0xfffffc000031000

access control

The page table entry contains access control information. Because the processor has Page Table Entrance as Virtual address The access control information can be easily used to determine whether the processor is accessing memory in its proper way.

Many factors make it necessary to strictly control access to memory areas. Some memory, such as the part containing execution code, should obviously be read-only. The operating system must not allow processes to write to this area. On the contrary, the page containing data should be writable, but executing this data will definitely lead to errors. Most processors have at least two execution modes: Nuclear mentality And user mode. No one is allowed to execute core code or modify core data structures in user mode.

Figure 3.2 Alpha AXP Page Table Entry

Page Table The access control information in the portal is processor related; Figure 3.2 shows the PTE (Page Table Entry) of Alpha AXP processor. these ones here Bit field Means:

Valid, if this position bit, it indicates that this PTE is valid

FOE

"Invalidate during execution", the processor will report whenever the instructions contained in this page are executed Page error And transfer control

FOW

"Write time failure" is the same as above except that the page error occurs during the write of this page.

FOR

"Invalidate on reading" is the same as above, except that the page error occurs during the reading of this page.

ASM

address space Match. Used by the operating system to clean some entries in the conversion buffer.

KRE

Code running in core mode can read this page.

URE

Code running in user mode can read this page.

Implicit granularity when mapping an entire block to a single conversion buffer rather than multiple conversion buffers.

KWE

The code running in core mode can write this page.

UWE

Code running in user mode can write this page.

page frame number

For PTE in position V, this field contains the box number of the physical page corresponding to this PTE; For invalid PTE, this field is not 0, and it contains information about the location of the page in the exchange file.

The following two bits are defined and used by Linux.

_PAGE_DIRTY

If set, this page will be written to the exchange file.

_PAGE_ACCESSED

Linux uses it to indicate that the page has been visited.

Cache

If the above theoretical model is used to implement a system, it may work, but the efficiency will not be high. Both operating system designers and processor designers are working hard to improve the performance of the system. In addition to creating faster CPUs and memory, the best way is to maintain useful information and data in the cache to speed up certain operations. Linux uses many cache related memory management Policy.

Buffer Cache

The buffer cache contains the block Equipment drive Data buffer used.

The size of these buffered units is generally fixed (say 512 bytes) and includes Block device A block of information read or written. A block device is a device that can only read and write in a fixed size. All hard disks are block devices.

Using the device identifier and the required block number as indexes, you can quickly find data in the buffer cache. Block devices can only be accessed through buffer cache. If the data can be found in the buffer cache, it is not necessary to read from the physical block device (such as the hard disk), which can speed up access.

Page Cache

Used to speed up the execution on the hard disk Image file And data file access.

It buffers the file contents of one page at a time. The page is read from the disk and cached in the page cache.

Swap Cache

Only the modified pages are stored in the swap file.

As long as these pages are not modified after being written to the exchange file, the next time this page is swapped out of memory, there is no need to update and write. These pages can be simply discarded. In systems where switching occurs frequently, Swap Cache can save many unnecessary and time-consuming disk operations.

Hardware Caches

A common hardware cache is Page Table Entrance cache. The processor does not always read the page table directly, but caches page transformations when needed. This cache is also called Translation Look side Buffers. It contains a buffer copy of the page table entry of one or more processors in the system.

When sending Virtual address The processor tried to find a matching TLB entry when referring to the. If it is found, the virtual address will be directly converted to the physical address and the data will be processed. If not, ask the operating system for help. The processor sends a TLB mismatch signal to the operating system, which uses a specific System mechanism To notify the operating system of this exception. The operating system generates a new TLB entry for this address matching pair. When the operating system clears this exception, the processor will perform virtual address translation again. Since there is already a corresponding entry in the TLB at this time, this operation will succeed.

The disadvantage of using cache is that Linux must consume more time and space to maintain these caches, and the system will crash when the cache system crashes.

Linux Page Table

Figure 3.3 Three level page table structure of Linux

Linux always assumes that the processor has three levels Page Table 。 Each page table is accessed by the page box number of the subordinate page table. Figure 3.3 shows Virtual address How to split into multiple fields. Each field provides the offset of a specified page table. To convert virtual addresses into Physical address , the processor must get the value of each domain. This process will continue for three times until the physical page box number corresponding to the virtual address is found. Finally, the last field in the virtual address is used to get the address of the data in the page.

In order to realize cross platform operation, Linux provides a series of conversion macros so that the core can access the page table of a specific process. In this way, the core does not need to know the structure of page table entries and their arrangement.

This strategy is quite successful. Linux always uses the same page table manipulation code in the Alpha AXP with three-level page table structure or the Intel X86 processor with two-level page table structure.

Page allocation and recycling

The requests for physical pages in the system are very frequent. For example, when an executable image is called into memory, the operating system must allocate pages for it. When the image is executed and uninstall These pages must be released when. Another use of physical pages is to store Page Table These core data structures. virtual memory The data structure and mechanism responsible for page allocation and recycling in the subsystem may be most useful.

All physical pages in the system are described by the linked list mem_map containing the mem_map_t structure, which is initialized when the system starts. Each mem_map_t describes a physical page. Where and memory management Relevant important fields are as follows:

count

Record the number of users using this page. When this page is shared among multiple processes, its value is greater than 1.

age

This field describes the age of the page. It is used to select the appropriate page to discard or replace out of memory.

map_nr

Record the box number of the physical page described in mem_map_t.

Use free_area for page allocation code array This mechanism is responsible for the entire buffer management. In addition, this code is related to the page size and physical paging Mechanism independent.

Each element in free_area contains information about the page block. The first element in the array describes a page, the second represents a block of 2 page sizes, and the next represents a block of 4 page sizes, all of which are power multiples of 2. The list field represents a queue header, which contains the data structure that points to the page data structure in the mem_map array Pointer 。 All free pages are in this queue. The map field is a pointer to the page group allocation bitmap of a specific page size. When the Nth block of the page is free, the Nth bit of the bitmap is set.

Figure free_area figure shows the free_area structure. The first element has a free page (page frame number 0), and the second element has two free blocks of four page sizes. The first one starts from page frame number 4 and the second one starts from page frame number 56.

Page Assignment

Linux uses the Buddy algorithm to effectively allocate and recycle page blocks. The page allocation code allocates a memory block containing one or more physical pages at a time. Pages are allocated in blocks to the power of 2. This means that it can allocate blocks of 1, 2, and 4 pages. As long as there are enough free pages in the system to meet this requirement (nr_free_pages>min_free_page), memory allocation The code will look for a free block with the same size as the request in free_area. Each element in free_area stores a bitmap that reflects the allocated and free pages of this size. For example, free_area array The second element in points to a memory image that reflects the allocation of memory blocks of four pages in size.

The allocation algorithm first searches for pages that meet the request size. It starts from the list field of the free_area data structure to search for free pages along the chain. If there is no free page of this request size, it searches for a memory block twice the request size. This process will continue until free_area is searched or a memory block that meets the requirements is found. If the found page block is larger than the requested block, it will be split to make its size match the requested block. Because the block sizes are all powers of 2, the segmentation process is very simple. The free block is connected to the corresponding queue and the page block is allocated to the caller.

Figure 3.4 Free_area Data Structure

In Figure 3.4, when a request of two page blocks is sent in the system, the first four page memory block (starting from page frame number 4) will be divided into two two page blocks. The first one, starting from page frame number 4, will be allocated and returned to the requester. The second one, starting from page frame number 6, will be added to element 1 of the free_area array, which represents the free block of two page sizes.

Page recycling

Breaking large page blocks for allocation will increase the number of fragmented free page blocks in the system. The page recycling code should combine these pages to form a single large page block when appropriate. In fact, the size of the page block determines the difficulty of page reassembly.

When the page block is released, the code will check whether there are adjacent or buddy memory blocks of the same size. If so, they will be combined to form a new free block twice the original size. After each combination, the code also checks whether it can continue to merge into a larger page. In the best case, the system's free page blocks will be as large as the maximum memory allowed to be allocated.

In Figure 3.4, if the page frame number 1 is released, it will combine with the free page frame number 0 to form a free block of 2 pages into the first element of free_area.

Memory mapping

When the image is executed, the contents of the executable image will be called into the process virtual address space. The same is true for shared libraries used by executable images. however Executable Actually, there is no transfer in physical memory Is simply connected to the virtual memory of the process. When other parts of the program refer to this part in runtime, they are transferred from disk to memory. The process of connecting an image to the process virtual address space is called memory mapping.

Figure 3.5 virtual memory region

The virtual memory of each process is represented by an mm_struct. It contains the currently executed image (such as BASH) and a large number of pointers to vm_area_struct. Each vm_area_struct data structure describes the starting and ending positions of virtual memory, the access permissions of processes to this memory area, and a set of memory operation functions. These functions are necessary for Linux to manipulate the virtual memory area subroutine 。 One of the processes in charge of accessing is not currently physical memory The condition of virtual memory in (invalidation through pages). This function is called nopage. It is used when Linux attempts to bring pages of executable images into memory.

Executable image mapped to process Virtual address A set of corresponding vm_area_struct data structures will be generated. Each vm_area_struct data structure represents a part of the executable image: Executable code , initialization data (variables), uninitialized data, and so on. Linux supports many standard virtual memory Operation function. When creating vm_area_struct data structure, there is a set of corresponding virtual memory operation functions.

Request page change

When the mapping of executable image to process virtual address space is completed, it can start running. Since only a few images are transferred into memory, it will happen soon physical memory Access to virtual memory regions in. When the process access is invalid Page Table The processor will report a Page error 。

Page error with invalid Virtual address And the memory access method that causes the failure. Linux must find the vm_area_struct structure that represents this area. The search speed of the vm_area_struct data structure determines the efficiency of handling page errors. All vm_area_struct structures are connected by an AVL (Adelson Veskii and Landis) tree structure. If the corresponding relationship between vm_area_struct and this invalid virtual address cannot be found, the system considers that this process has accessed an illegal virtual address. At this time, Linux will send the SIGSEGV signal to the process. If the process does not process this signal, it will terminate the operation.

If the corresponding relationship is found, Linux will check whether the Page error The fetch type of. If a process accesses memory in an illegal way, such as writing to a non writable area, the system will generate a memory error signal.

If Linux believes that a page error is legal, it needs to handle the situation.

First, Linux must distinguish between the pages in the swap file and the executable images on the disk. Alpha AXP Page Table There may be a page table entry with a non-zero value in the PFN field, but the significand is not set. In this case, the PFN field indicates the location of this page in the swap file. How to handle the pages in the swap file will be discussed in the next chapter.

Not all vm_area_struct data structures have a set of virtual memory operation functions, and some of them do not even have nopage functions. This is because Linux allocates new physical pages and creates valid Page Table Entrance to correct this visit. If the nopage operation function exists in this memory area, Linux will call it.

Generally, the Linux nopage function is used to process the memory mapping executable image, and it uses the page cache to call the requested page physical memory Go to the middle.

When the requested page is called into physical memory, the processor page table must also be updated. To update these entries, relevant hardware operations must be carried out, especially when the processor uses TLB. In this way, when the page invalidation is processed, the process will be invalidated from virtual memory The location accessed starts running again.

Linux page cache

Figure 3.6 Linux Page Cache

Linux uses page cache to speed up access to files on disk. Memory Map File Read and store these pages in the page cache one page at a time. Figure 3.6 shows that the page cache is from page_hash_table to mem_map_t data structure Pointer array form.

Each file in Linux is identified by a VFS inode (described in the File System chapter) data structure, and each VFS inode is unique. It can describe only one file. Page Table The index of is derived from the VFS inode of the file and the offset of the file.

Read the page from a memory mapping file. For example, when a page change request is generated, the page should be read back into memory. The system tries to read it from the page cache. If the page is in the cache, the page invalidation processing will be returned, pointing to the mem_map_t data structure; Otherwise, this page will read memory from the file system containing the image and allocate physical pages to it.

In the process of image reading and execution, the page cache keeps growing. When a page is no longer needed, that is, it is no longer used by any process, it will be deleted from the page cache.

Swapping out and discarding pages

When in the system physical memory When reducing, the Linux memory management subsystem must release physical pages. This task is completed by the core exchange daemon (kswapd).

The core exchange daemon is a special kind of core thread. It doesn't virtual memory Process in the physical address space Nuclear mentality function. The name of the core exchange background process is easily misunderstood. In fact, it does more work than just exchanging pages to the system exchange file. The goal is to ensure that there are enough free pages in the system to maintain memory management System operation efficiency.

This process is controlled by the core init process Run when the system is started and switched by the core timer Periodic call.

When the timer expires, the exchange daemon will check whether the number of free pages in the system is too small. It uses two variables: free_pages_high and free_page_low to determine whether to release some pages. As long as the number of free pages in the system is greater than free_pages_high, the core exchange background process does not do any work; It will sleep until the next timer expires. In the check, the core exchange daemon counts the number of pages currently written to the exchange file, and uses nr_async_pages to record this value; When a page is queued for writing to the exchange file queue, it will be incremented once, and decremented once after the write operation is completed. If the number of free pages in the system is below free_pages_high or even free_pages_low, the core exchange daemon will reduce the number of physical pages used in the system through three ways:

Reduce the size of buffer and page cache,

Exchange the memory pages of type V of the system,

Change out or discard the page.

If the number of free pages in the system is less than free_pages_low, the core exchange daemon will release 6 pages before the next run. Otherwise, it will only release 3. The above three methods will be used in turn until the system releases enough free pages. When the core exchange daemon attempts to release a physical page, it records the last method used. Next time, it will first run the last successful algorithm.

When enough pages are released, the core exchange daemon will sleep again until the next time timer At that time. If the reason that causes the core exchange daemon to release pages is that the number of idle pages in the system is less than free_pages_low, it will only sleep for half the normal time. Once the number of free pages is greater than free_pages_low, the core exchange process will sleep longer.

Reduce the size of P C and B C

(Note: Compact Page Cache and Buffer Cache are (PC and BC) read in this article.)

The pages in the Page Cache and Buffer cache will be released into the free_area array in priority. Page Cache contains Memory Map File Some of them may be unnecessary and waste the memory of the system. The Buffer Cache contains Physical devices Some of the buffered data read and written in may also be unnecessary. When the physical pages in the system start to run out, it is relatively simple to discard the pages from these caches (it does not need to write to the physical device as it does from memory). In addition to reducing the access speed to physical devices and memory mapped files, the page discarding policy does not have too many side effects. If the strategy is appropriate, all processes will suffer the same loss.

Each time the core exchange daemon tries to compress these caches.

It first checks the mem_map page array Check whether there is a page block in the physical memory Discarded in. When the number of idle pages in the system decreases to a dangerous level, the core background exchange process exchanges frequently, and the page blocks checked are generally large. The check method is rotation. Every time you try to compress the memory image, the core background exchange process always checks different page blocks. This is a well-known clock algorithm, which checks the page every time in the entire mem_map page array.

The core background exchange process will check each page to see whether it has been buffered by the page cache or buffer cache. Readers may have noticed that shared pages are not among the pages considered for discarding, and such pages will not appear in both caches. If the page is not one of the two, it will check the next page in the mem_map page array.

cache The pages in the buffer cache (or the buffer in the page is cached) can make buffer allocation and recycling more effective. The memory compression code will try to release the buffer contained in the inspected page.

If all buffers contained in the page are released, the page will also be released. If the inspected page is in the page cache of Linux, it will be deleted and released from the page cache.

If enough pages are released, the core exchange daemon will wait until it is awakened next time. These released pages are not any process virtual memory Part of the, so there is no need to update Page Table 。 If there are not enough buffered pages to discard, the exchange process will try to exchange some shared pages.

Change out system V memory page

System V Shared memory It is a mechanism used to realize process communication by sharing virtual memory between processes. Pointer Vm_area is a structure designed for each shared virtual memory area. They are connected through vm_next_shared and vm_prev_shared pointers. Each shmid_ds data structure contains a page table entry, and each entry describes the mapping relationship between the physical page and the shared virtual page.

The core exchange background process also uses the clock algorithm to Shared memory The page is swapped out.

Each time it runs, it needs to remember which share virtual memory Which page of the region is the last one to be swapped out. Two indexes can help it complete this work. One is a set of indexes of shmid_ds data structure, and the other is the shared memory area of system V Page Table entrance Linked list Index of. This ensures a fair choice of the shared memory area of System V.

Since the physical page frame number of the given system V shared virtual memory is saved in the page table of all processes sharing this virtual memory area, the core exchange daemon must modify all the page tables at the same time to indicate that the page is no longer in memory but in the exchange file. For each shared page to be exchanged, the core exchange daemon can find them in the page table entry of each shared process's page table (through the vm_area_struct data structure). If corresponding to this system V Shared memory The process of the page of Page Table The entry is valid, and it can change it into invalid, so that the number of users who change out page table entries and shared pages will be reduced by one. The format of swapped out system V shared page table entry includes an index corresponding to a group of shmid_ds data structures and a page table entry index of system V shared memory area.

If the page table of all shared processes is modified and the count of this page is 0, the shared page can be written to the exchange file. Similarly, the page table entry in the shmid_ds data structure linked list that points to the shared memory region of this system V is replaced by the page table entry of the pageout. Although the pageout table entry is invalid, it contains a set of open exchange file indexes, and can also find the offset of pageout in the file. When the page is brought back physical memory This information is very useful.

Swapping out and discarding pages

The exchange background process checks each process in the system in turn to determine who is most suitable for exchange.

The better candidates are those processes that can be swapped (some are not) and have only one or several pages in memory. Only those pages containing data that cannot be retrieved will be exchanged from the physical memory to the system exchange file.

Much of the executable image can be accessed from Image file And can be easily re read. For example, the executable instructions in the image cannot be modified by the image itself, so they will never be written to the exchange file. These pages can be discarded directly. When the process references them again, it only needs to read memory from the executable image file.

Once the process to be exchanged is determined, the exchange background process will search its entire virtual memory Area to find areas that are not shared or locked.

Linux does not exchange the entire swappable page of the selected process, it only deletes a small part paging noodles.

If the memory is locked, the page cannot be swapped or discarded.

The Linux exchange algorithm uses the page aging algorithm. Each page has a counter to tell the core exchange daemon whether the page is worth swapping out (this counter is included in the mem_map_t structure). The page will age when it is not used or found; Exchange background processes only exchange those old pages. The default operation is: when the page is allocated for the first time, its initial age is 3, and its age will be increased by 3 for each reference, with the maximum value of 20. Each time the core exchange daemon runs it to age the page - decrement the age by 1. The default operations can be changed and for this reason they are stored in the swap_control data structure.

If the page ages (age=0), the swap daemon will further process it. Dirty pages can be swapped out. Linux uses a hardware related bit in PTE to describe this feature of the page (see Figure 3.2). However, not all dirty pages need to be written into the exchange file. Each of the processes virtual memory The region may have its own exchange operations (represented by the vm_ops pointer in the vm_area_struct structure). These methods are used in the exchange. Otherwise, the exchange daemon will allocate a page in the exchange file and write the page to the device.

Of the page Page Table The entry is marked as invalid, but it contains information about the position of the page in the exchange file, including an offset value representing the position of the page in the exchange file and which exchange file is used. However, no matter which exchange algorithm is used, the previous physical page will be marked as idle and placed in free_area. Clean (or not dirty) pages can be discarded and put into free_area for reuse.

If enough swappable process pages are swapped out or discarded, the swap daemon will sleep again. The next time it wakes up, it will consider the next process in the system. In this way, the swapping background process gradually reclaims the swappable or discardable physical pages of each process until the system is in balance again. This is much fairer than swapping out the whole process.

The Swap Cache

When swapping pages into swap files, Linux always avoids page writing unless it is necessary to do so. When a page has been swapped out of memory, but a process accesses it again, it needs to be reloaded into memory. As long as the page has not been written in memory, the copy in the exchange file is valid.

Linux uses swap cache to track these pages. The swap cache is a Page Table Entry linked list, each corresponding to the physical page in the system. This is a page table entry corresponding to the exchange out page and describes which exchange file the page is placed in and where it is located in the exchange file. If the swap cache entry is non-zero, it means that the page in the exchange file has not been modified. If this page is modified (or written). The entry is deleted from the swap cache.

When Linux needs to exchange a physical page to the exchange file, it will check the swap cache. If there is a valid entry for this page, it is not necessary to write this page to the exchange file. This is because the page in memory has not been modified since it was read from the exchange file last time.

The entry in the swap cache is the page table entry of the paged out page. Although they are marked as invalid, they provide Linux with information such as which exchange file the page is in and the location of the file.

Change in of page

Dirty pages saved in the exchange file may be reused, for example, when the application virtual memory When an area is written. Access to virtual memory pages that are not in physical memory will throw Page error 。 Because the processor cannot translate this virtual address into Physical address , the processor notifies the operating system. Since it has been exchanged, the Page Table The entry is marked as invalid. The processor cannot handle this Virtual address Translation to physical address, so it passes control to the operating system and notifies the operating system of the address and reason of the page error. The format of this information and how the processor passes control to the operating system are hardware specific.

The error handling code of the processor related page will describe the location of the virtual memory The vm_area_struct data structure of the region. It looks for the location containing the wrong virtual address in the vm_area_struct of this process until it is found. These codes have a great relationship with time. The vm_area_struct data structure of the process is specially arranged to make the search operation take less time.

Execute these processor related operations and find errors Virtual address The rest of the page fault handling process is similar to that before.

General page error handling code is error virtual address search Page Table entrance. If the page table entry found is a paged out page, Linux must switch it into physical memory 。 The format of the page table entry of the paged out page is related to the processor type, but all processors mark these pages as invalid and put the necessary information to locate this page into the page table entry. Linux uses this information to swap pages into physical memory.

At this time, Linux knows that there is an error virtual memory Address and have a page table entry containing page location information. The vm_area_struct data structure may include swapping this virtual memory area to physical memory The subprogram in: swapin. If swapin exists for this virtual memory area, Linux will use it. This is the swapped out system V Shared memory Page processing process - because the shared page of the paged out system V is slightly different from the common paged out page. If there is no swapin operation, it may be that Linux assumes that ordinary pages need no special processing.

The system will assign physical pages and read the paged out pages in. About page location information in the interchange file from Page Table Take it out from the inlet.

If it causes Page error If the access of is not a write operation, the page is retained in the swap cache and its page table entry is no longer marked as writable. If the page is subsequently written, another page error will be generated. At this time, the page is marked as dirty, and its entry is deleted from the swap cache. If the page is not written and is required to be replaced, Linux can avoid this write because the page already exists in the exchange file.

If the operation that causes the page to read from the exchange file is a write operation, the page will be deleted from the swap cache and its page table entry will be marked as dirty and writable.

Mobile storage management system

characteristic

► It has been certified by the State Administration of Security, safe and reliable;

► Seamless integration with the encryption system, doubling the protection capacity;

► It is the first in China to change an ordinary USB stick into Encryption USB stick , completely solve the risks brought by the convenience of USB flash disk;

► Adopt double factor authentication technology;

► The special encrypted mobile storage is seamlessly integrated with the system, and the management is smoother;

► With various functions, it can meet the confidentiality requirements of various needs;

► Perfect audit function to keep abreast of the behavior of the USB flash drive holder.

function

● Centralized registration and authorization. USB flash drive identification and media tracking can be realized through registration information;

● Host identity authentication. All computers that install the client must be assigned real name information by the administrator before use;

● Encrypt and lock. User authentication is required for the encrypted and locked USB flash drive;

● Access control. It can flexibly control the registration policy and information of removable storage media, and set the computers or leases allowed to be used;

● Out of office copy. The data copied into the USB flash disk can be used interactively with external computers, and can also be copied directionally;

● User audit. The mobile management storage system provides detailed audit records and reports.

Novice on the road

Growth task Getting Started with Editing Edit Rule Edit by myself

I have questions

Content query Online Service Official post bar Feedback

Complaints and suggestions

Report bad information Failed to appeal through entries Complaint of infringement information Blocking query and unblocking

Jinggong Network Anbei No. 1100000200001