Memory mapping file is a mapping from a file to a piece of memory.Win32 provides a function (CreateFileMapping) that allows applications to map files to a process.Memory Map File vsvirtual memorySimilarly, a memory map file can be used to keep aaddress spaceArea, and will alsoPhysical storageSubmitted to this area, the physical storage for memory file mapping comes from a file that already exists on the disk, and the file must be mapped before operating on the file.When using memory mapped files to process files stored on disk, it is no longer necessary to executeI/O operationsThis enables memory mapped files to play an important role in processing files with large data volumes.
File operation is one of the most basic functions of an application. Both the Win32 API and MFC provide functions and classes that support file processing. The commonly used ones are CreateFile(), WriteFile(), ReadFile() of the Win32 API and CFile class provided by MFC.Generally speaking, the above functions can meet the requirements of most occasions, but for some special applications that require massive storage of tens of GB, hundreds of GB, or even a few TB, it is obviously not feasible to use the usual file processing methods.The operations on such large files are generally processed in the form of memory mapped files.[1]
Memory mapping file is a mapping from a file to the process address space.In Win32, each process has its own address space. One process cannot easily access the data in the address space of another process, so it cannot do as 16 bit Windows does.The Win32 system allows multiple processes (running on the same computer) to share data using memory mapped files.In fact, other technologies that share and transmit data, such as SendMessage or PostMessage, use memory mapping files internally.
Figure 1
The API provided by Windows for memory mapping files is shown on the right in Figure 1:
data sharing
Announce
edit
File data sharing
This data sharing allows two or moreprocessMap the view of the same file mapping object, that is, they are sharing the same physical storage page.In this way, when a process writes data to a view of the memory map file, other processes immediately see changes in their own view.be careful, use the same name for file mapping objects.
Access method
In this way, the data in the file can be accessed by memory read/write instructions instead of I/O system functions such as ReadFile and WriteFile, thus improving the fileAccess speed。
Scope application
Announce
edit
Scope of application
This function is most suitable for reading files and performingGrammatical analysisApplications, such as parsing input filesColour grammarEditor, compiler, etc.
After the file is mapped, it can be read and analyzed, so that the application can use memory operations to manipulate the file without having to read, write, and move back and forth in the filefield name pointer。
application
Some operations, such as abandoning the "read" of a character, were quite complex in the past, and users need to handle the buffer flushing problem.After the introduction of the mapping file, it is much simpler.All the application has to do is decrement the pointer by one value.
Another important application of mapping files is to support permanently named shared memory.To share memory between two applications, you can create a file in one application and map it. Then another application can use this file as shared memory by opening and mapping it.VC++uses memory mapped files to process large files
Memory file
Announce
edit
Memory mapping file is similar to virtual memory. Through memory mapping file, an area of address space can be reservedstorageSubmitted to this area, only memory file mappingPhysical storageIt comes from a file that already exists on the disk, rather than the system page file, and must be mapped before operating on the file, just like loading the entire file from the disk to memory[2]。It can be seen that when using memory mapped files to process files stored on disk, it is no longer necessary to executeI/O operationsThis means that when processing files, it is no longer necessary to apply for and allocate a cache for files. All file cache operations are directly managed by the system. Because the steps of loading file data into memory, writing back data from memory to files, and releasing memory blocks are canceled, memory mapped files can play an important role in processing files with large amounts of data.In addition, systems in actual projects often need to share data between multiple processes. If the data volume is small, the processing method is flexible. If the shared data volume is large, it needs to use memory mapping files to do so.In fact, memory mapped files are the most effective way to solve data sharing among local multiple processes.
Memory mapped file is not a simple file I/O operation, but actually uses the core programming technology of Windows--memory management 。Therefore, if you want to have a deeper understanding of memory mapped files, you must have a clear understanding of the memory management mechanism of the Windows operating system. The following is a general method for using memory mapped files:
First, create or open a file through the CreateFile() functionKernel objectThis object identifies the file to be used as the memory mapping file on the disk.After using CreateFile() to notify the operating system of the location of the file image in the physical storage, only the path of the image file is specified, and the image length has not been specified.In order to specify how much physical storage space the file mapping object needs, you also need to create a file mapping kernel object through the CreateFileMapping() function to tell the size of the system file and the way to access the file.After the file mapping object is created, an address space area must be reserved for the file data, and the file data must be submitted as the physical storage mapped to this area.The MapViewOfFile() function is responsible for mapping all or part of the file mapping objects to the process address space through system management.At this point, the use and processing of memory mapped files is basically the same as the processing of file data that is usually loaded into memory. When the use of memory mapped files is completed, a series of operations are required to clear the memory mapped files and release the used resources.This part is relatively simple. You can use UnmapViewOfFile() to undo the image of file data from the address space of the process, and CloseHandle() to close the file mapping object and file object created previously.
correlation function
Announce
edit
When using a memory mapped file, theAPI FunctionsThe functions mentioned above are mainly introduced as follows:
The function CreateFile() is often used to create and open files even in ordinary file operations. When processing memory mapped files, this function creates/opens a file kernel object and returns its handle. When calling this function, you need to set the parameters dwDesiredAccess and dwShareMode according to whether you need to read and write data and how files are shared,Incorrect parameter settings will result in failure of corresponding operations.
The CreateFileMapping() function creates a file mapping kernel object, and specifies the file handle to be mapped to the process address space through the parameter hFile (the handle is obtained by the return value of the CreateFile() function).Since the physical storage of the memory mapped file is actually a file stored on the disk, rather than the memory allocated from the system's page file, the system will not actively reserve an address space area for it, nor will it automatically map the file's storage space to this area. In order to enable the system to determine what protection attributes to take for the page,It needs to be set through the parameter flProtect. The protection attributes PAGE_READONLY, PAGE_READWRITE, and PAGE_WRITECOPY indicate that the file data can be read, read, and written after the file mapping object is mapped.When using PAGE_READONLY, you must ensure that CreateFile() uses the GENERIC_READ parameter;PAGE_READWRITE requires CreateFile() to use the GENERIC_READ | GENERIC_WRITE parameter;As for the attribute PAGE_WRITECOPY, you only need to ensure that CreateFile() uses one of GENERIC_READ and GENERIC_WRITE.The DWORD type parameters dwMaximumSizeHigh and dwMaximumSizeLow are also very important. They specify the maximum number of bytes in a file. Since these two parameters are 64 bits in total, the maximum supported file length is 16EB, which can meet the requirements of almost any large data file processing situation.
The MapViewOfFile() function is responsible for mapping file data to the address space of the process. The parameter hFileMappingObject is the handle of the file image object returned by CreateFileMapping().The dwDesiredAccess parameter specifies the access method to the file data again, and also matches the protection attribute set by the CreateFileMapping() function.Although it seems redundant to repeatedly set protection attributes here, it can enable applications to effectively control the protection attributes of data.The MapViewOfFile() function allows you to map all or part of a file. When mapping, you need to specify theOffset addressAnd the length to be mapped.The offset address of the file is specified by a 64 bit value consisting of the DWORD type parameters dwFileOffsetHigh and dwFileOffsetLow, and must be an integer multiple of the allocation granularity of the operating system. For the Windows operating system, the allocation granularity is fixed at 64KB.Of course, you can also dynamically obtain the allocation granularity of the current operating system through the following code:
The parameter dwNumberOfBytesToMap specifies the mapping length of the data file. It should be noted here that for Windows 9x operating systems, if MapViewOfFile() cannot find an area large enough to store the entire file mapping object, it will return a null value (NULL);However, under Windows 2000, MapViewOfFile() only needs to find a large enough area for the necessary view, without considering the size of the entire file mapping object.
After processing the file mapped to the process address space area, you need to release the file data image through the function UnmapViewOfFile(). The function prototype declaration is as follows:
BOOL UnmapViewOfFile(LPCVOID lpBaseAddress);
The unique parameter lpBaseAddress specifies the base address of the return area, which must be set as the return value of MapViewOfFile().After using the function MapViewOfFile(), there must be a corresponding UnmapViewOfFile() call, or the reserved area will not be released before the process terminates.In addition, the file kernel object and file mapping kernel object were created by the CreateFile() and CreateFileMapping() functions previously. Before the process terminates, it is necessary to release them through CloseHandle(), otherwise the problem of resource leakage will occur.
In addition to these necessary API functions, other auxiliary functions should be selected according to the situation when using memory mapping files.For example, when using memory mapped files, to improve speed, the system caches the data pages of the files, and does not update the disk image of the files immediately when processing the file mapping view.To solve this problem, consider using the FlushViewOfFile() function, which forces the system to rewrite some or all of the modified data to the disk image, thus ensuring that all data updates can be saved to the disk in a timely manner.
Application examples
Announce
edit
The following describes the use of memory mapping files with a specific example.This instance receives data from the port and stores it on the disk in real time. Due to the large amount of data (tens of gigabytes), memory mapping files are selected here for processing.The following is part of the main code in the worker thread MainProc. This thread starts when the program runs. When data arrives on the port, the event hEvent [0] will be issued,The WaitForMultipleObjects() function will save the received data to the disk after the event occurs. If the receiving is terminated, the event hEvent [1] will be issued. The event processing process will be responsible for the release of resources and the closing of files.The specific implementation process of this thread processing function is shown below:
//Create a file kernel object whose handle is saved in hFile
//Set the size, offset and other parameters.//Set the file as large as possible. If the written data exceeds the set value, the error getlasterror=183 will be reported when mapping the file again;
//Undo file data image from process's address space
UnmapViewOfFile(pbFile);
//Close File Map Object
CloseHandle(hFileMapping);
break;
}
}…
If only the UnmapViewOfFile() and CloseHandle() functions are simply executed in the process of termination event trigger processing, the actual size of the file cannot be correctly identified. That is, if the memory mapping file opened is 30GB and the data received is only 14GB, then after the above program is executed, the length of the saved file is still 30GB.That is to say, after the processing is completed, the file should be restored to the actual size again in the form of a memory mapped file. The following is the main code to achieve this requirement: