Collection
zero Useful+1
zero

Memory Map File

Mapping from a file to a block of memory
Memory mapping file is a mapping from a file to a piece of memory. Win32 provides a function (CreateFileMapping) that allows applications to map files to a process. Memory Map File vs virtual memory Similarly, a memory map file can be used to keep a address space Area, and will also Physical storage Submitted to this area, the physical storage for memory file mapping comes from a file that already exists on the disk, and the file must be mapped before operating on the file. When using memory mapped files to process files stored on disk, it is no longer necessary to execute I/O operations This enables memory mapped files to play an important role in processing files with large data volumes.
Chinese name
Memory Map File
Foreign name
Memory-Mapped Files

Basic overview

Announce
edit
File operation is one of the most basic functions of an application. Both the Win32 API and MFC provide functions and classes that support file processing. The commonly used ones are CreateFile(), WriteFile(), ReadFile() of the Win32 API and CFile class provided by MFC. Generally speaking, the above functions can meet the requirements of most occasions, but for some special applications that require massive storage of tens of GB, hundreds of GB, or even a few TB, it is obviously not feasible to use the usual file processing methods. The operations on such large files are generally processed in the form of memory mapped files. [1]
Memory mapping file is a mapping from a file to the process address space. In Win32, each process has its own address space. One process cannot easily access the data in the address space of another process, so it cannot do as 16 bit Windows does. The Win32 system allows multiple processes (running on the same computer) to share data using memory mapped files. In fact, other technologies that share and transmit data, such as SendMessage or PostMessage, use memory mapping files internally.
Figure 1
The API provided by Windows for memory mapping files is shown on the right in Figure 1:

data sharing

Announce
edit

File data sharing

This data sharing allows two or more process Map the view of the same file mapping object, that is, they are sharing the same physical storage page. In this way, when a process writes data to a view of the memory map file, other processes immediately see changes in their own view. be careful , use the same name for file mapping objects.

Access method

In this way, the data in the file can be accessed by memory read/write instructions instead of I/O system functions such as ReadFile and WriteFile, thus improving the file Access speed

Scope application

Announce
edit

Scope of application

This function is most suitable for reading files and performing Grammatical analysis Applications, such as parsing input files Colour grammar Editor, compiler, etc.
After the file is mapped, it can be read and analyzed, so that the application can use memory operations to manipulate the file without having to read, write, and move back and forth in the file field name pointer

application

Some operations, such as abandoning the "read" of a character, were quite complex in the past, and users need to handle the buffer flushing problem. After the introduction of the mapping file, it is much simpler. All the application has to do is decrement the pointer by one value.
Another important application of mapping files is to support permanently named shared memory. To share memory between two applications, you can create a file in one application and map it. Then another application can use this file as shared memory by opening and mapping it. VC++uses memory mapped files to process large files

Memory file

Announce
edit
Memory mapping file is similar to virtual memory. Through memory mapping file, an area of address space can be reserved storage Submitted to this area, only memory file mapping Physical storage It comes from a file that already exists on the disk, rather than the system page file, and must be mapped before operating on the file, just like loading the entire file from the disk to memory [2] It can be seen that when using memory mapped files to process files stored on disk, it is no longer necessary to execute I/O operations This means that when processing files, it is no longer necessary to apply for and allocate a cache for files. All file cache operations are directly managed by the system. Because the steps of loading file data into memory, writing back data from memory to files, and releasing memory blocks are canceled, memory mapped files can play an important role in processing files with large amounts of data. In addition, systems in actual projects often need to share data between multiple processes. If the data volume is small, the processing method is flexible. If the shared data volume is large, it needs to use memory mapping files to do so. In fact, memory mapped files are the most effective way to solve data sharing among local multiple processes.
Memory mapped file is not a simple file I/O operation, but actually uses the core programming technology of Windows-- memory management Therefore, if you want to have a deeper understanding of memory mapped files, you must have a clear understanding of the memory management mechanism of the Windows operating system. The following is a general method for using memory mapped files:
First, create or open a file through the CreateFile() function Kernel object This object identifies the file to be used as the memory mapping file on the disk. After using CreateFile() to notify the operating system of the location of the file image in the physical storage, only the path of the image file is specified, and the image length has not been specified. In order to specify how much physical storage space the file mapping object needs, you also need to create a file mapping kernel object through the CreateFileMapping() function to tell the size of the system file and the way to access the file. After the file mapping object is created, an address space area must be reserved for the file data, and the file data must be submitted as the physical storage mapped to this area. The MapViewOfFile() function is responsible for mapping all or part of the file mapping objects to the process address space through system management. At this point, the use and processing of memory mapped files is basically the same as the processing of file data that is usually loaded into memory. When the use of memory mapped files is completed, a series of operations are required to clear the memory mapped files and release the used resources. This part is relatively simple. You can use UnmapViewOfFile() to undo the image of file data from the address space of the process, and CloseHandle() to close the file mapping object and file object created previously.

correlation function

Announce
edit
When using a memory mapped file, the API Functions The functions mentioned above are mainly introduced as follows:
HANDLE CreateFile(LPCTSTR lpFileName, DWORD dwDesiredAccess,DWORD dwShareMode,LPSECURITY_ATTRIBUTES lpSecurityAttributes,DWORD dwCreationDisposition,DWORD dwFlagsAndAttributes,HANDLE hTemplateFile);
The function CreateFile() is often used to create and open files even in ordinary file operations. When processing memory mapped files, this function creates/opens a file kernel object and returns its handle. When calling this function, you need to set the parameters dwDesiredAccess and dwShareMode according to whether you need to read and write data and how files are shared, Incorrect parameter settings will result in failure of corresponding operations.
HANDLE CreateFileMapping(HANDLE hFile, LPSECURITY_ATTRIBUTES lpFileMappingAttributes,DWORD flProtect,DWORD dwMaximumSizeHigh,DWORD dwMaximumSizeLow,LPCTSTR lpName);
The CreateFileMapping() function creates a file mapping kernel object, and specifies the file handle to be mapped to the process address space through the parameter hFile (the handle is obtained by the return value of the CreateFile() function). Since the physical storage of the memory mapped file is actually a file stored on the disk, rather than the memory allocated from the system's page file, the system will not actively reserve an address space area for it, nor will it automatically map the file's storage space to this area. In order to enable the system to determine what protection attributes to take for the page, It needs to be set through the parameter flProtect. The protection attributes PAGE_READONLY, PAGE_READWRITE, and PAGE_WRITECOPY indicate that the file data can be read, read, and written after the file mapping object is mapped. When using PAGE_READONLY, you must ensure that CreateFile() uses the GENERIC_READ parameter; PAGE_READWRITE requires CreateFile() to use the GENERIC_READ | GENERIC_WRITE parameter; As for the attribute PAGE_WRITECOPY, you only need to ensure that CreateFile() uses one of GENERIC_READ and GENERIC_WRITE. The DWORD type parameters dwMaximumSizeHigh and dwMaximumSizeLow are also very important. They specify the maximum number of bytes in a file. Since these two parameters are 64 bits in total, the maximum supported file length is 16EB, which can meet the requirements of almost any large data file processing situation.
LPVOID MapViewOfFile(HANDLE hFileMappingObject, DWORD dwDesiredAccess,DWORD dwFileOffsetHigh,DWORD dwFileOffsetLow,DWORD dwNumberOfBytesToMap);
The MapViewOfFile() function is responsible for mapping file data to the address space of the process. The parameter hFileMappingObject is the handle of the file image object returned by CreateFileMapping(). The dwDesiredAccess parameter specifies the access method to the file data again, and also matches the protection attribute set by the CreateFileMapping() function. Although it seems redundant to repeatedly set protection attributes here, it can enable applications to effectively control the protection attributes of data. The MapViewOfFile() function allows you to map all or part of a file. When mapping, you need to specify the Offset address And the length to be mapped. The offset address of the file is specified by a 64 bit value consisting of the DWORD type parameters dwFileOffsetHigh and dwFileOffsetLow, and must be an integer multiple of the allocation granularity of the operating system. For the Windows operating system, the allocation granularity is fixed at 64KB. Of course, you can also dynamically obtain the allocation granularity of the current operating system through the following code:
SYSTEM_INFO sinf;GetSystemInfo(&sinf); DWORD dwAllocationGranularity = sinf.dwAllocationGranularity;
The parameter dwNumberOfBytesToMap specifies the mapping length of the data file. It should be noted here that for Windows 9x operating systems, if MapViewOfFile() cannot find an area large enough to store the entire file mapping object, it will return a null value (NULL); However, under Windows 2000, MapViewOfFile() only needs to find a large enough area for the necessary view, without considering the size of the entire file mapping object.
After processing the file mapped to the process address space area, you need to release the file data image through the function UnmapViewOfFile(). The function prototype declaration is as follows:
BOOL UnmapViewOfFile(LPCVOID lpBaseAddress);
The unique parameter lpBaseAddress specifies the base address of the return area, which must be set as the return value of MapViewOfFile(). After using the function MapViewOfFile(), there must be a corresponding UnmapViewOfFile() call, or the reserved area will not be released before the process terminates. In addition, the file kernel object and file mapping kernel object were created by the CreateFile() and CreateFileMapping() functions previously. Before the process terminates, it is necessary to release them through CloseHandle(), otherwise the problem of resource leakage will occur.
In addition to these necessary API functions, other auxiliary functions should be selected according to the situation when using memory mapping files. For example, when using memory mapped files, to improve speed, the system caches the data pages of the files, and does not update the disk image of the files immediately when processing the file mapping view. To solve this problem, consider using the FlushViewOfFile() function, which forces the system to rewrite some or all of the modified data to the disk image, thus ensuring that all data updates can be saved to the disk in a timely manner.

Application examples

Announce
edit
The following describes the use of memory mapping files with a specific example. This instance receives data from the port and stores it on the disk in real time. Due to the large amount of data (tens of gigabytes), memory mapping files are selected here for processing. The following is part of the main code in the worker thread MainProc. This thread starts when the program runs. When data arrives on the port, the event hEvent [0] will be issued, The WaitForMultipleObjects() function will save the received data to the disk after the event occurs. If the receiving is terminated, the event hEvent [1] will be issued. The event processing process will be responsible for the release of resources and the closing of files. The specific implementation process of this thread processing function is shown below:
//Create a file kernel object whose handle is saved in hFile
HANDLE hFile = CreateFile("Recv1.zip", GENERIC_WRITE | GENERIC_READ,FILE_SHARE_READ,NULL,CREATE_ALWAYS,FILE_FLAG_SEQUENTIAL_SCAN,NULL);
//Create a file mapping kernel object, and store the handle in hFileMapping
HANDLE hFileMapping = CreateFileMapping(hFile, NULL,PAGE_READWRITE,0,0x4000000,NULL);
//Free file kernel object
CloseHandle(hFile);
//Set the size, offset and other parameters.//Set the file as large as possible. If the written data exceeds the set value, the error getlasterror=183 will be reported when mapping the file again;
__int64 qwFileSize = 0x4000000;
__int64 qwFileOffset = 0;
__int64 T = 600 * sinf.dwAllocationGranularity;
DWORD dwBytesInBlock = 1000 * sinf.dwAllocationGranularity;
//Map file data to the address space of the process
PBYTE pbFile = (PBYTE)MapViewOfFile(hFileMapping, FILE_MAP_ALL_ACCESS, (DWORD)(qwFileOffset>>32),(DWORD)(qwFileOffset&0xFFFFFFFF),dwBytesInBlock);
while(bLoop)
{
//Capture event hEvent [0] and event hEvent [1]
DWORD ret = WaitForMultipleObjects(2, hEvent,FALSE,INFINITE);
ret -= WAIT_OBJECT_0;
switch (ret)
{
//Receive data event trigger
case 0:
//Receive data from port and save it to memory mapping file
nReadLen=syio_Read(port[1], pbFile + qwFileOffset,QueueLen);
qwFileOffset += nReadLen;
//When the data is full of 60%, a new mapping view needs to be created to prevent data overflow
if (qwFileOffset > T)
{
T = qwFileOffset + 600 * sinf.dwAllocationGranularity;UnmapViewOfFile(pbFile);
pbFile = (PBYTE)MapViewOfFile(hFileMapping, FILE_MAP_ALL_ACCESS, (DWORD)(qwFileOffset>>32),(DWORD)(qwFileOffset&0xFFFFFFFF),dwBytesInBlock);
}
break;
//Termination Event Trigger
case 1:
bLoop = FALSE;
//Undo file data image from process's address space
UnmapViewOfFile(pbFile);
//Close File Map Object
CloseHandle(hFileMapping);
break;
}
}…
If only the UnmapViewOfFile() and CloseHandle() functions are simply executed in the process of termination event trigger processing, the actual size of the file cannot be correctly identified. That is, if the memory mapping file opened is 30GB and the data received is only 14GB, then after the above program is executed, the length of the saved file is still 30GB. That is to say, after the processing is completed, the file should be restored to the actual size again in the form of a memory mapped file. The following is the main code to achieve this requirement:
//Create another file kernel object
hFile2 = CreateFile("Recv.zip", GENERIC_WRITE | GENERIC_READ,FILE_SHARE_READ,NULL,CREATE_ALWAYS,FILE_FLAG_SEQUENTIAL_SCAN,NULL);
//Create another file mapping kernel object with the actual data length
hFileMapping2 = CreateFileMapping(hFile2, NULL,PAGE_READWRITE,0, (DWORD)(qwFileOffset&0xFFFFFFFF),NULL);
//Close File Kernel Object
CloseHandle(hFile2);
//Map file data to the address space of the process
pbFile2 = (PBYTE)MapViewOfFile(hFileMapping2, FILE_MAP_ALL_ACCESS,0,0,qwFileOffset);
//Copy data from the original memory mapped file to this memory mapped file
memcpy(pbFile2, pbFile,qwFileOffset);
file:
//Undo file data image from process's address space
UnmapViewOfFile(pbFile);
UnmapViewOfFile(pbFile2);
//Close File Map Object
CloseHandle(hFileMapping);
CloseHandle(hFileMapping2);
//Delete temporary files
DeleteFile("Recv1.zip");