How to have a checkpoint file using mmap which is only synced to disk manually
mmap
can't be used for this purpose. There's no way to prevent data from being written to disk. In practice, using mlock()
to make the memory unswappable might have a side effect of preventing it from getting written to disk except when you ask for it to be written, but there's no guarantee. Certainly if another process opens the file, it's going to see the copy cached in memory (with your latest changes), not the copy on physical disk. In many ways, what you should do depends on whether you're trying to do synchronization with other processes or just for safety in case of crash or power failure.
If your data size is small, you might try a number of other methods for atomic syncing to disk. One way is to store the entire dataset in a filename and create an empty file by that name, then delete the old file. If 2 files exist at startup (due to extremely unlikely crash time), delete the older one and resume from the newer one. write()
may also be atomic if your data size is smaller than a filesystem block, page size, or disk block, but I don't know of any guarantee to that effect right off. You'd have to do some research.
Another very standard approach that works as long as your data isn't so big that 2 copies won't fit on disk: just create a second copy with a temporary name, then rename()
it over top of the old one. rename()
is always atomic. This is probably the best approach unless you have a reason not to do it that way.
How to Disable Copy-on-write and zero filled on demand for mmap()
There is MMAP_POPULATE flag of mmap(2):
http://linux.die.net/man/2/mmap
MAP_POPULATE (since Linux 2.5.46)
Populate (prefault) page tables for a mapping. For a file mapping, this causes read-ahead on the file. Later accesses to the mapping will not be blocked by page faults. MAP_POPULATE is only supported for private mappings since Linux 2.6.23.
It should pre-fault all pages in mmapped region. It should work for question (1), and may not work for question (2) (shared).
mmap local v/s nfs file: what happens when the underlying file is replaced on disk?
If you have an open reference to a file, that reference will continue to refer to the same file for as long as the reference lives, even if the file itself is deleted or renamed and even if its name is reused by a brand new file after it is deleted. The reference can be a file descriptor or a memory mapping. This is part of POSIX and it's true (or should be!) no matter what type of filesystem is in use.
In other words: if you open a file on an NFS filesystem and map it into memory, you can continue to use that memory mapping for as long as you don't unmap it, even if some other process (or the same process) deletes the file and replaces it with a new one with the same name.
It's true that the NFS protocol is stateless, so the implementation has to take special steps to make sure this case is handled correctly. It's been a very long time since I looked at how it's done, but the last time I did (on Solaris), it was done by renaming files to special hidden names (.nfsXXXXX
) instead of deleting them if their link count was decremented to zero while there were still open references to them. Anyway, whatever trick is used by the implementation, you as the user of the filesystem shouldn't have to worry about it.
If mmap is faster than legacy file accessing, where we see the time saving?
Updates to the file are not immediately visible in the disk, but are visible after an unmap
or following an msync
call. Hence, there is no system call during the updates, and the kernel is not involved. However, since the file is lazily read page by page, as needed, OS may need to read-in portions of the file as you cross page boundaries. Most obvious advantage of memory mapping is that it eliminates kernel-space to user-space data copies. There is also no need for system calls to seek to a specific position in a file.
Related Topics
Cdc_Acm: Failed to Set Dtr/Rts - Can Not Communicate with Usb Cdc Device
Change Conda Default Pkgs_Dirs and Envs Dirs
Run Any Linux Terminal Command from Typescript
Linking a Static Library into a Shared Library
How to Properly Debug a Bash Script
How to Edit a Symlink with a Text Editor
Setup Sftp to Use Public-Key Authentication
Check If a Git Branch Is Ahead of Another Using a Script
Core Dump of Multithreaded Application Shows Only One Thread
Read Lines Between Two Keywords
How to Use Sed to Delete Leading Digits
Linux: Find a List of Files in a Dictionary Recursively
List of Files Modified 1 Hour Before
How to Check If The Sed Command Replaced Some String
What Is The Maximum Number of Subdirectories Allowed in Ext4