How to Prevent Changes to The Underlying File After Mmap()-Ing a File from Being Visible to My Program

How to have a checkpoint file using mmap which is only synced to disk manually

mmap can't be used for this purpose. There's no way to prevent data from being written to disk. In practice, using mlock() to make the memory unswappable might have a side effect of preventing it from getting written to disk except when you ask for it to be written, but there's no guarantee. Certainly if another process opens the file, it's going to see the copy cached in memory (with your latest changes), not the copy on physical disk. In many ways, what you should do depends on whether you're trying to do synchronization with other processes or just for safety in case of crash or power failure.

If your data size is small, you might try a number of other methods for atomic syncing to disk. One way is to store the entire dataset in a filename and create an empty file by that name, then delete the old file. If 2 files exist at startup (due to extremely unlikely crash time), delete the older one and resume from the newer one. write() may also be atomic if your data size is smaller than a filesystem block, page size, or disk block, but I don't know of any guarantee to that effect right off. You'd have to do some research.

Another very standard approach that works as long as your data isn't so big that 2 copies won't fit on disk: just create a second copy with a temporary name, then rename() it over top of the old one. rename() is always atomic. This is probably the best approach unless you have a reason not to do it that way.

How to Disable Copy-on-write and zero filled on demand for mmap()

There is MMAP_POPULATE flag of mmap(2):

http://linux.die.net/man/2/mmap

MAP_POPULATE (since Linux 2.5.46)
Populate (prefault) page tables for a mapping. For a file mapping, this causes read-ahead on the file. Later accesses to the mapping will not be blocked by page faults. MAP_POPULATE is only supported for private mappings since Linux 2.6.23.

It should pre-fault all pages in mmapped region. It should work for question (1), and may not work for question (2) (shared).

mmap local v/s nfs file: what happens when the underlying file is replaced on disk?

If you have an open reference to a file, that reference will continue to refer to the same file for as long as the reference lives, even if the file itself is deleted or renamed and even if its name is reused by a brand new file after it is deleted. The reference can be a file descriptor or a memory mapping. This is part of POSIX and it's true (or should be!) no matter what type of filesystem is in use.

In other words: if you open a file on an NFS filesystem and map it into memory, you can continue to use that memory mapping for as long as you don't unmap it, even if some other process (or the same process) deletes the file and replaces it with a new one with the same name.

It's true that the NFS protocol is stateless, so the implementation has to take special steps to make sure this case is handled correctly. It's been a very long time since I looked at how it's done, but the last time I did (on Solaris), it was done by renaming files to special hidden names (.nfsXXXXX) instead of deleting them if their link count was decremented to zero while there were still open references to them. Anyway, whatever trick is used by the implementation, you as the user of the filesystem shouldn't have to worry about it.

If mmap is faster than legacy file accessing, where we see the time saving?

Updates to the file are not immediately visible in the disk, but are visible after an unmap or following an msync call. Hence, there is no system call during the updates, and the kernel is not involved. However, since the file is lazily read page by page, as needed, OS may need to read-in portions of the file as you cross page boundaries. Most obvious advantage of memory mapping is that it eliminates kernel-space to user-space data copies. There is also no need for system calls to seek to a specific position in a file.



Related Topics



Leave a reply



Submit