How to Ensure All Data Has Been Physically Written to Disk

How to ensure all data has been physically written to disk?

Under Windows, look at FlushFileBuffers (Win32 API).

How do I ensure data is written to disk before closing fstream?

You cannot to this with standard tools and have to rely on OS facilities.
For POSIX fsync should be what you need. As there is no way to a get C file descriptor from a standard stream you would have to resort to C streams in your whole application or just open the file for flushing do disk. Alternatively there is sync but this flushes all buffers, which your users and other applications are going to hate.

Does fwrite block until data has been written to disk?

There are several places where there is buffering of data in order to improve efficiency when using fwrite(): buffering within the C++ Runtime and buffering in the operating system file system interface and buffering within the actual disk hardware.

The default for these are to delay the actual physical writing of data to disk until there is an actual request to flush buffers or if appropriate indicators are turned on to perform physical writes as the write requests are made.

If you want to change the behavior of fwrite() take a look at the setbuf() function setbuf redirection as well as setbuff() Linux man page and here is the Microsoft documentation on setbuf().

And if you look at the documentation for the underlying Windows CreateFile() function you will see there are a number of flags which include flags as to whether buffering of data should be done or not.

FILE_FLAG_NO_BUFFERING 0x20000000

The file or device is being opened with no system caching for data
reads and writes. This flag does not affect hard disk caching or
memory mapped files.

There are strict requirements for successfully working with files
opened with CreateFile using the FILE_FLAG_NO_BUFFERING flag, for
details see File Buffering.

And see the Microsoft documentation topic File Buffering.

In a simple example, the application would open a file for write
access with the FILE_FLAG_NO_BUFFERING flag and then perform a call to
the WriteFile function using a data buffer defined within the
application. This local buffer is, in these circumstances, effectively
the only file buffer that exists for this operation. Because of
physical disk layout, file system storage layout, and system-level
file pointer position tracking, this write operation will fail unless
the locally-defined data buffers meet certain alignment criteria,
discussed in the following section.

Take a look at this discussion about settings at the OS level for what looks to be Linux https://superuser.com/questions/479379/how-long-can-file-system-writes-be-cached-with-ext4

C# StreamWriter - When is stream physically written to file?

The StreamWriter has an internal buffer, and once that buffer is full, it will get flushed to disk. You can force it to flush to disk at any time by calling Flush()

You can specify how big of a buffer you want in the constructors of StreamWriters if you wish.

Write file: Data consistency in practice

The basic way to overwrite data in a crash-safe way is:

  1. Write the data to a new storage location first. (You're not actually overwriting anything yet.)
  2. Tell the OS to flush the above to stable storage, using something like the POSIX fsync function. This is meant to flush caches and everything, so that when the function returns, the data is actually physically on disk.
  3. Write a "journal" entry somewhere that indicates that all the new data for this update has been written and is ready to commit.
  4. Flush the journal entry to disk.
  5. Read the data that you wrote in step 1 and write it to the "real" storage location. (This is where you do the actual overwrite.)
  6. Write another journal entry that says the change has been committed.
  7. Delete the temporary file that you created in step 1.

The flushes serve as write barriers: they ensure that everything before the flush has been safely stored on disk before anything after the flush can be written. Between a pair of barriers, reordering of writes (e.g. due to disks with command queueing) isn't a problem, because the barriers ensure that the order is correct in the places where it matters. In step 1, you don't care if the disk physically writes the second half of the file before it writes the first half; you just care that the whole file has been written before the journal entry attesting that the new file is complete.

After a crash, you go through the journal and process each entry:

  • If you find a file from step 1 that doesn't have a corresponding entry from step 3, treat the file as incomplete and discard it. This is a rollback of an incomplete change.
  • If the entry from step 3 is present but not the one from step 6, repeat step 5. It's possible that step 5 was partially completed before the crash, but that doesn't matter; it just means you might be overwriting some of the data with identical bytes, which is harmless.
  • If the entry from step 6 is present, repeat step 7 by deleting the file if it still exists.

You might find it informative to read PostgreSQL's documentation on reliability and write-ahead logging (which is PostgreSQL's term for the sort of journaling mechanism described above.) It incorporates additional safety measures, such as checksumming of WAL (journal) entries to protect against corruption, and disk flushes are deferred and batched for better performance during normal operation (at the expense of crash recovery possibly taking a little longer).

Speaking of databases, however, it'd probably be much easier and safer to actually use one — with its robust and well-tested consistency and durability mechanisms — than trying to roll your own. If a full database server like PostgreSQL is too heavyweight for your application, consider using something lighter like SQLite or Berkeley DB (which is a low-level key-value store, not an SQL relational database). Both support atomic commits.

Ensure fsync did its job

No, there is not.

With fsync you tell your OS to write it to disk and as far as the OS is concerned, it has been written to disk.

If disks are faking this then it is not something you can really change unfortunately.
With proper disk systems (i.e. BBU raid setups) you can simply enable/disable write cache to avoid this mostly.

Do note that if you specify the O_DIRECT and O_SYNC flags, it should write it to disk: http://www.kernel.org/doc/man-pages/online/pages/man2/open.2.html



Related Topics



Leave a reply



Submit