Should Lock_Ex on Both Read & Write Be Atomic

should LOCK_EX on both read & write be atomic?

Since this answer is long, here's the summary: No, file_get_contents() is not atomic as it does not respect advisory locks.

About file locks in PHP:

In PHP, while on a *nix platform, filesystem locking is advisory only. Per the docs (Emphasis mine):

PHP supports a portable way of locking complete files in an advisory way (which means all accessing programs have to use the same way of locking or it will not work). By default, this function will block until the requested lock is acquired; this may be controlled (on non-Windows platforms) with the LOCK_NB option documented below.

So, as long as all of the processes that are accessing the file use this method of locking, you're fine.

However, if you're writing a static HTML file with a sane webserver, the lock will be ignored. In the middle of the write, if a request comes in, Apache will serve the partially written file. The locks will have no effect on the other process reading the lock.

The only real exception is if you use the special mount option of -o mand on the filesystem to enable mandatory locking (but that's not really used much, and can have a performance penalty).

Have a read on File Locking for some more information. Namely the section under Unix:

This means that cooperating processes may use locks to coordinate access to a file among themselves, but programs are also free to ignore locks and access the file in any way they choose to.

So, in conclusion, using LOCK_EX will create an advisory lock on the file. Any attempt to read the file will block only if the reader respects and/or checks for the lock. If they do not, the lock will be ignored (since it can be).

Try it out. In one process:

file_put_contents('test.txt', 'Foo bar');
$f = fopen('test.txt', 'a+');
if (flock($f, LOCK_EX)) {
sleep(10);
fseek($f, 0);
var_dump(fgets($f, 4048));
flock($f, LOCK_UN);
}
fclose($f);

And while it's sleeping, call this:

$f = fopen('test.txt', 'a+');
fwrite($f, 'foobar');
fclose($f);

The output will be foobar...

About file_get_contents specifically:

To your other specific question, first off, there is no LOCK_EX parameter to file_get_contents. So you can't pass that in.

Now, looking at the source code, we can see the function file_get_contents defined on line 521. There are no calls to the internal function php_stream_lock as there are when you pass file_put_contents('file', 'txt', LOCK_EX); defined on line 589 of the same file.

So, let's test it, shall we:

In file1.php:

file_put_contents('test.txt', 'Foo bar');
$f = fopen('test.txt', 'a+');
if (flock($f, LOCK_EX)) {
sleep(10);
fseek($f, 0);
var_dump(fgets($f, 4048));
flock($f, LOCK_UN);
}
fclose($f);

In file2.php:

var_dump(file_get_contents('test.txt'));

When run, file2.php returns immediately. So no, it doesn't appear that file_get_contents respects file locks at all...

should LOCK_EX on both read & write be atomic?

Since this answer is long, here's the summary: No, file_get_contents() is not atomic as it does not respect advisory locks.

About file locks in PHP:

In PHP, while on a *nix platform, filesystem locking is advisory only. Per the docs (Emphasis mine):

PHP supports a portable way of locking complete files in an advisory way (which means all accessing programs have to use the same way of locking or it will not work). By default, this function will block until the requested lock is acquired; this may be controlled (on non-Windows platforms) with the LOCK_NB option documented below.

So, as long as all of the processes that are accessing the file use this method of locking, you're fine.

However, if you're writing a static HTML file with a sane webserver, the lock will be ignored. In the middle of the write, if a request comes in, Apache will serve the partially written file. The locks will have no effect on the other process reading the lock.

The only real exception is if you use the special mount option of -o mand on the filesystem to enable mandatory locking (but that's not really used much, and can have a performance penalty).

Have a read on File Locking for some more information. Namely the section under Unix:

This means that cooperating processes may use locks to coordinate access to a file among themselves, but programs are also free to ignore locks and access the file in any way they choose to.

So, in conclusion, using LOCK_EX will create an advisory lock on the file. Any attempt to read the file will block only if the reader respects and/or checks for the lock. If they do not, the lock will be ignored (since it can be).

Try it out. In one process:

file_put_contents('test.txt', 'Foo bar');
$f = fopen('test.txt', 'a+');
if (flock($f, LOCK_EX)) {
sleep(10);
fseek($f, 0);
var_dump(fgets($f, 4048));
flock($f, LOCK_UN);
}
fclose($f);

And while it's sleeping, call this:

$f = fopen('test.txt', 'a+');
fwrite($f, 'foobar');
fclose($f);

The output will be foobar...

About file_get_contents specifically:

To your other specific question, first off, there is no LOCK_EX parameter to file_get_contents. So you can't pass that in.

Now, looking at the source code, we can see the function file_get_contents defined on line 521. There are no calls to the internal function php_stream_lock as there are when you pass file_put_contents('file', 'txt', LOCK_EX); defined on line 589 of the same file.

So, let's test it, shall we:

In file1.php:

file_put_contents('test.txt', 'Foo bar');
$f = fopen('test.txt', 'a+');
if (flock($f, LOCK_EX)) {
sleep(10);
fseek($f, 0);
var_dump(fgets($f, 4048));
flock($f, LOCK_UN);
}
fclose($f);

In file2.php:

var_dump(file_get_contents('test.txt'));

When run, file2.php returns immediately. So no, it doesn't appear that file_get_contents respects file locks at all...

What happens when two scripts want to write at the same time on a file with LOCK_EX?

Both processes will call flock() to lock the file before they start writing. The first one will get the lock, the second will wait until the file is unlocked. There's no retrying, it's handled automatically by the OS. The documentation doesn't mention a timeout, so I assume there isn't one.

The first process will unlock the file as soon as it finishes writing, then the second process will run.

You generally don't need LOCK_EX if you're using FILE_APPEND. Each call to write() is atomic, and when the file is opened in append mode the filesystem ensures that each process writes at the new end-of-file, not EOF position when the file was opened.

Atomically open and lock file

Make the writer create a file called "foo.hex.init" instead, and initialize that before renaming it to "foo.hex". This way, the reader can never see the uninitialized file contents.

flock() between PHP and C edge case

Your code is assuming that the file_put_contents() operation is atomic, and that using FLOCK_EX and FLOCK_SH is enough to ensure no race conditions between the two programs happen. This is not the case.

As you can see from the PHP doc, the FLOCK_EX is applied after opening the file. This is important, because it leaves a short window of time for the C++ program to successfully open the file and lock it with FLOCK_SH. At that point the file was already truncated by the fopen() done by PHP, and it's empty.

What's most likely happening is:

  1. PHP code opens the file for writing, truncating it and effectively wiping out its content.
  2. C++ code opens the file for reading.
  3. C++ code requests the shared lock on the file: the lock is granted.
  4. PHP code requests the exclusive lock on the file: the call blocks, waiting for the lock to be available.
  5. C++ code reads the file's contents: nothing, the file is empty.
  6. C++ code deletes the file.
  7. C++ code releases the shared lock.
  8. PHP code acquires the exclusive lock.
  9. PHP code writes to the file: the data does not reach the disk because the inode associated with the open file descriptor does not exist anymore.
  10. You are effectively left with no file and the data is lost.

The problem with your code is that the operations you are doing on the file from two different programs are not atomic, and the way you are acquiring the locks does not help in ensuring that those don't overlap.

The only sane way of guaranteeing the atomicity of such an operation on a POSIX compliant system, without even worrying about file locking, is to take advantage of the atomicity of rename(2):

If newpath already exists, it will be atomically replaced, so that there is no point at which another process attempting to access newpath will find it missing.

If newpath exists but the operation fails for some reason, rename() guarantees to leave an instance of newpath in place.

The equivalent rename() PHP function is what you should use in this case. It's the simplest way to guarantee atomic updates to a file.

What I would suggest is the following:

  • PHP code:

    $tmpfname = tempnam("/tmp", "myprefix");     // Create a temporary file.
    file_put_contents($tmpfname, "contents"); // Write to the temporary file.
    rename($tmpfname, "sampleDir/invoice.xml"); // Atomically replace the contents of invoice.xml by renaming the file.

    // TODO: check for errors in all the above calls, most importantly tempnam().
  • C++ code:

    FILE* pInvoiceFile = fopen("sampleDir/invoice.xml", "r");

    if (pInvoiceFile != NULL)
    {
    struct stat fileStat;
    fstat(fileno(pInvoiceFile), &fileStat);

    string invoice;
    invoice.resize(fileStat.st_size);

    size_t n = fread(&invoice[0], 1, fileStat.st_size, pInvoiceFile);
    fclose(pInvoiceFile);

    if (n == 0)
    remove("sampleDir/invoice.xml");
    }

This way, the C++ program will always either see the old version of the file (if fopen() happens before PHP's rename()) or the new version of the file (if fopen() happens after), but it will never see an inconsistent version of the file.

Read and write file atomically

You want to use File#flock in exclusive mode. Here's a little demo. Run this in two different terminal windows.

filename = 'test.txt'

File.open(filename, File::RDWR) do |file|
file.flock(File::LOCK_EX)

puts "content: #{file.read}"
puts 'doing some heavy-lifting now'
sleep(10)
end

Is there a risk in running file_put_contents() on the same file from different PHP threads?

as it says on the man page (that you gave a link for!):

// Write the contents to the file, 
// using the FILE_APPEND flag to append the content to the end of the file
// and the LOCK_EX flag to prevent anyone else writing to the file at the same time
file_put_contents($file, $person, FILE_APPEND | LOCK_EX);

Use the LOCK_EX flag to prevent double writes



Related Topics



Leave a reply



Submit