How Does This Canonical Flock Example Work

How does this canonical flock example work?

The whole I/O context of the sub-shell (...) 200>/var/lock/mylockfile has to be evaluated — and the I/O redirection done — before any commands can be executed in the sub-shell, so the redirection always precedes the flock -s 200. Think about if the sub-shell had its standard output piped to another command; that pipe has to be created before the sub-shell is created. The same applies to the file descriptor 200 redirection.

The choice of file descriptor number really doesn't matter in the slightest — beyond it is advisable not to use file descriptors 0-2 (standard input, output, error). The file name matters; different processes could use different file descriptors; as long the name is agreed upon, it should be fine.

What does 200 $somefile accomplish?

200>/var/lock/mylockfile

This creates a file /var/lock/mylockfile which can be written to via file descriptor 200 inside the sub-shell. The number 200 is an arbitrary one. Picking a high number reduces the chance of any of the commands inside the sub-shell "noticing" the extra file descriptor.

(Typically, file descriptors 0, 1, and 2 are used by stdin, stdout, and stderr, respectively. This number could have been as low as 3.)

flock -s 200

Then flock is used to lock the file via the previously created file descriptor. It needs write access to the file, which the > in 200> provided. Note that this happens after the redirection above.

How do exec and flock work together in bash script

This is not valid or useful bash. It will just result in two different error messages.

Instead, the intended code was this:

#!/bin/bash
...
exec {LOCK}> foo.out
flock -x ${LOCK}
...

It uses:

  1. {name}> to open for writing and assign fd number to name
  2. exec to apply the redirection to the current, keeping the fd open for the duration of the shell
  3. flock to lock the assigned fd, which it will inherit from the current shell

So effectively, it creates a mutex based on the file foo.out, ensuring that only one instance is allowed to run things after the flock at a time. Any other instances will wait until the previous one is done.

flock vs lockf on Linux

The practical difference between flock() and lockf() is in the semantics (behaviour with respect to closing and passing), applicability over NFS and other shared filesystems, and whether the advisory locks are visible to other processes using fcntl() locks or not.

The library you're using simply has logic to pick the desired semantics based on the current platform.

If the semantics (behaviour over descriptor passing, forking, etc.) is acceptable, you should prefer lockf()/fcntl() locks over flock() locks in Linux, simply because the former works on NFS etc. filesystems, whereas the latter does not. (On BSDs and Mac OS X, I believe you need to explicitly use fcntl(), instead.)


In Linux, lockf() is just a wrapper around fcntl(), while flock() locks are separate (and will only work on local filesystems, not on e.g. NFS mounts on kernels prior to 2.6.12). That is, one process can have an advisory exclusive flock() lock on a file, while another process has an advisory exclusive fcntl() lock on that same file. Both are advisory locks, but they do not interact.

On Mac OS X and FreeBSD, lockf()/flock()/fcntl() locks all interact, although developers are recommended to use only one of the interfaces in an application. However, only fcntl() locks work on NFS mounts (and, obviously, only if both NFS client and server have been configured to support record locks, which is surprisingly rare in e.g. web hosting environments; a huge cause of headaches for some web (framework) developers).

POSIX does not explicitly specify how lockf()/flock()/fcntl() locks should interact, and there have been differences in the past. Now, the situation has calmed down a bit, and one can approximately say that

  1. fcntl() locks are the most reliable

    Across architectures, they have the best chance of working right on e.g. shared filesystems -- NFS and CIFS mounts, for example.

  2. Most often, lockf() is implemented as "shorthand" for fcntl()

    The other alternative, as "shorthand" for flock(), is possible, but nowadays rare.

  3. fcntl() and flock() have different semantics wrt. inheritance and automatic releases

    fcntl() locks are preserved across an exec(), but not inherited across a fork(). The locks are released when the owning process closes any descriptor referring to the same file.

    In Linux, FreeBSD, and MAc OS X, flock() locks are coupled with the open file descriptor: passing the descriptor also passes the lock. (The man pages state that "the lock is on the file, not on the file descriptor". This is not a contradiction. It just means that the lock applies to the file. It is still coupled to the descriptor, in such a way that duplicating the descriptor also passes the same lock, too.) Therefore, it is possible that multiple processes have the same exclusive advisory flock() lock on the same file at the same time, if they obtained the descriptor from the originator after the flock() call.

File locking is surprisingly complicated issue. I have personally had best results by simply sticking to fcntl() locking. The semantics wrt. fcntl() locks are not the easiest to work with, and in certain cases can be frankly infuriating; it's just that I've found it to yield the best -- most reliable, most portable, least surprising -- results.

How to avoid race condition when using a lock-file to avoid two instances of a script running simultaneously?

Yes, there is indeed a race condition in the sample script. You can use bash's noclobber option in order to get a failure in case of a race, when a different script sneaks in between the -f test and the touch.

The following is a sample code-snippet (inspired by this article) that illustrates the mechanism:

if (set -o noclobber; echo "$$" > "$lockfile") 2> /dev/null; 
then
# This will cause the lock-file to be deleted in case of a
# premature exit.
trap 'rm -f "$lockfile"; exit $?' INT TERM EXIT

# Critical Section: Here you'd place the code/commands you want
# to be protected (i.e., not run in multiple processes at once).

rm -f "$lockfile"
trap - INT TERM EXIT
else
echo "Failed to acquire lock-file: $lockfile."
echo "Held by process $(cat $lockfile)."
fi

Does flock understand && command for multiple bash commands?

Shell interprets your command this way:

flock -n ~/.my.lock cd /to/my/dir && python3.6 runcommand.py > /dev/null 2>&1
  • Step 1: Run flock -n ~/.my.lock cd /to/my/dir part
  • Step 2: If the command in step 1 exits with non-zero, skip step 3
  • Step 3: Run python3.6 runcommand.py > /dev/null 2>&1 part

So, flock has no business with && or the right side of it.

You could do this instead:

touch ./.my.lock # no need for this step if the file is already there and there is a potential that some other process could lock it
(
flock -e 10
cd /to/my/dir && python3.6 runcommand.py > /dev/null 2>&1
) 10< ./.my.lock

See this post on Unix & Linux site:

  • How to use flock and file descriptors to lock a file and write to the locked file?

What does LOCK_NB mean in flock?

LOCK_NB means non-blocking.

Usually when you try to lock a file, your PHP script execution will stop. The call to flock() then blocks it from resuming. It does so until a concurrent lock on the accessed file is removed.

Mostly your process is the only one trying to lock the file, so the blocking call to flock actually returns instantly. It's only if two processes lock the very same file, that one of them will be paused.

The LOCK_NB flag however will make flock() return immediately in any case. In that setting you have to check the returned status to see if you actually aquired the lock. As example:

while ( ! flock($f, LOCK_NB) ) {
sleep(1);
}

Would more or less emulate the behaviour of the normal blocking call. The purpose of course is to do something else / meaningful (not just wait) while the file is still locked by another process.



Related Topics



Leave a reply



Submit