Epoll on Regular Files

Epoll on regular files

Not really. epoll only makes sense for file descriptors which would normally exhibit blocking behavior on read/write, like pipes and sockets. Normal file descriptors will always either return a result or end-of-file more or less immediately, so epoll wouldn't do anything useful for them.

Why does select.select() work with disk files but not epoll()?

select allows filedescriptors pointing to regular files to be monitored, however it will always report a file as readable/writable (i.e. it's somewhat useless, as it doesn't tell you whether a read/write would actually block).

epoll just disallows monitoring of regular files, as it has no mechanism (on linux at least) available to tell whether reading/writing a regular file would block

How to read a [nonblocking] filedescriptor of a file that is appended to (aka, like tail -f)?

It is very common, for some reason, for people to think that making an fd nonblocking, or calling poll/select/.. has different behaviour for files compared to other types of file descriptions, but nonblocking behaviour and I/O readyness behaviour is essentially the same for all of types of file descriptions: the kernel will immediately return from read/write etc. if the outcome is known, and will signal I/O readyness when this is the case. When a socket has an EOF condition, select will signal that the socket is ready to read, and you will get 0 (for EOF). The same happens for files - if you are at the end of a file, the kernel will return immediately from read and return 0 to signal EOF.

The important difference is that files can change contents at random places, and can be extended. Pipes and sockets are not random access and cannot be appended to once closed. Thus, while the behaviour is consistent, it is often not what is wanted, namely waiting for a file to change in some way.

The conflict in many people's minds is simply that they want to be told "when there is new data", but if you think about it a bit, you will realise that simply waking you up would not be an adequate interface for this, as you have no way of knowing why you woke up, and what changed.

POSIX doesn't have an interface to do that, other than regularly polling the fd or file (and in case of random changes, regularly reading the whole file!). Some operating systems have an interface to do something similar to that (kqueue on BSDs, inotify on GNU/Linux) , but they are usually not a perfect match, either (for example, inotify cannot watch an fd for changes, it will watch a path for changes).

The closest you can get with libev is to use an ev_stat watcher. It behaves as if you would stat() a path regularly, and invoke the watcher callback whenever the stat data changes. Portably, it does just that: it regularly calls stat, but on some operating systems (currently only inotify on GNU/Linux, as kqueue doesn't have correct semantics for this) it can use other mechanisms to speed this up in some cases, although it will fall back to regular stat polling everywhere, for example for when the file is on a network file system, where inotify can't see remote changes.

To answer your question: If you have a path, you can use an ev_stat watcher to watch for stat data changes, such as size/mtime etc. changes. Doing this right can be a bit tricky (see the libev documentation, especially the part about stat time resolution: http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#code_ev_stat_code_did_the_file_attri), and you have to keep in mind that this watches a path, not a file descriptor, so you might want to compare the device/inode of your file descriptor and the watched path regularly to see if you still have the correct file open.

This still doesn't tell you what part of the file has changed.

Alternatively, since you apparently only want to read appended data, you could opt to just read() the file regularly (in an ev_timer callback) and do away with all the complexity and hassles of an ev_stat watcher setup (while not forgetting to also compare the path stat data with your fd stat data to see if you still hasve the right file open, depending on whether the file your are reading might get renamed or replaced. Sometimes programs also truncate files, something you can also detect by seeing the size decrease between stat calls).

This is essentially what older tail -f implementations do, while newer ones might, for example, take hints (only) from inotify, just like ev_stat watchers do.

None of that is easy, and details depend on your knowledge of how exactly the file changes, but it's the best you can do.

Kqueue on regular files

Yes, kqueue can be used to watch files for readability. From the man page:

 EVFILT_READ      Takes a file descriptor as the identifier, and returns
                  whenever there is data available to read.  The behavior
                  of the filter is slightly different depending on the
                  descriptor type.

 [...]

                  Vnodes
                      Returns when the file pointer is not at the end of
                      file.  data contains the offset from current posi-
                      tion to end of file, and may be negative.

("vnodes", in this context, are regular files.)

As regular files are always writable, it makes no sense to apply EVFILT_WRITE to them.

Why poll() returns immediately on regular files and blocks on fifo?

poll() or select() never block on regular files. They always return a regular file as "ready". If you want to use poll() to do what tail -f does, you're on the wrong track.

Quoting from the SUSv4 standard:

The poll() function shall support regular files, terminal and
pseudo-terminal devices, FIFOs, pipes, sockets and [OB XSR] STREAMS-based files. The behavior of poll() on
elements of fds that refer to other types of file is unspecified.

Regular files shall always poll TRUE for reading and writing.

Since using poll() or select() on regular files is pretty much useless, newer interfaces have tried to remedy that. On BSD, you could use kqueue(2) with EVFILT_READ, and on Linux inotify(2) with IN_MODIFY. The newer epoll(7) interface on Linux will simply error out with EPERM if you try to watch a regular file.

Unfortunately, neither of those is standard.

Is it necessary to remove all file descriptors in the interest list before closing the epoll instance itself?

No, it's not necessary to manually remove them.

From the epoll_create(2) manpage (emphasis added)

When all file descriptors referring to an epoll instance have been closed, the kernel destroys the instance and releases the associated resources for reuse.

Is there a system call or some way to know the type of file descriptor in Linux (e.g. regular file fd, socket fd, signal fd, timer fd)?

As "that other guy" mentioned, the most obvious such call is fstat. The st_mode member contains bits to distinguish between regular files, devices, sockets, pipes, etc.

But in practice, you will almost certainly need to keep track yourself of which fd is which. Knowing it's a regular file doesn't help too much when you have several different regular files open. So since you have to maintain this information somewhere in your code anyway, then referring back to that record would seem to be the most robust way to go.

(It's also going to be much faster to check some variables within your program than to make one or several additional system calls.)

Also, are all file descriptors, irrespective of the type, sequential. I mean if you open a regular data file, then create a timer file descriptor, then a signal file descriptor, are they all guaranteed to be numbered sequentially?

Not really.

As far as I know, calls that create a new fd will always return the lowest-numbered available fd. There are old programs that rely on this behavior; before dup2 existed, I believe the accepted way to move standard input to a new file was was close(0); open("myfile", ...);.

However, it's hard to really be sure what fds are available. For example, the user may have run your program as /usr/bin/prog 5>/some/file/somewhere and then it will appear that fd 5 gets skipped, because /some/file/somewhere is already open on fd 5. As such, if you open a bunch of files in succession, you cannot really be sure that you will get sequential fds, unless you have just closed all those fds yourself and are sure that all lower-numbered fds are already in use. And doing that seems much more of a hassle (and a source of potential problems) than just keeping track of the fds in the first place.

Epoll on Regular Files