What is the use case for EPOLLET?
The main use case for EPOLLET
that I'm aware of is with micro-threads.
To recap - user space is doing context switches between micro-threads (which I'm going to call "fibers" because it's shorter) based on the availability of something to work on. This is also called "collaborative multi-tasking".
The basic handling of file descriptors is by wrapping the relevant IO functions like so:
ssize_t read(int fd, void *buffer, size_t length) {
// fd should already be in O_NONBLOCK mode
while(true) {
ssize_t result = ::read(fd, buffer, length); // The real read
if( result!=-1 || (errno!=EAGAIN && errno!=EWOULDBLOCK) )
return result;
start_monitoring(fd, READ);
wait_event();
}
}
start_monitoring
is a function that makes sure that fd
is monitored for read availability. wait_event
performs a context switch out until the scheduler re-awakens this fiber because fd
now has data ready for reading.
The usual way to implement this with epoll
is to call EPOLL_CTL_MOD
on fd
within start_monitoring
to add listening for EPOLLIN
, and again after the epoll has reported the event to stop listening for EPOLLIN
.
This means that a read
that has data available will finish within 1 system call, but a read that returns EAGAIN
will take at least 4 system calls (original read
, two EPOLL_CTL_MOD
, and the final read
that succeeds).
Notice that the above does not count the epoll_wait
that also has to take place. I do not count it because I'm taking the generous assumption that other fibers are also about to be woken with that same system call, so it is unfair to attribute its cost entirely to our fiber. All in all, this mechanism needs 4+x system calls, where x is between 0 and one.
One way to reduce the cost is to use EPOLLONESHOT
. Doing so removes fd
from monitoring automatically, reducing our cost to 3+x. Better, but we can do better yet.
Enter EPOLLET
. The previous fd
state can be either armed or unarmed (i.e. - whether the next event will trigger the epoll
). Also, the fd may or may not currently (at the point of entry to read
) have data ready. Four states. Let's spread them out.
Ready (whether armed or not): The first call to read
returns the data. 1 system call. This path does not change the armed state, and ready state depends on whether we read everything.
Not ready (whether armed or not): The first call to read
returns EAGAIN
, thus arming the fd. We go to sleep in wait_event
without having to execute another system call. Once we wake up, we are in unarmed mode (as we just woke up). We thus do not need to call epoll_ctl
to disable listening on the fd. We call read
which returns the data. We leave the function either ready or not, but unarmed.
Total cost: 2+x.
We will have to face one spurious wakeup per fd
, as the fd
starts out armed. Our code will have to handle the case where epoll
reports an fd for which no fiber is listening. Handling, in this case, just means ignore and move on. The FD will not be spuriously reported again.
epoll - is EPOLLET prone to race conditions?
My understanding from the FAQ (Q9) in http://linux.die.net/man/4/epoll is that you will get another event in step 6 (assuming that you can guarantee that step 5 really happens after step 4 and the pipe is empty after step 4).
Having said that, you might get more events than guaranteed (but you have to be careful only to rely on documented behavior) - see http://cmeerw.org/blog/753.html#753 and http://cmeerw.org/blog/750.html#750
How should I use epoll to read and write from the same FD
SOoo..., as @Hasturkun has suggested, using dup
to filter EPOLLIN & EPOLLOUT events (in a thread safe manner) did the trick, however, to me, this looks more of a hack / workaround... I find it awkward that there is no more elegant solution... windows IOCompletion ports seems way more elegant to me...
epoll: difference between level triggered and edge triggered when EPOLLONESHOT specified
I think the bottom line answer is "there is not difference".
Looking at the code, it seems that the fd remembers the last set bits before being disabled by the one-shot. It remembers it was one shot, and it remembers whether it was ET or not.
Which is futile, because the fd is disabled until modified, and the next call to EPOLL_CTL_MOD
will erase all of that, and replace with whatever the new MOD says.
Having said that, I do not understand why anyone would want both EPOLLET
and EPOLLONESHOT
. To me, the whole point of EPOLLET
is that, unders certain programming models (namely, microthreads), it follows the semantics perfcetly. This means that I can add the fd to the epoll at the very start, and then never have to perform another epoll related system call.
EPOLLONESHOT
, on the other hand, is used by people who want to keep a very strict control over when the fd is watched and when it isn't. That, by definition, is the opposite of what EPOLLET
is used for. I just don't think the two are conceptually compatible.
Multithreaded TCP listener with epoll and EPOLLET in C
- A solution without using
SO_REUSEPORT
would be to have a common epoll fd and a common listener which are shared among all the threads.EPOLLONESHOT
is required so only one thread handles the events for a certain fd at a time.
Related Topics
How to Set a Non-Standard Baudrate on a Serial Port Device on Linux
How to Draw 2D Diagram in Linux
How to Split Flv File by Size Using Ffmpeg or Mencoder or Smth Else
Shell Script Won't Recognize Heredoc Delimiter
Calculating Memory of a Process Using Proc File System
Bash Assign Output to Variable Inside Here Document
Docker-Compose Stop Working After Docker Desktop Installation on Debian 11
Wc -M in Unix Adds One Character
Copying Files from Multiple Directories into a Single Destination Directory
Differencebetween These Two Commands Which Are Used to Run Shell Script
Adding Timestamps to Packet Payload with Tcpreplay
Sort Command in Not Working Properly in Unix for Sorting a CSV File
How to Check If "S" Permission Bit Is Set on Linux Shell? or Perl
Unable to Set Variable in Case Statement Bash
How to Merge Two Rows in a Same Row from a Text File in Linux Shell Script