Interrupting epoll_wait with a non-IO event, no signals
You can use an eventfd which is effectively the same thing as the self-pipe trick except with fewer file descriptors and less boilerplate (glibc has convenience eventfd_read/write
functions for instance).
How to interrupt epoll_pwait with an appropriate signal?
Instead of signals, consider using a pipe. Create a pipe and add the file descriptor for the read end of the pipe to the epoll. When you want to wake the epoll_wait call, just write 1 character to the write end of the pipe.
int read_pipe;
int write_pipe;
void InitPipe()
{
int pipefds[2] = {};
epoll_event ev = {};
pipe(pipefds, 0);
read_pipe = pipefds[0];
write_pipe = pipefds[1];
// make read-end non-blocking
int flags = fcntl(read_pipe, F_GETFL, 0);
fcntl(write_pipe, F_SETFL, flags|O_NONBLOCK);
// add the read end to the epoll
ev.events = EPOLLIN;
ev.data.fd = read_pipe;
epoll_ctl(epfd, EPOLL_CTL_ADD, read_pipe, &ev);
}
void add_message_to_queue(struct message_t* msg)
{
char ch = 'x';
add_msg(msg);
write(write_pipe, &ch, 1);
}
main_thread()
{
struct message_t msg;
while (msg = get_message_from_queue())
process_message(msg);
timeout = work_available ? 0 : -1;
nfds = epoll_wait(epfd, events, MAX_EPOLL_EVENTS, timeout);
for (i = 0; i < nfds; ++i)
{
if (events[i].data.fd == read_pipe)
{
// read all bytes from read end of pipe
char ch;
int result = 1;
while (result > 0)
{
result = read(epoll_read, &ch, 1);
}
}
if ((events[i].events & EPOLLIN) == EPOLLIN)
{
/// do stuff
}
}
run_state_machines();
}
How can I interrupt an infinite sigtimedwait?
Use signal file descriptors, instead of signal handlers.
Instead of a signal handler, the receipt of a signal is will now be done by reading from a file descriptor, which is epollable, and can be handled as part of your epoll set.
Yes, that's the better way to catch signals, on Linux, in this day and age.
epoll + non-blocking socket slower than blocking + timeout?
epoll() is useful for (from the man page, epoll(2)): monitoring multiple file descriptors to see if I/O is possible on any of them.
you are using epoll() to monitor one file descriptor. this is adding a bunch of overhead in terms of context switches; each child has to call epoll_create(), epoll_ctl(), and epoll_wait(). and then! they all get woken up for every new connection. and then! most of them fail with the accept.
in the blocking version, probably only one child even gets woken up.
Python: retrieve several URLs via select.epoll()
How can I use the requests library (or a different urllib) combined with linux epoll?
Unfortunately you can’t unless such a library has been built with this integration in mind. epoll, as well as select/poll/kqueue and others are I/O multiplexing system calls and the overall program architecture needs to be built around it.
Simply put, a typical program structure boils down to the following
- one needs to have a bunch of file descriptors (sockets in non-blocking mode in your case)
- a system call (man epoll_wait in case of epoll) blocks until a specified event occurs on one or multiple descriptors
- information of the descriptors available for I/O is returned
After that this is the outer code’s job to handle these descriptors i.e. figure out how much data has become available, call some callbacks etc.
If the library uses regular blocking sockets the only way to parallelize it is to use threads/processes
Here’s a good article on the subject, the examples use C and that’s good as it’s easier to understand what’s actually happening under the hood
Async frameworks & requests library
Lets check out what’s suggested here
If you are concerned about the use of blocking IO, there are lots of
projects out there that combine Requests with one of Python's
asynchronicity frameworks. Some excellent examples are
requests-threads, grequests, and requests-futures).
requests-threads - uses threads
grequests - integration with gevent (it’s a different story, see below)
requests-futures - in fact also threads/processes
neither of them has anything to do with true asynchronicity
Should I use select.epoll() or one of the many async frameworks which python has
Please note, epoll is linux-specific beast and it won’t work i.e. on OS X that has a different mechanism called kqueue. As you appear to be writing a general-purpose job queue it doesn’t seem to be a good solution.
Now back to python. You’ve got the following options:
threads/processes/concurrent.futures - unlikely is it something you’re aiming at as your app is a typical C10K server
epoll/kqueue - you’ll have to do everything yourself. In case of fetching an HTTP urls you’ll need to deal with not only http/ssl but also with asynchronous DNS resolution. Also consider using asyncore[] that provides some basic infrastructure
twisted/tornado - callback-based frameworks that already do all the low-level stuff for you
gevent - this is something you might like if you’re going to reuse existing blocking libraries (urllib, requests etc) and use both python 2.x and python 3.x. But this solution is a hack by design. For an app of your size it might be ok, but I wouldn’t use it for anything bigger that should be rock-solid and run in prod
asyncio
This module provides infrastructure for writing single-threaded
concurrent code using coroutines, multiplexing I/O access over sockets
and other resources, running network clients and servers, and other
related primitives
It has everything you might need.
There’s also a bunch of libraries working with popular RDBMs and http
https://github.com/aio-libs
But it lacks support of python 2.x. There are ports of asyncio to python 2.x but not sure how stable they are
Finally
So if I could sacrifice python 2.x I’d personally go with asyncio & related libraries
If you really really need python 2.x use one of the approaches above depending on the stability required and assumed peak load
Related Topics
How to Boot the Linux Kernel Without Creating an Initrd Image
Why Can't We Use C Standard Library Functions in Kernel Development
Bash Script to Get All Ip Addresses
Lowest Latency Notification Method Between Process Under Linux
Running Shell Script Using .Env File
How to Load a Specific Version of R in Linux
How to Force Node.Js Require to Be Case Sensitive
Get a Nanosecond-Precise Atime, Mtime, Ctime Fields for File (Stat)
Object-Oriented Shell for Linux
Go Http Server Testing Ab VS Wrk So Much Difference in Result
When and How Are System Calls Interrupted
Error: Ld.So: Object 'Libgtk3-Nocsd.So.0' from Ld_Preload Cannot Be Preloaded
For Loop for Files in Multiple Folders - Bash Shell