Simulate Effect of Select() and Poll() in Kernel Socket Programming

Simulate effect of select() and poll() in kernel socket programming

Your second idea sounds more like it will work.

The CEPH code looks like it does something similar, see net/ceph/messenger.c.

Using select()/poll() in device driver

You may want to write your own custom sk_buff handler, which calls the kernel_select() that tries to lock the semaphore and does a blocking wait when the socket is open.

Not sure if you have already gone through this link Simulate effect of select() and poll() in kernel socket programming

When you call select(2) how does the kernel figure out a socket is ready?

https://eklitzke.org/how-tcp-sockets-work answered my question

When a new data packet comes in on the network interface (NIC), the kernel is notified either by being interrupted by the NIC, or by polling the NIC for data. Typically whether the kernel is interrupt driven or in polling mode depends on how much network traffic is happening; when the NIC is very busy it’s more efficient for the kernel to poll, but if the NIC is not busy CPU cycles and power can be saved by using interrupts. Linux calls this technique NAPI, literally “New API”.
When the kernel gets a packet from the NIC it decodes the packet and figures out what TCP connection the packet is associated with based on the source IP, source port, destination IP, and destination port. This information is used to look up the struct sock in memory associated with that connection. Assuming the packet is in sequence, the data payload is then copied into the socket’s receive buffer. At this point the kernel will wake up any processes doing a blocking read(2), or that are using an I/O multiplexing system call like select(2) or epoll_wait(2) to wait on the socket.

select/poll and one write buffer

This is a type of race condition and is not unique to your device. The application must be prepared for this eventuality. That is, just because select returns a file descriptor as being writable (or readable or whatever) does not guarantee that a subsequent system call on the file will not block.

The usual way of handling this is to open the file descriptor in non-blocking mode (O_NONBLOCK or O_NDELAY). Then when the situation you described happens, UserA will get a write error (with errno EWOULDBLOCK/EAGAIN), and should then return to select to await the device becoming writable again.

How can you generate a POLLPRI event on a regular file?

It looks like you can achieve this by polling a sysctl exposed in procfs. If you look at the poll implementation in procfs for the sys subdirectory, you'll see that any sysctl that implements notifications for poll will return a mask that includes POLLERR|POLLPRI. So how do we figure out what sysctls implement this? We look for uses of proc_sys_poll_notify!

One such place is in proc_do_uts_string, which implements a number of sysctls under /proc/sys/kernel. Most of these are read-only, but hostname and domainname can be written (see also their table entries).

Of course, this is going to require root privileges to be able to write to e.g. /proc/sys/kernel/hostname.

This is probably the easiest way to do such a thing while staying within a synthetic filesystem implementation. Of course, the only real way to test your code is to poll(2) one of your pins, press a button, and see if you get your rising / falling signal interrupts.

Note: sysfs also does this for edge nodes in the tree:

>>> import select
>>> f = open('/sys/bus/clockevents/devices/clockevent0/uevent', 'r')
>>> p = select.poll()
>>> p.register(f, select.POLLPRI | select.POLLERR)
>>> result = p.poll(10)
>>> result
[(3, 10)]

10 is of course POLLPRI (0x2) | POLLERR (0x8). I got the same results using /sys/power/state as my input. Basically, if you poll any user-readable, non-directory file entry in sysfs, you'll get POLLPRI | POLLERR back.

Simulate Effect of Select() and Poll() in Kernel Socket Programming