Linux: Checking If a Socket/Pipe Is Broken Without Doing a Read()/Write()

Linux: Checking if a socket/pipe is broken without doing a read()/write()

struct pollfd pfd = {.fd = yourfd, .events = POLLERR};
if (poll(&pfd, 1, whatever) < 0) abort();
if (pfd.revents & POLLERR) printf("pipe is broken\n");

This does work for me. Note that sockets are not exactly pipes and thus show different behavior (-> use POLLRDHUP).

What happened to socket if network has broken down

There's numerous other ways a TCP connection can go dead undetected

someone yanks out a network cable inbetween.
the computer at the other end gets nuked.
a nat gateway inbetween silently drops the connection
the OS at the other end crashes hard.
the FIN packets gets lost.
undetectable errors: A router in-between the endpoints may drops packets.(including control packets)
reff

In all cases you can know about it when you try to write on socket this cause through SIGPIPE error in your program and terminate it.

By read() it can't be know whether other-side live or not. Thants Why SO_KEEPALIVE useful. Keepalive is non-invasive, and in most cases, if you're in doubt, you can turn it on without the risk of doing something wrong. But do remember that it generates extra network traffic, which can have an impact on routers and firewalls.

And this affects all sockets on your machine too!(you are correct). And Because SO_KEEPALIVE increase traffic and consume CPU. It's best to set the SIGPIPE handle, if there is a chance application will ever write to a broken connection.

Also use SO_KEEPALIVE at reasonable place in the application. It's poor to use it for whole connection duration (i.e do use so_keepalive when server works for long on client query).
Setting the probing interval Dependends on your application or say
Application layer protocol.

Though enabling TCP keepalive, you'll detect it eventually - at least during a couple of hours.

Say if the network has broken down and however, instead of trying to write, socket is puted into some epoll device :

The second argument in epoll:

 n = epoll_wait (efd, events, MAXEVENTS, -1);

Set with correct event-related code, Good practice is to check this code for

caution as follow.

n = epoll_wait (efd, events, MAXEVENTS, -1);  
for (i = 0; i < n; i++)  
{   
    if ((events[i].events & EPOLLERR) ||
          (events[i].events & EPOLLHUP) ||
          (!(events[i].events & EPOLLIN)))
    {
          /* An error has occured on this fd, or the socket is not
             ready for reading (why were we notified then?) */
      fprintf (stderr, "epoll error\n");
      close (events[i].data.fd);
      continue;
    }

    else if (sfd == events[i].data.fd)
    {
          /* We have a notification on the listening socket, which
         means one or more incoming connections. */
         
         // Do what you wants
     }
}

Where EPOLLRDHUP means is:

Stream socket peer closed connection, or shut down writing half of connection. (This flag is especially useful for writing simple code to detect peer shutdown when using Edge Triggered monitoring.)

What causes the Broken Pipe Error?

It can take time for the network close to be observed - the total time is nominally about 2 minutes (yes, minutes!) after a close before the packets destined for the port are all assumed to be dead. The error condition is detected at some point. With a small write, you are inside the MTU of the system, so the message is queued for sending. With a big write, you are bigger than the MTU and the system spots the problem quicker. If you ignore the SIGPIPE signal, then the functions will return EPIPE error on a broken pipe - at some point when the broken-ness of the connection is detected.

Send behavior on Broken pipe

POSIX says that EPIPE should be returned and SIGPIPE sent:

For write()s or pwrite()s to pipes or FIFOs not open for reading by any process, or with only one end open.
For write()s to sockets that are no longer connected or shut down for writing.

You can have a look at the POSIX standard here

Block until a reader has connected to named pipe

It turns out that contrary to my comment to the original question above, there is a straightforward solution.

This solution assumes that all readers and all writers to the same FIFO share the kernel buffers. This should be the most logical and straightforward way to implement FIFOs (considering their behaviour), so I do expect all systems providing FIFOs to behave this way. However, it is just my assumption, not any guarantee. I have not found anything in the relevant POSIX standards to support or contradict this. Please do pipe up if you find otherwise.

The procedure is trivial:

When a client vanishes unexpectedly, the writer opens the FIFO again, without closing the original descriptor first. This open() will block, until there is a new reader available, but since the original file descriptor is still open, the data already buffered in the FIFO will be available to the new reader. If the open() succeeds, the writer simply closes the original descriptor, and switches to using the new descriptor instead.

If the kernel structures are shared, the FIFO buffer state is shared between the writer descriptors, and the new reader will be able to read what the previous reader left unread.

(Note that the writer does not know the amount of data buffered between client changes, and is therefore unaware of the point in data stream where the switch happens.)

I have verified this trivial strategy works on Linux 3.8.0-35-generic kernels on x86_64 in Ubuntu, as well as 2.6.9-104.ELsmp on x86_64.

However, I still fully agree with either accepting the data loss, or changing the protocol, as suggested by Basile Starynkevitch in a comment to the original question.

Personally, I have found Unix domain sockets (bound to a pathname, say /var/run/yourservice/unix) to be a much better option, because it allows multiple simultaneous clients without data corruption (unlike FIFOs), and a much saner protocol.

I prefer Unix datagram sockets, with a sequence number and datagram length at the start of each datagram. (The length helps the client verify it read the entire datagram; I don't really expect any OS to truncate Unix datagrams.)

Typically, the writer sends a few datagrams to each client, and waits for acknowledgement before sending new ones. After processing a datagram, the client acknowledges the datagram by sending the sequence number to the writer. (Remember, these are sockets, so the communication is bidirectional.) This allows the writer to keep a few datagrams in flight per client, and the client to process the datagrams in either correct (sequence number) order, or out-of-order, using multiple threads.

The important point being that each datagram is acknowledged only after it has been processed, not immediately after it has been received.

(In addition to reader->writer "acks" (acknowledgements), I would also support "please resend" responses, in case the client used a too small buffer to receive a datagram. Or dropped it on the floor due to some other reasons. And maybe even "yuck" for datagrams the client does not know what to do about.)

If a client vanishes, the writer knows that all un-acknowledged datagrams were not processed by the client yet, and can re-send them to another connected client, or a future client.

Questions?

Broken Pipe for C-Socket. How to only keep server running?

Your program has two problems:

1) read() works differently than you think:

Normally read() will read up to a certain number of bytes from some file or stream (e.g. socket).

Because read() does not distinguish between different types of bytes (e.g. letters, the end-of-line marker or even the NUL byte) read() will not work like fgets() (reading line-wise).

read() is also allowed to "split" the data: If you do a write(..."Hello\n"...) on the client the server may receive "Hel" the first time you call read() and the next time it receives "lo\n".

And of course read() can concatenate data: Call write(..."Hello\n"...) and write(..."World\n"...) on the client and one single read() call may receive "Hello\nWorld\n".

And of course both effects may appear at the same time and you have to call read() three times receiving "Hel", "lo\nWo" and "rld\n".

TTYs (= the console (keyboard) and serial ports) have a special feature (which may be switched off) that makes the read() call behave like fgets(). However only TTYs have such a feature!

In the case of sockets read() will always wait for at least one byte to be received and return the (positive) number of bytes received as long as the connection is alive. As soon as read() returns zero or a negative value the connection has been dropped.

You have to use a while loop that processes data until the connection has been dropped.

You'll have to check the data received by read() if it contains the NUL byte to detect the "end" of the data - if "your" data is terminated by a NUL byte.

2) As soon as the client drops the connection the handle returned by accept() is useless.

You should close that handle to save memory and file descriptors (there is a limit on how many file descriptors you can have open at one time).

Then you have to call accept() again to wait for the client to establish a new connection.

uWSGI raises OSError: write error during large request

It may be the case that when you upload things, you use chunked encoding.
There is uWSGI option
--chunked-input-timeout,
that by default is 4 seconds (it defaults
to value of --socket-timeout, which is 4 seconds).

Though problem theoretically may lie somewhere else, I suggest you to try
aforementioned options. Plus, annoying exceptions are the reason why I have

ignore-sigpipe=true
ignore-write-errors=true
disable-write-exception=true

in my uWSGI config (note that I provide 3 options, not 2):

ignore-sigpipe makes uWSGI not show SIGPIPE errors;
ignore-write-errors makes it not show errors with
e.g. uwsgi_response_writev_headers_and_body_do;
disable-write-exception prevents
OSError generation on writes.

Linux: Checking If a Socket/Pipe Is Broken Without Doing a Read()/Write()