Can Read() Function on a Connected Socket Return Zero Bytes

Can read() function on a connected socket return zero bytes?

When a TCP connection is closed on one side read() on the other side returns 0 byte.

If the POSIX socket read function returns 0, does that indicate an error occurred?

No, a return value of zero does not indicate an error. The documentation for read() says:

Upon successful completion, these functions shall return a non-negative integer indicating the number of bytes actually read. Otherwise, the functions shall return -1 and set errno to indicate the error.

That is: A return value of zero just means that zero bytes have been actually read. This is not an error. An error is indicated by the return value -1. The fact that errno is still zero also is an indication that no error occurred.

As for the blocking / non-blocking part:

If fildes [i.e.: first parameter] refers to a socket, read() shall be equivalent to recv() with no flags set.

There it says:

If no messages are available at the socket and O_NONBLOCK is not set on the socket's file descriptor, recv() shall block until a message arrives. If no messages are available at the socket and O_NONBLOCK is set on the socket's file descriptor, recv() shall fail and set errno to [EAGAIN] or [EWOULDBLOCK].

Keeping in mind that the file is a socket, this translates to:

In blocking mode, the read()/recv() call will just wait until there is more data available. A return value of zero should therefore indicate that the socket has been closed and no more data will be recieved. One can argue whether this is an error or not, but it indicates an orderly shutdown, so I would not see it as error but just as "No more data to read here, move on!" instead.
In non-blocking mode, the read()/recv() call will return with -1 and errno could be either EAGAIN or EWOULDBLOCK. Zero can still occur as return value in non-blocking mode, if the socket has been closed properly.

Summary:

Zero does not indicate an error. Zero can be returned, if:

the socket has been closed in an orderly fashion and no more data can be received (in both blocking and non-blocking mode), or
a datagram of size zero has been received, or
exactly zero bytes have been requested via the third parameter to read().

Is reading zero bytes from a socket a valid way for monitoring a TCP/IP disconnect in POSIX C?

No, this is not a robust solution, for two reasons.

First, for a connected TCP socket read and recv will return zero, not −1, after all incoming data has been read and the remote peer has closed its end of the connection (using close or shutdown). Your IsConnected will return TRUE in this case, which is wrong.

Second, the specification of read says (second paragraph of DESCRIPTION; all emphasis mine)

Before any action described below is taken, and if nbyte is zero, the read function may detect and return errors as described below. In the absence of errors, or if error detection is not performed, the read function shall return zero and have no other results.

nbyte is the third argument, which your IsConnected is supplying as zero. Therefore, depending on the operating system, IsConnected might always return TRUE, regardless of the state of the socket.

The specification of recv does not say anything about what happens if the length argument (equivalent to nbyte for read) is zero; I think this is probably an oversight and it (and recvfrom, recvmsg, etc.) is meant to have the same special behavior as read. So changing read to recv will not fix the problem by itself. However, I think a complete fix is possible by using recv with MSG_PEEK:

bool is_connected(int sock)
{
    char dummy[1];
    ssize_t nread = recv(sock, dummy, sizeof dummy, MSG_PEEK);
    if (nread > 0)
        return true;    // at least one byte of data available
    else if (nread == 0)
        return false;   // EOF
    else
        return errno == EWOULDBLOCK || errno == EAGAIN;
}

Using MSG_PEEK allows you to supply a nonzero length, because the data will not actually be consumed.

Depending on the details of the application and its network protocol, you may also want to consider enabling TCP keep-alive packets on the socket.

using setsockopt; read returns 0 not -1 when socket closed from other side

When a socket is closed from other side, it indicates an End-Of-File, and the read call will return 0. That is the correct behavior.

Refer:
https://linux.die.net/man/2/read

Why socket reads 0 bytes when more was available

Have you read the documentation?

0 bytes read means that the remote end point have disconnected.

Either use blocking sockets or use the asynchronous methods like BeginReceive(). There is no need for Poll in .Net.

Can a C socket recv 0 bytes without the client shutting the connection?

TL;DR: POSIX says "no".

I don't think the man page is so unclear, but POSIX's description is perhaps a bit more clear:

The recv() function shall receive a message from a connection-mode or connectionless-mode socket.

[...]

Upon successful completion, recv() shall return the length of the message in bytes. If no messages are available to be received and the peer has performed an orderly shutdown, recv() shall return 0. Otherwise, -1 shall be returned and errno set to indicate the error.

Thus, there are exactly three alternatives allowed by POSIX:

recv() successfully receives a message and returns its length. The length is nonzero by definition, for if no bytes were received then no message was received. recv() therefore returns a value greater than 0.
no message was received from the peer, and we are confident that none will be forthcoming because the peer has (by the time recv() returns) performed an orderly shutdown of the connection. recv() returns 0.
"otherwise" something else happened. recv() returns -1.

In any event, recv() will block if no data are available and no error condition is available at the socket, unless the socket is in non-blocking mode. In non-blocking mode, if there is neither data nor error condition available when recv() is called, that falls into the "otherwise" case.

One cannot altogether rule out that a given system will fail to comply with POSIX, but you have to decide somehow how you're going to interpret function results. If you're calling a POSIX-defined function on a system that claims to conform to POSIX (at least with respect to that function) then it's hard to argue with relying on POSIX semantics.

C - does read() add a '\0'?

does read() add a '\0'?

No, it doesn't. It just reads.

From read()'s documentation:

The read() function shall attempt to read nbyte bytes from the file associated with the open file descriptor, fildes, into the buffer pointed to by buf.

Is there potential for error here, other than the cases where those functions return -1?

read() might return 0 indicating end-of-file.

If reading (also from a socket descriptor) read() not necessarily reads as much bytes as it was told to do. So in this context do not just test the outcome of read against -1, but also compare it against the number of bytes the function was told to read.

A general note:

Functions do what is documented (at least for proper implementations of the C language). Both your assumptions (autonomously set a 0-termination, detect the latter) are not documented.

Can Read() Function on a Connected Socket Return Zero Bytes