Why Is Nonblocking Socket Writable Before Connect() or Accept()

Why is nonblocking socket writable before connect() or accept()?

select() tells you when a socket is ready for the next action. In this case, it is ready to call connect().

Of course, calling select() on a brand new socket is unnecessary. Most applications wouldn't do that.

Is non-blocking socket really non-blocking when used with blocking select()?

This is rather theoretical question. If sockets I/O (either read or write) is set to O_NONBLOCK, but then this socket is set in fd_set to select() which blocks (waiting for an event the file descriptor become either readable or writable), then that socket is blocking anyway (due to the select())?

The select is blocking. The socket is still non-blocking.

Why would I set the socket to be non-blocking, when even the blocking (default) version once become readable (or writable) (thanks to select()), won't block, because the select() said it has data to read (or write) and thus the socket is able to perform its operation with that data without blocking.

No, no, no! That is not a safe assumption. There is no guarantee that a subsequent read or write will not block. You must set the socket non-blocking if you need a future guarantee that a later operation will not block.

So why to bother setting socket non-blocking when select() blocks anyway?

Because you don't want operations on the socket to block. The select function does not guarantee that a future operation won't block and people have gotten burnt by making that assumption in the past.

For example, you do select on a UDP socket and it says that a receive won't block. But before you call recv, the administrator enables UDP checksums which were previously disabled. Guess what, now your recv will block if the checksum was incorrect on the only received datagram.

Unless you think you could have foreseen every way something like that could happen, and you definitely can't, you must set the socket non-blocking if you do not wish it to block.

How to program non-blocking socket on connect and select?

Sure, below is a little C program that uses a non-blocking TCP connect to connect to www.google.com's port 80, send it a nonsense string, and print out the response it gets back:

#include <stdio.h>
#include <netdb.h>
#include <errno.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>
#include <sys/select.h>
#include <sys/socket.h>

static void SendNonsenseCommand(int sock)
{
   const char sendString[] = "Hello Google!  How are you!\r\n\r\n";
   if (send(sock, sendString, sizeof(sendString), 0) != sizeof(sendString)) perror("send()");
}

int main(int argc, char ** argv)
{
   // Create a TCP socket
   const int sock = socket(AF_INET, SOCK_STREAM, 0);
   if (sock < 0) {perror("socket"); return 10;}

   // Set the TCP socket to non-blocking mode
   const int flags = fcntl(sock, F_GETFL, 0);
   if (flags < 0) {perror("fcntl(F_GETFL)"); return 10;}
   if (fcntl(sock, F_SETFL, flags|O_NONBLOCK) < 0) {perror("fcntl(F_SETFL)"); return 10;}

   // Get the IP address of www.google.com
   struct hostent * he = gethostbyname("www.google.com");
   if (he == NULL) {printf("Couldn't get a hostent for www.google.com\n"); return 10;}

   // Start a non-blocking/asynchronous TCP connetion to port 80
   struct sockaddr_in saAddr;
   memset(&saAddr, 0, sizeof(saAddr));
   saAddr.sin_family = AF_INET;
   saAddr.sin_addr   = *(struct in_addr*)he->h_addr;
   saAddr.sin_port   = htons(80);

   const int connectResult = connect(sock, (const struct sockaddr *) &saAddr, sizeof(saAddr));
   int isTCPConnectInProgress = ((connectResult == -1)&&(errno == EINPROGRESS));
   if ((connectResult == 0)||(isTCPConnectInProgress))
   {
      if (isTCPConnectInProgress == 0) SendNonsenseCommand(sock);

      // TCP connection is happening in the background; our event-loop calls select() to block until it is ready
      while(1)
      {
         fd_set socketsToWatchForReadReady, socketsToWatchForWriteReady;
         FD_ZERO(&socketsToWatchForReadReady);
         FD_ZERO(&socketsToWatchForWriteReady);

         // While connecting, we'll watch the socket for ready-for-write as that will tell us when the
         // TCP connection process has completed.  After it's connected, we'll watch it for ready-for-read
         // to see what Google's web server has to say to us.
         if (isTCPConnectInProgress) FD_SET(sock, &socketsToWatchForWriteReady);
                                else FD_SET(sock, &socketsToWatchForReadReady);

         int maxFD = sock;  // if we were watching multiple sockets, we'd compute this to be the max value of all of them

         const int selectResult = select(maxFD+1, &socketsToWatchForReadReady, &socketsToWatchForWriteReady, NULL, NULL);
         if (selectResult >= 0)
         {
            if ((FD_ISSET(sock, &socketsToWatchForWriteReady))&&(isTCPConnectInProgress))
            {
               printf("Socket is ready for write!  Let's find out if the connection succeeded or not...\n");

               struct sockaddr_in junk;
               socklen_t length = sizeof(junk);
               memset(&junk, 0, sizeof(junk));
               if (getpeername(sock, (struct sockaddr *)&junk, &length) == 0)
               {
                  printf("TCP Connection succeeded, socket is ready for use!\n");
                  isTCPConnectInProgress = 0;

                  SendNonsenseCommand(sock);
               }
               else
               {
                  printf("TCP Connection failed!\n");
                  break;
               }
            }

            if (FD_ISSET(sock, &socketsToWatchForReadReady))
            {
               char buf[512];
               const int numBytesReceived = recv(sock, buf, sizeof(buf)-1, 0);
               if (numBytesReceived > 0)
               {
                  buf[numBytesReceived] = '\0';  // ensure NUL-termination before we call printf()
                  printf("recv() returned %i:  [%s]\n", numBytesReceived, buf);
               }
               else if (numBytesReceived == 0)
               {
                  printf("TCP Connection severed!\n");
                  break;
               }
               else perror("recv()");
            }
         }
         else {perror("select()"); return 10;}
      }
   }
   else perror("connect()");

   close(sock);  // just to be tidy
   return 0;
}

Non-blocking Socket connect always succeeds?

Everytime the select returns and invokes this callback which always
succeeds, i.e., going to the "Ready to write/read" block, instead of
cerring failure. Why can this happen?

While the asynchronous TCP connect is in progress (as indicated by -1/EINPROGRESS from the connect() call), you should pass the socket to select() as part of its ready-for-write socket set, so that select() will return when the socket indicates it is ready-for-write.

When the TCP connection succeeds-or-fails, select() will return that the socket is ready-for-write(*). When that happens, you need to figure out which of the two possible outcomes (success or failure) has occurred.

Below is the function I call when an asynchronously-connecting socket select()'s as ready-for-write.

// call this select() has indicated that (fd) is ready-for-write because
// (fd)'s asynchronous-TCP connection has either succeeded or failed.
// Returns true if the connection succeeded, false if the connection failed.
// If this returns true, you can then continue using (fd) as a normal
// connected/non-blocking TCP socket.  If this returns false, you should
// close(fd) because the connection failed.
bool FinalizeAsyncConnect(int fd)
{
#if defined(__FreeBSD__) || defined(BSD)
   // Special case for FreeBSD7, for which send() doesn't do the trick
   struct sockaddr_in junk;
   socklen_t length = sizeof(junk);
   memset(&junk, 0, sizeof(junk));
   return (getpeername(fd, (struct sockaddr *)&junk, &length) == 0);
#else
   // For most platforms, the code below is all we need
   char junk;
   return (send(fd, &junk, 0, 0L) == 0);
#endif
}

(*) Side note: Things are slightly different under Windows, because Windows likes to do things its own way: Under Windows, a successful asynchronous connect() is indicated as described above, but if you want to be notified about a failed asynchronous connect() attempt under Windows, you need to register your socket under the "except" fd_set also, as it is the "except" fd_set that Windows will use to communicate a failed asynchronous connect().

What's the ideal way to write a code for non-blocking connect()?

The code shown in the example is a bit misleading, in that it's not really implementing a non-blocking connect; rather it is implementing a blocking connect with a one-second timeout. (That is, if the code is working as intended, the nonblocking_connect() function might not return for up to one second when it is called).

That's fine, if that's what you want to do, but the real use-case for a non-blocking connect() is when your event-loop needs to make a TCP connection but also wants to be able to do other things while the TCP connection-setup is in progress.

For example, the program below will echo back any text you type in to stdin; however if you type in a command of the form connect 172.217.9.4 it will start a non-blocking TCP connection to port 443 of the IP address you entered. The interesting thing to note is that while the TCP connection is in progress you are still able to enter text into stdin and the program can still respond (it can even abort the TCP-connection-in-progress and start a new one if you tell it to) -- that can be useful, especially when the TCP connection is taking a long time to set up (e.g. because the server is slow, or because there is a firewall between you and the server that is blocking your client's TCP packets, in which case the TCP connection attempt might take several minutes before it times out and fails)

#include <errno.h>
#include <stdio.h>
#include <string.h>
#include <arpa/inet.h>
#include <sys/socket.h>
#include <sys/time.h>
#include <fcntl.h>
#include <stdlib.h>
#include <unistd.h>

int main(int argc, char ** argv)
{
   printf("Type something and press return to have your text echoed back to you\n");
   printf("Or type e.g. connect 172.217.9.4 and press return to start a non-blocking TCP connection.\n");
   printf("Note that the text-echoing functionality still works, even when the TCP connection setup is still in progress!\n");

   int tcpSocket = -1;  // this will be set non-negative only when we have a TCP connection in progress
   while(1)
   {
      fd_set readFDs, writeFDs;
      FD_ZERO(&readFDs);
      FD_ZERO(&writeFDs);

      FD_SET(STDIN_FILENO, &readFDs);
      if (tcpSocket >= 0) FD_SET(tcpSocket, &writeFDs); 
   
      int maxFD = STDIN_FILENO;
      if (tcpSocket > maxFD) maxFD = tcpSocket;
       
      if (select(maxFD+1, &readFDs, &writeFDs, NULL, NULL) < 0) {perror("select"); exit(10);}
   
      if (FD_ISSET(STDIN_FILENO, &readFDs))
      {
         char buf[256] = "\0";
         fgets(buf, sizeof(buf), stdin);
      
         if (strncmp(buf, "connect ", 8) == 0)
         {  
            if (tcpSocket >= 0)
            {
               printf("Closing existing TCP socket %i before starting a new connection attempt\n", tcpSocket);
               close(tcpSocket);
               tcpSocket = -1;
            }
       
            tcpSocket = socket(AF_INET, SOCK_STREAM, 0);
            if (tcpSocket < 0) {perror("socket"); exit(10);}
            
            const char * connectDest = &buf[8];
            printf("Starting new TCP connection using tcpSocket=%i to: %s\n", tcpSocket, connectDest);
            
            int flags = fcntl(tcpSocket, F_GETFL, 0);
            if (flags == -1) {perror("fcntl"); exit(10);}
            if (fcntl(tcpSocket, F_SETFL, flags | O_NONBLOCK) == -1) {perror("fcntl"); exit(10);}
               
            struct sockaddr_in serv_addr; memset(&serv_addr, 0, sizeof(serv_addr));
            serv_addr.sin_family = AF_INET;
            serv_addr.sin_port = htons(443);  // https port
            if (inet_aton(connectDest, &serv_addr.sin_addr) != 1) printf("Unable to parse IP address %s\n", connectDest);
            int ret = connect(tcpSocket, (struct sockaddr*)&serv_addr, sizeof(serv_addr));
            if (ret == 0)
            {
               printf("connect() succeeded immediately!  We can just use tcpSocket now\n");
               close(tcpSocket);  // but for the sake of this demo, I won't
               tcpSocket = -1;
            }
            else if (ret == -1)
            {
               if (errno == EINPROGRESS)
               {
                  printf("connect() returned -1/EINPROGRESS: the TCP connection attempt is now happening, but in the background.\n");
                  printf("while that's going on, you can still enter text here.\n");
               } 
               else
               {
                  perror("connect"); 
                  exit(10);
               }
            }
         }  
         else printf("You typed:  %s\n", buf);
      }        
                  
      if ((tcpSocket >= 0)&&(FD_ISSET(tcpSocket, &writeFDs)))
      {
         // Aha, the TCP setup has completed!  Now let's see if it succeeded or failed
         int setupResult;
         socklen_t resultLength = sizeof(setupResult);
         if (getsockopt(tcpSocket, SOL_SOCKET, SO_ERROR, &setupResult, &resultLength) < 0) {perror("getsocketopt"); exit(10);}

         if (setupResult == 0)
         {
            printf("\nTCP connection setup complete!  The TCP socket can now be used to communicate with the server\n");
         }
         else
         {
            printf("\nTCP connection setup failed because [%s]\n", strerror(setupResult));
         }

         // Close the socket, since for the purposes of this demo we don't need it any longer
         // A real program would probably keep it around and select()/send()/recv() on it as appropriate
         close(tcpSocket);
         tcpSocket = -1;
      }
   }
}

As for why you would want to call getsockopt(fd, SOL_SOCKET, SO_ERROR, ...), it's simply to determine whether select() returned ready-for-write on the TCP socket because the TCP-connection-setup succeeded, or because it failed (and in the latter case why it failed, if you care about why)

Why does accept() block, when listen() is the very first involved in TCP?

listen() and accept() are two completely different operations. Yourr understanding of how they work is incorrect.

listen() merely sets up the listening socket's backlog and opens the bound port, so clients can start connecting to the socket. That opening is a very quick operation, there is no need to worry about it blocking.

A 3-way handshake is not performed by listen(). It is performed by the kernel when a client tries to connect to the opened port and gets placed into the listening socket's backlog. Each new client connection performs its own 3-way handshake.

Once a client connection is fully handshaked, that connection is made available for accept() to extract it from the backlog. accept() blocks (or, if you use a non-blocking listening socket, accept() succeeds) only when a new client connection is available for subsequent communication.

You call listen() only 1 time, to open the listening port, that is all it does. Then you have to call accept() for each client that you want to communicate with. That is why accept() blocks and listen() does not.

What is the benefit of using non-blocking sockets with the select function?

There might be cases when a socket is reported as ready but by the time you get to check it, it changes its state.

One of the good examples is accepting connections. When a new connection arrives, a listening socket is reported as ready for read. By the time you get to call accept, the connection might be closed by the other side before ever sending anything and before we called accept. Of course, the handling of this case is OS-dependent, but it's possible that accept will simply block until a new connection is established, which will cause our application to wait for indefinite amount of time preventing processing of other sockets. If your listening socket is in a non-blocking mode, this won't happen and you'll get EWOULDBLOCK or some other error, but accept will not block anyway.

Some kernels used to have (I hope it's fixed now) an interesting bug with UDP and select. When a datagram arrives select wakes up with the socket with datagram being marked as ready for read. The datagram checksum validation is postponed until a user code calls recvfrom (or some other API capable of receiving UDP datagrams). When the code calls recvfrom and the validating code detects a checksum mismatch, a datagram is simply dropped and recvfrom ends up being blocked until a next datagram arrives. One of the patches fixing this problem (along with the problem description) can be found here.

Non Blocking write, and blocking recv

You can use select() to block until the socket is readable. Leave it in non-blocking mode. If send() returns -1 and errno is set to EAGAIN or EWOULDBLOCK then you will need to use select() to tell you when the socket has become writable again. Only do that when send() reports EAGAIN/EWOULDBLOCK, as sockets are almost always writable.

Why Is Nonblocking Socket Writable Before Connect() or Accept()