What Happens When a Tcp Server Binds and Forks Before Doing an Accept ? Which Process Would Handle The Client Requests

What happens when a tcp server binds and forks before doing an accept ? Which process would handle the client requests?

The Linux wait_queue_head implementation consists of an ordered data structure (linked list serving as a queue). New waiting tasks are added to the end of the queue, and wakeups are done from the head (cf. __wake_up_common in kernel/sched.c). Furthermore, only a single task is woken up (like in many places besides socket code), because having to schedule all tasks is often pointless when only a single task can get the resource in question (cf. comments in inet_csk_wait_for_connect in net/ipv4/inet_connection_sock.c).

In TCP, if the server uses another port to communicate, how will it inform the client?

A TCP connection is uniquely identified by the tuple (source ip, source port, destination ip, destinatin port). These tuple is used by OS to "bind" the TCP connection to a process, meaning to know which process the OS should deliver the TCP package to.

When server socket accepts the TCP connection and fork, that process inherits the original process so it effectively take up the binding of the TCP connection to this newly forked process. The client in the remote machine does not know and does not need to know such thing happens. The whole network keeps seeing the same thing, the package of the same tuple flow through the network.

At this time, the original process will keep listening to new TCP connection. When new TCP connection request arrive, even it is from the same previous machine, the port must be different. In OS's perspective it is a different tuple, therefore it can distinguish the TCP pcakge and deliver to the right process.

You may ask why the client from the remote machine knows it has to use another port to initiate a new connection. This is simply because the client OS knows (or informed by the socket library) that this process is creating a separate new connection. OS will assign another unique port number to the process. That's how it is possible for multiple processes communicating to the same server port without message mess up.

To put it short, the operation of accept and fork in server is just a kind of transferring the ownership of a TCP connection binding to another process. Nothing change in the server port used in this communication.

How does the client know the ephemeral port being used by the child TCP process?

Hiw does the client know the ephemeral port being used by the child TCP process?

There is no ephemeral port to know. The client just keeps using the same target port that it conncted to.

The child process will obviously have to bind to another ephemeral port to communicate with the client.

No. The client process inherits the accepted socket, which is bound to the same local port as the listening socket.

My question is how will the client know which port to send the data to in order to communicate once the child process is forked?

It communicates via the same port it connected to.

Does the parent TCP process listening on port 80 convey it to the client?

The client inherits the socket via the FD inheritance mechanism.

Linux API to list running processes?

http://procps.sourceforge.net/

http://procps.cvs.sourceforge.net/viewvc/procps/procps/proc/readproc.c?view=markup

Is the source of ps and other process tools. They do indeed use proc (indicating it is probably the conventional and best way). Their source is quite readable. The file

/procps-3.2.8/proc/readproc.c

May be useful. Also a useful suggestion as posted by ephemient is linking to the API provided by libproc, which should be available in your repo (or already installed I would say) but you will need the "-dev" variation for the headers and what-not.

Good Luck

How does connect()'ing a bind()'ed socket in a forked process affect the socket in the parent?

What I am uncertain about (regarding option 1) is how connecting the listener_socket to the client affect the original listener_socket in the parent server. Will it prevent the parent from receiving further messages from other clients?

Yes.

If not, why? Don't they both refer ultimately to the same socket?

Yes, when the parent forks the child, the child receives copies of all the parent's open file descriptors, and these refer to open file descriptions managed by the kernel. Those open file descriptions are thus shared between parent and child. Therefore, if the child connect()s the socket to a client, then both parent and child will see the effect that

If the socket sockfd is of type SOCK_DGRAM, then [the specified address] is the address to which datagrams are sent by default, and the only address from which datagrams are received.

(Linux manual page; emphasis added)

As for option 2, this gives me, quite expectedly, "address already in use". Is there some functionality like in routers (i.e. longest matching prefix) that delivers datagrams to the "most fitting" socket?

What would make one socket more fitting than the other? In any case, no. A UDP endpoint is identified by address and port alone.

Regarding option 3, I think it's quite inefficient, because it implies that I should use a new port for each client.

Inefficient relative to what? Yes, your option (3) would require designating a separate port for each client, but you haven't presented any other viable alternatives.

But that's not to say that there aren't other viable alternatives. If you don't want to negotiate a separate port and open a separate socket per client (which is one of the ways that FTP can operate, for example) then you cannot rely on per-client UDP sockets. In that case, all incoming traffic to the service must go to the same socket. But you can have the process receiving messages on that socket dispatch each one to an appropriate cooperating process, based on the message's source address and port. And you should be able to have those processes all send responses via the same socket.

There are numerous ways that the details of such a system could be set up. They all do have the limitation that the one process receiving all the messages could be a bottleneck, but in practice, we're talking about I/O. It is very likely that the main bottleneck would be at the network level, not in the receiving process.

Multiclient server using fork()

The main problem you have is that == has higher precedence than =, so this line:

if(pid=fork()==-1)

is assigning the result of fork() == -1 to pid, which isn't what you want: it'll always be 0 when fork() succeeds, in both the child and the parent. You need to use:

if((pid = fork()) == -1)

You should also close(new) in the parent after the fork() - the child owns that socket now. If you want to send the textual version of the counter, you need to use snprintf() to convert it to text. The child should also exit after it's finished - the easiest way to do that in your code is to break out of the loop. After these corrections, the inner loop in your server looks like:

for(;;)
{
        new = accept(sockid, (struct sockaddr *)&clientaddr, &len);

        if ((pid = fork()) == -1)
        {
            close(new);
            continue;
        }
        else if(pid > 0)
        {
            close(new);
            counter++;
            printf("here2\n");
            continue;
        }
        else if(pid == 0)
        {
            char buf[100];

            counter++;
            printf("here 1\n");
            snprintf(buf, sizeof buf, "hi %d", counter);
            send(new, buf, strlen(buf), 0);
            close(new);
            break;
        }
}

What Happens When a Tcp Server Binds and Forks Before Doing an Accept ? Which Process Would Handle The Client Requests