How to Open a Socket and Pass It to Another Process in Linux

Can I open a socket and pass it to another process in Linux

Yes you can, using sendmsg() with SCM_RIGHTS from one process to another:

SCM_RIGHTS - Send or receive a set of
open file descriptors from another
process. The data portion contains an
integer array of the file descriptors.
The passed file descriptors behave as
though they have been created with
dup(2).

http://linux.die.net/man/7/unix

That is not the typical usage though. More common is when a process inherits sockets from its parent (after a fork()). Any file handles (including sockets) not closed will be available to the child process. So the child process inherits the parent's sockets.

A server process that listens for connections is called a daemon. This usually forks on each new connection, spawning a process to handle each new request. An example of the typical daemon is here:

http://www.steve.org.uk/Reference/Unix/faq_8.html#SEC88

Scroll down to void process().

transfer socket between processes in Linux

The child process will inherit the file descriptor. So you have nothing to do except closing the socket in the parent after you forked the child.

If you exec another executable in the child, you may want to inform it of the file descriptor value by using a specific argument.

Share socket between unrelated processes like systemd

systemd is not unrelated to the processes who share the sockets. systemd starts up and supervises the entire system, so it can pass the socket file descriptors during exec() easily. systemd listens on behalf of the services and whenever a connection would come in, an instance of the respective service would be spawned. Here is the implementation:

int main(int argc, char **argv, char **envp) {
        int r, n;
        int epoll_fd = -1; 

        log_parse_environment();
        log_open();

        r = parse_argv(argc, argv);
        if (r <= 0)
                return r == 0 ? EXIT_SUCCESS : EXIT_FAILURE;

        r = install_chld_handler();
        if (r < 0)
                return EXIT_FAILURE;

        n = open_sockets(&epoll_fd, arg_accept);
        if (n < 0)
                return EXIT_FAILURE;
        if (n == 0) {
                log_error("No sockets to listen on specified or passed in.");
                return EXIT_FAILURE;
        }

        for (;;) {
                struct epoll_event event;

                r = epoll_wait(epoll_fd, &event, 1, -1);
                if (r < 0) {
                        if (errno == EINTR)
                                continue;

                        log_error_errno(errno, "epoll_wait() failed: %m");
                        return EXIT_FAILURE;
                }

                log_info("Communication attempt on fd %i.", event.data.fd);
                if (arg_accept) {
                        r = do_accept(argv[optind], argv + optind, envp, event.data.fd);
                        if (r < 0)
                                return EXIT_FAILURE;
                } else
                        break;
        }
        ...
}

Once a connection comes in, it will call do_accept():

static int do_accept(const char* name, char **argv, char **envp, int fd) {
        _cleanup_free_ char *local = NULL, *peer = NULL;
        _cleanup_close_ int fd_accepted = -1; 

        fd_accepted = accept4(fd, NULL, NULL, 0); 
        if (fd_accepted < 0)
                return log_error_errno(errno, "Failed to accept connection on fd:%d: %m", fd);

        getsockname_pretty(fd_accepted, &local);
        getpeername_pretty(fd_accepted, true, &peer);
        log_info("Connection from %s to %s", strna(peer), strna(local));

        return fork_and_exec_process(name, argv, envp, fd_accepted);
}

finally, it calls execvpe(name, argv, envp); and wrap the fd up in envp. There is a trick in it, if fd_accepted is not equal to SD_LISTEN_FDS_START, it call dup2() to makes SD_LISTEN_FDS_START be the copy of fd_accepted:

    if (start_fd != SD_LISTEN_FDS_START) {
            assert(n_fds == 1);

            r = dup2(start_fd, SD_LISTEN_FDS_START);
            if (r < 0)
                    return log_error_errno(errno, "Failed to dup connection: %m");

            safe_close(start_fd);
            start_fd = SD_LISTEN_FDS_START;
    }

So you can just use file descriptor 3 like this in your application, sd_listen_fds will parse the environment variable LISTEN_FDS passed from envp:

int listen_sock;
int fd_count = sd_listen_fds(0);
if (fd_count == 1) { // assume one socket only
  listen_sock = SD_LISTEN_FDS_START; // SD_LISTEN_FDS_START is a macro defined to 3
} else {
  // error
}
struct sockaddr addr;
socklen_t addrlen;
while (int client_sock = accept(listen_sock, &addr, &addrlen)) {
  // do something
}

How can I pass a socket from parent to child processes

When you call fork, the child process inherits copies of all open file descriptors. The typical way of doing this is for a parent process to open a listening socket, call accept which blocks until a connection arrives and then calls fork after receiving the connection. The parent then closes it's copy of the file descriptor, while the new child process can keep using the file descriptor and do any processing which is needed. Once the child is done it also closes the socket. It's important to remember two things: 1. The file descriptor / socket is a resource in the operating system and after the fork the parent and child each have a handle to that resource, which is kind of like a reference counted smart pointer. I explain this in more detail here. The second thing is that only file descriptors which are opened before calling fork are shared, because after forking parent and child are completely separate processes, even though they may share some resources like file descriptors which existed prior to the fork. If you're using a model where you want to have a parent handing out work to worker processes, it may be better for you to consider using threads, and a thread pool.

By the way, you can download allot of nice examples of servers and clients from Unix Network Programming website.

Passing socket descriptor between two processes using shared memory

No, you can't just use some alternate method to transfer the same "stuff" that would have gone into the sendmsg call. When you "pass a file descriptor", what you're really transferring is access to the kernel-internal file object.

The cmsg structure is just a way of formatting a request to the kernel, in which you say "I want to duplicate this open file object, and allow the process that reads this socket to gain access to it". The name SCM_RIGHTS is a clue that what you're transferring is in essence a permission.

Since the request is for manipulation of a kernel-internal object with security implications, you can't sneak around it. You have to make a syscall. And sendmsg is it. (There have been other fd-passing APIs... something with Streams on SysV I think. I don't know if that one is still alive in any recent OSes. For BSD and Linux at least, sendmsg with SCM_RIGHTS is the way to go.)

In general, this is exactly the difference between msg and cmsg: cmsg is used for operations where the kernel is doing more than just copying some bytes from one end of the socket to the other.

how to pass a fd to another process?

Open file descriptors are inherited when using fork. There is nothing you should.

From fork manpage:

          The child inherits copies of the parent's set of open file
          descriptors.  Each file descriptor in the child refers to the same
          open file description (see open(2)) as the corresponding file
          descriptor in the parent.  This means that the two file
          descriptors share open file status flags, file offset, and signal-
          driven I/O attributes (see the description of F_SETOWN and
          F_SETSIG in fcntl(2)).

As for exec that still holds true (if you didn't mark the fd as close-on-exec). From execve man page (all exec* calls are just a wrapper around this system call):

By default, file descriptors remain open across an execve(). File descriptors that are marked close-on-exec are closed; see the description of FD_CLOEXEC in fcntl(2).

Can I share a file descriptor to another process on linux or are they local to the process?

You can pass a file descriptor to another process over unix domain sockets.
Here's the code to pass such a file descriptor, taken from Unix Network Programming

ssize_t
write_fd(int fd, void *ptr, size_t nbytes, int sendfd)
{
    struct msghdr   msg;
    struct iovec    iov[1];

#ifdef  HAVE_MSGHDR_MSG_CONTROL
    union {
      struct cmsghdr    cm;
      char              control[CMSG_SPACE(sizeof(int))];
    } control_un;
    struct cmsghdr  *cmptr;

    msg.msg_control = control_un.control;
    msg.msg_controllen = sizeof(control_un.control);

    cmptr = CMSG_FIRSTHDR(&msg);
    cmptr->cmsg_len = CMSG_LEN(sizeof(int));
    cmptr->cmsg_level = SOL_SOCKET;
    cmptr->cmsg_type = SCM_RIGHTS;
    *((int *) CMSG_DATA(cmptr)) = sendfd;
#else
    msg.msg_accrights = (caddr_t) &sendfd;
    msg.msg_accrightslen = sizeof(int);
#endif

    msg.msg_name = NULL;
    msg.msg_namelen = 0;

    iov[0].iov_base = ptr;
    iov[0].iov_len = nbytes;
    msg.msg_iov = iov;
    msg.msg_iovlen = 1;

    return(sendmsg(fd, &msg, 0));
}
/* end write_fd */

And here's the code to receive the file descriptor

ssize_t
read_fd(int fd, void *ptr, size_t nbytes, int *recvfd)
{
    struct msghdr   msg;
    struct iovec    iov[1];
    ssize_t         n;
    int             newfd;

#ifdef  HAVE_MSGHDR_MSG_CONTROL
    union {
      struct cmsghdr    cm;
      char              control[CMSG_SPACE(sizeof(int))];
    } control_un;
    struct cmsghdr  *cmptr;

    msg.msg_control = control_un.control;
    msg.msg_controllen = sizeof(control_un.control);
#else
    msg.msg_accrights = (caddr_t) &newfd;
    msg.msg_accrightslen = sizeof(int);
#endif

    msg.msg_name = NULL;
    msg.msg_namelen = 0;

    iov[0].iov_base = ptr;
    iov[0].iov_len = nbytes;
    msg.msg_iov = iov;
    msg.msg_iovlen = 1;

    if ( (n = recvmsg(fd, &msg, 0)) <= 0)
        return(n);

#ifdef  HAVE_MSGHDR_MSG_CONTROL
    if ( (cmptr = CMSG_FIRSTHDR(&msg)) != NULL &&
        cmptr->cmsg_len == CMSG_LEN(sizeof(int))) {
        if (cmptr->cmsg_level != SOL_SOCKET)
            err_quit("control level != SOL_SOCKET");
        if (cmptr->cmsg_type != SCM_RIGHTS)
            err_quit("control type != SCM_RIGHTS");
        *recvfd = *((int *) CMSG_DATA(cmptr));
    } else
        *recvfd = -1;       /* descriptor was not passed */
#else
/* *INDENT-OFF* */
    if (msg.msg_accrightslen == sizeof(int))
        *recvfd = newfd;
    else
        *recvfd = -1;       /* descriptor was not passed */
/* *INDENT-ON* */
#endif

    return(n);
}
/* end read_fd */