Fork() and Pipes() in C

fork() and pipes() in c

A pipe is a mechanism for interprocess communication. Data written to the pipe by one process can be read by another process. The primitive for creating a pipe is the pipe function. This creates both the reading and writing ends of the pipe. It is not very useful for a single process to use a pipe to talk to itself. In typical use, a process creates a pipe just before it forks one or more child processes. The pipe is then used for communication either between the parent or child processes, or between two sibling processes. A familiar example of this kind of communication can be seen in all operating system shells. When you type a command at the shell, it will spawn the executable represented by that command with a call to fork. A pipe is opened to the new child process and its output is read and printed by the shell. This page has a full example of the fork and pipe functions. For your convenience, the code is reproduced below:

 #include <sys/types.h>
 #include <unistd.h>
 #include <stdio.h>
 #include <stdlib.h>

 /* Read characters from the pipe and echo them to stdout. */

 void
 read_from_pipe (int file)
 {
   FILE *stream;
   int c;
   stream = fdopen (file, "r");
   while ((c = fgetc (stream)) != EOF)
     putchar (c);
   fclose (stream);
 }

 /* Write some random text to the pipe. */

 void
 write_to_pipe (int file)
 {
   FILE *stream;
   stream = fdopen (file, "w");
   fprintf (stream, "hello, world!\n");
   fprintf (stream, "goodbye, world!\n");
   fclose (stream);
 }

 int
 main (void)
 {
   pid_t pid;
   int mypipe[2];

   /* Create the pipe. */
   if (pipe (mypipe))
     {
       fprintf (stderr, "Pipe failed.\n");
       return EXIT_FAILURE;
     }

   /* Create the child process. */
   pid = fork ();
   if (pid == (pid_t) 0)
     {
       /* This is the child process.
          Close other end first. */
       close (mypipe[1]);
       read_from_pipe (mypipe[0]);
       return EXIT_SUCCESS;
     }
   else if (pid < (pid_t) 0)
     {
       /* The fork failed. */
       fprintf (stderr, "Fork failed.\n");
       return EXIT_FAILURE;
     }
   else
     {
       /* This is the parent process.
          Close other end first. */
       close (mypipe[0]);
       write_to_pipe (mypipe[1]);
       return EXIT_SUCCESS;
     }
 }

Just like other C functions you can use both fork and pipe in C++.

Fork wait and pipe in C

First of all, it's better to first run all child processes and then wait for all of them, instead of waiting for each one sequentially.

In addition, the child processes should exit immediately and not keep running the forked code.

Thirdly, you must pay attention and wait for all children after the loop, and not only for the first one that terminates:

#include <stdio.h>
#include <sys/wait.h>
#include <unistd.h>

int main() {
  for (int i=0; i<3; i++) {
    pid_t child = fork();
    if (child > 0) {
      printf("Child %d created\n", child);
    }
    else if (child == 0) {
      printf("In child %d. Bye bye\n", i);
      return 0; // exit the child process
    }
  }

  while (wait(NULL) > 0); // wait for all child processes

  printf("Parent terminated\n");
  return 0;
}

EDIT:

The code above is just an improvement to the example given in the question. In order to implement the pipe of information from the child processes to the parent, a pipe can be created (using pipe()) and the write-end file descriptor would be accessible from child processes.

Here's a good example to do so.

pipe() and fork() in c

Your parent process waits for the sort process to finish before creating the ls process.

The sort process needs to read its input before it can finish. And its input is coming from the ls that won't be started until after the wait. Deadlock.

You need to create both processes, then wait for both of them.

Also, your file descriptor manipulations aren't quite right. In this pair of calls:

close(0);
dup2(fd[0], 0);

the close is redundant, since dup2 will automatically close the existing fd 0 if there is one. You should do a close(fd[0]) after ther dup2, so you only have one file descriptor tied to that end of the pipe. And if you want to be really robust, you should test wither fd[0]==0 already, and in that case skip the dup2 and close.

Apply all of that to the other dup2 also.

Then there's the issue of the parent process holding the pipe open. I'd say you should close both ends of the pipe in the parent after you've passed them on to the children, but you have that weird read from fd[0] after the last wait... I'm not sure why that's there. If the ls|sort pipeline has run correctly, the pipe will be empty afterward, so there will be nothing to read. In any case, you definitely need to close fd[1] in the parent, otherwise the sort process won't finish because the pipe won't indicate EOF until all writers are closed.

After the weird read is a printf that will probably crash, since the read buffer won't be '\0'-terminated.

And the point of using execlp is that it does the $PATH lookup for you so you don't have to specify /bin/. My first test run failed because my sort is in /usr/bin/. Why hardcode paths when you don't have to?

Forking and piping processes in c

This is a diagram I drew for myself showing how the processes are to be interconnected:

                  p4
           C5 <--------- C4
          /               \
     p5  /              p3 \
        /                   \
o----> C0 ---->o            C3
        \                   /
     p0  \              p2 /
          \               /
           C1 ---------> C2
                  p1

The Cn represent the processes; C0 is the parent process. The pn represent the pipes; the other two lines are standard input and standard output. Each child has a simple task, as befits children. The parent has a more complex task, mainly ensuring that exactly the right number of file descriptors are closed. In fact, the close() is so important that I created a debugging function, fd_close(), to conditionally report on file descriptors being closed. I used that too when I had silly mistakes in the code.

The err_*() functions are simplified versions of code I use in most of my programs. They make error reporting less onerous by converting most error reports into a one-line statement, rather than requiring multiple lines. (These functions are normally in 'stderr.c' and 'stderr.h', but those files are 750 lines of code and comment and are more comprehensive. The production code has an option to support prefixing each message with a PID, which is also important with multi-process systems like this.)

#include <errno.h>
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/wait.h>
#include <unistd.h>

enum { BUFFER_SIZE = 1024 };

typedef int Pipe[2];

static int debug = 0;
static void fd_close(int fd);

/* These functions normally declared in stderr.h */
static void err_setarg0(const char *argv0);
static void err_sysexit(char const *fmt, ...);
static void err_usage(char const *usestr);
static void err_remark(char const *fmt, ...);

static void be_childish(Pipe in, Pipe out)
{
    /* Close irrelevant ends of relevant pipes */
    fd_close(in[1]);
    fd_close(out[0]);
    char buffer[BUFFER_SIZE];
    ssize_t nbytes;
    while ((nbytes = read(in[0], buffer, sizeof(buffer))) > 0)
    {
        buffer[0]++;
        if (write(out[1], buffer, nbytes) != nbytes)
            err_sysexit("%d: failed to write to pipe", (int)getpid());
    }
    fd_close(in[0]);
    fd_close(out[1]);
    exit(0);
}

int main(int argc, char **argv)
{
    err_setarg0(argv[0]);

    int nkids;
    if (argc != 2 || (nkids = atoi(argv[1])) <= 1 || nkids >= 10)
        err_usage("n   # for n in 2..9");

    err_remark("Parent  has PID %d\n", (int)getpid());

    Pipe pipelist[nkids];
    if (pipe(pipelist[0]) != 0)
        err_sysexit("Failed to create pipe #%d", 0);
    if (debug)
        err_remark("p[0][0] = %d; p[0][1] = %d\n", pipelist[0][0], pipelist[0][1]);

    for (int i = 1; i < nkids; i++)
    {
        pid_t pid;
        if (pipe(pipelist[i]) != 0)
            err_sysexit("Failed to create pipe #%d", i);
        if (debug)
            err_remark("p[%d][0] = %d; p[%d][1] = %d\n", i, pipelist[i][0], i, pipelist[i][1]);
        if ((pid = fork()) < 0)
            err_sysexit("Failed to create child #%d", i);
        if (pid == 0)
        {
            /* Close irrelevant pipes */
            for (int j = 0; j < i-1; j++)
            {
                fd_close(pipelist[j][0]);
                fd_close(pipelist[j][1]);
            }
            be_childish(pipelist[i-1], pipelist[i]);
            /* NOTREACHED */
        }
        err_remark("Child %d has PID %d\n", i, (int)pid);
    }

    /* Close irrelevant pipes */
    for (int j = 1; j < nkids-1; j++)
    {
        fd_close(pipelist[j][0]);
        fd_close(pipelist[j][1]);
    }

    /* Close irrelevant ends of relevant pipes */
    fd_close(pipelist[0][0]);
    fd_close(pipelist[nkids-1][1]);

    int w_fd = pipelist[0][1];
    int r_fd = pipelist[nkids-1][0];

    /* Main loop */
    char buffer[BUFFER_SIZE];

    while (printf("Input:  ") > 0 && fgets(buffer, sizeof(buffer), stdin) != 0)
    {
        int len = strlen(buffer);
        if (write(w_fd, buffer, len) != len)
            err_sysexit("Failed to write to children");
        if (read(r_fd, buffer, len) != len)
            err_sysexit("Failed to read from children");
        printf("Output: %.*s", len, buffer);
    }
    fd_close(w_fd);
    fd_close(r_fd);
    putchar('\n');

    int status;
    int corpse;
    while ((corpse = wait(&status)) > 0)
        err_remark("%d exited with status 0x%.4X\n", corpse, status);

    return 0;
}

static void fd_close(int fd)
{
    if (debug)
        err_remark("%d: close(%d)\n", (int)getpid(), fd);
    if (close(fd) != 0)
        err_sysexit("%d: Failed to close %d\n", (int)getpid(), fd);
}

/* Normally in stderr.c */
static const char *arg0 = "<undefined>";

static void err_setarg0(const char *argv0)
{
    arg0 = argv0;
}

static void err_usage(char const *usestr)
{
    fprintf(stderr, "Usage: %s %s\n", arg0, usestr);
    exit(1);
}

static void err_vsyswarn(char const *fmt, va_list args)
{
    int errnum = errno;
    fprintf(stderr, "%s:%d: ", arg0, (int)getpid());
    vfprintf(stderr, fmt, args);
    if (errnum != 0)
        fprintf(stderr, " (%d: %s)", errnum, strerror(errnum));
    putc('\n', stderr);
}

static void err_sysexit(char const *fmt, ...)
{
    va_list args;
    va_start(args, fmt);
    err_vsyswarn(fmt, args);
    va_end(args);
    exit(1);
}

static void err_remark(char const *fmt, ...)
{
    va_list args;
    va_start(args, fmt);
    vfprintf(stderr, fmt, args);
    va_end(args);
}

Example output:

$  ./pipecircle 9
Parent  has PID 34473
Child 1 has PID 34474
Child 2 has PID 34475
Child 3 has PID 34476
Child 4 has PID 34477
Child 5 has PID 34478
Child 6 has PID 34479
Child 7 has PID 34480
Child 8 has PID 34481
Input:  Hello
Output: Pello
Input:  Bye
Output: Jye
Input:  ^D
34474 exited with status 0x0000
34477 exited with status 0x0000
34479 exited with status 0x0000
34476 exited with status 0x0000
34475 exited with status 0x0000
34478 exited with status 0x0000
34480 exited with status 0x0000
34481 exited with status 0x0000
$

Pipe() and fork()

Your program is a bit strange. The main problem seems to be that the second fork is executed in both main program and in the first child. In fact you are running four processes: main, two sons of main and the son of the first son. This is probably not what you want. You probably wanted to put the first switch immediately after the first fork and to execute the second fork only in the main program.

And, of course, you are not checking result values of read and write for unexpected situations.

using a fork and pipe to mimic linux pipe command

ORIGINAL

Once you want to replace the stdin of the child process, you need to use dup2() function.

Here is the manual section that explains why the dup() function will never work for your purposes:

The dup() system call creates a copy of the file descriptor oldfd,
using the lowest-numbered unused file descriptor for the new
descriptor.

Here is the manual section that explains why the dup2() function can solve your problem:

The dup2() system call performs the same task as dup(), but instead
of using the lowest-numbered unused file descriptor, it uses the file
descriptor number specified in newfd.

To solve your problem, replace the dup(rd) call for dup2(rd, STDIN_FILENO). You may also remove the close(0) call, once the dup2() function closes the newfd if it is already in use.

If the file descriptor newfd was previously open, it is silently
closed before being reused.

EDIT #1

What I previously wrote does not fix the problem, once close(0); dup(rd); will have the same effect as dup2(rd, 0), as this user mentioned below. So, I compiled your code as it is and, after running, I
had this result:

$ gcc -std=c99 -o program program.c
$ ./program ls : sort
two args
argA: ls  
argB: sort  
18
$

As you can see, the last line shows 18, the result of 2*1*9*1.
Now, notice that the parent process exits right after it writes to the file described as wt - the new stdin of bc command being executed in the child process. This means that the parent process may exit before the child process is done. I highly recommend you to test your code using a wait() or waitpid() call right before the parent process exits. For example:

// (...)

if (fork()) {
    close(rd);
    write(wt, "2*1*9*1", strlen("2*1*9*1"));
    write(wt, "\n", 1);
    close(wt);
    wait(NULL);
    exit(0);
} else {
    close(wt);
    close(0); // close zero
    dup(rd); // dup rd into lowest possible orbit
    close(rd);
    execlp("bc", "bc", NULL);
    exit(1);
}

I also replaced the line execlp("bc", "bc", 0, NULL); with the line execlp("bc", "bc", NULL);. The zero I removed is equivalent to NULL and means the end of the argument list for the command being executed with execlp().

EDIT #2 (IMPLEMENTATION)

Reading the entire code, we can divide your implementation in two parts:

Parsing the program's arguments to fit the syntax of the execlp() function;
Forking the process to execute the second command with the result of the first command as input.

If you read the man pages of the exec() function family, you will notice that the function execvp() is way more useful in this program, since the second argument of the execvp() function is the same type as the program's arguments: an array of strings NULL-terminated.
Following this steps, you can easily parse the program's arguments to fit in the execvp():

Iterate through the program's arguments;
Find the position of the pipe symbol;
In that position, put NULL to signalize the end of the first command's arguments;
Save the address of the next position as the start of the second command's arguments.

After parsing the program's arguments, it is time to create a pipe and fork the process. In the child process, replace the stdout with the write-end of the pipe before executing the first command. In the parent process, replace the stdin with the read-end of the pipe before executing the second command.

Here is the entire code I wrote, ran and tested:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>

#define PIPE_SYMBOL ":"

int main ( int argc , char **argv ) {
    /* Validates the usage. At least is needed the program's name, two commands and the pipe symbol */
    if ( argc < 4 ) {
        fprintf(stderr, "usage: command-1 [args-1...] : command-2 [args-2...]\n");
        return EXIT_FAILURE;
    }

    /* The start of the first comment is allways the start of the program arguments array */
    char **command1 = &argv[1];

    /* The start of the second command is undefined, once it depends where the pipe symbol is located */
    char **command2 = NULL;

    /* Finds the position of the pipe symbol */
    for ( int i = 0 ; argv[i] != NULL ; i++ ) {
        /* When found, ... */ 
        if ( strcmp(PIPE_SYMBOL, argv[i]) == 0 ) {
            /* ... replaces it for NULL, so the first command array is NULL terminated and... */
            argv[i] = NULL;
            /* ... the next position is the start of the second command */
            command2 = &argv[i+1];
            break;
        }
    }

    /* If the pipe symbol is missing or if there is no command after the pipe symbol, bad usage */
    if ( command2 == NULL || command2[0] == NULL ) {
        fprintf(stderr, "usage: command-1 [args-1...] : command-2 [args-2...]\n");
        return EXIT_FAILURE;
    }

    pid_t pid;
    int pipefd[2];

    if ( pipe(pipefd) == -1 ) {
        perror("creating pipe");
        return EXIT_FAILURE;
    }

    if ( (pid = fork()) == -1 ) {
        perror("creating child process");
        return EXIT_FAILURE;
    }
    
    /* Child process executes the first command */
    if ( pid == 0 ) {
        close(pipefd[0]);
        close(STDOUT_FILENO);
        dup(pipefd[1]);
        close(pipefd[1]);
        execvp(command1[0], command1);
        perror("executing first command");
        return EXIT_FAILURE;
    }

    /* Parent process executes the second command */
    close(pipefd[1]);
    close(STDIN_FILENO);
    dup(pipefd[0]);
    close(pipefd[0]);
    execvp(command2[0], command2);
    perror("executing second command");
    return EXIT_FAILURE;
}

C: pipe() and fork()

No. There is always only one pipe after you call pipe() once. However, what fork() does is that it copies everything including the file descriptor table to the child process. So the parent and the child will both have access to the ONLY two ends of the pipe. That's why you should close the unused end in parent and child, i.e. if you are trying to write in parent and read in child, you should close fd[0] in parent and fd[1] in child - because you can only push from one side to another in a pipe! Think of it as a real pipe, what would happen if you pour water into the both ends of the pipe.

Fork() and Pipes() in C