The Difference Between Fork(), Vfork(), Exec() and Clone()

The difference between fork(), vfork(), exec() and clone()

vfork() is an obsolete optimization. Before good memory management, fork() made a full copy of the parent's memory, so it was pretty expensive. since in many cases a fork() was followed by exec(), which discards the current memory map and creates a new one, it was a needless expense. Nowadays, fork() doesn't copy the memory; it's simply set as "copy on write", so fork()+exec() is just as efficient as vfork()+exec().
clone() is the syscall used by fork(). with some parameters, it creates a new process, with others, it creates a thread. the difference between them is just which data structures (memory space, processor state, stack, PID, open files, etc) are shared or not.

What is the difference between fork() and vfork()?

The intent of vfork was to eliminate the overhead of copying the whole process image if you only want to do an exec* in the child. Because exec* replaces the whole image of the child process, there is no point in copying the image of the parent.

if ((pid = vfork()) == 0) {
  execl(..., NULL); /* after a successful execl the parent should be resumed */
  _exit(127); /* terminate the child in case execl fails */
}

For other kinds of uses, vfork is dangerous and unpredictable.

With most current kernels, however, including Linux, the primary benefit of vfork has disappeared because of the way fork is implemented. Rather than copying the whole image when fork is executed, copy-on-write techniques are used.

Can a fork child determine whether it is a fork or a vfork?

A simple solution could use pthread_atfork(). The callbacks registered with this service are triggered only upon fork(). So, the 3rd parameter of the function, which is called in the child process right after the fork, could update a global variable. The child can check the variable and if it is modified, then it has been forked:

/*
  Simple program which demonstrates a solution to
  make the child process know if it has been forked or vforked
*/
#include <pthread.h>
#include <sys/wait.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

pid_t forked;

void child_hdl(void)
{
  forked = getpid();
}


int main(void)
{
pid_t pid;

  pthread_atfork(0, 0, child_hdl);

  pid = fork();
  if (pid == 0) {
    if (forked != 0) {
      printf("1. It is a fork()\n");
    }
    exit(0);
  }

  // Father continues here
  wait(NULL);

  pid = vfork();
  if (pid == 0) {
    if (forked != 0) {
      printf("2. It is a fork()\n");
    }
    _exit(0);
  }

  // Father continues here
  wait(NULL);

  return 0;
}

Build/execution:

$ gcc fork_or_vfork.c
$ ./a.out
1. It is a fork()

File descriptor table for vfork vs. fork

Because it's common to redirect stdin and/or stdout before calling exec in the child process. If they shared the same file descriptor table, this would modify the parent process's I/O.
You shouldn't store any variables in the child process. vfork() should only be used if you're going to immediately call an exec function.

Note that vfork() is obsolete on modern operating systems. Instead of copying the address space they use copy-on-write.

For more information see What is the difference between fork() and vfork()?

Clone, fork, vfork behaviour when followed by an exec

The shared objects are unmapped or unlinked but from a shared perspective.

Say you have 3 processes/threads all of them sharing memory starting at 0x1000.

One of them does an execve. Then it will do an shm_unlink(2) on 0x1000.
shm_unlink(2) will try to unlink(2) it.

Now for each process/thread using that memory range there is a counter. In our case the counter is set to 3 before the execve(2) and it will be set to 2 after it. No memory loss.

The memory will be 'destroyed', as you put it, when no process is using it anymore. When the counter is 0.

Same goes for all shared objects. For a list of what system calls are called and how they're 'destroying' the shared objects have a look at the links in the execve(2) manpage. Search for this phrase:

All process attributes are preserved during an execve(), except the following

If I have a process, and I clone it, is the PID the same?

Not quite. If you clone a process via fork/exec, or vfork/exec, you will get a new process id. fork() will give you the new process with a new process id, and exec() replaces that process with a new process, but maintaining the process id.

From here:

The vfork() function differs from
fork() only in that the child process
can share code and data with the
calling process (parent process). This
speeds cloning activity significantly
at a risk to the integrity of the
parent process if vfork() is misused.

The Difference Between Fork(), Vfork(), Exec() and Clone()