Time Waste of Execv() and Fork()

Time waste of execv() and fork()

What is the advantage that is achieved by using this combo (instead of some other solution) that makes people still use this even though we have waste?

You have to create a new process somehow. There are very few ways for a userspace program to accomplish that. POSIX used to have vfork() alognside fork(), and some systems may have their own mechanisms, such as Linux-specific clone(), but since 2008, POSIX specifies only fork() and the posix_spawn() family. The fork + exec route is more traditional, is well understood, and has few drawbacks (see below). The posix_spawn family is designed as a special purpose substitute for use in contexts that present difficulties for fork(); you can find details in the "Rationale" section of its specification.

This excerpt from the Linux man page for vfork() may be illuminating:

Under Linux, fork(2) is implemented using copy-on-write pages, so the only penalty incurred by fork(2) is the time and memory required to duplicate the parent’s page tables, and to create a unique task structure for the child. However, in the bad old days a fork(2) would require making a complete copy of the caller’s data space, often needlessly, since usually immediately afterwards an exec(3) is done. Thus, for greater efficiency, BSD introduced the vfork() system call, which did not fully copy the address space of the parent process, but borrowed the parent’s memory and thread of control until a call to execve(2) or an exit occurred. The parent process was suspended while the child was using its resources. The use of vfork() was tricky: for example, not modifying data in the parent process depended on knowing which variables are held in a register.

(Emphasis added)

Thus, your concern about waste is not well-founded for modern systems (not limited to Linux), but it was indeed an issue historically, and there were indeed mechanisms designed to avoid it. These days, most of those mechanisms are obsolete.

Some part of fork() mechanism

Why we can't simply allot freshly new chunk of space for all resources
needed by new process

The semantics of forks(2) say that when you do it another process starts executing from that point. So if it starts executing, it will naturally have some expectations regarding declared variables, their values and so on. You need to copy everything* the parent had access to.

int x = 42;
fork();
if (parent)
    /* x == 42. */
else
    /* I can haz x ? */

if the new process were to immediately execute the new image, why would that copying go
to waste

This copying is completely useless if the new process turns out not to need continue from that point. For example, if the new process simply wants to start executing a new program, it won't need any of those variables mentioned above.

Two parallel program execv with fork with returning to code

This code should fix it( two wait) :

#include <stdio.h> 
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main() 
{
    char *argv_for_program_1[] = {"ls" , NULL};
    int pid_1 = fork();
    if (pid_1 == 0)
    {     
        execv("/usr/bin/ls" , argv_for_program_1); //print ls
    }
    char *argv_for_program_2[] = {"pwd" , NULL};
    int pid_2 = fork();
    if (pid_2 == 0)
    {     
        execv("/usr/bin/pwd" , argv_for_program_2); //print pwd
    }
    wait(0);
    wait(0); //!!!
    printf("continue");  //print continue
}

Is the unix fork exec sequence really as expensive as it sounds?

Usually the fork does not actually copy all the memory, but uses a "copy on write" which means that as long as the memory is not modified the same pages are used. However, to avoid not having enough memory later on (should the process write to the memory) enough memory must be allocated.

This means that forking from large process on systems that do not allow overcommitting memory the memory must be available. So, if you have a 8 GB process forking, then for at least a short period of time 16 GB must be available.

See also vfork and posix_spawn for other solutions.

fork()-parent ignores execv() of an except script

By "interrupted", I presume you mean that the parent does not wait for the child to complete the operation before doing whatever it does next?

The parent and the child will run in parallel after the fork. If you want the parent to "hang" until the child is done then you need to wait for it.

See man wait and search for fork examples on the internet.

C execv() parameter issue

What execv wants is a char *[], i.e. an array of pointer, which is what you define with args.

If you add the & to args you get the type char *(*[]), that is a pointer to an array of pointers. Skip the & in the execv call and it will all work.

Running bash using posix instead of fork/execv

Let's look at what the code in the original fork/exec does:

close all file descriptors
open the tty, which will coincidentally get an fd of 0, i.e. it's stdin
duplicate this fd twice (dup(0)), which puts them into stdout and stderr
exec the shell command

Your new code doesn't follow the same pattern at all. What you want to do is repeat the process, but be more explicit:

close all the FDs:

for(i = 0; i <= maxfd; i++)
{
   ret = posix_spawn_file_actions_addclose (&action, i);
}

open the tty into STDIN_FILENO:

ret = posix_spawn_file_actions_addopen (&action, STDIN_FILENO, tty_name, O_RDWR, 0);

duplicate the STDIN_FILENO into STDOUT_FILENO and STDERR_FILENO:

ret = posix_spawn_file_actions_adddup2 (&action, STDIN_FILENO, STDOUT_FILENO);
ret = posix_spawn_file_actions_adddup2 (&action, STDIN_FILENO, STDERR_FILENO);

Then the posix_spawn should take place in the correct context.

For the old fork/exec process, you should have done something like:

int fd = open(tty_name, O_RDWR /*| O_NOCTTY*/);
if (fd != STDIN_FILENO) dup2(fd, STDIN_FILENO);
if (fd != STDOUT_FILENO) dup2(fd, STDOUT_FILENO);
if (fd != STDERR_FILENO) dup2(fd, STDERR_FILENO);

It's more explicit in intent.

The reason for the ifs is to prevent the accidental dup2 of the original fd into the new fd number. The only one that really should have that problem is the first one because fd == STDIN_FILENO because you don't have any other file descriptors open at that point.

To combine this into a small piece of code, with echo something rather than an invocation of bash we have:

#include <stdio.h>
#include <spawn.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/wait.h>

void
do_spawn()
{
  int ret;
  posix_spawn_file_actions_t action;
  int i;
  pid_t pid;
  int maxfd = 1024;
  char *tty_name = ttyname (0);

  ret = posix_spawn_file_actions_init (&action);
  for (i = 0; i <= maxfd; i++) {
      ret = posix_spawn_file_actions_addclose (&action, i);
  }
  ret = posix_spawn_file_actions_addopen (&action, STDIN_FILENO, tty_name, O_RDWR, 0);
  ret = posix_spawn_file_actions_adddup2 (&action, STDIN_FILENO, STDOUT_FILENO);
  ret = posix_spawn_file_actions_adddup2 (&action, STDIN_FILENO, STDERR_FILENO);
  char *argv[] = { "/bin/sh", "-c", "echo something", NULL };
  int status;
  extern char **environ;
  posix_spawnattr_t attr = { 0 };

  posix_spawnattr_init(&attr);
  posix_spawnattr_setflags(&attr, POSIX_SPAWN_USEVFORK);
  status = posix_spawn(&pid, argv[0], &action /*__file_actions*/ , &attr, argv, environ);
  printf ("%d %ld\n", status, pid);

  wait (0);

  posix_spawn_file_actions_destroy (&action);

}

int
main(int argc, char **argv)
{
  do_spawn();
}

This needs to be compiled with -D_GNU_SOURCE, as otherwise POSIX_SPAWN_USEVFORK is not defined.

Child process memory free problem with execvp

What's "definitely lost" memory?

First of all, let's discuss what Valgrind reports as "definitely lost": Valgrind will report allocated memory as "definitely lost" if all references to the allocated memory are lost before the termination of the program. In other words, if your program reaches a state in which there is allocated memory that cannot be freed because no valid pointers to it exists, this will count as "definitely lost".

This means that a program like this:

int main(void) {
    char *buf = malloc(10);
    // ...
    exit(0);
}

will cause no error from Valgrind, while a program like this:

void func(void) {
    char *buf = malloc(10);
    // ...
} // memory is definitely lost here

int main(void) {
    func();
    exit(0);
}

will cause a "definitely lost" error.

Why is the first version ok for Valgrind? It's because memory is always freed by the system on program exit. If you keep using an allocated chunk of memory until the end of your program, there really is no need to explicitly call free() on it, and it can be considered as just a waste of time. For this reason, if you don't free some allocated block while still holding a reference to it, Valgrind assumes that you did so to avoid an "useless" free() because you are smart and know that the OS is going to take care of it anyway.

If however you forget to free() something, and lose every reference to it, then Valgrind warns you, because you should have free()d the memory. If you don't, and the program keeps running, then the same thing happens each time the buggy block is entered, and you end up wasting memory. This is what is called a "memory leak". A very simple example is the following:

void func(void) {
    char *buf = malloc(10);
    // ...
} // memory is definitely lost here

int main(void) {
    while (1) {
        func();
    }
    exit(0);
}

This program will make your machine run out of memory and could ultimately end up killing or freezing your system (warning: do not test this if you don't want to risk freezing your PC). If you instead correctly call free(buf) before the end of func, then the program keeps running indefinitely without a problem.

What happens in your program

Now let's see where you are allocating memory and where the variables holding the references are declared. The only part of the program that is allocating memory is inside the if (rc == 0) block, through strdup, here:

char *myargs[3];
myargs[0] = strdup("wc");   // program: "wc" (word count)
myargs[1] = strdup("p4.c"); // argument: file to count

The two calls to strdup() duplicate the string and allocate new memory to do so. Then, you save a reference to the newly allocated memory in the myargs array, which is declared inside the if block. If your program exits the block without freeing the allocated memory, then those references will be lost, and there will be no way for your program to free the memory.

With execvp(): your child process is replaced by the new process (wc p4.c), and the memory space of the parent process is thrown away by the operating system (for Valgrind, this is exactly the same as program termination). This memory is not counted as lost by Valgrind, because references to the allocated memory are still present when execvp() is called. NOTE: this is not becaue you pass the pointers to allocated memory to execvp(), it's because the original program effectively terminates and memory is retained by the OS.

Without execvp(): your child process continues execution, and right after it exits the code block where myargs is defined, it loses any reference to the allocated memory (since myargs[0] and myargs[1] were the only references). Valgrind then correctly reports this as "definitely lost", 8 bytes (3 for "wc" and 5 for "p4.c") in 2 blocks (2 allocations). The same thing happens if the call to execvp() fails for whatever reason.

Additional considerations

To be fair, there is no real need to call strdup() in the program you show. It's not like those strings need to be copied because they are used somewhere else (or anything like that). The code could have simply been:

myargs[0] = "wc";   // program: "wc" (word count)
myargs[1] = "p4.c"; // argument: file to count

In any case, a good practice when using the exec*() family of functions, is to put an exit() directly after it, to ensure the program doesn't keep running in case the exec*() fails. Something like this:

execvp(myargs[0], myargs); 
perror("execvp failed");
exit(1);

Does parent process lose write ability during copy on write?

Right, if either process writes a COW page, it triggers a page fault.

In the page fault handler, if the page is supposed to be writeable, it allocates a new physical page and does a memcpy(newpage, shared_page, pagesize), then updates the page table of whichever process faulted to map the newpage to that virtual address. Then returns to user-space for the store instruction to re-run.

This is a win for something like fork, because one process typically makes an execve system call right away, after touching typically one page (of stack memory). execve destroys all memory mappings for that process, effectively replacing it with a new process. The parent once again has the only copy of every page. (Except pages that were already copy-on-write, e.g. memory allocated with mmap is typically COW-mapped to a single physical page of zeros, so reads can hit in L1d cache).

A smart optimization would be for fork to actually copy the page containing the top of the stack, but still do lazy COW for all the other pages, on the assumption that the child process will normally execve right away and thus drop its references to all the other pages. It still costs a TLB invalidation in the parent to temporarily flip all the pages to read-only and back, though.

Time Waste of Execv() and Fork()