How Does Copy-On-Write in Fork() Handle Multiple Fork

How does copy-on-write in fork() handle multiple fork?

If fork is called multiple times from the original parent process, then each of the children and parent will have their pages marked as read-only. When a child process attempts to write data then the page from the parent process is copied to its address space and the copied page is marked as writeable in the child but not in the parent.

If fork is called from the child process and the grand-child attempts to write, the page from the original parent is copied to the first child, and then to the grand child, and all is marked as writeable.

How does copy-on-write work in fork()?

Depends on the Operating System, hardware architecture and libc. But yes in case of recent Linux with MMU the fork(2) will work with copy-on-write. It will only (allocate and) copy a few system structures and the page table, but the heap pages actually point to the ones of the parent until written.

More control over this can be exercised with the clone(2) call. And vfork(2) beeing a special variant which does not expect the pages to be used. This is typically used before exec().

As for the allocation: the malloc() has meta information over requested memory blocks (address and size) and the C variable is a pointer (both in process memory heap and stacks). Those two look the same for the child (same values because same underlying memory page seen in the address space of both processes). So from a C program point of view the array is already allocated and the variable initialized when the process comes into existence. The underlying memory pages are however pointing to the original physical ones of the parent process, so no extra memory pages are needed until they are modified.

If the child allocates a new array it depends if it fits into the already existing heap pages or if the brk of the process needs to be increased. In both cases only the modified pages get copied and the new pages get allocated only for the child.

This also means that the physical memory might run out after malloc(). (Which is bad as the program cannot check the error return code of "a operation in a random code line"). Some operating systems will not allow this form of overcommit: So if you fork a process it will not allocate the pages, but it requires them to be available at that moment (kind of reserves them) just in case. In Linux this is configurable and called overcommit-accounting.

Copy-on-write during fork

Not so.

The addresses seen by both the child and the parent are relative to their own address spaces, not relative to the system as a whole.

The operating system maps the memory used by each process to a different location in physical (or virtual) memory. But that mapping is not visible to the processes.

Does parent process lose write ability during copy on write?

Right, if either process writes a COW page, it triggers a page fault.

In the page fault handler, if the page is supposed to be writeable, it allocates a new physical page and does a memcpy(newpage, shared_page, pagesize), then updates the page table of whichever process faulted to map the newpage to that virtual address. Then returns to user-space for the store instruction to re-run.

This is a win for something like fork, because one process typically makes an execve system call right away, after touching typically one page (of stack memory). execve destroys all memory mappings for that process, effectively replacing it with a new process. The parent once again has the only copy of every page. (Except pages that were already copy-on-write, e.g. memory allocated with mmap is typically COW-mapped to a single physical page of zeros, so reads can hit in L1d cache).

A smart optimization would be for fork to actually copy the page containing the top of the stack, but still do lazy COW for all the other pages, on the assumption that the child process will normally execve right away and thus drop its references to all the other pages. It still costs a TLB invalidation in the parent to temporarily flip all the pages to read-only and back, though.

Is a call to free() in the forked process causing a copy-on-write?

Yes, it certainly does.

Memory copy-on-write (CoW) happens on a different layer than malloc()/free().

When a process is forked, the child process has all its mapped pages marked as shared from the parent (and thus read-only). When the child modifies a shared page, it triggers a page fault and only then does the operating system copy the data to another area in the physical RAM (and change the mapping for the process).

malloc() and free() do not allocate physical RAM. They are memory management functions, with memory defined as "the (virtual) address space of a process". Thus, these C library functions keep track of an internal state of allocated memory chunks, and malloc() and free() only modifies these libc-internal data structures (with an exception of requesting more address space from the OS when malloc()-ing). Physical RAM allocation only happens at page fault, most commonly when a process accesses newly assigned memory for the first time.

In this respect, yes. As free() must modify memory to mark a region as freed, it will write to the relevant region, and at the lower level cause a remapping (i.e. CoW).

How Does Copy-On-Write in Fork() Handle Multiple Fork