How to Create a Real Thread with Clone() on Linux

How to create a real thread with clone() on Linux?

You may want to see the flag CLONE_THREAD, it will place the new thread in the same thread group as the calling process.

Once you give the CLONE_THREAD, it will make the new thread have the same pid and ppid as the calling process. It is used in posix threads. Below is an output from my system. the LWP column says that is is now a light weight process and has different TID

UID        PID  PPID   LWP  C NLWP    SZ   RSS PSR STIME TTY          TIME CMD
anukalp  18398  9638 18398  0    2   464   456   0 10:56 pts/3    00:00:00 ./a.out
anukalp  18398  9638 18399  0    2   464   456   1 10:56 pts/3    00:00:00 ./a.out

Also the output of /proc/self/status changes, I have added a couple of printfs:

[anukalp@localhost ~]$ ./a.out

This process pid: 18398
Creating new thread...
Done! Thread pid: 18399 /* This is now thread id, available to caller of clone */
getpid(): ad pid: 18399
Inside func.
getpid(): 18398
getppid(): 9638
Looking into /proc/self/status...
Name:   a.out
State:  R (running)
Tgid:   18398
Pid:    18398
PPid:   9638
TracerPid:      0
Uid:    500     500     500     500
Gid:    500     500     500     500
FDSize: 256
Groups: 7 19 22 80 81 82 83 100 490 500 
VmPeak:     1856 kB
VmSize:     1856 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:       248 kB
VmRSS:       248 kB
VmData:      168 kB
VmStk:       140 kB
VmExe:         4 kB
VmLib:      1516 kB
VmPTE:        16 kB
VmSwap:        0 kB
Threads:        2
SigQ:   1/14050
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000000000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffffffffffff
Cpus_allowed:   00000000,000000ff
Cpus_allowed_list:      0-7
voluntary_ctxt_switches:        1
nonvoluntary_ctxt_switches:     0
Inside thread: thread pid = 18398
Inside thread: thread ppid = 9638

Please let me know if this helps!

Correctly allocate stack for clone a thread

When you supply the CLONE_SETTLS, CLONE_PARENT_SETTID and CLONE_CHILD_CLEARTID flags you must provide the newtls, ptid and ctid arguments to clone() respectively.

If all you want is a normal thread with a separate FD table though, just use pthread_create() and call unshare(CLONE_FILES) as the first operation in the new thread.

Making a clone'd thread pthread compatible

Some of that data, for example, is essential to libc's function, such as the function pointer encryption key (pointer_guard) and locale pointer.

Correct. Don't forget about errno, which is also in there.

can I upgrade a clone'd thread to a full pthread via any mechanism?

No.

is there any way that I can call C functions from a clone'd thread

No.

If you have sources to the library, it should be relatively easy to replace direct clone calls with pthread_create.

If you do not, but the library is available in archive form, you may be able to use obcopy --rename-symbol to redirect its clone calls to a replacement (e.g. my_clone), which can then create a new thread via pthread_create and invoke the target function in that thread. Whether this will succeed greatly depends on how much the library cares about details of the clone.

It's also probably not worth the trouble.

A better alternative may be to implement the introspection without calling into libc. Since your printf and toupper probably only need to deal with ASCII and C locale, it's not hard to implement limited versions of these functions and use direct system calls to write the output.

Linux system call for creating process and thread

Processes are usually created with fork, threads (lightweight processes) are usually created with clone nowadays. However, anecdotically, there exist 1:N thread models, too, which don't do either.

Both fork and clone map to the same kernel function do_fork internally. This function can create a lightweight process that shares the address space with the old one, or a separate process (and many other options), depending on what flags you feed to it. The clone syscall is more or less a direct forwarding of that kernel function (and used by the higher level threading libraries) whereas fork wraps do_fork into the functionality of the 50 year old traditional Unix function.

The important difference is that fork guarantees that a complete, separate copy of the address space is made. This, as Basil points out correctly, is done with copy-on-write nowadays and therefore is not nearly as expensive as one would think.

When you create a thread, it just reuses the original address space and the same memory.

However, one should not assume that creating processes is generally "lightweight" on unix-like systems because of copy-on-write. It is somewhat less heavy than for example under Windows, but it's nowhere near free.

One reason is that although the actual pages are not copied, the new process still needs a copy of the page table. This can be several kilobytes to megabytes of memory for processes that use larger amounts of memory.
Another reason is that although copy-on-write is invisible and a clever optimization, it is not free, and it cannot do magic. When data is modified by either process, which inevitably happens, the affected pages fault.

Redis is a good example where you can see that fork is everything but lightweight (it uses fork to do background saves).

how to replace pthread_join() and pthread_create() by clone()

As alk said, if you use CLONE_THREAD you can not use wait() to wait for your thread to finish.

A new thread created with CLONE_THREAD has the same parent process as the caller of clone() (i.e., like CLONE_PARENT), so that calls to getppid(2) return the same value for all of the threads in a thread group. When a CLONE_THREAD thread terminates, the thread that created it using clone() is not sent a SIGCHLD (or other termination) signal; nor can the status of such a thread be obtained using wait(2). (The thread is said to be detached.)

The man page also tells us:

After all of the threads in a thread group terminate the parent process of the thread group is sent a SIGCHLD (or other termination) signal.

So if you have to use CLONE_THREAD you could use pause() or some other signal handling mechanism to wait for the whole thread group to finish.

...
    ctid = clone(childfun , stackptr+getpagesize() , CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|   CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,NULL);
    pause();
    printf("exit\n");
}

If you dont need to create a new thread group (e.g. don't use CLONE_THREAD), you can use wait() as you are used to from 'normal' process handling:

...
    ctid = clone(childfun , stackptr+getpagesize() , CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND |CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,NULL);

     ctid = waitpid( ctid, 0, 0 );
     if ( ctid == -1 ){             
        perror( "waitpid" );
        exit( 3 ); 
    }
}

Hope this helps!

When is clone() and fork better than pthreads?

The strength and weakness of fork (and company) is that they create a new process that's a clone of the existing process.

This is a weakness because, as you pointed out, creating a new process has a fair amount of overhead. It also means communication between the processes has to be done via some "approved" channel (pipes, sockets, files, shared-memory region, etc.)

This is a strength because it provides (much) greater isolation between the parent and the child. If, for example, a child process crashes, you can kill it and start another fairly easily. By contrast, if a child thread dies, killing it is problematic at best -- it's impossible to be certain what resources that thread held exclusively, so you can't clean up after it. Likewise, since all the threads in a process share a common address space, one thread that ran into a problem could overwrite data being used by all the other threads, so just killing that one thread wouldn't necessarily be enough to clean up the mess.

In other words, using threads is a little bit of a gamble. As long as your code is all clean, you can gain some efficiency by using multiple threads in a single process. Using multiple processes adds a bit of overhead, but can make your code quite a bit more robust, because it limits the damage a single problem can cause, and makes it much easy to shut down and replace a process if it does run into a major problem.

As far as concrete examples go, Apache might be a pretty good one. It will use multiple threads per process, but to limit the damage in case of problems (among other things), it limits the number of threads per process, and can/will spawn several separate processes running concurrently as well. On a decent server you might have, for example, 8 processes with 8 threads each. The large number of threads helps it service a large number of clients in a mostly I/O bound task, and breaking it up into processes means if a problem does arise, it doesn't suddenly become completely un-responsive, and can shut down and restart a process without losing a lot.

How to Create a Real Thread with Clone() on Linux