fork and exec in bash
Use the ampersand just like you would from the shell.
#!/usr/bin/bash
function_to_fork() {
...
}
function_to_fork &
# ... execution continues in parent process ...
Trying to fork a process in bash script
fork
is not directly available in bash, but it can be effectively simulated with &
.
A couple of other things about this code:
To do a numeric comparison you should use
-eq
noteq
(note the extra dash-
).It is also considered a best practice to use
[[ ]]
for tests instead of[ ]
.
Fork exec and pipe with bash script
I modified your program to report errors and actually wait for the children to die, like so:
#include <stdio.h>
#include <sys/wait.h>
#include <unistd.h>
int main(int argc, char* argv[])
{
if (argc > 2)
fprintf(stderr, "Excess arguments ignored\n");
int fd[2];
pid_t pid1, pid2;
char * input[] = {"/bin/bash", "sc.sh", argv[1], NULL};
char * output[] = {"./cprog", argv[1], NULL};
pipe(fd);
pid1 = fork();
if (pid1 == 0) {
dup2(fd[1], STDOUT_FILENO);
close(fd[0]);
close(fd[1]);
execv(input[0], input);
perror(input[0]);
return 1;
}
pid2 = fork();
if (pid2 == 0) {
dup2(fd[0], STDIN_FILENO);
close(fd[0]);
close(fd[1]);
execv(output[0], output);
perror(output[0]);
return 1;
}
close(fd[0]);
close(fd[1]);
int status1;
int corpse1 = waitpid(pid1, &status1, 0);
printf("PID %d: %d (0x%.4X)\n", pid1, corpse1, status1);
int status2;
int corpse2 = waitpid(pid2, &status2, 0);
printf("PID %d: %d (0x%.4X)\n", pid2, corpse2, status2);
return 0;
}
I used a simple C program as cprog
:
#include <stdio.h>
int main(void)
{
int c;
unsigned sum = 0;
unsigned cnt = 0;
while ((c = getchar()) != EOF)
sum += c, cnt++;
printf("sum of bytes: %u\n", sum);
printf("num of bytes: %u\n", cnt);
return 0;
}
Testing on the command line yielded:
$ bash sc.sh | cprog
sum of bytes: 325895667
num of bytes: 69926912
$
Running the main program (it was p19
created from p19.c
) yielded:
$ ./p19
sum of bytes: 372818733
num of bytes: 70303744
PID 28575: 28575 (0x7C00)
PID 28576: 28576 (0x0000)
$
The exit status shows that the timeout
exited with status 124, which is what GNU documents as the exit status when the command times out.
So, in my reproduction of your environment, the code you provided works OK. That suggests that your environment is not set up as you think. Maybe the sc.sh
script isn't there.
Differences between fork and exec
The use of fork
and exec
exemplifies the spirit of UNIX in that it provides a very simple way to start new processes.
The fork
call basically makes a duplicate of the current process, identical in almost every way. Not everything is copied over (for example, resource limits in some implementations) but the idea is to create as close a copy as possible.
The new process (child) gets a different process ID (PID) and has the PID of the old process (parent) as its parent PID (PPID). Because the two processes are now running exactly the same code, they can tell which is which by the return code of fork
- the child gets 0, the parent gets the PID of the child. This is all, of course, assuming the fork
call works - if not, no child is created and the parent gets an error code.
The exec
call is a way to basically replace the entire current process with a new program. It loads the program into the current process space and runs it from the entry point.
So, fork
and exec
are often used in sequence to get a new program running as a child of a current process. Shells typically do this whenever you try to run a program like find
- the shell forks, then the child loads the find
program into memory, setting up all command line arguments, standard I/O and so forth.
But they're not required to be used together. It's perfectly acceptable for a program to fork
itself without exec
ing if, for example, the program contains both parent and child code (you need to be careful what you do, each implementation may have restrictions). This was used quite a lot (and still is) for daemons which simply listen on a TCP port and fork
a copy of themselves to process a specific request while the parent goes back to listening.
Similarly, programs that know they're finished and just want to run another program don't need to fork
, exec
and then wait
for the child. They can just load the child directly into their process space.
Some UNIX implementations have an optimized fork
which uses what they call copy-on-write. This is a trick to delay the copying of the process space in fork
until the program attempts to change something in that space. This is useful for those programs using only fork
and not exec
in that they don't have to copy an entire process space.
If the exec
is called following fork
(and this is what happens mostly), that causes a write to the process space and it is then copied for the child process.
Note that there is a whole family of exec
calls (execl
, execle
, execve
and so on) but exec
in context here means any of them.
The following diagram illustrates the typical fork/exec
operation where the bash
shell is used to list a directory with the ls
command:
+--------+
| pid=7 |
| ppid=4 |
| bash |
+--------+
|
| calls fork
V
+--------+ +--------+
| pid=7 | forks | pid=22 |
| ppid=4 | ----------> | ppid=7 |
| bash | | bash |
+--------+ +--------+
| |
| waits for pid 22 | calls exec to run ls
| V
| +--------+
| | pid=22 |
| | ppid=7 |
| | ls |
V +--------+
+--------+ |
| pid=7 | | exits
| ppid=4 | <---------------+
| bash |
+--------+
|
| continues
V
Simple shell with fork and exec
You have 2
fgets()
call. Remove the first onefgets(line, BUFFER, stdin);
.fgets()
will read in the newline if there's space in buffer. You need to remove it because when you inputexit
, you'll actually inputexit\n
and there's no command as/bin/exit\n
.
The below code demonstrates removing newline character:
if(!fgets(line, BUFFER, stdin))
break;
char *p = strchr(line, '\n');
if (p) *p = 0;
- Your usage is wrong. Check the manual of
execl
. You need to pass the arguments:execl(program, line, (char *)NULL);
Notice the cast of last argument ofNULL
. In case,NULL
is defined as0
then the cast becomes necessary becauseexecl
is a variadic function.
A modified example using execvp
:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <string.h>
#define BUFFER 1024
int main(void) {
char line[BUFFER];
while(1) {
printf("$ ");
if(!fgets(line, BUFFER, stdin)) break;
char *p = strchr(line, '\n');
if (p) *p = 0;
if(strcmp(line, "exit")==0) break;
char *args[] = {line, (char*)0};
int pid= fork(); //fork child
if(pid==0) { //Child
execvp(line, args);
perror("exec");
exit(1);
} else { //Parent
wait(NULL);
}
}
return 0;
}
Why does executing a simple command in a grouping command does not fork a subshell process, and the compound command will do it
Bash optimizes the execution. It detects that only one command is inside the (
)
group and calls fork
+ exec
instead of fork
+ fork
+ exec
. That's why you see one bash
process less in the list of processes. It is easier to detect when using command that take more time ( sleep 5 )
to eliminate timing. Also, you may want to read this thread on unix.stackexchange.
I think the optimization is done somewhere inside execute_cmd.c
in execute_in_subshell()
function (arrows >
added by me):
/* If this is a simple command, tell execute_disk_command that it
might be able to get away without forking and simply exec.
>>>> This means things like ( sleep 10 ) will only cause one fork
If we're timing the command or inverting its return value, however,
we cannot do this optimization. */
and in execute_disk_command()
function we can also read:
/* If we can get away without forking and there are no pipes to deal with,
don't bother to fork, just directly exec the command. */
why the pid of process which is created with fork() and exec() finally changed
It looks like @Barmar is correct here ... internally sublime text is creating one (well ... there is definitely more than one) child here ... most likely with fork()
. You can tell from the clone
call below that sublime is creating children.
[acripps@localhost Code]$ strace -e trace=%process /opt/sublime_text/sublime_text
execve("/opt/sublime_text/sublime_text", ["/opt/sublime_text/sublime_text"],
0x7ffff4607370 /* 56 vars */) = 0
arch_prctl(ARCH_SET_FS, 0x7fb6fa15b740) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7fb6fa15ba10) = 32653
exit_group(0) = ?
+++ exited with 0 +++
And here we can see the pids, as described in the question. Please note the child_tidptr
value from strace: it corresponds with the actual PID of sublime, rather than
[acripps@localhost Code]$ ps afx | grep sublime
32675 pts/0 S+ 0:00 | | \_ grep --color=auto sublime
32653 ? Ssl 0:00 \_ /opt/sublime_text/sublime_text
32670 ? Sl 0:00 \_ /opt/sublime_text/plugin_host 32653 --auto-shell-env
[acripps@localhost Code]$
if you were to use something a little simpler, like sleep
for example, you would find that the pids line up with your expectations:
[acripps@localhost Code]$ ./exec_m1
pid = 1696
Press ENTER to continue ...
[acripps@localhost Code]$ ps afx | grep sleep
1696 pts/1 S+ 0:00 | | \_ /usr/bin/sleep 300
1711 pts/2 S+ 0:00 | \_ grep --color=auto sleep
or, using method 2:
[acripps@localhost Code]$ ./exec_m2
pid = 1774
Press ENTER to continue ...
[acripps@localhost Code]$ ps afx | grep sleep
1774 pts/1 S+ 0:00 | | \_ /usr/bin/sleep 300
1776 pts/2 S+ 0:00 | \_ grep --color=auto sleep
An interesting point to note, is that you use "/bin/sh -c"
in method 2 ... this step shouldn't be required. IIRC, when not attached to a tty the shell will simply call one of the exec
family of functions to replace itself with the executable ... if the shell is attached to a TTY, though, it will go through another fork
call first.
There's lots of really good information in the POSIX spec, but it may take several readings to really sink in ... also, checking out the source code of a POSIX OS and trying to understand the process management pieces will really help solidify the understanding. I did this with QNX neutrino, but FreeBSD is another really good one to check out.
For this exercise, I modified your main()
function a little bit, to be easier to use:
int main()
{
int pid = 0;
#if METHOD == 1
//method 1
char *name = "/usr/bin/sleep";
char *argv[] = {name, "300", (char *)0};
pid = create_process(name, argv);
#else
#if METHOD == 2
//method 2
char *cmdstring = "/usr/bin/sleep 300";
pid = forkstyle_system(cmdstring);
#endif
#endif
printf("pid = %d\n",pid);
printf("Press ENTER to continue ...");
getchar();
return 0;
}
which can be compiled like so:
gcc -o exec_method1 -DMETHOD=1 exec.c
gcc -o exec_method2 -DMETHOD=2 exec.c
... I got lazy and used the preprocessor, ideally (if this were the start of a tool you want to keep around), then you would want to parse main
's argv
to tell you which method to use, and where to find the executable/provide args for the executable. I leave that as an exercise for the reader ;-)
Related Topics
How to Kill Tcp Port 16969 in Bash
Search and Replace Text in All Files of a Linux Directory
Best Way to Divide in Bash Using Pipes
Minicom Black Background Color Is Not Respected
How to Obtain the Mdns.Service File Needed for Building Mdns in Yocto
Issue While Validating Bash Script
Graphical Diff Programs for Linux
Linux Cmd to Search for a Class File Among Jars Irrespective of Jar Path
How to List Empty Folders in Linux
Managing User Configuration Files Across Multiple Computers
Remove the Last Page of a PDF File Using PDFtk
Which Gantt Chart/Project Management Tool Would You Recommend for Linux
Bashrc Not Loading Until Run Bash Command
How to Count Occurrences of a Word in All the Files of a Directory
How to Do Http-Request/Call with JSON Payload from Command-Line
How to Convert Pe(Portable Executable) Format to Elf in Linux