Why Does Ps O/P List the Grep Process After the Pipe

Why does ps o/p list the grep process after the pipe?

When you execute the command:

ps -ef | grep cron

the shell you are using

(...I assume bash in your case, due to the color attribute of grep I think you are running a gnu system like a linux distribution, but it's the same on other unix/shell as well...)

will execute the pipe() call to create a FIFO, then it will fork() (make a running copy of itself). This will create a new child process. This new generated child process will close() its standard output file descriptor (fd 1) and attach the fd 1 to the write side of the pipe created by the father process (the shell where you executed the command). This is possible because the fork() syscall will maintain, for each, a valid open file descriptor (the pipe fd in this case). After doing so it will exec() the first (in your case) ps command found in your PATH environment variable. With the exec() call the process will become the command you executed.

So, you now have the shell process with a child that is, in your case, the ps command with -ef attributes.

At this point, the parent (the shell) fork()s again. This newly generated child process close()s its standard input file descriptor (fd 0) and attaches the fd 0 to the read side of the pipe created by the father process (the shell where you executed the command).

After doing so it will exec() the first (in your case) grep command found in your PATH environment variable.

Now you have the shell process with two children (that are siblings) where the first one is the ps command with -ef attributes and the second one is the grep command with the cron attribute. The read side of the pipe is attached to the STDIN of the grep command and the write side is attached to the STDOUT of the ps command: the standard output of the ps command is attached to the standard input of the grep command.

Since ps is written to send on the standard output info on each running process, while grep is written to get on its standard input something that has to match a given pattern, you'll have the answer to your first question:

  1. the shell runs: ps -ef;
  2. the shell runs: grep cron;
  3. ps sends data (that even contains the string "grep cron") to grep
  4. grep matches its search pattern from the STDIN and it matches the string "grep cron" because of the "cron" attribute you passed in to grep: you are instructing grep to match the "cron" string and it does because "grep cron" is a string returned by ps at the time grep has started its execution.

When you execute:

ps -ef | grep '[c]ron'

the attribute passed instructs grep to match something containing "c" followed by "ron". Like the first example, but in this case it will break the match string returned by ps because:

  1. the shell runs: ps -ef;
  2. the shell runs: grep [c]ron;
  3. ps sends data (that even contains the string grep [c]ron) to grep
  4. grep does not match its search pattern from the stdin because a string containing "c" followed by "ron" it's not found, but it has found a string containing "c" followed by "]ron"

GNU grep does not have any string matching limit, and on some platforms (I think Solaris, HPUX, aix) the limit of the string is given by the "$COLUMN" variable or by the terminal's screen width.

Hopefully this long response clarifies the shell pipe process a bit.

TIP:

ps -ef | grep cron | grep -v grep

linux: why if you pipe ps to grep, grep filter is in process output ? How does pipe works?

The reason is simple:

  1. OS starts the grep process (as the pipe target).
  2. OS starts the ps, which finds grep running.
  3. OS connects the standard output of ps to the standard input of grep.

Process grep with pipe returns itself. How do I exclude it?

The traditional way would be:

ps x | grep 'vsftpd'| grep -v grep

in which grep -v expr returns everything not matching expr

You can then use awk to extract the relevant field (the pid in your case)

ps x | grep 'vsftpd'| grep -v grep | awk '{ print $2 }'

(the $2 corresponds to the relevant field/column)

More elegant ps aux | grep -v grep

The usual technique is this:

ps aux | egrep '[t]erminal'

This will match lines containing terminal, which egrep '[t]erminal' does not! It also works on many flavours of Unix.

How to always cut the PID from `ps aux` command?

-d ' ' means using a single space as delimiter. Since there're 1 space before 2049 and 2 spaces before 12290, your command get them by -f 2 and -f 3.

I recommend using ps aux | awk '{print $2}' to get those pids.

Or you can use tr to squeeze those spaces first
ps aux | tr -s ' ' | cut -d ' ' -f 2

Find and kill a process in one line using bash and regex

In bash, you should be able to do:

kill $(ps aux | grep '[p]ython csp_build.py' | awk '{print $2}')

Details on its workings are as follows:

  • The ps gives you the list of all the processes.
  • The grep filters that based on your search string, [p] is a trick to stop you picking up the actual grep process itself.
  • The awk just gives you the second field of each line, which is the PID.
  • The $(x) construct means to execute x then take its output and put it on the command line. The output of that ps pipeline inside that construct above is the list of process IDs so you end up with a command like kill 1234 1122 7654.

Here's a transcript showing it in action:

pax> sleep 3600 &
[1] 2225
pax> sleep 3600 &
[2] 2226
pax> sleep 3600 &
[3] 2227
pax> sleep 3600 &
[4] 2228
pax> sleep 3600 &
[5] 2229
pax> kill $(ps aux | grep '[s]leep' | awk '{print $2}')
[5]+ Terminated sleep 3600
[1] Terminated sleep 3600
[2] Terminated sleep 3600
[3]- Terminated sleep 3600
[4]+ Terminated sleep 3600

and you can see it terminating all the sleepers.


Explaining the grep '[p]ython csp_build.py' bit in a bit more detail:

When you do sleep 3600 & followed by ps -ef | grep sleep, you tend to get two processes with sleep in it, the sleep 3600 and the grep sleep (because they both have sleep in them, that's not rocket science).

However, ps -ef | grep '[s]leep' won't create a process with sleep in it, it instead creates grep '[s]leep' and here's the tricky bit: the grep doesn't find it because it's looking for the regular expression "any character from the character class [s] (which is s) followed by leep.

In other words, it's looking for sleep but the grep process is grep '[s]leep' which doesn't have sleep in it.

When I was shown this (by someone here on SO), I immediately started using it because

  • it's one less process than adding | grep -v grep; and
  • it's elegant and sneaky, a rare combination :-)

How does ps aux | grep '[p]attern' exclude grep itself?

The pattern [f]irefox will not match the literal string [f]irefox. Instead it will match strings with exactly one char from the 1-character class [f], followed by irefox.



Related Topics



Leave a reply



Submit