Need explanations for Linux bash builtin exec command behavior
In this particular case, you have the exec
in a pipeline. In order to execute a series of pipeline commands, the shell must initially fork, making a sub-shell. (Specifically it has to create the pipe, then fork, so that everything run "on the left" of the pipe can have its output sent to whatever is "on the right" of the pipe.)
To see that this is in fact what is happening, compare:
{ ls; echo this too; } | cat
with:
{ exec ls; echo this too; } | cat
The former runs ls
without leaving the sub-shell, so that this sub-shell is therefore still around to run the echo
. The latter runs ls
by leaving the sub-shell, which is therefore no longer there to do the echo
, and this too
is not printed.
(The use of curly-braces { cmd1; cmd2; }
normally suppresses the sub-shell fork action that you get with parentheses (cmd1; cmd2)
, but in the case of a pipe, the fork is "forced", as it were.)
Redirection of the current shell happens only if there is "nothing to run", as it were, after the word exec
. Thus, e.g., exec >stdout 4<input 5>>append
modifies the current shell, but exec foo >stdout 4<input 5>>append
tries to exec command foo
. [Note: this is not strictly accurate; see addendum.]
Interestingly, in an interactive shell, after exec foo >output
fails because there is no command foo
, the shell sticks around, but stdout remains redirected to file output
. (You can recover with exec >/dev/tty
. In a script, the failure to exec foo
terminates the script.)
With a tip of the hat to @Pumbaa80, here's something even more illustrative:
#! /bin/bash
shopt -s execfail
exec ls | cat -E
echo this goes to stdout
echo this goes to stderr 1>&2
(note: cat -E
is simplified down from my usual cat -vET
, which is my handy go-to for "let me see non-printing characters in a recognizable way"). When this script is run, the output from ls
has cat -E
applied (on Linux this makes end-of-line visible as a $ sign), but the output sent to stdout and stderr (on the remaining two lines) is not redirected. Change the | cat -E
to > out
and, after the script runs, observe the contents of file out
: the final two echo
s are not in there.
Now change the ls
to foo
(or some other command that will not be found) and run the script again. This time the output is:
$ ./demo.sh
./demo.sh: line 3: exec: foo: not found
this goes to stderr
and the file out
now has the contents produced by the first echo
line.
This makes what exec
"really does" as obvious as possible (but no more obvious, as Albert Einstein did not put it :-) ).
Normally, when the shell goes to execute a "simple command" (see the manual page for the precise definition, but this specifically excludes the commands in a "pipeline"), it prepares any I/O redirection operations specified with <
, >
, and so on by opening the files needed. Then the shell invokes fork
(or some equivalent but more-efficient variant like vfork
or clone
depending on underlying OS, configuration, etc), and, in the child process, rearranges the open file descriptors (using dup2
calls or equivalent) to achieve the desired final arrangements: > out
moves the open descriptor to fd 1—stdout—while 6> out
moves the open descriptor to fd 6.
If you specify the exec
keyword, though, the shell suppresses the fork
step. It does all the file opening and file-descriptor-rearranging as usual, but this time, it affects any and all subsequent commands. Finally, having done all the redirections, the shell attempts to execve()
(in the system-call sense) the command, if there is one. If there is no command, or if the execve()
call fails and the shell is supposed to continue running (is interactive or you have set execfail
), the shell soldiers on. If the execve()
succeeds, the shell no longer exists, having been replaced by the new command. If execfail
is unset and the shell is not interactive, the shell exits.
(There's also the added complication of the command_not_found_handle
shell function: bash's exec
seems to suppress running it, based on test results. The exec
keyword in general makes the shell not look at its own functions, i.e., if you have a shell function f, running f
as a simple command runs the shell function, as does (f)
which runs it in a sub-shell, but running (exec f)
skips over it.)
As for why
ls>out1 ls>out2
creates two files (with or without an exec
), this is simple enough: the shell opens each redirection, and then uses dup2
to move the file descriptors. If you have two ordinary >
redirects, the shell opens both, moves the first one to fd 1 (stdout), then moves the second one to fd 1 (stdout again), closing the first in the process. Finally, it runs ls ls
, because that's what's left after removing the >out1 >out2
. As long as there is no file named ls
, the ls
command complains to stderr, and writes nothing to stdout. I don't understand bash exec
Yes, it sends any further output to the file named logfile
. In other words, it redirects standard output (also known as stdout) to the file logfile
.
Example
Let's start with this script:
$ cat >script.sh
#!/bin/bash
echo First
exec >>logfile
echo Second
If we run the script, we see output from the first but not the second echo
statements:
$ bash script.sh
First
The output from the second echo
statement went to the file logfile
:
$ cat logfile
Second
$
If we had used exec >logfile
, then the logfile
would be overwritten each time the script was run. Because we used >>
instead of >
, however, the output will be appended to logfile
. For example, if we run it once again:
$ bash script.sh
First
$ cat logfile
Second
Second
Documentation
This is documented in man bash
:
exec [-cl] [-a name] [command [arguments]]
If command
is specified, it replaces the shell. No new process is created. The
arguments become the arguments to command. If the -l option is
supplied, the shell places a dash at the beginning of the zeroth
argument passed to command. This is what login(1) does. The -c
option causes command to be executed with an empty environment. If
-a is supplied, the shell passes name as the zeroth argument to the executed command. If command cannot be executed for some reason, a
non-interactive shell exits, unless the execfail shell option is
enabled. In that case, it returns failure. An interactive shell
returns failure if the file cannot be executed. If command is not
specified, any redirections take effect in the current shell, and the
return status is 0. If there is a redirection error, the return
status is 1. [Emphasis added.]
In your case, no command argument is specified. So, the exec
command performs redirections which, in this case, means any further stdout is sent to file logfile
.
find command and -exec
The find command has a -exec
option. For example:
find / -type f -exec grep -l "bash" {} \;
Other than the similarity in name, the -exec
here has absolutely nothing to do with the shell command exec
.
The construct -exec grep -l "bash" {} \;
tells find
to execute the command grep -l "bash"
on any files that it finds. This is unrelated to the shell command exec >>logfile
which executes nothing but has the effect of redirecting output.
understanding bash exec 1 &2 command
exec
is a built-in Bash function, so it can have special behavior that an external program couldn't have. In particular, it has the special behavior that:
If COMMAND is not specified, any redirections take effect in the current shell.
(That's quoting from the message given by help exec
.)
This applies to any sort of redirection; you can also write, for example, any of these:
exec >tmp.txt
exec >>stdout.log 2>>stderr.log
exec 2>&1
(It does not, however, apply to pipes.)
What is the purpose of the : (colon) GNU Bash builtin?
Historically, Bourne shells didn't have true
and false
as built-in commands. true
was instead simply aliased to :
, and false
to something like let 0
.
:
is slightly better than true
for portability to ancient Bourne-derived shells. As a simple example, consider having neither the !
pipeline operator nor the ||
list operator (as was the case for some ancient Bourne shells). This leaves the else
clause of the if
statement as the only means for branching based on exit status:
if command; then :; else ...; fi
Since if
requires a non-empty then
clause and comments don't count as non-empty, :
serves as a no-op.
Nowadays (that is: in a modern context) you can usually use either :
or true
. Both are specified by POSIX, and some find true
easier to read. However there is one interesting difference: :
is a so-called POSIX special built-in, whereas true
is a regular built-in.
Special built-ins are required to be built into the shell; Regular built-ins are only "typically" built in, but it isn't strictly guaranteed. There usually shouldn't be a regular program named
:
with the function oftrue
in PATH of most systems.Probably the most crucial difference is that with special built-ins, any variable set by the built-in - even in the environment during simple command evaluation - persists after the command completes, as demonstrated here using ksh93:
$ unset x; ( x=hi :; echo "$x" )
hi
$ ( x=hi true; echo "$x" )
$Note that Zsh ignores this requirement, as does GNU Bash except when operating in POSIX compatibility mode, but all other major "POSIX sh derived" shells observe this including dash, ksh93, and mksh.
Another difference is that regular built-ins must be compatible with
exec
- demonstrated here using Bash:$ ( exec : )
-bash: exec: :: not found
$ ( exec true )
$POSIX also explicitly notes that
:
may be faster thantrue
, though this is of course an implementation-specific detail.
What does set -e and exec $@ do for docker entrypoint scripts?
It basically takes any command line arguments passed to entrypoint.sh
and execs them as a command. The intention is basically "Do everything in this .sh script, then in the same shell run the command the user passes in on the command line".
See:
- What are the special dollar sign shell variables?
- Need explanations for Linux bash builtin exec command behavior
sh -c irrationality and programmatically determining and running Linux builtin commands
You're looking for consistency in the wrong place because you're missing a critical aspect of what some of those commands are doing. Running the same command in different contexts might yield a different result. For example, if you run ls
or pwd
(with no arguments), the result depends on the current directory.
The dichotomy isn't between built-in commands and non-built-in commands, but between commands whose behavior are influenced by which shell runs them and commands that aren't. There is a correlation: most commands that are influenced by which shell runs them are built in, because an external command would not be able to access the state of the shell that runs them.
- The command
alias
prints out the lists of aliases defined in the current shell. Aliases are part of the internal state of a shell. If you run a new shell instance, it starts out with no aliases defined, soalias
prints an empty list. Typically, when you're running an interactive shell, your aliases are the ones defined by your startup file (e.g.~/.bashrc
) and that's whatalias
lists. But if you runalias
orunalias
on the command line, you can change the aliases of that shell instance, and that doesn't affect other shells (try it out to make sure that you understand what's going on). command alias
does the same thing asalias
sincealias
is a builtin.builtin alias
does the same thing asalias
in bash. Thebuiltin
command is a bash builtin.builtin
does not exist in other shells; on Ubuntu,/bin/sh
is not bash but dash, a shell that's smaller, faster and also POSIX-compliant but lacks some of bash's more advanced features. This also explainstype builtin
.bash -c 'type builtin'
would report thatbuiltin
is a builtin.type command
andtype type
report thatcommand
andtype
are builtins because they are builtins insh
.
You can't execute a builtin from a program: a builtin is a command of a particular shell. You can execute a shell that supports this builtin and tell it to execute that builtin, but of course the builtin is executed in the context of that shell.
You can no more execute the alias
command from a Pascal program than you can call the Pascal write
function from a shell program. A shell builtin is a library function of the shell. Shells blur the distinction between their own functions and external programs because you can call an external program using the same syntax rather than going through something like the TProcess
class, but at the end of the day the concepts are the same.
A “CLI helper GUI” already exists: it's called a terminal emulator. It sounds like you want to make some more constrained GUI that can only execute certain specific commands. In that case, I don't think it makes sense to expose features such as aliases. You aren't providing an interface to a shell here, you're providing an interface to run program. You aren't interfacing the shell, you're substituting to it. So don't think of shell commands, think of running programs. There's no program called alias
.
find -exec' a shell function in Linux
Since only the shell knows how to run shell functions, you have to run a shell to run a function. You also need to mark your function for export with export -f
, otherwise the subshell won't inherit them:
export -f dosomething
find . -exec bash -c 'dosomething "$0"' {} \;
Bash: Head & Tail behavior with bash script
This is a fairly interesting issue! Thanks for posting it!
I assumed that this happens as head
exits after processing the first few lines, so SIGPIPE
signal is sent to the bash running the script when it tries to echo $x
next time. I used RedX's script to prove this theory:
#!/usr/bin/bash
rm x.log
for((x=0;x<5;++x)); do
echo $x
echo $x>>x.log
done
This works, as You described! Using t.sh|head -n 2
it writes only 2 lines to the screen and to x.log. But trapping SIGPIPE this behavior changes...
#!/usr/bin/bash
trap "echo SIGPIPE>&2" PIPE
rm x.log
for((x=0;x<5;++x)); do
echo $x
echo $x>>x.log
done
Output:
$ ./t.sh |head -n 2
0
1
./t.sh: line 5: echo: write error: Broken pipe
SIGPIPE
./t.sh: line 5: echo: write error: Broken pipe
SIGPIPE
./t.sh: line 5: echo: write error: Broken pipe
SIGPIPE
The write error occurs as stdout
is already closed as the other end of the pipe is closed. And any attempt to write to the closed pipe causes a SIGPIPE signal, which terminates the program by default (see man 7 signal
). The x.log now contains 5 lines.
This also explains why /bin/echo
solved the problem. See the following script:
rm x.log
for((x=0;x<5;++x)); do
/bin/echo $x
echo "Ret: $?">&2
echo $x>>x.log
done
Output:
$ ./t.sh |head -n 2
0
Ret: 0
1
Ret: 0
Ret: 141
Ret: 141
Ret: 141
Decimal 141 = hex 8D. Hex 80 means a signal was received, hex 0D is for SIGPIPE. So when /bin/echo
tried to write to stdout it got a SIGPIPE and it was terminated (as default behavior) instead of the bash running the script.
Related Topics
Installed Go Binary Not Found in Path on Alpine Linux Docker
Turning Multiple Lines into One Comma Separated Line
Create a Dedicated Folder for Every Zip Files in a Directory and Extract Zip Files
Bash History Without Line Numbers
Can You Attach Amazon Ebs to Multiple Instances
How to Mount One Partition from an Image File That Contains Multiple Partitions on Linux
Merge Multiple Jpgs into Single PDF in Linux
How to Do Division with Variables in a Linux Shell
Architecture of I386 Input File Is Incompatible with I386:X86-64
Magic Numbers of the Linux Reboot() System Call
How to Get My Golang Web Server to Run in the Background
Number of Processors/Cores in Command Line
Must My Pidfile Be Located in /Var/Run