Bash redirect stdout and stderr to seperate files with timestamps
The trick is to make sure that tee
, and the process substitution running your log
function, exits before the script as a whole does -- so that when the shell that started the script prints its prompt, there isn't any backgrounded process that might write more output after it's done.
As a working example (albeit one focused more on being explicit than terseness):
#!/usr/bin/env bash
stdout_log=stdout.log; stderr_log=stderr.log
log () {
file=$1; shift
while read -r line; do
printf '%(%s)T %s\n' -1 "$line"
done >> "$file"
}
# first, make backups of your original stdout and stderr
exec {stdout_orig_fd}>&1 {stderr_orig_fd}>&2
# for stdout: start your process substitution, record its PID, start tee, record *its* PID
exec {stdout_log_fd}> >(log "$stdout_log"); stdout_log_pid=$!
exec {stdout_tee_fd}> >(tee "/dev/fd/$stdout_log_fd"); stdout_tee_pid=$!
exec {stdout_log_fd}>&- # close stdout_log_fd so the log process can exit when tee does
# for stderr: likewise
exec {stderr_log_fd}> >(log "$stderr_log"); stderr_log_pid=$!
exec {stderr_tee_fd}> >(tee "/dev/fd/$stderr_log_fd" >&2); stderr_tee_pid=$!
exec {stderr_log_fd}>&- # close stderr_log_fd so the log process can exit when tee does
# now actually swap out stdout and stderr for the processes we started
exec 1>&$stdout_tee_fd 2>&$stderr_tee_fd {stdout_tee_fd}>&- {stderr_tee_fd}>&-
# ...do the things you want to log here...
echo "this goes to stdout"; echo "this goes to stderr" >&2
# now, replace the FDs going to tee with the backups...
exec >&"$stdout_orig_fd" 2>&"$stderr_orig_fd"
# ...and wait for the associated processes to exit.
while :; do
ready_to_exit=1
for pid_var in stderr_tee_pid stderr_log_pid stdout_tee_pid stdout_log_pid; do
# kill -0 just checks whether a PID exists; it doesn't actually send a signal
kill -0 "${!pid_var}" &>/dev/null && ready_to_exit=0
done
(( ready_to_exit )) && break
sleep 0.1 # avoid a busy-loop eating unnecessary CPU by sleeping before next poll
done
So What's With The File Descriptor Manipulation?
A few key concepts to make sure we have clear:
- All subshells have their own copies of the file descriptor table as created when they were
fork()
ed off from their parent. From that point forward, each file descriptor table is effectively independent. - A process reading from (the read end of) a FIFO (or pipe) won't see an EOF until all programs writing to (the write end of) that FIFO have closed their copies of the descriptor.
...so, if you create a FIFO pair, fork()
off a child process, and let the child process write to the write end of the FIFO, whatever's reading from the read end will never see an EOF until not just the child, but also the parent, closes their copies.
Thus, the gymnastics you see here:
- When we run
exec {stdout_log_fd}>&-
, we're closing the parent shell's copy of the FIFO writing to thelog
function for stdout, so the only remaining copy is the one used by thetee
child process -- so that whentee
exits, the subshell runninglog
exits too. - When we run
exec 1>&$stdout_tee_fd {stdout_tee_fd}>&-
, we're doing two things: First, we make FD 1 a copy of the file descriptor whose number is stored in the variablestdout_tee_fd
; second, we delete thestdout_tee_fd
entry from the file descriptor table, so only the copy on FD 1 remains. This ensures that later, when we runexec >&"$stdout_orig_fd"
, we're deleting the last remaining write handle to the stdouttee
function, causingtee
to get an EOF on stdin (so it exits, thus closing the handle it holds on thelog
function's subshell and letting that subshell exit as well).
Some Final Notes On Process Management
Unfortunately, how bash handles subshells created for process substitutions has changed substantially between still-actively-deployed releases; so while in theory it's possible to use wait "$pid"
to let a process substitution exit and collect its exit status, this isn't always reliable -- hence the use of kill -0
.
However, if wait "$pid"
worked, it would be strongly preferable, because the wait()
syscall is what removes a previously-exited process's entry from the process table: It is guaranteed that a PID will not be reused (and a zombie process-table entry is left as a placeholder) if no wait()
or waitpid()
invocation has taken place.
Modern operating systems try fairly hard to avoid short-term PID reuse, so wraparound is not an active concern in most scenarios. However, if you're worried about this, consider using the flock
-based mechanism discussed in https://stackoverflow.com/a/31552333/14122 for waiting for your process substitutions to exit, instead of kill -0
.
How to prepend stdout and stderr output with timestamp when redirecting into log files?
In any POSIX shell, try:
{ cmd | log_pipe >>stdout.log; } 2>&1 | log_pipe >>stderr.log
Also, if you have GNU awk (sometimes called gawk
), then log_pipe
can be made simpler and faster:
log_pipe() { awk '{print strftime("[%Y-%m-%d %H:%M:%S]"),$0}'; }
Example
As an example, let's create the command cmd
:
cmd() { echo "This is out"; echo "This is err">&2; }
Now, let's run our command and look at the output files:
$ { cmd | log_pipe >>stdout.log; } 2>&1 | log_pipe >>stderr.log
$ cat stdout.log
[2019-07-04 23:42:20] This is out
$ cat stderr.log
[2019-07-04 23:42:20] This is err
The problem
cmd | log_pipe >> "$stdout_log" 2>> "$stderr_log"
The above redirects stdout from cmd
to log_pipe
. The stdout of log_pipe
is redirected to $stdout_log
and the stderr of log_pipe
is redirected to $stderr_log
. The problem is that the stderr of cmd
is never redirected. It goes straight to the terminal.
As an example, consider this cmd
:
cmd() { echo "This is out"; echo "This is err">&2; }
Now, let's run the command:
$ cmd | log_pipe >>stdout.log 2>>stderr.log
This is err
We can see that This is err
is not sent to the file stderr.log. Instead, it appears on the terminal. It is never seen by log_pipe
. stderr.log
only captures error messages from log_pipe
.
How to prepend timestamp to the STDERR and redirect to a file
This can do:
OUT=$(df -h 2>&1 >/dev/null) || { date; echo "$OUT"; } >testErr.log
Test:
$ : > testErr.log; OUT=$(df -jj 2>&1 >/dev/null) || { date -u; echo "$OUT"; } >testErr.log; cat testErr.log
Fri Aug 1 12:10:29 UTC 2014
df: invalid option -- 'j'
Try 'df --help' for more information.
$ : > testErr.log; OUT=$(df -h 2>&1 >/dev/null) || { date -u; echo "$OUT"; } >testErr.log; cat testErr.log
(no output)
You can change >testErr.log
to >>testErr.log
to append.
And this can be more efficient:
OUT=$(exec df -h 2>&1 >/dev/null) || { date; echo "$OUT"; } >testErr.log
Windows Batch - how to add timestamp while redirecting stdout to file
To a first approximation you need a batch file like:
add_ts.bat
@for /F "usebackq delims==" %%i in (`%1`) do @echo %time% %%i
which you would run like:
add_ts.bat "make all -d --trace" > buildlog.txt
This however isn't good enough if you want to capture and
timestamp STDERR
as well as STDOUT
from the command passed as%1
, because the backticks operator around %1
will only capture STDOUT
To fix this you'll need to capture STDERR as well as STDOUT within the backticks, by using redirection in there, which in turns means
you need to run a subshell to understand the redirection, and you need to
escape the redirection operators so they're not interpreted by the toplevel
shell. Which all comes to:
@for /F "usebackq delims==" %%i in (`cmd /C %1 2^>^&1`) do @echo %time% %%i
Run just the same way.
Later
what I don't get is why the | itself wasn't enough to send STDOUT and STDERR to STDIN of add_ts.bat
It is. I think you are labouring under the combination of two misconceptions.
One: You believe that a program's commandline arguments are the same as its standard
input, or that it gets it commandline arguments from its standard input. It doesn't.
A program's commandline arguments, if any, are passed to it as a fixed list
of strings in the program-startup protocol. Its standard input is an input stream made
available to it at the same time by the OS and connected by default to the console in which the program
starts. This default can be overridden in the shell by redirection operators. The contents of that input stream are not fixed in advance. It will feed to the
the program whatever is input to the console, or from its redirected proxy, as long as the program is running, as and when the program reads it. The program
can parse or ignore its commandline arguments and, quite independently of that, it can read or ignore its standard input.
Your program add_ts.bat
is a program that parses the first of its commandline arguments
- it uses %1
- and ignores any more. And it ignores its standard input entirely.
Two: You believe that the effect of a pipeline, e.g.
a | b
is to start an a
process and then, for each line that it writes to the standard output, start
a distinct b
process which will automatically receive that one line written by a
as
a single commandline argument (no matter who many words are in the line) and do its stuff
with that single commandline argument.
That's not what happens. The OS starts one a
process and one b
process, and connects the
standard output of the one a
process to the standard input of the one b
process. For the
pipeline to work at all, b
has got to be a program that reads its standard input. Youradd_ts.bat
is not such a program. It's only a program that parses its first commandline
argument: the |
doesn't give it any, and the commandline:
make all --trace -d 2>&1 | add_ts.bat
doesn't give it any either. The commandline:
make all --trace -d 2>&1 | add_ts.bat "Hello World"
would give it one commandline argument and:
make all --trace -d 2>&1 | add_ts.bat Hello World
would give it two, not one, commandline arguments, the second being ignored. But in any case
it doesn't read its standard input so piping to it is futile.
The site ss64.com is perfectly good about CMD
redirection and piping
but it assumes you know what a program has to do to be a pipeline-able command: To be an upstream command,
it has to write its standard output; to be a downstream command it has to read its standard input.
Related Topics
Convert Bash 'Ls' Output to JSON Array
How Is the Microsecond Time of Linux Gettimeofday() Obtained and What Is Its Accuracy
Rename Files in Multiple Directories to the Name of the Directory
Gurus Say That Ld_Library_Path Is Bad - What's the Alternative
How to Detect When Someone Opens the Slave Side of a Pty (Pseudo-Terminal) in Linux
How to Add a Cron Job in Linux
Simulate a Faulty Block Device with Read Errors
Identifying Received Signal Name in Bash
Differencebetween Pthread_Self() and Gettid()? Which One Should I Use
Meaning of Exit Status 1 Returned by Linux Command
Syntax of for Loop in Linux Shell Scripting
Use Perf Inside a Docker Container Without --Privileged
Installing Jenkins Plugins to Docker Jenkins