How to Add Timestamp While Redirecting Stdout to File in Bash

Bash redirect stdout and stderr to seperate files with timestamps

The trick is to make sure that tee, and the process substitution running your log function, exits before the script as a whole does -- so that when the shell that started the script prints its prompt, there isn't any backgrounded process that might write more output after it's done.

As a working example (albeit one focused more on being explicit than terseness):

#!/usr/bin/env bash
stdout_log=stdout.log; stderr_log=stderr.log

log () {
file=$1; shift
while read -r line; do
printf '%(%s)T %s\n' -1 "$line"
done >> "$file"
}

# first, make backups of your original stdout and stderr
exec {stdout_orig_fd}>&1 {stderr_orig_fd}>&2
# for stdout: start your process substitution, record its PID, start tee, record *its* PID
exec {stdout_log_fd}> >(log "$stdout_log"); stdout_log_pid=$!
exec {stdout_tee_fd}> >(tee "/dev/fd/$stdout_log_fd"); stdout_tee_pid=$!
exec {stdout_log_fd}>&- # close stdout_log_fd so the log process can exit when tee does
# for stderr: likewise
exec {stderr_log_fd}> >(log "$stderr_log"); stderr_log_pid=$!
exec {stderr_tee_fd}> >(tee "/dev/fd/$stderr_log_fd" >&2); stderr_tee_pid=$!
exec {stderr_log_fd}>&- # close stderr_log_fd so the log process can exit when tee does
# now actually swap out stdout and stderr for the processes we started
exec 1>&$stdout_tee_fd 2>&$stderr_tee_fd {stdout_tee_fd}>&- {stderr_tee_fd}>&-

# ...do the things you want to log here...
echo "this goes to stdout"; echo "this goes to stderr" >&2

# now, replace the FDs going to tee with the backups...
exec >&"$stdout_orig_fd" 2>&"$stderr_orig_fd"

# ...and wait for the associated processes to exit.
while :; do
ready_to_exit=1
for pid_var in stderr_tee_pid stderr_log_pid stdout_tee_pid stdout_log_pid; do
# kill -0 just checks whether a PID exists; it doesn't actually send a signal
kill -0 "${!pid_var}" &>/dev/null && ready_to_exit=0
done
(( ready_to_exit )) && break
sleep 0.1 # avoid a busy-loop eating unnecessary CPU by sleeping before next poll
done

So What's With The File Descriptor Manipulation?

A few key concepts to make sure we have clear:

  • All subshells have their own copies of the file descriptor table as created when they were fork()ed off from their parent. From that point forward, each file descriptor table is effectively independent.
  • A process reading from (the read end of) a FIFO (or pipe) won't see an EOF until all programs writing to (the write end of) that FIFO have closed their copies of the descriptor.

...so, if you create a FIFO pair, fork() off a child process, and let the child process write to the write end of the FIFO, whatever's reading from the read end will never see an EOF until not just the child, but also the parent, closes their copies.

Thus, the gymnastics you see here:

  • When we run exec {stdout_log_fd}>&-, we're closing the parent shell's copy of the FIFO writing to the log function for stdout, so the only remaining copy is the one used by the tee child process -- so that when tee exits, the subshell running log exits too.
  • When we run exec 1>&$stdout_tee_fd {stdout_tee_fd}>&-, we're doing two things: First, we make FD 1 a copy of the file descriptor whose number is stored in the variable stdout_tee_fd; second, we delete the stdout_tee_fd entry from the file descriptor table, so only the copy on FD 1 remains. This ensures that later, when we run exec >&"$stdout_orig_fd", we're deleting the last remaining write handle to the stdout tee function, causing tee to get an EOF on stdin (so it exits, thus closing the handle it holds on the log function's subshell and letting that subshell exit as well).

Some Final Notes On Process Management

Unfortunately, how bash handles subshells created for process substitutions has changed substantially between still-actively-deployed releases; so while in theory it's possible to use wait "$pid" to let a process substitution exit and collect its exit status, this isn't always reliable -- hence the use of kill -0.

However, if wait "$pid" worked, it would be strongly preferable, because the wait() syscall is what removes a previously-exited process's entry from the process table: It is guaranteed that a PID will not be reused (and a zombie process-table entry is left as a placeholder) if no wait() or waitpid() invocation has taken place.

Modern operating systems try fairly hard to avoid short-term PID reuse, so wraparound is not an active concern in most scenarios. However, if you're worried about this, consider using the flock-based mechanism discussed in https://stackoverflow.com/a/31552333/14122 for waiting for your process substitutions to exit, instead of kill -0.

How to prepend stdout and stderr output with timestamp when redirecting into log files?

In any POSIX shell, try:

{ cmd | log_pipe >>stdout.log; } 2>&1 | log_pipe >>stderr.log

Also, if you have GNU awk (sometimes called gawk), then log_pipe can be made simpler and faster:

log_pipe() {  awk '{print strftime("[%Y-%m-%d %H:%M:%S]"),$0}'; }

Example

As an example, let's create the command cmd:

cmd() { echo "This is out"; echo "This is err">&2; }

Now, let's run our command and look at the output files:

$ { cmd | log_pipe >>stdout.log; } 2>&1 | log_pipe >>stderr.log
$ cat stdout.log
[2019-07-04 23:42:20] This is out
$ cat stderr.log
[2019-07-04 23:42:20] This is err

The problem

cmd | log_pipe >> "$stdout_log" 2>> "$stderr_log"

The above redirects stdout from cmd to log_pipe. The stdout of log_pipe is redirected to $stdout_log and the stderr of log_pipe is redirected to $stderr_log. The problem is that the stderr of cmd is never redirected. It goes straight to the terminal.

As an example, consider this cmd:

cmd() { echo "This is out"; echo "This is err">&2; }

Now, let's run the command:

$ cmd | log_pipe >>stdout.log 2>>stderr.log
This is err

We can see that This is err is not sent to the file stderr.log. Instead, it appears on the terminal. It is never seen by log_pipe. stderr.log only captures error messages from log_pipe.

How to prepend timestamp to the STDERR and redirect to a file

This can do:

OUT=$(df -h 2>&1 >/dev/null) || { date; echo "$OUT"; } >testErr.log

Test:

$ : > testErr.log; OUT=$(df -jj 2>&1 >/dev/null) || { date -u; echo "$OUT"; } >testErr.log; cat testErr.log
Fri Aug 1 12:10:29 UTC 2014
df: invalid option -- 'j'
Try 'df --help' for more information.
$ : > testErr.log; OUT=$(df -h 2>&1 >/dev/null) || { date -u; echo "$OUT"; } >testErr.log; cat testErr.log
(no output)

You can change >testErr.log to >>testErr.log to append.

And this can be more efficient:

OUT=$(exec df -h 2>&1 >/dev/null) || { date; echo "$OUT"; } >testErr.log

Windows Batch - how to add timestamp while redirecting stdout to file

To a first approximation you need a batch file like:

add_ts.bat

@for /F "usebackq delims==" %%i in (`%1`) do @echo %time% %%i

which you would run like:

add_ts.bat "make all -d --trace" > buildlog.txt

This however isn't good enough if you want to capture and
timestamp STDERR as well as STDOUT from the command passed as
%1, because the backticks operator around %1 will only capture STDOUT

To fix this you'll need to capture STDERR as well as STDOUT within the backticks, by using redirection in there, which in turns means
you need to run a subshell to understand the redirection, and you need to
escape the redirection operators so they're not interpreted by the toplevel
shell. Which all comes to:

@for /F "usebackq delims==" %%i in (`cmd /C %1 2^>^&1`) do @echo %time% %%i 

Run just the same way.

Later

what I don't get is why the | itself wasn't enough to send STDOUT and STDERR to STDIN of add_ts.bat

It is. I think you are labouring under the combination of two misconceptions.

One: You believe that a program's commandline arguments are the same as its standard
input, or that it gets it commandline arguments from its standard input. It doesn't.
A program's commandline arguments, if any, are passed to it as a fixed list
of strings in the program-startup protocol. Its standard input is an input stream made
available to it at the same time by the OS and connected by default to the console in which the program
starts. This default can be overridden in the shell by redirection operators. The contents of that input stream are not fixed in advance. It will feed to the
the program whatever is input to the console, or from its redirected proxy, as long as the program is running, as and when the program reads it. The program
can parse or ignore its commandline arguments and, quite independently of that, it can read or ignore its standard input.

Your program add_ts.bat is a program that parses the first of its commandline arguments
- it uses %1 - and ignores any more. And it ignores its standard input entirely.

Two: You believe that the effect of a pipeline, e.g.

a | b

is to start an a process and then, for each line that it writes to the standard output, start
a distinct b process which will automatically receive that one line written by a as
a single commandline argument (no matter who many words are in the line) and do its stuff
with that single commandline argument.

That's not what happens. The OS starts one a process and one b process, and connects the
standard output of the one a process to the standard input of the one b process. For the
pipeline to work at all, b has got to be a program that reads its standard input. Your
add_ts.bat is not such a program. It's only a program that parses its first commandline
argument: the | doesn't give it any, and the commandline:

make all --trace -d 2>&1 | add_ts.bat

doesn't give it any either. The commandline:

make all --trace -d 2>&1 | add_ts.bat "Hello World"

would give it one commandline argument and:

make all --trace -d 2>&1 | add_ts.bat Hello World

would give it two, not one, commandline arguments, the second being ignored. But in any case
it doesn't read its standard input so piping to it is futile.

The site ss64.com is perfectly good about CMD redirection and piping
but it assumes you know what a program has to do to be a pipeline-able command: To be an upstream command,
it has to write its standard output; to be a downstream command it has to read its standard input.



Related Topics



Leave a reply



Submit