Python Read from Subprocess Stdout and Stderr Separately While Preserving Order

Python read from subprocess stdout and stderr separately while preserving order

Here's a solution based on selectors, but one that preserves order, and streams variable-length characters (even single chars).

The trick is to use read1(), instead of read().

import selectors
import subprocess
import sys

p = subprocess.Popen(
    ["python", "random_out.py"], stdout=subprocess.PIPE, stderr=subprocess.PIPE
)

sel = selectors.DefaultSelector()
sel.register(p.stdout, selectors.EVENT_READ)
sel.register(p.stderr, selectors.EVENT_READ)

while True:
    for key, _ in sel.select():
        data = key.fileobj.read1().decode()
        if not data:
            exit()
        if key.fileobj is p.stdout:
            print(data, end="")
        else:
            print(data, end="", file=sys.stderr)

If you want a test program, use this.

import sys
from time import sleep

for i in range(10):
    print(f" x{i} ", file=sys.stderr, end="")
    sleep(0.1)
    print(f" y{i} ", end="")
    sleep(0.1)

Can you make a python subprocess output stdout and stderr as usual, but also capture the output as a string?

This example seems to work for me:

# -*- Mode: Python -*-
# vi:si:et:sw=4:sts=4:ts=4

import subprocess
import sys
import select

p = subprocess.Popen(["find", "/proc"],
    stdout=subprocess.PIPE, stderr=subprocess.PIPE)

stdout = []
stderr = []

while True:
    reads = [p.stdout.fileno(), p.stderr.fileno()]
    ret = select.select(reads, [], [])

    for fd in ret[0]:
        if fd == p.stdout.fileno():
            read = p.stdout.readline()
            sys.stdout.write('stdout: ' + read)
            stdout.append(read)
        if fd == p.stderr.fileno():
            read = p.stderr.readline()
            sys.stderr.write('stderr: ' + read)
            stderr.append(read)

    if p.poll() != None:
        break

print 'program ended'

print 'stdout:', "".join(stdout)
print 'stderr:', "".join(stderr)

In general, any situation where you want to do stuff with multiple file descriptors at the same time and you don't know which one will have stuff for you to read, you should use select or something equivalent (like a Twisted reactor).

subprocess stdout and stderr while doing ssh

To detect that an error happened, you should check the returncode attribute of the Popen object (ps).

To get the output from stderr, you have to pass stderr=subprocess.PIPE to Popen, just as you do for stdout.

python stream subprocess stdout and stderr zip doesnt work

zip stops when one of the iterators is finished.

In each of the examples you gave, one stream(stdout/stderr) is empty. So zip will produce nothing.

To fix this you should use itertools.zip_longest

Run command and get its stdout, stderr separately in near real time like in a terminal

The stdout and stderr of the program being run can be logged separately.

You can't use pexpect because both stdout and stderr go to the same pty and there is no way to separate them after that.

The stdout and stderr of the program being run can be viewed in near-real time, such that if the child process hangs, the user can see. (i.e. we do not wait for execution to complete before printing the stdout/stderr to the user)

If the output of a subprocess is not a tty then it is likely that it uses a block buffering and therefore if it doesn't produce much output then it won't be "real time" e.g., if the buffer is 4K then your parent Python process won't see anything until the child process prints 4K chars and the buffer overflows or it is flushed explicitly (inside the subprocess). This buffer is inside the child process and there are no standard ways to manage it from outside. Here's picture that shows stdio buffers and the pipe buffer for command 1 | command2 shell pipeline:

pipe/stdio buffers

The program being run does not know it is being run via python, and thus will not do unexpected things (like chunk its output instead of printing it in real-time, or exit because it demands a terminal to view its output).

It seems, you meant the opposite i.e., it is likely that your child process chunks its output instead of flushing each output line as soon as possible if the output is redirected to a pipe (when you use stdout=PIPE in Python). It means that the default threading or asyncio solutions won't work as is in your case.

There are several options to workaround it:

the command may accept a command-line argument such as grep --line-buffered or python -u, to disable block buffering.

stdbuf works for some programs i.e., you could run ['stdbuf', '-oL', '-eL'] + command using the threading or asyncio solution above and you should get stdout, stderr separately and lines should appear in near-real time:

#!/usr/bin/env python3
import os
import sys
from select import select
from subprocess import Popen, PIPE

with Popen(['stdbuf', '-oL', '-e0', 'curl', 'www.google.com'],
           stdout=PIPE, stderr=PIPE) as p:
    readable = {
        p.stdout.fileno(): sys.stdout.buffer, # log separately
        p.stderr.fileno(): sys.stderr.buffer,
    }
    while readable:
        for fd in select(readable, [], [])[0]:
            data = os.read(fd, 1024) # read available
            if not data: # EOF
                del readable[fd]
            else: 
                readable[fd].write(data)
                readable[fd].flush()

finally, you could try pty + select solution with two ptys:

#!/usr/bin/env python3
import errno
import os
import pty
import sys
from select import select
from subprocess import Popen

masters, slaves = zip(pty.openpty(), pty.openpty())
with Popen([sys.executable, '-c', r'''import sys, time
print('stdout', 1) # no explicit flush
time.sleep(.5)
print('stderr', 2, file=sys.stderr)
time.sleep(.5)
print('stdout', 3)
time.sleep(.5)
print('stderr', 4, file=sys.stderr)
'''],
           stdin=slaves[0], stdout=slaves[0], stderr=slaves[1]):
    for fd in slaves:
        os.close(fd) # no input
    readable = {
        masters[0]: sys.stdout.buffer, # log separately
        masters[1]: sys.stderr.buffer,
    }
    while readable:
        for fd in select(readable, [], [])[0]:
            try:
                data = os.read(fd, 1024) # read available
            except OSError as e:
                if e.errno != errno.EIO:
                    raise #XXX cleanup
                del readable[fd] # EIO means EOF on some systems
            else:
                if not data: # EOF
                    del readable[fd]
                else:
                    readable[fd].write(data)
                    readable[fd].flush()
for fd in masters:
    os.close(fd)

I don't know what are the side-effects of using different ptys for stdout, stderr. You could try whether a single pty is enough in your case e.g., set stderr=PIPE and use p.stderr.fileno() instead of masters[1]. Comment in sh source suggests that there are issues if stderr not in {STDOUT, pipe}

subprocess.Popen handling stdout and stderr as they come

I was able to solve this by using select.select()

process = subprocess.Popen(
    cmd,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE,
    close_fds=True,
    **kw
)

while True:
    reads, _, _ = select(
        [process.stdout.fileno(), process.stderr.fileno()],
        [], []
    )

    for descriptor in reads:
        if descriptor == process.stdout.fileno():
            read = process.stdout.readline()
            if read:
                print 'stdout: %s' % read

        if descriptor == process.stderr.fileno():
            read = process.stderr.readline()
            if read:
                print 'stderr: %s' % read
        sys.stdout.flush()

    if process.poll() is not None:
        break

By passing in the file descriptors to select() on the reads argument (first argument for select()) and looping over them (as long as process.poll()indicated that the process was still alive).

No need for threads. Code was adapted from this stackoverflow answer

How to unify stdout and stderr, yet be able to distinguish between them?

You can use select() to multiplex the output. Suppose you have stdout and stderr being captured in pipes, this code will work:

import select
import sys

inputs = set([pipe_stdout, pipe_stderr])

while inputs:
  readable, _, _ = select.select(inputs, [], [])
  for x in readable:
    line = x.readline()
    if len(line) == 0:
      inputs.discard(x)
    if x == pipe_stdout
      print 'STDOUT', line
    if x == pipe_stderr
      print 'STDERR', line

Python Read from Subprocess Stdout and Stderr Separately While Preserving Order