Python read from subprocess stdout and stderr separately while preserving order
Here's a solution based on selectors
, but one that preserves order, and streams variable-length characters (even single chars).
The trick is to use read1()
, instead of read()
.
import selectors
import subprocess
import sys
p = subprocess.Popen(
["python", "random_out.py"], stdout=subprocess.PIPE, stderr=subprocess.PIPE
)
sel = selectors.DefaultSelector()
sel.register(p.stdout, selectors.EVENT_READ)
sel.register(p.stderr, selectors.EVENT_READ)
while True:
for key, _ in sel.select():
data = key.fileobj.read1().decode()
if not data:
exit()
if key.fileobj is p.stdout:
print(data, end="")
else:
print(data, end="", file=sys.stderr)
If you want a test program, use this.
import sys
from time import sleep
for i in range(10):
print(f" x{i} ", file=sys.stderr, end="")
sleep(0.1)
print(f" y{i} ", end="")
sleep(0.1)
Can you make a python subprocess output stdout and stderr as usual, but also capture the output as a string?
This example seems to work for me:
# -*- Mode: Python -*-
# vi:si:et:sw=4:sts=4:ts=4
import subprocess
import sys
import select
p = subprocess.Popen(["find", "/proc"],
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout = []
stderr = []
while True:
reads = [p.stdout.fileno(), p.stderr.fileno()]
ret = select.select(reads, [], [])
for fd in ret[0]:
if fd == p.stdout.fileno():
read = p.stdout.readline()
sys.stdout.write('stdout: ' + read)
stdout.append(read)
if fd == p.stderr.fileno():
read = p.stderr.readline()
sys.stderr.write('stderr: ' + read)
stderr.append(read)
if p.poll() != None:
break
print 'program ended'
print 'stdout:', "".join(stdout)
print 'stderr:', "".join(stderr)
In general, any situation where you want to do stuff with multiple file descriptors at the same time and you don't know which one will have stuff for you to read, you should use select or something equivalent (like a Twisted reactor).
subprocess stdout and stderr while doing ssh
To detect that an error happened, you should check the returncode
attribute of the Popen object (ps
).
To get the output from stderr
, you have to pass stderr=subprocess.PIPE
to Popen, just as you do for stdout
.
python stream subprocess stdout and stderr zip doesnt work
zip stops when one of the iterators is finished.
In each of the examples you gave, one stream(stdout/stderr) is empty. So zip will produce nothing.
To fix this you should use itertools.zip_longest
Run command and get its stdout, stderr separately in near real time like in a terminal
The stdout and stderr of the program being run can be logged separately.
You can't use pexpect
because both stdout and stderr go to the same pty
and there is no way to separate them after that.
The stdout and stderr of the program being run can be viewed in near-real time, such that if the child process hangs, the user can see. (i.e. we do not wait for execution to complete before printing the stdout/stderr to the user)
If the output of a subprocess is not a tty then it is likely that it uses a block buffering and therefore if it doesn't produce much output then it won't be "real time" e.g., if the buffer is 4K then your parent Python process won't see anything until the child process prints 4K chars and the buffer overflows or it is flushed explicitly (inside the subprocess). This buffer is inside the child process and there are no standard ways to manage it from outside. Here's picture that shows stdio buffers and the pipe buffer for command 1 | command2
shell pipeline:
The program being run does not know it is being run via python, and thus will not do unexpected things (like chunk its output instead of printing it in real-time, or exit because it demands a terminal to view its output).
It seems, you meant the opposite i.e., it is likely that your child process chunks its output instead of flushing each output line as soon as possible if the output is redirected to a pipe (when you use stdout=PIPE
in Python). It means that the default threading or asyncio solutions won't work as is in your case.
There are several options to workaround it:
the command may accept a command-line argument such as
grep --line-buffered
orpython -u
, to disable block buffering.stdbuf
works for some programs i.e., you could run['stdbuf', '-oL', '-eL'] + command
using the threading or asyncio solution above and you should get stdout, stderr separately and lines should appear in near-real time:#!/usr/bin/env python3
import os
import sys
from select import select
from subprocess import Popen, PIPE
with Popen(['stdbuf', '-oL', '-e0', 'curl', 'www.google.com'],
stdout=PIPE, stderr=PIPE) as p:
readable = {
p.stdout.fileno(): sys.stdout.buffer, # log separately
p.stderr.fileno(): sys.stderr.buffer,
}
while readable:
for fd in select(readable, [], [])[0]:
data = os.read(fd, 1024) # read available
if not data: # EOF
del readable[fd]
else:
readable[fd].write(data)
readable[fd].flush()finally, you could try
pty
+select
solution with twopty
s:#!/usr/bin/env python3
import errno
import os
import pty
import sys
from select import select
from subprocess import Popen
masters, slaves = zip(pty.openpty(), pty.openpty())
with Popen([sys.executable, '-c', r'''import sys, time
print('stdout', 1) # no explicit flush
time.sleep(.5)
print('stderr', 2, file=sys.stderr)
time.sleep(.5)
print('stdout', 3)
time.sleep(.5)
print('stderr', 4, file=sys.stderr)
'''],
stdin=slaves[0], stdout=slaves[0], stderr=slaves[1]):
for fd in slaves:
os.close(fd) # no input
readable = {
masters[0]: sys.stdout.buffer, # log separately
masters[1]: sys.stderr.buffer,
}
while readable:
for fd in select(readable, [], [])[0]:
try:
data = os.read(fd, 1024) # read available
except OSError as e:
if e.errno != errno.EIO:
raise #XXX cleanup
del readable[fd] # EIO means EOF on some systems
else:
if not data: # EOF
del readable[fd]
else:
readable[fd].write(data)
readable[fd].flush()
for fd in masters:
os.close(fd)I don't know what are the side-effects of using different
pty
s for stdout, stderr. You could try whether a single pty is enough in your case e.g., setstderr=PIPE
and usep.stderr.fileno()
instead ofmasters[1]
. Comment insh
source suggests that there are issues ifstderr not in {STDOUT, pipe}
subprocess.Popen handling stdout and stderr as they come
I was able to solve this by using select.select()
process = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
close_fds=True,
**kw
)
while True:
reads, _, _ = select(
[process.stdout.fileno(), process.stderr.fileno()],
[], []
)
for descriptor in reads:
if descriptor == process.stdout.fileno():
read = process.stdout.readline()
if read:
print 'stdout: %s' % read
if descriptor == process.stderr.fileno():
read = process.stderr.readline()
if read:
print 'stderr: %s' % read
sys.stdout.flush()
if process.poll() is not None:
break
By passing in the file descriptors to select()
on the reads
argument (first argument for select()
) and looping over them (as long as process.poll()
indicated that the process was still alive).
No need for threads. Code was adapted from this stackoverflow answer
How to unify stdout and stderr, yet be able to distinguish between them?
You can use select()
to multiplex the output. Suppose you have stdout and stderr being captured in pipes, this code will work:
import select
import sys
inputs = set([pipe_stdout, pipe_stderr])
while inputs:
readable, _, _ = select.select(inputs, [], [])
for x in readable:
line = x.readline()
if len(line) == 0:
inputs.discard(x)
if x == pipe_stdout
print 'STDOUT', line
if x == pipe_stderr
print 'STDERR', line
Related Topics
Creating Over 20 Unique Legend Colors Using Matplotlib
How to Convert an Array of Strings to an Array of Floats in Numpy
Nested Dictionary to Multiindex Dataframe Where Dictionary Keys Are Column Labels
How to Change the Range of the X-Axis with Datetimes in Matplotlib
Reading E-Mails from Outlook with Python Through Mapi
Convert List to Tuple in Python
What Is the Time Complexity of Popping Elements from List in Python
Scipy: Savefig Without Frames, Axes, Only Content
Dummy Variables When Not All Categories Are Present
Call Int() Function on Every List Element
How to Plot Nan Values as a Special Color with Imshow in Matplotlib
Can You Use a String to Instantiate a Class
Is There a Clever Way to Pass the Key to Defaultdict's Default_Factory
Testing Floating Point Equality
Split List into Smaller Lists (Split in Half)