Rescuing command not found for IO::popen
Yes: Upgrade to ruby 1.9. If you run that in 1.9, an Errno::ENOENT
will be raised instead, and you will be able to rescue
it.
(Edit) Here is a hackish way of doing it in 1.8:
error = IO.pipe
$stderr.reopen error[1]
pipe = IO.popen 'qwe' # <- not a real command
$stderr.reopen IO.new(2)
error[1].close
if !select([error[0]], nil, nil, 0.1)
# The command was found. Use `pipe' here.
puts 'found'
else
# The command could not be found.
puts 'not found'
end
run external program in Ruby IO.popen : rescue not working
After your call to IO.popen
you are passing the output from the child program to JSON.parse
, regardless of whether it is valid. The exception you see is the json parser trying to parse the Java exception method, which is captured because you redirect stderr with 2>&1
.
You need to check that the child process completed successfully before continuing. The simplest way is probably to use the $?
special variable, which indicates the status of the last executed child process, after the call to popen
. This variable is an instance if Process::Status
. You could do something like this:
output = IO.popen(command+" 2>&1") do |io|
io.read
end
unless $?.success?
# Handle the error however you feel is best, e.g.
puts "Tika had an error, the message was:\n#{output}"
raise "Tika error"
end
For more control you could look at the Open3
module in the standard library. Since Tika is a Java program, another possibility might be to look into using JRuby and call it directly.
Stopping IO.popen in the middle of execution using Exceptions given a bad condition
So - disclaimer, I am using Ruby 1.8.6 and on Windows. It is the only Ruby that the software I use currently supports, so there may be a more elegant solution. Overall, it came down to making sure the process died using the Process.kill command before continuing execution.
IO.popen(cmdLineExecution) do |stream|
stream.each do |line|
puts line
begin
#if it finds an error, throws an exception
analyzeLine(line)
rescue correctionException
#if it was able to handle the error
puts "Handled the exception successfully"
Process.kill("KILL", stream.pid) #stop the system process
rescue correctionFailedException => failedEx
#not able to handle the error
puts "Failed handling the exception"
Process.kill("KILL", stream.pid) #stop the system process
raise "Was unable to make a known correction to the running enviorment: #{failedEx.message}"
end
end
end
I made both the exceptions standard classes that inherit Exception
.
How to detect if shell failed to execute a command after popen call? Not to confuse with the command exit status
General comments about when to use errno
No standard C or POSIX library function ever sets errno
to zero. Printing an error message based on errno
when fd
is not NULL is not appropriate; the error number is not from popen()
(or is not set because popen()
failed). Printing res
after pclose()
is OK; adding strerror(errno)
runs into the same problem (the information in errno
may be entirely irrelevant). You can set errno
to zero before calling a function. If the function returns a failure indication, it may be relevant to look at errno
(look at the specification of the function — is it defined to set errno
on failure?). However, errno
can be set non-zero by a function even if it succeeds. Solaris standard I/O used to set errno = ENOTTY
if the output stream was not connected to a terminal, even though the operation succeeded; it probably still does. And Solaris setting errno
even on success is perfectly legitimate; it is only legitimate to look at errno
if (1) the function reports failure and (2) the function is documented to set errno
(by POSIX or by the system manual).
See C11 §7.5 Errors <errno.h>
¶3:
The value of errno in the initial thread is zero at program startup (the initial value of errno in other threads is an indeterminate value), but is never set to zero by any library function.202) The value of errno may be set to nonzero by a library function call whether or not there is an error, provided the use of errno is not documented in the description of the function in this International Standard.
202) Thus, a program that uses
errno
for error checking should set it to zero before a library function call, then inspect it before a subsequent library function call. Of course, a library function can save the value oferrno
on entry and then set it to zero, as long as the original value is restored iferrno
's value is still zero just before the return.
POSIX is similar (errno
):
Many functions provide an error number in
errno
, which has typeint
and is defined in<errno.h>
. The value oferrno
shall be defined only after a call to a function for which it is explicitly stated to be set and until it is changed by the next function call or if the application assigns it a value. The value oferrno
should only be examined when it is indicated to be valid by a function's return value. Applications shall obtain the definition oferrno
by the inclusion of<errno.h>
. No function in this volume of POSIX.1-2017 shall set errno to 0. The setting oferrno
after a successful call to a function is unspecified unless the description of that function specifies thaterrno
shall not be modified.
popen()
and pclose()
The POSIX specification for popen()
is not dreadfully helpful. There's only one circumstance under which popen()
'must fail'; everything else is 'may fail'.
However, the details for pclose()
are much more helpful, including:
If the command language interpreter cannot be executed, the child termination status returned by
pclose()
shall be as if the command language interpreter terminated usingexit(127)
or_exit(127)
.
and
Upon successful return,
pclose()
shall return the termination status of the command language interpreter. Otherwise,pclose()
shall return -1 and set errno to indicate the error.
That means that pclose()
returns the value it received from waitpid()
— the exit status from the command that was invoked. Note that it must use waitpid()
(or an equivalently selective function — hunt for wait3()
and wait4()
on BSD systems); it is not authorized to wait for any other child processes than the one created by popen()
for this file stream. There are prescriptions about pclose()
must be sure that the child has exited, even if some other function waited on the dead child in the interim and thereby caused the system to lose the status for the child created by popen()
.
If you interpret decimal 32512 as hexadecimal, you get 0x7F00. And if you used the WIFEXITED
and WEXITSTATUS
macros from <sys/wait.h>
on that, you'd find that the exit status is 127
(because 0x7F
is 127
decimal, and the exit status is encoded in the high-order bits of the status returned by waitpid()
.
int res = pclose(fd);
if (WIFEXITED(res))
printf("Command exited with status %d (0x%.4X)\n", WEXITSTATUS(res), res);
else if (WIFSIGNALED(res))
printf("Command exited from signal %d (0x%.4X)\n", WTERMSIG(res), res);
else
printf("Command exited with unrecognized status 0x%.4X\n", res);
And remember that 0
is the exit status indicating success; anything else normally indicates an error of some sort. You can further analyze the exit status to look for 127
or relayed signals, etc. It's unlikely you'd get a 'signalled' status, or an unrecognized status.
popen()
told you that the child failed.
Of course, it is possible that the executed command actually exited itself with status 127; that's unavoidably confusing, and the only way around it is to avoid exit statuses in the range 126 to 128 + 'maximum signal number' (which might mean 126 .. 191 if there are 63 recognized signals). The value 126
is also used by POSIX to report when the interpreter specified in a shebang (#!/usr/bin/interpreter
) is missing (as opposed to the program to be executed not being available). Whether that's returned by pclose()
is a separate discussion. And the signal reporting is done by the shell because there's no (easy) way to report that a child died from a signal otherwise.
Timeout within a popen works, but popen inside a timeout doesn't?
Aha, subtle.
There is a hidden, blocking ensure
clause at the end of the IO#popen block in the second case. The Timeout::Error is raised raised timely, but you cannot rescue
it until execution returns from that implicit ensure
clause.
Under the hood, IO.popen(cmd) { |io| ... }
does something like this:
def my_illustrative_io_popen(cmd, &block)
begin
pio = IO.popen(cmd)
block.call(pio) # This *is* interrupted...
ensure
pio.close # ...but then control goes here, which blocks on cmd's termination
end
and the IO#close call is really more-or-less a pclose(3)
, which is blocking you in waitpid(2)
until the sleeping child exits.
You can verify this like so:
#!/usr/bin/env ruby
require 'timeout'
BEGIN { $BASETIME = Time.now.to_i }
def xputs(msg)
puts "%4.2f: %s" % [(Time.now.to_f - $BASETIME), msg]
end
begin
Timeout.timeout(3) do
begin
xputs "popen(sleep 10)"
pio = IO.popen("sleep 10")
sleep 100 # or loop over pio.gets or whatever
ensure
xputs "Entering ensure block"
#Process.kill 9, pio.pid # <--- This would solve your problem!
pio.close
xputs "Leaving ensure block"
end
end
rescue Timeout::Error => ex
xputs "rescuing: #{ex}"
end
So, what can you do?
You'll have to do it the explicit way, since the interpreter doesn't expose a way to override the IO#popen ensure
logic. You can use the above code as a starting template and uncomment the kill()
line, for example.
Output redirection to file using subprocess.Popen
I/O redirection with <
and >
is done by the shell. When you call subprocess.Popen()
with a list as the first argument or without shell=True
, the program is executed directly, not using the shell to parse the command line. So you're executing the program and passing literal arguments <
and >
to it. It's as if you executed the shell command and quoted the <
and >
characters:
scriptname '<' infile.txt '>' outfile.txt
If you want to use the shell you have to send a single string (just like using os.system()
.
data = subprocess.Popen(" ".join([ shlex.quote(script.out), "<", shlex.quote(input_file[i]), ">", shlex.quote(output_file[i])]), shell=True)
Use shlex.quote()
to escape arguments that shouldn't be treated as shell metacharacters.
Running shell command and capturing the output
In all officially maintained versions of Python, the simplest approach is to use the subprocess.check_output
function:
>>> subprocess.check_output(['ls', '-l'])
b'total 0\n-rw-r--r-- 1 memyself staff 0 Mar 14 11:04 files\n'
check_output
runs a single program that takes only arguments as input.1 It returns the result exactly as printed to stdout
. If you need to write input to stdin
, skip ahead to the run
or Popen
sections. If you want to execute complex shell commands, see the note on shell=True
at the end of this answer.
The check_output
function works in all officially maintained versions of Python. But for more recent versions, a more flexible approach is available.
Modern versions of Python (3.5 or higher): run
If you're using Python 3.5+, and do not need backwards compatibility, the new run
function is recommended by the official documentation for most tasks. It provides a very general, high-level API for the subprocess
module. To capture the output of a program, pass the subprocess.PIPE
flag to the stdout
keyword argument. Then access the stdout
attribute of the returned CompletedProcess
object:
>>> import subprocess
>>> result = subprocess.run(['ls', '-l'], stdout=subprocess.PIPE)
>>> result.stdout
b'total 0\n-rw-r--r-- 1 memyself staff 0 Mar 14 11:04 files\n'
The return value is a bytes
object, so if you want a proper string, you'll need to decode
it. Assuming the called process returns a UTF-8-encoded string:
>>> result.stdout.decode('utf-8')
'total 0\n-rw-r--r-- 1 memyself staff 0 Mar 14 11:04 files\n'
This can all be compressed to a one-liner if desired:
>>> subprocess.run(['ls', '-l'], stdout=subprocess.PIPE).stdout.decode('utf-8')
'total 0\n-rw-r--r-- 1 memyself staff 0 Mar 14 11:04 files\n'
If you want to pass input to the process's stdin
, you can pass a bytes
object to the input
keyword argument:
>>> cmd = ['awk', 'length($0) > 5']
>>> ip = 'foo\nfoofoo\n'.encode('utf-8')
>>> result = subprocess.run(cmd, stdout=subprocess.PIPE, input=ip)
>>> result.stdout.decode('utf-8')
'foofoo\n'
You can capture errors by passing stderr=subprocess.PIPE
(capture to result.stderr
) or stderr=subprocess.STDOUT
(capture to result.stdout
along with regular output). If you want run
to throw an exception when the process returns a nonzero exit code, you can pass check=True
. (Or you can check the returncode
attribute of result
above.) When security is not a concern, you can also run more complex shell commands by passing shell=True
as described at the end of this answer.
Later versions of Python streamline the above further. In Python 3.7+, the above one-liner can be spelled like this:
>>> subprocess.run(['ls', '-l'], capture_output=True, text=True).stdout
'total 0\n-rw-r--r-- 1 memyself staff 0 Mar 14 11:04 files\n'
Using run
this way adds just a bit of complexity, compared to the old way of doing things. But now you can do almost anything you need to do with the run
function alone.
Older versions of Python (3-3.4): more about check_output
If you are using an older version of Python, or need modest backwards compatibility, you can use the check_output
function as briefly described above. It has been available since Python 2.7.
subprocess.check_output(*popenargs, **kwargs)
It takes takes the same arguments as Popen
(see below), and returns a string containing the program's output. The beginning of this answer has a more detailed usage example. In Python 3.5+, check_output
is equivalent to executing run
with check=True
and stdout=PIPE
, and returning just the stdout
attribute.
You can pass stderr=subprocess.STDOUT
to ensure that error messages are included in the returned output. When security is not a concern, you can also run more complex shell commands by passing shell=True
as described at the end of this answer.
If you need to pipe from stderr
or pass input to the process, check_output
won't be up to the task. See the Popen
examples below in that case.
Complex applications and legacy versions of Python (2.6 and below): Popen
If you need deep backwards compatibility, or if you need more sophisticated functionality than check_output
or run
provide, you'll have to work directly with Popen
objects, which encapsulate the low-level API for subprocesses.
The Popen
constructor accepts either a single command without arguments, or a list containing a command as its first item, followed by any number of arguments, each as a separate item in the list. shlex.split
can help parse strings into appropriately formatted lists. Popen
objects also accept a host of different arguments for process IO management and low-level configuration.
To send input and capture output, communicate
is almost always the preferred method. As in:
output = subprocess.Popen(["mycmd", "myarg"],
stdout=subprocess.PIPE).communicate()[0]
Or
>>> import subprocess
>>> p = subprocess.Popen(['ls', '-a'], stdout=subprocess.PIPE,
... stderr=subprocess.PIPE)
>>> out, err = p.communicate()
>>> print out
.
..
foo
If you set stdin=PIPE
, communicate
also allows you to pass data to the process via stdin
:
>>> cmd = ['awk', 'length($0) > 5']
>>> p = subprocess.Popen(cmd, stdout=subprocess.PIPE,
... stderr=subprocess.PIPE,
... stdin=subprocess.PIPE)
>>> out, err = p.communicate('foo\nfoofoo\n')
>>> print out
foofoo
Note Aaron Hall's answer, which indicates that on some systems, you may need to set stdout
, stderr
, and stdin
all to PIPE
(or DEVNULL
) to get communicate
to work at all.
In some rare cases, you may need complex, real-time output capturing. Vartec's answer suggests a way forward, but methods other than communicate
are prone to deadlocks if not used carefully.
As with all the above functions, when security is not a concern, you can run more complex shell commands by passing shell=True
.
Notes
1. Running shell commands: the shell=True
argument
Normally, each call to run
, check_output
, or the Popen
constructor executes a single program. That means no fancy bash-style pipes. If you want to run complex shell commands, you can pass shell=True
, which all three functions support. For example:
>>> subprocess.check_output('cat books/* | wc', shell=True, text=True)
' 1299377 17005208 101299376\n'
However, doing this raises security concerns. If you're doing anything more than light scripting, you might be better off calling each process separately, and passing the output from each as an input to the next, via
run(cmd, [stdout=etc...], input=other_output)
Or
Popen(cmd, [stdout=etc...]).communicate(other_output)
The temptation to directly connect pipes is strong; resist it. Otherwise, you'll likely see deadlocks or have to do hacky things like this.
Related Topics
Shading Mask Algorithm for Radiation Calculations
How to Iterate Through This JSON Document Using Ruby
Run Ruby Script That Is Stored on Internet
Generating Devise Controllers - Rails Devise
Ruby 2.1 with Erubis Template Engine
Adding Fields to Devise Sign Up Using Rails 4
Bundler How to Uninstall Conflicting Dependency
Share Session Between Two Rails4 Applications
Rails 4 Not Encrypting Cookie Contents
Ruby String Split into Words Ignoring All Special Characters: Simpler Query
Any Standard Guide for Ruby Win32Ole API
Uninitialized Constant Applicationrecord Error
Will_Paginate Can It Order by Day
Rails 4 - Heroku SQLite3 Error
Ruby on Rails - Pagination on Search Result