Difference Between Subprocess.Popen and Os.System

Difference between subprocess.Popen and os.system

If you check out the subprocess section of the Python docs, you'll notice there is an example of how to replace os.system() with subprocess.Popen():

sts = os.system("mycmd" + " myarg")

...does the same thing as...

sts = Popen("mycmd" + " myarg", shell=True).wait()

The "improved" code looks more complicated, but it's better because once you know subprocess.Popen(), you don't need anything else. subprocess.Popen() replaces several other tools (os.system() is just one of those) that were scattered throughout three other Python modules.

If it helps, think of subprocess.Popen() as a very flexible os.system().

Advantages of subprocess over os.system

First of all, you are cutting out the middleman; subprocess.call by default avoids spawning a shell that examines your command, and directly spawns the requested process. This is important because, besides the efficiency side of the matter, you don't have much control over the default shell behavior, and it actually typically works against you regarding escaping.

In particular, do not do this:

subprocess.call('netsh interface set interface "Wi-Fi" enable')

since

If passing a single string, either shell must be True (see below) or else the string must simply name the program to be executed without specifying any arguments.

Instead, you'll do:

subprocess.call(["netsh", "interface", "set", "interface", "Wi-Fi", "enable"])

Notice that here all the escaping nightmares are gone. subprocess handles escaping (if the OS wants arguments as a single string - such as Windows) or passes the separated arguments straight to the relevant syscall (execvp on UNIX).

Compare this with having to handle the escaping yourself, especially in a cross-platform way (cmd doesn't escape in the same way as POSIX sh), especially with the shell in the middle messing with your stuff (trust me, you don't want to know what unholy mess is to provide a 100% safe escaping for your command when calling cmd /k).

Also, when using subprocess without the shell in the middle you are sure you are getting correct return codes. If there's a failure launching the process you get a Python exception, if you get a return code it's actually the return code of the launched program. With os.system you have no way to know if the return code you get comes from the launched command (which is generally the default behavior if the shell manages to launch it) or it is some error from the shell (if it didn't manage to launch it).


Besides arguments splitting/escaping and return code, you have way better control over the launched process. Even with subprocess.call (which is the most basic utility function over subprocess functionalities) you can redirect stdin, stdout and stderr, possibly communicating with the launched process. check_call is similar and it avoids the risk of ignoring a failure exit code. check_output covers the common use case of check_call + capturing all the program output into a string variable.

Once you get past call & friends (which is blocking just as os.system), there are way more powerful functionalities - in particular, the Popen object allows you to work with the launched process asynchronously. You can start it, possibly talk with it through the redirected streams, check if it is running from time to time while doing other stuff, waiting for it to complete, sending signals to it and killing it - all stuff that is way besides the mere synchronous "start process with default stdin/stdout/stderr through the shell and wait it to finish" that os.system provides.


So, to sum it up, with subprocess:

  • even at the most basic level (call & friends), you:
    • avoid escaping problems by passing a Python list of arguments;
    • avoid the shell messing with your command line;
    • either you have an exception or the true exit code of the process you launched; no confusion about program/shell exit code;
    • have the possibility to capture stdout and in general redirect the standard streams;
  • when you use Popen:
    • you aren't restricted to a synchronous interface, but you can actually do other stuff while the subprocess run;
    • you can control the subprocess (check if it is running, communicate with it, kill it).

Given that subprocess does way more than os.system can do - and in a safer, more flexible (if you need it) way - there's just no reason to use system instead.

What is the difference between subprocess.Popen() and os.fork()?

subprocess.Popen let's you execute an arbitrary program/command/executable/whatever in its own process.

os.fork only allows you to create a child process that will execute the same script from the exact line in which you called it. As its name suggests, it "simply" forks the current process into 2.

os.fork is only available on Unix, and subprocess.Popen is cross-platfrom.

The Difference between os.system and subprocess calls

The difference between os.system and subprocess.Popen is that Popen actually opens a pipe, and os.system starts a subshell, much like subprocess.call. Windows only half-supports some pipe/shell features of what *nix operating systems will, but the difference should still fundamentally be the same. A subshell doesn't let you communicate with the standard input and output of another process like a pipe does.

What you probably want is to use subprocess like you are, but then call the kill() method (from the docs) on the pipe object before your application terminates. That will let you decide when you want the process terminated. You might need to satisfy whatever i/o the process wants to do by calling pipe.communicate() and closing the pipe's file handles.

Are os.system() and subprocess.call() any different?

If you're running python (cpython) on windows the <built-in function system> os.system will execute under the curtains _wsystem while if you're using a non-windows os, it'll use system.

While subprocess.call will use CreateProcess on windows and _posixsubprocess.fork_exec in posix-based operating-systems.

The above points should answer your questions about the main differences (structurally)... That said, I'd suggest you follow the most important advice from the os.system docs, which is:

The subprocess module provides more powerful facilities for spawning
new processes and retrieving their results; using that module is
preferable to using this function. See the Replacing Older Functions
with the subprocess Module section in the subprocess documentation for
some helpful recipes.

When should I use subprocess.Popen instead of os.popen?

Short answer: Never use os.popen, always use subprocess!

As you can see from the Python 2.7 os.popen docs:

Deprecated since version 2.6: This function is obsolete. Use the
subprocess module. Check especially the Replacing Older Functions
with the subprocess
Module section.

There were various limitations and problems with the old os.popen family of functions. And as the docs mention, the pre 2.6 versions weren't even reliable on Windows.

The motivation behind subprocess is explained in PEP 324 -- subprocess - New process module:

Motivation

Starting new processes is a common task in any programming language,
and very common in a high-level language like Python. Good support for
this task is needed, because:

  • Inappropriate functions for starting processes could mean a
    security risk: If the program is started through the shell, and
    the arguments contain shell meta characters, the result can be
    disastrous. [1]

  • It makes Python an even better replacement language for
    over-complicated shell scripts.


Currently, Python has a large number of different functions for
process creation. This makes it hard for developers to choose.

The subprocess module provides the following enhancements over
previous functions:

  • One "unified" module provides all functionality from previous
    functions.

  • Cross-process exceptions: Exceptions happening in the child
    before the new process has started to execute are re-raised in
    the parent. This means that it's easy to handle exec()
    failures, for example. With popen2, for example, it's
    impossible to detect if the execution failed.

  • A hook for executing custom code between fork and exec. This
    can be used for, for example, changing uid.

  • No implicit call of /bin/sh. This means that there is no need
    for escaping dangerous shell meta characters.

  • All combinations of file descriptor redirection is possible.
    For example, the "python-dialog" [2] needs to spawn a process
    and redirect stderr, but not stdout. This is not possible with
    current functions, without using temporary files.

  • With the subprocess module, it's possible to control if all open
    file descriptors should be closed before the new program is
    executed.

  • Support for connecting several subprocesses (shell "pipe").

  • Universal newline support.

  • A communicate() method, which makes it easy to send stdin data
    and read stdout and stderr data, without risking deadlocks.
    Most people are aware of the flow control issues involved with
    child process communication, but not all have the patience or
    skills to write a fully correct and deadlock-free select loop.
    This means that many Python applications contain race
    conditions. A communicate() method in the standard library
    solves this problem.

Please see the PEP link for the Rationale, and further details.

Aside from the safety & reliability issues, IMHO, the old os.popen family was cumbersome and confusing. It was almost impossible to use correctly without closely referring to the docs while you were coding. In comparison, subprocess is a godsend, although it's still wise to refer to the docs while using it. ;)

Occasionally, one sees people recommending the use of os.popen rather than subprocess.Popen in Python 2.7, eg Python subprocess vs os.popen overhead because it's faster. Sure, it's faster, but that's because it doesn't do various things that are vital to guarantee that it's working safely!


FWIW, os.popen itself still exists in Python 3, however it's safely implemented via subprocess.Popen, so you might as well just use subprocess.Popen directly yourself. The other members of the os.popen family no longer exist in Python 3. The os.spawn family of functions still exist in Python 3, but the docs recommend that the more powerful facilities provided by the subprocess module be used instead.

What is the difference between subprocess.popen and subprocess.run

subprocess.run() was added in Python 3.5 as a simplification over subprocess.Popen when you just want to execute a command and wait until it finishes, but you don't want to do anything else in the mean time. For other cases, you still need to use subprocess.Popen.

The main difference is that subprocess.run() executes a command and waits for it to finish, while with subprocess.Popen you can continue doing your stuff while the process finishes and then just repeatedly call Popen.communicate() yourself to pass and receive data to your process. Secondly, subprocess.run() returns subprocess.CompletedProcess.

subprocess.run() just wraps Popen and Popen.communicate() so you don't need to make a loop to pass/receive data or wait for the process to finish.

Check the official documentation for info on which params subprocess.run() pass to Popen and communicate().

What's the difference between subprocess Popen and call (how can I use them)?

There are two ways to do the redirect. Both apply to either subprocess.Popen or subprocess.call.

  1. Set the keyword argument shell = True or executable = /path/to/the/shell and specify the command just as you have it there.

  2. Since you're just redirecting the output to a file, set the keyword argument

    stdout = an_open_writeable_file_object

    where the object points to the output file.

subprocess.Popen is more general than subprocess.call.

Popen doesn't block, allowing you to interact with the process while it's running, or continue with other things in your Python program. The call to Popen returns a Popen object.

call does block. While it supports all the same arguments as the Popen constructor, so you can still set the process' output, environmental variables, etc., your script waits for the program to complete, and call returns a code representing the process' exit status.

returncode = call(*args, **kwargs) 

is basically the same as calling

returncode = Popen(*args, **kwargs).wait()

call is just a convenience function. It's implementation in CPython is in subprocess.py:

def call(*popenargs, timeout=None, **kwargs):
"""Run command with arguments. Wait for command to complete or
timeout, then return the returncode attribute.

The arguments are the same as for the Popen constructor. Example:

retcode = call(["ls", "-l"])
"""
with Popen(*popenargs, **kwargs) as p:
try:
return p.wait(timeout=timeout)
except:
p.kill()
p.wait()
raise

As you can see, it's a thin wrapper around Popen.



Related Topics



Leave a reply



Submit