Understanding Python Subprocess.Check_Output'S First Argument and Shell=True

Actual meaning of 'shell=True' in subprocess

The benefit of not calling via the shell is that you are not invoking a 'mystery program.' On POSIX, the environment variable SHELL controls which binary is invoked as the "shell." On Windows, there is no bourne shell descendent, only cmd.exe.

So invoking the shell invokes a program of the user's choosing and is platform-dependent. Generally speaking, avoid invocations via the shell.

Invoking via the shell does allow you to expand environment variables and file globs according to the shell's usual mechanism. On POSIX systems, the shell expands file globs to a list of files. On Windows, a file glob (e.g., "*.*") is not expanded by the shell, anyway (but environment variables on a command line are expanded by cmd.exe).

If you think you want environment variable expansions and file globs, research the ILS attacks of 1992-ish on network services which performed subprogram invocations via the shell. Examples include the various sendmail backdoors involving ILS.

In summary, use shell=False.

how to avoid shell=True in subprocess

Just pass the arguments to check_output() as a list:

subprocess.check_output(["md5", "Downloads/test.txt"], stderr=subprocess.STDOUT)

From the docs:

args is required for all calls and should be a string, or a sequence
of program arguments. Providing a sequence of arguments is generally
preferred, as it allows the module to take care of any required
escaping and quoting of arguments (e.g. to permit spaces in file
names). If passing a single string, either shell must be True (see
below) or else the string must simply name the program to be executed
without specifying any arguments.

Subprocess & Python what am I doing wrong?

The big thing you're running into here is that iw dev wlan0 link | grep signal | awk '{print $2}' isn't one process.

It's three processes, connected by pipes: iw, whose output is piped to grep, whose output is piped to awk.

Notably, the thing creating those pipes when you run this command is the shell - bash, most likely - and it's doing so based on the command you passed into it.

Python's subprocess commands, however, expect you to pass a path to an executable file as your first argument. You can then pass additional strings to be used as arguments for the executable you're running. Doing things the python way, then, would look roughly like: subprocess.check_output("/usr/sbin/iw", "dev", "wlan0"...). This is why python's confused - the big long string you passed to it isn't a file path.

However, you can tell python that you're passing it a big long shell command using an argument - the boolean shell arg.

Try this:

# note the "shell=True" bit
output_bytes = subprocess.check_output("iw dev wlan0 link | grep signal | awk '{print $2}'", shell=True)

and you should see it work out. Python will take your command and hand it to a shell, who will correctly execute it, as opposed to trying to use the string you passed it as a filename of an executable to run.

How does subprocess.call() work with shell=False?

UNIX programs start each other with the following three calls, or derivatives/equivalents thereto:

  • fork() - Create a new copy of yourself.
  • exec() - Replace yourself with a different program (do this if you're the copy!).
  • wait() - Wait for another process to finish (optional, if not running in background).

Thus, with shell=False, you do just that (as Python-syntax pseudocode below -- exclude the wait() if not a blocking invocation such as subprocess.call()):

pid = fork()
if pid == 0: # we're the child process, not the parent
execlp("ls", "ls", "-l", NUL);
else:
retval = wait(pid) # we're the parent; wait for the child to exit & get its exit status

whereas with shell=True, you do this:

pid = fork()
if pid == 0:
execlp("sh", "sh", "-c", "ls -l", NUL);
else:
retval = wait(pid)

Note that with shell=False, the command we executed was ls, whereas with shell=True, the command we executed was sh.


That is to say:

subprocess.Popen(foo, shell=True)

is exactly the same as:

subprocess.Popen(
["sh", "-c"] + ([foo] if isinstance(foo, basestring) else foo),
shell=False)

That is to say, you execute a copy of /bin/sh, and direct that copy of /bin/sh to parse the string into an argument list and execute ls -l itself.


So, why would you use shell=True?

  • You're invoking a shell builtin.

    For instance, the exit command is actually part of the shell itself, rather than an external command. That said, this is a fairly small set of commands, and it's rare for them to be useful in the context of a shell instance that only exists for the duration of a single subprocess.call() invocation.

  • You have some code with shell constructs (ie. redirections) that would be difficult to emulate without it.

    If, for instance, your command is cat one two >three, the syntax >three is a redirection: It's not an argument to cat, but an instruction to the shell to set stdout=open('three', 'w') when running the command ['cat', 'one', 'two']. If you don't want to deal with redirections and pipelines yourself, you need a shell to do it.

    A slightly trickier case is cat foo bar | baz. To do that without a shell, you need to start both sides of the pipeline yourself: p1 = Popen(['cat', 'foo', 'bar'], stdout=PIPE), p2=Popen(['baz'], stdin=p1.stdout).

  • You don't give a damn about security bugs.

    ...okay, that's a little bit too strong, but not by much. Using shell=True is dangerous. You can't do this: Popen('cat -- %s' % (filename,), shell=True) without a shell injection vulnerability: If your code were ever invoked with a filename containing $(rm -rf ~), you'd have a very bad day. On the other hand, ['cat', '--', filename] is safe with all possible filenames: The filename is purely data, not parsed as source code by a shell or anything else.

    It is possible to write safe scripts in shell, but you need to be careful about it. Consider the following:

    filenames = ['file1', 'file2'] # these can be user-provided
    subprocess.Popen(['cat -- "$@" | baz', '_'] + filenames, shell=True)

    That code is safe (well -- as safe as letting a user read any file they want ever is), because it's passing your filenames out-of-band from your script code -- but it's safe only because the string being passed to the shell is fixed and hardcoded, and the parameterized content is external variables (the filenames list). And even then, it's "safe" only to a point -- a bug like Shellshock that triggers on shell initialization would impact it as much as anything else.

What is the difference between subprocess.run & subprocess.check_output?

It is because from the python documentation here:
run method

run method accepts the first parameter as arguments and not string.

So you can try passing the arguments in a list as:

result = subprocess.run(['abd', 'devices'], capture_output=True, text=True, universal_newlines=True)

Also,
check_output method accepts args but it has a parameter call "shell = True" Therefore, it works for multi-word args.

If you want to use the run method without a list, add shell=True in the run method parameter. (I tried for "man ls" command and it worked).

Correct incantation of subprocess with shell=True to get output and not hang

First off -- there's very little point to passing an array to:

subprocess.check_output(['/usr/bin/wc','-l','A-Z*/A-Z*.F*'], shell=True)

...as this simply runs wc with no arguments, in a shell also passed arguments -l and A-Z*/A-Z*.F* as arguments (to the shell, not to wc). Instead, you want:

subprocess.check_output('/usr/bin/wc -l A-Z*/A-Z*.F*', shell=True)

Before being corrected, this would hang because wc had no arguments and was reading from stdin. I would suggest ensuring that stdin is passed in closed, rather than passing along your Python program's stdin (as is the default behavior).

An easy way to do this, since you have shell=True:

subprocess.check_output(
'/usr/bin/wc -l A-Z*/A-Z*.F* </dev/null',
shell=True)

...alternately:

p = subprocess.Popen('/usr/bin/wc -l A-Z*/A-Z*.F*', shell=True,
stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=None)
(output, _) = p.communicate(input='')

...which will ensure an empty stdin from Python code rather than relying on the shell.



Related Topics



Leave a reply



Submit