Unable to see or modify value of PYTHONHASHSEED through a module
You can set PYTHONHASHSEED in a Python script, but it has no effect on the behavior of the hash()
function - it needs to be set in the environment of the interpreter before the interpreter starts up.
How to set its value using pure Python
The trick is to pass the environment variable to the Python interpreter in a subprocess.
import random
from subprocess import call
random.seed(37)
cmd = ['python', '-c', 'print(hash("abc"))']
for i in range(5):
hashseed = bytes(random.randint(0, 4294967295))
print('\nhashseed', hashseed)
call(cmd, env={'PYTHONHASHSEED': hashseed})
output
hashseed 2929187283
-972692480
hashseed 393430205
2066796829
hashseed 2653501013
1620854360
hashseed 3616018455
-599248233
hashseed 3584366196
-2103216293
You can change the cmd
list so that it runs the hashtest.py
script above:
cmd = ['python', 'hashtest.py']
or if hashtest.py
is executable,
cmd = './hashtest.py'
By passing a dict
as the env
argument we replace the default environment that would be passed to the command. If you need access to those other environment variables, then instead you should modify os.environ
in the calling script, with eg, os.environ['PYTHONHASHSEED'] = hashseed
.
How to set its value using Bash
First, we have a short Bash script pyhashtest.bsh
that uses the RANDOM environment variable as the seed for PYTHONHASHSEED. This variable must be exported so that the Python interpreter can see it. Then we run our Python script hashtest.py
. We do this in a loop 5 times so we can see that using different seeds has an effect on the hash value.
The Python script hashtest.py
reads PYTHONHASHSEED from the environment and prints it to show that it has the value we expect it to have. We then calculate & print the hash of a short string.
pyhashtest.bsh
#!/usr/bin/env bash
for((i=0; i<5; i++)); do
n=$RANDOM
echo "$i: Seed is $n"
export PYTHONHASHSEED="$n"
python hashtest.py
echo
done
hashtest.py
#!/usr/bin/env python
import os
s = 'abc'
print('Hashseed is', os.environ['PYTHONHASHSEED'])
print('hash of s is', hash(s))
typical output
0: Seed is 9352
Hashseed is 9352
hash of s is 401719638
1: Seed is 24945
Hashseed is 24945
hash of s is -1250185385
2: Seed is 17661
Hashseed is 17661
hash of s is -571990551
3: Seed is 24313
Hashseed is 24313
hash of s is 99658978
4: Seed is 21142
Hashseed is 21142
hash of s is -662114263
To run these programs, save them both into the same directory, eg the usual directory you run Python scripts from. Then open a Bash shell and navigate to that directory using the cd
command.
Eg, if you've saved the scripts to /mnt/sda2/fred/python
then you'd do
cd /mnt/sda2/fred/python
Next, make pyhashtest.bsh
executable using this command:
chmod a+x pyhashtest.bsh
Then run it with
./pyhashtest.bsh
Perl string replace with backreferenced values and shell variables
The meaning of $1
is different in the shell and in Perl.
In the shell, it means the first positional argument. As double quotes expand variables, $1
in double quotes also means the first positional argument.
In Perl, $1
means the first capture group matched by a regular expression.
But, if you use $1
in double quotes on the shell level, Perl never sees it: the shell expands $1
as the first positional argument and sends the expanded string to Perl.
You can use the %ENV
hash in Perl to refer to environment variables:
aaa=5 perl -i.bak -pe 's/pm.max_children\s*=\s*\K([0-9]+)/($1 * $ENV{aaa})/ge' /usr/local/etc/php-fpm.d/www.conf
How to substitute shell variables in complex text files
Looking, it turns out on my system there is an envsubst
command which is part of the gettext-base package.
So, this makes it easy:
envsubst < "source.txt" > "destination.txt"
Note if you want to use the same file for both, you'll have to use something like moreutil's sponge
, as suggested by Johnny Utahh: envsubst < "source.txt" | sponge "source.txt"
. (Because the shell redirect will otherwise empty the file before its read.)
bash : Bad Substitution
The default shell (/bin/sh
) under Ubuntu points to dash
, not bash
.
me@pc:~$ readlink -f $(which sh)
/bin/dash
So if you chmod +x your_script_file.sh
and then run it with ./your_script_file.sh
, or if you run it with bash your_script_file.sh
, it should work fine.
Running it with sh your_script_file.sh
will not work because the hashbang line will be ignored and the script will be interpreted by dash
, which does not support that string substitution syntax.
What's a concise way to check that environment variables are set in a Unix shell script?
Parameter Expansion
The obvious answer is to use one of the special forms of parameter expansion:
: ${STATE?"Need to set STATE"}
: ${DEST:?"Need to set DEST non-empty"}
Or, better (see section on 'Position of double quotes' below):
: "${STATE?Need to set STATE}"
: "${DEST:?Need to set DEST non-empty}"
The first variant (using just ?
) requires STATE to be set, but STATE="" (an empty string) is OK — not exactly what you want, but the alternative and older notation.
The second variant (using :?
) requires DEST to be set and non-empty.
If you supply no message, the shell provides a default message.
The ${var?}
construct is portable back to Version 7 UNIX and the Bourne Shell (1978 or thereabouts). The ${var:?}
construct is slightly more recent: I think it was in System III UNIX circa 1981, but it may have been in PWB UNIX before that. It is therefore in the Korn Shell, and in the POSIX shells, including specifically Bash.
It is usually documented in the shell's man page in a section called Parameter Expansion. For example, the bash
manual says:
${parameter:?word}
Display Error if Null or Unset. If parameter is null or unset, the expansion of word (or a message to that effect if word is not present) is written to the standard error and the shell, if it is not interactive, exits. Otherwise, the value of parameter is substituted.
The Colon Command
I should probably add that the colon command simply has its arguments evaluated and then succeeds. It is the original shell comment notation (before '#
' to end of line). For a long time, Bourne shell scripts had a colon as the first character. The C Shell would read a script and use the first character to determine whether it was for the C Shell (a '#
' hash) or the Bourne shell (a ':
' colon). Then the kernel got in on the act and added support for '#!/path/to/program
' and the Bourne shell got '#
' comments, and the colon convention went by the wayside. But if you come across a script that starts with a colon, now you will know why.
Position of double quotes
blong asked in a comment:
Any thoughts on this discussion? https://github.com/koalaman/shellcheck/issues/380#issuecomment-145872749
The gist of the discussion is:
… However, when I
shellcheck
it (with version 0.4.1), I get this message:In script.sh line 13:
: ${FOO:?"The environment variable 'FOO' must be set and non-empty"}
^-- SC2086: Double quote to prevent globbing and word splitting.Any advice on what I should do in this case?
The short answer is "do as shellcheck
suggests":
: "${STATE?Need to set STATE}"
: "${DEST:?Need to set DEST non-empty}"
To illustrate why, study the following. Note that the :
command doesn't echo its arguments (but the shell does evaluate the arguments). We want to see the arguments, so the code below uses printf "%s\n"
in place of :
.
$ mkdir junk
$ cd junk
$ > abc
$ > def
$ > ghi
$
$ x="*"
$ printf "%s\n" ${x:?You must set x} # Careless; not recommended
abc
def
ghi
$ unset x
$ printf "%s\n" ${x:?You must set x} # Careless; not recommended
bash: x: You must set x
$ printf "%s\n" "${x:?You must set x}" # Careful: should be used
bash: x: You must set x
$ x="*"
$ printf "%s\n" "${x:?You must set x}" # Careful: should be used
*
$ printf "%s\n" ${x:?"You must set x"} # Not quite careful enough
abc
def
ghi
$ x=
$ printf "%s\n" ${x:?"You must set x"} # Not quite careful enough
bash: x: You must set x
$ unset x
$ printf "%s\n" ${x:?"You must set x"} # Not quite careful enough
bash: x: You must set x
$
Note how the value in $x
is expanded to first *
and then a list of file names when the overall expression is not in double quotes. This is what shellcheck
is recommending should be fixed. I have not verified that it doesn't object to the form where the expression is enclosed in double quotes, but it is a reasonable assumption that it would be OK.
How to store /etc/passwd in a hash or array?
Store it in a hash with usernames as keys, and the split array as value:
my %passwd = ();
open PASSWD, "/etc/passwd";
while(<PASSWD>) {
chomp;
my @f = split /:/;
@{$passwd{$f[0]}} = @f;
}
print $passwd{'Sjoerder'}[3];
Should I use quotes in environment path names?
Tip of the hat to @gniourf_gniourf and @chepner for their help.
tl;dr
To be safe, double-quote: it'll work in all cases, across all POSIX-like shells.
If you want to add a ~
-based path, selectively leave the ~/
unquoted to ensure that ~
is expanded; e.g.: export PATH=~/"bin:$PATH"
.
See below for the rules of ~
expansion in variable assignments.
Alternatively, simply use $HOME
inside a single, double-quoted string:export PATH="$HOME/bin:$PATH"
NOTE: The following applies to bash
, ksh
, and zsh
, but NOT to (mostly) strictly POSIX compliant shells such as dash
; thus, when you target /bin/sh
, you MUST double-quote the RHS of export
.[1]
- Double-quotes are optional, ONLY IF the literal part of your RHS (the value to assign) contains neither whitespace nor other shell metacharacters.
- Whether the values of the variables referenced contain whitespace/metacharacters or not does not matter - see below.
- Again: It does matter with
sh
, whenexport
is used, so always double-quote there.
- Again: It does matter with
The reason you can get away without double-quoting in this case is that variable-assignment statements in POSIX-like shells interpret their RHS differently than arguments passed to commands, as described in section 2.9.1 of the POSIX spec:
Specifically, even though initial word-splitting is performed, it is only applied to the unexpanded (raw) RHS (that's why you do need quoting with whitespace/metacharacters in literals), and not to its results.
This only applies to genuine assignment statements of the form
<name>=<value>
in all POSIX-like shells, i.e., if there is no command name before the variable name; note that that includes assignments prepended to a command to define ad-hoc environment variables for it, e.g.,foo=$bar cmd ...
.Assignments in the context of other commands should always be double-quoted, to be safe:
With
sh
(in a (mostly) strictly POSIX-compliant shell such asdash
) an assignment withexport
is treated as a regular command, and thefoo=$bar
part is treated as the 1st argument to theexport
builtin and therefore treated as usual (subject to word-splitting of the result, too).
(POSIX doesn't specify any other commands involving (explicit) variable-assignment;declare
,typeset
, andlocal
are nonstandard extensions).bash
,ksh
,zsh
, in an understandable deviation from POSIX, extend the assignment logic toexport foo=$bar
andtypeset/declare/local foo=$bar
as well. In other words: inbash
,ksh
,zsh
,export/typeset/declare/local
commands are treated like assignments, so that quoting isn't strictly necessary.- Perhaps surprisingly,
dash
, which also chose to implement the non-POSIXlocal
builtin[2]
, does NOT extend assignment logic to it; it is consistent with itsexport
behavior, however.
- Perhaps surprisingly,
Assignments passed to
env
(e.g.,env foo=$bar cmd ...
) are also subject to expansion as a command argument and therefore need double-quoting - except inzsh
.- That
env
acts differently fromexport
inksh
andbash
in that regard is due to the fact thatenv
is an external utility, whereasexport
is a shell builtin.
(zsh
's behavior fundamentally differs from that of the other shells when it comes to unquoted variable references).
- That
Tilde (
~
) expansion happens as follows in genuine assignment statements:- In addition to the
~
needing to be unquoted, as usual, it is also only applied:- If the entire RHS is
~
; e.g.:foo=~ # same as: foo="$HOME"
- Otherwise: only if both of the following conditions are met:
- if
~
starts the string or is preceded by an unquoted:
- if
~
is followed by an unquoted/
. - e.g.,
foo=~/bin # same as foo="$HOME/bin"
foo=$foo:~/bin # same as foo="$foo:$HOME/bin"
- if
- If the entire RHS is
- In addition to the
Example
This example demonstrates that in bash
, ksh
, and zsh
you can get away without double-quoting, even when using export
, but I do not recommend it.
#!/usr/bin/env bash
# or ksh or zsh - but NOT /bin/sh!
# Create env. variable with whitespace and other shell metacharacters
export FOO="b:c &|<> d"
# Extend the value - the double quotes here are optional, but ONLY
# because the literal part, 'a:`, contains no whitespace or other shell metacharacters.
# To be safe, DO double-quote the RHS.
export FOO=a:$foo # OK - $FOO now contains 'a:b:c &|<> d'
[1] As @gniourf_gniourf points out: Use of export
to modify the value of PATH
is optional, because once a variable is marked as exported, you can use a regular assignment (PATH=...
) to change its value.
That said, you may still choose to use export
, so as to make it explicit that the variable being modified is exported.
[2] @gniourf_gniourf states that a future version of the POSIX standard may introduce the local
builtin.
Related Topics
A Way to Prevent Bash from Parsing Command Line W/Out Using Escape Symbols
Displaying or Redirecting a Shell's Job Control Messages
Shell Script Issue with Filenames Containing Spaces
How to Make .Gitignore Configurable Based on Environment Variables
Mongod Does Not Start (Mongod.Service: Failed with Result 'Signal')
"Command Not Found" Piping a Variable to Cut When Output Stored in a Variable
Synchronize Linux System Clock to Windows Ntp Service
How to Monitor the Amount of Simd Instruction Usage
Syntax Error of ";; Unexpected" on Simple Init Script for Debian
How to Mail Script Output in Table Format
Check If a File Exists with a Filename Containing Spaces
Multiple -A with Greater Than/Less Than Break Bash Script
What Is Segment 00 in My Linux Executable Program (64 Bits)
Executable Object Files and Virtual Memory
In Bash How to Split a Column in Several Column of Fixed Dimension