In a Bash Script, What Would $'\0' Evaluate to and Why

In a bash script, what would $'\0' evaluate to and why?

To complement rici's helpful answer:

Note that this answer is about bash. ksh and zsh also support $'...' strings, but their behavior differs:

* zsh does create and preserve NUL (null bytes) with $'\0'.

* ksh, by contrast, has the same limitations as bash, and additionally interprets the first NUL in a command substitution's output as the string terminator (cuts off at the first NUL, whereas bash strips such NULs).

$'\0' is an ANSI C-quoted string that technically creates a NUL (0x0 byte), but effectively results in the empty (null) string (same as ''), because any NUL is interpreted as the (C-style) string terminator by Bash in the context of arguments and here-docs/here-strings.

As such, it is somewhat misleading to use $'\0' because it suggests that you can create a NUL this way, when you actually cannot:

  • You cannot create NULs as part of a command argument or here-doc / here-string, and you cannot store NULs in a variable:

    • echo $'a\0b' | cat -v # -> 'a' - string terminated after 'a'
    • cat -v <<<$'a\0b' # -> 'a' - ditto
  • In the context of command substitutions, by contrast, NULs are stripped:

    • echo "$(printf 'a\0b')" | cat -v # -> 'ab' - NUL is stripped
  • However, you can pass NUL bytes via files and pipes.

    • printf 'a\0b' | cat -v # -> 'a^@b' - NUL is preserved, via stdout and pipe
    • Note that it is printf that is generating the NUL via its single-quoted argument whose escape sequences printf then interprets and writes to stdout. By contrast, if you used printf $'a\0b', bash would again interpret the NUL as the string terminator up front and pass only 'a' to printf.

If we examine the sample code, whose intent is to read the entire input at once, across lines (I've therefore changed line to content):

while read -r -d $'\0' content; do  # same as: `while read -r -d '' ...`
echo "${content}"
done <<< "${some_variable}"

This will never enter the while loop body, because stdin input is provided by a here-string, which, as explained, cannot contain NULs.

Note that read actually does look for NULs with -d $'\0', even though $'\0' is effectively ''. In other words: read by convention interprets the empty (null) string to mean NUL as -d's option-argument, because NUL itself cannot be specified for technical reasons.

In the absence of an actual NUL in the input, read's exit code indicates failure, so the loop is never entered.

However, even in the absence of the delimiter, the value is read, so to make this code work with a here-string or here-doc, it must be modified as follows:

while read -r -d $'\0' content || [[ -n $content ]]; do
echo "${content}"
done <<< "${some_variable}"

However, as @rici notes in a comment, with a single (multi-line) input string, there is no need to use while at all:

read -r -d $'\0' content <<< "${some_variable}"

This reads the entire content of $some_variable, while trimming leading and trailing whitespace (which is what read does with $IFS at its default value, $' \t\n').

@rici also points out that if such trimming weren't desired, a simple content=$some_variable would do.

Contrast this with input that actually contains NULs, in which case while is needed to process each NUL-separated token (but without the || [[ -n $<var> ]] clause); find -print0 outputs filenames separated by a NUL each):

while IFS= read -r -d $'\0' file; do
echo "${file}"
done < <(find . -print0)

Note the use of IFS= read ... to suppress trimming of leading and trailing whitespace, which is undesired in this case, because input filenames must be preserved as-is.

Bash: Why does [[ zero -eq 0 ]] evaluate to true?

-eq runs a numeric comparison.

When a string is given as an operand, the value of a like-named variable is looked up.

Thus, this becomes equivalent to [[ $zero -eq 0 ]]. An empty string has a numeric value of 0. Thus, unless a different value has been assigned to the shell variable named zero, this is equivalent to [[ 0 -eq 0 ]], which is true.

What does the bash read -d '' do?

In bash read builtin empty string delimiter -d '' behaves same as using delimiter as a NUL byte or $'\0' (as defined by ANSI C-quoted string) or in hex representation 0x0.

-d '' specifies that each input line should be delimited by a NUL byte. It means that input string is read up to the immediate next NUL character in each invocation of read.

Usually it is used with IFS= as:

IFS= read -r -d ''

for trimming leading and trailing whitespaces in input.

A common example of processing NUL delimited input is:

while IFS= read -r -d '' file; do
echo "$file"
done < <(find . -type f -print0)
  • find command is printing files in current directory with NUL as the delimiter between each entry.
  • read -d '' sets \0 as delimiter for reading one entry at a time from output of find command.

Related: Why ‘read’ doesn’t accept \0 as a delimiter in this example?

Why $'\0' or $'\x0' is an empty string? Should be the null-character, isn't it?

It's a limitation. bash does not allow string values to contain interior NUL bytes.

Posix (and C) character strings cannot contain interior NULs. See, for example, the Posix definition of character string (emphasis added):

3.92 Character String

A contiguous sequence of characters terminated by and including the first null byte.

Similarly, standard C is reasonably explicit about the NUL character in character strings:

§5.2.1p2 …A byte with all bits set to 0, called the null character, shall exist in the basic execution character set; it is used to terminate a character string.

Posix explicitly forbids the use of NUL (and /) in filenames (XBD 3.170) or in environment variables (XBD 8.1 "... are considered to end with a null byte."

In this context, shell command languages, including bash, tend to use the same definition of a character string, as a sequence of non-NUL characters terminated by a single NUL.

You can pass NULs freely through bash pipes, of course, and nothing stops you from assigning a shell variable to the output of a program which outputs a NUL byte. However, the consequences are "unspecified" according to Posix (XSH 2.6.3 "If the output contains any null bytes, the behavior is unspecified."). In bash, the NULs are removed, unless you insert a NUL into a string using bash's C-escape syntax ($'\0'), in which case the NUL will end up terminating the value.

On a practical note, consider the difference between the two following ways of attempting to insert a NUL into the stdin of a utility:

$ # Prefer printf to echo -n
$ printf $'foo\0bar' | wc -c
3
$ printf 'foo\0bar' | wc -c
7
$ # Bash extension which is better for strings which might contain %
$ printf %b 'foo\0bar' | wc -c
7

The 'eval' command in Bash and its typical uses

eval takes a string as its argument, and evaluates it as if you'd typed that string on a command line. (If you pass several arguments, they are first joined with spaces between them.)

${$n} is a syntax error in bash. Inside the braces, you can only have a variable name, with some possible prefix and suffixes, but you can't have arbitrary bash syntax and in particular you can't use variable expansion. There is a way of saying “the value of the variable whose name is in this variable”, though:

echo ${!n}
one

$(…) runs the command specified inside the parentheses in a subshell (i.e. in a separate process that inherits all settings such as variable values from the current shell), and gathers its output. So echo $($n) runs $n as a shell command, and displays its output. Since $n evaluates to 1, $($n) attempts to run the command 1, which does not exist.

eval echo \${$n} runs the parameters passed to eval. After expansion, the parameters are echo and ${1}. So eval echo \${$n} runs the command echo ${1}.

Note that most of the time, you must use double quotes around variable substitutions and command substitutions (i.e. anytime there's a $): "$foo", "$(foo)". Always put double quotes around variable and command substitutions, unless you know you need to leave them off. Without the double quotes, the shell performs field splitting (i.e. it splits value of the variable or the output from the command into separate words) and then treats each word as a wildcard pattern. For example:

$ ls
file1 file2 otherfile
$ set -- 'f* *'
$ echo "$1"
f* *
$ echo $1
file1 file2 file1 file2 otherfile
$ n=1
$ eval echo \${$n}
file1 file2 file1 file2 otherfile
$eval echo \"\${$n}\"
f* *
$ echo "${!n}"
f* *

eval is not used very often. In some shells, the most common use is to obtain the value of a variable whose name is not known until runtime. In bash, this is not necessary thanks to the ${!VAR} syntax. eval is still useful when you need to construct a longer command containing operators, reserved words, etc.

why integer result in 0 when I change declared integer to string in Bash script

From the declare section of man bash:

-i The variable is treated as an integer; arithmetic evaluation (see ARITHMETIC EVALUATION) is performed when the variable is assigned a value.

From the ARITHMETIC EVALUATION section of man bash:

The value of a variable is evaluated as an arithmetic expression when...a variable which has been given the integer attribute using declare -i is assigned a value. A null value evaluates to 0.

Together, these clearly state that the behavior you're seeing is the expected behavior. When the characters t h r e e are evaluated arithmetically, the resulting null value is evaluated as 0, which is then assigned to the variable number.

All assignments in bash are interpreted first as strings. number=10 interprets the 1 0 as a string first, recognizes it as a valid integer, and leaves it as-is. number=three is just as syntactically and semantically valid as number=10, which is why your script continues without any error after assigning the evaluated value of 0 to number.

Getting command not found error while comparing two strings in Bash

This is problem:

if [[$variable == $blanko]];

Spaces are required inside square brackets, use it like this:

[[ "$variable" == "$blanko" ]] && echo "Nichts da!" || echo "$variable"

How can I have a newline in a string in sh?

If you're using Bash, you can use backslash-escapes inside of a specially-quoted $'string'. For example, adding \n:

STR=$'Hello\nWorld'
echo "$STR" # quotes are required here!

Prints:

Hello
World

If you're using pretty much any other shell, just insert the newline as-is in the string:

STR='Hello
World'

Bash recognizes a number of other backslash escape sequences in the $'' string. Here is an excerpt from the Bash manual page:

Words of the form $'string' are treated specially. The word expands to
string, with backslash-escaped characters replaced as specified by the
ANSI C standard. Backslash escape sequences, if present, are decoded
as follows:
\a alert (bell)
\b backspace
\e
\E an escape character
\f form feed
\n new line
\r carriage return
\t horizontal tab
\v vertical tab
\\ backslash
\' single quote
\" double quote
\nnn the eight-bit character whose value is the octal value
nnn (one to three digits)
\xHH the eight-bit character whose value is the hexadecimal
value HH (one or two hex digits)
\cx a control-x character

The expanded result is single-quoted, as if the dollar sign had not
been present.

A double-quoted string preceded by a dollar sign ($"string") will cause
the string to be translated according to the current locale. If the
current locale is C or POSIX, the dollar sign is ignored. If the
string is translated and replaced, the replacement is double-quoted.


Related Topics



Leave a reply



Submit