Bash Printf %Q Invalid Directive

Bash printf %q invalid directive

The printf command is built into bash. It's also an external command, typically installed in /usr/bin/printf. On most Linux systems, /usr/bin/printf is the GNU coreutils implementation.

Older releases of the GNU coreutils printf command do not support the %q format specifier; it was introduced in version 8.25, released 2016-10-20. bash's built-in printf command does -- and has as long as bash has had a built-in printf command.

The error message implies that you're running script.sh using something other than bash.

Since the #!/bin/bash line appears to be correct, you're probably doing one of the following:

sh script.sh
. script.sh
source script.sh

Instead, just execute it directly (after making sure it has execute permission, using chmod +x if needed):

./script.sh

Or you could just edit your .bashrc file manually. The script, if executed correctly, will add this line to your .bashrc:

PS1=\\u@\\h:\\w\$\

(The space at the end of that line is significant.) Or you can do it more simply like this:

PS1='\u@\h:\w\$ '

One problem with the script is that it will replace every line that mentions PS1. If you just set it once and otherwise don't refer to it, that's fine, but if you have something like:

if [ ... ] ; then
    PS1=this
else
    PS1=that
fi

then the script will thoroughly mess that up. It's just a bit too clever.

Is $(printf '%q ' ${@:1}) equivalent to ${*}?

No, it is not equivalent, because words are splitted. Ex. the following code:

check_args() {
  echo "\$#=$#"
  printf "%s\n" "$@";
}

# setting arguments
set -- "space notspace" "newline"$'\n'"newline"
echo '1: ---------------- "$*"'
check_args "$*"

echo '2: ---------------- $(printf '\''%q '\'' "${@:1}")'
check_args $(printf '%q ' "${@:1}")

echo '3: ---------------- "$(printf '\''%q '\'' "${@:1}")"'
check_args "$(printf '%q ' "${@:1}")"

echo '4: ---------------- IFS=@ and "$*"'
( IFS=@; check_args "$*"; )

echo "5: ---------------- duplicating quoted"
check_args "$(printf '%s'"${IFS:0:1}" "${@:1}" | sed 's/'"${IFS:0:1}"'$//')"

echo "6: ---------------- duplicating quoted IFS=@"
( IFS=@; check_args "$(printf '%s'"${IFS:0:1}" "${@:1}" | sed 's/'"${IFS:0:1}"'$//')"; )

echo "7: ---------------- duplicating eval unquoted"
eval check_args $(printf '%q"'"${IFS:0:1}"'"' "${@:1}" | sed 's/'"${IFS:0:1}"'$//')

echo "8: ---------------- duplicating eval unquoted IFS=@"
( eval check_args $(IFS=@ ; printf '%q"'"${IFS:0:1}"'"' "${@:1}" | sed 's/"'"${IFS:0:1}"'"$//'); )

will output:

1: ---------------- "$*"
$#=1
space notspace newline
newline
2: ---------------- $(printf '%q ' "${@:1}")
$#=3
space\
notspace
$'newline\nnewline'
3: ---------------- "$(printf '%q ' "${@:1}")"
$#=1
space\ notspace $'newline\nnewline'
4: ---------------- IFS=@ and "$*"
$#=1
space notspace@newline
newline
5: ---------------- duplicating quoted
$#=1
space notspace newline
newline
6: ---------------- duplicating quoted IFS=@
$#=1
space notspace@newline
newline
7: ---------------- duplicating eval unquoted
$#=1
space notspace newline
newline
8: ---------------- duplicating eval unquoted IFS=@
$#=1
space notspace@newline
newline

tested on repl.

The "$*" outputs the arguments delimetered by IFS. So, shown in test 4, if delimeter is not unset or set to space, then the output of $* will be delimetered by IFS, @ in this example.

Also when IFS is unset or set to space, the output of $* does not include a terminating space, while printf '%q ' will always print a trailing space on the end of the string.

The output of $(printf '%q ' "${@:1}") is still splitted on space. So the test case 2 receives 3 arguments, because the space notspace string is separated by space and splitted to two arguments. When enclosing the printf inside " will not help - printf substitutes ex. newlines for \n characters.

Cases 5, 6, 7, 8 are my tries to replicate the behavior of "$*" using printf. It can be seen with cases 7 and 8 I used eval, with cases 5 and 6 I quoted the command substitution. The output of cases ( 5 and 6 ) and ( 7 and 8 ) should match the output of cases 1 and 4 respectively.

For duplicating the behavior of "$*" special care needs to be taken for IFS to properly delimeter the strings. I used sed 's/'"${IFS:0:1}"'$//' to remove the trailing IFS separator from the printf output. The 5 and 6 cases are unquoted $(printf ...) tries, with 6 using IFS=@ to show the separating works. The 7 and 8 cases use eval with special handling on the IFS, cause the IFS character itself needs to be enclosed with quotes, so the shell will not split on it again, that's why printf '%q"'"${IFS:0:1}"'"'.

doing $(printf '%q ' "${@:2}") (note the 2 instead of 1 as before) is not possible with pure bash $*?

You probably could just shift the arguments inside the substitution $(shift; printf "%s\n" "$*"), but as shown above, they are not equivalent anyway.

How to use printf %q in bash?

Bear this in mind when using %q:

ARGUMENT is printed in a format that can be reused as shell input, escaping non-printable characters with the proposed POSIX $'' syntax.

Emphasis mine. printf is free to reformat the arguments any way it likes as long as the input can be reused in the shell. However this is not the reason your input looks the way it does.

In Bash the ' character is a string delimiter, which is how you tell bash "the following string contains special characters like spaces, and these special characters should not be parsed by Bash". The quotes do not get passed to the commands that get called. What the command sees is something like this:

Command:
  printf "%q" a 'b c'

Received args:
  printf::arg0:  printf
  printf::arg1:  %q
  printf::arg2:  a
  printf::arg3:  b c

Note that arg3 does not have the quotes surrounding it. Bash does not pass them on.

When printf prints the args out, it does not know that there were quotes around b c, so it does not print them. But it does know that the space between 'b' and 'c' is a special shell character, and puts the \ in front to escape it.

This is true for all bash functions/commands, so bear in mind that the same happens when you call print_augs too.

If you want to maintain quotes around your strings, you'll need to double quote them so that Bash doesn't parse them:

function print_augs2() {
  echo "$@" >> "${output_file}"
}

print_augs2 a "'b c'"

# Output: a 'b c'

Correct usage of printf with variable

You want:

printf "$line\t%s\n" "${DESC}"

Check

help printf

Btw, if you have xmllint installed, you can get the description more nicely with xpath:

curl "https://www.ebi.ac.uk/ena/data/view/${EXP}&display=xml" \
  | xmllint --xpath '//DESCRIPTION/text()' -

printf field width doesn't support multibyte characters?

Are these the only way? There's no way to do it with printf alone?

Well with the example from ninjalj (thx btw), I wrote a script to deal with this problem, and saved it as fprintf in /usr/local/bin:

#! /bin/bash

IFS=' '
declare -a Text=("${@}")

## Skip the whole thing if there are no multi-byte characters ##
if (( $(echo "${Text[*]}" | wc -c) > $(echo "${Text[*]}" | wc -m) )); then
    if echo "${Text[*]}" | grep -Eq '%[#0 +-]?[0-9]+(\.[0-9]+)?[sb]'; then
        IFS=$'\n'
        declare -a FormatStrings=($(echo -n "${Text[0]}" | grep -Eo '%[^%]*?[bs]'))
        IFS=$' \t\n'
        declare -i format=0

    ## Check every format string ##
        for fw in "${FormatStrings[@]}"; do
            (( format++ ))
            if [[ "$fw" =~ ^%[#0\ +-]?[1-9][0-9]*(\.[1-9][0-9]*)?[sb]$ ]]; then
                (( Difference = $(echo "${Text[format]}" | wc -c) - $(echo "${Text[format]}" | wc -m) ))

            ## If multi-btye characters ##
                if (( Difference > 0 )); then

                ## If a field width is entered then replace field width value ##
                    if [[ "$fw" =~ ^%[#0\ +-]?[1-9][0-9]* ]]; then
                        (( Width = $(echo -n "$fw" | gsed -re 's|^%[#0 +-]?([1-9][0-9]*).*[bs]|\1|') + Difference ))
                        declare -a Text[0]="$(echo -n "${Text[0]}" | gsed -rne '1h;1!H;${g;y|\n|\x1C|;s|(%[^%])|\n\1|g;p}' | gsed -rne $(( format + 1 ))'s|^(%[#0 +-]?)[1-9][0-9]*|\1'${Width}'|;1h;1!H;${g;s|\n||g;y|\x1C|\n|;p}')"
                    fi

                ## If a precision is entered then replace precision value ##
                    if [[ "$fw" =~ \.[1-9][0-9]*[sb]$ ]]; then
                        (( Precision = $(echo -n "$fw" | gsed -re 's|^%.*\.([1-9][0-9]*)[sb]$|\1|') + Difference ))
                        declare -a Text[0]="$(echo -n "${Text[0]}" | gsed -rne '1h;1!H;${g;y|\n|\x1C|;s|(%[^%])|\n\1|g;p}' | gsed -rne $(( format + 1 ))'s|^(%[#0 +-]?([1-9][0-9]*)?)\.[1-9][0-9]*([bs])|\1.'${Precision}'\3|;1h;1!H;${g;s|\n||g;y|\x1C|\n|;p}')"
                    fi
                fi
            fi
        done
    fi
fi

printf "${Text[@]}"
exit 0

Usage: fprintf "## %5s %5s %5s ##\n## %5s %5s %5s ##\n" '' '*' '' '' '•' ''

A few things to note:

I didn't write this script to deal with * (asterisk) values for formats because I never use them. I wrote this for me and didn't want to over-complicate things.
I wrote this to check only the format strings %s and %b as they seem to be the only ones that are affected by this problem. Thus, if somehow someone manages to get a multi-byte unicode character out of a number, it may not work without minor modification.
The script works great for basic use of printf (not some old-skooler UNIX hacker), feel free to modify, or use as is all!

Which characters need to be escaped when using Bash?

There are two easy and safe rules which work not only in sh but also bash.

1. Put the whole string in single quotes

This works for all chars except single quote itself. To escape the single quote, close the quoting before it, insert the single quote, and re-open the quoting.

'I'\''m a s@fe $tring which ends in newline
'

sed command: sed -e "s/'/'\\\\''/g; 1s/^/'/; \$s/\$/'/"

2. Escape every char with a backslash

This works for all characters except newline. For newline characters use single or double quotes. Empty strings must still be handled - replace with ""

\I\'\m\ \a\ \s\@\f\e\ \$\t\r\i\n\g\ \w\h\i\c\h\ \e\n\d\s\ \i\n\ \n\e\w\l\i\n\e"
"

sed command: sed -e 's/./\\&/g; 1{$s/^$/""/}; 1!s/^/"/; $!s/$/"/'.

2b. More readable version of 2

There's an easy safe set of characters, like [a-zA-Z0-9,._+:@%/-], which can be left unescaped to keep it more readable

I\'m\ a\ s@fe\ \$tring\ which\ ends\ in\ newline"
"

sed command: LC_ALL=C sed -e 's/[^a-zA-Z0-9,._+@%/-]/\\&/g; 1{$s/^$/""/}; 1!s/^/"/; $!s/$/"/'.

Note that in a sed program, one can't know whether the last line of input ends with a newline byte (except when it's empty). That's why both above sed commands assume it does not. You can add a quoted newline manually.

Note that shell variables are only defined for text in the POSIX sense. Processing binary data is not defined. For the implementations that matter, binary works with the exception of NUL bytes (because variables are implemented with C strings, and meant to be used as C strings, namely program arguments), but you should switch to a "binary" locale such as latin1.

(You can easily validate the rules by reading the POSIX spec for sh. For bash, check the reference manual linked by @AustinPhillips)

echo the exact command-line passed to a command with visible quotes

Based on BashFAQ/050:

#!/bin/bash

trap 'printf RUNNING:\ %s\\n "$BASH_COMMAND" >&2' DEBUG

foo () {
    printf '%s\n' "$@" > /dev/null
}

foo bar baz

foo 'qux bazinga' '1 2' 3 '4 5'

foo "I'm going home"

The DEBUG trap outputs:

RUNNING: foo bar baz
RUNNING: foo 'qux bazinga' '1 2' 3 '4 5'
RUNNING: foo "I'm going home"

Showing the quotes as they appear in the script.

The function foo is set to output to /dev/null so we can ignore its output. It's just a stand-in for whatever commands your script may actually be running.

Here is a version that puts the trap code in a function and calls that function to turn the trap on and off. This is useful if you only want the trap to handle certain sections of code:

#!/bin/bash

dbt () {
    if [[ $1 == on ]]
    then
        trap 'printf RUNNING:\ %s\\n "$BASH_COMMAND" >&2' DEBUG
    elif [[ $1 == off ]]
    then
        trap '' DEBUG
    else
        printf '%s\n' "Invalid action: $1"
    fi
}

foo () {
    printf '%s\n' "$@" > /dev/null
}

foo bar baz

dbt on

foo 'qux bazinga' '1 2' 3 '4 5'

foo "I'm going home"

now=$(date)

dbt off

declare -a array_b

dbt on

array_b=(a b 'c d' e)

which outputs:

RUNNING: foo 'qux bazinga' '1 2' 3 '4 5'
RUNNING: foo "I'm going home"
RUNNING: now=$(date)
RUNNING: dbt off
RUNNING: array_b=(a b 'c d' e)

Another way to trace execution of a script is to use set -x.

Print a trace of simple commands, for commands, case commands, select commands, and arithmetic for commands and their arguments or associated word lists after they are expanded and before they are executed. The value of the PS4 variable is expanded and the resultant value is printed before the command and its expanded arguments.

Bash Printf %Q Invalid Directive