Bash Arrays and Negative Subscripts, Yes or No

Bash arrays and negative subscripts, yes or no?

If you just want the last element

$ echo ${muh[*]: -1}
2

If you want next to last element

$ echo ${muh[*]: -2:1}
bleh

Bad array subscript

So on line 38,

if (( times[$i] > times[$i - 1] + $SECS + 10 ))

would refer to times[-1] once during the iteration. Negative indices are only very recently part of bash arrays, so that is most likely why you are getting the error.

Likewise with lines 54 and 67 you're hitting a negative array subscript once. Adjust your loops to avoid [0 - 1].

Add a new element to an array without specifying the index in Bash

Yes there is:

ARRAY=()
ARRAY+=('foo')
ARRAY+=('bar')

Bash Reference Manual:

In the context where an assignment statement is assigning a value to a shell variable or array index (see Arrays), the ‘+=’ operator can be used to append to or add to the variable's previous value.

Also:

When += is applied to an array variable using compound assignment (see Arrays below), the variable's value is not unset (as it is when using =), and new values are appended to the array beginning at one greater than the array's maximum index (for indexed arrays)

Bash empty array expansion with `set -u`

The only safe idiom is ${arr[@]+"${arr[@]}"}

Unless you only care about Bash 4.4+, but you wouldn't be looking at this question if that were the case :)

This is already the recommendation in ikegami's answer, but there's a lot of misinformation and guesswork in this thread. Other patterns, such as ${arr[@]-} or ${arr[@]:0}, are not safe across all major versions of Bash.

As the table below shows, the only expansion that is reliable across all modern-ish Bash versions is ${arr[@]+"${arr[@]}"} (column +"). Of note, several other expansions fail in Bash 4.2, including (unfortunately) the shorter ${arr[@]:0} idiom, which doesn't just produce an incorrect result but actually fails. If you need to support versions prior to 4.4, and in particular 4.2, this is the only working idiom.

Screenshot of different idioms across versions

Unfortunately other + expansions that, at a glance, look the same do indeed emit different behavior. Using :+ instead of + (:+" in the table), for example, does not work because :-expansion treats an array with a single empty element (('')) as "null" and thus doesn't (consistently) expand to the same result.

Quoting the full expansion instead of the nested array ("${arr[@]+${arr[@]}}", "+ in the table), which I would have expected to be roughly equivalent, is similarly unsafe in 4.2.

You can see the code that generated this data along with results for several additional version of bash in this gist.

Python negative subscripting

Python, Lua, and Ruby support negative subscripts. In Python, this feature was added as a footnote in version 1.4 and reaffirmed as extended slicing in version 2.3

On p.264 of Sebesta's book (10th ed.) he claims Python does not support negative indexing on arrays. The original text was overhauled and republished as edition 6 in 2004, while Python 2.3 was released on July 29, 2003. I'm guessing extended slicing was overlooked and been in error since the release of Sebesta's 6th edition.

I cannot find errata for the 10th edition. You may want to email the author and inform him.

Shell parameter expansion: $# vs. ${#@}

${#@} / ${#*} is the same as $# in most POSIX-like shells, but not all - a notable exception is dash, which acts as sh on Ubuntu systems.

$# is the POSIX-compliant form, so it is the safe (portable) choice (from the POSIX spec, prefix $ implied):

# Expands to the decimal number of positional parameters.


Optional background information

The POSIX shell spec is largely based on the historical Bourne shell, whose only array-like construct is the sequence of positional parameters ($1, $2, ...), with $# containing the count of positional parameters, $* expanding to a space-separated list of the parameter values that is then subject to word-splitting, and "$@" - in a double-quoted context - expanding to the positional parameters as originally specified (even if they contain embedded whitespace).

The following discusses bash, ksh, and zsh; dash, which acts fundamentally differently, is discussed at the bottom.

bash, ksh, and zsh:

POSIX-compatible shells such as ksh and bash later generalized this pseudo-array to provide bona fide array variables, whose syntax borrowed from the positional-parameter syntax (zsh supports this syntax too, but has its own, simpler syntax as well):

${arr[*]} and "${arr[@]}" function analogously to $* and "$@", and both ${#arr[@]} and ${#arr[*]} correspond to $#.

Perhaps in a nod to the original syntax, these shells (which also includes zsh, whose array syntax is simpler, however) also chose to support ${#@} and ${#*} for symmetry, where you can think of @ / * as the all-elements subscripts of the implied array, i.e., the pseudo-array of positional parameters.

As for symmetry regarding element extraction:

  • Something like ${@[2]} to mirror $2 works only in zsh, not in bash and ksh.

  • The equivalent slicing syntax works in all of them, however: ${@:2:1}


dash:

dash, the default shell (/bin/sh) on Ubuntu systems, dash, is mostly restricted to POSIX-only features, and does not support arrays at all.

As a consequence, it treats ${#@} / ${#*} differently: it interprets @ and * as the scalar string list of the (expanded) parameters and returns that string's length.

In other words: in dash, echo "${#@} / echo "${#*} is the equivalent of: list="$@"; echo "${#list}".

In the absence of support for arrays altogether, dash fittingly neither supports ${@[2]} nor ${@:2:1}.

With arrays, why is it the case that a[5] == 5[a]?

The C standard defines the [] operator as follows:

a[b] == *(a + b)

Therefore a[5] will evaluate to:

*(a + 5)

and 5[a] will evaluate to:

*(5 + a)

a is a pointer to the first element of the array. a[5] is the value that's 5 elements further from a, which is the same as *(a + 5), and from elementary school math we know those are equal (addition is commutative).

Is there a bash command that can tell the size of a shell variable

wc can tell you how many characters and bytes are in a variable, and bash itself can tell you how many elements are in an array. If what you're looking for is how large bash's internal structures are for holding a specific variable then I don't believe that's available anywhere.

$ foo=42
$ bar=(1 2 3 4)
$ echo -n "$foo" | wc -c -m
2 2
$ echo "${#bar[@]}"
4


Related Topics



Leave a reply



Submit