Dynamic indirect Bash array
Bash with Coreutils, grep and sed
If I understand your code right, you try to have multidimensional arrays, which Bash doesn't support. If I were to solve this problem from scratch, I'd use this mix of command line tools (see security concerns at the end of the answer!):
#!/bin/bash
while read name; do
printf "%s=(\"%d\" \"%s\")\n" \
"$name" \
"$(grep -c "$name" "$1")" \
"$(grep "$name" "$1" | tr $'\n' ' ' | sed 's/ /" "/g;s/" "$//')"
done < <(cut -d ',' -f 2 "$1" | sort -u)
Sample output:
$ ./SO.sh infile
jack=("1" "log3,jack,time,etc")
john=("1" "log1,john,time,etc")
peter=("2" "log2,peter,time,etc" "log4,peter,time,etc")
This uses process substitution to prepare the log file so we can loop over unique names; the output of the substitution looks like
$ cut -d ',' -f 2 "$1" | sort -u
jack
john
peter
i.e., a list of unique names.
For each name, we then print the summarized log line with
printf "%s=(\"%d\" \"%s\")\n"
Where
- The
%s
string is just the name ("$name"
). The log line count is the output of a grep command,
grep -c "$name" "$1"
which counts the number of occurrences of
"$name"
. If the name can occur elsewhere in the log line, we can limit the search to just the second field of the log lines withgrep -c "$name" <(cut -d ',' -f 2 "$1")
Finally, to get all log lines on one line with proper quoting and all, we use
grep "$name" "$1" | tr $'\n' ' ' | sed 's/ /" "/g;s/" "$//'
This gets all lines containing
"$name"
, replaces newlines with spaces, then surrounds the spaces with quotes and removes the extra quotes from the end of the line.
Pure Bash
After initially thinking that pure Bash would be too cumbersome, it turned out to be not all that complicated:
#!/bin/bash
declare -A count
declare -A lines
old_ifs=IFS
IFS=,
while read -r -a line; do
name="${line[1]}"
(( ++count[$name] ))
lines[$name]+="\"${line[*]}\" "
done < "$1"
for name in "${!count[@]}"; do
printf "%s=(\"%d\" %s)\n" "$name" "${count[$name]}" "${lines[$name]% }"
done
IFS="$old_ifs"
This updates two associative arrays while looping over the input file: count
keeps track of the number of times a certain name occurs, and lines
appends the log lines to an entry per name.
To separate fields by commas, we set the input field separator IFS
to a comma (but save it beforehand so it can be reset at the end).
read -r -a
reads the lines into an array line
with comma separated fields, so the name is now in ${line[1]}
. We increase the count for that name in the arithmetic expression (( ... ))
, and append (+=
) the log line in the next line.
${line[*]}
prints all fields of the array separated by IFS
, which is exactly what we want. We also add a space here; the unwanted space at the end of the line (after the last element) will be removed later.
The second loop iterates over all the keys of the count
array (the names), then prints the properly formatted line for each. ${lines[$name]% }
removes the space from the end of the line.
Security concerns
As it seems that the output of these scripts is supposed to be reused by the shell, we might want to prevent malicious code execution if we can't trust the contents of the log file.
A way to do that for the Bash solution (hat tip: Charles Duffy) would be the following: the for loop would have to be replaced by
for name in "${!count[@]}"; do
IFS=' ' read -r -a words <<< "${lines[$name]}"
printf -v words_str '%q ' "${words[@]}"
printf "%q=(\"%d\" %s)\n" "$name" "${count[$name]}" "${words_str% }"
done
That is, we split the combined log lines into an array words
, print that with the %q
formatting flag into a string words_str
and then use that string for our output, resulting in escaped output like this:
peter=("2" \"log2\,peter\,time\,etc\" \"log4\,peter\,time\,etc\")
jack=("1" \"log3\,jack\,time\,etc\")
john=("1" \"log1\,john\,time\,etc\")
The analogous could be done for the first solution.
How to iterate over an array using indirect reference?
${!ARRAYNAME[@]}
means "the indices of ARRAYNAME
". As stated in the bash man page since ARRAYNAME
is set, but as a string, not an array, it returns 0
.
Here's a solution using eval
.
#!/usr/bin/env bash
ARRAYNAME='FRUITS'
FRUITS=( APPLE BANANA ORANGE )
eval array=\( \${${ARRAYNAME}[@]} \)
for fruit in "${array[@]}"; do
echo ${fruit}
done
What you were originally trying to do was create an Indirect Reference. These were introduced in bash version 2 and were meant to largely replace the need for eval
when trying to achieve reflection-like behavior in the shell.
What you have to do when using indirect references with arrays is include the [@]
in your guess at the variable name:
#!/usr/bin/env bash
ARRAYNAME='FRUITS'
FRUITS=( APPLE BANANA ORANGE )
array="${ARRAYNAME}[@]"
for fruit in "${!array}"; do
echo $fruit
done
All that said, it's one thing to use Indirect References in this trivial example, but, as indicated in the link provided by Dennis Williamson, you should be hesitant to use them in real-world scripts. They are all but guaranteed to make your code more confusing than necessary. Usually you can get the functionality you need with an Associative Array.
Dynamic array variable name in bash
Using bash 4.3, declare -n aliasName=destVarName
will make aliasName
refer to destVarName
, even for arrays; thus permitting any kinds of assignment, dereferencing, &c. you would otherwise use.
#!/usr/bin/env bash
# ^^^^^^^^ - Use bash version from PATH; on MacOS, this should be newer
# than the system one if MacPorts, Homebrew, etc. is installed.
case $BASH_VERSION in
''|[1-3]*|4.[0-2]*) echo "This code requires bash 4.3 or newer" >&2; exit 1;;
esac
# to make "index0", "index1", &c. valid indexes, our arrays need to be associative
declare -A arrayFolder1 arrayFolder2
var1=1
declare -n curArrayFolder=arrayFolder$var1
curArrayFolder[index0]=file1
curArrayFolder[index1]=file2
curArrayFolder[index2]=file3
unset -n curArrayFolder
var1=2
declare -n curArrayFolder=arrayFolder$var1
curArrayFolder[index0]=file4
curArrayFolder[index1]=file5
curArrayFolder[index2]=file6
unset -n curArrayFolder
...will properly result in a situation where:
declare -p arrayFolder1 arrayFolder2
emits as output:
declare -A arrayFolder1=([index0]="file1" [index1]="file2" [index2]="file3" )
declare -A arrayFolder2=([index0]="file4" [index1]="file5" [index2]="file6" )
If you want to try to cut down the number of commands needed to switch over which folder is current, consider a function:
setCurArrayFolder() {
declare -p curArrayFolder &>/dev/null && unset -n curArray
declare -g -n curArrayFolder="arrayFolder$1"
var1=$1
}
Then the code becomes:
setCurArrayFolder 1
curArrayFolder[index0]=file1
curArrayFolder[index1]=file2
curArrayFolder[index2]=file3
setCurArrayFolder 2
curArrayFolder[index0]=file4
curArrayFolder[index1]=file5
curArrayFolder[index2]=file6
Iterate over bash arrays, substitute array name dynamically, is this possible?
You can use bash's indirect expansion for this:
loopOverSomething()
{
looparray="$1[@]"
for i in "${!looparray}"
do
echo "value is $i"
done
}
dynamic name for associative array in bash
The array name and index together are needed for indirect parameter expansion.
echoValue () {
# $1 array name
# $2 array index
t="$1[$2]"
echo "${!t}"
}
Related Topics
Id_Rsa.Pub File Ssh Error: Invalid Format
Check If Environment Variable Is Already Set
Bashrc Not Loading Until Run Bash Command
Should I Put Trailing Slash After Source and Destination When Copy Folders
Given Two Directory Trees How to Find Which Files Are the Same
View Multi-Core or Mlti-Cpu Utlization on Linux
How to Fix Symbol Lookup Error: Undefined Symbol Errors in a Cluster Environment
How to Register Fuse Filesystem Type with Mount(8) and Fstab
Check If Service Exists in Bash (Centos and Ubuntu)
How to Check If a Process Is in Hang State (Linux)
How to Use Linux Command Sort to Sort the Text File According to 4Th Column, Numeric Order
How to Start Postgresql Service on Centos 7
How to Check That Two Folders Are the Same in Linux
Shared Libraries: Windows VS Linux Method
How to Ssh into Remote Linux by Ngrok
How to Decrease the Size of Generated Binaries
Why Does Perf Stat Show "Stalled-Cycles-Backend" as <Not Supported>