how to use schell script to read element from a file, do some calculation and write back?

your requirement is so unclear! what calculation did you mean?

I could show you an example, hope it helps:

kent$  cat test.txt
100 B1 C1
200 B2 C2
300 B3 C3

# in this example, the "calculation" is getting the squre of A1
kent$ awk '{$1*=$1}1' test.txt
10000 B1 C1
40000 B2 C2
90000 B3 C3

#If you want to get the result in a newfile.txt, do this:

kent$ awk '{$1*=$1}1' test.txt >newfile.txt


kent$ cat newfile.txt
10000 B1 C1
40000 B2 C2
90000 B3 C3


here I could give you an example how to invoke date in awk:

kent$  echo "1359579362 B1 C1"|awk '{"date -d @"$1|getline $1}1'                                                                                                            
Wed Jan 30 21:56:02 CET 2013 B1 C1

I guess that is what you are looking for.

good luck.

How to select an element from a 2d array in a file in Linux shell

awk is great for this:

$ awk 'NR==row{print $col}' row=2 col=2 file
  • NR==row{} means: on number of record number row, do {} Number of record normally is the number of line.
  • {print $col} means: print the field number col.
  • row=2 col=2 is giving both parameters to awk.


One more little question: How can I transform this into a sh file so
that when I enter -r 2 -c 2 test.dat into prompt, I get to run the
script so that it reads from the file and echoes the output? –

For example:



awk 'NR==row{print $col}' row=$row col=$col $file

And you execute like:

./script a 3 2

How can I parse a YAML file from a Linux shell script?

My use case may or may not be quite the same as what this original post was asking, but it's definitely similar.

I need to pull in some YAML as bash variables. The YAML will never be more than one level deep.

YAML looks like so:

KEY:                value
ANOTHER_KEY: another_value
OH_MY_SO_MANY_KEYS: yet_another_value
LAST_KEY: last_value

Output like-a dis:


I achieved the output with this line:

sed -e 's/:[^:\/\/]/="/g;s/$/"/g;s/ *=/=/g' file.yaml >
  • s/:[^:\/\/]/="/g finds : and replaces it with =", while ignoring :// (for URLs)
  • s/$/"/g appends " to the end of each line
  • s/ *=/=/g removes all spaces before =

Creating an array from a text file in Bash

Use the mapfile command:

mapfile -t myArray < file.txt

The error is using for -- the idiomatic way to loop over lines of a file is:

while IFS= read -r line; do echo ">>$line<<"; done < file.txt

See BashFAQ/005 for more details.

Return value in a Bash function

Although Bash has a return statement, the only thing you can specify with it is the function's own exit status (a value between 0 and 255, 0 meaning "success"). So return is not what you want.

You might want to convert your return statement to an echo statement - that way your function output could be captured using $() braces, which seems to be exactly what you want.

Here is an example:

function fun1(){
echo 34

function fun2(){
local res=$(fun1)
echo $res

Another way to get the return value (if you just want to return an integer 0-255) is $?.

function fun1(){
return 34

function fun2(){
local res=$?
echo $res

Also, note that you can use the return value to use Boolean logic - like fun1 || fun2 will only run fun2 if fun1 returns a non-0 value. The default return value is the exit value of the last statement executed within the function.

How can I quickly sum all numbers in a file?

For a Perl one-liner, it's basically the same thing as the awk solution in Ayman Hourieh's answer:

 % perl -nle '$sum += $_ } END { print $sum'

If you're curious what Perl one-liners do, you can deparse them:

 %  perl -MO=Deparse -nle '$sum += $_ } END { print $sum'

The result is a more verbose version of the program, in a form that no one would ever write on their own:

BEGIN { $/ = "\n"; $\ = "\n"; }
LINE: while (defined($_ = <ARGV>)) {
chomp $_;
$sum += $_;
sub END {
print $sum;
-e syntax OK

Just for giggles, I tried this with a file containing 1,000,000 numbers (in the range 0 - 9,999). On my Mac Pro, it returns virtually instantaneously. That's too bad, because I was hoping using mmap would be really fast, but it's just the same time:

use 5.010;
use File::Map qw(map_file);

map_file my $map, $ARGV[0];

$sum += $1 while $map =~ m/(\d+)/g;

say $sum;

Shell command to sum integers, one per line?

Bit of awk should do it?

awk '{s+=$1} END {print s}' mydatafile

Note: some versions of awk have some odd behaviours if you are going to be adding anything exceeding 2^31 (2147483647). See comments for more background. One suggestion is to use printf rather than print:

awk '{s+=$1} END {printf "%.0f", s}' mydatafile

How can I store the find command results as an array in Bash

Update 2020 for Linux Users:

If you have an up-to-date version of bash (4.4-alpha or better), as you probably do if you are on Linux, then you should be using Benjamin W.'s answer.

If you are on Mac OS, which —last I checked— still used bash 3.2, or are otherwise using an older bash, then continue on to the next section.

Answer for bash 4.3 or earlier

Here is one solution for getting the output of find into a bash array:

while IFS= read -r -d $'\0'; do
done < <(find . -name "${input}" -print0)

This is tricky because, in general, file names can have spaces, new lines, and other script-hostile characters. The only way to use find and have the file names safely separated from each other is to use -print0 which prints the file names separated with a null character. This would not be much of an inconvenience if bash's readarray/mapfile functions supported null-separated strings but they don't. Bash's read does and that leads us to the loop above.

[This answer was originally written in 2014. If you have a recent version of bash, please see the update below.]

How it works

  1. The first line creates an empty array: array=()

  2. Every time that the read statement is executed, a null-separated file name is read from standard input. The -r option tells read to leave backslash characters alone. The -d $'\0' tells read that the input will be null-separated. Since we omit the name to read, the shell puts the input into the default name: REPLY.

  3. The array+=("$REPLY") statement appends the new file name to the array array.

  4. The final line combines redirection and command substitution to provide the output of find to the standard input of the while loop.

Why use process substitution?

If we didn't use process substitution, the loop could be written as:

find . -name "${input}" -print0 >tmpfile
while IFS= read -r -d $'\0'; do
done <tmpfile
rm -f tmpfile

In the above the output of find is stored in a temporary file and that file is used as standard input to the while loop. The idea of process substitution is to make such temporary files unnecessary. So, instead of having the while loop get its stdin from tmpfile, we can have it get its stdin from <(find . -name ${input} -print0).

Process substitution is widely useful. In many places where a command wants to read from a file, you can specify process substitution, <(...), instead of a file name. There is an analogous form, >(...), that can be used in place of a file name where the command wants to write to the file.

Like arrays, process substitution is a feature of bash and other advanced shells. It is not part of the POSIX standard.

Alternative: lastpipe

If desired, lastpipe can be used instead of process substitution (hat tip: Caesar):

set +m
shopt -s lastpipe
find . -name "${input}" -print0 | while IFS= read -r -d $'\0'; do array+=("$REPLY"); done; declare -p array

shopt -s lastpipe tells bash to run the last command in the pipeline in the current shell (not the background). This way, the array remains in existence after the pipeline completes. Because lastpipe only takes effect if job control is turned off, we run set +m. (In a script, as opposed to the command line, job control is off by default.)

Additional notes

The following command creates a shell variable, not a shell array:

array=`find . -name "${input}"`

If you wanted to create an array, you would need to put parens around the output of find. So, naively, one could:

array=(`find . -name "${input}"`)  # don't do this

The problem is that the shell performs word splitting on the results of find so that the elements of the array are not guaranteed to be what you want.

Update 2019

Starting with version 4.4-alpha, bash now supports a -d option so that the above loop is no longer necessary. Instead, one can use:

mapfile -d $'\0' array < <(find . -name "${input}" -print0)

For more information on this, please see (and upvote) Benjamin W.'s answer.

Best way to simulate group by from bash?

sort ip_addresses | uniq -c

This will print the count first, but other than that it should be exactly what you want.

