Looping Through the Content of a File in Bash

Looping through the content of a file in Bash

One way to do it is:

while read p; do
  echo "$p"
done <peptides.txt

As pointed out in the comments, this has the side effects of trimming leading whitespace, interpreting backslash sequences, and skipping the last line if it's missing a terminating linefeed. If these are concerns, you can do:

while IFS="" read -r p || [ -n "$p" ]
do
  printf '%s\n' "$p"
done < peptides.txt

Exceptionally, if the loop body may read from standard input, you can open the file using a different file descriptor:

while read -u 10 p; do
  ...
done 10<peptides.txt

Here, 10 is just an arbitrary number (different from 0, 1, 2).

bash - loop through file contents and append to string

By putting the read loop in a pipe, you are building the string in a subprocess. Instead, do something like:

#!/bin/bash

some_string=""

while read line; do 
    some_string+="$line"
done < .env

echo "$some_string"

But, really, don't do any of that. Instead, do:

some_string=$(tr -d \\n < .env)

It's worth noting that sometimes you want to keep the subprocess, but you need to be aware that the variables will lose their values at the end of the process. But it is sometimes very convenient to do things like:

#!/bin/bash

some_string=""

cmd | {
    while read line; do 
        some_string+="$line"
    done 
    echo "in pipe, some_string=$some_string"
}
echo "after pipe, some_string=$some_string"

Looping through a file but ignoring the first two lines in Bash

Given the small number of lines (2) to be ignored, I suggest the following:

{
  read; read
  while IFS="" read -r p || [ -n "$p" ] ; do
    ...
  done
} < diff.txt

read; read reads the first two lines from stdin (diff.txt).

Looping through array of filenames in bash and read them

The basic problem is that you seem to be mixing bash and tcsh syntax -- and just by chance you're using tcsh commands that happen not to be syntax errors in bash, but don't do what you want.

This:

set fl = `basename $filename`

is how you'd set $fl to the basename of $filename in tcsh. In bash, however, the set command is quite different. Since it's not what you need to use here anyway, I won't go into details, but you can read about them here.

In bash, the way to set a variable is just

var=value  # NO spaces around the "="

Also, bash, unlike tcsh, has a $(command) syntax to capture the output of a command, in addition to the older `command`.

So your command

set fl = `basename $filename`

should be

fl="$("basename $filename")"

Adding double quotes around both the $filename reference and the $(...) command substitution ensures that the shell can handle odd characters in the file name and/or command output.

Bash Loop Through End of File

Suggesting awk script, to scan each file only once.

 awk 'FRN == RN {wordsArr[++wordsCount] = $0}  # read file1 lines into array
      FRN != RN && /example/ {                 # read file2 line matching regExp /example/
        for (i in wordsArr) {             # scan all words in array
           if ($0 ~ wordsArr[i]) {        # if a word matched in current line
              print;                      # print the current line
              next;                       # skip rest of words,read next line
           }
        }
      }' file1 file2

bash - for loop through multiple directories and their files

Would you please try the following:

#!/bin/bash

for i in my_path/*/; do
    year=${i%/}; year=${year##*/}       # extract year
    year2=$(( year + 19 ))              # add 19
    for j in "$i"*.nc; do
        echo cdo "selyear,${year}/${year2}" "$j" "$j"2
    done
done

It outputs command lines as a dry run. If it looks good, drop echo and run.

How to efficiently loop through the lines of a file in Bash?

See why-is-using-a-shell-loop-to-process-text-considered-bad-practice for some of the reasons why your script is so slow.

$ cat tst.awk
{ val2hits[$0] = val2hits[$0] FS NR }
END {
    for (val in val2hits) {
        numHits = split(val2hits[val],hits)
        if ( numHits > 1 ) {
            printf "found %d equal lines:", numHits
            for ( hitNr=1; hitNr<=numHits; hitNr++ ) {
                printf " index%d=%d ,", hitNr, hits[hitNr]
            }
            print " value=" val
        }
    }
}

$ awk -f tst.awk file
found 2 equal lines: index1=1 , index2=4 , value=saudifh
found 2 equal lines: index1=3 , index2=5 , value=sometextASLKJND

To give you an idea of the performance difference using a bash script that's written to be as efficient as possible and an equivalent awk script:

bash:

$ cat tst.sh
#!/bin/bash
case $BASH_VERSION in ''|[123].*) echo "ERROR: bash 4.0 required" >&2; exit 1;; esac

# initialize an associative array, mapping each string to the last line it was seen on
declare -A lines=( )
lineNum=0

while IFS= read -r line; do
  (( ++lineNum ))
  if [[ ${lines[$line]} ]]; then
     printf 'Content previously seen on line %s also seen on line %s: %s\n' \
       "${lines[$line]}" "$lineNum" "$line"
  fi
  lines[$line]=$lineNum
done < "$1"

$ time ./tst.sh file100k > ou.sh
real    0m15.631s
user    0m13.806s
sys     0m1.029s

awk:

$ cat tst.awk
lines[$0] {
    printf "Content previously seen on line %s also seen on line %s: %s\n", \
       lines[$0], NR, $0
}
{ lines[$0]=NR }

$ time awk -f tst.awk file100k > ou.awk
real    0m0.234s
user    0m0.218s
sys     0m0.016s

There are no differences in the output of both scripts:

$ diff ou.sh ou.awk
$

The above is using 3rd-run timing to avoid caching issues and being tested against a file generated by the following awk script:

awk 'BEGIN{for (i=1; i<=10000; i++) for (j=1; j<=10; j++) print j}' > file100k

When the input file had zero duplicate lines (generated by seq 100000 > nodups100k) the bash script executed in about the same amount of time as it did above while the awk script executed much faster than it did above:

$ time ./tst.sh nodups100k > ou.sh
real    0m15.179s
user    0m13.322s
sys     0m1.278s

$ time awk -f tst.awk nodups100k > ou.awk
real    0m0.078s
user    0m0.046s
sys     0m0.015s

Iterate over a text file line by line within a for loop in a shell script

This is my attempt:

counter=1
while IFS= read -r line
do
    for (( i=1; i <= 5 && $counter <= $1; i++ ))
    do
     server create --location $line node-$counter
     counter=$((counter+1))
    done 
done < ~/Documents/files/locations.txt

The output with 11:

server create --location location-1a node-1
server create --location location-1a node-2
server create --location location-1a node-3
server create --location location-1a node-4
server create --location location-1a node-5
server create --location location-2c node-6
server create --location location-2c node-7
server create --location location-2c node-8
server create --location location-2c node-9
server create --location location-2c node-10
server create --location location-3d node-11

Looping Through the Content of a File in Bash