Iterate Over Lines Instead of Words in a for Loop of Shell Script

Iterate over lines instead of words in a for loop of shell script

The for loop is not designed to loop over "lines". Instead it loops over "words".

Short terminology: "lines" are things separated by newlines. "words" are things separated by spaces (and newlines, among others). in bash lingo "words" are called "fields".

The idiomatic way to loop over lines is to use a while loop in combination with read.

ioscan -m dsf | while read -r line
do
printf '%s\n' "$line"
done

Note that the while loop is in a subshell because of the pipe. This can cause some confusion with variable scope. In bash you can work around this by using process substitution.

while read -r line
do
printf '%s\n' "$line"
done < <(ioscan -m dsf)

see also http://mywiki.wooledge.org/BashFAQ/024


The for loop splits the things to loop over using the characters in the $IFS variable as separators. IFS is short for Internal Field Separator. Usually $IFS contains a space, a tab, and a newline. That means the for loop will loop over the "words", not over the lines.

If you insist on using a for loop to loop over lines you have to change the value of $IFS to only newline. But if you do this you have to save the old value of $IFS and restore that after the loop, because many other things also depend on $IFS.

OLDIFS="$IFS"
IFS=$'\n' # bash specific
for line in $(ioscan -m dsf)
do
printf '%s\n' "$line"
done
IFS="$OLDIFS"

in POSIX shells, that have no ANSI-C Quoting ($'\n'), you can do it like this:

IFS='
'

that is: put an actual new line between the quotes.

Alternatively you can use a subshell to contain the change to $IFS:

(
# changes to variables in the subshell stay in the subshell
IFS=$'\n'
for line in $(ioscan -m dsf)
do
printf '%s\n' "$line"
done
)
# $IFS is not changed outside of the subshell

But beware the command in the loop may itself depends on some sane setting for $IFS. Then you have to restore the $IFS before executing the command and set again before the next loop or some such. I do not recommend messing with $IFS. Too many commands depend on some sane values in $IFS and changing it is an endless nightmare of obscure bug hunting.

See also:

  • http://wiki.bash-hackers.org/syntax/ccmd/classic_for
  • http://wiki.bash-hackers.org/commands/builtin/read
  • http://mywiki.wooledge.org/IFS
  • http://mywiki.wooledge.org/SubShell
  • http://mywiki.wooledge.org/ProcessSubstitution

For loop reading word instead of lines in bash

As @donjuedo statet, space is usually regarded as a separator, that's why you don't get the whole lines. There are several ways to work around this.
I list the solutions for both reading from file and from the output of a command. As input you can create a file with the following content and name it testfile.txt (with empty lines, lines with spaces in between and also on both ends:

This is the first line
2
third line

fifth line
sixth line

Solution 1: most generally applicable

while IFS= read -u 3 -r line; do
echo ">${line}<"
read -p "press enter to show next line" var
echo "read caught the following input: >$var<"
done 3< testfile.txt

The variant that reads from a pipe looks exactly the same, only the last line changes. As an example I process all lines from the testfile that contain spaces, which eliminates lines two and four (here with <&3 instead of -u 3 - both are equivalent):

while IFS= read -r line <&3;  do
...
done 3< <(grep " " testfile.txt)

This works for large input files. The 3< indirection seems awkward, but makes sure, that processes within the loop still can read from standard input (see the read statement "press enter..."). This might be important if you execute commands that might show user prompts themselfes (like rm etc).
So:

  • works for large input files as well without blowing up memory

  • stdin is not redirected (instead fd 3 is used)

  • spaces inbetween, and at both ends of lies are retained (using IFS)

  • empty lines are retained es well

Thanks to @BenjaminW for proposing this.

Solution 2: with IFS and for (with restrictions)

OFS="$IFS"
IFS=$'\n'
for line in $(cat testfile.txt) ; do
echo ">${line}<"
read -p "press enter to show next line" var
echo "read caught the following input: >$var<"
done
IFS="$OFS"

This temporarily changes the field separator to newline, so the file is only splitted on line breaks.

  • not recommended for large input because of the command line substitution ($(cat testfile))

  • stdin is also not redirected, so it is possible to use stdin in the body without restricitons

  • spaces inbetween, and at both ends of lies are retained (using IFS)

  • empty lines as line four are skipped here (line legth 0 / matching ^$)

  • if you use this, you have to make sure, that you reset IFS

  • it might get messy, if your loop body needs a different interpretation of fields (e.g. needs to read something else which should be split around spaces).

Iterate over a text file line by line within a for loop in a shell script

This is my attempt:

counter=1
while IFS= read -r line
do
for (( i=1; i <= 5 && $counter <= $1; i++ ))
do
server create --location $line node-$counter
counter=$((counter+1))
done
done < ~/Documents/files/locations.txt

The output with 11:

server create --location location-1a node-1
server create --location location-1a node-2
server create --location location-1a node-3
server create --location location-1a node-4
server create --location location-1a node-5
server create --location location-2c node-6
server create --location location-2c node-7
server create --location location-2c node-8
server create --location location-2c node-9
server create --location location-2c node-10
server create --location location-3d node-11

Looping through the content of a file in Bash

One way to do it is:

while read p; do
echo "$p"
done <peptides.txt

As pointed out in the comments, this has the side effects of trimming leading whitespace, interpreting backslash sequences, and skipping the last line if it's missing a terminating linefeed. If these are concerns, you can do:

while IFS="" read -r p || [ -n "$p" ]
do
printf '%s\n' "$p"
done < peptides.txt

Exceptionally, if the loop body may read from standard input, you can open the file using a different file descriptor:

while read -u 10 p; do
...
done 10<peptides.txt

Here, 10 is just an arbitrary number (different from 0, 1, 2).

How can I iterate through text two lines at a time?

Assumptions:

  • counts are accumulated across the entire file (as opposed to restarting the counts for each new line)
  • word pairs can span lines, eg, one\nword is the same as one word
  • we're only interested in 2-word pairings, ie, no need to code for a dynamic number of words (eg, 3-words, 4-words)

Sample input data:

$ cat words.dat
I am a man
I am not a man I
am a man

One awk idea:

$ awk -v RS='' '                       # treat file as one loooong single record
{ for (i=1;i<NF;i++) # loop through list of fields 1 - (NF-1)
count[$(i)" "$(i+1)]++ # use field i and i+1 as array index
}
END { for (i in count) # loop through array indices
print count[i],i
}
' words.dat

This generates:

2 am a
3 a man
1 am not
3 I am
1 not a
2 man I

NOTE: no sorting requirement was stated otherwise we could pipe the result to sort, or if using GNU awk we may be able to add an appropriate PROCINFO["sorted_in"] statement

OP's original input:

$ awk -v RS='' '{for (i=1;i<NF;i++) count[$(i)" "$(i+1)]++} END {for (i in count) print count[i],i}' <<< "I am a man"
1 am a
1 a man
1 I am

Removing the assumption about dynamic word counts ...

$ awk -v wcnt=2 -v RS='' '                  # <word_count> = 2; treat file as one loooong single record
NF>=wcnt { for (i=1;i<=(NF-wcnt+1);i++) { # loop through list of fields 1 - (NF-<word_count>)
pfx=key=""
for (j=0;j<wcnt;j++) { # build count[] index from <word_count> fields
key=key pfx $(j+i)
pfx=" "
}
count[key]++
}
}

END { for (i in count) # loop through array indices
print count[i],i
}
' words.dat

With -v wcnt=2:

2 am a
3 a man
1 am not
3 I am
1 not a
2 man I

With -v wcnt=3:

1 not a man
2 I am a
1 I am not
2 man I am
2 am a man
2 a man I
1 am not a

With -v wcnt=5:

1 I am a man I
1 I am not a man
1 am not a man I
1 am a man I am
1 man I am a man
1 man I am not a
1 a man I am not
1 not a man I am
1 a man I am a

With -v wcnt=3 and awk '...' <<< "I am a man":

1 I am a
1 am a man

With -v wcnt=5 and awk '...' <<< "I am a man":

# no output since less than wcnt=5 words to work with

How do I iterate over each line in a file with Bash?

Use cat for concatenating or displaying. No need for it here.

file="/path/to/file"
while read line; do
echo "${line}"
done < "${file}"

How to iterate through string one word at a time in zsh

In order to see the behavior compatible with Bourne shell, you'd need to set the option SH_WORD_SPLIT:

setopt shwordsplit      # this can be unset by saying: unsetopt shwordsplit
things="one two"

for one_thing in $things; do
echo $one_thing
done

would produce:

one
two

However, it's recommended to use an array for producing word splitting, e.g.,

things=(one two)

for one_thing in $things; do
echo $one_thing
done

You may also want to refer to:

3.1: Why does $var where var="foo bar" not do what I expect?

Iterating over lines in a file with for in shell never matching target

The original code didn't ever look inside of a file until after running the comparison -- it merely compared the name of each .dat file to the target, and allowed only exact matches (not substrings).

Consider instead:

while read -r line; do
if [[ $line = *"$1"* ]]; then
echo "$1 present in $line"
else
echo "$1 not found in $line"
fi
done < <(cat *.dat)
  • Using cat *.dat combines all the files into a single stream. Enclosing this in <(cat *.dat) generates a filename which can read from to yield that stream; using < <(cat *.dat) redirects stdin from this file (within the scope of the while loop for which this redirection takes place).
  • Using while read processes an input stream line-by-line (see BashFAQ #1).
  • Using a test of [[ $line = *"$1"* ]] allows the target (contents of $1) to be found inside a line, instead of only matching when $1 matches the entire line as a whole. You can also have this effect with [[ $line =~ "$1" ]]. Note that the quotes are mandatory for correct operation in either of these cases.
  • Using a for loop to iterate over lines is extremely poor practice; see Don't Read Lines With For. If you want to use a for loop, use it to iterate over files instead:

    for f in *.dat; do
    # handle case where no files exist
    [[ -e "$f" ]] || continue
    # read each given file
    while read -r line; do
    if [[ $line = *"$1"* ]]; then
    echo "$1 present in $line in file $f"
    else
    echo "$1 not present in $line in file $f"
    fi
    done <"$f"
    done

How do I iterate through lines in an external file with shell?

One way would be:

while read NAME
do
echo "$NAME"
done < names.txt

EDIT:
Note that the loop gets executed in a sub-shell, so any modified variables will be local, except if you declare them with declare outside the loop.

Dennis Williamson is right. Sorry, must have used piped constructs too often and got confused.



Related Topics



Leave a reply



Submit