Extending a Script to Loop Over Multiple Files and Generate Output Names

Extending a script to loop over multiple files and generate output names


#!/bin/sh
for f; do
tempdir=$(mktemp -t -d gifdir.XXXXXX)
ffmpeg -i "$f" "$tempdir/out%04d.gif"
gifsicle --delay=10 --loop "$tempdir"/*.gif >"${f%.*}.gif"
rm -rf "$tempdir"
done

Let's go over how this works:

  1. Iteration

    for f; do

    is equivalent to for f in "$@"; that is to say, it loops over all command-line arguments. If instead you wanted to loop over all MP4s in the current directory, this would be for f in *.mp4; do, or to loop over all MP4s named in the directory passed as the first command line argument, it would be for f in "$1"/*.mp4; do. To support either usage -- but go with the first one if no directory is passed -- it would be for f in "${1:-.}"/*.mp4; do.

  2. Temporary directory use

    Because the original script would reuse /tmp/gif for everything, you'd get files from one input source being used in others. This is best avoided by creating a new temporary directory for each input file, which mktemp will automate.

  3. Creating the .gif name

    "${f%.*}" is a parameter expansion which removes everything after the last . in a file; see BashFAQ #100 for documentation on string manipulation in bash in general, including this particular form.

    Thus, "${f%.*}.gif" strips the existing extension, and adds a .gif extension.

for loop for multiple extension and do something with each file

You are not using $file anywhere. Try

for file in "$arg"/*.{jpg,jpeg,png} ; do
echo "$file" > z.txt
done

How to loop over files in directory and change path and add suffix to filename

A couple of notes first: when you use Data/data1.txt as an argument, should it really be /Data/data1.txt (with a leading slash)? Also, should the outer loop scan only for .txt files, or all files in /Data? Here's an answer, assuming /Data/data1.txt and .txt files only:

#!/bin/bash
for filename in /Data/*.txt; do
for ((i=0; i<=3; i++)); do
./MyProgram.exe "$filename" "Logs/$(basename "$filename" .txt)_Log$i.txt"
done
done

Notes:

  • /Data/*.txt expands to the paths of the text files in /Data (including the /Data/ part)
  • $( ... ) runs a shell command and inserts its output at that point in the command line
  • basename somepath .txt outputs the base part of somepath, with .txt removed from the end (e.g. /Data/file.txt -> file)

If you needed to run MyProgram with Data/file.txt instead of /Data/file.txt, use "${filename#/}" to remove the leading slash. On the other hand, if it's really Data not /Data you want to scan, just use for filename in Data/*.txt.

Generate multiple output files for loop

Please, use https://www.shellcheck.net/ to check your shell scripts.
If you use Visual Studio Code, you could install "ShellCheck" (by Timon Wong) extension.

About your porgram.

  • Assume bash
  • Define different extensions for input and output files (really important if there are in the same directory)
  • Loop on report, input, files only
  • Clear output file
  • Read input file
  • if sequence:
    • if [[ ... ]] with space after [[ and before ]]
    • spaces before and after operators (=~)
    • reverse operands order for operators =~
  • Prevent globbing with "..."
#! /bin/bash

# Input file extension
declare -r EXT_REPORT=".txt"
# Output file extension
declare -r EXT_OUTPUT=".output"

# RE
declare -r success="(Compiling)\s\".*\"\s\-\s(Succeeded)"
declare -r failure="(Compiling)\s\".*\"\s\-\s(Failed)"

# Counters
declare -i count_success=0
declare -i count_failure=0

for REPORT_FILE in ~/Documents/reports/*"${EXT_REPORT}"; do
# Clear output file
: > "${REPORT_FILE}${EXT_OUTPUT}"

# Read input file (see named file in "done" line)
while read -r line; do

# does the line match the success pattern ?
if [[ $line =~ $success ]]; then
echo "$line" >> "${REPORT_FILE}${EXT_OUTPUT}"
count_success+=1
# does the line match the failure pattern ?
elif [[ $line =~ $failure ]]; then
echo "$line" >> "${REPORT_FILE}${EXT_OUTPUT}"
count_failure+=1
fi

done < "$REPORT_FILE"
done

echo "$count_success of jobs ran succesfully"
echo "$count_failure of jobs didn't work"

loop over files and extract part of filename

Just re-assign the loop variable at the beginning of each iteration:

for sample in *.vcf; do
sample=${sample%_*}
# do stuff here
done

Loop through all the files with a specific extension

No fancy tricks needed:

for i in *.java; do
[ -f "$i" ] || break
...
done

The guard ensures that if there are no matching files, the loop will exit without trying to process a non-existent file name *.java.

In bash (or shells supporting something similar), you can use the nullglob option
to simply ignore a failed match and not enter the body of the loop.

shopt -s nullglob
for i in *.java; do
...
done

Some more detail on the break-vs-continue discussion in the comments. I consider it somewhat out of scope whether you use break or continue, because what the first loop is trying to do is distinguish between two cases:

  1. *.java had no matches, and so is treated as literal text.
  2. *.java had at least one match, and that match might have included an entry named *.java.

In case #1, break is fine, because there are no other values of $i forthcoming, and break and continue would be equivalent (though I find break more explicit; you're exiting the loop, not just waiting for the loop to exit passively).

In case #2, you still have to do whatever filtering is necessary on any possible matches. As such, the choice of break or continue is less relevant than which test (-f, -d, -e, etc) you apply to $i, which IMO is the wrong way to determine if you entered the loop "incorrectly" in the first place.

That is, I don't want to be in the position of examining the value of $i at all in case #1, and in case #2 what you do with the value has more to do with your business logic for each file, rather than the logic of selecting files to process in the first place. I would prefer to leave that logic to the individual user, rather than express one choice or the other in the question.


As an aside, zsh provides a way to do this kind of filtering in the glob itself. You can match only regular files ending with .java (and disable the default behavior of treating unmatched patterns as an error, rather than as literal text) with

for f in *.java(.N); do
...
done

With the above, you are guaranteed that if you reach the body of the loop, then $f expands to the name of a regular file. The . makes *.java match only regular files, and the N causes a failed match to expand to nothing instead of producing an error.

There are also other such glob qualifiers for doing all sorts of filtering on filename expansions. (I like to joke that zsh's glob expansion replaces the need to use find at all.)

Loop over multiple file extensions from bash script

You can use an array support to iterate over first glob pattern and use 2nd file from array:

waves=(*.wav)

k=0
for textgrid_file in *.TextGrid; do
praat --run pitch.praat "$textgrid_file" "${waves[k++]}" >> output.txt
done

Shell script to loop over files with same names but different extensions

anubhava's solution is excellent if, as they do in your example, the extensions sort into the right order. For the more general case, where sorting cannot be relied upon, we can specify the argument order explicitly:

for f in *.ext1
do
program "$f" "${f%.ext1}.ext2"
done

This will work even if the filenames have spaces or other difficult characters in them.

bash script - loop through files and add to different variables

If you loop over all the R1 and R2 files, you'll run bowtie for all possible pairs of data files. If I understand correctly, that's not what you want - you only want to process the corresponding pairs.

To do that, loop over R1 files only, and try to find the corresponding R2 file for each:

#!/bin/bash
fqdir=...
for r1 in "$fqdir"/*_R1.fastq; do
r2=${r1%_R1.fastq}_R2.fastq
if [[ -f $r2 ]] ; then
bowtie2 -x index_files -1 "$r1" -2 "$r2" -S "$N"_output.sam
else
echo "$r2 not found" >&2
fi
done

I'm not sure what $N stands for. Maybe you can use $r1 instead?



Related Topics



Leave a reply



Submit