Loop Code for Each File in a Directory

Loop code for each file in a directory

scandir:

$files = scandir('folder/');
foreach($files as $file) {
//do your work here
}

or glob may be even better for your needs:

$files = glob('folder/*.{jpg,png,gif}', GLOB_BRACE);
foreach($files as $file) {
//do your work here
}

Iterate all files in a directory using a 'for' loop

This lists all the files (and only the files) in the current directory and its subdirectories recursively:

for /r %i in (*) do echo %i

Also if you run that command in a batch file you need to double the % signs.

for /r %%i in (*) do echo %%i

(thanks @agnul)

Run the for loop for each file in directory using Python

Try changing os.walk(dir) for os.listdir(dir). This will give you a list of all the elements in the directory.

import os
with open('/user/folderlist.txt') as f:
for line in f:
line = line.strip("\n")

dir = '/user/' + line
for file in os.listdir(dir):
if file.endswith("fileExtension"):
print(file)

Hope it helps

Write code to iterate all files in a directory and append each file's path to a dictionary

I changed files_dir = {} line to

files_in_dir = { key:0 for key in os.listdir(args.source_dir) if os.path.abspath(key).endswith(".csv") }

and it seems work.

if-else in for statement version following requests:

for key in os.listdir(args.source_dir):
if os.path.abspath(key).endswith(".csv"):
files_in_dir[key] = 0
else:
print(f"The file {key} is not a .csv file and it will be ignored")

For Loop in Python to move through each file sequentially and append it into an array

Your code could simplify (aside from actions, which you don't show us) to something like

import os
from collections import defaultdict

root = "/tmp/mp_data"

# Map labels (subdirectories of root) to data
data_per_label = defaultdict(list)

# Get all top-level directories within `root`
label_dirs = [
name for name in os.listdir(root) if os.path.isdir(os.path.join(root, name))
]
print(f"{label_dirs=}")

# Loop over each label directory
for label in label_dirs:
label_dir = os.path.join(root, label)
# Loop over each filename in the label directory
for filename in os.listdir(label_dir):
# Take care to only look at .npy files
if filename.endswith(".npy"):
filepath = os.path.join(label_dir, filename)
print(f"{label=} {filename=} {filepath=}")
data = filename # replace with np.load(filename)
data_per_label[label].append(data)

print(data_per_label)

Given a tree like

/tmp/mp_data
├── r1
│   └── a.npy
├── r2
│   └── b.npy
└── r3
└── c.npy

this prints out

label_dirs=['r1', 'r3', 'r2']
label='r1' filename='a.npy' filepath='/tmp/mp_data/r1/a.npy'
label='r3' filename='c.npy' filepath='/tmp/mp_data/r3/c.npy'
label='r2' filename='b.npy' filepath='/tmp/mp_data/r2/b.npy'
defaultdict(<class 'list'>, {'r1': ['/tmp/mp_data/r1/a.npy'], 'r3': ['/tmp/mp_data/r3/c.npy'], 'r2': ['/tmp/mp_data/r2/b.npy']})

How to loop through each file in a folder, do some action to the file and save output to a file in another folder Python

You need to specify what each new file is called. To do so, Python has some good string formatting methods. Fortunately, your new desired file names are easy to do in a loop

import os
input_location = 'C:/Users/User/Desktop/mini_mouse'
output_location = 'C:/Users/User/Desktop/filter_mini_mouse/mouse'
for root, dir, files in os.walk(input_location):
for file in files:
new_file = "{}_filtered.txt".format(file)
os.chdir(input_location)
with open(file, 'r') as f, open('NLTK-stop-word-list', 'r') as f2:
mouse_file = f.read().split()
stopwords = f2.read().split()
x = (' '.join(i for i in mouse_file if i.lower() not in (x.lower() for x in stopwords)))
with open(output_location+'/'+new_file, 'w') as output_file: # Changed 'append' to 'write'
output_file.write(x)

If you're in Python 3.7, you can do

new_file = f"{file}_filtered.txt"

and

with open(f"{output_location}/{new_file}", 'w') as output_file:
output_file.write(x)

Loop through each subdirectory in a main directory and run code against each file using OS

I like using pure os:

import os

for fname in os.listdir(src):

# build the path to the folder
folder_path = os.path.join(src, fname)

if os.path.isdir(folder_path):
# we are sure this is a folder; now lets iterate it
for file_name in os.listdir(folder_path):
file_path = os.path.join(folder_path, file_name)
# now you can apply any function assuming it is a file
# or double check it if needed as `os.path.isfile(file_path)`

Note that this function just iterate over the folder given at src and one more level:

src/foo.txt  # this file is ignored
src/foo/a.txt # this file is processed
src/foo/foo_2/b.txt # this file is ignored; too deep.
src/foo/foo_2/foo_3/c.txt # this file is ignored; too deep.

In case you need to go as deep as possible, you can write a recursive function and apply it to every single file, as follows:

import os

def function_over_files(path):
if os.path.isfile(path):
# do whatever you need with file at path
else:
# this is a dir: we will list all files on it and call recursively
for fname in os.listdir(path):
f_path = os.path.join(path, fname)

# here is the trick: recursive call to the same function
function_over_files(f_path)

src = "path/to/your/dir"
function_over_files(src)

This way you can apply the function to any file under path, don't care how deep it is in the folder, now:

src/foo.txt  # this file is processed; as each file under src
src/foo/a.txt # this file is processed
src/foo/foo_2/b.txt # this file is processed
src/foo/foo_2/foo_3/c.txt # this file is processed

I'm trying to loop through all files in directory & its subfolders, get each file's text content & return an array of text content

You are confusing forEach with map and using an async function.

Sequential resolution

import {promises as fs} from "fs";

const readFileContent = async (files) => {
return await Promise.map(files, async (file) => {
return await fs.readFile(file, 'utf8');
});
};

Concurrent resolution

import {promises as fs} from "fs";

const readFileContent = async (files) => {
return await Promise.all(
files.map( (file) => fs.readFile(file, 'utf8') )
);
};

Looping through each file in directory - bash

The code snipped you've provided has a few problems, e.g. unneeded nested for cycle and erroneous pipeline
(the whole line gunzip $file | grep 'word1\|word2' $filenoext > $filedone | rm -f $filenoext | gzip...).

Note also your code will work correctly only if *.gz files don't have spaces (or special characters) in names.
Also zgrep -c 'word1\|word2' will also match strings like line_starts_withword1_orword2_.

Here is the working version of the script:

#!/bin/bash
for file in *.gz; do
counter=$(zgrep -c -E 'word1|word2' $file) # now counter is the number of word1/word2 occurences in $file
if [[ $counter -gt 0 ]]; then
name=$(basename $file .gz)
zcat $file | grep -E 'word1|word2' > ${name}_done
gzip -f -c ${name}_done > /donefiles/$file
rm -f ${name}_done
else
echo 'nothing to do here'
fi
done

What we can improve here is:

  • since we unzipping the file anyway to check for word1|word2 presence, we may do this to temp file and avoid double-unzipping
  • we don't need to count how many word1 or word2 is inside the file, we may just check for their presence
  • ${name}_done can be a temp file cleaned up automatically
  • we can use while cycle to handle file names with spaces
#!/bin/bash
tmp=`mktemp /tmp/gzip_demo.XXXXXX` # create temp file for us
trap "rm -f \"$tmp\"" EXIT INT TERM QUIT HUP # clean $tmp upon exit or termination
find . -maxdepth 1 -mindepth 1 -type f -name '*.gz' | while read f; do
# quotes around $f are now required in case of spaces in it
s=$(basename "$f") # short name w/o dir
gunzip -f -c "$f" | grep -P '\b(word1|word2)\b' > "$tmp"
[ -s "$tmp" ] && gzip -f -c "$tmp" > "/donefiles/$s" # create archive if anything is found
done



Related Topics



Leave a reply



Submit