Rename part of file name based on exact match in contents of another file
You can perfectly combine your for
and while
loops to only use mv
:
while read from to ; do
for i in test* ; do
if [ "$i" != "${i/$from/$to}" ] ; then
mv $i ${i/$from/$to}
fi
done
done < replacements.txt
An alternative solution with sed
could consist in using the e
command that executes the result of a substitution (Use with caution! Try without the ending e
first to print what commands would be executed).
Hence:
sed 's/\(\w\+\)\s\+\(\w\+\)/mv sample_\1\.txt sample_\2\.txt/e' replacements.txt
would parse your replacements.txt file and rename all your .txt files as desired.
We just have to add a loop to deal with the other extentions:
for j in .txt .bak .tsv .fq .fq.abc ; do
sed "s/\(\w\+\)\s\+\(\w\+\)/mv 'sample_\1$j' 'sample_\2$j'/e" replacements.txt
done
(Note that you should get error messages when it tries to rename non-existing files, for example when it tries to execute mv sample_ACGT.fq sample_name1.fq
but file sample_ACGT.fq
does not exist)
Replace exact part of file name with Powershell
The dot in regex means Any Character. Without escaping that, things go wrong.
Try
Rename-Item -NewName {$_.Name -replace ('{0}$' -f [regex]::Escape('.L.wav')),'_L.wav'}
or manually escape the regex metacharacters:
Rename-Item -NewName {$_.Name -replace '\.L\.wav$','_L.wav'}
The $
at the end anchors the text to match at the end on the string
Also, instead of doing ls *.* | Rename-Item {...}
, better use
(Get-ChildItem -Filter '*.L.wav' -File) | Rename-Item {...}
(ls is alias to Get-ChildItem )
- Using the
-Filter
you can specify what files you're looking for. - Using the
-File
switch, you make sure you do not also try to rename folder objects. - By surrounding the Get-ChildItem part of the code in brackets, you make sure the gathering of the files is complete before you start renaming them. Otherwise, chances are the code will try and keep renaming files that are already processed.
Replacing / removing specific part of file names using regex
You can use the regex:
_\d{6}(\.[^.]+)$
and replace with $1
instead.
The regex is matching 6 digits, then group 1 ((\.[^.]+)
) matches the extension, which you replace with in the replacement string. The extension is matched by "a dot followed by a bunch of non-dots". Also note that the end of string anchor $
to assert that all of this must be at the end of the string.
Change your code to:
string newName = Regex.Replace(f.FullName, @"_\d{6}(\.[^.]+)$", "$1");
How to rename files using certain string inside each file?
You can use the following script to achieve your goal. Note, for the script to work on macOS, you either have to install GNU grep via Homebrew, or substitute the grep
call with ggrep
.
- The script will search the current directory and all its subdirectories for
*.html
files. - It will substitute only the names of the files that contain the specific tag.
- For multiple files that containt the same tag, each subsicuent file apart from the first will have an identifier appended to its name. E.g.,
1_234.html
,1_234_1.html
,1_234_2.html
- For files that contain multiple tags, the first tag encountered will be used.
#!/bin/bash
rename_file ()
{
# Check that file name received is an existing regular file
file_name="$(realpath "${1}")"
if [ ! -f "${file_name}" ]; then
echo "No argument or non existing file or non regular file provided"
exit 1
fi
# Get the tag number. If the number does not exist, the variable tag will be
# empty. The first tag on a file will be used if there are multiple tags
# within a file.
tag="$(grep -oP -m 1 '(?<=<div id="myID" style="display:none">).*?(?=</div>)' \
-- "${file_name}")"
# Rename the file only if it contained a tag
if [ -n "${tag}" ]; then
file_path="$(dirname "${file_name}")"
# Change directory to the file's location silently
pushd "${file_path}" > /dev/null || return
# Check for multiple occurences of files with the same tag
if [ -e "${tag}.html" ]; then
counter="$(find ./ -maxdepth 1 -type f -name "${tag}.html" -o -name "${tag}_*.html" | wc -l)"
tag="${tag}_${counter}"
fi
# Rename the file
mv "${file_name}" "${tag}.html"
# Return to previous directory silently
popd > /dev/null || return
fi
}
# Necessary in order to call rename_file from find command within main
export -f rename_file
# The entry point function of the script. This function searches for all the
# html files in the directory that the script is run, and all subdirectories.
# The function calls rename_files upon each of the found files.
main ()
{
find ./ -type f -name "*.html" -exec bash -c 'rename_file "${1}"' _ {} \;
}
main
mv/rename files with common part but unknown file pattern
I wasn't looking at the right place for my problem
the if was my reel problem. this is ok!
ABC_Files=$(ls "$DOSSIER/$OLD_NAME"*.abc 2> /dev/null | wc -l)
if [ **"$ABC_Files" != "0"** ];
then
for i in "${DOSSIER}/$OLD_NAME"*.abc; do
[ -f "$i" ] || continue
mv "$i" "${i/$OLD_NAME/$NEW_NAME}"
done
fi
of course assuming you know that
$DOSSIER is the path
$OLD_NAME is your actual filename
$NEW_NAME is your new filename
Match file names and replace with new name
A simple solution in python:
from collections import OrderedDict
LINES_PER_CYCLE = 1000
with open('output.txt', 'wb') as output, open('test_2.txt', 'rb') as fin:
fin_line = ''
# Loop until fin reaches EOF.
while True:
cache = OrderedDict()
# Fill the cache with up to LINES_PER_CYCLE entries.
for _ in xrange(LINES_PER_CYCLE):
fin_line = fin.readline()
if not fin_line:
break
key, rest = fin_line.strip().split(' ', 1)
cache[key] = ['', rest]
# Loop over the file_1.txt to find tags with given id.
with open('test_1.txt', 'rb') as fout:
for line in fout:
tag, _ = line.split(' ', 1)
_, idx = tag.rsplit('_', 1)
if idx in cache:
cache[idx][0] = tag
# Write matched lines to the output file, in the same order
# as the lines were inserted into the cache.
for _, (tag, rest) in cache.iteritems():
output.write('{} {}\n'.format(tag, rest))
# If fin has reached EOF, break.
if not fin_line:
break
What it does is reading up to LINES_PER_CYCLE
entries from the file_2.txt
, finding matching entries in file_1.txt
and writing to the output. As a result of limited memory (for cache), file_1.txt
is searched through multiple times.
This assumes that the tag/id part is separated by whitespace from the -------
, and that the tag and id are separated by an underscore from themselves, ie. 'tag_idx blah blah'.
Related Topics
Printing an Integer with X86 32-Bit Linux Sys_Write (Nasm)
How to Count the Number of Occurrences of a String in an Entire File
Crontab Is Not Working on Amazon Ec2 Server
Embedding the Password in the Bash Script
Amazon Linux: "Apt-Get: Command Not Found"
Need Explanations for Linux Bash Builtin Exec Command Behavior
Linux Commands to Copy One File to Many Files
How to Find All Files with a Filename That Ends with Tilde
Is There Any Method to Run Perf Under Wsl
Ssh Error When Executing a Remote Command: "Stdin: Is Not a Tty"
Pipe Bash Command Output to Stdout and to a Variable
How to Run a Shell Script by Cron Job
Is Kernel Space Mapped into User Space on Linux X86
(Master) at End of Terminal Prompt
Linux Command or Script Counting Duplicated Lines in a Text File
Linux Shell to Restrict Sftp Users to Their Home Directories