Have One Folder with Files That Have the Same Name But Different File

Have one folder with files that have the same name but different file

Your if [ "$file_hash" == "$a" ]; compares a hash with a filename. You need something like

if [ "$file_hash" == $(md5sum "$a" | cut -d ' ' -f 1) ];

to compute the hash for each of the file in destination folder.

Furthermore, your for loop, in its current version, runs only once ; you need something like

for a in $destination_folder/*

to get all files in that folder rather than just the folder name.

Based on your edits, a solution would look like

#!/bin/bash -xv

file=$1
destination_folder=$2
file_hash=`md5sum "$file" | cut -d ' ' -f 1`

# test that the filename exists in the destination dir
if [[ -f $destination_folder/$file ]] ; then
dest_hash=$(md5sum "$destination_folder/$file" | cut -d ' ' -f 1)
# test that the hash is the same
if [[ "$file_hash" == $curr_hash ]] ; then
cp "$file.JPG" "$destination_folder/$file.JPG"
else
# do nothing
fi
else
# destination does not exit, copy file
cp "$file.JPG" "$destination_folder/$file"
fi

This does not ensure that there are no duplicates. It simply ensures that distinct files with identical names do not overwrite each other.

#!/bin/bash -xv

file=$1
destination_folder=$2
file_hash=`md5sum "$file" | cut -d ' ' -f 1`

# test each file in destination
for a in $destination_folder/*
do
curr_hash=$(md5sum "$a" | cut -d ' ' -f 1)
if [ "$file_hash" == $curr_hash ];
then
# an identical file exists. (maybe under another name)
# do nothing
exists=1
break
fi
done

if [[ $exists != 1 ]] ; then
if [[ -f $destination_folder/$file ]] ; then
cp "$file.JPG" "$destination_folder/$file.JPG"
else
cp "$file.JPG" "$destination_folder"
fi
fi

Not tested.

files with same name in different folders

Try this to get the files which have names in common:

cd dir1
find . -type f | sort > /tmp/dir1.txt
cd dir2
find . -type f | sort > /tmp/dir2.txt
comm -12 /tmp/dir1.txt /tmp/dir2.txt

Then use a loop to do whatever you need:

for filename in "$(comm -12 /tmp/dir1.txt /tmp/dir2.txt)"; do
cat "dir1/$filename"
cat "dir2/$filename"
done

Find same file name in different folders, and utilize two files for a certain task in python

Use python's os.walk() to get a list of files in both folders. Then match them as per your needs and run whichever function you want on them.
Use this and this reference to understand how to use os.walk().

How to pack files into a ZIP file which have same name but different suffixes?

The task could be done with following batch file:

@echo off
setlocal EnableExtensions DisableDelayedExpansion
cd /D "C:\folder" || exit /B
for %%I in ("video\*-video.*") do (
set "VideoFileName=%%~nxI"
setlocal EnableDelayedExpansion
set "CommonName=!VideoFileName:-video%%~xI=!"
"%ProgramFiles%\7-Zip\7z.exe" a -bd -bso0 -mx0 -y -- "!CommonName!.zip" "cover\!CommonName!-art.*" "image\!CommonName!-screen.*" "mix\!CommonName!-HDphoto.*" "video\!VideoFileName!"
endlocal
)
endlocal

The batch file defines with the first two command lines the required execution environment which is:

  • command echo mode turned off
  • command extensions enabled
  • delayed expansion disabled

The next command changes the current directory to C:\folder. If that command fails because of the directory does not exist or a UNC path is used instead of a path starting with a drive letter and a colon, the batch file processing is exited without any further message than the error message output by command CD.

The FOR loop searches in subdirectory video for non-hidden files matching the wildcard pattern *-video.*. The file name with file extension of a found video file matching this wildcard pattern is assigned to the environment variable VideoFileName.

Next delayed expansion is enabled as required to make use of the file name assigned to the environment variable VideoFileName. Please read this answer for details on what happens in background on execution of setlocal EnableDelayedExpansion and later on execution of corresponding endlocal.

A case-insensitive string substitution using delayed expansion is used to remove from video file name -video left to the file extension and the file extension itself which can be .mp4, .mpg, .mpeg, etc. VideoFileName was defined with file extension in case of the file name itself contains anywhere the string -video which should not be removed by the string substitution. For example My Home-Video-video.mp4 assigned to VideoFileName results in My Home-Video getting assigned to the environment variable CommonName because of taking also the file extension into account on string substitution.

Next 7-Zip is executed with the command a and the switches as posted in question to create or add to a ZIP file in current directory with common name part of the files in the four directories and file extension .zip the video file name and the other image files from the other three directories with whatever file extension the images have in the other three directories.

Then endlocal is executed to restore the previous environment with delayed expansion disabled again.

The commands setlocal EnableDelayedExpansion and endlocal are used inside the FOR loop to be able to process correct also a files collection with a common name like Superman & Batman (+ Robin!) containing an exclamation mark.

The video files in subdirectory video define which files to pack into a ZIP file. So all image files in the three image directories are ignored for which no video file exists in directory video.

Note: The batch file is not capable processing correct video file names which contain an exclamation mark in the file extension. I doubt that this limitation is ever a problem as I have never seen a video file extension with an exclamation mark.

For understanding the used commands and how they work, open a command prompt window, execute there the following commands, and read entirely all help pages displayed for each command very carefully.

  • cd /?
  • echo /?
  • endlocal /?
  • exit /?
  • set /?
  • setlocal /?

See also single line with multiple commands using Windows batch file for an explanation of conditional operator ||.

Merge files with same name in more than 100 folders

The cat command is just about the simplest there is, so there is no obvious and portable way to make the copying of file contents any faster. The bottleneck is probably going to be finding the files, anyway, not in copying them. If indeed the files are all in subdirectories immediately below the root directory,

cat /*/replaced_txt >merged_txt

will expand the wildcard alphabetically (so /folder10/replaced_txt comes before /folder2/replaced_txt) but might run into "Argument list too long" and/or take a long time to expand the wildcard if some of these directories are large (especially on an older Linux system with an ext3 filesystem, which doesn't scale to large directories very well). A more general solution is find, which is better at finding files in arbitrarily nested subdirectories, and won't run into "Argument list too long" because it never tries to assemble all the file names into an alphabetized list; instead, it just enumerates the files it finds as it traverses directories in whichever order the filesystem reports them, and creates a new cat process when the argument list fills up to the point where the system's ARG_MAX limit would be exceeded.

find / -type f -name replaced_txt -xdev -exec cat {} + >merged_txt

If you want to limit how far subdirectories will be traversed or you only want to visit some directories, look at the find man page for additional options.



Related Topics



Leave a reply



Submit