Trying to Search Files from User Keyword in Bash

Bash scanning for filenames containing keywords and move them

This could be done by:

  • creating a script that does what you want to do, once.
  • run the script from cron, at a certain interval. Say a couple minutes, or a couple hours, depends on the volume of files you receive.
  • no need for a continually running daemon.

Ex:

#!/bin/bash

start_dir="/start/directory"
if [[ ! -d "$start_dir" ]]
then
echo "ERROR: start_dir ($start_dir) not found."
exit 1
fi

target_dir="/target/directory"
if [[ ! -d "$target_dir" ]]
then
echo "ERROR: target_dir ($target_dir) not found."
exit 1
fi

# Move all MP4 and MKV files to the target directory
find "$start_dir" -type f \( -name "*keyword*.MP4" -o -name "*keyword*.MKV" \) -print0 | while read -r -d $'\0' file
do
# add any processing here...
filename=$(basename "$file")
echo "Moving $filename to $target_dir..."
mv "$file" "$target_dir/$filename"
done

# That being done, all that is left in start_dir can be deleted
find "$start_dir" -type d ! -path "$start_dir" -exec /bin/rm -fr {} \;

Details:

  • scanning for files is most efficient with the find command
  • the -print0 with read ... method is to ensure all valid filenames are processed, even if they include spaces or other "weird" characters.
  • the result of the above code is that each file that matches your keyword, with extensions MP4 or MKV will be processed once.
  • you can then use "$file" to access the file being processed in the current loop.
  • make sure you ALWAYS double quote $file, otherwise any weird filename will brake your code. Well you should always double quote your variables anyway.
  • more complex logic can be added for your specific needs. Ex. create the target directory if it does not exist. Create a different target directory depending on your keyword. etc.
  • to delete all sub-directories under $start_dir, I use find. Again this will process weird directory names.

One point, some will argue that it could all be done in 1 find command with -exec option. True, but IMHO the version with the while loop is easier to code, understand, debug, learn.

And this construct is good to have in your bash toolbox.


When you create a script, only one #! line is needed.

And I fixed the indentation in your question, much easier to read your code properly indented and formatted (see the edit help in the question editor).


Last point to discuss, lets say you have a LARGE number of directories and files to process, and it is possible that new files are added while the script is running. Ex. you are moving many MP4 files, and while it is doing it, new files are deposited in the directories. Then when you do the deletion you could potentially loose files.

If such a case is possible, you could add a check for new files just before you do the /bin/rm, it would help. To be absolutely certain, you could setup a script that processes 1 file, and have it triggered by inotify. But that is another ball game, more complicated and out of scope for this answer.

How to achieve AJAX(interactive) kind of SEARCH in LINUX to FIND files?

Stumbled aross this old question, found it interesting and thought I'd give it a try. This BASH script worked for me:

#!/bin/bash
# Set MINLEN to the minimum number of characters needed to start the
# search.
MINLEN=2
clear
echo "Start typing (minimum $MINLEN characters)..."
# get one character without need for return
while read -n 1 -s i
do
# get ascii value of character to detect backspace
n=`echo -n $i|od -i -An|tr -d " "`
if (( $n == 127 )) # if character is a backspace...
then
if (( ${#in} > 0 )) # ...and search string is not empty
then
in=${in:0:${#in}-1} # shorten search string by one
# could use ${in:0:-1} for bash >= 4.2
fi
elif (( $n == 27 )) # if character is an escape...
then
exit 0 # ...then quit
else # if any other char was typed...
in=$in$i # add it to the search string
fi
clear
echo "Search: \""$in"\"" # show search string on top of screen
if (( ${#in} >= $MINLEN )) # if search string is long enough...
then
find "$@" -iname "*$in*" # ...call find, pass it any parameters given
fi
done

Hope this does what you intend(ed) to do. I included a "start dir" option, because the listings can get quite unwieldy if you search through a whole home folder or something. Just dump the $1 if you don't need it.
Using the ascii value in $n it should be easily possible to include some hotkey functionality like quitting or saving results, too.

EDIT:

If you start the script it will display "Start typing..." and wait for keys to be pressed. If the search string is long enough (as defined by variable MINLEN) any key press will trigger a find run with the current search string (the grep seems kind of redundant here). The script passes any parameters given to find. This allows for better search results and shorter result lists. -type d for example will limit the search to directories, -xdev will keep the search on the current file sytem etc. (see man find). Backspaces will shorten the search string by one, while pressing Escape will quit the script. The current search string is displayed on top. I used -iname for the search to be case-insensitive. Change this to `-name' to get case-sensitive behaviour.

Bash - find a keyword in a file and delete its line

Use the stream editor, sed:

sed -i ".bak" '/culpa/d' test.txt

The above will delete lines containing culpa in test.txt. It will create a backup of the original (named test.txt.bak) and will modify the original file in-place.

How to find all files containing specific text (string) on Linux?

Do the following:

grep -rnw '/path/to/somewhere/' -e 'pattern'
  • -r or -R is recursive,
  • -n is line number, and
  • -w stands for match the whole word.
  • -l (lower-case L) can be added to just give the file name of matching files.
  • -e is the pattern used during the search

Along with these, --exclude, --include, --exclude-dir flags could be used for efficient searching:

  • This will only search through those files which have .c or .h extensions:

    grep --include=\*.{c,h} -rnw '/path/to/somewhere/' -e "pattern"
  • This will exclude searching all the files ending with .o extension:

    grep --exclude=\*.o -rnw '/path/to/somewhere/' -e "pattern"
  • For directories it's possible to exclude one or more directories using the --exclude-dir parameter. For example, this will exclude the dirs dir1/, dir2/ and all of them matching *.dst/:

    grep --exclude-dir={dir1,dir2,*.dst} -rnw '/path/to/search/' -e "pattern"

This works very well for me, to achieve almost the same purpose like yours.

For more options, see man grep.



Related Topics



Leave a reply



Submit