Search Ms Word Files in a Directory for Specific Content in Linux

How to find all files containing specific text (string) on Linux?

Do the following:

grep -rnw '/path/to/somewhere/' -e 'pattern'
  • -r or -R is recursive,
  • -n is line number, and
  • -w stands for match the whole word.
  • -l (lower-case L) can be added to just give the file name of matching files.
  • -e is the pattern used during the search

Along with these, --exclude, --include, --exclude-dir flags could be used for efficient searching:

  • This will only search through those files which have .c or .h extensions:
grep --include=\*.{c,h} -rnw '/path/to/somewhere/' -e "pattern"
  • This will exclude searching all the files ending with .o extension:
grep --exclude=\*.o -rnw '/path/to/somewhere/' -e "pattern"
  • For directories it's possible to exclude one or more directories using the --exclude-dir parameter. For example, this will exclude the dirs dir1/, dir2/ and all of them matching *.dst/:
grep --exclude-dir={dir1,dir2,*.dst} -rnw '/path/to/somewhere/' -e "pattern"

This works very well for me, to achieve almost the same purpose like yours.

For more options, see man grep.

How to search multiple DOCX files for a string within a Word field?

This script should accomplish what you are trying to do. Let me know if that isn't the case. I don't usually write entire scripts because it can hurt the learning process, so I have commented each command so that you might learn from it.

#!/bin/sh

# Create ~/tmp/WORDXML folder if it doesn't exist already
mkdir -p ~/tmp/WORDXML

# Change directory to ~/tmp/WORDXML
cd ~/tmp/WORDXML

# Iterate through each file passed to this script
for FILE in $@; do
{
# unzip it into ~/tmp/WORDXML
# 2>&1 > /dev/null discards all output to the terminal
unzip $FILE 2>&1 > /dev/null

# find all of the xml files
find -type f -name '*.xml' | \

# open them in xmllint to make them pretty. Discard errors.
xargs xmllint --recover --format 2> /dev/null | \

# search for and report if found
grep 'getProposalTranslations' && echo " [^ found in file '$FILE']"

# remove the temporary contents
rm -rf ~/tmp/WORDXML/*

}; done

# remove the temporary folder
rm -rf ~/tmp/WORDXML

Save the script wherever you like. Name it whatever you like. I'll name it docxfind. Make it executable by running chmod +x docxfind. Then you can run the script like this (assuming your terminal is running in the same directory): ./docxfind filenames...

Find specific folders then search specific files inside them for a word

try this, I am assuming you have multiple files named test.log in the folders whose names start with k0 here:

for file in $(find ./k0* -name 'test.log'); do 
grep -w 'ERROR' $file

done

You can make this into a one-liner command like this:

for file in $(find ./k0* -name 'test.log'); do grep -w 'ERROR' $file; done

It's executable on terminal if you just post it.

Search all .htaccess files in a Linux server webroot for a word and return file paths to a text file

Following your example, try using:

find . \( -type f -name .htaccess \) -print0 | xargs -0 grep -H ldap > /tmp/results.txt

this find will list null-terminated files .htaccess in . directory, and xargs -0 pass them to the grep. grep -H ldap will list files containing ldap string with filenames.

Command line tool to search docx file under ms dos or cygwin

After a trying out the stuff , I found the easiest way to do this is to use a linux utility to batch convert all docx files into txt files, then do grep with those txt files easily.

Search a word into all makefiles linux from terminal

You can use the below command to find for send keyword in all the Makefiles
recursively.

find /home/mypath -name "Makefile" | xargs grep -r "send"

Here the find command list all the files with name Makefile under the specified directory. xargs command will pass all the files listed with serially to the grep command to search for the string send

find' some text including before and after lines

You can do exactly what you are doing, but add the -A and -B options to grep. E.g:

find /directory -name "*.log" | xargs grep -A #above -B #below "something"

Replace #above with the number of lines above the match you would like, and similarly replace #below with a number for the lines you would like to see below the match.

Linux cmd to search for a class file among jars irrespective of jar path

Where are you jar files? Is there a pattern to find where they are?

1. Are they all in one directory?

For example, foo/a/a.jar and foo/b/b.jar are all under the folder foo/, in this case, you could use find with grep:

find foo/ -name "*.jar" | xargs grep Hello.class

Sure, at least you can search them under the root directory /, but it will be slow.

As @loganaayahee said, you could also use the command locate. locate search the files with an index, so it will be faster. But the command should be:

locate "*.jar" | xargs grep Hello.class

Since you want to search the content of the jar files.

2. Are the paths stored in an environment variable?

Typically, Java will store the paths to find jar files in an environment variable like CLASS_PATH, I don't know if this is what you want. But if your variable is just like this:CLASS_PATH=/lib:/usr/lib:/bin, which use a : to separate the paths, then you could use this commend to search the class:

for P in `echo $CLASS_PATH | sed 's/:/ /g'`; do grep Hello.calss $P/*.jar; done


Related Topics



Leave a reply



Submit