Error: grep: Argument list too long
Use find
find /home/*/public_html -type f -exec grep -l 'pattern' {} +
The +
modifier makes it group the filenames in manageable chunks.
However, you can do it with grep -r
. The arguments to this should be the directory names, not filenames.
grep -rl 'pattern' /home/*/public_html
This will just have 500+ arguments, not thousands of filenames.
bash /bin/grep: Argument list too long (using --file option)
If there are no regular expressions in the input file, you should switch to grep -F
which can read a significantly larger number of input records.
Failing that, splitting the input file would be hugely more efficient than running 30,000+ iterations of grep
over the same file.
Here's splitting in chunks of 10,000 lines; adapting to a different factor should be trivial.
#!/bin/sh
t=$(mktemp -d -t fgrepsplit.XXXXXXXXXXXX) || exit
trap 'rm -rf "$t"' EXIT # Remove temp dir when done
trap 'exit 127' HUP INT TERM # Remove temp dir if interrupted, too
split -l 10000 "$1" "$t"/pat
for p in "$t"/pat*; do
grep -F -f "$p" "$2"
done
how to grep large number of files?
This makes David sad...
Everyone so far is wrong (except for anubhava).
Shell scripting is not like any other programming language because much of the interpretation of lines comes from the power of the shell interpolating them before the command is actually executed.
Let's take something simple:
$ set -x
$ ls
+ ls
bar.txt foo.txt fubar.log
$ echo The text files are *.txt
echo The text files are *.txt
> echo The text files are bar.txt foo.txt
The text files are bar.txt foo.txt
$ set +x
$
The set -x
allows you to see how the shell actually interpolates the glob and then passes that back to the command as input. The >
points to the line that is actually being executed by the command.
You can see that the echo
command isn't interpreting the *
. Instead, the shell grabs the *
and replaces it with the names of the matching files. Then and only then does the echo
command actually executes the command.
When you have 40K plus files, and you do grep *
, you're expanding that *
to the names of those 40,000 plus files before grep
even has a chance to execute, and that's where the error message /usr/bin/grep: Argument list too long is coming from.
Fortunately, Unix has a way around this dilemma:
$ find . -name "*.kaks" -type f -maxdepth 1 | xargs grep -f A01/genes.txt
The find . -name "*.kaks" -type f -maxdepth 1
will find all of your *.kaks
files, and the -depth 1
will only include files in the current directory. The -type f
makes sure you only pick up files and not a directory.
The find
command pipes the names of the files into xargs
and xargs
will append the names of the file to the grep -f A01/genes.txt
command. However, xargs
has a trick up it sleeve. It knows how long the command line buffer is, and will execute the grep
when the command line buffer is full, then pass in another series of file to the grep
. This way, grep
gets executed maybe three or ten times (depending upon the size of the command line buffer), and all of our files are used.
Unfortunately, xargs
uses whitespace as a separator for the file names. If your files contain spaces or tabs, you'll have trouble with xargs
. Fortunately, there's another fix:
$ find . -name "*.kaks" -type f -maxdepth 1 -print0 | xargs -0 grep -f A01/genes.txt
The -print0
will cause find
to print out the names of the files not separated by newlines, but by the NUL character. The -0
parameter for xargs
tells xargs
that the file separator isn't whitespace, but the NUL character. Thus, fixes the issue.
You could also do this too:
$ find . -name "*.kaks" -type f -maxdepth 1 -exec grep -f A01/genes.txt {} \;
This will execute the grep
for each and every file found instead of what xargs
does and only runs grep
for all the files it can stuff on the command line. The advantage of this is that it avoids shell interference entirely. However, it may or may not be less efficient.
What would be interesting is to experiment and see which one is more efficient. You can use time
to see:
$ time find . -name "*.kaks" -type f -maxdepth 1 -exec grep -f A01/genes.txt {} \;
This will execute the command and then tell you how long it took. Try it with the -exec
and with xargs
and see which is faster. Let us know what you find.
How can I grep while avoiding 'Too many arguments'
Run several instances of grep. Instead of
grep -i user@domain.com 1US* | awk '{...}' | xargs rm
do
(for i in 1US*; do grep -li user@domain "$i"; done) | xargs rm
Note the -l flag, since we only want the file name of the match. This will both speed up grep (terminate on first match) and makes your awk script unrequired. This could be improved by checking the return status of grep and calling rm, not using xargs (xargs is very fragile, IMO). I'll give you the better version if you ask.
Hope it helps.
Getting argument list too long error
Try this
$ find /file/collection/*/logs/ -name "*.log" -type f -maxdepth 1 | xargs grep hello
Argument list too long error in while loop reading from infinite input stream
The issue is that inside the print_volume
function, I was repeatedly sourcing a file with export
s in it. As pointed out by Charles Duffy, this caused the environment size to be too large.
grep:argument list too long
You can try this sed
,
sed 'N;/^[^\n]*\n[^\n]*$/N; /.*\n.*\n.*Possible/{$q;N;N;N;d};P;D;' structure > final
Related Topics
Linux How to Get Error Description by Error Number
Gentoo Crontab: Why This Simple Crontab Is Not Working
Permission Denied Error When Running Crontab
Awk - Count Each Unique Value and Match Values Between Two Files
How to Use "Py" Instead of "Python" at the Command Line in Linux
How to Make a Built-In Device Driver in Linux
How to Replace Just One Newline Between > and < in Unix
How to Check All Columns Data Types of Table Using Awk Script
Issue While Validating Bash Script
Bash - Find Files Older Than X Minutes and Move Them
How to Grep One String Occuring Multiple Times from Same File
How to Write Multiple Line String Using Bash with Variables
Copying Files Based on Modification Date in Linux
Eclipse - Changing Font Size in Project/Package Explorer
Bash: /Bin/Tar: Argument List Too Long When Compressing Many Files with Tar
Search and Replace Text in All Files of a Linux Directory