How to Grep While Avoiding 'Too Many Arguments'

How can I grep while avoiding 'Too many arguments'

Run several instances of grep. Instead of

grep -i user@domain.com 1US* | awk '{...}' | xargs rm

do

(for i in 1US*; do grep -li user@domain "$i"; done) | xargs rm

Note the -l flag, since we only want the file name of the match. This will both speed up grep (terminate on first match) and makes your awk script unrequired. This could be improved by checking the return status of grep and calling rm, not using xargs (xargs is very fragile, IMO). I'll give you the better version if you ask.

Hope it helps.

grep returns Too many argument specified on command

$ find ./ -name '20110101*' -print0 -type f | xargs -0 grep -l "search_pattern"

you can use find and xargs. xargs will run grep for each file found by find. You can use -P to run multiple grep's parallely and -n for multiple files per grep command invocation. The print0 argument in find separates each filename with a null character to avoid confusion caused by any spaces in the file name. If you are sure there will not be any spaces you can remove -print0 and -0 args.

Error: grep: Argument list too long

Use find

find /home/*/public_html -type f -exec grep -l 'pattern' {} +

The + modifier makes it group the filenames in manageable chunks.

However, you can do it with grep -r. The arguments to this should be the directory names, not filenames.

grep -rl 'pattern' /home/*/public_html

This will just have 500+ arguments, not thousands of filenames.

too many arguents using []

too many arguments

Just as the error says, you are supplying too many arguments.

[ arg1 = arg2 ], the structure should be somewhat like this. Although, $(cat /etc/dhcpcd.conf | grep 'interface wlan0') should have produced multiline or multiword output due to which you get that error.

To avoid that error, you can simply enclose the result of the command in quotes which will make the complete result a single argument.

[ "$(cat /etc/dhcpcd.conf | grep 'interface wlan0')" = 'interface wlan0' ]

Although I think you are looking to get just the first match of grep.
Grep only the first match and stop should probably help you in that case.

How to get around the Linux Too Many Arguments limit


edit:

I was finally able to pass <= 256 KB as a single command line argument (see edit (4) in the bottom). However, please read carefully how I did it and decide for yourself if this is a way you want to go. At least you should be able to understand why you are 'stuck' otherwise from what I found out.


With the coupling of ARG_MAX to ulim -s / 4 came the introduction of MAX_ARG_STRLEN as max. length of an argument:

/*
* linux/fs/exec.c
*
* Copyright (C) 1991, 1992 Linus Torvalds
*/

...

#ifdef CONFIG_MMU
/*
* The nascent bprm->mm is not visible until exec_mmap() but it can
* use a lot of memory, account these pages in current->mm temporary
* for oom_badness()->get_mm_rss(). Once exec succeeds or fails, we
* change the counter back via acct_arg_size(0).
*/

...

static bool valid_arg_len(struct linux_binprm *bprm, long len)
{
return len <= MAX_ARG_STRLEN;
}

...

#else

...

static bool valid_arg_len(struct linux_binprm *bprm, long len)
{
return len <= bprm->p;
}

#endif /* CONFIG_MMU */

...

static int copy_strings(int argc, struct user_arg_ptr argv,
struct linux_binprm *bprm)
{

...

    str = get_user_arg_ptr(argv, argc);

...

    len = strnlen_user(str, MAX_ARG_STRLEN);
if (!len)
goto out;

ret = -E2BIG;
if (!valid_arg_len(bprm, len))
goto out;

...

}

...

MAX_ARG_STRLEN is defined as 32 times the page size in linux/include/uapi/linux/binfmts.h:

...

/*
* These are the maximum length and maximum number of strings passed to the
* execve() system call. MAX_ARG_STRLEN is essentially random but serves to
* prevent the kernel from being unduly impacted by misaddressed pointers.
* MAX_ARG_STRINGS is chosen to fit in a signed 32-bit integer.
*/
#define MAX_ARG_STRLEN (PAGE_SIZE * 32)
#define MAX_ARG_STRINGS 0x7FFFFFFF

...

The default page size is 4 KB so you cannot pass arguments longer than 128 KB.

I can't try it now but maybe switching to huge page mode (page size 4 MB) if possible on your system solves this problem.

For more detailed information and references see this answer to a similar question on Unix & Linux SE.


edits:

(1)
According to this answer one can change the page size of x86_64 Linux to 1 MB by enabling CONFIG_TRANSPARENT_HUGEPAGE and setting CONFIG_TRANSPARENT_HUGEPAGE_MADVISE to n in the kernel config.

(2)
After recompiling my kernel with the above configuration changes getconf PAGESIZE still returns 4096.
According to this answer CONFIG_HUGETLB_PAGE is also needed which I could pull in via CONFIG_HUGETLBFS. I am recompiling now and will test again.

(3)
I recompiled my kernel with CONFIG_HUGETLBFS enabled and now /proc/meminfo contains the corresponding HugePages_* entries mentioned in the corresponding section of the kernel documentation.
However, the page size according to getconf PAGESIZE is still unchanged. So while I should be able now to request huge pages via mmap calls, the kernel's default page size determining MAX_ARG_STRLEN is still fixed at 4 KB.

(4)
I modified linux/include/uapi/linux/binfmts.h to #define MAX_ARG_STRLEN (PAGE_SIZE * 64), recompiled my kernel and now your code produces:

...

117037
123196
123196
129680
129680
136505
143689
151251
159211

...

227982
227982
239981
239981
252611
252611
265906
./testCL: line 11: ./foo: Argument list too long
279901
./testCL: line 11: ./foo: Argument list too long
294632
./testCL: line 11: ./foo: Argument list too long

So now the limit moved from 128 KB to 256 KB as expected.
I don't know about potential side effects though.
As far as I can tell, my system seems to run just fine.

Argument list too long error for rm, cp, mv commands

The reason this occurs is because bash actually expands the asterisk to every matching file, producing a very long command line.

Try this:

find . -name "*.pdf" -print0 | xargs -0 rm

Warning: this is a recursive search and will find (and delete) files in subdirectories as well. Tack on -f to the rm command only if you are sure you don't want confirmation.

You can do the following to make the command non-recursive:

find . -maxdepth 1 -name "*.pdf" -print0 | xargs -0 rm

Another option is to use find's -delete flag:

find . -name "*.pdf" -delete

Too many arguments issue in if statement in Bash

If a variable contains whitespace, it gets expanded inside of [ ... ] and the number of parameters to [ increases:

three_words='a b c'
[ $three_words = 'a b c' ]

is actually interpreted as

[ a b c = 'a b c' ]
# 1 2 3 4 5

I used = here, as -eq is used to compare numbers, not strings.

Do you see? 5 words!

Solution? Double quote the variable:

[ "$three_words" = 'a b c' ]

Or, if in bash and not caring about portability to other shells, use [[, it needs no quoting:

[[ $three_words = 'a b c' ]]

Does argument list too long restriction apply to shell builtins?

In bash, the OS-enforced limitation on command-line length which causes the error argument list too long is not applied to shell builtins.

This error is triggered when the execve() syscall returns the error code E2BIG. There is no execve() call involved when invoking a builtin, so the error cannot take place.

Thus, both of your proposed operations are safe: cmd <<< "$string" writes $string to a temporary file, which does not require that it be passed as an argv element (or an environment variable, which is stored in the same pool of reserved space); and printf '%s\n' "$cmd" takes place internal to the shell unless the shell's configuration has been modified, as with enable -n printf, to use an external printf implementation.



Related Topics



Leave a reply



Submit