Bash Script Pattern Matching

Bash script pattern matching

Use a character class: [0-9] matches 0, 9, and every character between them in the character set, which - at least in Unicode (e.g. UTF-8) and subset character sets (e.g. US-ASCII, Latin-1) - are the digits 1 through 8. So it matches any one of the 10 Latin digits.

if [[ $var1 == *,123[0-9][0-9][0-9],* ]] ; then echo "Pattern matched"; fi

Using =~ instead of == changes the pattern type from shell standard "glob" patterns to regular expressions ("regexes" for short). You can make an equivalent regex a little shorter:

if [[ $var1 =~ ,123[0-9]{3}, ]] ; then echo "Pattern matched"; fi

The first shortening comes from the fact that a regex only has to match any part of the string, not the whole thing. Therefore you don't need the equivalent of the leading and trailing *s that you find in the glob pattern.

The second length reduction is due to the {n} syntax, which lets you specify an exact number of repetitions of the previous pattern instead of actually repeating the pattern itself in the regex. (You can also match any of a range of repetition counts by specifying a minimum and maximum, such as [0-9]{2,4} to match either two, three, or four digits in a row.)

It's worth noting that you could also use a named character class to match digits. Depending on your locale, [[:digit:]] may be exactly equivalent to [0-9], or it may include characters from other scripts with the Unicode "Number, Decimal Digit" property.

if [[ $var1 =~ ,123[[:digit:]]{3}, ]] ; then echo "Pattern matched"; fi

Regex for pattern matching in shell script and extract the match part


I wanted to make it more generic like select everything which appears after the last equal sign.

You may use:

[[ $PM_path =~ .*=([^/]+) ]] && echo "${BASH_REMATCH[1]}"
R9NEXRERNFVCSS01_PS

.* matches longest possible text from start then we match a =. Finally we match and capture remaining string of 1+ non-/ characters in ([^/]+) that we print using echo "${BASH_REMATCH[1]}"

Pattern matching in if statement in bash

Using grep - this is pretty simple to do.

#!/bin/bash

wordcount=0
for file in ./*.txt
do
count=`cat $file | xargs -n1 | grep -ie "[aeiou].*[aeiou]" | wc -l`
wordcount=`expr $wordcount + $count`
done

echo $wordcount

How to pattern match a script argument in Linux bash

As pointed out in the comments, case only supports globs (also known as wildcards). To check regular expression as in your script you could use [[ … =~ … ]] but that would be overkill. Since you only want to check the first letter, globs are sufficient.

Also, I would rephrase the warning message. Tell the user what to do, not just “You messed up. Good luck guessing on your next try.”.

if [ $# != 1 ]; then
echo "Expected 1 argument but found $#."
exit 1
fi
case "$1" in
[0-9]*) echo "Argument starts with number" ;;
[a-zA-Z]*) echo "Argument starts with letter" ;;
*) echo "Argument is a string";;
esac

I have pattern matching problem in case command in bash

The error is most likely because you have turned on extglob option in your current shell. Because sourcing the script takes the current shell's options and extended options, it works when sourcing the script.

But when doing the ./t.sh you are launching an explicit shell which does not have the option turned on by default. Since [[ operator with == turns on extglob by default, it works for the first test but fails for the case statement. To enable the option explicitly in scripts do shopt -s extglob at the top of your script.

As you can see below the pattern works with case only if the option is enabled. Try removing -O extglob from below command and you can see it doesn't work.

bash -O extglob -c 'case apple79 in apple@(14|38|79|11)) echo ok 2;; *) ;; esac'

As far why your attempt didn't work, try adding a line shopt extglob to your t.sh and repeat your tests. You'll notice that when the script is sourced you'll see extglob on and for the executed case get extglob off

Check if a string matches a regex in Bash script

You can use the test construct, [[ ]], along with the regular expression match operator, =~, to check if a string matches a regex pattern (documentation).

For your specific case, you can write:

[[ "$date" =~ ^[0-9]{8}$ ]] && echo "yes"

Or more a accurate test:

[[ "$date" =~ ^[0-9]{4}(0[1-9]|1[0-2])(0[1-9]|[1-2][0-9]|3[0-1])$ ]] && echo "yes"
# |\______/\______*______/\______*__________*______/|
# | | | | |
# | | | | |
# | --year-- --month-- --day-- |
# | either 01...09 either 01..09 |
# start of line or 10,11,12 or 10..29 |
# or 30, 31 |
# end of line

That is, you can define a regex in Bash matching the format you want. This way you can do:

[[ "$date" =~ ^regex$ ]] && echo "matched" || echo "did not match"

where commands after && are executed if the test is successful, and commands after || are executed if the test is unsuccessful.

Note this is based on the solution by Aleks-Daniel Jakimenko in User input date format verification in bash.


In other shells you can use grep. If your shell is POSIX compliant, do

(echo "$date" | grep -Eq  ^regex$) && echo "matched" || echo "did not match"

In fish, which is not POSIX-compliant, you can do

echo "$date" | grep -Eq "^regex\$"; and echo "matched"; or echo "did not match"

Caveat: These portable grep solutions are not water-proof! For example, they can be tricked by input parameters that contain newlines. The first mentioned bash-specific regex check does not have this issue.

how to shell script regex perfect matching?

This issue is solved.

The follow answer to up @The fourth bird
i missed anchor(^).
To clarify the starting and ending points, It should be between '^' and '$'.

You can refer to answer

if [[ "$image" =~ ^[0-9]+(\.[0-9]+){3}\-[0-9]+$ ]]; @The fourth bird Jul 11 at 8:43

Thank you for replayers XD



Related Topics



Leave a reply



Submit