How to Match "Any Character" in Regular Expression

How to match any character in regular expression?

Yes, you can. That should work.

  • . = any char except newline
  • \. = the actual dot character
  • .? = .{0,1} = match any char except newline zero or one times
  • .* = .{0,} = match any char except newline zero or more times
  • .+ = .{1,} = match any char except newline one or more times

How can I match anything up until this sequence of characters in a regular expression?

You didn't specify which flavor of regex you're using, but this will
work in any of the most popular ones that can be considered "complete".

/.+?(?=abc)/

How it works

The .+? part is the un-greedy version of .+ (one or more of
anything). When we use .+, the engine will basically match everything.
Then, if there is something else in the regex it will go back in steps
trying to match the following part. This is the greedy behavior,
meaning as much as possible to satisfy.

When using .+?, instead of matching all at once and going back for
other conditions (if any), the engine will match the next characters by
step until the subsequent part of the regex is matched (again if any).
This is the un-greedy, meaning match the fewest possible to
satisfy
.

/.+X/  ~ "abcXabcXabcX"        /.+/  ~ "abcXabcXabcX"
^^^^^^^^^^^^ ^^^^^^^^^^^^

/.+?X/ ~ "abcXabcXabcX" /.+?/ ~ "abcXabcXabcX"
^^^^ ^

Following that we have (?={contents}), a zero width
assertion
, a look around. This grouped construction matches its
contents, but does not count as characters matched (zero width). It
only returns if it is a match or not (assertion).

Thus, in other terms the regex /.+?(?=abc)/ means:

Match any characters as few as possible until a "abc" is found,
without counting the "abc".

regex to match any character or none?

Use .*? instead of .+?.

+ means "1 or more"

* means "0 or more"

Regex101 Demo

If you want a more efficient regex, use a negated character class [^"] instead of a lazy quantifier ?. You should also use the raw string flag r and \d for digits.

r'"[^"]*" \d{3}'

Regex match any single character (one character only)

Match any single character

  • Use the dot . character as a wildcard to match any single character.

Example regex: a.c

abc   // match
a c // match
azc // match
ac // no match
abbc // no match

Match any specific character in a set

  • Use square brackets [] to match any characters in a set.
  • Use \w to match any single alphanumeric character: 0-9, a-z, A-Z, and _ (underscore).
  • Use \d to match any single digit.
  • Use \s to match any single whitespace character.

Example 1 regex: a[bcd]c

abc   // match
acc // match
adc // match
ac // no match
abbc // no match

Example 2 regex: a[0-7]c

a0c   // match
a3c // match
a7c // match
a8c // no match
ac // no match
a55c // no match

Match any character except ...

Use the hat in square brackets [^] to match any single character except for any of the characters that come after the hat ^.

Example regex: a[^abc]c

aac   // no match
abc // no match
acc // no match
a c // match
azc // match
ac // no match
azzc // no match

(Don't confuse the ^ here in [^] with its other usage as the start of line character: ^ = line start, $ = line end.)

Match any character optionally

Use the optional character ? after any character to specify zero or one occurrence of that character. Thus, you would use .? to match any single character optionally.

Example regex: a.?c

abc   // match
a c // match
azc // match
ac // match
abbc // no match

See also

  • A quick tutorial to teach you the basics of regex
  • A practice sandbox to try things out

Symbol for any number of any characters in regex?

.*

. is any char, * means repeated zero or more times.

Regular Expression to match only alphabetic characters

You may use any of these 2 variants:

/^[A-Z]+$/i
/^[A-Za-z]+$/

to match an input string of ASCII alphabets.

  • [A-Za-z] will match all the alphabets (both lowercase and uppercase).
  • ^ and $ will make sure that nothing but these alphabets will be matched.

Code:

preg_match('/^[A-Z]+$/i', "abcAbc^Xyz", $m);
var_dump($m);

Output:

array(0) {
}

Test case is for OP's comment that he wants to match only if there are 1 or more alphabets present in the input. As you can see in the test case that matches failed because there was ^ in the input string abcAbc^Xyz.

Note: Please note that the above answer only matches ASCII alphabets and doesn't match Unicode characters. If you want to match Unicode letters then use:

/^\p{L}+$/u

Here, \p{L} matches any kind of letter from any language

Regular expression to match any character being repeated more than 10 times

The regex you need is /(.)\1{9,}/.

Test:

#!perl
use warnings;
use strict;
my $regex = qr/(.)\1{9,}/;
print "NO" if "abcdefghijklmno" =~ $regex;
print "YES" if "------------------------" =~ $regex;
print "YES" if "========================" =~ $regex;

Here the \1 is called a backreference. It references what is captured by the dot . between the brackets (.) and then the {9,} asks for nine or more of the same character. Thus this matches ten or more of any single character.

Although the above test script is in Perl, this is very standard regex syntax and should work in any language. In some variants you might need to use more backslashes, e.g. Emacs would make you write \(.\)\1\{9,\} here.

If a whole string should consist of 9 or more identical characters, add anchors around the pattern:

my $regex = qr/^(.)\1{9,}$/;

Matching special characters and letters in regex

Add them to the allowed characters, but you'll need to escape some of them, such as -]/\

var pattern = /^[a-zA-Z0-9!@#$%^&*()_+\-=\[\]{};':"\\|,.<>\/?]*$/

That way you can remove any individual character you want to disallow.

Also, you want to include the start and end of string placemarkers ^ and $

Update:

As elclanrs understood (and the rest of us didn't, initially), the only special characters needing to be allowed in the pattern are &-._

/^[\w&.\-]+$/

[\w] is the same as [a-zA-Z0-9_]

Though the dash doesn't need escaping when it's at the start or end of the list, I prefer to do it in case other characters are added. Additionally, the + means you need at least one of the listed characters. If zero is ok (ie an empty value), then replace it with a * instead:

/^[\w&.\-]*$/


Related Topics



Leave a reply



Submit