Regular expression to match A, AB, ABC, but not AC. (starts with)
Try this regular expression:
^(A(B(C)?)?)?$
I think you can see the pattern and expand it for ABCD
and ABCDE
like:^(A(B(C(D)?)?)?)?$
^(A(B(C(D(E)?)?)?)?)?$
Now each part depends on the preceeding parts (B depends on A, C depends on B, etc.). Regular expression not allowing a and c to be next to each other
Here is a much simpler straightforward regex. Rather than thinking to exclude the pattern, you can also match the pattern and ignore them like following example:
String[] str = {
"a", "b", "c", "ba", "ca", "ab", "cb", "ac", "bc", "baa",
"caa", "aba", "cba", "aca", "bca", "bab", "cab", "abb",
"cbb", "acb", "bcb", "bac", "cac", "abc", "cbc", "acc", "bcc"
};
for(int i=0; i<str.length; ++i) {
if(str[i].matches("ac.?|.?ac|ca.?|.?ca")) {
System.out.println("MATCH: " + str[i]);
} else {
System.out.println(str[i]);
}
}
This makes the following output:a
b
c
ba
MATCH: ca
ab
cb
MATCH: ac
bc
baa
MATCH: caa
aba
cba
MATCH: aca
MATCH: bca
bab
MATCH: cab
abb
cbb
MATCH: acb
bcb
MATCH: bac
MATCH: cac
abc
cbc
MATCH: acc
bcc
Regular expression for only characters a-z, A-Z
/^[a-zA-Z]*$/
Change the *
to +
if you don't want to allow empty matches.References:
Character classes ([...]
), Anchors (^
and $
), Repetition (+
, *
)The /
are just delimiters, it denotes the start and the end of the regex. One use of this is now you can use modifiers on it.
Javascript Regular Expressions /ab*c/
"*" means "Matches the preceding expression 0 or more times". So it will match any string that contains "ac" (b 0 times in this case)
Regular expression for a-b, a-c but not a-a?
Note that \w
already matches \d
and _
and \w[\w\d_]+
= \w{2,}
.
You can capture the first "word" (before ::
) and check with a negative lookahead that the "word" after ::
is not equal to it:
\b(\w+)::(?!\b\1\b)\w+\b
See the regex demoExplanation:
\b
- leading word boundary(\w+)
- Group 1: one or more alphanumeric and underscore characters::
- 2 consecutive colons(?!\b\1\b)
- the next "word" cannot be the same as the value in Group 1\w+\b
- one or more alphanumeric and underscore characters followed with a trailing word boundary.
\b(\w{2,})::(?!\b\1\b)\w{2,}\b
Regex that matches xa?b?c? but not x alone
Here’s the shortest version:
(a)?(b)?(c)?(?(1)|(?(2)|(?(3)|(*FAIL))))
If you need to keep around the match in a separate group, write this:((a)?(b)?(c)?)(?(2)|(?(3)|(?(4)|(*FAIL))))
But that isn’t very robust in case a
, b
, or c
contain capture groups. So instead write this:(?<A>a)?(?<B>b)?(?<C>c)?(?(<A>)|(?(<B>)|(?(<C>)|(*FAIL))))
And if you need a group for the whole match, then write this:(?<M>(?<A>a)?(?<B>b)?(?<C>c)?(?(<A>)|(?(<B>)|(?(<C>)|(*FAIL)))))
And if like me you prefer multi-lettered identifiers and also think this sort of thing is insane without being in /x
mode, write this:(?x)
(?<Whole_Match>
(?<Group_A> a) ?
(?<Group_B> b) ?
(?<Group_C> c) ?
(?(<Group_A>) # Succeed
| (?(<Group_B>) # Succeed
| (?(<Group_C>) # Succeed
| (*FAIL)
)
)
)
)
And here is the full testing program to prove that those all work:#!/usr/bin/perl
use 5.010_000;
my @pats = (
qr/(a)?(b)?(c)?(?(1)|(?(2)|(?(3)|(*FAIL))))/,
qr/((a)?(b)?(c)?)(?(2)|(?(3)|(?(4)|(*FAIL))))/,
qr/(?<A>a)?(?<B>b)?(?<C>c)?(?(<A>)|(?(<B>)|(?(<C>)|(*FAIL))))/,
qr/(?<M>(?<A>a)?(?<B>b)?(?<C>c)?(?(<A>)|(?(<B>)|(?(<C>)|(*FAIL)))))/,
qr{
(?<Whole_Match>
(?<Group_A> a) ?
(?<Group_B> b) ?
(?<Group_C> c) ?
(?(<Group_A>) # Succeed
| (?(<Group_B>) # Succeed
| (?(<Group_C>) # Succeed
| (*FAIL)
)
)
)
)
}x,
);
for my $pat (@pats) {
say "\nTESTING $pat";
$_ = "i can match bad crabcatchers from 34 bc and call a cab";
while (/$pat/g) {
say "$`<$&>$'";
}
}
All five versions produce this output:i <c>an match bad crabcatchers from 34 bc and call a cab
i c<a>n match bad crabcatchers from 34 bc and call a cab
i can m<a>tch bad crabcatchers from 34 bc and call a cab
i can mat<c>h bad crabcatchers from 34 bc and call a cab
i can match <b>ad crabcatchers from 34 bc and call a cab
i can match b<a>d crabcatchers from 34 bc and call a cab
i can match bad <c>rabcatchers from 34 bc and call a cab
i can match bad cr<abc>atchers from 34 bc and call a cab
i can match bad crabc<a>tchers from 34 bc and call a cab
i can match bad crabcat<c>hers from 34 bc and call a cab
i can match bad crabcatchers from 34 <bc> and call a cab
i can match bad crabcatchers from 34 bc <a>nd call a cab
i can match bad crabcatchers from 34 bc and <c>all a cab
i can match bad crabcatchers from 34 bc and c<a>ll a cab
i can match bad crabcatchers from 34 bc and call <a> cab
i can match bad crabcatchers from 34 bc and call a <c>ab
i can match bad crabcatchers from 34 bc and call a c<ab>
Sweet, eh?EDIT: For the x
in the beginning part, just put whatever x
you want at the start of the match, before the very first optional capture group for the a
part, so like this:
x(a)?(b)?(c)?(?(1)|(?(2)|(?(3)|(*FAIL))))
or like this(?x) # enable non-insane mode
(?<Whole_Match>
x # first match some leader string
# now match a, b, and c, in that order, and each optional
(?<Group_A> a ) ?
(?<Group_B> b ) ?
(?<Group_C> c ) ?
# now make sure we got at least one of a, b, or c
(?(<Group_A>) # SUCCEED!
| (?(<Group_B>) # SUCCEED!
| (?(<Group_C>) # SUCCEED!
| (*FAIL)
)
)
)
)
The test sentence was constructed without the x
part, so it won’t work for that, but I think I’ve shown how I mean to go at this. Note that all of x
, a
, b
, and c
can be arbitrarily complex patterns (yes, even recursive), not merely single letters, and it doesn’t matter if they use numbered capture groups of their own, even.If you want to go at this with lookaheads, you can do this:
(?x)
(?(DEFINE)
(?<Group_A> a)
(?<Group_B> b)
(?<Group_C> c)
)
x
(?= (?&Group_A)
| (?&Group_B)
| (?&Group_C)
)
(?&Group_A) ?
(?&Group_B) ?
(?&Group_C) ?
And here is what to add to the @pats
array in the test program to show that this approach also works:qr{
(?(DEFINE)
(?<Group_A> a)
(?<Group_B> b)
(?<Group_C> c)
)
(?= (?&Group_A)
| (?&Group_B)
| (?&Group_C)
)
(?&Group_A) ?
(?&Group_B) ?
(?&Group_C) ?
}x
You’ll notice please that I still manage never to repeat any of a
, b
, or c
, even with the lookahead technique.Do I win? ☺
Partial matching a string against a regex
Looks like you're lucky, I've already implemented that stuff in JS (which works for most patterns - maybe that'll be enough for you). See my answer here. You'll also find a working demo there.
There's no need to duplicate the full code here, I'll just state the overall process:
- Parse the input regex, and perform some replacements. There's no need for error handling as you can't have an invalid pattern in a
RegExp
object in JS. - Replace
abc
with(?:a|$)(?:b|$)(?:c|$)
- Do the same for any "atoms". For instance, a character group
[a-c]
would become(?:[a-c]|$)
- Keep anchors as-is
- Keep negative lookaheads as-is
^(\w+)\s+\1$
against hello hel
).
Related Topics
Add Commas to a Number in Jquery
Why Does JavaScript's Regex.Exec() Not Always Return the Same Value
Does Use of Anonymous Functions Affect Performance
Viewing All the Timeouts/Intervals in JavaScript
Can Promises Have Multiple Arguments to Onfulfilled
Bootstrap: Open Another Modal in Modal
Angularjs "Controller As" or "$Scope"
Converting a Buffer into a Readablestream in Node.Js
Javascript: How to Detect If a Word Is Highlighted
How to Test for Nan in JavaScript
Jquery Document.Ready VS Phonegap Deviceready
Prototyping Object in JavaScript Breaks Jquery
Why Threre Is No Way to Download File Using Ajax Request
How to Validate Google Recaptcha V2 Using JavaScript/Jquery
Adding Csrftoken to Ajax Request