Regular expression to match a line that doesn't contain a word
The notion that regex doesn't support inverse matching is not entirely true. You can mimic this behavior by using negative look-arounds:
^((?!hede).)*$
Non-capturing variant:
^(?:(?!:hede).)*$
The regex above will match any string, or line without a line break, not containing the (sub)string 'hede'. As mentioned, this is not something regex is "good" at (or should do), but still, it is possible.
And if you need to match line break chars as well, use the DOT-ALL modifier (the trailing s
in the following pattern):
/^((?!hede).)*$/s
or use it inline:
/(?s)^((?!hede).)*$/
(where the /.../
are the regex delimiters, i.e., not part of the pattern)
If the DOT-ALL modifier is not available, you can mimic the same behavior with the character class [\s\S]
:
/^((?!hede)[\s\S])*$/
Explanation
A string is just a list of n
characters. Before, and after each character, there's an empty string. So a list of n
characters will have n+1
empty strings. Consider the string "ABhedeCD"
:
┌──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┐
S = │e1│ A │e2│ B │e3│ h │e4│ e │e5│ d │e6│ e │e7│ C │e8│ D │e9│
└──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┘
index 0 1 2 3 4 5 6 7
where the e
's are the empty strings. The regex (?!hede).
looks ahead to see if there's no substring "hede"
to be seen, and if that is the case (so something else is seen), then the .
(dot) will match any character except a line break. Look-arounds are also called zero-width-assertions because they don't consume any characters. They only assert/validate something.
So, in my example, every empty string is first validated to see if there's no "hede"
up ahead, before a character is consumed by the .
(dot). The regex (?!hede).
will do that only once, so it is wrapped in a group, and repeated zero or more times: ((?!hede).)*
. Finally, the start- and end-of-input are anchored to make sure the entire input is consumed: ^((?!hede).)*$
As you can see, the input "ABhedeCD"
will fail because on e3
, the regex (?!hede)
fails (there is "hede"
up ahead!).
match string if does not contain a specific word
You need the following regex
/^(?!.*TOTO)(.*)$/s
Try it at Regex 101 to get an explanation as to what it does
How to match a text in a string, that does not contain a specific word and does not contain a word with letters and digits?
Perhaps this will match your values using a word boundary and a negative lookahead:
\b(?!\w*abc)[^\W\d]+\b
\b
Word boundary(?!\w*abc)
Assert what is on the right does not containabc
[^\W\d]+
Negated character class, match 1+ times a word character except a digit\b
Word boundary
Regex demo
How to match all strings found unless it contains a specific word?
TL;DR Full Regex
http.{5,10}(?:media.tumblr)(?:(?!avatar).)+?(?:png|jpg|jpeg|gif|swf)
Why it fails
.+?(?!avatar).+?<anything else>
The first .+?
matches one character (because it is lazy quantified).
If the string avatar
is found next then it will also match the a
of avatar
The second .+?
matches everything else untill anything else
can be matched.
A solution
Replace the part with
(?:(?!avatar).)+?<anything else>
Why it works
(?!avatar).
matches a single character that is not the start of a string avatar
.
The part (?:(?!avatar).)+?
(lazily) matches all characters that fulfill this property. And if neither of the characters is the starting character of avatar
then the string can not be contained.
Regex: Match word not containing
Your ^((?!Drive).)*$
did not work at all because you tested against a multiline input.
You should use /m
modifier to see what the regex matches. It just matches lines that do not contain Drive
, but that tempered greedy token does not check if EFI
is inside the string.
Actually, the $
anchor is redundant here since .*
matches any zero or more characters other than line break characters. You may simply remove it from your pattern.
(NOTE: In .NET, you will need to use [^\r\n]*
instead of .*
since .
in a .NET pattern matches any char but a newline, LF, char, and matches all other line break chars, like a carriage return, CR, etc.).
Use something like
^(?!.*Drive).*EFI.*
Or, if you need to only fail the match if a Drive
is present as a whole word:
^(?!.*\bDrive\b).*EFI.*
Or, if there are more words you want to signal the failure with:
^(?!.*(?:Drive|SomethingElse)).*EFI.*
^(?!.*\b(?:Drive|SomethingElse)\b).*EFI.*
See regex demo
Here,
^
- matches start of string(?!.*Drive)
- makes sure there is no "Drive" in the string (so,Drives
are NOT allowed)(?!.*\bDrive\b)
- makes sure there is no "Drive" as a whole word in the string (so,Drives
are allowed).*
- any 0+ chars other than line break chars, as many as possibleEFI
- anEFI
substring.*
- any 0+ chars other than line break chars, as many as possible.
If your string has newlines, either use a /s
dotall modifier or replace .
with [\s\S]
.
Regex: How to find substring that does NOT contain a certain word
Using a tempered dot, we can try:
string = "STARTcandyFINISH STARTsugarFINISH STARTpoisonFINISH STARTBlobpoisonFINISH STARTpoisonBlobFINISH"
matches = re.findall(r'START((?:(?!poison).)*?)FINISH', string)
print(matches)
This prints:
['candy', 'sugar']
For an explanation of how the regex pattern works, we can have a closer look at:
(?:(?!poison).)*?
This uses a tempered dot trick. It will match, one character at a time, so long as what follows is not poison
.
Regex to match a string which does not contain a specific word next to the match string
I want regex which does not contain not(in first string), I want to match only 2nd string.
That means you should check if the This is...
pattern is not followed by newline sequence + spaces* + not
as a whole word with backtracking disabled. We can disable backtracking using atomic group in .NET:
(?>This\s+is(?:\s+\d+)+ *)(?![\r\n]+\p{Zs}*not\b)
See the regex demo
Part 1 of the regex This\s+is(?:\s+\d+)+ *
matches This is
followed with one or more sequences of one or more whitespaces followed with one or more digits, then followed with zero or more spaces. The (?>...)
prevent backtracking inside this part of the pattern. The lookahead (?![\r\n]+\p{Zs}*not\b)
fails the match if the previously matched text is followed with the whitespaces followed with a whole word not
(where \b
stands for a word boundary).
How to match a range of string that doesn't contain an specific word using only regular expression?
This seems to work:
var regex = /\[(?!dog)([a-z]+) (?!dog)([a-z]+)\]/gi;
var string = "[cat dog] [dog cow] [cow cat] [cat tiger] [tiger lion] [monkey dog]";
console.log(string.match(regex));
Regular expression to match strings that do NOT contain all specified elements
Nice question. It looks like you are looking for some AND
logic. I am sure someone can come up with something better, but I thought of two ways:
^(?=(?!.*\btwo\b)|(?!.*\bthree\b)).*$
See the online demo
Or:
^(?=.*\btwo\b)(?=.*\bthree\b)(*SKIP)(*F)|^.*$
See the online demo
In both cases we are using positive lookahead to mimic the AND
logic to prevent both words being present in a text irrespective of their position in the full string. If just one of those words is present, the string will pass.
How to match a line not containing a word
This should work:
/^((?!PART).)*$/
Edit (by request): How this works
The (?!...)
syntax is a negative lookahead, which I've always found tough to explain. Basically, it means "whatever follows this point must not match the regular expression /PART/
." The site I've linked explains this far better than I can, but I'll try to break this down:
^ #Start matching from the beginning of the string.
(?!PART) #This position must not be followed by the string "PART".
. #Matches any character except line breaks (it will include those in single-line mode).
$ #Match all the way until the end of the string.
The ((?!xxx).)*
idiom is probably hardest to understand. As we saw, (?!PART)
looks at the string ahead and says that whatever comes next can't match the subpattern /PART/
. So what we're doing with ((?!xxx).)*
is going through the string letter by letter and applying the rule to all of them. Each character can be anything, but if you take that character and the next few characters after it, you'd better not get the word PART.
The ^
and $
anchors are there to demand that the rule be applied to the entire string, from beginning to end. Without those anchors, any piece of the string that didn't begin with PART would be a match. Even PART itself would have matches in it, because (for example) the letter A isn't followed by the exact string PART.
Since we do have ^
and $
, if PART were anywhere in the string, one of the characters would match (?=PART).
and the overall match would fail. Hope that's clear enough to be helpful.
Related Topics
Can't Get Rack-Cors Working in Rails Application
How to Run Rails Console in the Test Environment and Load Test_Helper.Rb
One Liner in Ruby for Displaying a Prompt, Getting Input, and Assigning to a Variable
Rails Search with Optional Parameters
When to Use a Lambda in Ruby on Rails
Where in the Ruby Language Is %Q, %W, etc., Defined
How to Measure the Size of a Ruby Object
Executing User-Supplied Ruby Code on a Web Server
Cannot Execute "Rails Console" Due to an Error with Readline
How to Handle Errors with Httparty
How to Use Dot Syntax for Ruby Hash
How to Create a Full Audit Log in Rails for Every Table
How to Loop Over a Hash of Hashes
Why Bundle Install Is Installing Gems in Vendor/Bundle