This Regex Doesn't Work in C++

Regex doesn't work as expected

The regex you're looking for is

#define NICK_PATTERN "^[]a-z0-9|{}^[]{4,36}$"

and you want to compile it like so:

regcomp(&re, pattern, REG_ICASE | REG_EXTENDED | REG_NOSUB)

regcomp, regexec etc. use the POSIX regular expression syntax, which means slightly different escaping rules, no slashes for matching, and no flags like in Perl -- these are replaced, in a manner of speaking, by the functions and their flags. There's a rundown of POSIX syntax here. It is written for a C++ interface, but the rules are the same.

Differences from what you expected in this particular case: You don't have to escape in a [] set, and if there's a ] to be part of the set, it has to come at the front.

Regex doesn't match in c but works with online interpreter

You need to escape your backslashes. Change

char * regex = "<sql\s+db=(.+)\s+query=(.+;)\s*\\>";

to

char * regex = "<sql\\s+db=(.+)\\s+query=(.+;)\\s*\\\\>";

Note that this is extremely inefficient. A much more efficient regex uses non-greedy quantification, with ?:

<sql\s+db=(.+?)\s+query=(.+;)\s*\\>
// ^ key change

That becomes:

char * regex = "<sql\\s+db=(.+?)\\s+query=(.+;)\\s*\\\\>";

Also note: Your string to be matched also includes \. You need to escape it there, too:

char * str = "<sql db=../serverTcp/Testing.db query=SELECT * From BuyMarsians;\\>";

Here's a working demo of your corrected code.

regular expression doesnt work while execute in c

Your pattern contains a $ anchor, capturing groups with (...) and the interval quantifier {m,n}, so you need to pass REG_EXTENDED to the regex compile method:

regex_t regex;
int reti;
reti = regcomp(®ex, "([0-9a-fA-F]{8}( |$))+$", REG_EXTENDED); // <-- See here
if (reti)
{
fprintf(stderr, "Could not compile regex\n");
exit(1);
}

reti = regexec(®ex, "00001206 ffffff00 00200800 00001044", 0, NULL, 0);
if (!reti)
{
printf("Match");
}
else if (reti == REG_NOMATCH) {
printf("No match bla bla\n");
}

See the online C demo printing Match.

However, I believe you need to match the entire string, and disallow whitespace at the end, so probably

reti = regcomp(®ex, "^[0-9a-fA-F]{8}( [0-9a-fA-F]{8})*$", REG_EXTENDED);

will be more precise as it will not allow any arbitrary text in front and won't allow a trailing space.

When using regex in C, \d does not work but [0-9] does

The regex flavor you're using is GNU ERE, which is similar to POSIX ERE, but with a few extra features. Among these are support for the character class shorthands \s, \S, \w and \W, but not \d and \D. You can find more info here.

This regex doesn't work in c++

Your problem is your backslashes are escaping the '1''s in your string. You need to inform std::regex to treat them as '\' 's. You can do this by using a raw string R"((.+)\1\1+)", or by escaping the slashes, as shown here:

#include <regex>
#include <string>
#include <iostream>

int main(){

std::string s ("xaxababababaxax");
std::smatch m;
std::regex e ("(.+)\\1\\1+");

while (std::regex_search (s,m,e)) {
for (auto x:m) std::cout << x << " ";
std::cout << std::endl;
s = m.suffix().str();
}

return 0;
}

Which produces the output

abababab ab 

Why my regex doesn't work in c++ but it works with Python?

\ is used as a escape sequence in C++. You have to write \ as \\ to pass it to regex engine.

    regex rgx("([0-9]+|[\\/*+-])\\s+"); 

Another option is using raw string literal (since C++11):

    regex rgx(R"(([0-9]+|[\/*+-])\s+)"); 

Regex pattern not working on my C# code however it works on an online tester

The lookbehind (?<=\bAmount.*) is always true in the example data after Amount.

The first 7 matches are empty, as (\d+\,*\.*)* can not consume a character where there is no digit, but it matches at the position as the quantifier * matches 0 or more times.

See this screenshot of the matches:

Sample Image

You might use

(?<=\bAmount\D*)\d{1,3}(?:,\d{3})*(?:\.\d{1,2})?\b
  • (?<=\bAmount\D*) Positive lookbehind, assert Amount to the left followed by optional non digits
  • \d{1,3} Match 1-3 digits
  • (?:,\d{3})* Optionally repeat , and 3 digits
  • (?:\.\d{1,2})? Optionally match . and 1 or 2 digits
  • \b A word boundary

See a .NET regex demo

For example

double doubleVal = 0;
var pattern = @"(?<=\bAmount\D*)\d{1,3}(?:,\d{3})*(?:\.\d{1,2})?\b";
var textToParse = "Amount : USD 3,747,190.67";
var matchPattern = Regex.Match(textToParse, pattern);
if (matchPattern.Success)
{
double.TryParse(matchPattern.Value.ToString(), out doubleVal);
}
Console.WriteLine(doubleVal);

Output

3747190.67

You can omit the word boundaries if needed if a partial match is also valid

(?<=Amount\D*)\d{1,3}(?:,\d{3})*(?:\.\d{1,2})?

| (or operator is not working in the regular expression in C language)

Your regex 1 | 2c | 3c matches 1 in I have 1 dollar, but from your comments I see you need to match whole strings, not just part of them. For that, you need to use anchors ^ (start of string) and $ (end of string) that work only when you use REG_EXTENDED flag.

When you use alternatives, you need to either repeat the anchors, or set a group with the help of parentheses:

^1$|^2c$|^3c$

Or

^(1|2c|3c)$

These expressions will safely match whole strings like 1, 2c, or 3c.

Regex C doesn't work as expected with wild card * and caret

In a nutshell, * is a quantifier in a regular expression that means 0 or more occurrences. Replacing it with .* should yield expected results.

Note that n1* also matches any n inside an input string, as it means n and an optional 1 (0 or more occurrences). n1.* already needs n1 to be present in a string in order to return a match.

run("n1.*", "to-dallas");          // => No match
run("n1.*", "newyork-to-dallas1"); // => No match
run("n1.*", "n1"); // => Match "n1"

As for newyork-to-dallas* (and newyork-to-dallas.*), it will match newyork-to-dallas1:

run("newyork-to-dallas*","newyork-to-dallas1"); // => Matches "newyork-to-dallas"
run("newyork-to-dallas.*","newyork-to-dallas1"); // => Matches "newyork-to-dallas1" as .* matches "1"

As for caret, it matches at the beginning of the string.

// Caret
run("^to-da.*", "to-dallas"); => Matches "to-dallas"
run("^to-da.*", "newyork-to-dallas1"); => No match (not at the beginning)
run("^to-da.*", "n1"); => No match

See the full demo program on CodingGround



Related Topics



Leave a reply



Submit