C++ Regular Expressions with Boost Regex

Compare two regular expressions using boost::regex (c++)

Convert the regexes to a DFA graph, g1 and g2.

Define g1' and g2' as the same graphs with the accepting states inverted.

Define a = g1 x g2' and b = g1' x g2, where you keep track of both sets of states for the input. The accepting states of a and b are those that are accepting in both the source-product graphs.

Strings accepted by a are those that r1 accepts and r2 does not.

Strings accepted by b are those that r2 accepts and r1 does not.

r1 is a subset of r2 if and only if every string r1 accepts is also accepted by r2.

So simply prove that a accepts no strings to prove r1 is a subset of r2.

If you want strict subset, also show that b accepts at least one string.

I am unaware of a way to do any of this easily with boost. I don't know if these steps qualify as "easy". I suspect not, because this problem is PSPACE-complete.

My Boost regular expression is not matching anything

If your regular expression does not match the string you wanted it to match then your regular expression is wrong. I've corrected your regular expression:

(\\w{3}) (\\d{1,2}) (\\d{2}):(\\d{2}):(\\d{2}).*SOFTLOADSERVICE;Install started\\s*

Here's where you can test your regular expression and yourself:

https://regex101.com/

https://www.regextester.com/

https://regexr.com/

C++ Regular Expressions with Boost Regex

Perhaps you're looking for something like this. It uses regex_iterator to get all matches of the current pattern. See reference.

#include <boost/regex.hpp>
#include <iostream>
#include <string>

int main()
{
std::string text(" 192.168.0.1 abc 10.0.0.255 10.5.1 1.2.3.4a 5.4.3.2 ");
const char* pattern =
"\\b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"
"\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"
"\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)"
"\\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b";
boost::regex ip_regex(pattern);

boost::sregex_iterator it(text.begin(), text.end(), ip_regex);
boost::sregex_iterator end;
for (; it != end; ++it) {
std::cout << it->str() << "\n";
// v.push_back(it->str()); or something similar
}
}

Output:

192.168.0.1
10.0.0.255
5.4.3.2

Side note: you probably meant \\b instead of \b; I doubt you watnted to match backspace character.

Recursive regular expression match with boost

You may declare the regex using a raw string literal, using R"(...)" syntax. This way, you won't have to escape backslashes twice.

Cf., these are equal declarations:

std::string my_pattern("\\w+");
std::string my_pattern(R"(\w+)");

The parentheses are not part of the regex pattern, they are raw string literal delimiter parts.

However, your regex is not quite correct: you need to recurse only the first alternative and not the whole regex.

Here is the fix:

std::string my_pattern(R"((\((?:[^()]++|(?1))*\))|\w+)");

Here, (\((?:[^()]++|(?1))*\)) matches and 1+ chars other than ( and ) or recurses the whole Group 1 pattern with (?1) regex subroutine.

See the regex demo.

Boost regular expression for match whole word does not work

The boost::regex documentation states that a lookbehind needs to be a fixed length. Your lookbehind matches zero or one character.

Boost regexp match

In C++, the character \ needs to be escaped. So if you want to escape anything, you need to do \\. That should fix the problem. Whenever you use the backslash in a string, you need to escape it like that. If you ever need to find it in a string with the regex, you'll need to search for it with \\\\.

C# Regex to C++ boost::regex

You can use \w, \s, and \d in your regular expressions. However, that's not what you're doing; you're trying to use \w as a character in the string. For there to be a \ followed by a w in the actual string, you need to escape the \ (same for s and d, of course):

boost::regex regex("[\\.\\w],\\s*\\d{1,3},\\s*\\d{1,3},\\s*\\d{1,3}");

As of C++11, you can use raw string literals to make your code even more similar to the C# version:

boost::regex regex(R"del([\.\w],\s*\d{1,3},\s*\d{1,3},\s*\d{1,3})del");


Related Topics



Leave a reply



Submit