Understanding Boost.Spirit's String Parser

Understanding Boost.spirit's string parser

It's not the string matcher per se. It's [attribute propagation] + [backtracking] in action.

A string attribute is a container attribute and many elements could be assigned into it by different parser subexpressions. Now for efficiency reasons, Spirit doesn't rollback the values of emitted attributes on backtracking.

Often this is no problem at all, but as you can see, the 'a' from the failed first branch of the alternative sticks around.

Either reword or employ the 'big gun' qi::hold[] directive:

(qi::hold [ string("a")  >> string("a") ] | string("a")),

Rewording could look like:

qi::string("a") >> -qi::string("a"),

Also, if you're really just trying to match certain textual strings, consider:

(qi::raw [ qi::lit("aa") | "a" ]), 
// or even just
qi::string("aa") | qi::string("a"),

Now which one of these applies most, depends on your grammar.

Boost::Spirit struggle with parsing a String

I'd suggest starting out with the desired AST as always.

Spirit works well with static polymorphism, so I'd use a variant to represent commands:

namespace AST {
namespace Cmd {
struct Move { int x,y,z; };
struct Bomb { int x,y; };
struct Inc { int amount; };
struct Msg { std::string text; };
struct Wait {};
}

using Command = boost::variant<Cmd::Move, Cmd::Bomb, Cmd::Inc, Cmd::Msg, Cmd::Wait>;
using Commands = std::vector<Command>;
}

Now, write the most straight-forward grammar to match it:

template <typename It>
struct ScriptGrammar : qi::grammar<It, AST::Commands()>
{
ScriptGrammar() : ScriptGrammar::base_type(start) {
using namespace qi;
start = skip(space) [ script ];
script = command % ";";
command = move|bomb|inc|msg|wait;

move = "MOVE" >> int_ >> int_ >> int_;
bomb = "BOMB" >> int_ >> int_;
inc = "INC" >> int_;
msg = "MSG" >> text;
wait = "WAIT" >> qi::attr(AST::Cmd::Wait{});

text = +~char_(";");
BOOST_SPIRIT_DEBUG_NODES((start)(script)(command)(move)(bomb)(inc)(msg)(wait)(text))
}
private:
using Skipper = qi::space_type;
qi::rule<It, AST::Commands(), Skipper> script;
qi::rule<It, AST::Command(), Skipper> command;
qi::rule<It, AST::Cmd::Move(), Skipper> move;
qi::rule<It, AST::Cmd::Bomb(), Skipper> bomb;
qi::rule<It, AST::Cmd::Inc(), Skipper> inc;
qi::rule<It, AST::Cmd::Msg(), Skipper> msg;
qi::rule<It, AST::Cmd::Wait(), Skipper> wait;
// lexeme
qi::rule<It, AST::Commands()> start;
qi::rule<It, std::string()> text;
};

Add in some glue for debug (Fusion adaptation and output streaming), and we have a working sample:

Live On Coliru

#define BOOST_SPIRIT_DEBUG
#include <iostream>
#include <vector>
#include <string>
#include <iterator>
#include <iomanip>
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/phoenix/phoenix.hpp>

namespace AST {
namespace Cmd {
struct Move { int x,y,z; };
struct Bomb { int x,y; };
struct Inc { int amount; };
struct Msg { std::string text; };
struct Wait {};
}

using Command = boost::variant<Cmd::Move, Cmd::Bomb, Cmd::Inc, Cmd::Msg, Cmd::Wait>;
using Commands = std::vector<Command>;
}

BOOST_FUSION_ADAPT_STRUCT(AST::Cmd::Move, x,y,z)
BOOST_FUSION_ADAPT_STRUCT(AST::Cmd::Bomb, x,y)
BOOST_FUSION_ADAPT_STRUCT(AST::Cmd::Inc, amount)
BOOST_FUSION_ADAPT_STRUCT(AST::Cmd::Msg, text)
BOOST_FUSION_ADAPT_STRUCT(AST::Cmd::Wait)

namespace AST { namespace Cmd { // For demo/debug
std::ostream& operator<<(std::ostream& os, Move const& cmd) { return os << "MOVE " << boost::fusion::as_vector(cmd); }
std::ostream& operator<<(std::ostream& os, Bomb const& cmd) { return os << "BOMB " << boost::fusion::as_vector(cmd); }
std::ostream& operator<<(std::ostream& os, Inc const& cmd) { return os << "INC " << boost::fusion::as_vector(cmd); }
std::ostream& operator<<(std::ostream& os, Msg const& cmd) { return os << "MSG " << boost::fusion::as_vector(cmd); }
std::ostream& operator<<(std::ostream& os, Wait const& cmd) { return os << "WAIT " << boost::fusion::as_vector(cmd); }
} }

namespace qi = boost::spirit::qi;

template <typename It>
struct ScriptGrammar : qi::grammar<It, AST::Commands()>
{
ScriptGrammar() : ScriptGrammar::base_type(start) {
using namespace qi;
start = skip(space) [ script ];
script = command % ";";
command = move|bomb|inc|msg|wait;

move = "MOVE" >> int_ >> int_ >> int_;
bomb = "BOMB" >> int_ >> int_;
inc = "INC" >> int_;
msg = "MSG" >> text;
wait = "WAIT" >> qi::attr(AST::Cmd::Wait{});

text = +~char_(";");
BOOST_SPIRIT_DEBUG_NODES((start)(script)(command)(move)(bomb)(inc)(msg)(wait)(text))
}
private:
using Skipper = qi::space_type;
qi::rule<It, AST::Commands(), Skipper> script;
qi::rule<It, AST::Command(), Skipper> command;
qi::rule<It, AST::Cmd::Move(), Skipper> move;
qi::rule<It, AST::Cmd::Bomb(), Skipper> bomb;
qi::rule<It, AST::Cmd::Inc(), Skipper> inc;
qi::rule<It, AST::Cmd::Msg(), Skipper> msg;
qi::rule<It, AST::Cmd::Wait(), Skipper> wait;
// lexeme
qi::rule<It, AST::Commands()> start;
qi::rule<It, std::string()> text;
};

int main() {
std::string const testInput = "MOVE 1 2 43;BOMB 0 3;INC 6;MOVE 2 3 99;MSG MOVE ZIG;WAIT;MSG FOR GREAT JUSTICE!;MOVE 1 2 6";

typedef std::string::const_iterator iter;

iter start = testInput.begin(), end = testInput.end();

AST::Commands script;

bool match = qi::parse(start, testInput.end(), ScriptGrammar<iter>(), script);

if (match) {
std::cout << "Parsed " << script.size() << " commands\n";
std::copy(script.begin(), script.end(), std::ostream_iterator<AST::Command>(std::cout, ";"));
} else {
std::cout << "Parse failed\n";
}

if (start != end)
std::cout << "Remaining unparsed input: '" << std::string(start, end) << "'\n";
}

Which prints:

Parsed 8 commands
MOVE (1 2 43);BOMB (0 3);INC (6);MOVE (2 3 99);MSG (MOVE ZIG);WAIT ();MSG (FOR GREAT JUSTICE!);MOVE (1 2 6);

And optionally the BOOST_SPIRIT_DEBUG output:

<start>
<try>MOVE 1 2 43;BOMB 0 3</try>
<script>
<try>MOVE 1 2 43;BOMB 0 3</try>
<command>
<try>MOVE 1 2 43;BOMB 0 3</try>
<move>
<try>MOVE 1 2 43;BOMB 0 3</try>
<success>;BOMB 0 3;INC 6;MOVE</success>
<attributes>[[1, 2, 43]]</attributes>
</move>
<success>;BOMB 0 3;INC 6;MOVE</success>
<attributes>[[1, 2, 43]]</attributes>
</command>
<command>
<try>BOMB 0 3;INC 6;MOVE </try>
<move>
<try>BOMB 0 3;INC 6;MOVE </try>
<fail/>
</move>
<bomb>
<try>BOMB 0 3;INC 6;MOVE </try>
<success>;INC 6;MOVE 2 3 99;M</success>
<attributes>[[0, 3]]</attributes>
</bomb>
<success>;INC 6;MOVE 2 3 99;M</success>
<attributes>[[0, 3]]</attributes>
</command>
<command>
<try>INC 6;MOVE 2 3 99;MS</try>
<move>
<try>INC 6;MOVE 2 3 99;MS</try>
<fail/>
</move>
<bomb>
<try>INC 6;MOVE 2 3 99;MS</try>
<fail/>
</bomb>
<inc>
<try>INC 6;MOVE 2 3 99;MS</try>
<success>;MOVE 2 3 99;MSG MOV</success>
<attributes>[[6]]</attributes>
</inc>
<success>;MOVE 2 3 99;MSG MOV</success>
<attributes>[[6]]</attributes>
</command>
<command>
<try>MOVE 2 3 99;MSG MOVE</try>
<move>
<try>MOVE 2 3 99;MSG MOVE</try>
<success>;MSG MOVE ZIG;WAIT;M</success>
<attributes>[[2, 3, 99]]</attributes>
</move>
<success>;MSG MOVE ZIG;WAIT;M</success>
<attributes>[[2, 3, 99]]</attributes>
</command>
<command>
<try>MSG MOVE ZIG;WAIT;MS</try>
<move>
<try>MSG MOVE ZIG;WAIT;MS</try>
<fail/>
</move>
<bomb>
<try>MSG MOVE ZIG;WAIT;MS</try>
<fail/>
</bomb>
<inc>
<try>MSG MOVE ZIG;WAIT;MS</try>
<fail/>
</inc>
<msg>
<try>MSG MOVE ZIG;WAIT;MS</try>
<text>
<try>MOVE ZIG;WAIT;MSG FO</try>
<success>;WAIT;MSG FOR GREAT </success>
<attributes>[[M, O, V, E, , Z, I, G]]</attributes>
</text>
<success>;WAIT;MSG FOR GREAT </success>
<attributes>[[[M, O, V, E, , Z, I, G]]]</attributes>
</msg>
<success>;WAIT;MSG FOR GREAT </success>
<attributes>[[[M, O, V, E, , Z, I, G]]]</attributes>
</command>
<command>
<try>WAIT;MSG FOR GREAT J</try>
<move>
<try>WAIT;MSG FOR GREAT J</try>
<fail/>
</move>
<bomb>
<try>WAIT;MSG FOR GREAT J</try>
<fail/>
</bomb>
<inc>
<try>WAIT;MSG FOR GREAT J</try>
<fail/>
</inc>
<msg>
<try>WAIT;MSG FOR GREAT J</try>
<fail/>
</msg>
<wait>
<try>WAIT;MSG FOR GREAT J</try>
<success>;MSG FOR GREAT JUSTI</success>
<attributes>[[]]</attributes>
</wait>
<success>;MSG FOR GREAT JUSTI</success>
<attributes>[[]]</attributes>
</command>
<command>
<try>MSG FOR GREAT JUSTIC</try>
<move>
<try>MSG FOR GREAT JUSTIC</try>
<fail/>
</move>
<bomb>
<try>MSG FOR GREAT JUSTIC</try>
<fail/>
</bomb>
<inc>
<try>MSG FOR GREAT JUSTIC</try>
<fail/>
</inc>
<msg>
<try>MSG FOR GREAT JUSTIC</try>
<text>
<try>FOR GREAT JUSTICE!;M</try>
<success>;MOVE 1 2 6</success>
<attributes>[[F, O, R, , G, R, E, A, T, , J, U, S, T, I, C, E, !]]</attributes>
</text>
<success>;MOVE 1 2 6</success>
<attributes>[[[F, O, R, , G, R, E, A, T, , J, U, S, T, I, C, E, !]]]</attributes>
</msg>
<success>;MOVE 1 2 6</success>
<attributes>[[[F, O, R, , G, R, E, A, T, , J, U, S, T, I, C, E, !]]]</attributes>
</command>
<command>
<try>MOVE 1 2 6</try>
<move>
<try>MOVE 1 2 6</try>
<success></success>
<attributes>[[1, 2, 6]]</attributes>
</move>
<success></success>
<attributes>[[1, 2, 6]]</attributes>
</command>
<success></success>
<attributes>[[[1, 2, 43], [0, 3], [6], [2, 3, 99], [[M, O, V, E, , Z, I, G]], [], [[F, O, R, , G, R, E, A, T, , J, U, S, T, I, C, E, !]], [1, 2, 6]]]</attributes>
</script>
<success></success>
<attributes>[[[1, 2, 43], [0, 3], [6], [2, 3, 99], [[M, O, V, E, , Z, I, G]], [], [[F, O, R, , G, R, E, A, T, , J, U, S, T, I, C, E, !]], [1, 2, 6]]]</attributes>
</start>

Use a trait for parsing a date in boost::spirit

After posting Why does using a stream in boost spirit penalize performance so much? I re-read your post and added the approach here.

There were a considerable number of issues with the way in which the trait and the parser rule were declared.

  • notably, repeat(2)[digit_] doesn't transform to an integer attribute. I suspect you might have gotten a lot of 49, 50 etc. values (ASCII code of '1', '2' etc) and perhaps some indeterminate values too

  • you subtracted 1900 from the month value

The Parser

Simplified it to:

namespace QiParsers {

struct Months : qi::symbols<char, int> {
Months() { this->add
("Jan", 0)
("Feb", 1)
("Mar", 2)
("Apr", 3)
("May", 4)
("Jun", 5)
("Jul", 6)
("Aug", 7)
("Sep", 8)
("Oct", 9)
("Nov", 10)
("Dec", 11);
}
} static const mmm_;

static const qi::uint_parser<int, 10, 4, 4> yyyy_;
static const qi::uint_parser<int, 10, 2, 2> dd_, hh_, mm_, ss_;
static const qi::uint_parser<int, 10, 6, 6> fff_;

}

Now the parser can be written legibly¹ like:

template <typename It>
struct Parser2 : qi::grammar<It, structs::Record2()>
{
Parser2() : Parser2::base_type(start) {
using namespace qi;

date = '[' >> yyyy_ >> '-' >> mmm_ >> '-' >> dd_
>> ' ' >> hh_ >> ':' >> mm_ >> ':' >> ss_ >> '.' >> fff_ >> ']';

start =
date //'[' >> raw[*~char_(']')] >> ']'
>> " - " >> double_ >> " s"
>> " => String: " >> raw[+graph]
>> eol;
}

private:
qi::rule<It, structs::Record2()> start;
qi::rule<It, boost::fusion::vector<int, int, int, int, int, int, int>()> date;
};

The Trait

Basically what you had, but ironing out a few quirks:

template <typename Attr>
struct transform_attribute<structs::Timestamp, Attr, qi::domain> {
using type = Attr;
static type pre(structs::Timestamp) { return type(); }
static void fail(structs::Timestamp&) { }
static void post(structs::Timestamp& timestamp, type const& v) {
/*
* struct tm
* {
* int tm_sec; [> Seconds. [0-60] (1 leap second) <]
* int tm_min; [> Minutes. [0-59] <]
* int tm_hour; [> Hours. [0-23] <]
* int tm_mday; [> Day. [1-31] <]
* int tm_mon; [> Month. [0-11] <]
* int tm_year; [> Year - 1900. <]
* int tm_wday; [> Day of week. [0-6] <]
* int tm_yday; [> Days in year.[0-365] <]
* int tm_isdst; [> DST. [-1/0/1]<]
*
* # ifdef __USE_MISC
* long int tm_gmtoff; [> Seconds east of UTC. <]
* const char *tm_zone; [> Timezone abbreviation. <]
* # else
* long int __tm_gmtoff; [> Seconds east of UTC. <]
* const char *__tm_zone; [> Timezone abbreviation. <]
* # endif
* };
*/
std::tm time = { fusion::at_c<5>(v), // seconds
fusion::at_c<4>(v), // minutes
fusion::at_c<3>(v), // hours
fusion::at_c<2>(v), // day (1-31)
fusion::at_c<1>(v), // month
fusion::at_c<0>(v) - 1900, // year - 1900
0, 0, // wday, yday
0, 0, 0 // isdst, tm_gmtoff, tm_zone
};

timestamp.date = std::mktime(&time);
timestamp.ms = fusion::at_c<6>(v)/1000000.0;
}
};

Benchmark It!

The benchmark runs, and parses correctly:

Live On Coliru

#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>
#include <boost/chrono/chrono.hpp>
#include <iomanip>
#include <ctime>

namespace structs {
struct Timestamp {
std::time_t date;
double ms;
};

struct Record1 {
std::string date;
double time;
std::string str;
};

struct Record2 {
Timestamp date;
double time;
std::string str;
};

typedef std::vector<Record1> Records1;
typedef std::vector<Record2> Records2;
}

BOOST_FUSION_ADAPT_STRUCT(structs::Record1,
(std::string, date)
(double, time)
(std::string, str))

BOOST_FUSION_ADAPT_STRUCT(structs::Record2,
(structs::Timestamp, date)
(double, time)
(std::string, str))

namespace boost { namespace spirit { namespace traits {
template <typename It>
struct assign_to_attribute_from_iterators<std::string, It, void> {
static inline void call(It f, It l, std::string& attr) {
attr = std::string(&*f, std::distance(f,l));
}
};

template <typename Attr>
struct transform_attribute<structs::Timestamp, Attr, qi::domain> {
using type = Attr;
static type pre(structs::Timestamp) { return type(); }
static void fail(structs::Timestamp&) { }
static void post(structs::Timestamp& timestamp, type const& v) {
/*
* struct tm
* {
* int tm_sec; [> Seconds. [0-60] (1 leap second) <]
* int tm_min; [> Minutes. [0-59] <]
* int tm_hour; [> Hours. [0-23] <]
* int tm_mday; [> Day. [1-31] <]
* int tm_mon; [> Month. [0-11] <]
* int tm_year; [> Year - 1900. <]
* int tm_wday; [> Day of week. [0-6] <]
* int tm_yday; [> Days in year.[0-365] <]
* int tm_isdst; [> DST. [-1/0/1]<]
*
* # ifdef __USE_MISC
* long int tm_gmtoff; [> Seconds east of UTC. <]
* const char *tm_zone; [> Timezone abbreviation. <]
* # else
* long int __tm_gmtoff; [> Seconds east of UTC. <]
* const char *__tm_zone; [> Timezone abbreviation. <]
* # endif
* };
*/
std::tm time = { fusion::at_c<5>(v), // seconds
fusion::at_c<4>(v), // minutes
fusion::at_c<3>(v), // hours
fusion::at_c<2>(v), // day (1-31)
fusion::at_c<1>(v), // month
fusion::at_c<0>(v) - 1900, // year - 1900
0, 0, // wday, yday
0, 0, 0 // isdst, tm_gmtoff, tm_zone
};

timestamp.date = std::mktime(&time);
timestamp.ms = fusion::at_c<6>(v)/1000000.0;
}
};

} } }

namespace qi = boost::spirit::qi;

namespace QiParsers {

struct Months : qi::symbols<char, int> {
Months() { this->add
("Jan", 0)
("Feb", 1)
("Mar", 2)
("Apr", 3)
("May", 4)
("Jun", 5)
("Jul", 6)
("Aug", 7)
("Sep", 8)
("Oct", 9)
("Nov", 10)
("Dec", 11);
}
} static const mmm_;

static const qi::uint_parser<int, 10, 4, 4> yyyy_;
static const qi::uint_parser<int, 10, 2, 2> dd_, hh_, mm_, ss_;
static const qi::uint_parser<int, 10, 6, 6> fff_;

template <typename It>
struct Parser1 : qi::grammar<It, structs::Record1()>
{
Parser1() : Parser1::base_type(start) {
using namespace qi;

start = '[' >> raw[*~char_(']')] >> ']'
>> " - " >> double_ >> " s"
>> " => String: " >> raw[+graph]
>> eol;
}

private:
qi::rule<It, structs::Record1()> start;
};

template <typename It>
struct Parser2 : qi::grammar<It, structs::Record2()>
{
Parser2() : Parser2::base_type(start) {
using namespace qi;

date = '[' >> yyyy_ >> '-' >> mmm_ >> '-' >> dd_
>> ' ' >> hh_ >> ':' >> mm_ >> ':' >> ss_ >> '.' >> fff_ >> ']';

start =
date //'[' >> raw[*~char_(']')] >> ']'
>> " - " >> double_ >> " s"
>> " => String: " >> raw[+graph]
>> eol;
}

private:
qi::rule<It, structs::Record2()> start;
qi::rule<It, boost::fusion::vector<int, int, int, int, int, int, int>()> date;
};

template <typename It>
struct Parser3 : qi::grammar<It, structs::Records1()>
{
Parser3() : Parser3::base_type(start) {
using namespace qi;
using boost::phoenix::push_back;

line = '[' >> raw[*~char_(']')] >> ']'
>> " - " >> double_ >> " s"
>> " => String: " >> raw[+graph];

ignore = *~char_("\r\n");

start = (line[push_back(_val, _1)] | ignore) % eol;
}

private:
qi::rule<It> ignore;
qi::rule<It, structs::Record1()> line;
qi::rule<It, structs::Records1()> start;
};

template <typename It>
struct Parser4 : qi::grammar<It, structs::Records2()>
{
Parser4() : Parser4::base_type(start) {
using namespace qi;
using boost::phoenix::push_back;

date = '[' >> yyyy_ >> '-' >> mmm_ >> '-' >> dd_
>> ' ' >> hh_ >> ':' >> mm_ >> ':' >> ss_ >> '.' >> fff_ >> ']';

line = date
>> " - " >> double_ >> " s"
>> " => String: " >> raw[+graph];

ignore = *~char_("\r\n");

start = (line[push_back(_val, _1)] | ignore) % eol;
}

private:
qi::rule<It> ignore;
qi::rule<It, structs::Record2()> line;
qi::rule<It, structs::Records2()> start;
qi::rule<It, boost::fusion::vector<int, int, int, int, int, int, int>()> date;
};
}

template <typename Parser> static const Parser s_instance {};

template<template <typename> class Parser, typename Container, typename It>
Container parse_seek(It b, It e, const std::string& message)
{
Container records;

auto const t0 = boost::chrono::high_resolution_clock::now();
parse(b, e, *boost::spirit::repository::qi::seek[s_instance<Parser<It> >], records);
auto const t1 = boost::chrono::high_resolution_clock::now();

auto elapsed = boost::chrono::duration_cast<boost::chrono::milliseconds>(t1 - t0);
std::cout << "Elapsed time: " << elapsed.count() << " ms (" << message << ")\n";

return records;
}

template<template <typename> class Parser, typename Container, typename It>
Container parse_ignoring(It b, It e, const std::string& message)
{
Container records;

auto const t0 = boost::chrono::high_resolution_clock::now();
parse(b, e, s_instance<Parser<It> >, records);
auto const t1 = boost::chrono::high_resolution_clock::now();

auto elapsed = boost::chrono::duration_cast<boost::chrono::milliseconds>(t1 - t0);
std::cout << "Elapsed time: " << elapsed.count() << " ms (" << message << ")\n";

return records;
}

static const std::string input1 = "[2018-Mar-01 00:01:02.012345] - 1.000 s => String: Valid_string\n";
static const std::string input2 = "[2018-Mar-02 00:01:02.012345] - 2.000 s => I dont care\n";

std::string prepare_input() {
std::string input;
const int N1 = 10;
const int N2 = 1000;

input.reserve(N1 * (input1.size() + N2*input2.size()));

for (int i = N1; i--;) {
input += input1;
for (int j = N2; j--;)
input += input2;
}

return input;
}

int main() {
auto const input = prepare_input();

auto f = input.data(), l = f + input.length();

for (auto& r: parse_seek<QiParsers::Parser1, structs::Records1>(f, l, "std::string + seek")) {
std::cout << r.date << "\n";
break;
}
for (auto& r: parse_seek<QiParsers::Parser2, structs::Records2>(f, l, "stream + seek")) {
auto tm = *std::localtime(&r.date.date);
std::cout << std::put_time(&tm, "%Y-%b-%d %H:%M:%S") << " " << r.date.ms << "\n";
break;
}
for (auto& r: parse_ignoring<QiParsers::Parser3, structs::Records1>(f, l, "std::string + ignoring")) {
std::cout << r.date << "\n";
break;
}
for (auto& r: parse_ignoring<QiParsers::Parser4, structs::Records2>(f, l, "stream + ignoring")) {
auto tm = *std::localtime(&r.date.date);
std::cout << std::put_time(&tm, "%Y-%b-%d %H:%M:%S") << " " << r.date.ms << "\n";
break;
}
}

Prints

Elapsed time: 14 ms (std::string + seek)
2018-Mar-01 00:01:02.012345
Elapsed time: 42 ms (stream + seek)
2018-Mar-01 00:01:02 0.012345
Elapsed time: 2 ms (std::string + ignoring)
2018-Mar-01 00:01:02.012345
Elapsed time: 31 ms (stream + ignoring)
2018-Mar-01 00:01:02 0.012345

Conclusion

Parsing and mktime have a significant cost (10% of the profile run below). You will not do much better than boost::posix_time::from_time_string unless you're willing to opt out of std::time_t.

One notable advantage of the approach here is that the call to mktime is not done if a line is ignored. And it shows:

  • Parser1: 21.12 %
  • Parser2: 47.60 %
  • Parser3: 8.91 %
  • Parser4: 20.57 %

The ignoring parser is indeed quicker than the string-based non-ignoring parser now.

Profiling graph:


¹ taken the code from the other answer, so it's easy to compare benchmark results

Boost.spirit: parsing number char and string

What you probably haven't realized, is that parser expressions stop having automatic attribute propagation in the presence of semantic actions*.

* Documentation backgound: How Do Rules Propagate Their Attributes?

You're using a semantic action to 'manually' propagate the attribute of the uint_ parser:

[ref(num) = _1]   // this is a Semantic Action

So the easiest way to fix this, would be to propagate num automatically too (the way the qi::parse and qi::phrase_parse APIs were intended):

bool ok = qi::phrase_parse(first, last,               // input iterators
uint_ >> lit('X') >> lexeme[+(char_ - ' ')], // parser expr
space, // skipper
num, parsed_str); // output attributes

Or, addressing some off-topic points, even cleaner:

bool ok = qi::phrase_parse(first, last,
uint_ >> 'X' >> lexeme[+graph],
blank,
num, parsed_str);

As you can see, you can pass multiple lvalues as output attribute recipients.1, 2

See it a live demo on Coliru (link)

There's a whole lot of magic going on, which in practice leads to my rule of thumb:

Avoid using semantic actions in Spirit Qi expressions unless you absolutely have to

I have about this before, in an answer specificly about this: Boost Spirit: "Semantic actions are evil"?

In my experience, it's almost always cleaner to use the Attribute Customization Points to tweak the automatic propagation than to abandon auto rules and resort to manual attribute handling.



Related Topics



Leave a reply



Submit