Boost::Spirit::Qi Parser: Index of Parsed Element

Boost::Spirit::QI parser: index of parsed element

There are a number of approaches.

  • What I'd usually recommend instead, is using well thought out repeat(n) expressions with directly exposed container attributes (like vector<vector<double> >).

  • What you seem to be looking for is semantic actions with state. (This is common practice coming from lex/yacc).

I treat these approaches in three full demos below (1., 2. and 3.)

  • An advanced technique is using customization points to allow Spirit to directly treat your Matrix type as a container attribute and override the insertion logic for it using spirit::traits. For this approach I refer to this answer: pass attribute to child rule in boost spirit.

Using inherited attributes

Here is a relatively straightforward approach:

  1. parsing directly into a vector<vector<double> > (full code live online)

    qi::rule<It, Matrix::value_type(size_t cols), qi::blank_type> row;
    qi::rule<It, Matrix(size_t rows,size_t cols), qi::blank_type> matrix;

    row %= skip(char_(" \t,")) [ repeat(_r1) [ double_ ] ];
    matrix %= eps // [ std::cout << phx::val("debug: ") << _r1 << ", " << _r2 << "\n" ]
    >> repeat(_r1) [ row(_r2) >> (eol|eoi) ];

    Usage:

    if (qi::phrase_parse(f,l,parser(10, 4),qi::blank, m))
    std::cout << "Wokay\n";
    else
    std::cerr << "Uhoh\n";
  2. Similarly, but adapting a Matrix struct (full code live here)

    struct Matrix
    {
    Matrix(size_t rows, size_t cols) : _cells(), _rows(rows), _cols(cols) { }

    double & data(size_t col, size_t row) { return _cells.at(row).at(col); }
    const double & data(size_t col, size_t row) const { return _cells.at(row).at(col); }

    size_t columns() const { return _cols; }
    size_t rows() const { return _rows; }

    std::vector<std::vector<double> > _cells;
    size_t _rows, _cols;
    };

    BOOST_FUSION_ADAPT_STRUCT(Matrix, (std::vector<std::vector<double> >,_cells))

    Usage

    Matrix m(10, 4);

    if (qi::phrase_parse(f,l,parser(m.rows(),m.columns()),qi::blank, m))
    std::cout << "Wokay\n";
    else
    std::cerr << "Uhoh\n";

Using semantic actions/qi::locals

3. This is more work, but potentially more flexible. You'd define a polymorphic callable type to insert a value at a given cell:

struct MatrixInsert
{
template <typename, typename, typename, typename> struct result { typedef bool type; };
template <typename Matrix, typename Row, typename Col, typename Value>
bool operator()(Matrix &m, Row& r, Col& c, Value v) const
{
if (r < m.rows() && c < m.columns())
{
m.data(r, c++) = v;
return true; // parse continues
}
return false; // fail the parse
}
};

BOOST_PHOENIX_ADAPT_CALLABLE(matrix_insert, MatrixInsert, 4)

The last line makes this a phoenix lazy function, so you can use it without weird bind syntax in your semantic actions:

qi::rule<It, Matrix(), qi::blank_type, qi::locals<size_t /*_a: row*/, size_t/*_b: col*/> > matrix;
matrix = eps [ _a = 0 /*current row*/ ]
>> (
eps [ _b = 0 /*current col*/ ]
>> double_ [ _pass = matrix_insert(_val, _a, _b, _1) ] % ','
) % (eol [ ++_a /*next row*/])
;

Full code is, again live on liveworkspace.org

How to capture the value parsed by a boost::spirit::x3 parser to be used within the body of a semantic action?

In X3, semantic actions are much simpler. They're unary callables that take just the context.

Then you use free functions to extract information from the context:

  • x3::_val(ctx) is like qi::_val
  • x3::_attr(ctx) is like qi::_0 (or qi::_1 for simple parsers)
  • x3::_pass(ctx) is like qi::_pass

So, to get your semantic action, you could do:

   auto qstring 
= x3::rule<struct rule_type, std::string> {"qstring"}
= x3::lexeme[quote > *("\\" >> x3::char_(quote) | ~x3::char_(quote)) > quote]
;

Now to make a very odd string rule that reverses the text (after de-escaping) and requires the number of characters to be an odd-number:

auto odd_reverse = [](auto& ctx) {
auto& attr = x3::_attr(ctx);
auto& val = x3::_val(ctx);
x3::traits::move_to(attr, val);
std::reverse(val.begin(), val.end());

x3::_pass(ctx) = val.size() % 2 == 0;
};

auto odd_string
= x3::rule<struct odd_type, std::string> {"odd_string"}
= qstring [ odd_reverse ]
;

DEMO

Live On Coliru

#include <boost/spirit/home/x3.hpp>
#include <iostream>
#include <iomanip>

int main() {
namespace x3 = boost::spirit::x3;

auto constexpr quote = '"';
auto qstring
= x3::rule<struct rule_type, std::string> {"qstring"}
= x3::lexeme[quote > *("\\" >> x3::char_(quote) | ~x3::char_(quote)) > quote]
;

auto odd_reverse = [](auto& ctx) {
auto& attr = x3::_attr(ctx);
auto& val = x3::_val(ctx);
x3::traits::move_to(attr, val);
std::reverse(val.begin(), val.end());

x3::_pass(ctx) = val.size() % 2 == 0;
};

auto odd_string
= x3::rule<struct odd_type, std::string> {"odd_string"}
= qstring [ odd_reverse ]
;

for (std::string const input : {
R"("test \"hello\" world")",
R"("test \"hello\" world!")",
}) {
std::string output;
auto f = begin(input), l = end(input);
if (x3::phrase_parse(f, l, odd_string, x3::blank, output)) {
std::cout << "[" << output << "]\n";
} else {
std::cout << "Failed\n";
}
if (f != l) {
std::cout << "Remaining unparsed: " << std::quoted(std::string(f,l)) << "\n";
}
}
}

Printing

[dlrow "olleh" tset]
Failed
Remaining unparsed: "\"test \\\"hello\\\" world!\""

UPDATE

To the added question:

EDIT: it seems that whenever I attach any semantic action in general
to the parser, the value is nullified. I suppose the question now is
how could I access the value before that happens? I just need to be
able to manipulate the parsed string before it is given to the AST.

Yes, if you attach an action, automatic attribute propagation is inhibited. This is the same in Qi, where you could assign rules with %= instead of = to force automatic attribute propagation.

To get the same effect in X3, use the third template argument to x3::rule: x3::rule<X, T, true> to indicate you want automatic propagation.

Really, try not to fight the system. In practice, the automatic transformation system is way more sophisticated than I am willing to re-discover on my own, so I usually post-process the whole AST or at most apply some minor tweaks in an action. See also Boost Spirit: "Semantic actions are evil"?

Using boost::spirit defaulting a parsed value to an earlier value when parsing into a struct

First of all, instead of spelling out omit[+space], just use a skipper:

bool parsed = qi::phrase_parse(iter, end, (
qi::lexeme[+(alnum | '_')]
>> uint_ >> (uint_ | attr(0))
>> (("//" >> lexeme[+qi::char_]) | attr(""))
), qi::space, result);

Here, qi::space is the skipper. lexeme[] avoids skipping inside the sub-expression (see Boost spirit skipper issues).

Next up, you can do it more than one way.

  1. use a local attribute to temporarily store a value:

    Live On Coliru

    rule<It, record_struct(), locals<uint8_t>, space_type> g;

    g %= lexeme[+(alnum | '_')]
    >> uint_ [_a = _1] >> (uint_ | attr(_a))
    >> -("//" >> lexeme[+char_]);

    parsed = phrase_parse(iter, end, g, space, result);

    This requires

    • a qi::rule declaration to declare the qi::locals<uint8_t>; qi::_a is the placeholder for that local attribute
    • initialize the rule as an "auto-rule" (docs), i.e. with %= so that semantic actions do not overrule attribute propagation
  2. There's a wacky hybrid here where you don't actually use locals<> but just refer to an external variable; this is in general a bad idea but as your parser is not recursive/reentrant you could do it

    Live On Coliru

    parsed = phrase_parse(iter, end, (
    lexeme[+(alnum | '_')]
    >> uint_ [ phx::ref(dist_) = _1 ] >> (uint_ | attr(phx::ref(dist_)))
    >> (("//" >> lexeme[+char_]) | attr(""))
    ), space, result);
  3. You could go full Boost Phoenix and juggle the values right from the semantic actions

    Live On Coliru

    parsed = phrase_parse(iter, end, (
    lexeme[+(alnum | '_')]
    >> uint_ >> (uint_ | attr(phx::at_c<1>(_val)))
    >> (("//" >> lexeme[+char_]) | attr(""))
    ), space, result);
  4. You could parse into optional<uint8_t> and postprocess the information

    Live On Coliru

    std::string              name;
    uint8_t distance;
    boost::optional<uint8_t> travelDistance;
    std::string comment;

    parsed = phrase_parse(iter, end, (
    lexeme[+(alnum | '_')]
    >> uint_ >> -uint_
    >> -("//" >> lexeme[+char_])
    ), space, name, distance, travelDistance, comment);

    result = { name, distance, travelDistance? *travelDistance : distance, comment };

Post Scriptum

I noticed this a little late:

If possible, I'd like to isolate the parsing of the two integers to a separate parser rule, without giving up using the fusion struct.

Well, of course you can:

rule<It, uint8_t(uint8_t)> def_uint8 = uint_parser<uint8_t>() | attr(_r1);

This is at once more accurate, because it doesn't parse unsigned values that don't fit in a uint8_t. Mixing and matching from the above: Live On Coliru

share local data in boost spirit actions inside the parse step

the solution with custom iterator works .. so just adding this iterator (code below) and the "iter_pos" boost::spirit helper to access this iterator inside the rules. With this you can call iter.getData() to access a shared information over all rules

class custom_iterator {
public:
typedef wchar_t value_type;
typedef std::ptrdiff_t difference_type;
typedef const value_type* pointer;
typedef const value_type& reference;
typedef std::forward_iterator_tag iterator_category;

custom_iterator() :handler_(nullptr) { }

custom_iterator(parserDataS* handler, scriptSTDWStringType::const_iterator iter)
:
handler_(handler), iter_(iter)
{ }

custom_iterator& operator++() {
++iter_;
return *this;
}

custom_iterator operator++(int) {
custom_iterator tmp = *this;
iter_++;
return tmp;
}

value_type operator*() const {
return *iter_;
}

friend bool operator==(custom_iterator a, custom_iterator b) {
return a.iter_ == b.iter_;
}

friend bool operator!=(custom_iterator a, custom_iterator b) {
return a.iter_ != b.iter_;
}

scriptSTDWStringType::const_iterator getIter() const { return iter_; }
parserDataS* getData() const { return const_cast<parserDataS *>(handler_); }

private:
scriptSTDWStringType::const_iterator iter_;
parserDataS* handler_;
};

How do I parse end-of-line with boost::spirit::qi?

You are using space as the skipper for your calls to phrase_parse. This parser matches any character for which std::isspace returns true (assuming you're doing ascii based parsing). For this reason the \r\n in the input are eaten by your skipper before they can be seen by your eol parser.

Using semantic actions in boost spirit to set fields

Yes it's possible.

One pitfall is that presence of semantic actions usually supresses automatic attribute propagation. Since you'd like to have both, you will want to assign the parser expression using %= instead of = (see docs).

Alternatively you can generate a value on the fly and use the adaptation that you showed.

Proof Of Concept: SA + Adaptation

Here I'd simply exclude id from the adaptation. Note also that you don't need to repeat types since c++11:

BOOST_FUSION_ADAPT_STRUCT(
client::employee, age, /*id, */ surname, forename, salary)

I'd prefer to write the SA with some phoenix function helpers:

    auto _id = px::function { std::mem_fn(&client::employee::id) };
auto _gen = px::function { client::employee::generate_id };

start %= skip(space) [
age >> name >> name >> salary >> eps
[ _id(_val) = _gen() ]
];

Live On Coliru

#include <boost/fusion/adapted.hpp>
#include <boost/fusion/include/io.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <iomanip>
#include <iomanip>
namespace qi = boost::spirit::qi;
namespace px = boost::phoenix;
namespace fus = boost::fusion;

namespace client {
struct employee {
std::string id;
int age;
std::string surname;
std::string forename;
double salary;

static std::string generate_id() {
static int _next{0};
return "autoid#" + std::to_string(_next++);
}
};
using fus::operator<<;
}

BOOST_FUSION_ADAPT_STRUCT(
client::employee, age, /*id, */ surname, forename, salary)

template <typename It>
struct Parser : qi::grammar<It, client::employee()> {
Parser() : Parser::base_type(start) {
using namespace qi;
name = +graph;
age = uint_ >> eps(_val < 110 && _val > 16);
salary = double_;

auto _id = px::function { std::mem_fn(&client::employee::id) };
auto _gen = px::function { client::employee::generate_id };

start %= skip(space) [
age >> name >> name >> salary >> eps
[ _id(_val) = _gen() ]
];

BOOST_SPIRIT_DEBUG_NODES((start)(name)(age)(salary))
}
private:
qi::rule<It, client::employee()> start;
qi::rule<It, unsigned()> age;
qi::rule<It, std::string()> name;
qi::rule<It, double()> salary;
};

static auto qview(auto f, auto l) {
return std::quoted(
std::string_view(std::addressof(*f), std::distance(f, l)));
}

int main() {
Parser<std::string::const_iterator> p;
std::cout << fus::tuple_delimiter(',');

for (std::string const& input: {
//age surname forename salary
"55 Astley Rick 7232.88",
"23 Black Rebecca 0.00",
"77 Waters Roger 24815.07",
})
{
auto f = begin(input), l = end(input);

client::employee emp;
if (parse(f, l, p, emp)) {
std::cout << "Parsed: " << emp.id << " " << emp << "\n";
} else {
std::cout << "Parse failed\n";
}

if (f != l) {
std::cout << "Remaining unput: " << qview(f,l) << "\n";
}
}
}

Prints

Parsed: autoid#0 (55,Astley,Rick,7232.88)
Parsed: autoid#1 (23,Black,Rebecca,0)
Parsed: autoid#2 (77,Waters,Roger,24815.1)

Alternative: Inline Generation

You'd keep th full adaptation:

BOOST_FUSION_ADAPT_STRUCT(
client::employee, age, id, surname, forename, salary)

And respell the rule using qi::attr() on the right spot:

    auto _gen = px::function { client::employee::generate_id };

start %= skip(space) [
age >> attr(_gen()) >> name >> name >> salary
];

Live On Coliru (omitting rest of unchanged listing)

Printing (again):

Parsed: autoid#0 (55,autoid#0,Astley,Rick,7232.88)
Parsed: autoid#1 (23,autoid#1,Black,Rebecca,0)
Parsed: autoid#2 (77,autoid#2,Waters,Roger,24815.1)

Conclusion

In retrospect, I think the alternative has more appeal.

Boost Spirit X3: skip parser that would do nothing

You can use either no_skip[] or lexeme[]. They're almost identical, except for pre-skip (Boost Spirit lexeme vs no_skip).

Are there no skip parsers that would simply do nothing? Am I missing something?

A wild guess, but you might be missing the parse API that doesn't accept a skipper in the first place

Live On Coliru

#include <iostream>
#include <iomanip>
#include <boost/spirit/home/x3.hpp>
namespace x3 = boost::spirit::x3;

int main() {
std::string const input{ "2,4,5" };
auto f = begin(input), l = end(input);

const auto parser = x3::int_ % ',';
std::vector<int> numbers;

auto r = parse(f, l, parser, numbers);

if (r) {
// success
for (const auto& item : numbers)
std::cout << item << std::endl;
} else {
std::cerr << "Input was not parsed successfully" << std::endl;
return 1;
}

if (f!=l) {
std::cout << "Remaining input " << std::quoted(std::string(f,l)) << "\n";
return 2;
}
}

Prints

2
4
5

Why does boost::spirit::qi::parse() not set this boost::variant's value?

The rule's attribute must be specified using the function declaration syntax:

qi::rule<Iterator, Variant()> m_rule;

I have not tried, but I believe it will work after this change (the same is required for the grammar, btw).

Boost spirit floating number parser precision

So it will be probably limitation/bug of "float" type parser. Try to use double_ parser.

#include<iostream>
#include<iomanip>
#include<string>
#include<boost/spirit/include/qi.hpp>

int main()
{
std::cout.precision(20);

//float x=219721.03839999999f;
//std::cout << x*1.0f << std::endl;
//gives 219721.03125

double resultD;
std::string arg="219721.03839999999";

auto itBeg = arg.begin();
auto itEnd = arg.end();
if(!boost::spirit::qi::parse(itBeg, itEnd,boost::spirit::qi::double_,resultD) || itBeg != itEnd)
std::cerr << "Cannot convert from std::string to double" << std::endl;
else
std::cout << "qi::double_:" << resultD << std::endl;

float resultF;
itBeg = arg.begin();
itEnd = arg.end();
if(!boost::spirit::qi::parse(itBeg, itEnd,boost::spirit::qi::float_,resultF) || itBeg != itEnd)
std::cerr << "Cannot convert from std::string to float" << std::endl;
else
std::cout << "qi::float_ :" << resultF << std::endl;

return 0;
}

Output:

qi::double_:219721.03839999999036

qi::float_:219721.109375



Related Topics



Leave a reply



Submit