Detecting the Parameter Types in a Spirit Semantic Action

Detecting the parameter types in a Spirit semantic action

For clarity - the error here is that base_ >> int_ >> int_ was used as the expression for a rule that creates a myderived, and since base_ is fixed to type mybase, we'd have to create a myderrived from a mybase and two ints, but there's nothing to tell Spirit how to do that.

You can get boost to print out the type of the value that boost creates from parsing base_ >> int_ >> int_ by defining a functor that will take any parameters, and tell you what they are (the following code is adapted from some code sehe put on SO chat):

struct what_is_the_attr
{
template <typename> struct result { typedef bool type; };

template <typename T>
static void print_the_type()
{
std::cout << " ";
std::cout << typeid(T).name();
if(std::is_const<typename std::remove_reference<T>::type>::value)
std::cout << " const";
if(std::is_rvalue_reference<T>::value)
std::cout << " &&";
else if(std::is_lvalue_reference<T>::value)
std::cout << " &";
}

template <typename Th, typename Th2, typename... Tt>
static void print_the_type()
{
print_the_type<Th>();
std::cout << ",\n";
print_the_type<Th2, Tt...>();
}

template <typename... Ts>
void operator()(Ts&&...) const
{
std::cout << "what_is_the_attr(\n";
print_the_type<Ts...>();
std::cout << ")" << std::endl;
}
};

Then to use it, use the above actor in a semantic action on initializer for your faulty rule:

std::string input = "1 2 3 4";
auto f(std::begin(input)), l(std::end(input));

rule<decltype(f), mybase() , space_type> base_ = int_ >> int_;
rule<decltype(f), myderived(), space_type> derived_ = (base_ >> int_ >> int_)[what_is_the_attr()];

myderived data;
bool ok = phrase_parse(f,l,derived_,space,data);

Note, you cannot use automatic attribute propagation with %= (unless you remove the exposed attribute type from the rule's declared type).

Running this should then yield an encoded type, which can be decoded with c++filt -t: Live On Coliru

$ g++ 9404189.cpp -std=c++0x
$ ./a.out |c++filt -t
what_is_the_attr(
boost::fusion::vector3<mybase, int, int> &,
boost::spirit::context<boost::fusion::cons<boost::spirit::unused_type&, boost::fusion::nil>, boost::fusion::vector0<void> > &,
bool &)

The first line, boost::fusion::vector3<mybase, int, int>, least tells you that boost is trying to create your return type from 3 objects of types mybase, int and int.

boost spirit reporting semantic error

I'd use filepos_iterator and just throw an exception, so you have complete control over the reporting.

Let me see what I can come up with in the remaining 15 minutes I have

Ok, took a little bit more time but think it's an instructive demo:

Live On Coliru

#include <boost/fusion/adapted.hpp>
#include <boost/fusion/include/io.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/support_line_pos_iterator.hpp>
#include <boost/spirit/repository/include/qi_iter_pos.hpp>
#include <boost/lexical_cast.hpp>

namespace qi = boost::spirit::qi;
namespace qr = boost::spirit::repository::qi;
namespace px = boost::phoenix;
namespace qi_coding = boost::spirit::ascii;
using It = boost::spirit::line_pos_iterator<std::string::const_iterator>;

namespace ast {
enum actionid { f_unary, f_binary };
enum param_type { int_param, string_param };

static inline std::ostream& operator<<(std::ostream& os, actionid id) {
switch(id) {
case f_unary: return os << "f_unary";
case f_binary: return os << "f_binary";
default: return os << "(unknown)";
} }
static inline std::ostream& operator<<(std::ostream& os, param_type t) {
switch(t) {
case int_param: return os << "integer";
case string_param: return os << "string";
default: return os << "(unknown)";
} }

using param_value = boost::variant<int, std::string>;
struct parameter {
It position;
param_value value;

friend std::ostream& operator<<(std::ostream& os, parameter const& p) { return os << p.value; }
};
using parameters = std::vector<parameter>;

struct action {
/*
*action() = default;
*template <typename Sequence> action(Sequence const& seq) { boost::fusion::copy(seq, *this); }
*/
actionid id;
parameters params;
};
}

namespace std {
static inline std::ostream& operator<<(std::ostream& os, ast::parameters const& v) {
std::copy(v.begin(), v.end(), std::ostream_iterator<ast::parameter>(os, " "));
return os;
}
}

BOOST_FUSION_ADAPT_STRUCT(ast::action, id, params)
BOOST_FUSION_ADAPT_STRUCT(ast::parameter, position, value)

struct BadAction : std::exception {
It _where;
std::string _what;
BadAction(It it, std::string msg) : _where(it), _what(std::move(msg)) {}
It where() const { return _where; }
char const* what() const noexcept { return _what.c_str(); }
};

struct ValidateAction {
std::map<ast::actionid, std::vector<ast::param_type> > const specs {
{ ast::f_unary, { ast::int_param } },
{ ast::f_binary, { ast::int_param, ast::string_param } },
};

ast::action operator()(It source, ast::action parsed) const {
auto check = [](ast::parameter const& p, ast::param_type expected_type) {
if (p.value.which() != expected_type) {
auto name = boost::lexical_cast<std::string>(expected_type);
throw BadAction(p.position, "Type mismatch (expecting " + name + ")");
}
};

int i;
try {
auto& formals = specs.at(parsed.id);
auto& actuals = parsed.params;
auto arity = formals.size();

for (i=0; i<arity; ++i)
check(actuals.at(i), formals.at(i));

if (actuals.size() > arity)
throw BadAction(actuals.at(arity).position, "Excess parameters");
} catch(std::out_of_range const&) {
throw BadAction(source, "Missing parameter #" + std::to_string(i+1));
}
return parsed;
}
};

template <typename It, typename Skipper = qi::space_type>
struct Parser : qi::grammar<It, ast::action(), Skipper> {
Parser() : Parser::base_type(start) {
using namespace qi;
parameter = qr::iter_pos >> (int_ | lexeme['"' >> *~qi_coding::char_('"') >> '"']);
parameters = -(parameter % ',');
action = actions_ >> '(' >> parameters >> ')';
start = (qr::iter_pos >> action) [ _val = validate_(_1, _2) ];

BOOST_SPIRIT_DEBUG_NODES((parameter)(parameters)(action))
}
private:
qi::rule<It, ast::action(), Skipper> start, action;
qi::rule<It, ast::parameters(), Skipper> parameters;
qi::rule<It, ast::parameter(), Skipper> parameter;
px::function<ValidateAction> validate_;

struct Actions : qi::symbols<char, ast::actionid> {
Actions() { this->add("f_unary", ast::f_unary)("f_binary", ast::f_binary); }
} actions_;

};

int main() {
for (std::string const input : {
// good
"f_unary( 0 )",
"f_binary ( 47, \"hello\")",
// errors
"f_binary ( 47, \"hello\") bogus",
"f_unary ( 47, \"hello\") ",
"f_binary ( 47, \r\n 7) ",
})
{
std::cout << "-----------------------\n";
Parser<It> p;
It f(input.begin()), l(input.end());

auto printErrorContext = [f,l](std::ostream& os, It where) {
auto line = get_current_line(f, where, l);

os << " line:" << get_line(where)
<< ", col:" << get_column(line.begin(), where) << "\n";
while (!line.empty() && std::strchr("\r\n", *line.begin()))
line.advance_begin(1);
std::cerr << line << "\n";
std::cerr << std::string(std::distance(line.begin(), where), ' ') << "^ --- here\n";
};

ast::action data;
try {
if (qi::phrase_parse(f, l, p > qi::eoi, qi::space, data)) {
std::cout << "Parsed: " << boost::fusion::as_vector(data) << "\n";
}
} catch(qi::expectation_failure<It> const& e) {
printErrorContext(std::cerr << "Expectation failed: " << e.what_, e.first);
} catch(BadAction const& ba) {
printErrorContext(std::cerr << "BadAction: " << ba.what(), ba.where());
}

if (f!=l) {
std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
}
}
}

Printing:

-----------------------
Parsed: (f_unary 0 )
-----------------------
Parsed: (f_binary 47 hello )
-----------------------
Expectation failed: <eoi> line:1, col:25
f_binary ( 47, "hello") bogus
^ --- here
Remaining unparsed: 'f_binary ( 47, "hello") bogus'
-----------------------
BadAction: Excess parameters line:1, col:15
f_unary ( 47, "hello")
^ --- here
Remaining unparsed: 'f_unary ( 47, "hello") '
-----------------------
BadAction: Type mismatch (expecting string) line:2, col:8
7)
^ --- here
Remaining unparsed: 'f_binary ( 47,
7) '

Questions about Spirit.Qi sequence operator and semantic actions

First, blow-by-blow. See below for a out-of-the-box answer.

Question 1: Why do I have to add a semantic action to the rule sign above?
Isn't char convertible to std::string?

Erm, no char is not convertible to string. See below for other options.

Question 2: Why does compilation fail when I try to merge the last two rules
like this:

rule<Iterator, std::string()> floating = -sign >> 
(mantissa >> -(exp | suffix) | +digit >> (exp | suffix));

This is due to the rules for atomic attribute assignment. The parser exposes something like

vector2<optional<string>, variant<
vector2<string, optional<string> >,
vector2<std::vector<char>, optional<string> > >

or similar (see the documentation for the parsers, I typed this in the browser from memory). This is, obviously, not assignable to string. Use qi::as<> to coerce atomic assignment. For convenience ***there is qi::as_string:

floating = qi::as_string [ -sign >> (mantissa >> -(exp | suffix) | 
+digit >> (exp | suffix)) ]

Question 3: Let's say I want to let the attribute of floating be double and
write a semantic action to do the conversion from string to double. How can I
refer to the entire string matched by the rule from inside the semantic
action?

You could use qi::as_string again, but the most appropriate would seem to be to use qi::raw:

floating = qi::raw [ -sign >> (mantissa >> -(exp | suffix) | 
+digit >> (exp | suffix)) ]
[ _val = parse_float(_1, _2) ];

This parser directive exposes a pair of source iterators, so you can use it to refer to the exact input sequence matched.

Question 4: In the rule floating of Question 2, what does the placeholder _2
refer to and what is its type?

In general, to detect attribute types - that is, when the documentation has you confused or you want to double check your understanding of it - see the answers here:

  • Detecting the parameter types in a Spirit semantic action

Out-of-the-box

Have you looked at using Qi's builtin real_parser<> template, which can be comprehensively customized. It sure looks like you'd want to use that instead of doing custom parsing in your semantic action.

The real_parser template with policies is both fast and very flexible and robust. See also the recent answer Is it possible to read infinity or NaN values using input streams?.

For models of RealPolicies the following expressions must be valid:

Expression                 | Semantics 
===========================+=============================================================================
RP::allow_leading_dot | Allow leading dot.
RP::allow_trailing_dot | Allow trailing dot.
RP::expect_dot | Require a dot.
RP::parse_sign(f, l) | Parse the prefix sign (e.g. '-'). Return true if successful, otherwise false.
RP::parse_n(f, l, n) | Parse the integer at the left of the decimal point. Return true if successful, otherwise false. If successful, place the result into n.
RP::parse_dot(f, l) | Parse the decimal point. Return true if successful, otherwise false.
RP::parse_frac_n(f, l, n) | Parse the fraction after the decimal point. Return true if successful, otherwise false. If successful, place the result into n.
RP::parse_exp(f, l) | Parse the exponent prefix (e.g. 'e'). Return true if successful, otherwise false.
RP::parse_exp_n(f, l, n) | Parse the actual exponent. Return true if successful, otherwise false. If successful, place the result into n.
RP::parse_nan(f, l, n) | Parse a NaN. Return true if successful, otherwise false. If successful, place the result into n.
RP::parse_inf(f, l, n) | Parse an Inf. Return true if successful, otherwise false. If successful, place the result into n

See the example for a compelling idea of how you'd use it.

Compound Attribute generation in Boost::Spirit parse rule

The attribute types resulting of the parser expressions are quite well-documented. But that can be disorienting and timeconsuming.

Here's a trick: send in a sentinel to detect the attribute type:

struct Sniffer
{
typedef void result_type;

template <typename T>
void operator()(T const&) const { std::cout << typeid(T).name() << "\n"; }
};

then using the folliing parser expression

 (input >> (qi::repeat(0,2)[qi::char_(';') >> input])) [ Sniffer() ]

will dump:

N5boost6fusion7vector2ISt6vectorIsSaIsEES2_INS1_IcS4_EESaIS5_EEEE

which c++filt -1 will tell you represents:

boost::fusion::vector2<
std::vector<short, std::allocator<short> >,
std::vector<boost::fusion::vector2<char, std::vector<short, std::allocator<short> > >,
std::allocator<boost::fusion::vector2<char, std::vector<short, std::allocator<short> > >
> >
>

See it live on Coliru: http://coliru.stacked-crooked.com/view?id=3e767990571f8d0917aae745bccfa520-5c1d29aa57205c65cfb2587775d52d22

boost::fusion::vector2<std::vector<short, std::allocator<short> >, std::vector<std::vector<short, std::allocator<short> >, std::allocator<std::vector<short, std::allocator<short> > > > >

It might be so surprisingly complicated, in part, because char_(";") could have been ';' (or more explicitely lit(';')). Constrast with this (Coliru):

boost::fusion::vector2<
std::vector<short, ... >,
std::vector<std::vector<short, std::allocator<short> >, ... > >

This should answer your question.

Sidenotes: parsing things

Don't underestimate automatic attribute propagation in Spirit. Frequently, you don't have to bother with the exact exposed types of attributes. Instead, rely on the (many) attribute transformations that Spirit uses to assign them to your supplied attribute references.

I trust you know the list-operator (%) in spirit? I'll show you how you can use it without further ado:

vector<vector<short>> data;

qi::parse(f, l, qi::short_ % ',' % ';', data);

Now, if you need to enforce the fact that it may be 1-3 elements, you might employ an eps with a Phoenix action to assert the maximum size:

const string x = "1,2,3;2,3,4;3,4,5";
auto f(begin(x)), l(end(x));

if (qi::parse(f, l,
(qi::eps(phx::size(qi::_val) < 2) > (qi::short_ % ',')) % ';'
, data))
{
cout << karma::format(karma::short_ % ',' % ';', data) << "\n";
}
cout << "remaining unparsed: '" << std::string(f,l) << "'\n";

Prints:

1,2,3;2,3,4
remaining unparsed: ';3,4,5'

Problems with boost::phoenix::bind and boost::phoenix::actors in a semantic action for boost::spirit::qi

A few observations:

  • You don't seem to be using a skipper, so using lexeme is redundant (see Boost spirit skipper issues)

  • You want to know how to detect the type of the attribute exposed by a parser expression: see Detecting the parameter types in a Spirit semantic action

    The types are documented with the parser directives, though, so e.g. as_string[(qi::char_("1-9") >> +qi::char_("0-9"))] results in boost::fusion::vector2<char, std::vector<char> >, which is directly reflected in the error message on GCC:

    boost/phoenix/bind/detail/preprocessed/function_ptr_10.hpp|50 col 39| error: could not convert ‘a0’ from ‘boost::fusion::vector2<char, std::vector<char> >’ to ‘std::vector<char>’
  • Prefer not to mix and match library placeholders/wrappers, e.g. boost::ref and boost::phoenix::ref

  • You seem to be reinventing integer parsing; consider using qi::int_parser instead

  • It seems that the case to parse 0 is missing :)

Assuming you want my_str to simply reflect the input string including number base prefix, I could suggest using:

number =
as_string[(qi::char_("1-9") >> +qi::char_("0-9"))] [phx::bind(&fPushIntCV, qi::_1, phx::ref(code), phx::ref(variables))]
| as_string[("0x" >> +qi::char_("0-9a-fA-F")) ] [phx::bind(&fPushIntCV, qi::_1, phx::ref(code), phx::ref(variables))]
| as_string[("0b" >> +qi::char_("0-1")) ] [phx::bind(&fPushIntCV, qi::_1, phx::ref(code), phx::ref(variables))]
| as_string[("0" >> +qi::char_("0-7")) ] [phx::bind(&fPushIntCV, qi::_1, phx::ref(code), phx::ref(variables))]
//some other junk
;

However, this could be simplified to:

number = as_string[
(qi::char_("1-9") >> +qi::char_("0-9"))
| ("0x" >> +qi::char_("0-9a-fA-F"))
| ("0b" >> +qi::char_("01"))
| ("0" >> +qi::char_("0-7"))
] [phx::bind(&fPushIntCV, qi::_1, phx::ref(code), phx::ref(variables))]
;

Now, you would probably like to just parse an integer value instead:

number = 
(
("0x" >> qi::int_parser<int, 16, 1>())
| ("0b" >> qi::int_parser<int, 2, 1>())
| ("0" >> qi::int_parser<int, 8, 1>())
| qi::int_ /* accepts "0" */) [phx::bind(&fPushIntCV, qi::_1, phx::ref(code), phx::ref(variables))]
;

Which handsomely does the conversions[1], and you can just take an int:

void fPushIntCV (int my_number, Code& c, Variables& v) {
std::cout << "fPushIntCV: " << my_number << "\n";
}

[1] (there's also uint_parser and you can parse long, long long etc.; even big integers like boost::multiprecision::cpp_int should be no issue)

Here's a demo program using this, showing that the values are converted correctly (and: "0" is accepted :)): Live On Coliru

int main()
{
Code code;
Variables variables;
Calculator g(code, variables);

for (std::string const input : { "0", "0xef1A", "010", "0b10101" })
{
It f(input.begin()), l(input.end());

if(qi::parse(f, l, g))
std::cout << "Parse success ('" << input << "')\n";
else std::cout << "Parse failed ('" << input << "')\n";

if (f != l)
std::cout << "Input remaining: '" << std::string(f, l) << "'\n";
}
}

Prints

fPushIntCV: 0
Parse success ('0')

fPushIntCV: 61210
Parse success ('0xef1A')

fPushIntCV: 8
Parse success ('010')

fPushIntCV: 21
Parse success ('0b10101')

Spirit X3, semantic action makes compilation fails with: Attribute does not have the expected size

This is surprising to me too, I'd report it at the mailing list (or the bug tracker) as a potential bug.

Meanwhile, you can "fix" it by supplying an attribute type for dest:

Live On Coliru

#include <boost/fusion/adapted/std_tuple.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iostream>

namespace x3 = boost::spirit::x3;

template <typename T>
void parse(T begin, T end) {
auto dest = x3::rule<struct dest_type, std::tuple<int, int> > {} = '[' >> x3::int_ >> ';' >> x3::int_ >> ']';

auto on_portal = [&](auto& ctx) {
int a, b;
if (auto tup = x3::_attr(ctx)) {
std::tie(a, b) = *tup;
std::cout << "Parsed [" << a << ", " << b << "]\n";
}
};
auto portal = ('P' >> -dest)[on_portal];

auto tiles = +portal;
x3::phrase_parse(begin, end, tiles, x3::eol);
}

int main() {
std::string x = "P[1;2]P[3;4]P[5;6]";
parse(x.begin(), x.end());
}

Prints:

Parsed [1, 2]
Parsed [3, 4]
Parsed [5, 6]

NOTE I changed char_('P') into just lit('P') because I didn't want to complicate the sample dealing with the character in the attribute. Perhaps you didn't mean to have it in the exposed attribute anyways.

Boost spirit, why the as directive is required? (a.k.a. help me understand the attributes compatibility rules)

Otherwise

*(+(qi::char_ - ';') >> ';')

would just expose a std::vector<char> (each kleene-+ would append into the same attribute). As a rule of thumb, kleene-operators always directly pushback into the referenced attribute, which also implies that it expects that attribute to be of container type (the boost::spirit::traits::container_value<> trait is used to detect what the attribute of the repeated parser expression should convert to).

In this case, you might find fusion-adaptation with qi::as_string more elegant: Live On Coliru

struct Block {
vector<string> value;
};

BOOST_FUSION_ADAPT_STRUCT(t::Block,(std::vector<std::string>,value))
// ...

start =
qi::lit('{')
>> *qi::as_string [ +(qi::char_ - ';') >> ';' ]
>> '}'
;

With this, beware of

  • Spirit Qi attribute propagation issue with single-member struct (a limitation/bug)
  • see also: Detecting the parameter types in a Spirit semantic action

Boost spirit x3 tuple member in fusion-adapted struct

It's a nested structure. However you parse into a flat synthesized tuple that doesn't match the AST structure:

In other words ((int, int), int) is not structurally compatible with (int, (int, int)) or even (int, int, int) as your rule would parse.

The mentioned workarounds help with the attribute coercion to reflect the desired (and required) structure.



Related Topics



Leave a reply



Submit