parsing into several vector members
There are several ways :)
- Custom attribute traits
- The same using semantic actions
- Everything in semantic actions, at detail level
1. Custom attribute traits
The cleanest, IMO would to replace the Fusion Sequence Adaptation (BOOST_FUSION_ADAPT_STRUCT
) by custom container attribute traits for Spirit:
namespace boost { namespace spirit { namespace traits {
template<>
struct is_container<ElemParseData, void> : mpl::true_ { };
template<>
struct container_value<ElemParseData, void> {
typedef boost::variant<float, unsigned int> type;
};
template <>
struct push_back_container<ElemParseData, std::vector<float>, void> {
static bool call(ElemParseData& c, std::vector<float> const& val) {
c.verts.insert(c.verts.end(), val.begin(), val.end());
return true;
}
};
template <>
struct push_back_container<ElemParseData, std::vector<unsigned int>, void> {
static bool call(ElemParseData& c, std::vector<unsigned int> const& val) {
c.idx.insert(c.idx.end(), val.begin(), val.end());
return true;
}
};
}}}
Without changes to the grammar, this will simply result in the same effect. However, now you can modify the parser to expect the desired grammar:
vertex = 'v' >> qi::double_ >> qi::double_ >> qi::double_;
elements = 'f' >> qi::int_ >> qi::int_ >> qi::int_;
start = *(vertex | elements);
And because of the traits, Spirit will "just know" how to insert into ElemParseData
. See it live on Coliru
2. The same using semantic actions
You can wire it up in semantic actions:
start = *(
vertex [phx::bind(insert, _val, _1)]
| elements [phx::bind(insert, _val, _1)]
);
With insert
a member of type inserter
:
struct inserter {
template <typename,typename> struct result { typedef void type; };
template <typename Attr, typename Vec>
void operator()(Attr& attr, Vec const& v) const { dispatch(attr, v); }
private:
static void dispatch(ElemParseData& data, std::vector<float> vertices) {
data.verts.insert(data.verts.end(), vertices.begin(), vertices.end());
}
static void dispatch(ElemParseData& data, std::vector<unsigned int> indices) {
data.idx.insert(data.idx.end(), indices.begin(), indices.end());
}
};
This looks largely the same, and it does the same: live on Coliru
3. Everything in semantic actions, at detail level
This is the only solution that doesn't require any kind of plumbing, except perhaps inclusion of boost/spirit/include/phoenix.hpp
:
struct objGram : qi::grammar<std::string::const_iterator, ElemParseData(), iso8859::space_type>
{
objGram() : objGram::base_type(start)
{
using namespace qi;
auto add_vertex = phx::push_back(phx::bind(&ElemParseData::verts, _r1), _1);
auto add_index = phx::push_back(phx::bind(&ElemParseData::idx, _r1), _1);
vertex = 'v' >> double_ [add_vertex] >> double_ [add_vertex] >> double_ [add_vertex];
elements = 'f' >> int_ [add_index] >> int_ [add_index] >> int_ [add_index] ;
start = *(vertex(_val) | elements(_val));
}
qi::rule<std::string::const_iterator, ElemParseData(), iso8859::space_type> start;
qi::rule<std::string::const_iterator, void(ElemParseData&), iso8859::space_type> vertex, elements;
} objGrammar;
Note:
- One slight advantage here would be that there is less copying of values
- A disadvantage is that you lose 'atomicity' (if a line fails to parse after, say, the second value, the first two values will have been pushed into the
ElemParseData
members irrevocably).
Side note
There is a bug in the read loop, prefer the simpler options:
std::filebuf fb;
if (fb.open("parsetest.txt", std::ios::in))
{
ss << &fb;
fb.close();
}
Or consider
boost::spirit::istream_iterator
auto concatenation of parse results into vectors
You missplaced the grouping parentheses: expanding
vertexList = *(vertex | comment);
normalList = *(normal | comment);
by eliminating subrules leads to
vertex = *(('v' >> qi::double_ >> qi::double_ >> qi::double_) | comment);
normal = *(("vn" >> qi::double_ >> qi::double_ >> qi::double_) | comment);
or, as I'd prefer:
Full working sample (please make your code samples SSCCE next time? https://meta.stackexchange.com/questions/22754/sscce-how-to-provide-examples-for-programming-questions):
#include <iterator>
#include <fstream>
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;
namespace karma = boost::spirit::karma;
namespace phx = boost::phoenix;
struct ObjParseData
{
ObjParseData() : verts(), norms() {}
std::vector<float> verts;
std::vector<float> norms;
};
BOOST_FUSION_ADAPT_STRUCT(ObjParseData, (std::vector<float>, verts)(std::vector<float>, norms))
template <typename It, typename Skipper = qi::space_type>
struct parser : qi::grammar<It, ObjParseData(), Skipper>
{
parser() : parser::base_type(start)
{
using namespace qi;
vertex = 'v' >> qi::double_ >> qi::double_ >> qi::double_;
normal = "vn" >> qi::double_ >> qi::double_ >> qi::double_;
comment = '#' >> qi::skip(qi::blank)[ *(qi::print) ];
#if 0
vertexList = *(vertex | comment);
normalList = *(normal | comment);
start = vertexList >> normalList;
#else
vertex = *(comment | ('v' >> qi::double_ >> qi::double_ >> qi::double_));
normal = *(comment | ("vn" >> qi::double_ >> qi::double_ >> qi::double_));
start = vertex >> normal;
#endif
BOOST_SPIRIT_DEBUG_NODE(start);
}
private:
qi::rule<std::string::const_iterator, ObjParseData(), qi::space_type> start;
qi::rule<std::string::const_iterator, std::vector<float>(), qi::space_type> vertexList;
qi::rule<std::string::const_iterator, std::vector<float>(), qi::space_type> normalList;
qi::rule<std::string::const_iterator, std::vector<float>(), qi::space_type> vertex;
qi::rule<std::string::const_iterator, std::vector<float>(), qi::space_type> normal;
qi::rule<std::string::const_iterator, qi::space_type> comment;
};
bool doParse(const std::string& input)
{
typedef std::string::const_iterator It;
auto f(begin(input)), l(end(input));
parser<It, qi::space_type> p;
ObjParseData data;
try
{
bool ok = qi::phrase_parse(f,l,p,qi::space,data);
if (ok)
{
std::cout << "parse success\n";
std::cout << "data: " << karma::format_delimited(
"v: " << karma::auto_ << karma::eol <<
"n: " << karma::auto_ << karma::eol, ' ', data);
}
else std::cerr << "parse failed: '" << std::string(f,l) << "'\n";
if (f!=l) std::cerr << "trailing unparsed: '" << std::string(f,l) << "'\n";
return ok;
} catch(const qi::expectation_failure<It>& e)
{
std::string frag(e.first, e.last);
std::cerr << e.what() << "'" << frag << "'\n";
}
return false;
}
int main()
{
std::ifstream ifs("input.txt", std::ios::binary);
ifs.unsetf(std::ios::skipws);
std::istreambuf_iterator<char> f(ifs), l;
bool ok = doParse({ f, l });
}
Output:
parse success
data: v: -1.57 33.809 0.359 -24.012 0.005 21.744
n: 0.0 0.535 0.845 0.833 0.553 0.0
Right way to split an std::string into a vectorstring
For space separated strings, then you can do this:
std::string s = "What is the right way to split a string into a vector of strings";
std::stringstream ss(s);
std::istream_iterator<std::string> begin(ss);
std::istream_iterator<std::string> end;
std::vector<std::string> vstrings(begin, end);
std::copy(vstrings.begin(), vstrings.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
Output:
What
is
the
right
way
to
split
a
string
into
a
vector
of
strings
string that have both comma and space
struct tokens: std::ctype<char>
{
tokens(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table()
{
typedef std::ctype<char> cctype;
static const cctype::mask *const_rc= cctype::classic_table();
static cctype::mask rc[cctype::table_size];
std::memcpy(rc, const_rc, cctype::table_size * sizeof(cctype::mask));
rc[','] = std::ctype_base::space;
rc[' '] = std::ctype_base::space;
return &rc[0];
}
};
std::string s = "right way, wrong way, correct way";
std::stringstream ss(s);
ss.imbue(std::locale(std::locale(), new tokens()));
std::istream_iterator<std::string> begin(ss);
std::istream_iterator<std::string> end;
std::vector<std::string> vstrings(begin, end);
std::copy(vstrings.begin(), vstrings.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
Output:
right
way
wrong
way
correct
way
How to push all the arguments into result vector when parsing with Spirit::Qi?
Enabling your debugging shows: https://godbolt.org/z/o3nvjz9bG
Not clear enough for me. Let's add an argument rule:
struct Command {
using Arg = std::string;
using Args = std::vector<Arg>;
enum TYPE { NONE, CMD1, CMD2, FAIL };
TYPE type = NONE;
Args args;
};
qi::rule<It, Command::Arg()> arg;
And
none = omit[*blank] >> &(eol | eoi)
>> attr(Command::NONE)
/*>> attr(Command::Args{})*/;
arg = raw[double_] | +~char_(",)\r\n");
cmd1 = lit("CMD1") >> attr(Command::CMD1) //
>> '(' >> arg >> ')';
cmd2 = lit("CMD2") >> attr(Command::CMD2) //
>> '(' >> arg >> ',' >> arg >> ')';
fail = omit[*~char_("\r\n")] //
>> attr(Command::FAIL);
Now we can see https://godbolt.org/z/3Kqr3K41v
<cmd2>
<try>CMD2(identity, 25.5)</try>
<arg>
<try>identity, 25.5)</try>
<success>, 25.5)</success>
<attributes>[[i, d, e, n, t, i, t, y]]</attributes>
</arg>
<arg>
<try>25.5)</try>
<success>)</success>
<attributes>[[2, 5, ., 5]]</attributes>
</arg>
<success></success>
<attributes>[[CMD2, [[i, d, e, n, t, i, t, y]]]]</attributes>
</cmd2>
Clearly, both arguments are parsed, but only one is assigned. The sad fact is that you're actively confusing the rule, by adapting a two-element struct and parsing a sequence of 3 elements.
You can get this to work, but you'd have help it (e.g. with transform_attribute
, attr_cast<>
or a separate rule):
arg = raw[double_] | +~char_(",)\r\n");
args = arg % ',';
cmd1 = lit("CMD1") >> attr(Command::CMD1) //
>> '(' >> arg >> ')';
cmd2 = lit("CMD2") >> attr(Command::CMD2) //
>> '(' >> args >> ')';
Now you get:
<cmd2>
<try>CMD2(identity, 25.5)</try>
<args>
<try>identity, 25.5)</try>
<arg>
<try>identity, 25.5)</try>
<success>, 25.5)</success>
<attributes>[[i, d, e, n, t, i, t, y]]</attributes>
</arg>
<arg>
<try> 25.5)</try>
<success>)</success>
<attributes>[[ , 2, 5, ., 5]]</attributes>
</arg>
<success>)</success>
<attributes>[[[i, d, e, n, t, i, t, y], [ , 2, 5, ., 5]]]</attributes>
</args>
<success></success>
<attributes>[[CMD2, [[i, d, e, n, t, i, t, y], [ , 2, 5, ., 5]]]]</attributes>
</cmd2>
Now this hints at an obvious improvement: improve the grammar by simplifying:
none = omit[*blank] >> &(eol | eoi) >> attr(Command{Command::NONE, {}});
fail = omit[*~char_("\r\n")] >> attr(Command::FAIL);
arg = raw[double_] | +~char_(",)\r\n");
args = '(' >> arg % ',' >> ')';
cmd = no_case[type_] >> -args;
start = skip(blank)[(cmd|fail) % eol] > eoi;
Then add validation to the commands after the fact.
Demo
Live On Compiler Explorer
//#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <iomanip>
#include <iostream>
namespace qi = boost::spirit::qi;
struct Command {
using Arg = std::string;
using Args = std::vector<Arg>;
enum Type { NONE, CMD1, CMD2, FAIL };
Type type = NONE;
Args args;
friend std::ostream& operator<<(std::ostream& os, Type type) {
switch(type) {
case NONE: return os << "NONE";
case CMD1: return os << "CMD1";
case CMD2: return os << "CMD2";
case FAIL: return os << "FAIL";
default: return os << "???";
}
}
friend std::ostream& operator<<(std::ostream& os, Command const& cmd) {
os << cmd.type << "(";
auto sep = "";
for (auto& arg : cmd.args)
os << std::exchange(sep, ", ") << std::quoted(arg);
return os << ")";
}
};
using Commands = std::vector<Command>;
BOOST_FUSION_ADAPT_STRUCT(Command, type, args)
template <typename It> struct Parser : qi::grammar<It, Commands()> {
Parser() : Parser::base_type(start) {
using namespace qi;
none = omit[*blank] >> &(eol | eoi) >> attr(Command{Command::NONE, {}});
fail = omit[*~char_("\r\n")] >> attr(Command::FAIL);
arg = raw[double_] | +~char_(",)\r\n");
args = '(' >> arg % ',' >> ')';
cmd = no_case[type] >> -args;
start = skip(blank)[(cmd|none|fail) % eol] > eoi;
BOOST_SPIRIT_DEBUG_NODES((start)(fail)(none)(cmd)(arg)(args))
}
private:
struct type_sym : qi::symbols<char, Command::Type> {
type_sym() { this->add//
("cmd1", Command::CMD1)
("cmd2", Command::CMD2);
}
} type;
qi::rule<It, Command::Arg()> arg;
qi::rule<It, Command::Args()> args;
qi::rule<It, Command(), qi::blank_type> cmd, none, fail;
qi::rule<It, Commands()> start;
};
Commands parse(std::string const& text)
{
using It = std::string::const_iterator;
static const Parser<It> parser;
Commands commands;
It first = text.begin(), last = text.end();
if (!qi::parse(first, last, parser, commands))
throw std::runtime_error("command parse error");
return commands;
}
int main()
{
try {
for (auto& cmd : parse(R"(
CMD1(some ad hoc text)
this is a bogus line
cmd2(identity, 25.5))"))
std::cout << cmd << "\n";
} catch (std::exception const& e) {
std::cout << e.what() << "\n";
}
}
Prints
NONE()
CMD1("some ad hoc text")
FAIL()
CMD2("identity", " 25.5")
Split vector to multiple array/vector C++
Edit: I removed a verbose transposing function.
I assume that you want to convert std::vector<std::string>
to a 2D matrix std::vector<std::vector<int>>
.
For instance, for your example, the desired result is assumed to be arr1 = {0,1,...}
, arr2 = {14,2,...}
and arr3 = {150,220,...}
.
First,
We can use
std::istream_iterator
to extract integers from strings.We can also apply the range constructor to create a
std::vector<int>
corresponding to each string.
So the following function would work for you and it does not seem to be a spaghetti code at least to me.
First, this function extract two integer arrays {0,14,150,...}
and {1,2,220,...}
as matrices from a passed string vector v
.
Since a default constructed std::istream_iterator
is an end-of-stream iterator, each range constructor reads each string until it fails to read the next value.
And finally, transposed one is returned:
#include <vector>
#include <string>
#include <sstream>
#include <iterator>
template <typename T>
auto extractNumbers(const std::vector<std::string>& v)
{
std::vector<std::vector<T>> extracted;
extracted.reserve(v.size());
for(auto& s : v)
{
std::stringstream ss(s);
std::istream_iterator<T> begin(ss), end; //defaulted end-of-stream iterator.
extracted.emplace_back(begin, end);
}
// this also validates following access to extracted[0].
if(extracted.empty()){
return extracted;
}
decltype(extracted) transposed(extracted[0].size());
for(std::size_t i=0; i<transposed.size(); ++i){
for(std::size_t j=0; j<extracted.size(); ++j){
transposed.at(i).push_back(std::move(extracted.at(j).at(i)));
}
}
return transposed;
}
Then you can extract integers from a string vector as follows:
DEMO
std::vector<std::string> v(n);
v[0] = "0 14 150";
v[1] = "1 2 220";
...
v[n-1] = "...";
auto matrix = extractNumbers<int>(v);
where matrix[0]
is arr1
, matrix[1]
is arr2
, and so on.
We can also quickly get internal pointers of them by auto arr1 = std::move(matrix[0]);
.
(R) Parse character vector and split into two separate columns
Use separate
as shown below. Note that this requires tidyr 0.8.2 or later. Earlier versions did not support NA
in the into
argument.
library(dplyr)
library(tidyr)
table %>%
separate(var1, into = c("mean1", "sd1", NA), sep = "[ ()]+") %>%
separate(var2, into = c("mean2", "sd2", NA), sep = "[ ()]+")
giving:
# A tibble: 3 x 4
mean1 sd1 mean2 sd2
<chr> <chr> <chr> <chr>
1 27.0 3.1 171.4 9.0
2 27.0 3.2 176.8 7.2
3 27.1 3.0 165.0 6.2
Parse into a vectorvectordouble with boost::spirit
The answer is yes.
It is actually quite trivial to parse into vector<vector<double> >
The rule definition requires a function type, not the type directly. This is simply explained here. A more thorough explanation is probably found in the documentation of boost::phoenix
The output of the program above is now showing nicely the parsed values:
parse success.
0, 5011, 10000, 15000, 20000, 25000,
-40, 0, 20, 40,
Base:
200, 175, 170, 165, 160, 150,
200, 175, 170, 165, 160, 150,
165, 165, 160, 155, 145, 145,
160, 155, 150, 145, 145, 140,
Related Topics
How to Get the Starting/Base Address of a Process in C++
Assert That Code Does Not Compile
Why Is a C++ Bool Var True by Default
Can Cython Code Be Compiled to a Dll So C++ Application Can Call It
How to Embed/Link Binary Data into a Windows Module
Convert Bitmap to Png In-Memory in C++ (Win32)
Disable Sleep Mode in Windows Mobile 6
Why Does Stack<Const String> Not Compile in G++
How to Vertically Align Text in Edit Box
Simulating Mouse Clicks on MAC Os X Does Not Work for Some Applications
Insert into an Stl Queue Using Std::Copy
How to Implement "_Mm_Storeu_Epi64" Without Aliasing Problems
How to Ignore False Positive Memory Leaks from _Crtdumpmemoryleaks
How to Decide If a Template Specialization Exist
Fastest Way to Produce a Mask with N Ones Starting at Position I