Is There a Good Python Library That Can Parse C++

Is there a good Python library that can parse C++?

C++ is notoriously hard to parse. Most people who try to do this properly end up taking apart a compiler. In fact this is (in part) why LLVM started: Apple needed a way they could parse C++ for use in XCode that matched the way the compiler parsed it.

That's why there are projects like GCC_XML which you could combine with a python xml library.

Some non-compiler projects that seem to do a pretty good job at parsing C++ are:

  • Eclipse CDT
  • OpenGrok
  • Doxygen

parsing C code using python

Take a look at this link for an extensive list of parsing tools available for Python. Specifically, for parsing c code, try the pycparser

How to choose a proper Python parser generator to parse C struct definitions?

As an alternative solution that might feel a bit over-ambitious from the beginning, but also might serve you very well in the long-term, is:

  • Redefine the protocol in some higher-level language, for instance some custom XML
  • Generate both the C struct definitions and any required Python versions from the same source.

How to parse C++ source in Python?

I'll simply recommend Clang.

It's a C++ library-based compiler designed with ease of reuse in mind. It notably means that you can use it solely for parsing and generating an Abstract Syntax Tree. It takes care of all the tedious operator overloading resolution, template instantiation and so on.

Clang exports a C-based interface, which is extended with Python Bindings. The interface is normally quite rich, but I haven't use it. Anyway, contributions are welcome if you wish to help extending it.

What is the best way to parse python script file in C/C++ code

If you want to do syntax analysis you should look into Pythons grammar (and maybe use Bison as a parser generator)

Python grammar specs:

  • http://docs.python.org/reference/grammar.html
  • http://inst.eecs.berkeley.edu/~cs164/sp10/python-grammar.html

Which tool to use to parse programming languages in Python?

I really like pyPEG. Its error reporting isn't very friendly, but it can add source code locations to the AST.

pyPEG doesn't have a separate lexer, which would make parsing Python itself hard (I think CPython recognises indent and dedent in the lexer), but I've used pyPEG to build a parser for subset of C# with surprisingly little work.

An example adapted from fdik.org/pyPEG/: A simple language like this:

function fak(n) {
if (n==0) { // 0! is 1 by definition
return 1;
} else {
return n * fak(n - 1);
};
}

A pyPEG parser for that language:

def comment():          return [re.compile(r"//.*"),
re.compile("/\*.*?\*/", re.S)]
def literal(): return re.compile(r'\d*\.\d*|\d+|".*?"')
def symbol(): return re.compile(r"\w+")
def operator(): return re.compile(r"\+|\-|\*|\/|\=\=")
def operation(): return symbol, operator, [literal, functioncall]
def expression(): return [literal, operation, functioncall]
def expressionlist(): return expression, -1, (",", expression)
def returnstatement(): return keyword("return"), expression
def ifstatement(): return (keyword("if"), "(", expression, ")", block,
keyword("else"), block)
def statement(): return [ifstatement, returnstatement], ";"
def block(): return "{", -2, statement, "}"
def parameterlist(): return "(", symbol, -1, (",", symbol), ")"
def functioncall(): return symbol, "(", expressionlist, ")"
def function(): return keyword("function"), symbol, parameterlist, block
def simpleLanguage(): return function

C / C++ equivalents to the Python Standard Library

The Poco library is more like other languages' standard libraries.

Actually the Poco web site's logo says "C++ now comes with batteries included!", which seems to be precisely what you're asking for.

I didn't like it when I tried because I found it too C-like and with too many dependencies between parts (difficult to single out just the functionality you want).

But there are many people & firms using it, so it seems I'm in minority and you will perhaps find it very useful.

In addition, as others have mentioned, for data structures, parsers, and indeed an interface to Python!, and such stuff, check out Boost.

Cheers & hth.,

Generating C declarations

Eli Bendersky's Python library pycparser has an example, parsing C-code to an AST-representation, and pretty-prints it, converting it to C-code again.

The specific example I am thinking of can be found here, and the part I think is of interest to you, is this:

def _zz_test_translate():
# internal use
src = r'''
void f(char * restrict joe){}
int main(void)
{
unsigned int long k = 4;
int p = - - k;
return 0;
}
'''
parser = c_parser.CParser()
ast = parser.parse(src)
ast.show()
generator = c_generator.CGenerator()

print(generator.visit(ast))

So what this example does, is it parses C -> AST from string (src). Then it pretty-prints the AST (ast.show()). Then it turns the AST into C source code and prints it out.

This example uses the c_generator module which can be found here.
It is a class with a visit-method that traverses an AST and spits out C code.

I've used it for instrumenting C code, e.g. inserting code at the beginning of function calls or return-points. It should be possible to define struct-decls programmatically and then convert them to C.

From your description it sounds like this could be of use? Either directly: from AST-format -> C source, or as an inspirational source for your own work.

Can you provide a few examples of the struct-decls you wish to convert to C code?



Related Topics



Leave a reply



Submit