Extracting C/C++ Function Prototypes

Extracting C / C++ function prototypes

The tool cproto does what you want and allows to tune the output to your requirements.

Note: This tool also only works for C files.

List all available function prototypes from within C/C++?

The language doesn't support reflection yet. However, since you are looking for some sources of information, take a look at the Boost.Reflect library to help you add reflection to your code, to a certain extent. Also, look at ClangTooling and libclang for libraries that let you do automated code-analysis.

Easy way to get function prototypes?

You can use clang libraries to parse C/C++ source code and extract any information you want in particular function prototypes.

Due to library-based architecture it is easy to reuse parts of clang that you need. In your case these are frontend libraries (liblex, libparse, libsema). I think this is a more feasible approach then using hand-written scanner considering the difficulties that you mentioned (typedefs, defines, etc).

clang can also be used as a tool to parse the source code and output AST in XML form, for example if you have the file test.cpp:

void foo() {}

int main()
{
foo();
}

and invoke clang++ -Xclang -ast-print-xml -fsyntax-only test.cpp you'll get the file test.xml similar to the following (here irrelevant parts skipped for brevity):

<?xml version="1.0"?>
<CLANG_XML>
<TranslationUnit>
<Function id="_1D" file="f2" line="1" col="6" context="_2"
name="foo" type="_12" function_type="_1E" num_args="0">
</Function>
<Function id="_1F" file="f2" line="3" col="5" context="_2"
name="main" type="_21" function_type="_22" num_args="0">
</Function>
</TranslationUnit>
<ReferenceSection>
<Types>
<FunctionType result_type="_12" id="_1E"/>
<FundamentalType kind="int" id="_21"/>
<FundamentalType kind="void" id="_12"/>
<FunctionType result_type="_21" id="_22"/>
<PointerType type="_12" id="_10"/>
</Types>
<Files>
<File id="f2" name="test.cpp"/>
</Files>
</ReferenceSection>
</CLANG_XML>

I don't think that extracting this information from binaries is possible at least for symbols with C linkage, because they don't have name mangling.

How to extract a single function from a source file

Why don't you write a small PERL/PHP/Python script or even a small C++,Java or C# program that does that?

I don't know of any already-made tools to do that but writing the code to parse out the text file and extract a function body from a C++ code file should not take more than 20 lines of code.. The only difficult part will be locating the beginning of the function and that should be a relatively simple task using RegEx. After that, all you need is to iterate through the rest of the file keeping track of opening and closing curly braces and when you reach the function body closing brace you're done.

Function prototype filters

Apparently ctags can do that!

ctags -x --c-kinds=f main.c

There's some extra stuff in output but it can be handled with sed or awk.
Or cut

ctags -x --c-kinds=f cards.c | cut -c 51-

How to extract function prototypes from an elf file?

GDB knows the signature of a function through DWARF debuginfo. readelf -w ELF would dump that. You'd probably want to read Introduction to the
DWARF Debugging Format by Michael J. Eager. Using pyelftools you can explore and experiment with DWARF from an interactive Python session.

To extract function prototypes, you want the subprogram debug information entries. An example in the DWARF format tutorial is:

strndup.c

 1: #include "ansidecl.h"
2: #include <stddef.h>
3:
4: extern size_t strlen (const char*);
5: extern PTR malloc (size_t);
6: extern PTR memcpy (PTR, const PTR, size_t);
7:
8: char *
9: strndup (const char *s, size_t n)
10: {
11: char *result;
12: size_t len = strlen (s);
13:
14: if (n < len)
15: len = n;
16:
17: result = (char *) malloc (len + 1);
18: if (!result)
19: return 0;
20:
21: result[len] = '\0';
22: return (char *) memcpy (result, s, len);
23: }

DWARF description for strndup.c

<1>: DW_TAG_base_type
DW_AT_name = int
DW_AT_byte_size = 4
DW_AT_encoding = signed
<2>: DW_TAG_typedef
DW_AT_name = size_t
DW_AT_type = <3>
<3>: DW_TAG_base_type
DW_AT_name = unsigned int
DW_AT_byte_size = 4
DW_AT_encoding = unsigned
<4>: DW_TAG_base_type
DW_AT_name = long int
DW_AT_byte_size = 4
DW_AT_encoding = signed
<5>: DW_TAG_subprogram
DW_AT_sibling = <10>
DW_AT_external = 1
DW_AT_name = strndup
DW_AT_prototyped = 1
DW_AT_type = <10>
DW_AT_low_pc = 0
DW_AT_high_pc = 0x7b
<6>: DW_TAG_formal_parameter
DW_AT_name = s
DW_AT_type = <12>
DW_AT_location =
(DW_OP_fbreg: 0)
<7>: DW_TAG_formal_parameter
DW_AT_name = n
DW_AT_type = <2>
DW_AT_location =
(DW_OP_fbreg: 4)
<8>: DW_TAG_variable
DW_AT_name = result
DW_AT_type = <10>
DW_AT_location =
(DW_OP_fbreg: -28)
<9>: DW_TAG_variable
DW_AT_name = len
DW_AT_type = <2>
DW_AT_location =
(DW_OP_fbreg: -24)
<10>: DW_TAG_pointer_type
DW_AT_byte_size = 4
DW_AT_type = <11>
<11>: DW_TAG_base_type
DW_AT_name = char
DW_AT_byte_size = 1
DW_AT_encoding =
signed char
<12>: DW_TAG_pointer_type
DW_AT_byte_size = 4
DW_AT_type = <13>
<13>: DW_TAG_const_type
DW_AT_type = <11>

For a more complete sample implementation, take a look at this C reflection library by Petr Machata. It has the code to do what you want with the following caveats:

  • Reflection runs in-process instead of out-of-process like GDB
  • It depends on libdw and libdwfl from elfutils. Not sure how you'd feel about growing those external library dependencies.


Related Topics



Leave a reply



Submit