Is There a C++ Decompiler

Is there a C++ decompiler?

You can use IDA Pro by Hex-Rays. You will usually not get good C++ out of a binary unless you compiled in debugging information. Prepare to spend a lot of manual labor reversing the code.

If you didn't strip the binaries there is some hope as IDA Pro can produce C-alike code for you to work with. Usually it is very rough though, at least when I used it a couple of years ago.

Is it possible to decompile a C++ executable file

Duplicate of this question here.

Yes, it is possible, however when it comes to peeking function bodies and the like, you might have a little less luck. Operating systems like Kali Linux specialize in de-compilation and reverse engineering, so maybe look into a VM of that. And of course, windows has a lot of applications you can use as well to check the application code.

Look over the other question for specific app suggestions. :)

  • Edit : You will most likely have lost all your logic and function bodies, but you might be able to recover the overall structure. It's your EXE so you might be more familiar with how it was all connected up.

Decompile a C file to Pro*C

There is no decompiler for C to *Proc file. A C file containing SQL statement in it is call an embedded *Proc file, an utility provided by Oracle. Once you translate this embedded file to C by using a proc translator provided by Oracle it turn that *Proc file into a C file which modify these SQL statement with oracle provided library APIs.

Now my advice to you is, find out all the methods/functions in that C file (generated from a *Proc file) and check and understand what they are doing in Oracle database, especially all the database transactions. Once you understood that, just try to create your own C or C++ file by using ODBC driver api provided by Oracle.

Why is there no accurate C++ decompiler?

There are several reasons:

  1. Inlining. A lot of C++ code gets inlined in optimized builds. That plays havoc with any form of decompiler. To figure out that a function was inlined, the decompiler would have to analyze the specifics of the inlined code and match them up. And post-inlining optimization steps can make code very different, depending on where it was inlined.

  2. Templates. Templates use #1 exclusively, but they create additional problems. It is at least theoretically possible that a function that gets inlined in two places would compile to the same sequence of assembly instructions. But for template code, which was instantiated with different template arguments? Different instantiations will usually have to compile down to different sequences of instructions. And this becomes even more difficult, since template code can call different sets of functions based on the template parameters. And those functions themselves could be inlined.

  3. Compile-time execution. Template metaprogramming allows the compiler to actually execute code. But C++11's constexpr provides a more natural way to do some computations at compile time. Obviously, compile-time function calls or metafunction instantiations cannot be part of the compiled executable. Only the results of them will be (since that's kinda the point).

  4. Lack of comprehensive runtime reflection. C# and Java both lace their bytecode with a lot of information about what the nature of the original source code. Object definitions are easily detectable, as are object names, member variable types and names, etc. C++ compiles down to machine language, which is not required to have any such information. And since it isn't required, compilers don't generate it. Even the reflection study group of the ISO C++ committee is focused on compile-time reflection, which is information that won't be available at runtime.

    Even std::type_info doesn't offer anything. The reason being that, if the compiler does not detect that a particular type will have typeid called on it, then the compiler doesn't need to generate a std::type_info object for it. And even if it did, all that gives you is an object's name (and an identifier). Nothing more.

Decompile C code with debug info?

I am unable to find a convincing answer as to why the information from the -g option is insufficient for de-compilation, but sufficient for debugging?

The debugging information basically contains only mapping between the addresses in the generated code and the source files line numbers. The debugger does not need to decompile code - it just shows you the original sources. If the source files are missing, debugger won't magically show them.

That said, presence of debugging info does make decompilation easier. If the debug info includes the layout of the used types and function prototypes, the decompiler can use it and provide a much more precise decompilation. In many cases, however, it will still likely be different from the original source.

For example, here's a function decompiled with the Hex-Rays decompiler without using the debug info:

int __stdcall sub_4050A0(int a1)
{
int result; // eax@1

result = a1;
if ( *(_BYTE *)(a1 + 12) )
{
result = sub_404600(*(_DWORD *)a1);
*(_BYTE *)(a1 + 12) = 0;
}
return result;
}

Since it does not know the type of a1, the accesses to its fields are represented as additions and casts.

And here's the same function after the symbol file has been loaded:

void __thiscall mytree::write_page(mytree *this, PAGE *src)
{
if ( src->isChanged )
{
cache::set_changed(this->cache, src->baseAddr);
src->isChanged = 0;
}
}

You can see that it's been improved quite a lot.

As for why decompiling bytecode is usually easier, in addition to NPE's answer check also this.

reverse engineering c programs

You can never get back to the exact same source since there is no meta-data about that saved with the compiled code.

But you can re-create code out from the assembly-code.

Check out this book if you are interested in these things: Reversing: Secrets of Reverse Engineering.

Edit

Some compilers-101 here, if you were to define a compiler with another word and not as technical as "compiler", what would it be?

Answer: Translator

A compiler translates the syntax / phrases you have written into another language a C compiler translates to Assembly or even Machine-code. C# Code is translated to IL and so forth.

The executable you have is just a translation of your original text / syntax and if you want to "reverse it" hence "translate it back" you will most likely not get the same structure as you had at the start.

A more real life example would be if you Translate from English to German and the from German back to English, the sentance structure will most likely be different, other words might be used but the meaning, the context, will most likely not have changed.

The same goes for a compiler / translator if you go from C to ASM, the logic is the same, it's just a different way of reading it ( and of course its optimized ).

Reconstructing Control Flow of Decompiled Program

Reko is a decompiler that tries to reconstruct C-like code from machine code. It has a pass that reconstructs high-level constructs like if, while and switch statements. The code is based on the paper "Native x86 Decompilation using Semantics-Preserving Structural Analysis and Iterative Control-Flow Structuring" by Edward J. Schwartz, JongHyup Lee, Maverick Woo and David Brumley.

Although it is written in C#, it shouldn't be terribly hard to port the one class to C++.
If you have further questions, you can reach out to me at https://gitter.im/uxmal/reko, or email me at johnkal at gmail dot com.

Is there an easy way to modify a decompiled file without having to deal with its dependencies?

You cannot build without the dependencies; however, there is no need to decompile the dependencies. Just add the DLLs themselves as reference to the project.

This is always fine if the decompiled assembly depends on other DLLs; however, if the other DLLs depend on the decompiled assembly, this will only work if the assemblies are not signed, i.e. if they are not using strong names. The purpose of signing is precisely to disallow such hacks.



Related Topics



Leave a reply



Submit