llvm ir back to human-readable source language?
There is an issue here... it might not be possible to easily represent the IR back into the language.
I mean, you'll probably be able to get some representation, but it might be less readable.
The issue is that the IR is not concerned with high-level semantic, and without it...
I'd rather advise you to learn to read the IR. I can read a bit of it without that much effort, and I am far from being a llvm expert.
Otherwise, you can C code from the IR. It won't be much more similar to your C++ code, but you'll perhaps feel better without ssa and phi nodes.
Converting LLVM-IR into a C like language
The C backend was dropped in release 3.1 because it was not maintained and started developing code rot, becoming a burden. Since no maintainer stepped up, it was removed from the tree. From the release notes of 3.1:
The C backend has been removed. It had numerous problems, to the point
of not being able to compile any nontrivial program.
In August 2012 a thread on llvmdev discussed reviving the C backend, but I don't think it ended up anywhere useful.
You can still download LLVM version 3.0 (from the releases page), build it and see the C backend in action, study its code, etc. For your specific purpose - looking at the code and figuring out how it works, the 3.0 C backend should be good enough.
Is it possible to recompile LLVM IR into another triplet and data layout?
No, the LLVM toolchain has no tool to transform between incompatible triples.
The best third party option I am aware of would be to lift the IR to source code and recompile.
Compiler output language - LLVM IR vs C
I've used LLVM IR for a few compiler back ends and have worked with compilers that use C as a back end. One thing that I found that gave the LLVM IR an advantage is that it is typed. It is hard to make completely ill-formed output without getting errors from the LLVM libraries.
It is also easier to keep a close correlation between the source code and the IR for debugging, in my opinion.
Plus, you get all the cool LLVM command line tools to analyse and process the IR your front end emits.
Parsing and Modifying LLVM IR code
First, to fix an obvious misunderstanding: LLVM is a framework for manipulating code in IR format. There are no ASTs in sight (*) - you read IR, transform/manipulate/analyze it, and you write IR back.
Reading IR is really simple:
int main(int argc, char** argv)
{
if (argc < 2) {
errs() << "Expected an argument - IR file name\n";
exit(1);
}
LLVMContext &Context = getGlobalContext();
SMDiagnostic Err;
Module *Mod = ParseIRFile(argv[1], Err, Context);
if (!Mod) {
Err.print(argv[0], errs());
return 1;
}
[...]
}
This code accepts a file name. This should be an LLVM IR file (textual). It then goes on to parse it into a Module
, which represents a module of IR in LLVM's internal in-memory format. This can then be manipulated with the various passes LLVM has or you add on your own. Take a look at some examples in the LLVM code base (such as lib/Transforms/Hello/Hello.cpp
) and read this - http://llvm.org/docs/WritingAnLLVMPass.html.
Spitting IR back into a file is even easier. The Module
class just writes itself to a stream:
some_stream << *Mod;
That's it.
Now, if you have any specific questions about specific modifications you want to do to IR code, you should really ask something more focused. I hope this answer shows you how to parse IR and write it back.
(*) IR doesn't have an AST representation inside LLVM, because it's a simple assembly-like language. If you go one step up, to C or C++, you can use Clang to parse that into ASTs, and then do manipulations at the AST level. Clang then knows how to produce LLVM IR from its AST. However, you do have to start with C/C++ here, and not LLVM IR. If LLVM IR is all you care about, forget about ASTs.
How to convert LLVM IR br back to a while loop
During the translation from C to LLVM IR, instructions that are deemed necessary can be decorated with Metadata, this metadata can then be used to convert LLVM IR to JavaScript, e.g indicating if the circular branching between basic blocks is a while loop or not (This information is present in the C context). See Intrinsics & Metadata Attributes.
For more information regarding LLVM Metadata see LLVM-Metadata.
Related Topics
Check If a Type Is from a Particular Namespace
Using a Static Library in Qt Creator
Openmp Nested Parallel for Loops VS Inner Parallel For
How to Perform Atomic Operations on Linux That Work on X86, Arm, Gcc and Icc
C++ Most Efficient Way to Convert String to Int (Faster Than Atoi)
Increment Void Pointer by One Byte? by Two
How to Safely Average Two Unsigned Ints in C++
Disabling G++'s Return-Value Optimisation
What Does the "L" Mean at the End of an Integer Literal
How to Access MySQL from Multiple Threads Concurrently
How to Get Cmake to Recognize Pthread on Ubuntu
Why Does Enable_If_T in Template Arguments Complains About Redefinitions
Differencebetween Std::Quick_Exit and Std::Abort and Why Was Std::Quick_Exit Needed
C++11 Emplace_Back on Vector<Struct>
Undefined Symbols for Architecture X86_64: Compiling Problems