Cmakelist File to Generate Llvm Bitcode File from C Source File

CMakeList file to generate LLVM bitcode file from C source file

I think what you ultimately want is to be able to build a C-program
project with CMake and clang in which source files are compiled to LLVM bitcode
and the executable is linked from the bitcode files.

With CMake, asking clang to to link bitcode files means asking it to link in LTO mode,
with the -flto linkage option.

And you can get clang to compile to LLVM bitcode with the -flto compilation
option, or with the -emit-llvm option.

For illustration here is a Hello World project comprising two source files and one header:

$ ls -R
.:
CMakeLists.txt hello.c hello.h main.c

Here is the:

CMakeLists.txt

cmake_minimum_required(VERSION 3.0.2)
project (hello)
set(CMAKE_C_COMPILER clang)
set(CMAKE_EXE_LINKER_FLAGS ${CMAKE_EXE_LINKER_FLAGS} "-flto")
add_executable(hello main.c hello.c)
target_compile_options(hello PUBLIC ${CMAKE_C_FLAGS} -flto)
#target_compile_options(hello PUBLIC ${CMAKE_C_FLAGS} -emit-llvm)

It will work equally well with:

#target_compile_options(hello PUBLIC ${CMAKE_C_FLAGS} -flto)
target_compile_options(hello PUBLIC ${CMAKE_C_FLAGS} -emit-llvm)

Make a build directory for CMake and go there:

$ mkdir build
$ cd build

Generate the build system:

$ cmake ..

Build:

$ make
Scanning dependencies of target hello
[ 33%] Building C object CMakeFiles/hello.dir/main.c.o
[ 66%] Building C object CMakeFiles/hello.dir/hello.c.o
[100%] Linking C executable hello
[100%] Built target hello

You will not find any *.bc targets in the Makefiles, nor any *.bc files
generated:

$ egrep -r '.*\.bc'; echo Done
Done
$ find -name '*.bc'; echo Done
Done

because the compilation option -flto or -emit-llvm results in an output
file:

CMakeFiles/hello.dir/main.c.o
CMakeFiles/hello.dir/hello.c.o

that adheres to the usual CMake naming convention but is in fact not an object file
but an LLVM bitcode file, as you see:

$ file $(find -name '*.o')
./CMakeFiles/hello.dir/hello.c.o: LLVM IR bitcode
./CMakeFiles/hello.dir/main.c.o: LLVM IR bitcode

The program does the usual thing:

$ ./hello 
Hello World!

Later

When I try " make hello.o " it should generate the object file right?
the cmd executes successfully but, could not find the generated object file. Am I doing it right?

You are doing it in one way that is right, though not the only way that is right, but
your expectations are wrong. Look again at:

$ file $(find -name '*.o')
./CMakeFiles/hello.dir/hello.c.o: LLVM IR bitcode
./CMakeFiles/hello.dir/main.c.o: LLVM IR bitcode

You can see there that the .o files that are made from hello.c and main.c
by the CMake-generated makefile are not called hello.o and main.o but hello.c.o
and main.c.o. CMake prefers a compiled filename to preserve the extension of the
source file, and append .o. That is a fairly common practice. So if you wanted
to use the makefile to compile hello.c, the most obviously right way would be
make hello.c.o.

Let's see what actually happens. In my CMake build directory:

$ make VERBOSE=1 hello.c.o
make -f CMakeFiles/hello.dir/build.make CMakeFiles/hello.dir/hello.c.o
make[1]: Entering directory '/home/imk/develop/so/scrap/build'
make[1]: 'CMakeFiles/hello.dir/hello.c.o' is up to date.
make[1]: Leaving directory '/home/imk/develop/so/scrap/build'

There was nothing to be done, because my hello.c.o was up to date. So I'll
delete it and repeat:

$ rm CMakeFiles/hello.dir/hello.c.o
$ make VERBOSE=1 hello.c.o
make -f CMakeFiles/hello.dir/build.make CMakeFiles/hello.dir/hello.c.o
make[1]: Entering directory '/home/imk/develop/so/scrap/build'
Building C object CMakeFiles/hello.dir/hello.c.o
clang -flto -o CMakeFiles/hello.dir/hello.c.o -c /home/imk/develop/so/scrap/hello.c
make[1]: Leaving directory '/home/imk/develop/so/scrap/build'

Now it has been recompiled.

However, because many people - like you - would expect hello.o to be compiled
from hello.c, CMake helpfully defines hello.o as a .PHONY target
that depends on hello.c.o:

$ egrep  -A3 'hello.o.*:.*hello.c.o' Makefile 
hello.o: hello.c.o

.PHONY : hello.o

So in fact I can do:

$ rm CMakeFiles/hello.dir/hello.c.o
$ make VERBOSE=1 hello.o
make -f CMakeFiles/hello.dir/build.make CMakeFiles/hello.dir/hello.c.o
make[1]: Entering directory '/home/imk/develop/so/scrap/build'
Building C object CMakeFiles/hello.dir/hello.c.o
clang -flto -o CMakeFiles/hello.dir/hello.c.o -c /home/imk/develop/so/scrap/hello.c
make[1]: Leaving directory '/home/imk/develop/so/scrap/build'

make hello.o is another way of making hello.c.o

Emitting a single IR bitcode File with LLVM LLD using CMake

You can use the -save-temps option.

clang -flto -fuse-ld=lld -Wl,-save-temps a.o b.o -o myprogram

This will generate myprogramXYZ.precodegen.bc among other files. You can then use llvm-dis to get it in readable IR format.

How to generate bitcode (.bc file) using emscripten with a cmake project?

When the Emscripten build system is used to build a project, it will always generate a bitcode file. This is regardless of the file extension of the default output file. It can't generate a different file, since that would confuse Make, with the file not being created that it was told would be. At the Emscripten website there is a note a short way down the page that says:

The file output from make might have a different suffix: .a for a static library archive, .so for a shared library, .o or .bc for object files (these file extensions are the same as gcc would use for the different types). Irrespective of the file extension, these files contain linked LLVM bitcode that emcc can compile into JavaScript in the final step. If the suffix is something else - like no suffix at all, or something like .so.1 - then you may need to rename the file before sending it to emcc.

Whatever files the build is supposed to create, even ones that are usually shared libraries, will always contain the bitcode, and can be linked directly with the rest of your project.

Edit:
I can only assume that the reason for the .js output file is because the CMake project is set up to produce an executable. It is possible that Emscripten is smart enough to create .js in that case, but I don't know for sure.

How to apply llvm passes using CMake

You can have a look or even use this repo that implements the various steps as cmake commands.

The gist of it is that creates various commands (using the cmake's add_custom_command) basically do exactly what you're looking for using the various LLVM subtools in conjunction with the various cmake target properties, in order to create the IR generation commands from source and to native binary code (i.e. .o).

For example, using llvmir_attach_bc_target() attaches to a top-level cmake target and creates a (unoptimized) .bc file for each source file in the SOURCES property of it.

It contains various examples in the same repo that should be enough to get you started.



Related Topics



Leave a reply



Submit