What is the most efficient and elegant way develop/debug linux kernel
Here is my notes on how to build and run the custom kernel.
Obtaining sources
Linus Torvalds' tree is [1].
It's marked as "mainline" on [2].
To clone it use information from [1]:
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Now go to linux/
dir and checkout on master branch (we need to use most recent
changes as starting point for development):
$ cd linux
$ git checkout master
Before actual development don't forget to update your branch:
$ git pull --rebase
Building
Kernel version on my machine:
$ uname -r
3.16.0-4-amd64
To obtain config from the system running on my machine:
$ cp /boot/config-`uname -r` ./.config
To update my config (with default answers) I used next command:
$ make olddefconfig
To disable (to not build) modules which are not loaded in my current system:
$ make localmodconfig
To answer all questions with default answer, I just clicked Enter until finish
(just two times actually).
Next I did:
$ make menuconfig
and chose next config options:
CONFIG_LOCALVERSION_AUTO=y
CONFIG_LOCALVERSION="-joe"
Setting up ccache and build environment:
$ ccache -C
$ ccache -M 4G
$ export CC="ccache gcc"
Build kernel (using ccache
):
$ reset
$ make -j4
$ make -j4 modules
Built kernel image is:
arch/x86_64/boot/bzImage
Installing
Installing modules for your kernel:
$ sudo make modules_install
Installing new kernel:
$ sudo make install
Installed modules reside at /lib/modules/*-joe/kernel/
.
Installed kernel files reside at /boot/*joe*
:
- config-*joe*
- initrd.img-*joe*
- System.map-*joe*
- vmlinuz-*joe*
update-grub
was run as part of make install
script, so no need to run it
manually.
NOTE: modules_install
must be run before install
, because install
rule is needed for populating initramfs image with modules.
Check size of /boot/initrd.img-*joe*
file: it must be >= 15 MiB
(if it's smaller, chances are modules are not in there).
Start custom built kernel
Usually your custom kernel should have version bigger than your distro kernel,
so custom kernel should be run by default. If no, read further.
Reboot, go to GRUB, select next entries:
-> Advanced options for Debian GNU/Linux
-> Debian GNU/Linux, with Linux 4.0.0-rc7-joe-00061-g3259b12
Make your distro kernel load by default
Since video may not work in your custom kernel (video drivers must be
rebuild for this), you may want make distro kernel be loaded by default by GRUB.
For this just edit /etc/default/grub
file:
$ sudo vim /etc/default/grub
and change line
GRUB_DEFAULT=0
to
GRUB_DEFAULT="1>3"
where "1>3"
means:
- go to 2nd line in GRUB, enter
- and boot using 4th line.
After this run:
$ sudo update-grub
NOTE: do not edit /boot/grub/grub.cfg
file as it's auto-generated and will
be replace after each update-grub
command.
Removing custom kernel
If you don't need your custom kernel anymore, you may want to remove it.
To remove installed kernel, do next.
Remove all files installed to /boot:
$ sudo rm -f *joe*
Remove all modules installed:
$ sudo rm -rf /lib/modules/*joe*
Update GRUB:
$ sudo update-grub
Cleaning up built kernel
If you don't need to do incremental build and want to do clean build instead
(e.g. you made checkout to another version), you may want to clean your built
files first:
$ make -j4 distclean
Links
[1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/
[2] https://kernel.org/
[3] http://kernelnewbies.org/FAQ/KernelCompilation
How to debug hacked Linux Kernel code
Depending on what you are hacking, it might be better to use UML (User Mode Linux). If you're messing with non-hardware dependent code, then I think it will help a great deal.
UML allows you to compile the kernel as an ordinary user mode program, and run it as any other application on your system. Because it runs like a regular application, you can very easily debug it with gdb, or any other debugger of choice.
Here's a good start for UML
Best way to test a custom kernel for hardware performance counters
I found a solution : KVM + QEMU emulator.
To use PMU, I changed this parameter in the VM parameters (XML format) :
<cpu mode='host-passthrough'/>
Or you can add this option in cmd line :
-cpu host
I followed in part this page for building the kernel on qemu and for the counters this page.
Learning Kernel Programming
Try to get hold of Robert Love's book on Linux Kernel Programming. Its very concise and easy to follow.
After that or along with that, you may want to take a look at "Understanding the Linux kernel".But I wouldn't recommend it during the early stages.
Also, look at the Linux kernel programming guide. Since a lot can be learnt from programing kernel modules, that guide will help you. And yes, for a lot of information, consult the 'documentation' sub-directory of the Kernel sources tarball.
lookup the lock statistic in linux kernel
Enable lock statistics
First of all be sure to enable lock statistics collecting. From Documentation/locking/lockstat.txt:
- CONFIGURATION
Lock statistics are enabled via
CONFIG_LOCK_STAT
.
- USAGE
Enable collection of statistics:
# echo 1 >/proc/sys/kernel/lock_stat
Disable collection of statistics:
# echo 0 >/proc/sys/kernel/lock_stat
Look at the current lock statistics:
( line numbers not part of actual output, done for clarity in the explanation
below )# less /proc/lock_stat
So, be sure to enable collection of lock statistics first.
Check lockdep warnings in kernel log
Now, let's take a look at the code where your warning message is printing:
kernel/locking/lockdep_proc.c : seq_header():
if (unlikely(!debug_locks))
seq_printf(m, "*WARNING* lock debugging disabled!! - possibly due to a lockdep warning\n");
This debug_locks
variable is being set to 0 (disabled) by debug_locks_off()
function, which is in turn can be called from a lot of places. Let's take a look at the place where this variable is defined:
lib/debug_locks.c:
/*
* We want to turn all lock-debugging facilities on/off at once,
* via a global flag. The reason is that once a single bug has been
* detected and reported, there might be cascade of followup bugs
* that would just muddy the log. So we report the first one and
* shut up after that.
*/
int debug_locks = 1;
EXPORT_SYMBOL_GPL(debug_locks);
Comment for that variable explain why you see that warning.
So check your kernel log (via dmesg
command) for actual found bugs by lockdep
mechanism. You will probably find one, which will explain why lock debugging is disabled.
UPDATE
Regarding this message:
[ 0.084000] SMP alternatives: lockdep: fixing up alternatives
It seems like it has nothing to do with your actual issue. This message is printing by next code:
arch/x86/kernel/alternative.c : alternatives_enable_smp():
#ifdef CONFIG_LOCKDEP
/*
* Older binutils section handling bug prevented
* alternatives-replacement from working reliably.
*
* If this still occurs then you should see a hang
* or crash shortly after this line:
*/
pr_info("lockdep: fixing up alternatives\n");
#endif
This code is old and obsolete, and it's dropped in newer kernel versions by this commit:
lockdep, x86/alternatives: Drop ancient lockdep fixup message
So I think it's something else that leads to disabling of collecting lock statistics. It's hard to tell what exactly though. The only thing I can think of is to modify kernel so that you can see what caused disabling of lock debug, rebuild kernel and look into the kernel log to figure out, where lock debug disabling was called from.
So, if you are up to this, modify __debug_locks_off() function as follows:
static inline int __debug_locks_off(void)
{
/* ---- Add this code ---- */
pr_err("### __debug_locks_off() called!\n");
dump_stack();
/* ----------------------- */
return xchg(&debug_locks, 0);
}
Also, add #include <linux/printk.h>
line at the top of that file, just in case.
Then rebuild your kernel, run it and provide the whole dmesg
output. It should be enough to tell where it's been disabled.
UPDATE 2
As I can see from backtraces you provided, all the dump_stack()
calls (that you added to __debug_locks_off()
function) -- they called from locking_selftest()
function, which is in turn just some unit testing routine. Let's take a look at it: locking_selftest(). What is important to notice here is this code:
} else {
printk("-------------------------------------------------------\n");
printk("Good, all %3d testcases passed! |\n",
testcase_successes);
printk("---------------------------------\n");
debug_locks = 1;
}
So you can see that in your case you have debug_locks
variable enabled ("on") in the end of the locking_selftest()
. Taking into the account that all the messages you provided were actually triggered from locking_selftest()
I can say that those messages couldn't lead to your issue (which is disabled debug_locks
variable).
So you still need to figure out, where and why (in your case) this debug_locks
variable is disabled. Let's start with this: please share complete dmesg
output with us (you can use some pastebin service and just add the link to your question or as new comment). You may overlooked some __debug_locks_off()
call in your dmesg
output which was actually relevant (not called from that self-testing routine).
What is the most efficient way to monitor the number of context switches in linux kernel?
Check Linux's perf subsystem it is the way you need to gain performance counters soft or hard from a Linux system.
Related Topics
Possible Values for 'Uname -M'
Linux Keyboard Event Capturing /Dev/Inputx
How Does Copy_From_User from the Linux Kernel Work Internally
Why Do the Md5 Hashes of Two Tarballs of the Same File Differ
What Is the Safest Way to Run an Executable on Linux
Linux 3/1 Virtual Address Split
Run a Persistent Process via Ssh
Could Not Load Shared Library Symbols for Linux-Vdso.So.1. While Debugging
How to Return Exit Code 0 from a Failed Command
Tracking Threads Memory and CPU Consumption
Error: Clgetplatformids -1001 When Running Opencl Code (Linux)
How to Use Kgdb Over Ethernet (Kgdboe)
Change .Eclipse Folder in Linux
PDF Lib Install Fail on Linux Server. Using Pecl Install PDFlib