Debugging by printing
Probably the simplest way to get some debug information from your kernel code is by printing out various information with the kernel's equivalent of printf - the printk function and its derivatives.
printk works more or less the same way as printf in userspace, so if you ever debugged your userspace program using printf, you are ready to do the same with your kernel code, e.g. by adding:
printk("My Debugger is Printk\n");
This wasn't that difficult, was it?
Usually you would print out some more interesting information like
printk("Var1 %d var2 %d\n", var1, var2);
just like in userspace.
In order to see the kernel messages, just use the
command in one of your shells - this one will print out the whole kernel log buffer to you.
The supported format strings are also quite similar to the standard ones supported in printf and are quite extensively documented in Documentation/printk-formats.txt
However please note: always use %zu, %zd or %zx for printing size_t and ssize_t values. ssize_t and size_t are quite common values in the kernel, so please use the %z to avoid annoying compile warnings.
Author's practical advice:
If you want to debug an oops (e.g caused by releasing a resource twice) in your driver and you don't have a clue where the oops happens, simply add this line
printk(KERN_ALERT "DEBUG: Passed %s %d \n",__FUNCTION__,__LINE__);
after each possibly offending statement. Recompile and (re-)load the module and trigger the error condition - your log now shows you the last line that was successfully executed before the oops happened.
Of course you should remove these 'rude' statements before shipping your module ;)
If you look into real kernel code you will always see something like:
printk(KERN_ERR "something went wrong, return code: %d\n",ret);
where "KERN_ERR" is one of the eight different log levels defined in include/linux/printk.h and specifes the severity of the error message.
Note that there is NO comma between the KERN_ERR and the format string (as the preprocessor concatenates both strings)
The log levels are:
|KERN_EMERG||"<0>"||Emergency messages, system is about to crash or is unstable||pr_emerg|
|KERN_ALERT||"<1>"||Something bad happened and action must be taken immediately||pr_alert|
|KERN_CRIT||"<2>"||A critical condition occurred like a serious hardware/software failure||pr_crit|
|KERN_ERR||"<3>"||An error condition, often used by drivers to indicate difficulties with the hardware||pr_err|
|KERN_WARNING||"<4>"||A warning, meaning nothing serious by itself but might indicate problems||pr_warning|
|KERN_NOTICE||"<5>"||Nothing serious, but notably nevertheless. Often used to report security events.||pr_notice|
|KERN_INFO||"<6>"||Informational message e.g. startup information at driver initialization||pr_info|
|KERN_DEBUG||"<7>"||Debug messages||pr_debug, pr_devel if DEBUG is defined|
The pr_* macros (with exception of pr_debug) are simple shorthand definitions in include/linux/printk.h for their respective printk call and should probably be used in newer drivers.
pr_devel and pr_debug are replaced with printk(KERN_DEBUG ... if the kernel was compiled with DEBUG, otherwise replaced with an empty statement.
For drivers the pr_debug should not be used anymore (use dev_dbg instead).
If you don't specify a log level in your message it defaults to DEFAULT_MESSAGE_LOGLEVEL (usually "<4>"=KERN_WARNING) which can be set via the CONFIG_DEFAULT_MESSAGE_LOGLEVEL kernel config option (make menuconfig-> Kernel Hacking -> Default message log level)
The log level is used by the kernel to determine the importance of a message and to decide whether it should be presented to the user immediately, by printing it to the current console (where console could also be a serial line or even a printer, not an xterm).
For this the kernel compares the log level of the message to the console_loglevel (a kernel variable) and if the priority is higher (i.e. a lower value) than the console_loglevel the message will be printed to the current console.
To determine your current console_loglevel you simply enter:
$ cat /proc/sys/kernel/printk 7 4 1 7 current default minimum boot-time-default
The first integer shows you your current console_loglevel; the second the default log level that you have seen above.
To change your current console_loglevel simply write to this file, so in order to get all messages printed to the console do a simple
# echo 8 > /proc/sys/kernel/printk
and every kernel message will appear on your console.
Another way to change the console log level is to use dmesg with the -n parameter
# #set console_loglevel to print KERN_WARNING (4) or more severe messages # dmesg -n 5
Only messages with a value lower (not lower equal) than the console_loglevel will be printed.
You can also specify the console_loglevel at boot time using the loglevel boot parameter. (see Documentation/kernel-parameters.txt for more details)
Author's practical advice:
Of course you should always specify an appropriate log level for your messages, but for debugging, I guess most developers leave the console_loglevel unchanged and simply use KERN_ERR or KERN_CRIT to ensure the message reaches the console.
pr_err("REMOVE ME: my debug statement that I swear to remove when I'm done\");
Please make sure to remove these 'inappropriately' tagged messages before shipping the module ;)
Rate limiting and one time messages
Occasionally you have to insert a message in a section which gets called quite often. This not only might have a severe performance impact, it also could overwrite and spam your kernel buffer so it should be avoided.
As always the kernel already provides you with nice functions that solve your problems:
printk_once is fairly trivial - no matter how often you call it, it prints once and never again.
printk_ratelimited is a little bit more complicated - it prints by default not more than 10 times in every 5 seconds (for each function it is called in).
If you need other values for the maximum burst count and the timeout, you can always setup your own ratelimit using the DEFINE_RATELIMIT_STATE macro and the __ratelimit function - see the implementation of printk_ratelimited for an example.
Be sure to #include <kernel/ratelimit.h> in order to use printk_ratelimited()
Both functions have also their pr_* equivalents like pr_info_ratelimited for printk_ratelimited(KERN_INFO... and pr_crit_once for printk_once(KERN_CRIT...
Note: both did not work as expected in my tests here, will probably investigate further
Printk from userspace
Sometimes, especially when doing automated testing, it is quite useful to insert some messages in the kernel log buffer in order to annotate what's going on.
It is as simple as
# echo "Hello Kernel-World" > /dev/kmsg
Of course this messages gets the default log level assigned, if you want e.g. to issue a KERN_CRIT message you have to use the string representation of the log level - in this case "<2>"
# echo "<2>Writing critical printk messages from userspace" >/dev/kmsg
The message will appear like any other kernel message - there is no way to distinguish them!
Note: Don't confuse this with printf - we are printing a kernel message from userspace here.
Printk is implemented by using a ring buffer in the kernel with a size of __LOG_BUF_LEN bytes where __LOG_BUF_LEN equals (1 << CONFIG_LOG_BUF_SHIFT) (see kernel/printk.c for details).
You can specify the size of the buffer in your kernel config by setting CONFIG_LOG_BUF_SHIFT to an appropriate value (e.g. 17 for 128Kb) (make menuconfig -> General Setup -> Kernel log buffer size).
Using a ring buffer implies that older messages get overwritten once the buffer fills up, but this is only a minor drawback compared to the robustness of this solution (i.e. minimum memory footprint, callable from every context, not many resources wasted if nobody reads the buffer, no filling up of disk space/ram when some kernel process goes wild and spams the buffer, ...). Using a reasonably large buffer size should give you enough time to read your important messages before they are overwritten.
The kernel log buffer is accessible for reading from userspace by /proc/kmsg. /proc/kmsg behaves more or less like a FIFO and blocks until new messages appear.
Please note - reading from /proc/kmsg consumes the messages in the ring buffer so they may not be available for other programs. It is usually a good idea to let klogd or syslog do this job and read the content of the buffer via dmesg.
Pros and Cons
The main advantage of printk over other debugging solutions is that it requires no sophisticated setup and can be called anywhere from any time. Printk can be called while holding a lock, from interrupt and process context, is SMP safe and does not need any special preparation. It is just there and just works. The only precondition is that you have some kind of working console to display the messages.
For the early stages in the boot process, where no console is available yet, there is a special function named early_printk, this function writes directly to the VGA buffer or a serial line but otherwise works just like printk -- you have to enable this function by setting CONFIG_EARLY_PRINTK in your kernel config (make menuconfig -> Kernel Hacking -> Early printk).
The major drawback is that printk is quite static, so you have to figure out what you want to trace beforehand and if you want to trace something different you have to recompile your code - which can become quite cumbersome. (And of course printk is not interactive at all, so you can't modify any variables or the like.)
The other drawback is that printing usually consumes quite some processing power and io time, so if you're trying to debug a timing critical section or a timing bug, you're probably out of luck.
Sometimes you are in the unlucky situation that the machine crashes or hangs and you have no way to access the console afterwards, e.g. the graphic driver hangs and and the kernel dies soon after. In this case you could either hook up a serial line to your crashing target machine (if a serial port is available) or use the kernels netconsole feature to enable printk logging via UDP.
In order to use it you have to enable the CONFIG_NETCONSOLE kernel config option (make menuconfig -> Device Drivers -> Network device support -> Network core driver support -> Network console logging support) and configure it appropriately by using the netconsole configuration parameter
netconsole=[src-port]@[src-ip]/[<dev>],[tgt-port]@<tgt-ip>/[tgt-macaddr] where src-port source for UDP packets (defaults to 6665) src-ip source IP to use (interface address) dev network interface (eth0) tgt-port port for logging agent (6666) tgt-ip IP address for logging agent tgt-macaddr ethernet MAC address for logging agent (broadcast)
e.g. specify at kernel commandline (in your bootloader)
to send messages from 10.0.0.1 port 4444 via eth1 to 10.0.0.2 port 9353 with the MAC 12:34:56:78:9a:bc
or while loading the module e.g
# insmod netconsole netconsole=@/,@10.0.0.2/
to send messages via broadcast to 10.0.0.2 port 6666
On the other machine you can now simply fire up
# netcat -u -l -p <port>
$ netcat -u -l -p 6666
and see the printk messages from your target dribbling in.
If you don't see any messages you might have to set the console_loglevel to a higher value (see above) or test the connection via telnet e.g. from the target type
$ telnet 10.0.0.2 6666
see Documentation/networking/netconsole.txt for more details.
clears the dmesg buffer. Sometimes it is nice to start with a blank buffer, so you will only see new messages when you invoke dmesg
Setting this kernel config option prepends every printk statement with a timestamp representing the time since boot. This is particularly useful to get a general idea about the timings of your code.
You can also specify an argument on the kernel command line to enable this, or you can enable it any time at runtime by doing the following:
$echo 1 >/sys/module/printk/parameters/time
Also, there are tools available to use the information to show relative times between printks (scripts/show_delta) and create graphs of durations in the kernel (scripts/bootgraph.pl)
See Printk Times for more details
Printing buffers as hex
If you want to print a buffer as hex within the kernel, don't reinvent the wheel use printk_hex_dump_bytes() instead.
print_hex_dump_bytes(const char *prefix_str, int prefix_type, const void *buf, size_t len)
this function prints a buffer as hex values to the kernel log buffer (with level KERN_DEBUG) Example:
Kernel Code: char mybuf = "abcdef"; print_hex_dump_bytes("", DUMP_PREFIX_NONE, mybuf, ARRAY_SIZE(mybuf));
dmesg output: 61 62 63 64 65 66 00 abcdef.
If you need something more sophisticated and flexible maybe have a look at print_hex_dump() and hex_dump_to_buffer()
It is also possible to enable/disable debug information at runtime using the dynamic debug functionality of the kernel. For this the CONFIG_DYNAMIC_DEBUG kernel config option must be set. See Documentation/dynamic-debug-howto.txt for more information.
- Linux Kernel Development, Robert Love, 3rd Edition, Chapter 18 Debugging
- Linux Device Drivers, Corbet, Rubini and Kroah-Hartmann, 3rd Edition, Chapter 4 Section 2
- Essential Linux Device Drivers