Here is a listing of some instrumentation systems for the kernel:
Existing Instrumentation Systems
Andrew Morton's system for measuring intervals between kernel events:
Produces printk's with extra time data on them. As of kernel 2.6.11 this is part of the mainline kernel enabled by CONFIG_PRINTK_TIME. Previous versions can add it via a very simple patch. It works for bootup time measurements, or other places where you can just jam in a printk or two.
See Printk Times
Kernel Function Instrumentation (KFI)
A system which uses a compiler flag to instrument most of the functions in the kernel. Timing data is recorded at each function entry and exit. The data can be extracted and displayed later with a command-line program.
The kernel portion of this is available in the CELF tree now.
Grep for CONFIG_KFI.
See the page Kernel Function Instrumentation page for some preliminary notes.
FIXTHIS - need to isolate this as a patch.
Linux Trace Toolkit
Kernel Tracer (in IKD patch)
This is part of a general kernel tools package, maintained by Andrea Arcangeli.
The ktrace implementation is in the file kernel/debug/profiler.c It was originally written by Ingo Molnar, Richard Henderson and/or Andrea Arcangeli
It uses the compiler flag -pg to add profiling instrumentation to the kernel.
Function trace in KDB
Last year (Jan 2002) Jim Houston sent a patch to the kernel mailing list which provides support compiler-instrumented function calls.
Ftrace is a simple function tracer which initially came from the -rt patches but was mainlined in 2.6.27. Compiler profiling features are used to insert an instrumentation call that can be overwritten with a NOP sequence to ensure overhead is minimal with tracing disabled. There are a number of tracers in the kernel that use ftrace to trace high level events such as irq enabling/disabling preemption enabling/disabling, scheduler events and branch profiling.
The interface to access ftrace can be found in /debugfs/tracing, and is documented in Documentation/ftrace.txt.
SystemTap / Kprobes
SystemTap is a sophisticated kernel instrumentation tool that can be scripted with it's own language to gather information about a running kernel. It uses the Kprobes infrastructure to implement it's tracing.
Some random thoughts on instrumentation:
- Most instrumentation systems need lots of memory to buffer the data produced
- Some instrumentation systems support filters or triggers to allow for better control over the information saved
- instrumentation systems tend to introduce overhead or otherwise interfere with the thing they are measuring
- instrumentation systems tend to pollute the cache lines for the processor
- There doesn't seem to be a single API to support in-kernel timing instrumentation which is supported on lots of different architectures. This is the main reason for CELF's current project to define an Instrumentation API