Tracing Collaboration Project

Introduction
This page has information and (possibly) documents used for collaboration between some of the major tracing systems for Linux. For a list of tracing systems available for the Linux kernel, see: Kernel Tracing Systems

''NOTE: This work was started in 2006, and since then the list of tracing systems for the kernel has changed somewhat, with some of the tracing systems being obsoleted by ftrace or perf. This material is left here for reference.''

Tracing Collaboration
In order to avoid duplication of effort, we would like to have members of the major tracing systems collaborate.

Ideas for collaborating
Here are some proposed areas of tracing collaboration:
 * sharing of trace point definitions
 * sharing of in-kernel high-res/low cost timestamp services
 * sharing of post-processing tools
 * requires common input (standard trace data format)
 * collaboration on static tracepoint mainlining

Some of these have been discussed:
 * Jose Santos proposes standardization on LKST post-processing tools: http://sources.redhat.com/ml/systemtap/2006-q3/msg00193.html
 * Jian Gui is measuring timer cost of various kernel timers: http://sources.redhat.com/ml/systemtap/2006-q3/msg00232.html
 * Masami Hiramatsu has ported LKST probe points to System Tap: http://sources.redhat.com/ml/systemtap/2006-q3/msg00196.html

kernel markers

 * Masmi Hiramatsu has a proposal for lightweight kernel markers for runtime inserted tracepoints at arbitrary kernel locations: http://sources.redhat.com/ml/systemtap/2006-q3/msg00273.html
 * Frank Ch Eigler has a proposal for static probe markers (disambiguated from the above by calling them "conditional markers"). He describes them in his System Tap paper for OLS 2006. See page 264 of the proceedings

Action Items
This sections lists the things that need to happen next:
 * need to assemble presentations from OLS 2006 tracing BOF
 * need to document OLS 2006 tracing BOF issues
 * need to determine an area to work together on
 * Tim to propose finalized tracer terminology
 * Tim has a more finished document he prepared for the OLS 2006 BOF, but he needs to publish it
 * use terminology to categorize tracer elements and issues

Linux Trace leaders
Here are some people influential in tracing projects:
 * Mathieu Desnoyers - LTTng lead
 * Tim Bird - KFT maintainer
 * Tohru Nojiri - LKST lead
 * Ingo Molnar - Latency-trace author
 * Vara Prasad - manager for IBM System Tap team
 * Frank Ch Eigler - Red Hat System Tap developer
 * Masami Hiramatsu - Hitachi djprobes/LKST developer
 * Jose Santos - IBM System Tap developer
 * Will Cohen - Red Hat System Tap developer

Meetings and Conferences

 * BOFS and presentations at OLS 2006:
 * Tracing BOF - William Cohen of Red Hat - http://www.linuxsymposium.org/2006/view_abstract.php?content_key=117
 * More information (coming soon) is at: Tracing BOFat OLS2006
 * Probing the Guts of KProbes - Ananth N Mavinakayanahalli of IBM - http://www.linuxsymposium.org/2006/view_abstract.php?content_key=138
 * Improving the Approach to Linux Performance Analysis - Jose Santos of IBM - http://www.linuxsymposium.org/2006/view_abstract.php?content_key=199
 * Problem Solving With Systemtap - Frank Ch. Eigler of Red Hat - http://www.linuxsymposium.org/2006/view_abstract.php?content_key=17
 * The Frysk Execution Analysis Architecture - Andrew Cagney of Red Hat - http://www.linuxsymposium.org/2006/view_abstract.php?content_key=171
 * The LTTng tracer : A Low Impact Performance and Behavior Monitor for GNU/Linux. - Mathieu Desnoyers - http://www.linuxsymposium.org/2006/view_abstract.php?content_key=119
 * Tracing BOF at ELC 2006
 * See notes for this at: Tracing BOF at ELC 2006

Terminology
I think we should reach consensus on tracer terminology. Following is a list of terms and their definitions, as I (Tim Bird) understand them:
 * event - an instruction location or system state at a specific point in time
 * capture - the act of recording event information
 * trace buffer - location where trace data is stored at time of capture
 * trace log - location where trace data is stored long-term
 * post-processing - manipulation of the trace data after the trace is collected
 * configuration interface - the API used to configure the tracing engine
 * control interface - the API used to control the tracing engine
 * transfer interface - the API or mechanism used to move the trace data from kernel to user space
 * configuration - the set of constraints which determine what events are collected and how they are processed in a trace
 * static tracepoint - a trace point statically compiled into the software being traced
 * dynamic tracepoint - a trace point dynamically added to the software being traced
 * aggregation - updating statistics or other analytical information, based on trace events
 * trace time - the time when the trace is active
 * ie. System Tap can do aggregation at trace time, while KFT and LTTng do aggregation during post-processing (mostly).
 * filters - criteria used to limit the events that are processed or captured
 * triggers - criteria used to start and stop tracing automatically

[other terms???] Sentences using these terms:
 * filtering and aggregation can substantially reduce the size of the trace log that is required for a trace.

Tracer Taxonomy
I'm working on building a taxonomy of kernel tracer attributes.

See Tracer Taxonomy

See also Tracer Survey Questions for survey questions for trace system leaders.