Crash handler

This page describes Sony's embedded crash_handler program.

This is a new crash_handler written by Tim Bird, of Sony, for use in embedded Linux products.

It is a derivative of Android debuggerd crash-handler and debug aid program, and is licensed under the Apache license.

Description
Crash_handler is a crash report generator for embedded Linux systems. It uses features of recent Linux kernels to capture process crash events, and save off individual crash reports, as well as to record information about the overall crash history of a device.

It is originally based on Android's debuggerd, which performs similar functionality. However, debuggerd requires that a dedicated debugging process be running permanently on the system, where crash_handler does not.

When a crash occurs, crash_handler collects information about the dying process from /proc, possibly from the kernel message log, and by using ptrace to query the process memory image. This information is saved in a crash report. Up to 10 crash_reports are saved, before the oldest ones start being overwritten. Also, a crash journal file is maintained, which records information about the crash history of the device.

Features

 * Ability to capture crash reports on processes, with no modification to process or libraries
 * crash handler is completely transparent to crashing program
 * Crash report has information from process image, /proc, and kernel log
 * Automatic log rotation (fixed number of crash reports)
 * Crash journal to record crash patterns
 * Very small crash report budget
 * Crash journal is < 4k
 * Each crash report is <12K, limited to 10 crash reports
 * ARM stack unwinding on target, even without symbol or unwind information
 * Requires no separate running process or daemon
 * crash_handler is started on-demand, and exits when done

Usage Guide (outline)

 * Build using your cross-compiler, and place somewhere on your target system
 * Install by running the following as root: 'crash_handler --install'
 * This sets /proc/sys/kernel/core_pattern with the correct string
 * Output:
 * The crash journal is at: /tmp/crash_journal
 * Individual crash reports are in: /tmp/crash_reports/, and have the names 'crash_report_0x', where x is a number from 0 to 9
 * retrieve crash reports and process them on host with 'crash_syms'

Presentations

 * [[Media:Embedded-Appropriate Crash Handling in Linux.pdf|Embedded-Appropriate Crash Handling in Linux (PDF)]]
 * Presentation from ELC 2012, by Tim Bird

Required kernel patches
This is included here for quick reference. The patch and instructions are in the source tar starting with version 0.6.



To Do

 * add support for x86
 * currently, the code only handles stack backtrace for ARM processors. However, there is a generic library for this at: http://www.nongnu.org/libunwind that might be usable to do this on multiple processors.

Resources
To discuss this crash handler, use the celinux-dev mailing list. See http://lists.celinuxforum.org/mailman/listinfo/celinux-dev