Please note that User Registration has been temporarily disabled due to a recent increase in automated registrations. If anyone needs an account, please request one here: RequestAccount. Thanks for your patience!--Wmat (talk)
Please email User:Wmat if you experience any issues with the Request Account form.

Difference between revisions of "Profilers"

From eLinux.org
Jump to: navigation, search
m
m
 
(One intermediate revision by one other user not shown)
Line 70: Line 70:
 
* Imprecise
 
* Imprecise
 
* Huge overhead
 
* Huge overhead
 +
 +
== ARM Streamline ==
 +
 +
[http://www.arm.com/streamline ARM Streamline] is a commercial sample-based system performance analyzer which brings together performance counters from the core(s) and OS, time- and event-based profiling, context switch tracepoints, and instrumented messages (like printf) to provide developers with a system-to-instruction drill-down ability.
 +
 +
Streamline is a component of the ARM DS-5 suite, which has a [http://www.arm.com/products/tools/software-tools/ds-5/ds-5-downloads.php 30-day evaluation] version available for download.
  
  
Line 96: Line 102:
 
* Since these are simulators, your application will run very slow, from 20 to 100 times or more.
 
* Since these are simulators, your application will run very slow, from 20 to 100 times or more.
  
[[Category:Tools]
+
[[Category:Tools]]

Latest revision as of 12:24, 18 January 2012

Linux systems count with a wide variety of profilers each of with their pros and cons, there is no magic bullet, it's recommended to use more than one tool when analyzing your application, I recommend you to use at least OProfile and Valgrind.

OProfile

OProfile is a non-obtrusive system-wide profiler for Linux, it can use system's performance counters to give you insights on where to optimize your code.

Helper script for simple runs (remember to run opcontrol --setup ... once before this script):

#!/bin/sh
sudo opcontrol --init
sudo opcontrol --start
sudo opcontrol --dump
sudo opcontrol --reset
$@
sudo opcontrol --stop
opreport --symbols > oprof.txt

see also oprof_start

User Interfaces

oprof_start

Qt3 GUI for OProfile, ships in official distribution, see http://oprofile.sourceforge.net/doc/oprofile-gui.html

KCachegrind

One can use reports that are usable by kcachegrind with (for more instruction see http://docs.kde.org/kde3/en/kdesdk/kcachegrind/using-kcachegrind.html):

opreport -gdf | op2callgrind 

OProfileUI

Another useful application is OProfileUI, it provides a GTK+2 user interface to run and show statistics, including basic call graph.

Documentation

In depth explanation can be found at:

Caveats

  • JFFS2 does not provide OProfile requirements, running OProfile to output data into JFFS2 partition will fail, instead use a ramfs or some external card with ext2/3:
mkdir /var/lib/oprofile
# ram
mount -t ramfs none /var/lib/oprofile
# ext3
mount -t ext3 /dev/sda1 /var/lib/oprofile
  • settings event counter too low or too high may hang your test machine. Check the recommended value for each event on specific platforms.


GProf

Unlike OProfile, GProf requires applications to be compiled with special flag -pg so compiler will introduce marks in generated binary to measure runs. After run the gmon.out file will be created with measured data, this file can be viewed with tools like gprof or kprof.

User Interfaces

KProf

Outdated, but exists: http://kprof.sourceforge.net/

Documentation

Caveats

  • Imprecise
  • Huge overhead

ARM Streamline

ARM Streamline is a commercial sample-based system performance analyzer which brings together performance counters from the core(s) and OS, time- and event-based profiling, context switch tracepoints, and instrumented messages (like printf) to provide developers with a system-to-instruction drill-down ability.

Streamline is a component of the ARM DS-5 suite, which has a 30-day evaluation version available for download.


Valgrind Cachegrind/Callgrind

Valgrind is tool suite for debugging and profiling. It's most famous for it's memcheck tool to check for memory leaks, invalid access, double frees and more, but also ships with interesting callgrind and cachegrind.

cachegrind will accurately simulate L1, L2 and D1 CPU caches so you can figure out what in your code is trashing your memory access.

callgrind is an extension of cachegrind that will collect call graphs.

User Interfaces

KCachegrind

The most famous user interface is KCachegrind

Documentation

Caveats

  • Since these are simulators, your application will run very slow, from 20 to 100 times or more.