- memcheck will trap every memory access and report double frees, invalid read, writes and leaks. It's very helpful and will dramatically cut down your bug hunting time since it provides easy to follow tracebacks.
- cachegrind and it's extension callgrind makes cache usage optimizations a breeze, you can even use the great kcachegrind user interface for easier visualization.
Usually your embedded system will not be supported by valgrind (see supported platforms), but you can try to test most bits on your desktop. You can do that by writing tests for hardware independent libraries and maybe replace hardware dependent code with stubs, seems as more work to do but it really pays off.
This is the most famous tool of the suite, it checks for memory related problems. Its usage is very simple, just compile your application with debug
-g so reports are more useful and run:
valgrind --tool=memcheck --leak-check=yes --leak-check=full ./binary arg1 arg2
It will report number of allocations, free and leak status. You can also run with
--show-reachable=yes to enable reporting globals and other memory that is not released but still referenced when program finishes. If you need better resolution of backtraces, try
--leak-resolution=high. Children are not traced by default, use
--trace-children=yes to do so. If you find it hard to read output (to stderr) mixed with your own program messages, you can use
--log-file=filename and it will be saved there.
==19182== Invalid write of size 4 ==19182== at 0x804838F: f (example.c:6) ==19182== by 0x80483AB: main (example.c:11) ==19182== Address 0x1BA45050 is 0 bytes after a block of size 40 alloc'd ==19182== at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130) ==19182== by 0x8048385: f (example.c:5) ==19182== by 0x80483AB: main (example.c:11)
First of all, it's a backtrace, so read it like so: to last line is the topmost (or parent/root) call, while the first line is the leaf, where the problem happened. In this case we see 2 traces of process
19182 (number between == is the process id):
- Error message: Invalid write of size 4, means possible an integer or pointer on 32bits platform was stored in a memory that is not allocated with
malloc()or on the stack. This happened at
example.c6th line, that was called from
main.c11th line. If you look into the code at valgrind quickstart you'll see how easy is to spot the problem, but we can even guess without looking at the code!
- Error cause: Address 0x1BA45050 is 0 bytes after a block of size 40 alloc'd, means the invalid write was to address
0x1BA45050which is an overflow of a 40byte buffer allocated at
example.c5th line. Again, looking at the code it's very easy to spot the problem, but you can easily guess what's the problem just by reading the backtrace! No need to think, just go there and fix the oveflow!
If it was not an overflow, but write to bogus memory it would read:
==19182== Address 0x1BA45050 is not stack'd, malloc'd or free'd
If it was free'd in some other place it would read:
==19182== Address 0x1BA45050 is 0 bytes inside a block of size 4 free'd ==19182== at 0x4004FFDF: free (vg_clientmalloc.c:577) ==19182== by 0x80484C7: main (x.c:10)
A leak would be reported as:
==19182== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1 ==19182== at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130) ==19182== by 0x8048385: f (a.c:5) ==19182== by 0x80483AB: main (a.c:11)
That means that the memory allocated at 5th line of
a.c is not released and since function
f() finished and nothing refers to it (ie: globals) it is definitively lost!
Double free's are also easy:
Invalid free() at 0x4004FFDF: free (vg_clientmalloc.c:577) by 0x80484C7: main (tests/doublefree.c:10) Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd at 0x4004FFDF: free (vg_clientmalloc.c:577) by 0x80484C7: main (tests/doublefree.c:10)
Remember to read messages carefully! You may find false positives, specially with system libraries that are optimized or do caches in some way, like libx11, libpython, libdl and others! If you find there are some false positives, write a suppression file and use it.
This tool will simulate caches L1, L2 and D1 asl well as branch prediction, providing enough information to optimize critical/hot paths.
It's usage is similar to memcheck:
valgrind --tool=cachegrind --branch-sim=yes ./binary arg1 arg2
Cachegrind will use values of your current machine with
CPUID instruction or some defaults if not available, if you wish to use something else, add commands
--I1=size,associativity,linesize for instruction cache, <code->--D1=...</code> for data cache, and so on.
After run it will show usage and misses for instruction and data caches and save a file called
cachegrind.out.PID. You can feed this data file to kcachegrind or cg_annotate. See Valgrind user manual for in depth documentation of every column that is printed by cg_annotate, as well as options to change them, but usually it is:
cg_annotate --auto=yes cachegrind.out.PID
It's nicer to use kcachegrind to see with full colors and other graphical details what's there.