Valgrind

Valgrind is an award-winning tool suite to aid finding memory and threading bugs. Its tools include famous memcheck and cachegrind:


 * memcheck will trap every memory access and report double frees, invalid read, writes and leaks. It's very helpful and will dramatically cut down your bug hunting time since it provides easy to follow tracebacks.
 * cachegrind and it's extension callgrind makes cache usage optimizations a breeze, you can even use the great kcachegrind user interface for easier visualization.

As of August 2020, valgrind supports the following architectures and platforms:
 * {x86,amd64,arm32,arm64,ppc32,ppc64le,ppc64be,s390x,mips32,mips64}-linux
 * {arm32,arm64,x86,mips32}-android
 * {x86,amd64}-solaris
 * {x86,amd64}-darwin (Mac OS X 10)

See supported platforms).

If your embedded platform is not supported, you can try using valgrind with your principal application by compiling your app for your desktop (or some other system where valgrind is supported). You can do that by re-writing portions of your application for hardware independent libraries and maybe replacing hardware dependent code with stubs. This might be a lot of work (depending on your application), but it really pay off to find memory bugs.

memcheck
This is the most famous tool of the suite, it checks for memory related problems. Its usage is very simple, just compile your application with debug  so reports are more useful and run:

valgrind --tool=memcheck --leak-check=yes --leak-check=full ./binary arg1 arg2

It will report number of allocations, free and leak status. You can also run with  to enable reporting globals and other memory that is not released but still referenced when program finishes. If you need better resolution of backtraces, try. Children are not traced by default, use  to do so. If you find it hard to read output (to stderr) mixed with your own program messages, you can use  and it will be saved there.

Output example is easy to understand, for more explanation see valgrind quickstart or user manual: ==19182== Invalid write of size 4 ==19182==   at 0x804838F: f (example.c:6) ==19182==   by 0x80483AB: main (example.c:11) ==19182== Address 0x1BA45050 is 0 bytes after a block of size 40 alloc'd ==19182==    at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130) ==19182==   by 0x8048385: f (example.c:5) ==19182==   by 0x80483AB: main (example.c:11)

First of all, it's a backtrace, so read it like so: to last line is the topmost (or parent/root) call, while the first line is the leaf, where the problem happened. In this case we see 2 traces of process  (number between == is the process id):
 * Error message: Invalid write of size 4, means possible an integer or pointer on 32bits platform was stored in a memory that is not allocated with  or on the stack. This happened at   6th line, that was called from   11th line. If you look into the code at valgrind quickstart you'll see how easy is to spot the problem, but we can even guess without looking at the code!
 * Error cause: Address 0x1BA45050 is 0 bytes after a block of size 40 alloc'd, means the invalid write was to address  which is an overflow of a 40byte buffer allocated at   5th line. Again, looking at the code it's very easy to spot the problem, but you can easily guess what's the problem just by reading the backtrace! No need to think, just go there and fix the oveflow!

If it was not an overflow, but write to bogus memory it would read: ==19182== Address 0x1BA45050 is not stack'd, malloc'd or free'd

If it was free'd in some other place it would read: ==19182== Address 0x1BA45050 is 0 bytes inside a block of size 4 free'd ==19182==    at 0x4004FFDF: free (vg_clientmalloc.c:577) ==19182==   by 0x80484C7: main (x.c:10)

A leak would be reported as: ==19182== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1 ==19182==   at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130) ==19182==   by 0x8048385: f (a.c:5) ==19182==   by 0x80483AB: main (a.c:11)

That means that the memory allocated at 5th line of  is not released and since function   finished and nothing refers to it (ie: globals) it is definitively lost!

Double free's are also easy: Invalid free at 0x4004FFDF: free (vg_clientmalloc.c:577) by 0x80484C7: main (tests/doublefree.c:10) Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd   at 0x4004FFDF: free (vg_clientmalloc.c:577) by 0x80484C7: main (tests/doublefree.c:10)

Remember to read messages carefully! You may find false positives, specially with system libraries that are optimized or do caches in some way, like libx11, libpython, libdl and others! If you find there are some false positives, write a suppression file and use it.

Cachegrind
This tool will simulate caches L1, L2 and D1 asl well as branch prediction, providing enough information to optimize critical/hot paths.

It's usage is similar to memcheck:

valgrind --tool=cachegrind --branch-sim=yes ./binary arg1 arg2

Cachegrind will use values of your current machine with  instruction or some defaults if not available, if you wish to use something else, add commands   for instruction cache,   for data cache, and so on.

After run it will show usage and misses for instruction and data caches and save a file called. You can feed this data file to kcachegrind or cg_annotate. See Valgrind user manual for in depth documentation of every column that is printed by cg_annotate, as well as options to change them, but usually it is:

cg_annotate --auto=yes cachegrind.out.PID

It's nicer to use kcachegrind to see with full colors and other graphical details what's there.