Kernel Size Tuning Guide

From eLinux.org
Revision as of 21:05, 11 December 2006 by Wmat (Talk | contribs) (Kernel Configuration Options)

Jump to: navigation, search

This document describes how to configure the Linux kernel to use a small amount of memory and flash.

Alert.gif Note: This document is a work in progress. Please feel free to add material anywhere you have additional information or data. Sections of this document which need additional work are denoted with [FIXTHIS] markers.

Introduction

One big problem area when using Linux in an embedded project is the size of the Linux kernel.

Measuring the kernel

There are 3 aspects of kernel size which are important:

  1. the size of the kernel image stored in flash (or other persistent storage)
  2. the static size of kernel image in RAM (usually, this will be the size of the uncompressed image)
    • This includes the text, data, and BSS segments of the kernel at the time it is loaded. The text and BSS segments will stay the same size for the kernel throughout it execution. However, the data and stack segments may grow according to the needs of the system.
  3. the amount of dynamic RAM used by the kernel.
    • This will fluctuate during system execution. However, there is a baseline amount of memory which is allocated at system startup. Application-specific RAM can be calculated to be above this minimal amount of required RAM.

For now, this document ignores Execute-In-Place (XIP) and Data-Read-In-Place (DRIP) techniques, the use of which have an impact on the amount of flash and RAM used by the kernel. See the following online resources for more information about these techniques: [Kernel XIP] and [Data Read In Place]

Measuring the kernel image size

The compressed kernel image is what is stored in the flash or ROM of the target device. The size of this image can be obtained by examining the size of the image file in the host filesystem with the '
ls -l
' command:
  • for example: '
    ls -l vmlinuz
    ' or '
    ls -l bzImage
    ' (or whatever the compressed image name is for your platform.)

Measuring the kernel text, data and bss segments

Use the
size
command to determine the size of the text, data, and BSS segments of a kernel image.

Note that the BSS segment is not stored in the kernel image because it can be synthesized at boot time by filling a block of memory with zeros. Note also that portions of the kernel text and data are set aside in special initialization segments, which are discarded when the kernel finishes booting. Because of these factors, the size command does not give you an exactly correct value for the static kernel RAM size. However, it can be used as a reasonable estimate.

To use the size command, run it with the filename of the uncompressed kernel image (which is usually
vmlinux
).
  • for example: '
    size vmlinux
    '

Example output:

   text    data     bss     dec     hex filename
2921377  369712  132996 3424085  343f55 vmlinux
Measuring and comparing sub-parts of the kernel

In order to find areas where the kernel size can be reduced, it is often useful to break down the static size of the kernel by sub-system or by kernel symbol. The following sections describe how to see the size of each kernel sub-system, how to see the size of individual kernel symbols, and how to compare the size of symbols between two kernel versions. This is useful because as you make changes to the kernel configuration you can determine what part of the kernel is affected by the change. From this information you may be able to predict what the affect of the change will be, and decide whether the change is acceptable.

== Measuring major kernel subsystems ==
The major sub-systems of the kernel are put into library object files named
built-in.o
in the corresponding sub-directory for that sub-system within the kernel build directory. The major sub-directories, at the time of this writing (for kernel 2.6.17) are:
init, user, kernel, mm, fs, ipc, security, crypto, block, ltt, drivers, sound, net, lib
To see the size of the major kernel sections (code, data, and BSS), use the
size
command, with a wildcard for the first level of sub-directory:
  • size */built-in.o
You can pipe this output through
sort
to sort by the largest libraries:
  • size */built-in.o | sort -n -r -k 4

Example output:

 731596   53144   33588  818328   c7c98 drivers/built-in.o
 687960   24972    2648  715580   aeb3c fs/built-in.o
 547844   19508   28052  595404   915cc net/built-in.o
 184072    6256   32440  222768   36630 kernel/built-in.o
 141956    3300    2852  148108   2428c mm/built-in.o
  68048    1804    1096   70948   11524 block/built-in.o
  26216     768       0   26984    6968 crypto/built-in.o
  17744    2412    2124   22280    5708 init/built-in.o
  20780     292     124   21196    52cc ipc/built-in.o
  18768      68       0   18836    4994 lib/built-in.o
   2116       0       0    2116     844 security/built-in.o
    134       0       0     134      86 usr/built-in.o
   text    data     bss     dec     hex filename
To see even greater detail, you can examine the size of
built-in.o
files even deeper in the kernel build hierarchy, using the
find
command:
  • find . -name "built-in.o" | xargs size | sort -n -r -k 4 

Example output:

 731596   53144   33588  818328   c7c98 ./drivers/built-in.o
 687960   24972    2648  715580   aeb3c ./fs/built-in.o
 547844   19508   28052  595404   915cc ./net/built-in.o
 260019    9824    4944  274787   43163 ./net/ipv4/built-in.o
 184072    6256   32440  222768   36630 ./kernel/built-in.o
...
Alert.gif Note: Please be careful interpreting the results from the size of the
built-in.o
files in sub-directories. In general, the object files are aggregated into the libraries of parent directories, meaning that many object files will have their size counted twice. You cannot simply add the columns for an indication of the total kernel size
== Measuring individual kernel symbols ==

You can measure the size of individual kernel symbols using the 'nm' command.

Using the
nm --size -r vmlinux
[tbird@crest ebony]$ nm --size -r vmlinux | head -10
00008000 b read_buffers
00004000 b __log_buf
00003100 B ide_hwifs
000024f8 T jffs2_garbage_collect_pass
00002418 T journal_commit_transaction
00002400 b futex_queues
000021a8 t jedec_probe_chip
00002000 b write_buf
00002000 D init_thread_union
00001e6c t tcp_ack

Legend: The columns of this output are:

  1. size in bytes (in hexadecimal)
  2. symbol type
  3. symbol name.

The symbol type is usually one of:

  • 'b' or 'B' for a symbol in the BSS segment (uninitialized data),
  • 't' or 'T' for a symbol in the text segment (code), or
  • 'd' or 'D' for a symbol in the data segment.
Use '
man nm
' for additional information on the '
nm
' command.
== Comparing kernel symbols between two kernel images ==
Use the bloat-o-meter command, found in the kernel source
scripts
directory, to compare the symbol sizes between two kernel images.
  • <kernel-src>
    /scripts/bloat-o-meter vmlinux.default vmlinux.altconfig

If you get an error: 'chmod a+x <kernel-src>/scripts/bloat-o-meter'

Example output:

[FIXTHIS - need bloat-o-meter output]

Kernel Size Tuning features

The Linux kernel includes a number of options for to control the features and options it supports. The kernel, over time, has accumulated a large set of features and capabilities. But many features are not needed in Consumer Electronics products. By carefully tuning the kernel options, you can omit many parts of the kernel and save memory in your product.

Linux-tiny patches

The Linux-tiny patch set is a set of patches maintained by Matt Mackall developed with the intent to help a developer reduce the size of the Linux kernel.

The CELF wiki page describing these patches is at: Linux Tiny

The Linux-tiny patch set includes a number of different patches to allow the kernel to be reduced in size. Sometimes, the size reductions are accomplished by reducing the number of objects for a particular features (like the number of possible swap areas, or the number of tty discipline structures). Sometimes, the size reductions are achieved by removing features or functions from the kernel.

Here is a list of the individual Linux-tiny patches that are available for the 2.6.16 kernel:

patch type description kernel option
use-funit-at-a-time.patch compiler flag Add -funit-at-a-time to the gcc compilation flags for building the kernel None
config-net-small.patch add option Add CONFIG_NET_SMALL configuration option Adds CONFIG_NET_SMALL
cache_defer_hash.patch smaller data Reduce RPC cache hash table size from PageSize to 512 Uses CONFIG_NET_SMALL
unix_socket_table.patch smaller data Reduce AF_UNIX socket hash table from 256 to 16 entries Uses CONFIG_NET_SMALL
inet_protos.patch smaller data Reduce number of internet protocols supported from 256 to 32 Uses CONFIG_NET_SMALL
flow-cache-small.patch smaller data Reduce flow cache hash table from 2^10 (1024) to 2^3 (8) Uses CONFIG_NET_SMALL
tg3-oops.patch bugfix Handle tg3 ring allocation correctly None
namei-inlines.patch smaller code Uninline various functions in namei.c None
buffer-inlines.patch smaller code Uninline function in buffer.c None
ext2namei-inlines.patch smaller code Uninline ext2_add_nondir function None
kmalloc-accounting.patch measurement feature Add kmalloc accounting feature CONFIG_KMALLOC_ACCOUNTING
audit-bootmem.patch measurement feature Cause bootmem code to print callers and sizes for allocations CONFIG_AUDIT_BOOTMEM
deprecate-inline.patch measurement feature Add system for counting inline usage by generating deprecation warnings CONFIG_MEASURE_INLINES
func-size.patch measurement feature Adds a script to count inline function sizes None
tiny-panic.patch reduced debug feature Add option to use smaller panic code CONFIG_FULL_PANIC
nopanic.patch omit debug feature Make code for kernel panic configurable CONFIG_PANIC
tiny-crc.patch smaller data Allow using function instead of table for CRC32 calculations CONFIG_CRC32_TABLES
threadinfo-ool.patch smaller code Inline current() and current_thread_info() on UP (configurable) CONFIG_INLINE_THREADINFO
slob-accounting.patch measurement feature Add kmalloc accounting to SLOB allocator Uses CONFIG_SLOB, CONFIG_KMALLOC_ACCOUNTING
mempool-shrink.patch reduced feature Allow disabling mempool allocator feature CONFIG_MEMPOOL
no-translations.patch omit feature Allow omitting support for console charset translation CONFIG_CONSOLE_TRANSLATIONS
sysenter.patch omit feature Allow disabling syscalls via sysenter (x86-only) CONFIG_SYSENTER
no-aio.patch omit feature Allow disabling Asynchronous IO syscalls and support CONFIG_AIO
no-xattr.patch omit feature Allow disabling Extended Attributes syscalls and support CONFIG_XATTR
fslock.patch omit feature Allow disabling POSIX file locking syscalls and support CONFIG_FILE_LOCKING
ethtool.patch omit feature Allow disabling support for configuring network devices with ethtool program CONFIG_ETHTOOL
inetpeer.patch omit feature Allow disabling INET peer data tracking CONFIG_INETPEER
net-filter.patch omit feature Allow disabling old-style packet filtering support CONFIG_NET_SK_FILTER
dev_mcast.patch omit feature Allow disabling netdev multicast support CONFIG_NET_DEV_MULTICAST
igmp.patch omit feature Allow disabling IGMP (Internet Group Management Protocol) support - used for multicasts CONFIG_IGMP
binfmt-script.patch omit feature Allow disabling support to run shell scripts via standard "#!" syntax CONFIG_BINFMT_SCRIPT
elf-no-aout.patch omit feature Allow disabling support for ELF programs with a.out format loader or libraries CONFIG_BINFMT_ELF_AOUT
max-swapfiles.patch smaller data Make the number of swapfiles configurable CONFIG_MAX_SWAPFILES_SHIFT
ldiscs.patch smaller data Make the number of tty line disciplines configurable CONFIG_NR_LDISCS
max_user_rt_prio.patch smaller data Make the number of RT priority O(1) scheduling queues configurable CONFIG_MAX_USER_RT_PRIO
ide-hwif.patch smaller data Make the number of supported IDE interfaces configurable CONFIG_IDE_HWIFS
sbf.patch omit feature Allow disabling simple bootflag support (x86-only) CONFIG_BOOTFLAG
serial-pci.patch omit feature Allow disabling support for PCI serial devices CONFIG_SERIAL_PCI
dmi_blacklist.patch omit feautre Allow disabling DMI scanning (x86-only) CONFIG_DMI_SCAN
pci-quirks.patch omit feature Allow disabling of workarounds for various PCI chipset bugs and quirks CONFIG_PCI_QUIRKS
tsc.patch omit feature Allow disabling use of TSC as kernel timer (x86-only) CONFIG_X86_TSC_TIMER
cpu-support.patch omit feature Allow disabling vendor-specific x86 CPU features (x86-only) CONFIG_PROCESSOR_SELECT, CONFIG_CPU_SUP_* (many)
mtrr.patch continuation patch Make MTRR support depend on vendor-specific CPU selection (x86-only) None
movsl-mask.patch continuation patch Make movsl mask usage depend on vendor-specific CPU selection (x86-only) None
do-printk.patch reduced feature Allow fine-grained control of printk message compilation CONFIG_PRINTK_FUNC, uses CONFIG_PRINTK

Please note that the last patch in this list ("do-printk") is available separately from the main Linux-tiny patch set. Please find this patch at: [Do Printk]

The patches listed in this table represent patches that can be applied to a 2.6.16 Linux kernel. However, as of version 2.6.16, many options for reducing the kernel were already available in Linux. A list of options, both from these patches and from existing code, which are interesting for tuning the kernel size is provided in the section: "Kernel configuration Options"

How to configure the kernel

[FIXTHIS - need detailed kernel configuration instructions]

  • use 'make menuconfig'
  • perform thorough testing of your library and applications with the smaller config
  • development vs. deployment configurations
  • describe all_no config - most times it won't boot.

Kernel Configuration Options

Here is a table of kernel configuration options, including a description, the default value for a kernel, and the recommended value for a smaller configuration of the kernel:


CONFIG option Description Default Small
CONFIG_CORE_SMALL tune some kernel data sizes N Y
CONFIG_NET_SMALL tune some net-related data sizes N Y
CONFIG_KMALLOC_ACCOUNTING turn on kmalloc accounting N Y *
CONFIG_AUDIT_BOOTMEM print out all bootmem allocations N Y *
CONFIG_DEPRECATE_INLINES cause compiler to emit info about inlines N Y *
CONFIG_PRINTK allow disable of printk code and message data Y N
CONFIG_BUG allow elimination of BUG (and BUG_ON??) code Y N
CONFIG_ELF_CORE allow disabling of ELF core dumps Y N
CONFIG_PROC_KCORE allow disabling of /proc/kcore Y N
CONFIG_AIO allow disabling of async IO syscalls Y N
CONFIG_XATTR allow disabling of xattr syscalls Y N
CONFIG_FILE_LOCKING allow disabling of file locking syscalls Y N
CONFIG_DIRECTIO allow disabling of direct IO support Y N
CONFIG_MAX_SWAPFILES_SHIFT number of swapfiles 5 0
CONFIG_NR_LDISCS number of tty line disciplines 16 2
CONFIG_MAX_USER_RT_PRIO number of RT priority levels (schedule slots) 100 5
Other config options These are not in Linux-tiny, but help with size default small
CONFIG_KALLSYMS load all symbols for debugging/kksymoops Y N
CONFIG_SHMEM allow disabling of shmem filesystem Y N +
CONFIG_SWAP allow disabling of support for a swap segment (virtual memory) Y N
CONFIG_SYSV_IPC allow disabling of support for System V IPC Y N +
CONFIG_POSIX_MQUEUE allow disabling of POSIX message queue support Y N +
CONFIG_SYSCTL allow disabling of sysctl support Y N +
CONFIG_LOG_BUF_SHIFT control size of kernel printk buffer 14 11
CONFIG_UID16 allow support for 16-bit uids Y ??
CONFIG_CC_OPTIMIZE_FOR_SIZE Use gcc -os to optimize for size Y Y
CONFIG_MODULES allow support for kernel loadable modules Y N +
CONFIG_KMOD allow support for automatic kernel module loading Y N
CONFIG_PCI allow support for PCI bus and devices Y Y -
CONFIG_XIP_KERNEL allow support for kernel Execute-in-Place N N
CONFIG_MAX_RESERVE_AREA ?? ?? ??
CONFIG_BLK_DEV_LOOP support for loopback block device Y Y -
CONFIG_BLK_DEV_RAM support for block devices for RAM filesystems Y Y -
CONFIG_BLK_DEV_RAM_COUNT Number of block devices for RAM filesystems 16 2?
CONFIG_BLK_DEV_RAM_SIZE Size of block device struct for RAM filesystems 4096 ??
CONFIG_IOSCHED_AS Include Anticipatory IO scheduler Y Y
CONFIG_IOSCHED_DEADLINE Include Deadline IO scheduler Y N +
CONFIG_IOSCHED_CFQ Include CFQ IO scheduler Y N +
CONFIG_IP_PNP support for IP autoconfiguration Y N +
CONFIG_IP_PNP_DHCP support for IP autoconfiguration via DHCP Y N +
CONFIG_IDE support for IDE devices Y N +
CONFIG_SCSI support for SCSI devices Y N +


Legend:

  • "Y *" - Set to 'Y' for measurement during development, and set to 'N' for deployment.
  • "N +" - Whether you can set this to 'N' depends on whether this feaure is needed by your applications.
  • "Y -" - You probably need this, but it might we worth checking to see if you don't.

Special Instructions for some kernel options

How to use CONFIG_PRINTK

If the "do-printk" patch is applied, there are two options which control the compilation of printk elements in the kernel: CONFIG_PRINTK_FUNC and CONFIG_PRINTK. You can use these options to control how much printk support the kernel provides, and to control on a global basis whether any printk messages at all are compiled into the kernel. Another special preprocessor variable is also available, called DO_PRINTK, which provides the ability to enable printk messages inside a single C compilation unit, even if printk messages are disabled globally.

This section explains how to use these features to reduce the kernel size, while still enabling sufficient printk messages to be useful during development and deployment.

The CONFIG_PRINTK option disables all of the kernel printk calls. By setting this option to 'N' in your kernel configuration, all uses of "printk" throughout the kernel source are turned into empty statements, and omitted when the program is compiled. This provides a substantial size savings, since the kernel messages often account for more than 100 kilobytes of space in the kernel image. Setting this option to 'N' will not, however, remove the actual
printk
code itself (just the calls to
printk
). The CONFIG_PRINTK_FUNC option controls whether the
printk
function and various helper functions are compiled into the Linux kernel. When this is set to 'N', CONFIG_PRINTK

is automatically set to 'N', and no printk messages are compiled into the kernel. This usually saves about another 4K of size in the kernel image.

By using both CONFIG_PRINTK and CONFIG_PRINTK_FUNC, you can reduce the size of the kernel image (and that flash and RAM it requires). However, there is a drawback to eliminating all the messages. Obviously, it is then not possible to get any status, diagnostic or debug messages from the kernel! Another mechanism is available, which allows you to control on a per-file basis which printk calls are compiled into the kernel. This is the pre-processor variable DO_PRINTK.

To use DO_PRINTK, set CONFIG_PRINTK to 'N' and CONFIG_PRINTK_FUNC to 'Y' in your kernel configuration. This will globally disable all printk calls in the kernel. Now, determine the C files where you wish to enable printk messages, and add the line:

#define DO_PRINTK 1

at the top of each file. Now, the printk calls in those files will be compiled normally. Printk calls in other modules will be omitted.

Alert.gif - Important Note: The DO_PRINTK variable controls how the preprocessor will treat printk statements in the code.BRFor this reason, this statement MUST appear at the top of the file, before any
#include
lines.
In order to change the set of printk messages preserved in the code, you will need to modify the
DO_PRINTK
lines, and recompile the kernel. (There is no runtime control of the printk calls.) This is a simple mechanism, but it does provide a way to omit most of the printk messages from the kernel while still preserving some messages that may be useful during

development or on a deployed product.

In review, there are basically 3 different settings combinations for CONFIG_PRINTK_FUNC and CONFIG_PRINTK that make sense:

Settings Explanation CONFIG_PRINTK_FUNC CONFIG_PRINTK
|Y ||Y ||This is the default setting for the kernel configuration. In this setting the
printk
code is compiled into the kernel, and all printk calls throughout the entire source code are also compiled as part of the kernel.||

||Y ||N ||This leaves the actual printk() routine in the kernel, but disables all calls to printk throughout the entire source code. However, you can use DO_PRINTK in individual modules to enable the printk calls from those modules.|| ||N ||N ||This removes the printk() routine from the kernel, and disables all kernel printk messages, and gives the smallest kernel code and data size. DO_PRINTK will NOT enable any module-specific printk calls.||

Booting without SysFS

(copied from linux-tiny wiki)

Turning off sysfs support can save a substantial amount of memory in some setups. One big downside is that it breaks the normal boot process because the kernel can no longer map a symbolic device name to the internal device numbers.

Thus, you will need to pass a numeric device number in hex. For example, to boot off /dev/hda1, which has major number 3 and minor 1, you'll need to append a root== option like this:

/boot/vmlinuz root==0x0301 ro

Booting without /proc fs

It is also possible to boot with
/proc
fs, but many programs expect this psuedo-filesystem to be present and mounted. For example,
free
and
ps
are two commands which retrieve information from
/proc
in order to run.

list some workarounds here

Using kernel memory measurement features

[FIXTHIS - need instruction on kmalloc accounting, bootmem auditing and counting inlines]

Kmalloc Accounting
Bootmem Auditing
Counting Inlines

Outline

[FIXTHIS - need to review outline and fill in missing material]

* Tuning the kernel
 * how to measure kernel size
   * in-kernel size reporting - kmalloc accounting
   * bloat-o-meter
   
 * kernel configuration options
   * mainline options
   * optional features
   * minimal config
   * sufficient API?
     * POSIX compliance
     * LSB compliance
     * LTP compliance
  * file systems
   * comparison of file system sizes
 * compiler options for reducing size
   * gcc -os
   * gcc -whole-program
   * 
* online resources:
   * bloatwatch
   * kconfigsize

References

* Linux-tiny project web site: http://www.selenic.com/linux-tiny/
* CELF page about Linux-tiny: http://tree.celinuxforum.org/CelfPubWiki/LinuxTiny
* Matt Mackall's Linux-tiny presentation
* CE Linux Forum resources for reducing system size: http://tree.celinuxforum.org/CelfPubWiki/SystemSizeResources

Appendices

Appendix A - Sample minimum configuration for ARM

[FIXTHIS - need ARM minimum config.]

Appendix B - Configuration Option Details

[Want to fill in this section with deails about configuration options]

For each option, would like to document:

* what is size affect for different option values
 * KernelSizeTuningGuide/ConfigOptionImpact describes kernel size and RAM usage impact affected by each configuration option listed in "Kernel Configuration Options" above, on i386.
* what is affect of performance, functionality, etc.
* what programs (if any) will stop working if option is turned off (or reduced)

Appendix C - Things to research

* miniconfigs
* how to use an initramfs (to avoid using NFS-mounted rootfs)
* how to use a local fs (to avoid using NFS-mounted rootfs)
* Eric Biederman's turning off CONFIG_BLOCK - will any FS work after this??
  * he got a 2.6.1 kernel (presumably all_no) to: "191K bzImage and a 323K text segment".  See here.
* why is networking so big??
* why are file systems so big??
* capture serial output from kernel for size measurement (see grabserial program)

Modification History

||<:rowbgcolor=="#88ccff"> Date||<:> Version||<:> Description|| ||2006-11-01 ||0.5.1||Added link to configuration option impact data page on appendix B|| ||2006-10-27 ||0.5.0||Renamed appendix B, small modifications for Jamboree 11 presentation|| ||2006-09-14 ||0.4.1||Moved to CELF public wiki for forum collaboration || ||2006-09-14 ||0.4.0||Added linux-tiny patch table, config option table, and CONFIG_PRINTK usage|| ||2006-09-11 ||0.2.0||Added content for memory measurement|| ||2006-09-07 ||0.1.0||Created document and initial outline.||