Difference between revisions of "Improving Android Boot Time Outline"
From eLinux.org
(add rules between slides) |
|||
| Line 2: | Line 2: | ||
* Title | * Title | ||
---- | ---- | ||
| − | * Outline | + | * Outline |
** Android boot overview | ** Android boot overview | ||
** Measuring boot times | ** Measuring boot times | ||
| Line 82: | Line 82: | ||
---- | ---- | ||
* package manager package scan | * package manager package scan | ||
| − | |||
| − | |||
** EVERY package is scanned at boot time | ** EVERY package is scanned at boot time | ||
| + | ** Very deep nesting, with abstraction | ||
| + | ** Not sure of exact set of purposes | ||
| + | *** But I see validation of certificates, permissions, capabilities and dependencies, etc. | ||
| + | ** Very difficult to trace | ||
| + | *** It bounces between java, c++ and kernel | ||
| + | *** And uses mmaped files (meaning accesses cause page faults)!! | ||
| + | **** So it's not even using syscalls for reading the data | ||
| + | ---- | ||
| + | * Package scan call tree | ||
| + | [put call tree here] | ||
| + | |||
| + | ---- | ||
| + | * parseZipArchive() | ||
| + | ** evil routine that builds an in-memory data structure for accessing a package file | ||
| + | ** scans the entire package, checking the content headers | ||
| + | *** caused read of almost entire package | ||
| + | *** touches every page in mmaped file, even if a sub-file in the archive won't be | ||
| + | read later | ||
| + | *** e.g. entire package is scanned, when only the AndroidManifest.xml file is requested | ||
| + | ---- | ||
| + | * new Resources() | ||
| + | ** reads all resources in a file | ||
| + | ** is this really necessary? | ||
---- | ---- | ||
* ideas for enhancements | * ideas for enhancements | ||
| Line 139: | Line 160: | ||
** however, this just masks bad behavior | ** however, this just masks bad behavior | ||
*** Contacts.apk (half of 1.6M) is read 4 times! during boot | *** Contacts.apk (half of 1.6M) is read 4 times! during boot | ||
| − | *** filling page cache makes reads after first one fast, but it would be better to avoid | + | *** filling page cache makes reads after first one fast, but it would be better to avoid (most of) the reads altogether |
**** better to just optimize or eliminate parseZipArchive() | **** better to just optimize or eliminate parseZipArchive() | ||
** sreadahead should be used dead last (after all other enhancements) | ** sreadahead should be used dead last (after all other enhancements) | ||
| Line 145: | Line 166: | ||
* Conclusions | * Conclusions | ||
** Sorry - no speedups yet | ** Sorry - no speedups yet | ||
| − | ** | + | ** But, have a good foundation and set of tools for improving things going forward |
| + | ** Premature optimization is the root of all evil | ||
| + | *** Be very careful of optimizing wasteful operations | ||
| + | *** Better to improve or eliminate the operations, than hide the wasteful operations with caching | ||
---- | ---- | ||
* Observations | * Observations | ||
| + | ** Beware of systemic problems | ||
| + | *** Package management basically builds a persistent container and compression architecture in user space | ||
| + | *** Except, it rebuilds the in-core data structure for it over and over | ||
| + | *** Just use the file system, for heaven's sake! | ||
Revision as of 18:03, 28 July 2010
Here is the outline for my talk:
- Title
- Outline
- Android boot overview
- Measuring boot times
- Problem areas
- Some gory details
- Ideas for improvements
- Android boot overview
- bootloader
- kernel
- init
- zygote
- building preload heap
- start package manager
- service manager
- start services
- measuring bootup time
- systems measured: adp1, n1, evm
- adp1 with donut
- n1 with eclair
- evm with eclair
- NOTE: used nfs root filesystem (file IO timings might be bogus)
- systems measured: adp1, n1, evm
- Tools for measuring and tracing boot time
- stopwatch
- grabserial
- printk times
- bootchart
- strace
- logcat
- method tracer*
- ftrace*
- stopwatch
- grabserial
- printk times
- bootchart
- strace
- logcat
- extra instrumentation for preloading classes
- PARSE_CHATTY flag for package scanning
- mention my own tool 'logdelta'
- method tracer*
- method traces is built in to
- ftrace*
- I could have really used ftrace for some things
- (especially to see page faults intermingled with system calls)
- kernel version for my primary development board (2.6.29) didn't support it
- should be usable in future versions of Android (Froyo is at 2.6.32)
- NOTE: ARM is missing function graph tracing - see Ftrace Function Graph ARM
- Problem Areas
- First, a bootchart for EVM board
- bootloader init
- kernel init
- zygote class preloading
- package scanning
- service initialization
- bootloader init
- outside scope of this talk
- didn't measure commercial bootloader, only development one (U-boot)
- kernel init
- is mostly the usual suspects
- (initcall_debug results)
- USB
- zygote class preloading
- zygote pre-loads just under 2000 classes, and instantiates them in its heap
- controlled by file:
- package manager package scan
- EVERY package is scanned at boot time
- Very deep nesting, with abstraction
- Not sure of exact set of purposes
- But I see validation of certificates, permissions, capabilities and dependencies, etc.
- Very difficult to trace
- It bounces between java, c++ and kernel
- And uses mmaped files (meaning accesses cause page faults)!!
- So it's not even using syscalls for reading the data
- Package scan call tree
[put call tree here]
- parseZipArchive()
- evil routine that builds an in-memory data structure for accessing a package file
- scans the entire package, checking the content headers
- caused read of almost entire package
- touches every page in mmaped file, even if a sub-file in the archive won't be
read later
- e.g. entire package is scanned, when only the AndroidManifest.xml file is requested
- new Resources()
- reads all resources in a file
- is this really necessary?
- ideas for enhancements
- kernel speedups
- optimize package scan
- optimize class preloading
- miscellaneous optimizations
- sreadahead??
- kernel speedups
- outside the scope of this presentation
- see http://elinux.org/Boot_Time
- should really be able to get kernel up in 1 second
- modulo network delays
- Optimize Class Preloading
- Preload less, and let apps pay penalty for shared class and resource use
- move some classes to services, and have preload be a thin stub
- figure out how to share heap back with zygote
- Thread the heap construction
- There is some evidence that class preloading has I/O waits, and would benefit from threading
- Don't know if this is possible
- NOTE: all threads need to terminate before spawning Android apps
- Use pre-constructed dalvik heap
- (more on next slide)
- Preload less, and let apps pay penalty for shared class and resource use
- Use pre-constructed dalvik heap
- Basic operation:
- Snapshot the heap at end of preloading
- Check for modifications to any class in preload list, and do regular preload
- Otherwise, load pre-constructed heap
- Issues:
- Don't know if this is possible
- Need to find parts of heap that are identical on each boot
- Probably need separate "always-the-same" and "may-change" classes
- Needs careful analysis, and knowledge of each class
- Don't know if this is possible
- Basic operation:
- Optimize Package Scan
- Most definitely! should be first thing attacked
- Need to continue analysis
- Most likely, should switch to a compressed flash file system
- Miscellaneous
- zoneinfo inefficiences
- discovered with strace
- routines that do read syscall for 40 bytes, 8 bytes, 8 bytes (hundreds of times)
- no buffering at user level, sloppy loop coding
- linear scan of timezone file
- for a file not present!!
- probably only a few hundred milliseconds, but worth changing
- zoneinfo inefficiences
- Sreadahead??
- could use sreadahead to pre-fill page cache
- however, this just masks bad behavior
- Contacts.apk (half of 1.6M) is read 4 times! during boot
- filling page cache makes reads after first one fast, but it would be better to avoid (most of) the reads altogether
- better to just optimize or eliminate parseZipArchive()
- sreadahead should be used dead last (after all other enhancements)
- Conclusions
- Sorry - no speedups yet
- But, have a good foundation and set of tools for improving things going forward
- Premature optimization is the root of all evil
- Be very careful of optimizing wasteful operations
- Better to improve or eliminate the operations, than hide the wasteful operations with caching
- Observations
- Beware of systemic problems
- Package management basically builds a persistent container and compression architecture in user space
- Except, it rebuilds the in-core data structure for it over and over
- Just use the file system, for heaven's sake!
- Beware of systemic problems