Difference between revisions of "Improving Android Boot Time Outline"

From eLinux.org
Jump to: navigation, search
(more text)
m (Add category)
 
(4 intermediate revisions by one other user not shown)
Line 3: Line 3:
 
----
 
----
 
* Outline  
 
* Outline  
** Android boot overview
+
** Android boot sequence overview
 
** Measuring boot times
 
** Measuring boot times
 
** Problem areas
 
** Problem areas
Line 9: Line 9:
 
** Ideas for improvements
 
** Ideas for improvements
 
----
 
----
* Android boot overview
+
* Boot Sequence Overview
 
** bootloader
 
** bootloader
 
** kernel
 
** kernel
 
** init
 
** init
 +
*** loads several daemons and services, including zygote
 +
*** see /init.rc and init.''<platform>''.rc
 
** zygote
 
** zygote
*** building preload heap
+
*** preloads classes
*** start package manager
+
*** starts package manager
 
** service manager
 
** service manager
 
*** start services
 
*** start services
 +
[ diagram would be nice ]
 
----
 
----
* measuring bootup time
+
* Measuring Boot Time
** systems measured: adp1, n1, evm
+
* Systems measured:
*** adp1 with donut
+
*** Android Developer Phone (same hardware as G1), running Donut (1.6)
*** n1 with eclair
+
*** Nexus 1, running Eclair (2.1)
*** evm with eclair
+
*** OMAP Evaluation Module (EVM), running  Eclair
**** NOTE: used nfs root filesystem (file IO timings might be bogus)
+
**** NOTE: Used NFS root filesystem (which means file IO timings might be off)
 +
* Tools and techniques used (next slide)
 
----
 
----
 
* Tools for measuring and tracing boot time
 
* Tools for measuring and tracing boot time
 
** stopwatch
 
** stopwatch
** grabserial
+
*** It's kind of sad that it takes so long that you can use a stopwatch
** printk times
+
** Message loggers
 +
*** grabserial
 +
*** printk times
 +
*** logcat
 
** bootchart
 
** bootchart
 
** strace
 
** strace
** logcat
 
 
** method tracer*
 
** method tracer*
 
** ftrace*
 
** ftrace*
 
----
 
----
* stopwatch
+
* Grabserial
 +
** Tool for measuring time of printouts on a serial port, from a host machine
 +
*** Only useful with EVM board, which has serial console
 +
** Shows timestamp for each line received over serial console
 +
** See [[Grabserial]]
 
----
 
----
* grabserial
+
* Printk Times
 +
** Kernel option for adding time stamp to each printk
 +
*** Set CONFIG_PRINTK_TIME=y
 +
**** Option is on "Kernel hacking" menu, "Show timing information on printks"
 +
** Can save from terminal on serial console, on host
 +
** Can also show after boot with 'dmesg'
 +
** Useful to use 'initcall_debug' on kernel command line
 +
** See [[Printk Times]]
 +
** Can change loglevel of 'init' program
 +
*** Change "loglevel 3" to "loglevel 8" in /init.rc
 
----
 
----
* printk times
+
* Bootchart
----
+
** 'init' gathers data on startup
* bootchart
+
*** Must re-compile 'init' with support for bootchart data collection
 +
** A tool on the host produces a nice graphic
 +
** See [[Bootchart]] and [[Using Bootchart on Android]]
 
----
 
----
 
* strace
 
* strace
 +
** Shows system calls for a process (or set of processes)
 +
** Is part of AOSP since Eclair
 +
** Can add to init.rc to trace initialization.
 +
*** For example, to trace zygote startup, in /init.rc change:
 +
    service zygote /system/bin/app_process -Xzygote /system/bin --zygote --start-system-server
 +
to
 +
    service zygote /system/xbin/strace -tt -o/data/boot.strace /system/bin/app_process -Xzygote /system/bin --zygote --start-system-server
 
----
 
----
* logcat
+
* Android system log
** extra instrumentation for preloading classes
+
** Android's has built-in logging system
** PARSE_CHATTY flag for package scanning
+
** Use logcat to see messages
** mention my own tool 'logdelta'
+
** I added extra instrumentation for class preloading and package scanning
 +
*** PARSE_CHATTY flag for package scanning
 +
** I built my own tool (logdelta) to scan log and produce 'delta' times
 
----
 
----
 
* method tracer*
 
* method tracer*
Line 60: Line 90:
 
** NOTE: ARM is missing function graph tracing - see [[Ftrace Function Graph ARM]]
 
** NOTE: ARM is missing function graph tracing - see [[Ftrace Function Graph ARM]]
 
----
 
----
 +
* Measurement results
 +
** stopwatch
 +
** grabserial
 +
** printk times
 +
** strace
 +
** bootchart
 +
** logcat
 +
---
 +
* stopwatch
 +
** ADP1: 4, 32, 57 | 39 apps
 +
** Nexus 1: 3.5, 20, 36 | 79 apps
 +
** EVM: 17, 37, 62 | 45 apps (very small)
 +
---
 +
* grabserial
 +
---
 +
* printk times
 +
---
 +
* strace
 +
---
 +
* bootchart
 +
---
 +
* logcat
 +
---
 
* Problem Areas
 
* Problem Areas
 
** First, a bootchart for EVM board
 
** First, a bootchart for EVM board
Line 99: Line 152:
 
** scans the entire package, checking the content headers
 
** scans the entire package, checking the content headers
 
*** caused read of almost entire package
 
*** caused read of almost entire package
*** touches every page in mmaped file, even if a sub-file in the archive won't be
+
*** touches every page in mmaped file, even if a sub-file in the archive won't be read later
read later
+
 
*** e.g. entire package is scanned, when only the AndroidManifest.xml file is requested
 
*** e.g. entire package is scanned, when only the AndroidManifest.xml file is requested
 
----
 
----
Line 166: Line 218:
 
*** probably only a few hundred milliseconds, but worth changing
 
*** probably only a few hundred milliseconds, but worth changing
 
----
 
----
* Sreadahead??
+
* readahead??
 +
** Interesting result from just pre-filling the page cache
 +
[put page cache sequence here]
 
** could use sreadahead to pre-fill page cache
 
** could use sreadahead to pre-fill page cache
 
** however, this just masks bad behavior
 
** however, this just masks bad behavior
Line 177: Line 231:
 
** Sorry - no speedups yet
 
** Sorry - no speedups yet
 
** But, have a good foundation and set of tools for improving things going forward
 
** But, have a good foundation and set of tools for improving things going forward
 +
----
 +
* Observations
 
** Premature optimization is the root of all evil
 
** Premature optimization is the root of all evil
 
*** Be very careful of optimizing wasteful operations
 
*** Be very careful of optimizing wasteful operations
 
*** Better to improve or eliminate the operations, than hide the wasteful operations with caching
 
*** Better to improve or eliminate the operations, than hide the wasteful operations with caching
----
+
** Beware of systemic or architectural problems
* Observations
+
** Beware of systemic problems
+
 
*** Package management basically builds a persistent container and compression architecture in user space
 
*** Package management basically builds a persistent container and compression architecture in user space
*** Except, it rebuilds the in-core data structure for it over and over
+
*** Except, it does it poorly.  (It rebuilds the in-memory data structure for indexing an archive over and over.)
*** Just use the file system, for heaven's sake!
+
*** Just use a file system, for heaven's sake!
 +
----
 +
* Resources
 +
** Wiki page for this talk: http://elinux.org/Improving_Android_Boot_Time
 +
** Android Porting, Android Platform, and Android Kernel mailing lists, depending on where your issue is
 +
*** See http://elinux.org/Android_Web_Resources#Mailing_Lists
 +
** My e-mail: tim(dot)bird (at) am(dot)sony(dot)com
 +
 
 +
[[Category:Android]]

Latest revision as of 21:36, 27 October 2011

Here is the outline for my talk:

  • Title

  • Outline
    • Android boot sequence overview
    • Measuring boot times
    • Problem areas
      • Some gory details
    • Ideas for improvements

  • Boot Sequence Overview
    • bootloader
    • kernel
    • init
      • loads several daemons and services, including zygote
      • see /init.rc and init.<platform>.rc
    • zygote
      • preloads classes
      • starts package manager
    • service manager
      • start services

[ diagram would be nice ]


  • Measuring Boot Time
  • Systems measured:
      • Android Developer Phone (same hardware as G1), running Donut (1.6)
      • Nexus 1, running Eclair (2.1)
      • OMAP Evaluation Module (EVM), running Eclair
        • NOTE: Used NFS root filesystem (which means file IO timings might be off)
  • Tools and techniques used (next slide)

  • Tools for measuring and tracing boot time
    • stopwatch
      • It's kind of sad that it takes so long that you can use a stopwatch
    • Message loggers
      • grabserial
      • printk times
      • logcat
    • bootchart
    • strace
    • method tracer*
    • ftrace*

  • Grabserial
    • Tool for measuring time of printouts on a serial port, from a host machine
      • Only useful with EVM board, which has serial console
    • Shows timestamp for each line received over serial console
    • See Grabserial

  • Printk Times
    • Kernel option for adding time stamp to each printk
      • Set CONFIG_PRINTK_TIME=y
        • Option is on "Kernel hacking" menu, "Show timing information on printks"
    • Can save from terminal on serial console, on host
    • Can also show after boot with 'dmesg'
    • Useful to use 'initcall_debug' on kernel command line
    • See Printk Times
    • Can change loglevel of 'init' program
      • Change "loglevel 3" to "loglevel 8" in /init.rc

  • Bootchart
    • 'init' gathers data on startup
      • Must re-compile 'init' with support for bootchart data collection
    • A tool on the host produces a nice graphic
    • See Bootchart and Using Bootchart on Android

  • strace
    • Shows system calls for a process (or set of processes)
    • Is part of AOSP since Eclair
    • Can add to init.rc to trace initialization.
      • For example, to trace zygote startup, in /init.rc change:
   service zygote /system/bin/app_process -Xzygote /system/bin --zygote --start-system-server
to
   service zygote /system/xbin/strace -tt -o/data/boot.strace /system/bin/app_process -Xzygote /system/bin --zygote --start-system-server

  • Android system log
    • Android's has built-in logging system
    • Use logcat to see messages
    • I added extra instrumentation for class preloading and package scanning
      • PARSE_CHATTY flag for package scanning
    • I built my own tool (logdelta) to scan log and produce 'delta' times

  • method tracer*
    • method traces is built in to
  • ftrace*
    • I could have really used ftrace for some things
    • (especially to see page faults intermingled with system calls)
    • kernel version for my primary development board (2.6.29) didn't support it
    • should be usable in future versions of Android (Froyo is at 2.6.32)
    • NOTE: ARM is missing function graph tracing - see Ftrace Function Graph ARM

  • Measurement results
    • stopwatch
    • grabserial
    • printk times
    • strace
    • bootchart
    • logcat

---

  • stopwatch
    • ADP1: 4, 32, 57 | 39 apps
    • Nexus 1: 3.5, 20, 36 | 79 apps
    • EVM: 17, 37, 62 | 45 apps (very small)

---

  • grabserial

---

  • printk times

---

  • strace

---

  • bootchart

---

  • logcat

---

  • Problem Areas
    • First, a bootchart for EVM board
    • bootloader init
    • kernel init
    • zygote class preloading
    • package scanning
    • service initialization

  • bootloader init
    • outside scope of this talk
    • didn't measure commercial bootloader, only development one (U-boot)

  • kernel init
    • is mostly the usual suspects
    • (initcall_debug results)
    • USB

  • zygote class preloading
    • zygote pre-loads just under 2000 classes, and instantiates them in its heap
    • controlled by file:

  • package manager package scan
    • EVERY package is scanned at boot time
    • Very deep nesting, with abstraction
    • Not sure of exact set of purposes
      • But I see validation of certificates, permissions, capabilities and dependencies, etc.
    • Very difficult to trace
      • It bounces between java, c++ and kernel
      • And uses mmaped files (meaning accesses cause page faults)!!
        • So it's not even using syscalls for reading the data

  • Package scan call tree

[put call tree here]


  • parseZipArchive()
    • evil routine that builds an in-memory data structure for accessing a package file
    • scans the entire package, checking the content headers
      • caused read of almost entire package
      • touches every page in mmaped file, even if a sub-file in the archive won't be read later
      • e.g. entire package is scanned, when only the AndroidManifest.xml file is requested

  • new Resources()
    • reads all resources in a file
    • is this really necessary?

  • Ideas for Enhancements
    • First, a side note on toothpaste..
    • kernel speedups
    • optimize package scan
    • optimize class preloading
    • miscellaneous optimizations
    • sreadahead??

  • Toothpaste
    • Problem with optimizations are that reduction in one area causes some other problem (either in speed or size) in some other area
    • i.e. when you squeeze the tube of toothpaste, it just moves somewhere else, and doesn't actually come out the end
    • This is demonstrated with class preloading and page cache effects
      • I tried to improve things, but the I/O delays just moved somewhere else in the system, sometimes making things worse
      • e.g. AutoText - eliminated and gained 4 seconds during class preloading, but /system/frameworks/frameworks-res.apk was just loaded later in the boot
      • e.g. Contacts.apk - moved AndroidManifest.xml to its own package, to avoid reading the entire (1.6M) package to read this one file, but next reference to contents of Contacts.apk caused the index rebuild again (costing the entire page cache load)

  • kernel speedups
    • outside the scope of this presentation
    • see http://elinux.org/Boot_Time
    • should really be able to get kernel up in 1 second
      • modulo network delays

  • Optimize Class Preloading
    • Preload less, and let apps pay penalty for shared class and resource use
      • move some classes to services, and have preloaded class be an accessor stub
      • figure out how to share heap back with zygote
      • This needs a lot of analysis - Google Android devs know that the whole process of selecting what classes to preload is a black art
    • Thread the heap construction
      • There is some evidence that class preloading has I/O waits, and would benefit from threading
      • Don't know if this is possible
      • NOTE: all threads need to terminate before spawning Android apps
    • Use pre-constructed dalvik heap
      • (more on next slide)

  • Use pre-constructed dalvik heap
    • Basic operation:
      • Snapshot the heap at end of preloading
      • Check for modifications to any class in preload list, and do regular preload
      • Otherwise, load pre-constructed heap
    • Issues:
      • Don't know if this is possible
        • Need to find parts of heap that are identical on each boot
        • Probably need separate "always-the-same" and "may-change" classes
      • Needs careful analysis, and knowledge of each class

  • Optimize Package Scan
    • Most definitely! should be first thing attacked
    • Need to continue analysis
      • Most likely, should switch to a compressed flash file system

  • Miscellaneous
    • zoneinfo inefficiences
      • discovered with strace
      • routines that do read syscall for 40 bytes, 8 bytes, 8 bytes (hundreds of times)
        • no buffering at user level, sloppy loop coding
      • linear scan of timezone file
        • for a file not present!!
      • probably only a few hundred milliseconds, but worth changing

  • readahead??
    • Interesting result from just pre-filling the page cache

[put page cache sequence here]

    • could use sreadahead to pre-fill page cache
    • however, this just masks bad behavior
      • Contacts.apk (half of 1.6M) is read 4 times! during boot
      • filling page cache makes reads after first one fast, but it would be better to avoid (most of) the reads altogether
        • better to just optimize or eliminate parseZipArchive()
    • sreadahead should be used dead last (after all other enhancements)

  • Conclusions
    • Sorry - no speedups yet
    • But, have a good foundation and set of tools for improving things going forward

  • Observations
    • Premature optimization is the root of all evil
      • Be very careful of optimizing wasteful operations
      • Better to improve or eliminate the operations, than hide the wasteful operations with caching
    • Beware of systemic or architectural problems
      • Package management basically builds a persistent container and compression architecture in user space
      • Except, it does it poorly. (It rebuilds the in-memory data structure for indexing an archive over and over.)
      • Just use a file system, for heaven's sake!