Yocto Project Introduction

This is an introduction about the Yocto Project, written from the point of view of someone somewhat familiar with embedded Linux development and embedded Linux distribution maintenance. In my own investigation of the Yocto Project, I found that there were a few key concepts that I didn't find discussed or presented anywhere. I'd like to fix that with this page. The purpose of this page is to give a broad overview of how the project technically works, so that the learning curve for people approaching the Yocto Project for the first time is lessened.

[As of August 2012, this document is still under construction...]

The big picture
The Yocto Project is a collection of tools and meta-data (defined in a bit) that allows a developer to build their own custom distribution of Linux for their embedded platform. This could be a developer at a semi-conductor company, who wishes to develop board support for one of their hardware platforms, or it could be an independent developer writing a complete software stack for a product they are making. It could also be a group of engineers developing a distribution for use in multiple devices or products -- such as an embedded Linux distribution company, or the "systems" team at a company that produces multiple embedded Linux products.

The main parts
The main parts of the Yocto Project are the build system, the package meta-data, and the developer tools. The build system uses a tool called "bitbake" to process the meta-data and produce a complete Linux distribution. By design, the build system produces not just the software that will run on the target, but also the development tools used to build that software. It basically starts completely from scratch, building all the tools needed to construct the software, and then using those to build the kernel, libraries, and programs that comprise a Linux distribution. Finally, it prepares the resulting software by placing it into appropriate bundles (including packages, images, or both) for deployment to the target device and in preparation for application development and debugging. The Yocto Project also includes various additional tools used to develop embedded Linux or applications on top of it. This includes things such as emulators, IDEs and host/target (cross) agents and debug tools.

Let's start by describing some of the concepts of the build system...

Build system
The primary tool used in the build system is called 'bitbake'. Bitbake has a user-manual at http://docs.openembedded.org/bitbake/html/. (As of August, 2012, this document appears to be a bit dated, and missing a few items of importance.) Basically, bitbake can be thought of as "make" on steroids. It basically performs the same type of functionality as make - which is: determining the actions to perform based on 1) what the user requests at the command line, 2) the project data and 3) the existing build state, and then performing those actions.

Bitbake uses files with it's own new syntax for expressing:
 * the tasks to perform
 * the relationships (dependencies) between those tasks
 * the variables that control how the tasks are performed
 * the actual build instructions (e.g. compiler commands, linker commands, packaging commands, etc.)

Bitbake differs from 'make' in several key ways. The first is that it has a global view of the task list for a distribution. That is, it reads the entire set of files related to the distribution, and determines the global task list for a particular high-level build operation. This is considerably different from 'make', which processes just a single Makefile at a time. (Admittedly, 'make' can be made to work with extremely large projects, using complex include schemes and nested invocations. However, even these types of systems are rarely used for something as complex as a complete Linux distribution build.)

Another apparent difference is that the syntax of the files that bitbake processes allows for a very high degree of flexibility in defining the tasks that should be performed and the variables that control the build process and output.

[note to consider: maybe too much flexibility?]

A variety of mechanisms (described in the bitbake manual) are used to control what operations are performed and what variables are used to control them. bitbake supports inheritance mechanisms, to allow for a class-like definition of common operations. These common operations can be inherited and customized for specific situations. Bitbake also supports conditional definition of new tasks, and has the ability to customize or eliminate (mask) tasks based on variables computed during the build.

Bitbake is written in Python, and some aspects of the build system can be written using short Python snippets as well. Many aspects of the system are written in snippets of shell code as well.

The meta-data
A key element of the Yocto Project is the meta-data which is used to construct a Linux distribution.

Meta-data refers to the build instructions themselves, as well as the data used to control what things get built and to affect how they are built. The meta-data also includes commands and data used to indicate what versions of software are used, and where they are obtained from. The meta-data also consist of changes or additions to the software itself (patches or auxiliary files) which are used to fix bugs or customize the software for use in a particular situation.

The build instructions consists of commands to execute the compiler, linker, archiver, packaging tools and other programs.

The Yocto Project provides the build tool (bitbake) and meta-data itself, for a few distributions. Much of this meta-data can be re-used when someone is building their own distribution. The Yocto Project is related to the OpenEmbedded project, where the bitbake tool, much of the meta-data, and many of the meta-data concepts originated.

Each software component on the system (such as an individual program) has associated with it one or more files to express it's meta-data (dependencies, patches, build instructions). A top-level (usually single) file that defines the 'tasks' for the software is provided in a ".bb" (bitbake) file. This is referred to as that component's "recipe" file. This file may be terse, as the system allows for inheritance and inclusion. Class files (.bbclass) are used to express the meta-data for commonly-used types of build and packaging operations. These files are 'inherit'ed into a recipe. Include files (.inc) are sometimes used to provide a common set of definitions, which can be customized for a particular version of the software. These files are included into a recipe using a 'require' command. Finally, patches and auxiliary files may be associated with recipe. These files can be referenced by the build instructions (eg. applied to the software after fetching it and before compiling it in the case of patches) of the recipe.

A set of bitbake files that are related to a particular feature area can be organized into a "layer". This represents a collection of software or build tasks that can included into an overall distribution.

Finally, a user selects the individual bitbake files to include, or sets of files from different layers, by referencing them in their local configuration (conf/bblayers.conf) and defining build control variables - also in their local configuration (conf/local.conf).

Build stages
The recipes for a distribution define a number of discreet tasks that are performed to accomplish the build. The tasks have names such as 'fetchall', 'configure', 'compile'. Internally, these tasks are prefixed with 'do_' (eg. do_configure)

There can be a large number of tasks associated with a software component. By default, bitbake performs all tasks associated with building a software component and preparing it for deployment. However, bitbake can be used to perform just a single task relative to a component, using the '-c' command line option.

For example, to just perform the 'configure' task for the busybox software, you can do the following: bitbake busybox -c configure Note that if busybox has already been built, this command might not do anything (because bitbake avoids re-running tasks that are not needed, like 'make'). To force it, you could use the '-f' option: bitbake busybox -c configure -f

Not every software component needs every build stage, so some might not be defined for a component. To see the list of tasks that a recipe defines for a component, execute the 'listtasks' task for that component: bitbake busybox -c listtasks

This is a list of the tasks associated with busybox (as of Poky version 7.0) do_fetchall do_build do_devshell do_cleansstate do_configure do_cleanall do_populate_lic do_package_write do_populate_sysroot do_buildall do_package_write_rpm do_menuconfig do_populate_lic_setscene do_patch do_listtasks do_compile do_package_setscene do_fetch do_checkuri do_clean do_package_write_rpm_setscene do_package do_unpack do_install do_populate_sysroot_setscene do_checkuriall

In general, a build proceed through the stages in order: fetching, configuring, compiling, installing, and packaging the software.

Build work areas
The area where each software component is built is dependent on the recipe name, the distro name, build tool and the target architecture for the component. During the build, some items are built for the host machine and some are built for the target machine. In non-embedded Linux, the machine you build and debug on and the machine you compiler for (run the software on) are very often the same. However, in embedded Linux these are often different machines, and often they even have different CPU architectures.

The directory used for an individual component is something like: /tmp/work//

For example, in a build of the busybox code for the ARM qemu emulator platform with yocto version 7.0 was located in: /tmp/work/armv5te-poky-linux-gnueabi/busybox-1.19.4-r2

The build directory for the sqlite database tool, for use on my Ubuntu 12.04 (x86_64) host, was located at: /tmp/work/x86_64-linux/sqlite3-native-3.7.10-r2

The work directory for a software component contains not only the source code for the item, but additional data related to building it, staging it for installation into a package or image, and a 'temp' directory which has logs of the various build tasks. The work directory is useful to know if you want to customize the software (for example if you need to make a bugfix or apply some external patch), or determine why some build operation (such as fetching, compiling, packaging, etc.) is failing.

In the 'temp' directory under the work directory are both 'run' files and log files. The 'run' files are actually shell scripts or python scripts that are executed during the different build stages. The log files have the output from the commands that were used to build the software.

Other directories under the work directory are used for packaging the software's output files, or staging them for inclusion in a system image, or for staging other information about the package, such as it's software license.

Build output
The software for an embedded device can be packaged multiple different ways in preparation for deployment to the actual device. For some systems, the built software is packaged into a kernel and a filesystem image, which can then be directly written to flash or storage media on the device. For other systems, the software may be bundled as packages (similar to the way desktop distributions of Linux are delivered). These might require developer or even end-user installation on the target device. The Yocto Project build system is capable of producing images and packages in several formats, that are selectable by the developer in the local configuration for the distribution. For example, three of the different package formats that can be built by yocto are: debian packages (.deb), redhat package manager files (.rpm), and itsy packages (.ipk).

The directory where the software resides after a typical build is: /tmp/deploy and /tmp/sysroots. The 'sysroots' directory contains the directory structure and contents for the root filesystem for a target build. (Except it seems to be missing /dev, and it doesn't have root ownership - it probably needs pseudo to see the full contents). The 'deploy' directory has the images or packages that are ready to be installed on the device.

Development tools
[need to write about building and SDK, and the IDE here]

Now for some details
[it would be good to fill out the following sections]

fetching tips
[info about fetch mechanisms, proxies, mirrors, etc.]

recipe features that control the build
[document the 'important' variables that show up everywhere in the .bb files (eg PN)] [how does append work, how does masking work, how do constraints work]]

extracting stuff
[where to pull stuff out - maybe covered already]

changing stuff
[how to adjust your recipe list, how to adjust the configurations of the packages, how to adjust the included files]

adding stuff
[how to add a 'hello world' package, your own kernel, your own library]

More neat stuff
[mention HOB, the build appliance, build servers, shared state]