Difference between revisions of "R-Car/Virtualization"

From eLinux.org
Jump to: navigation, search
(Fix link to OASIS VIRTIO TC)
m (Clarify kvm and CPU types)
Line 319: Line 319:
 
</pre>
 
</pre>
  
Note that using KVM requires running on a Cortex-A57 CPU core. When running on a Cortex-A53 CPU core in a big.LITTLE configuration, QEMU will fail to start with the following error message.
+
Note that using KVM requires running on the same CPU type as specified in the QEMU command line. Hence when launched with '-enable-kvm -cpu cortex-a57', and running on a Cortex-A53 CPU core in a big.LITTLE configuration, QEMU will fail to start with the following error message:
  
 
<pre>
 
<pre>

Revision as of 11:41, 23 March 2018

Virtualization on Renesas R-Car Platforms

Virtualization with Xen hypervisor

Xen supports R-Car chips out of the box. Example instructions on how to run Xen on R-Car H3/H2 can be found @ XenProject Wiki:


Virtualization with QEMU

QEMU is a generic and open source machine emulator and virtualizer. It emulates CPUs and provides a set of emulated devices to support running unmodified guest operating systems. When running on Linux hosts, it can use the host's hardware virtualization extensions (if supported) through the KVM API to run guests at near-native speed.

Quickstart

  • Install QEMU (e.g. "apt-get install qemu-system-arm")
  • Download a guest image (e.g. "openwrt-arm64-qemu-virt.Image" from OpenWRT)
  • Start QEMU with

    $ qemu-system-aarch64 -m 1024 -cpu cortex-a57 -M virt -nographic -kernel openwrt-arm64-qemu-virt.Image

    and press <ENTER> to enjoy your new ARM64 system!

Installing QEMU

When not using a binary distribution that provides a QEMU package, QEMU can be cross-compiled from sources, manually or with the help of a cross-compilation build system such as BuildRoot or OpenEmbedded.

Manual cross-compilation

QEMU supports a wide variety of options, in particular related to emulation of architectures or various classes of devices. The instructions given here only enable a minimal set of options to reduce the required dependencies. We only enable the aarch64-softmmu target architecture, FDT for ARM Linux guest support, KVM for hardware virtualization acceleration and SDL for display acceleration.

The following packages are needed on the host to cross-compile QEMU.

  • pkg-config
  • Python 2.x

The following packages are needed in the host build environment to cross-compile and on the target to run QEMU.

  • DTC (Device Tree Compiler)
  • GLib 2.x
  • Pixman
  • SDL (1.x or 2.x)
  • zlib

Other options can be enabled as needed and may require extra dependencies.

After obtaining the QEMU sources, the build must be setup. The following environment variables must be set.

CROSS_COMPILE
Cross-compilation toolchain prefix (e.g. aarch64-linux-gnu-).
PKG_CONFIG
Path to the pkg-config script for the cross-compilation environment (typically ${HOST_DIR}/bin/pkg-config).
The script must set the PKG_CONFIG_LIBDIR environment variable and execute the pkg-config binary.
SDL_CONFIG
Path to the sdl-config script (typically ${STAGING_DIR}/usr/bin/sdl-config).

Additionally, if not using the host system Python, the following environment variables must be set.

PYTHON
Path to the host build environment Python 2 binary.
PYTHONPATH
Search file for Python module files for the Python interpreter referenced by $PYTHON.

You can then run the configure script.

$ ./configure                                                  \
    --prefix=/usr --cross-prefix=${CROSS_COMPILE}              \
    --target-list=aarch64-softmmu                              \
    --enable-attr       --enable-fdt       --enable-kvm        \
    --enable-sdl        --enable-system    --enable-tools      \
    --audio-drv-list=                                          \
    --disable-bluez     --disable-brlapi   --disable-bsd-user  \
    --disable-cap-ng    --disable-curl     --disable-curses    \
    --disable-docs      --disable-libiscsi --disable-linux-aio \
    --disable-rbd       --disable-seccomp  --disable-slirp     \
    --disable-sparse    --disable-spice    --disable-strip     \
    --disable-usb-redir --disable-vde      --disable-virtfs    \
    --disable-vnc       --disable-werror   --disable-xen

QEMU can then be compiled and installed and installed to the destination directory $DESTDIR.

$ make
$ make DESTDIR=${DESTDIR} install

Cross-compilation with BuildRoot

BuildRoot provides a QEMU package (v2.10.1 in BuildRoot 2017.11). To compile it, the following configuration options are needed.

BR2_PACKAGE_QEMU=y
BR2_PACKAGE_QEMU_CUSTOM_TARGETS="aarch64-softmmu"
BR2_PACKAGE_QEMU_HAS_EMULS=y
BR2_PACKAGE_QEMU_SDL=y
BR2_PACKAGE_QEMU_FDT=y
BR2_PACKAGE_QEMU_TOOLS=y
BR2_PACKAGE_SDL2=y

Additionally, if you want to enable the KMSDRM backend in SDL2, the following configuration options are needed.

BR2_PACKAGE_LIBDRM=y                    # for libdrm
BR2_PACKAGE_MESA3D=y                    # for libgbm
BR2_PACKAGE_MESA3D_GALLIUM_DRIVER_SWRAST=y
BR2_PACKAGE_MESA3D_DRI_DRIVER_SWRAST=y
BR2_PACKAGE_SDL2_KMSDRM=y

Running QEMU

The bare minimum command line to start a virtualized Linux guest in QEMU without any emulated device other than a serial console is as follows.

$ qemu-system-aarch64 -m 1G -cpu cortex-a57 -M virt --nographic -kernel qemu/Image
-m 1G
Sets the initial memory for the guest to 1GB.
-cpu cortex-a57
Sets the CPU model to Cortex A57, same as the host's big CPUs. Having an identical CPU type on the guest and host allows accelerating execution of the guest by running instructions natively instead of emulating them (note that such acceleration requires using KVM).
-M virt
Sets the emulated machine type that will be exposed to the guest. The virt machine type is a generic ARM virtual machine that is a good starting point. Many other machine types are available and can be listed with '-M list'.
--nographic
Completely disables the graphical output from QEMU and redirects the guest emulated serial port to the console, muxed with the QEMU monitor. The QEMU man page gives more information about how to switch between the console and monitor.
-kernel qemu/Image
Selects the kernel image to boot on the guest. The image can contain a RAM disk for the root file system.

In addition to this minimal configuration various classes of virtual devices can be exposed to the guest. QEMU supports emulating a wide range of standard devices (such as PCI network adapters, disk controllers or graphics cards). It also implements support for para-virtualized devices collectively referred to as 'virtio' that usually offer improved performances over the emulated devices by enabling direct communication between the guests and host. The following sections explain how to instantiate virtio devices for disks, serial ports and graphic cards.

Disk virtualization

The guest can boot from a root file system stored in a disk image. To do so a disk drive need to be setup and a corresponding option passed to the Linux kernel. The corresponding virtio device is called virtio-blk and requires the guest kernel to be compiled with the CONFIG_VIRTIO_BLK=y option.

    -drive if=none,file=qemu/rootfs.ext2,format=raw,id=root-disk \
    -device virtio-blk-device,drive=root-disk \
    -append "root=/dev/vda"
-drive if=none,file=qemu/rootfs.ext2,format=raw,id=root-disk
Declare a disk drive backed by a raw disk image stored in the qemu/rootfs.ext2 file. The drive is given the name 'root-disk'.
-device virtio-blk-device,drive=root-disk
Instantiate a virtio block device backed by the disk drive 'root-disk' and expose it to the guest.
-append "root=/dev/vda"
Set the guest Linux kernel root device to /dev/vda, corresponding to the first virtio-blk disk.

Note that the shorthand version '-drive if=virtio' assumes PCI virtio (using virtio-blk which is an alias for virtio-blk-pci) and thus can't be used fort virtio-mmio. The device has to be explicitly set to the mmio-based virtio-blk-device, requiring both the '-drive' and '-device' options.

Console virtualization

The default QEMU configuration creates a serial port for the console only when the -nographic option is set. If graphic emulation is desired but with the guest still using a serial port for its console, the console serial port has to be instantiated explicitly. The corresponding virtio device is called virtio-serial and requires the guest kernel to be compiled with the CONFIG_VIRTIO_CONSOLE=y option.

    -device virtio-serial-device \
    -device virtconsole,chardev=virtiocon0 \
    -chardev stdio,signal=off,id=virtiocon0 \
    -append "console=hvc0"
-device virtio-serial-device
Instantiate a virtio serial device (note that as for block devices virtio-serial is an alias for virtio-serial-pci, we need to specify virtio-serial-device for the mmio-based version), creating a serial bus. The serial bus is used as transport for the virtio console.
-device virtconsole,chardev=virtiocon0
Instantiate a virtual console that connects to the previously instantiated serial bus and binds to the 'virtiocon0' character device.
-chardev stdio,signal=off,id=virtiocon0
Create a character device named 'virtiocon0' using the stdio backend. Character devices are QEMU frameworks that implement data exchange between QEMU and the host. The stdio backend is a bidirectional pipe between QEMU and its standard input and output on the console in which it runs. The 'signal=off' option keeps signals within the guest, to avoid for instance ctrl-C to be propagated to the QEMU process and killing it.
-append "console=hvc0"
Set the guest Linux kernel console device to hvc0, corresponding to the first virtconsole device.

Network virtualization

QEMU can expose one or more network interfaces to the guest, and route traffic from those interfaces to the host network in various ways (from simply bridging the two networks to implementing private networks connecting multiple virtual machines). The corresponding virtio device is called virtio-net and requires the guest kernel to be compiled with the CONFIG_VIRTIO_NET=y option.

    -device virtio-net-device
-device virtio-net-device
Instantiate a virtio network interface and expose it to the guest. The interface will be named etho0.

As for other devices, the shorthand option '-net nic,model=virtio' creates a virtio PCI device, so we need to use long options.

Graphics virtualization

When not passed the '-nographic' option QEMU will create a graphics window to display the QEMU monitor, the guest console (unless instructed otherwise with explicit console configuration) and the guest graphics. The virtio device used for guest graphics is called virtio-gpu and requires the guest kernel to be compiled with the CONFIG_DRM_VIRTIO_GPU=y option.

    -device virtio-gpu-device
-device virtio-gpu-device
Instantiate a virtio graphics device and expose it to the guest. The device will be available through the DRM/KMS device /dev/dri/card0.

As before virtio-gpu-device is the MMIO version, while virtio-gpu is a shortcut for virtio-gpu-pci.

QEMU uses SDL (v1 or v2) to displaying the guest graphics on the host. SDL supports different backends that can be selected at runtime through the SDL_VIDEODRIVER environment variable (and some of those backends implement multiple backends of their own, increasing the complexity of the graphics stack). The following SDL backends have been tested.

X11 (SDL1 and SDL2)
The X11 backend creates an X window and renders all guest graphics there. It requires a running X server. This is the default backend and has been tested successfully with SDL2.
directfb (SDL1 and SDL2)
The directfb backend uses the directfb framework to render graphics. directfb provides a lightweight alternative to X11 to manage windows. By default directfb uses X11, DRM/KMS or FBDEV for display based on availability, and the choice can be overridden through the directfb configuration file or the DFBARGS environment variable. While directfb can thus be used for display without an X server, tests of the directfb with direct DRM/KMS were unsuccessful on both SDL1 and SDL2. QEMU initializes correctly and displays the directfb mouse pointer (with SDL2) or a blank screen (with SDL1) and no guest graphics.
fbcon (SDL1 only)
The fbcon backend accesses the framebuffer device directly through the FBDEV API. It thus requires FBDEV emulation support in the host kernel. All tests were unsuccessful, with the first run resulting in an error when starting QEMU and subsequent tests hanging QEMU completely.
KMSDRM (SDL2 only)
The KMSDRM backend accesses the display device directly through the DRM/KMS API. The backend uses libgbm for buffer allocation and expects support for the host display device in libgbm. When such support is not available (which is the case for mainline Mesa on R-Car platforms) software rendering can be forced by setting the GBM_ALWAYS_SOFTWARE environment variable to 1. All tests were unsuccessful, the display always stayed blank.

As X11 is the default, we expect most tests to be run with that backend, and failures with the other less common backends are not entirely surprising. They could be fixed provided enough interest.

While the X11 backend works, it incurs a high cost due to all the components involved in the display stack.

,--------------------------------.
|        Guest Userspace         |
`--------------------------------'
            [DRM/KMS]
,--------------------------------.
| Guest Kernel virtio-gpu Driver |
`--------------------------------'
            [VirtIO]
,--------------------------------.
|  Host QEMU virtio-gpu Driver   |
+--------------------------------+
|              SDL2              |
+--------------------------------+
|        SDL2 X11 Backend        |
`--------------------------------'
           [X protocol]
,--------------------------------.
|             X.org              |
+--------------------------------+
|       X.org fbdev Driver       |
`--------------------------------'
            [DRM/KMS]
,--------------------------------.
|   Host Kernel rcar-du Driver   |
`--------------------------------'
           [MMIO & DMA]
,--------------------------------.
|       R-Car DU Hardware        |
`--------------------------------'

To improve performances, a QEMU host virtio-gpu driver implementation that interfaces directly to the host graphics device through the DRM/KMS API could be developed. This could be further enhanced by making use of the KVM_SET_USER_MEMORY_REGION API to share memory buffers with the host, removing the need for the costly memcpy() operation when transferring video frames from the guest to the host. As the DRM/KMS API only allows for a single master device a helper process that control and shares access to the display devices would be needed on the host side. Such a process would be much more lightweight than a full X11 or Wayland server, as its only purpose would be to give access to independent display pipelines to separate guests running in separate QEMU instances.

Common issues

Recent kernels may access ARM64 registers that require a recent version of QEMU. If your kernel crashes with

Code: ........ ........ ........ ........ (d5380740)
Internal error: undefined instruction: 0 [#1] SMP
...
pc : __cpuinfo_store_cpu+0x68/0x1b0

you have to upgrade QEMU, or comment out the following two lines in arch/arm64/kernel/cpuinfo.c:__cpuinfo_store_cpu():

info->reg_id_aa64mmfr2 = read_cpuid(ID_AA64MMFR2_EL1);
info->reg_id_aa64zfr0 = read_cpuid(ID_AA64ZFR0_EL1);

Virtualization acceleration

Multiple techniques can be used to accelerate virtualization, targetting both instruction execution and device emulation.

Accelerating instruction execution with KVM

By default QEMU executes guest software on emulated CPUs using dynamic binary translation. When the guest and host CPU architectures are identical, guest instructions could be run directly on the host CPU, provided a way to catch attempts to evade the guest's view of system resources. Such a mechanism is implemented by the KVM Linux kernel module when the host CPUs support it (using the Intel VT-x, AMD-V or ARM Virtualization Extension hardware mechanisms). This drasticallly improves guest performances.

Host Kernel Configuration

Make sure your host kernel has the following options enabled:

CONFIG_VIRTUALIZATION=y
CONFIG_KVM=y

"renesas_defconfig" should be fine.

Firmware Support

To use KVM, the firmware on your board must start Linux in hypervisor (EL2 / HYP) mode.

If your host kernel prints:

CPU: All CPU(s) started at EL2
...
kvm [1]: Hyp mode initialized successfully

during boot up, everything is fine, and you can skip the next section.

If your host kernel prints:

CPU: All CPU(s) started at EL1
...
kvm [1]: HYP mode not available

during bootup, HYP mode is not available, and KVM cannot be used, unless you first replace the firmware of your board with a version that supports HYP mode.

Enabling HYP Support

To build your own ARM Trusted Firmware with hypervisor mode support:

$ git clone https://github.com/renesas-rcar/arm-trusted-firmware.git
$ cd arm-trusted-firmware
$ make CROSS_COMPILE=aarch64-linux-gnu- PLAT=rcar RCAR_DRAM_SPLIT=3 RCAR_BL33_EXECUTION_EL=1 LSI=H3 # or LSI=M3 

Replace the BL2 and BL31 binaries in your firmware package by the generated "build/rcar/release/bl2.srec" and "build/rcar/release/bl31.srec", and follow your normal firmware upgrade procedure.

Running QEMU with KVM

To enable KVM acceleration, add the following option to the qemu command line.

    -enable-kvm

Note that using KVM requires running on the same CPU type as specified in the QEMU command line. Hence when launched with '-enable-kvm -cpu cortex-a57', and running on a Cortex-A53 CPU core in a big.LITTLE configuration, QEMU will fail to start with the following error message:

kvm_init_vcpu failed: Invalid argument

Just retry, or force your luck by offlining all Cortex-A53 cores first:

for i in $(grep -lr arm,cortex-a53 /sys/bus/cpu/devices/cpu*/of_node/compatible); do
	echo 0 > $(dirname $(dirname $i))/online
done

Note: The same is true for running with '-cpu cortex-a53', with a53 and a57 exchanged.

Device Pass-Through

Instead of emulating devices, QEMU can pass ownership of a hardware device to the guest. This allows the guest to access the device directly, removing any costly emulation layer. Support for pass-through access to devices is implemented using the Linux VFIO API on the host, which exposes direct device access to the userspace QEMU process in an IOMMU/device-agnostic way. For more information about using VFIO with QEMU, see VFIO.

A downside of pass-through access to devices is that a device passed to a guest won't be available to other guests, while device emulation could allow multiple guests to access the same hardware device managed by the host. Another potential issue is that some hardware devices can potential affect system stability when used incorrectly, or interact with other devices in the system in adverse ways. This is the case when dependencies exist between two instances of a hardware device, for instances when two display outputs are driven by display controllers sharing some controls signals. Device pass-through should thus be used with care.

Device para-virtualization

To emulate devices QEMU needs to expose an interface for the guest to communicate with the host. The traditional way is for that interface to emulate commonly supported peripherals (such as various types of PCI devices). The guest will then be unaware it runs in a virtualized environment, and will perform I/O accesses to the device as if running directly on the hardware. Those accesses that will be caught by QEMU which will interpret them based on the emulated hardware operation. This mechanism is slow as the interface isn't designed for virtualization, and can cause subtle issues as QEMU might not always behave exactly as the real hardware it emulates.

Interfaces more suitable for virtualization have been developed by the OASIS Virtual I/O Device (VIRTIO) Technical Committee. Those interfaces improve device emulation performances by creating optimized mechanisms to communicate between the guest and host. These mechanisms are called para-virtualization. The downside is that they need specific drivers for the guest, preventing unmodified virtualization-agnostic guest operating systems from using those devices.

Para-virtualization of devices is best suited for devices that move large quantities of data or need low latency communication between the guest and host, when pass-through device access is not possible or desirable. This is for instance the case with disks (in most cases we will want to expose virtual disks to the guest, not a full SATA controller), network interfaces (similarly multiple guests should run on the system with all their networks connected in a configurable way, not with a physical network interface passed to each of them), or display (individual outputs of a single display controller should be assigned to separate guests, without a single guest controlling the whole display controller).

More information about how QEMU implements para-virtualization for devices is available from R-Car/IO-Virtualization.


Managing virtual machines with libvirt

See Libvirt.


Building your own guest kernel

Kernel Config

The kernel config file used for the OpenWRT guest image is a good starting point. You can extract it from '/proc/config.gz' on the running system.

RAM Disk

You can extract the initramfs from an OpenWRT guest image using 'binwalk'. Make sure to truncate the initramfs file after the first 256-byte boundary after "TRAILER!!!", else the kernel will crash with

Kernel panic - not syncing: broken padding