Difference between revisions of "Board Management Layer Notes"

From eLinux.org
Jump to: navigation, search
(r4d)
(add yocto project board control layer functions)
 
Line 134: Line 134:
 
* the Run method is designed for long-running processes, so it streams output and can be aborted; also the console output is included in the command output, which may not be the best decision (it's always possible to merge it later, but not possible to unmerge)
 
* the Run method is designed for long-running processes, so it streams output and can be aborted; also the console output is included in the command output, which may not be the best decision (it's always possible to merge it later, but not possible to unmerge)
 
* Diagnose is newer addition and can be used to provoke machine/OS-dependent diagnostics output on the console (because we do care a lot about kernel crashes/hangs and ability to understand what happened later based on console output)
 
* Diagnose is newer addition and can be used to provoke machine/OS-dependent diagnostics output on the console (because we do care a lot about kernel crashes/hangs and ability to understand what happened later based on console output)
 +
 +
== yocto project ==
 +
See targetcontrol.py at: /https://git.yoctoproject.org/cgit/cgit.cgi/poky/tree/meta/lib/oeqa/targetcontrol.py
 +
 +
functions in the BaseTarget class:
 +
* __init__ - for Qemu, set up instance (paths to images to use)
 +
* deploy - ???
 +
* start - launch emulator
 +
* stop
 +
* get_extra_files
 +
* match_image_fstype (internal routine?)
 +
* restart
 +
* run
 +
* copy_to
 +
* copy_from
 +
 +
QemuTarget has:
 +
* run_serial - appears to run a command on the serial console (for Qemu)
 +
 +
SimpleRemoteTarget only does ssh connection setup/teardown
 +
 +
The class relies on a 'runner' class and a 'connection' class.
 +
 +
The runner class (from QemuRunner) has:
 +
* launch
 +
* start
 +
* ip
 +
* server_ip
 +
* is_alive
 +
* stop
 +
* restart
 +
* run_serial
 +
 +
The connection class (from SSHConnection) has:
 +
* run
 +
* copy_to
 +
* copy_from

Latest revision as of 16:44, 10 October 2019

Here are notes about the different ways that system do board managemnent:

different systems

SLAV

Uses the following verbs or functions:

  • dut_boot
  • dut_login
  • dut_exec
  • dut_copyfrom
  • dut_copyto
  • fota = "flash-over-the-air" for flashing a board

Questions:

  • what layer provides these?
  • what layer calls these?

ttc (Tiny Target Control)

Uses the following verbs or functions:

  • ttc kinstall - install kernel (either to target or tftp area on server)
  • ttc fsinstall - install root filesystem (either to target or nfs area on server)
  • ttc reboot - reboot the target
  • ttc console - get access to console on the target
  • ttc login - get to login shell on the target
  • ttc run - execute a command on the target
  • ttc cp - copy files to or from the target
    • example: ttc [<board>] cp file target:/tmp
    • example: ttc [<board>] cp target:/some/dir/file hostfile
  • ttc rm - remove files from target
  • ttc waitfor - wait for a command to complete successfully

reservations

  • ttc status - show status of a board (including reservation)
  • ttc reserve - reserve a target
  • ttc release - release a target

labgrid

  • Resource - Simple Resources like USB Serial Ports, Power Switch Ports (with availability annotations in the remote infrastructure)
  • Driver - Drivers which bind to a target and use resources, i.e. SerialDriver to use a SerialPort
  • Protocol - Abstract description of a driver interface, i.e. ConsoleProtocol for a driver which provides `read`, `write` `sendline`, `sendcontrol` and `expect` functions
  • Place - Collection of Resources to describe a board in the remote infrastructure
  • Target - description of a device under test with resources and drivers

Remote infrastructure verbs:

  • lock/acquire - acquire exclusive access
  • unlock/release - release exclusive access
  • add-match/del-match - add or remove regex matches to exported resources
  • create/delete - create or delete a place

r4d

see https://github.com/ci-rt/r4d for an overview.

R4d focuses on a rack-based lab, with power controllers and serial device servers that are kept in sync with each other.

"r4d means Remote For Device-under-test and is an infrastructure for power-control and console access for multiple Linux Boards that should be controlled by a test-infrastructure like jenkins."


Functions for lab management and rack/slot/controller assignment:

  • r4dcfg --add-rack <rack-name> <location> - add a rack to the lab
  • r4dcfg --add-power <rack> <model> <ip-addr> - add a power switch to the lab
  • r4dcfg --add-serial <rack> <model> <ip-addr> - add a serial device server to the lab
  • r4dcfg --add-board <rack> <port> <board-name> - add a board to the lab
    • the board should be connected to the same port (indicated) on the power controller and serial device server (e.g., if on port 5, the board must be configured on port 5 of the power controller and port 5 of the serial device server)
  • r4dcfg --move-board - move a board to a different rack and/or port
  • r4dcfg --delete-board - remove a board from the lab
  • r4dcfg --show-db - inspect lab configuration
  • r4dcfg --list-boards - show boards and board configuration
  • r4dcfg --poweron <board>
  • r4dcfg --poweroff <board>
  • r4dcfg --powercycle <board>

The following power control modules are supported:

  • net8x (Gude Expert Power Control NET 8x)
  • pc8210 (Gude Expert Power Control 8210 / 8211)

The following serial device servers are supported:

  • ps810 (Sena Pro Series PS810)

Main access to and control of the boards is provided by a libvirt API.

The virsh command set is documented here: https://libvirt.org/sources/virshcmdref/html-single/

LAVA

kernelCI (as a caller)

Fuego

These are considered the "transport" APIs:

  • cmd - execute a command on the device under test
  • report - execute a command, and log its output (used to execute the actual test program)
  • put - copy files and/or directories to the device under test
  • get - copy files and/or directories from the device under test

And a board management API:

  • reboot - reboot the device under test

details

The APIs provided by the plugin-class for this are:

  • ov_transport_connect - establish communication channel with a board
  • ov_transport_disconnect - disconnect communication channel with a board
  • ov_transport_get - get files from a board
  • ov_transport_put - put files to a board
  • ov_transport_cmd - execute a command on the board
  • ov_board_setup - provision and reserve a board or instantiate a vm
  • ov_board_teardown - destroy a vm instance, or release a board
  • ov_board_control_reboot - reboot a board

Currently in Fuego, The setup, teardown, connect and disconnect functions are often empty for a board, and provisioning is left as an exercise for a different element of the CI loop.

syzbot

Uses these functions:

  • Copy - copy file from host into VM
  • Forward - sets up forwarding (communications channel) from VM to host
  • Run - execute a command in the VM
  • Diagnose - returns diagnostic or debugging information from the VM
  • Close - stops and destroys the VM

Here is the interface: https://github.com/google/syzkaller/blob/28ac6e6496673327d3319bab81c57a0f7366fb45/vm/vmimpl/vmimpl.go#L32-L57

Some comments:

  • when Instance is created, it's supposed to be a "good" state (e.g. rebooted)
  • Close ("destructor") should take care of doing tear down, returning back to pool, etc
  • for Copy operation we don't specify destination path, it's supposed to be chosen by the impl (different machines can have writable storage at different paths); this is fine for our use case of copying a few files into a single dir; a more flexible interface would allow choosing a suffix of the path on the target machine (impl will choose a path prefix, but you can still re-create a particular dir layout on the target)
  • port forwarding may look a bit specialized for our use case, but we want to connect back to the host over tcp (for a richer rpc protocol); maybe we could limit this to just 1 port which is specified during construction (not dynamically, e.g. qemu port forwarding can be setup only when you start the instance, not later)
  • the Run method is designed for long-running processes, so it streams output and can be aborted; also the console output is included in the command output, which may not be the best decision (it's always possible to merge it later, but not possible to unmerge)
  • Diagnose is newer addition and can be used to provoke machine/OS-dependent diagnostics output on the console (because we do care a lot about kernel crashes/hangs and ability to understand what happened later based on console output)

yocto project

See targetcontrol.py at: /https://git.yoctoproject.org/cgit/cgit.cgi/poky/tree/meta/lib/oeqa/targetcontrol.py

functions in the BaseTarget class:

  • __init__ - for Qemu, set up instance (paths to images to use)
  • deploy - ???
  • start - launch emulator
  • stop
  • get_extra_files
  • match_image_fstype (internal routine?)
  • restart
  • run
  • copy_to
  • copy_from

QemuTarget has:

  • run_serial - appears to run a command on the serial console (for Qemu)

SimpleRemoteTarget only does ssh connection setup/teardown

The class relies on a 'runner' class and a 'connection' class.

The runner class (from QemuRunner) has:

  • launch
  • start
  • ip
  • server_ip
  • is_alive
  • stop
  • restart
  • run_serial

The connection class (from SSHConnection) has:

  • run
  • copy_to
  • copy_from