ATS2019-Minutes

These minutes are in text format.

Minutes from sessions at Automated Testing Summit 2019 (held in Lyon, France - October 31, 2019)

https://events19.linuxfoundation.org/events/ats-2019/program/schedule/

09:00 https://ats19.sched.com/event/Uvq8/keynote-welcome-opening-remarks-tim-bird-sr-staff-software-engineer-sony

09:10 https://ats19.sched.com/event/Uvpn/keynote-report-on-recent-testing-meetups-kevin-hilman-co-founder-sr-engineer-baylibre

* "the bugs are too fast - and why we can't catch them." * Summary of test conferences * Everybody is testing in their own corner - there is not much communication / upstreaming * Test coverage := on the beaten tracks * Even with fragmentation: We still find lots of bugs. How can we fix all this bugs? * ~10% of kernel commits have are bugfixes * syzbot: finds ~3bugs/day; but only 7% coverage * => we could find much more bugs! how can we deal with that? * Can we use the structured information that come from bots without bringing it in free-form emails? * => Discussion is already going on. * ==> estimated 20k bugs/release (!) * "This is a lot of bugs. Can we dig out?" * Current problem: Fragmentation in CI/CD, test frameworks, test suites, result parsing, pass/fail criteria, log collection, results visualization, bug tracking, kernel develper process for fixes * And there are even more closed projects working on this * Conclusion: * Fragmentation is bad * Collaboration is good * Work upstream * No upstream? Create one!

09:30 https://ats19.sched.com/event/WDIw/lkft-status-update-milosz-wasilewski-linaro

LKFT * Covering multiple architectures: arm32, arm64, i386, x86_64 * Multiple hardware of each type * Testing with QEMU * Testing multiple LTS branches * Testing: latest stable, mainline, next * Running multiple test suites: LTP, perf, kselftest, ... * ~25.000 Tests per push * => 1M tests each week, 70M up to today * lately also started testing android with a downstream kernel * Future Plans: * Boot design: * LAVA * Choose rootfs, choose kernel * Fastboot is avoided where possible * Uses NFS based rootfs where possible * LAVA job generation abstracted with own tool => Talk after lunch today * Test design: * Kselftest build with kernel and overlayed into rootfs * dmesg-logs: improved parsing for kernel warnings and errors * Reporting system: * Find a lot of data => but reporting is not easy * Looking for reporting and analytics layer * Analyse cross branch and cross time * How to integrate with multiple data sources? * questions: * How many tests are failing? How many bugs are found? * About 100 Bugs currently open that have not been fixed * But no detailed statistics

09:40 https://ats19.sched.com/event/WDJ3/fuego-status-update-tim-bird-sony

Status update: fuego * Introduction: * What is fuego? * Debian-based linux distro * Jenkins packaged inside * Test execution core inside * Collection of tests inside * all in a docker container * its for high level integration testing * Fuego does embedded testing: Always host and board * Fuego cross-builds tests and tools => architectural neutral * Collection of tests: * Scripts for test execution * Tools for results parsing, analysis, visualization * Phases: pre_test, dependency check, build, deploy, run, post_test * Fuego has multiple transports between host and target * Command line tool if you want to ignore all the GUI * Everything in docker to make it reproducible * Latest release 1.5: * Now: Jenkinsless install, can now be plugged into other CI-systems * Now installable without a container and can run nativley * Fuego core needs only bash and python on host * Individual tests might require other things * Fuego markes sure you only used a limited feature set on the target * Requires only posix shell and 20 common linux commands (all available from busybox) on target * does not require awk or sed, but does require grep * Fuego does not support provisioning of a target! * Is an exercise for the user /o\ * Currently prototype features: * Support tests from other frameworks: Functional.Linaro, Functional.ptest * Want to support running a test via LAVA, because so many people are running LAVA * configurable back end for test results * Artifact server support: Pull artifacts from other servers, (like kernelci) * Roadmap: Short term: * Board provisioning support (even if Tim wanted someone else to do it) * External monitors e.g. power monitors => Adds another dimension of data to each test * Continue integration with other systems: * Run a test on an external board manager like beaker, labgrid, ... * Roadmap: Longer Term: * Utilize external artifact server for SUT * maybe use kernelci triggers and builds? * Hardware testing * How to test SPI, CAN bus, USB * Currently totally missing * fserver support: * Test object server * Storage of tests, build artifacts, test requests and results * Can be used to deliver requests from one host to another and return the results * is WIP * https://fuegotest.org/wiki/Using_Fuego_with_fserver * Fuego has some tests included * But there are thousands of tests in-house but neither results nor tests are shared * groups at fujitsu, sony, toshiba, mitsubishi, samsung are using it			* A group at Renesas is using an older version of it (1.2?) * fujitsu is contributing the most currently * fuego is focused on test complete products or distributions. * it is not used to test upstream / master stuff * In Tim's lab the newest kernel is 4.4-something * fuego does a lot of work in describing and understanding pass/fail criteria * Kevin Hilman: That is really valueable to the upstream-focused test projects * pass/fail criteria is dependend on hardware (board, sd-card, ...) => a lot effort is done to create this criteria based on those

09:50 https://ats19.sched.com/event/WDJB/kernelci-status-update-kevin-hilman-baylibre

state of kernel-ci * goal always was: test all architectures and as much boards as possible * 35x SoC vendors * over 250+ unique boards * building multiple kernel trees with multiple compilers (and versions) and with multiple configurations * mostly boot testing => get into a shell is a pass * some boards have more complex tests (DRM, v4l2-compliance, power (suspend / resume), USB smoke test) * not writing own test suites, but using the ones of others * With kernel-ci in the linux foundation: * gets more of a forum where people can share code and infos * There is already a lot of press coverage. it's on kernelci.org * Current problems: * collecting sooo much data * want to do some data analytics on this (maybe also with the LF project) * currently no new data or graphs to show * After plumbers: started to massage test results of multiple projects into common format. more: https://github.com/kernelci/kcidb * with kernel-ci cooperating with google and microsoft: * much more compute power in the cloud(s); can add much more variants * Guillaume Tucker from Collabora has been refactoring the legacy jenkins jobs * Microsoft took some of those pipelines and ported them to azure pipelines in a few days

10:00 https://ats19.sched.com/event/WDJI/cki-status-update-veronika-kabatova-red-hat

cki overview https://cki-project.org/ * finding bugs before they get into the kernel * tracking a few upstream trees * newest stable, stable-next, ARM-next, RDMA, RT-devel * SCSI, net-next, mainline (on x86_64 only) * Not sharing the results publically, just with developers (?) * Results for the first set of trees are sent to appropriate mailing lists and developers * Results for the second group are internal-only as we don't actively collaborate with maintainers of those trees but there's no private info there * We should be able to push to kcidb once it's ready, the only reason for not sending these results is to not spam people * Focus: Mostly server-systems and hardware used in those systems (storage, nics, infiniband) * Using Gitlab-CI, not Jenkins * Beaker as Hardware-abstraction backend && provisioning system * Compile results in a report and publish on mailing list * Also publish on kcidb * KPET selects tests to run based on files touch by the patch => try to prevent finding old bugs and only the new ones (and getting results quicker) * kselftest integration on todo list

10:10 https://ats19.sched.com/event/WIZr/slav-status-update-pawel-wieczorek-samsung

* Started 2017 as proof-of-concept * Automated testing with interactive sessions on the DUTs * Were not able to sell the muxpis hardware since they are R&D, but it is public * Current state: project is on hold * Documenting design decisions * Trying to provide a muxpi-less evaluation environment * trying to minimize confusion: map repositories to Test_Glossary from last ATS * https://github.com/SamsungSLAV/muxpi * https://3mdeb.com/products/open-source-hardware/muxpi/ * ~250 EUR * Now software is completly open source

10:25: Dmitry Vyukov (developer of syzbot and syzkaller) 5-minute announcement * 3000 syzcaller reproducers are availabe * only crash/not-crash result * tests may corrupt your system * side note: CKI has a flag on the test definition if a test is destructive (i.e. regarding filesystems) * some developers want patches to be reviewed but not tested or the other way around

10:30 https://ats19.sched.com/event/VdBs/open-testing-philosophy-kevin-hilman-baylibre

* Slide from Guillaume Tucker: https://www.linuxplumbersconf.org/event/4/timetable/?view=nicecompact * Reminder: Define pipeline-blocks for testing to enable us to share tests, results * Find places in that diagram for collaboration

cfi: https://github.com/SmithChart/Designing-for-Automated-Testing https://designing-for-automated-testing.readthedocs.io/en/latest/

Comment: x86 on TAC needed for android test suite

10:40 coffeebreak

11:00 https://ats19.sched.com/event/UvpM/labgrid-real-world-examples-jan-lubbe-rouven-czerwinski-pengutronix-ek 15 minutes until projector is fixed :/

11:50 https://ats19.sched.com/event/Uvpk/new-ways-out-of-the-struggle-of-testing-embedded-devices-chris-fiege-pengutronix-ek

* New Ways Out of the Struggle of Testing Embedded Devices - Chris Fiege, Pengutronix e.K.	* Organized in a Rack * Pengutronix Lab * USB * CAN * RS232 * Ethernet * GPIO * Power Supply * In the Future: more CI for customer projects, more remote access to lab * need to increase reliablity * -> more smaller test controllers/servers

12:30 lunch

14:00 https://ats19.sched.com/event/WR99/beaker-project-automated-testing-at-red-hat-tomas-klohna-red-hat

* 	* Beaker: * is a unified testing platform for quality engineers * has Support for alternative hw-architectures * allows to filter on hardware. * stores logs and results, exposes this in a web-ui * supports multiple systems for one tests. (scheduler is smart enough to wait for multiple resources) * Test farm is heavily loaded: 9k machines and 4k users available * Beaker is split up: * Inventory management * Knows machine details * Knows machine history * Access control and user database * Filter of hw-properties * Scheduling * Job contains recipe sets, recipe sets contain recipes and recipes contain tasks * Scheduling is done on ??? layer * Tasks defined in XML * Code can come from Git or RPM * Provisioning * Only Redhat-like distros * Testing * tests written in C				* Tests can be written in any language as long as they handle the Beaker API, it's the test harness that's written in C			* tasks can be destructive * tasks have a timeout, there is a watchdog * tasks have metadata * result collection * some web-ui

14:20 https://ats19.sched.com/event/V959/slav-test-stack-abstraction-layers-pawel-wieczorek-samsung-rd-institute-poland

* SLAV: Test stack abstraction layers * Aims at testing everything embedded * Motivation * Parallel access for interactive hacking and automated testing * Both use cases are close to each other: both need preparation step, interaction and release of resources * Abstractions: * DUT - Device under test * TM / DUT-C * Test manager: Network access * DUT Control * Test scheduler / Test manager * Non-monolithic approach * Hardware Layer * All schematics published * SD-Mux: DUT Control * SDWire: Unser USB-SD-Mux * MuxPi: DUT Control and Target Management * Contains SBC * Software: * REST-API * Test Manager * Test Scheduler * Target Manager * Capabilities: * Describe attributes of the DUT and connected devices * Can be used to get access to the right board * Abstraction: Admin creates defined shell-scripts at defined location on target manager * What did we learn here? * Pro: * Requires only preparing test plan * Test plans can be reused * Cons: * Keeping compliance to other formats * Other tools gaining features really fast * Capabilities have to be defined when preparing DUT for lab

Jan-Simon MÃ¶ller, AGL Release Manager 14:50 https://ats19.sched.com/event/UvpJ/how-agl-tests-its-distro-and-what-challenges-we-face-jan-simon-moller-the-linux-foundation

* AGL initially did infotainment, now working on instrument cluster * Tools: * Single Signon via LF Identity for tools: * Git with Gerrit code review * JIRA, Jenkins * Challenges: * multiple boards, multiple images -> large test matrix * also need timely results (hours not days) => fast turnaround time * release builds need a full pass with license/CVE scanning * automotive wants full test coverage * Architecture: * Git <-> Gerrit => Jenkins / CI 		* Lava: Tests on hardware (Qemu, or real hardware) * Multiple LAVA instances controlling multiple DUTs * Collecting results in modified kernelCI database * Results via E-Mail * Also: Results in KernelCI web UI but is not really helpful * Missing trend analysis => Currently waiting for new capabilities in kernelCI * Feedback-Loop from CI to gerrit: Developer gets +1 for build and CT		* Every single commit goes through the CI loop * Lessons learned: * Build step: * full builds take 5-6 hours on 16-core cloud machine * IO is an issue in Cloud environments * run in qemu first (touchstone build) * Build Phase: 2.5h but that seems too long * Test step: * Test centers are decentralized, build is in the cloud. * Transfer of artifacts (~800MB) takes time * Build jobs tend to finish in bursts => leads to multiple downloads in parallel * Lava: Needs better priorizing or round-robin scheduling. currently one lab seems to be preferred * reporting: * KernelCI UI is too slow with many tests

14:00 https://ats19.sched.com/event/UvpY/test-plan-templating-for-lava-milosz-wasilewski-linaro-ltd * start with 1 devices * now 8 devices, 20+ tests * convoluted definitions * interconnections between deploy type and tests * Deployments are all different * Separation into layers * base -> deployment -> type -> test * More Android tests (CTS, VTS) * Questions: * KernelCI similar, migration possible? * Possible to migrate kernelCI to this * Kevin Hilman: interested in migrating to the lava-test-plans because its already similar * https://github.com/andersson/bootrr to check device tree and loaded kernel

14:50 https://ats19.sched.com/event/Uvpe/a-survey-of-open-source-test-definitions-tim-bird-sony

* Standards XKCD * Test definition store for each framework * What's inside: * how to run the test * lots of metadata * parsing * analysis * whatever ends up in the database * Fuego * yml and tar variables and parsers * a lot of files * python, shell, yaml and json * Linaro: * test.yaml * test.sh instructions * Yocto: * Python file with classes * 0Day * PGKBUILD, pack in bash * yaml for execution and dependencies * CKI * index files for meta-data and triggering, scheduling, control * Makefile with phases * metadata with metadata (created by Makefile) * README.md metadata (markdown) * Jenkins * config.xml - meta-data, including instructions * SLAV * yaml, meta-data, instructions and execution * Overview * Everyone uses shell snippets for tests * YAML often for metadata * Overview of fields in projects (not repeated here) * Overview of common intersections * Harmonization issues * time location for execution * what runs where (target, test server) * what is the required software?

15:20 https://sched.co/WIcw Guide to CIP testing

* Goal to have env to test SLTS Kernel, SLTS RT Kernel, CIP Core (Deby & ISAR), SW update * currently ~30 Kernel Configs for SLTS v4.4 & v4.19 * Use Gitlab CI, Gitlab runner in AWS with k8s, builders on-demand in AWS * LAVA Workers locally * currently testing Boot, Spectre/Meltdown checker from linaro * LTP in progress * next steps: * improve reporting * improve coverage * kselftest, jitterdebugger, linaro test defs, benchmarks, hardware testing (CAN/PCIe/USB etc) * add more boards * collaboration with automated testing community * gitlab cloud ci: gitlab.com/cip-project/cip-testing/gitlab-cloud-ci * ISAR needs binfmt-misc (needs privileges because not namespace aware) * hard way: fix binfmt_misc namespace aware * easy way: use privileged containers * went both ways :)		* reuses kops as a thin wrapper		* AWS or on-premise		* scaling supported	* currently:		* 100% uptime over last six months		* auto scaling a bit slow on AWS		* m5d.4xlarge (16vCPU, 64GB RAM, 2x NVMe SSD)		* dynamic 0-40 slave nodes		* on-premise at Siemens	* contributions/bug reports welcome	* master always running on AWS (40â‚¬ per month)	* linux-cip-ci:		* two containers for building and testing linux SLTS kernels via LAVA:			* creates job def and waits for the result	* currently two physical labs

https://ats19.sched.com/event/Uvph/working-together-to-build-a-modular-ci-ecosystem-discussion-session-tim-bird-sony Tim: Modular Testing Framework * Many monolithic test frameworks * want to mix and match * -> reduce work * we need to define APIs / boundaries between our modules / systems * we're making progress on the run artifacts (kcidb) * how can we integrate our tools so that it is still easy for our users to use our systems? * proposal: * git style interface toolname, verb args * JSON as return value * async via start/stop/collect * How should we input data? JSON, ENV, parameters? * CLI is ok since we are not time-critical * how to get threre without breaking existing systems? * need to look at each other's systems

https://ats19.sched.com/event/Uvri/defining-a-standard-board-management-api-jan-lubbe-pengutronix-ek-pawel-wieczorek-samsung-rd-institute-poland

TODO: Some Names missing here. See Names in [Brackets]

* Poll: are there people interested to run multiple test systems in one lab? * Linaro: Interactive hacking is a case * Fuego, Lava, labgrid seem to need to coexist * There should be common coordinator for all the systems. and all systems must be aware of this * Pro: You can use any system for the use-cases it is good in and learn the other ones

* Jan proposes a central resource controller: * Make reservation, use the board, give it back afterwards * Chris suggests to have a master-scheduler and not use hacks on the single-master systems. * Must make single-master systems be aware of the new master * Jan: * Need a seperate daemon talking to the systems * Schedule this systems time to do it's jobs. * Needs jobs to inspect a queue and the boards state * Remi: That is currently used in LAVA for development.

* Tim: How to handle serial port? Fuego wants local device * Jan: LAVA can call a tool and use STDOUT STDIN * Jan: We already use that on top of labgrid * Kevin: Many labs use ser2net. In newer version ser2net can multiplex multiple clients

* Tim: Using a different model: Test via SSH and store results in files on the target. They don't really use the serial port for test results. They are only used for executing commands if needed.

* Jan: What about different configurations in all the tools? * Jan: For the beginning it could be ok. Later there could be a tool that compiles the oder versions * Tim: Could think about using other formats.

* Jan: Progress: Labgrid takes no new drivers for power switches. We tell them to add it to pdudaemon. * Tim: New release of pdudaemon. * Matt Hart: * There is now a daemonless-mode * Output can now be driven by name if they are named in the config file. * Remi: * pdudaemon with named ports can now be used in LAVA with a small configuration-change * there is no need for the commands in the config file any more

https://ats19.sched.com/event/UvyI/summit-wrap-up-tim-bird-sony

* Tim: Action items: * Note Chris: Action items also on: https://elinux.org/Automated_Testing_Summit_2019#Presentations * Tim: Everybody: Upload your slides! * Agree to best upload it to elinux.org: https://elinux.org/Automated_Testing_Summit_2019#Summit_Artifacts * ALL: Send notes to the list * Tim: Key decisions: * Jan: Upload test results to kcidb * Tim: All systems make a kcidb client to upload results * Kevin: Extend kcidb schema * Tim: Has prioritiy over test definition unification work * Tim: Use LTP metadata format as initial standard * add meta-data convert to kselftest (Tim will take care) * Also plan to add a converter to this format for Fuego (Tim again) * This should give us some ideas of how to use it, and tweak it going forward * Jan: Will build a prototoype to move boards between LAVA and labgrid. * Jan: Calls out for other systems to adapt to the prototype once it's there. * Tim: Chris should keep working on "Hardware Design for Testing" * Try to promote it via: people.kernel.org, corporate blog post, LWN, ... * Jan: Suggests people add more infos on interesting or not interesting hardware to the Board_Farm page: https://elinux.org/Board_Farm * Jan/Tim: Do we do this again? * Tim: There was some work to do before: Logo, sponsor-stuff. But that is done now. * Tim: This year there was a lot of testing-discussion on other events. Make preperation weird. * Tim: There will be a testing-track at plumbers * Jan: There will be an embedded track at FOSDEM * Kevin: Testing at plumbers will be a full day * Tim/Kevin: There is only little overlap of people. * Tim: There is a lot of VM-testing. * Sasha Levin: Please submit your topics for the micro-conference! * Tim adds an action item to do a decision in the future. * Tomas Klohna: https://opentestcon.org/ is open for all topics * Jan: Would like to have more "hackfest" or workshops * Tim: Wants to plan for plumbers hackfest in august 2020 * https://www.linuxplumbersconf.org/ * Tim: sounds like a decision to focus on that next year, instead of repeat of ATS in 2020 (October, co-located with ELCE) * Some people won't be at Plumbers, so they would miss out * Jan suggests to continue use the mailing list * Jan makes an advertisement for the usb-sd-mux