DhruvaGole/ProjectReport

From eLinux.org
Revision as of 05:13, 27 November 2021 by DhruvaG2000 (talk | contribs) (remove proposal from heading)
Jump to: navigation, search


Internship Report- Bela support for BBAI

Youtube Video
About
Student: Dhruva Gole
Mentors: Giulio Moro, Stephen Arnold and Robert Manzke
Code: My Fork of Bela and Official Bela Code Repository Wiki: https://forum.beagleboard.org/t/bela-support-for-bbai-later-ti-chips/29257/7
GSoC: Google Summer of code 2021 entry

Aim

This project proposes to provide restructuring and improvement of existing Bela Software Code to allow for compatibility and easier transition to newer Texas Instrument Sitara Processors (like the AM5729 in the BeagleBone AI).

About Student

College ID: 181030017
Github: https://github.com/DhruvaG2000
School: Veermata Jijabai Technological Inst.
Country: India
Primary language : English, Marathi, Hindi
Typical work hours: 10AM - 7PM Indian Standard Time

About the project

Project name: Bela support for the BeagleBone AI
My Blog: My blog related to this project and my findings.
Logs : I maintain weekly progress updates here: https://dhruvag2000.github.io/Blog-GSoC21/logs/

Description

What is Bela?
The BELA Cape

As given on the official website, Bela is a hardware and software system for creating beautiful interaction with sensors and sound. Bela consists of a Bela cape on top of a BeagleBone Black computer (until now). Bela has a lot of analog and digital inputs and outputs for hooking up sensors and controlling other devices, and most importantly Bela has stereo audio i/o allowing you to interact with the world of sound.
Both Bela systems use the same Bela software. It uses a customized Debian distribution which - most notably - uses a Xenomai kernel instead of a stock kernel. Xenomai is co-kernel for Linux which allows to achieve hard real-time performance on Linux machines (http://xenomai.org/). It thus takes advantage of features of the BeagleBone computers and can achieve extremely fast audio and sensor processing times.
Although the proposal Title mentions support for AI, I will try to develop a standardized setup that allows an easy jump across all TI chips.

Applications of Bela

Bela is ideal for creating anything interactive that uses sensors and sound. So far, Bela has been used to create:

  1. musical instruments and audio effects
  2. kinetic sculptures
  3. wearable devices
  4. interactive sound installations

and many more applications that are listed here

The BeagleBone Black
The BeagleBone Black board

BeagleBone Black is a low-cost, community-supported development platform for developers and hobbyists. Boot Linux in under 10 seconds and get started on development in less than 5 minutes with just a single USB cable. To know more visit https://beagleboard.org/black

The BeagleBone AI
The BeagleBone AI Board

This is the board for which I was able to introduce Bela compatibility.
Built on the proven BeagleBoard.org® open source Linux approach, BeagleBone® AI fills the gap between small SBCs and more powerful industrial computers. Based on the Texas Instruments AM5729, developers have access to the powerful SoC with the ease of BeagleBone® Black header and mechanical compatibility. BeagleBone® AI makes it easy to explore how artificial intelligence (AI) can be used in everyday life via the TI C66x digital-signal-processor (DSP) cores and embedded-vision-engine (EVE) cores supported through an optimized TIDL machine learning OpenCL API with pre-installed tools. Focused on everyday automation in industrial, commercial and home applications. To know more visit https://beagleboard.org/ai

Why add support for BBAI/newer TI chips?

The Beagle Black was launched over 7 years ago in 2013 and newer and better TI Sitara Processors have been launched ever since. It would be better to have a more standardized setup that allows an easier jump across TI chips. Soon, newer boards with different and more efficient chips like the AM5X and the TI C66x digital-signal-processor (DSP) cores in the BBAI are coming up that will need to be compatible with the Bela Software and Hardware.

Programming languages and tools to be used

C, C++, PRU, dtb, GNU Make, ARM Assembly

Implementation Details

The Bela cape is normally used in combination with the TI AM3358 processor present on the BeagleBone Black. The hardware was partially working on the BBAI using only ALSA(Advanced Linux Sound Architecture*) and the SPI driver [1]. However, the Bela real-time code on ARM and PRU was not running on the BBAI yet.

Brief Summary:

  1. Created a device tree overlay using Cape Compatibility layer to port BB-BONE-AUDI which worked but had a few frequency issues on the BBAI. The Overlay I wrote has been accepted by BeagleBone maintainer Robert Nelson, and you can find it to here: https://github.com/beagleboard/BeagleBoard-DeviceTrees/pull/36
  2. Created a device tree overlay for the BELA Cape to work on the BBAI using the Cape Compatibility layer. It has been tested and is available on github: https://github.com/DhruvaG2000/BeagleBoard-DeviceTrees/blob/v4.19.x-ti-overlays/src/arm/overlays/BBAI-BELA-00A1.dts
  3. Adapted to the Bela PRU and ARM code and workflow to use the PRU using the Remote Processor Framework instead of the almost outdated UIO PRUSS.
  4. Updated the Bela code to use the McASP, GPIO and McSPI on the AM5729 SoC of the BBAI
  5. Install a Xenomai patched kernel and run the full Bela stack.
  6. I also ported a debugger for PRU called PRUDebug.

This project involved dealing with pinmuxing (using overlays), PRU assembly, C and C++ for Linux user space applications. Also studied the Technical Reference Manual for the Sitara family of SoCs. (AM5729 and the AM335x.


What is RProc?

The remoteproc framework allows different platforms/architectures to control (power on, load firmware, power off) those remote processors while abstracting the hardware differences, so the entire driver doesn't need to be duplicated. In addition, this framework also adds rpmsg virtio devices for remote processors that supports this kind of communication. This way, platform-specific remoteproc drivers only need to provide a few low-level handlers Reference: https://www.kernel.org/doc/Documentation/remoteproc.txt

What is a Device Tree Overlay?

Sometimes it is not convenient to describe an entire system with a single FDT(Flattened Device Tree). For example, processor modules that are plugged into one or more modules (a la the BeagleBone), or systems with an FPGA peripheral that is programmed after the system is booted. For these cases it is proposed to implement an overlay feature so that the initial device tree data can be modified by userspace at runtime by loading additional overlay FDTs that amend the original data. ([ ref.])
How is an overlay compiled?
dtc (Device Tree Compiler) - converts between the human editable device tree source "dts" format and the compact device tree blob "dtb" representation usable by the kernel or assembler source.
Once an overlay is compiled, it generates a .dtbo file which we can then use in the next stage.

How does one load a DT Overlay?
It's simple, just edit the file /boot/uEnv.txt and then edit the following lines:

enable_uboot_overlays=1
uboot_overlay_addr4=/lib/firmware/BBAI-AUDI-02-00A0.dtbo

and on the next boot, this new overlay should be loaded automatically.

Syntax Analysis

The places within the Bela core code that required intervention are:

  1. in the Makefile, updated the workflow to build the PRU code for remoteproc. Also implemented auto-detection of which processor the code was being compiled on which was passed as a compile time flag to the codes.
  2. Created PruManager code which combined RProc and UIO PRUSS(using the libprussdrv API) implementation all under one roof.
  3. in pru/pru_rtaudio.p, the hard-coded McASP, SPI and GPIO constants were replaced with board-dependent ones.

All these changes were made so that the same code base can run on all supported boards (e.g.: BBAI, BBB) with compile-time checks.
To explain in short how I was able to establish a common code base for supporting all boards,

In the libraries (like bela_hw_settings.h), the constants like pin numbers were set accordingly and the rest of the code base became much easier to use without the need to hard code much.

PRU

A programmable real-time unit (PRU) is a fast (200-MHz, 32-bit) processor with single-cycle I/O access to a number of the pins and full access to the internal memory and peripherals on the Sitara processors on BeagleBones.

  1. The current Bela core code uses pasm to build the PRU assembly pru/pru_rtaudio.p and uses libprussdrv, which binds to the uio_pruss kernel driver to load the firmware to the PRU and handle access to the PRU RAM. Both pasm and libprussdrv are now deprecated, replaced by the clpru toolchain and remoteproc driver respectively.
  2. The PRU firmware contains hard-coded values for the address of the McASP, McSPI and GPIO peripherals. These addresses will change for the BBAI, so these constants were made conditional at compile time using ifdef 's.
  3. As the Bela PRU firmware is written in assembly for pasm, instead of rewriting or updating it in such a way that it will stop working on the current Bela images, we have used the workflow detailed below.
Workflow for building the existing pasm PRU code with clpru

The workflow below works and has been tested on v4.19 BBAI and v4.14 BBB (+ BELA Cape).

  1. Built the .p file as is with pasm with -V2 -L -c -b flags. This generates a .bin file that contains the assembled program.
  2. I have used the disassembler Giulio Moro put together hacking the one that was inside prudebug. (Find it here. (Note: A disassembler is a computer program that translates machine language into assembly language).
  3. Process the bin through the disassembler and make it ready to be included inside an __asm__ directive (i.e.: add quotes and prepend a space at the beginning of each line):
   	prudis $< | sed 's/^\(.*\)$$/" \1\\n"/' > $(RPROC_INCLUDED_ASSEMBLY)

4. have the following in a .c file:

void main()
{
     __asm__ __volatile__
     (
#include "included_assembly.h"
     );
}

5. build this .c file with the regular clpru toolchain

   	clpru -fe $(RPROC_TMP_FILE).o $(RPROC_TEMPLATE) -v3 --endian=little --include_path=$(RPROC_BUILD_DIR) --include_path=$(RPROC_INCLUDE) --include_path=/usr/lib/ti/pru-software-support-package/include

6. The rest of the build procedure is implemented in the Makefile.

I also changed these addresses in 'pru/pru_rtaudio.p' :

#define MCASP_SRCTL6
...
#define MCASP0_BASE 0x48038000
...
#define MCASP_XBUF10			0x228
...
#define MCASP_RBUF10			0x2A8

and a few other one's like GPIO and SPI.

Additionally, I compared the McASP, McSPI and GPIO sections of the AM5729 and AM3358 manuals to verify that all the registers of the McASP and McSPI peripherals kept the same meaning and offsets between the two chips.

PINMUXING

The following pin diagram(from this mathworks forum) aided greatly to help compare the pins on the BeagleBone black versus the BeagleBone AI.

Beaglebone black pinmap

I have created a basic AUDIO overlay (named BBAI-AUDI-02-00A0.dts) that has been merged into the beagleboard / BeagleBoard-DeviceTrees repository.
I have also created and tested the BBAI-BELA-00A1.dts overlay for the BELA cape.

PRU->ARM INTERRUPTS

Bela can work with a PRU->ARM interrupt, which is default these days, but requires an rtdm driver, which is another layer of complications. As an intermediate step to avoid further complications, I have managed to run it without the PRU->ARM interrupt by adding BELA_USE_DEFINE=BELA_USE_POLL to my make command line.

Transitioning from libprussdrv to rproc

I initially believed that I needed to change the initialization code in PRU.cpp that is currently relying on libprussdrv and move to rproc . I was not sure if rproc provides some functionalities to access the PRU's RAM the way prussdrv_map_prumem() used to, that essentially gives access to a previously mmap'ed area of memory.
On the latest Bela code there's a Mmap class which can make this somehow simpler ref. here. I had to find the correct addresses in the AM5729 TRM to get Mmap working using class Mmap.
As for the Rproc implementation, I wrote the entire PruManager.cpp and PruManager.h codes from scratch, taking advantage of OOPs features of the C++ Language.
I program is structured as follows:

  1. There exists a virtual base class PruManager.
  2. PruManagerRprocMmap and PruManagerUio are child classes of the above.
  3. As their names suggest, PruManagerRprocMmap implements RProc + Mmap to control and PRU and access the Memory.
  4. And, the class PruManagerUio basically uses the libprussdrv thus preserving the approach that was being used earlier. This class essentially enables backward compatibility.

Both these classes being inherited from the same abstract base class, have similar function names such as start(), stop(), getOwnMemory(), and getSharedMemory().

XENOMAI kernel

Xenomai is a Free Software project in which engineers from a wide background collaborate to build a robust and resource-efficient real-time core for Linux© following the dual kernel approach, for applications with stringent latency requirements.

Xenomai kernel (v4.19.94-ti-xenomai-r64) has been built and tested for the BBAI. I have installed the xenomai kernel through the default procedure to update kernel and libraries which I have documented here. I have also managed to successfully build the entire Bela core code, also ran and tested a few examples on the BBAI.

Hardware required

The hardware listed below was necessary for testing if my code implementation works correctly on the hardware.

  1. BeagleBone AI.
  2. Bela cape: The original Bela board.
  3. LA104 Logic Analyzer
  4. LEDs, jumper wires, Multimeter, etc.
  5. A Fan Cape

Timeline

Mar 29 Applications open Students register with GSoC, work on proposal with mentors.
Apr 13 Proposal complete Submitted to https://summerofcode.withgoogle.com
May 17 Proposal accepted or rejected
  1. Proposal Accepted
  2. Community Bonding Period and discussion on the project and resources available.
  3. Learn about embedded linux structure ✓
  4. Setting up beaglebone-ai i.e flashing up to date Linux image and connect to local area network (LAN) via either Ethernet or WiFi and try to run example codes from this repository to test basic stuff is working.
Jun 07 Pre-work complete, Coding officially begins!
  1. All the boards and BELA Cape will be available to me at this period of time and I should have set up my BeagleBone Black and Ai boards i.e flashing up to date Linux image.
  2. Initial checks for hardware like audio ports and other peripheral devices will be completed.
  3. A detailed spreadsheet will be created if not already available for cape pin mapping and referencing for further use during BeagleBone AI software development ease. (one exists here)

(some points above are ref. from here )

Jun 17 Milestone #1,
  1. Introductory YouTube video ✓
  2. Setting pinmux values appropriately and fix the dtb to get the correct clock on the McASP MCLK pin. ✓
  3. Writing the BBAI-AUDI-02-00A0.dts overlay to port the old BB-BONE-AUDI overlay using the CCL to work on the AI. ✓
  4. verify that it works running with ALSA at the correct frequency. ✓
June 24 Milestone #2
  1. Started writing the overlay for Bela Cape.
  2. Study about the remote processor framework and rpmesg, and how I can integrate it into the PRU.cpp code.
  3. Test and debug the workflow for building the existing pasm PRU code with clpru.
  4. modify the PRU code with McASP configuration that supports AM572x.
  5. ported prudebug to the AM572x
June 30 Milestone #3
  1. Start writing code that uses rproc.
  2. Add the finalised PRU build workflow into the Makefile.
  3. Finalize the PruManager implementation and test it.
  4. sinetone example test confirms that thee BELA overlay and PRU Code written so far works.
July 12 18:00 UTC Milestone #4
  1. Start working on SPI and GPIO by going through the AM572x TRM.
  2. Cleanup the codes written so far and create a proper structure as to where all the files needed for rproc .out file generation are stored.
July 18 Milestone #5
  1. The Final milestone.
  2. Test the stock tutorials.
  3. Test GPIO, McASP, SPI support for BeagleBone AI using the examples available in Bela.

Experience

The following experience aided me to get started more systematically with this project,

  1. I have used C++, C and Python programming languages over the past 3 years in a variety of projects involving embedded systems using the ESP32, Arduino UNO, ESP8266 and am also well-versed with freeRTOS.
  2. I have an aptitude for writing good reports and blogs, and have written a small blog on how to use a debugger.
  3. I recently did a project using ESP32, in which I used the DHT11 sensor to display humidity and temperature on a local HTML server . Other than that I have worked on developing hardware and making documentation for a 3 DOF arm based on an ESP32 custom board.
  4. I also interned at an embedded device startup where I
    1. Interfaced ADS1115 ADC with the ESP32 and used it to read battery voltage.
    2. Used UART for ESP32 and SIMCOM SIM 7600IE communication to gain LTE support.
    3. Published local sensor data to the cloud via LTE.
  5. I actively contribute to open source (most recently, I contributed to the ADS1115 library for ESP32 on the unclerus repo and can be seen here).
  6. Currently I am working on designing a Development board for the Raspberry Pico (RP2040) using KiCAD.
  7. I also do a lot of mini projects throughout the year, you can find my several more interesting projects at my github page

Contingency

I believe that if I get stuck on my project and my mentor isn’t around, I will use the resources that are available to me. Some of those information portals are listed below.

  1. Derek Molloy's beagle bone guide provides all the information needed for getting up and running with my beagle.
  2. remoteproc
  3. BBAI vs BBB pin headers Google Sheet
  4. beaglebone pru gpio example
  5. official BELA website
  6. Ask on the BeagleBoard.org forum

Benefit

If successfully completed, this project will add support for the Bela cape + Xenomai + PRU on the BeagleBone AI, and also the code will be easier to port to other Texas Instruments systems-on-chip.

By going through the steps needed to have the Bela environment running on BBAI, we will go through refactoring and rationalization, using mainline drivers and APIs where possible. This will make Bela easier to maintain and to port to new platforms, benefiting the project's longevity and allowing it to expand its user base.
-Giulio Moro

Just ordered a Bela cape. The platform seems really cool and exactly what I was looking for :) Just thought I would add that, I would also be very interested in having a bit more processing power under the hood and the ai would definitely be enough for my purposes.
- A User on Bela Forum

Misc

Completed all the requirements listed on the ideas page.
The code for the cross-compilation task can be found here submitted through pull request #149.