BeagleBoard/GSoC/2022 Proposal/TaliaXu

=Proposal for Beaglewire PRU and Support= About Student: Talia Xu Mentors: Michael Welling, Omkar Bhilare Proposal:Implementing PRU and Improving Standalone Cores on BeagleWire

=Proposal=
 * Completed All the requirements listed on the ideas page.
 * The PR request for cross-compilation task: #162.

=Status= This project is currently just a proposal.

About you
Github: taliaxu09 School: [Technical University Delft] Country: The Netherlands Primary language : English Typical work hours: 12PM-8PM CET Previous GSoC participation: This is my first time applying to participate for GSoC. I want to patricipate in GSoC with BeagleBoard because I think this is a great opportunity to gain a deeper knowledge of the code repository of BeagleBoard and explore how to use it together with different peripherals. I also think the BeagleWire could be a promising candidate for some of the topics I wish to look into in my study on visible light communication (For example, an extension to the OpenVLC platform http://www.openvlc.org/instructions.html which is based on BBB).

About your project
Project name: RISC-V Based PRU on FPGA and Beglewire Updates

Main Goals:
 * 1) Create RISC-V PRU on BeagleWire and optimize for the I/O latency
 * 2) Create examples with PRU cores on BeagleWire in assembly
 * 3) Improve the stability and implement testbenches for subsystems in standalone cores
 * 4) Improve the documentation

Introduction
The BeagleWire is an FPGA cape with the Lattice iCE40HX that can be connected to and interfaced with the BeagleBoard. There are two main goals in this project, the first one is to implement a programmable real-time unit on BeagleWire to allow low-latency I/O control between the main CPU and peripherals. The second one is to revisit and improve the software support for standalone cores, such as the (SDRAM, UART, SPI, PWM).

RISC-V Cores on FPGA
Several open-sourced PRU cores can be leveraged to implement on the BeagleWire

For iCE40:

 * 1) https://github.com/olofk/serv
 * 2) https://github.com/stnolting/neorv32
 * 3) https://github.com/sylefeb/Silice/tree/draft/projects/ice-v

The above cores can be used as a starting point and reference to quickly implement a working RISC-V core on the current BeagleWire cape. Once the cores are able to be run on the BeagleWire, the goal is to focus on the following improvements:
 * 1) Interface between BeagleWire PRU cores and BBB: to create an interface for the BBB to manage both the PRU on the BBB and the PRU on the BeagleWire simultaneously
 * 2) To measure and improve the I/O latency of the PRU cores: to identify any bottlenecks in the implementation of PRU cores for communicating with peripherals and look into ways to improve them if any. The I/O latencites of PRU cores on BBB will be used as a reference.
 * 3) To fit as many PRU cores as possible on the iCE40: to run multiple cores simultaneously on the BeagleWire with shared access to SDRAM. I also plan on looking into the components that are not necessary for BeagleWire to maximize the PRU cores on BeagleWire

Another option here suggested by @abishek will also be looked at and discussed during community bonding period-
 * 1) https://github.com/GlasgowEmbedded/glasgow

For PolarFire:

 * 1) https://www.microsemi.com/product-directory/soc-fpgas/5498-polarfire-soc-fpga

The approach of supporting PRU on PolarFire is less sort out, but the purpose is to start with the official support of RISC-V on PolarFire.

Getting a working RISC-V core running on the iCE40 or PolarFire is the first step of the project, but the focus of the project is to improve the latency of I/O access for peripherals.

Soft RISC-V-based CPU core for low-latency I/O on BeagleWire
In the above RISC-V implementations, the I/Os tend to be mapped to memory blocks. Writing and reading from memory are multi-cycled instructions and can cause undesirable delays that violate timing requirements for certain peripherals. For this reason, in this project, I will be connecting one of the 31 general purpose registers directly to the I/O pins.

The following is roughly my plan to go about reaching this goal, if time permits, I will try to make progress before the submission deadline:
 * 1) Run the PRU example https://www.glennklockwood.com/embedded/beaglebone-pru.html and chracterize the latency with a ring buffer (https://pub.pages.cba.mit.edu/ring/)
 * 2) Implement a RISC-V core on BeagleWire or a PolarFire dev board and characterize the same latency (Latency can be measured from both digital probe and/or Fmax + # of cycles)
 * 3) Go through the HDL for the RISC-V core to verify whether the register block has been instantiated with FFs or a RAM block; If a RAM block is used, re-impelment it with FFs
 * 4) Modify the pin-outs for one of the registers such that I/Os have direct access to them; modify logic to write any peripheral state/data to this register
 * 5) Characterize the latency again, if it looks like Fmax can be improved further, that would be the enxt to look at
 * 6) Modify peripheral code in assembly to make sure compiler doesn't touch the I/O register

Supporting multiple peripherals
When multiple peripherals are connected to the PRU and a single register is insufficient, implement a logic block for multiplexing the I/O pins. Current thoughts:
 * 1) This part is perhaps not as critical in terms of timing, so I might get away with implementing this in memory
 * 2) The other alternative is to reserve 2 bits of the register for peripheral selection (up to 4)
 * 3) It's also possible to implement multiple cores on an FPGA, each supporting one peripheral, project Silice can be used as a starting point for this direction

Improving the stand-alone cores on BeagleWire
The previous issues on several subsystems of BeagleWire have been addressed with LiteDRAM, but the issues with the standalone cores are left unresolved. As part of my project, I would like to take a further look into the standalone subsystems.

The tasks I would like to achive in this project are
 * 1) Implement an automatic test script for each of the subsystem (SDRAM, SPI, UART)
 * 2) Read the RTL to understand any timing violation, as well as other issues that could have caused the unstable behavior
 * 3) Improve the reliability of standalone cores by solving any issues identified

Experience and approach

 * I am currently pursuing a PhD in embedded systems, and am familiar with the concepts/implementations of low-power processors.
 * I have working experience with Verilog and embedded systems from previous internships & research projects. I also have some expriences working with low latency I/Os on FPGA boards. I had done an undergraduate capstone project + internship with using FPGAs as accelerators for machine learning algorithms.
 * I have experience working with existing code repo and I think I am able to navigate a large code base quite well from previous work/internship experience in large software companies.
 * I have working experience with designing PCBs in Altium and Kicad up to 8 layers.

Contingency
if I get stuck on my project and my mentor isn’t around, I will use the following resources:
 * 1) Getting Started Guide for BeagleBone by derek molloy: http://derekmolloy.ie/beaglebone
 * 2) PRU Cookbook:


 * BeagleWire Repo:
 * 1) https://github.com/BeagleWire


 * Documentation and Repo on RISC-V cores:
 * 1) https://github.com/olofk/serv
 * 2) https://github.com/stnolting/neorv32
 * 3) https://github.com/sylefeb/Silice/tree/draft/projects/ice-v

There are a good amount of resources on everything that I look to implement, if all above fail, I will post on the BBB slack/irc channel.

Benefit
(I am reusing last year's benefit - because the goal of the project and thus the benefit is essentially similar) ''The completed project will provide the BeagleBoard.org community with easy to implement and powerful tools for the realization of projects based on Programmable Logic Device(FPGA), which will surely increase the number of applications based on it. The developed software will be easy and, at the same time, efficient tool for communication with FPGA. At this point, FPGA will be able to meet the requirements of even more advanced applications. The BeagleWire creates a powerful and versatile digital cape for users to create their imaginative digital designs.''

''It is largely about advancing RISC-V and learning about a key architecture benefit seen in earlier BeagleBone systems. Note how TI has removed PRU documentation from TDA4VM, despite it being a key value to BeagleBone users? This is about developing a PRU-like CPU that is open source on an open ISA, not about the ideal use of FPGA fabric. But, it could also be a handy way to configure FPGA fabric in a way that users don't need to understand how to generate FPGA code itself, if they can just program the RISC-V core, but that is secondary to the development and analysis of the core itself.''