ECE497 BeagleBone PRU
Embedded Linux Class by Mark A. Yoder
- 1 Grading
- 2 Executive Summary
- 3 Installation Instructions
- 4 User Instructions
- 5 Finding Where to Access Things
- 6 Building and Running the GPIO_PWM_PRU Example
- 7 How the Assembly Code Works
- 8 How the C Code Works
- 9 Sending an array to the PRU
- 10 Building and Running the Sin_Approximation Example
- 11 Highlights
- 12 Theory of Operation
- 13 Work Breakdown
- 14 Future Work
- 15 Conclusions
I'm using the following template to grade. Each slot is 10 points. 0 = Missing, 5=OK, 10=Wow!
09 Executive Summary (looks good) 05 Installation Instructions (Can't find uio_pruss) 05 User Instructions (Segmentation fault) 10 Highlights (Nice clean tone in 2nd video) 07 Theory of Operation (Good overview. I'd like to see more comment in your PRU code) 05 Work Breakdown (Need to include who did what) 10 Future Work 10 Conclusions 00 Demo (I want to hear the clean sine before filling this in) 10 Late Comments: Score: 00/100
For this project, the objective is to explore the PRU, Programmable Realtime Unit, (DID THIS:Define PRU) of the BeagleBone, looking at both the limitations of implementation and how to implement tasks such as handling pulse width modulation. The PRU is a part of the processor that runs at 200MHz (5ns per instruction), and is separate from the operating system all together, making it more efficient at accessing I/O pins. The project is more research intensive, as opposed to implementation intensive, and serves to bring together all of the sources found on the BeagleBone's PRU into one abbreviated document with examples of how to use it and the potential for extra projects. The ultimate goal here is to walk through step by step leading to the representation of a sinusoidal wave using pulse width modulation accessed from PRU and play the produced wave through a speaker.
As of now we have gathered information about the PRU, found memory locations that can be edited on the PRU and in C so that we can interact with functions outside of the PRU's capabilities, and implemented code on the PRU that simulates a pulse width modulation on a GPIO pin. We were also able to produce an approximated sinusoidal output on the GPIO pin using pulse width modulation at a specific frequency. For each of these there is an example to follow describing how each part works, and listing any resources to look at to find out more.
We were hoping to potentially look into reading an analog input and producing it as an approximated output using the pulse width modulator, but this ended up being to optimistic of a goal. This would be an interesting thing to explore for a project that expands upon this.
The Git Hub is on the following link:
- Hardware: Some LEDs for messing with the GPIO pins and a speaker for listening to PWM approximated sine wave.
Note: When implementing the pulse width modulation, you may want to bias the wave around 0V instead of 1.65V. If this is the case, you may want to use a summing circuit which will require an Op-Amp, a 2kohm and 1kohm resistor, and 2 resistors of the same value (Higher values preferable for lower power consumption), which will need to be connected as shown where V1 is your pwm voltage, V2 is your -1.65V bias, and V3 is unused:
Unless you desire a louder output than capable with simple I/O pins, there is no additional hardware needed.
Always run the following before doing anything with the PRU:
beagle$ modprobe uio_pruss
This can be activated anywhere on the BeagleBone and activates the PRU module in the kernel so that its memory and all of its components are accessible.
Note: modprobe uio_pruss is BeagleBone specific and will not be found on a host computer. Also, if you do not do this instruction before following the rest of the instructions you may run into segmentation faults when trying to initialize the PRU.
DID THIS:Where is uio_pruss? (Not a specific location, can be accessed anywhere on the BeagleBone)
Finding Where to Access Things
There are many locations in memory that are needed to access specific I/O pins on the BeagleBone. Some of these I/O pins can be found here:
The following are not found in the file, but are good addresses to know when accessing MUXs:
memory location: gpmc_a2.gpio1_18 (0x44e10848/0x848 = 0x0027), b NA, t NA mode: OMAP_PIN_INPUT_PULLDOWN | OMAP_MUX_MODE7 signals: gpmc_a2 | gmii2_txd3 | rgmii2_td3 | mmc2_dat1 | gpmc_a18 | pr1_mii1_txd2 | ehrpwm1A | gpio1_18
memory location: gpmc_a3.gpio1_19 (0x44e1084c/0x84c = 0x0027), b NA, t NA mode: OMAP_PIN_INPUT_PULLDOWN | OMAP_MUX_MODE7 signals: gpmc_a3 | gmii2_txd2 | rgmii2_td2 | mmc2_dat2 | gpmc_a19 | pr1_mii1_txd1 | ehrpwm1B | gpio1_19
memory location: gpmc_ad8.gpio0_22 (0x44e10820/0x820 = 0x0027), b NA, t NA mode: OMAP_PIN_INPUT_PULLDOWN | OMAP_MUX_MODE7 signals: gpmc_ad8 | lcd_data23 | mmc1_dat0 | mmc2_dat4 | ehrpwm2A | pr1_mii_mt0_clk | NA | gpio0_22
memory location: gpmc_ad9.gpio0_23 (0x44e10824/0x824 = 0x0027), b NA, t NA mode: OMAP_PIN_INPUT_PULLDOWN | OMAP_MUX_MODE7 signals: gpmc_ad9 | lcd_data22 | mmc1_dat1 | mmc2_dat5 | ehrpwm2B | pr1_mii0_col | NA | gpio0_23
Building and Running the GPIO_PWM_PRU Example
This example is located in the GPIO_PWM_PRU directory in the AM335x_PRU_BeagleBone git repository, and can be pulled with the following:
beagle$ git clone git://github.com/millerap/AM335x_PRU_BeagleBone
This example uses the gpio and delay loops to approximate a PWM using the user LEDs on the BeagleBone. It is based on an example provided by Lyren Brown and documented by boxysean at
In GPIO_PWM_PRU all of the complicated Makefiles and directories used to make a multitude of examples at once have been stripped away to allow the user to compile one individual program that will run on the PRU.
The readme.txt file in the GPIO_PWM_PRU directory provides a walkthrough for compiling and running blinker on the BeagleBone.
The first step to compiling a program for the PRU is to make sure prussdrv.c is made and up to date. This is the file provided by TI that contains all of the C functions that allow for communication with the PRU. To do this, do the following:
beagle$ cd <directory>/AM335x_PRU_BeagleBone/GPIO_PWM_PRU/interface beagle$ export CROSS_COMPILE="" beagle$ make
Make the rest follow the above format.
CROSS_COMPILE is specified as "" because this is running on the BeagleBone itself and the Makefile is setup to defaultly cross compile the code from another linux machine.
Once this is completed, the pasm_source must be set for the BeagleBone's linux operating system:
beagle$ cd ../utils/pasm_source beagle$ ./linuxbuild
Note: The above instructions need to be done for every time the BeagleBone boots up and these directories should be included with any code that you write for the PRU
Now, the BeagleBone is ready to compile the example code. Navigate to the example's root directory again:
beagle$ cd ../../ beagle$ make CROSS_COMPILE=""
This will compile the blinker.c file and output it to the bin folder. After this point, the assembly file needs to be compiled into a .bin file. This is done in the bin folder.
beagle$ cd bin beagle$ make
Now, there should be a blinker.bin file in the folder. running the blinker executabile will put the blinker.bin file on the PRU and start it running. Use the following:
I get a Segmentation fault, but I think that's because I can't do the modprobe
How the Assembly Code Works
(DID THIS:Note that this is /bin/blinker.p. WORKING ON THIS: Could you add some comments to the file explaining things?) //in the overview talk about the period being 5ns
In the assembly file blinker.p:
Registers r5 and r6 are the duty_cycle and period respectively. The duty_cycle is a number smaller than the period that the accumulator r4 counts up to before setting the output to zero. When the r4 = period, r4 resets and the output is set to 1. This gives the following for for OnTime and OffTime.
SecondsPerCycle = 5*10^-9 OnCycles = 2 + (duty_cycle)*3 + 2 OffCycles = 2 + (period - duty_cycle)*3 - 1 + 2 TotalCycles = 7 + (period)*3
These equations can be used to create a very exact PWM output by setting duty_cycle and period to the values you wish to use. The code that was compiled and run above has a period of about a second and a duty cycle of about 50%.
There are a few macros defined at the beginning of the program. These macros are the location of GPIO1's memory space, the location of its set registers and the location of its clear registers. The BeagleBone's GPIO pins must be turned off and on using these two different memory locations. Setting the set register to 0 does not turn off its respective GPIO pin.
r2 stores the value that is going to be written to either set or clear gpio. r3 stores the address that r2 will be written to. within the first 3 lines of PWM_ON these values are set such that r2 will turn on the user LEDs. The instruction that actually turns it on is SBBO. This takes the value of r2 and writes it to memory location r3 with an offset of 0.
Here is a complete guide to the PRU's Assembly Instructions from TI
How the C Code Works
DID THIS:This is an interesting section. Could you note here where in the manuals you found your information?
The following information can be found on TI's PRU Linux Application Loader API Guide wiki:
This lays out every function that can be used in the C code as well as an explanation of its functionality.
The blinker.c file is a direct port of the PRU initialization code from TI. Putting the two side by side, the only difference between the two is the name of the bin file that is used for the exec function.
The code first initializes the PRUSS, Programmable Realtime Unit Subsystem or the entire system of two PRUs, an interrupt controller (INTC), and associated memory (PRUSS),(DID THIS:What's the difference between PRUSS and PRU?) by allocating memory for it using the prussdrv_init() function. It then initializes memory mapping for the PRU using the prussdrv_open() function. All of the intc functions are used for interrupt communication between the ARM and the PRU. This code is not utilized by the examples in this page.
Similar to the exec function in C, the prussdrv_exec_program () function overlays the IRAM (Instruction RAM) portion of the PRUSS with the bin file that was created from blinker.p. The first field of prussdrv_exec_program needs a PRU number, which is either 0 or 1 depending on which PRU core is being used. In this case, PRU0 is executing blinker.bin. The second field is the path to the bin that will be put into the PRU's IRAM.
The next section waits on event 0 from the PRU to signal the C program that it has completed its execution. This, again, was not implemented, but writing the appropriate bit to the r31 register would cause the C program to continue. As it is, the program stalls at this point until SIGINT is received.
If the correct event were received, the next function is used to halt the PRU's execution then it would release the PRUSS clocks and disable the prussdrv module.
Sending an array to the PRU
The initialization code provided by TI has a handy function for passing an array to the PRU. Each of the PRU cores have an 8kb data ram associated with them, and that data space can be populated from an external C program. The next example will make use of this function to pass different PWM duty cycles to the PRU. This will be largely based around the following function:
int prussdrv_pru_write_memory (unsigned int pru_ram_id, unsigned int wordoffset, unsigned int *memarea, unsigned int bytelength);
pru_ram_id can take on one of 4 values, and are as follows:
PRUSS0_PRU0_DATARAM PRUSS0_PRU1_DATARAM PRUSS0_PRU0_IRAM PRUSS0_PRU1_IRAM
Here, each of the PRUs have both an Instruction RAM and a DATARAM section. DATARAM for PRU0 is found in the memory locations 0x0 - 0x2000, and DATARAM for PRU1 is found in the memory locations 0x2000 - 0x4000.
wordoffset is an offset in words (4 bytes) from the base memory location, pru_ram_id.
memarea is a pointer to an array of unsigned ints (also 4 bytes) that will be passed onto the PRU.
bytelength is the number of bytes to write to the PRU.
For more information on using C to initialize the PRU visit TI's PRU Linux Application Loader API Guide Thanks for the reference. It's a big help.)
Building and Running the Sin_Approximation Example
This example uses a modified version of the GPIO_PWM_PRU example to change the duty cycle every period such that the average voltage approximates a sin wave. Navigate to the pwm_sin directory and take a look at the C code. Its nearly identical to the previous code except for a few small differences. The first difference is that it opens and edits two files to export GPIO0_7 and turn it into an output.
The next change is that the prussdrv_pru_write_memory command discussed above is used to push an array containing duty cycles onto the DRAM for PRU0. The duty cycles are approximated using a the sin function from the math.h header file. Here is why:
(VCC*(on_time)+0*(off_time)) ---------------------------- = VCC*duty_percent period . VCC*duty_percent = Va . Va = VCC*Sin(2*PI*f*t/fs) . VCC*duty_percent = VCC*sin(2*PI*f*t/fs) . duty_percent = sin(2*PI*f*t/fs)
Looking at the assembly code, we see a similar PWM control as before, but this time it is reading the duty cycles from memory. The coding to do this is a little more complicated due to precise measuring for the sampling frequency, but because this is being run on PRU0, DRAM starts at 0x0. If you feel that you need to change the sampling frequency at any point here's a bit on how to calculate the amount of instructions you need to delay in total.
sample_period = 1/sample_frequency . sample_period/(number_instructions_delay_loop*5ns) = total_number_instructions_to_delay_per_period . Then count the amount of instructions before or after the loop and nock off that many instructions to delay. This will of course need to be accounted for in the duty cycle, and some sampling frequencies may not offer all duty_percentages. . duty_percent = number_on_instruction_delay/number_off_instruction_delay . number_on_instruction_delay + number_off_instruction_delay = total_number_instructions_to_delay_per_period . ->(duty_percent + 1)*total_number_instruction_to_delay_per_period = number_on_instruction_delay
Of course a few other numbers will have to be adjusted, such as the number of samples to read from memory. Because it has to read in 4 bytes of data, this will end up being:
This way the number will reset to 0 as soon as it goes over the limit of memory to be read.
To build this example, follow the same procedure as before, and wire GPIO7 (Pin 42) (Which header?) through a speaker to ground. The output produced will be start as a 367Hz approximation and can be changed by putting a number into the terminal (67 to 66000) and pressing enter (NOTE:Text may mess it up. So, DON'T use text!). All of the sampling for 133kHz sampling is now taken care of for any frequency wave mentioned in the previous sentence.
During the project we were able to get an approximated 880Hz sine wave to play by changing a pulse width modulation duty cycle to approximate a dc voltage output, as you would with an LED dimmer. You can view this in action along with a helpful tip in the youtube video:
The first video sounds clipped, but the second one sound very clean! Good job.
Theory of Operation
In the first examples, you can see that the GPIO can be toggled on and off simply by editing locations in memory from the PRU. You simply set how long you want the LED to be on and how long you want the LED to be off, and delay the time between on and off to create the desired latency.
In the following example, the premise changes slightly. The operation of our code is simple, the PRU offers precise instruction delay of 5ns per instruction. With this we were able to create a delay of instructions that would be a sampling frequency, and in between this delay of instructions we were able to have a set amount of instructions for which the GPIO was on, and a set amount where the GPIO was off. Also, memory could be set from a C program, and then read by the PRU. This came in handy when approximating the sine wave because C offers the math.h header file that has the sin function included and can give approximate numbers to send to the PRU. The amount of delay on and off could be set dynamically every time the loop began by reading the next bit of data stored in memory, thus creating an average voltage that approximated a playable sine wave.
Refer to How the C Code Works, How the Assembly Code Works, and Building and Running Sin_Approximation for more details.
Also include who did what.
10/22: We should have all research done. Update documentation with every Milestone.
10/26: We should be able to show something, an example or simple implementation.
10/29: Ability to send different lengths to turn on an LEDs.
10/31: Ability to send different lengths to multiple LEDs.
11/2: We should be able to demo our overall work, possibly have some things to fix before presentation.
11/4: Finalize presentation
Most of our research has come from internet resources listed below:
- TI PRU Resources
- Example for Running Code on the PRU
- PRU Assembly Instructions
- Initializing PRU in C
- AM335X Datasheet
For future work there are a few interesting features that we were not able to get to due to time limit and the amount of research needed to begin with. First, we found difficulty in accessing things such as the PWM and analog in ports. These could be further explored given the documents that we have dug up, and some exploration on Google. Second, we wanted to read audio from the analog input and adjust the duty_cycle of the PWM accordingly to have approximate audio output which is the next step to what we have done here. Third, we wanted to explore interrupts on the PRU, but were unable to find enough documentation to get an example working. (The 'C' code has this comment: /* Wait for event completion from PRU */. Does it wait for the PRU?) So, PWM, analog in, audio capabilities, and interrupts are the possible things to look into. Also, any other time critical operations can be explored further with the BeagleBone PRU because it has a delay of exactly 5ns for every instruction.
So, if you need precise timing, or more rapid access to a certain GPIO pin, this is a route you might want to look into. There are a few suggestions listed above that might be interesting to see come out of using the PRU. However, if you do not require precisely timed events or faster access to GPIO pins, you might want to consider just using C on the main processor. Much of the information needed to access certain parts of the PRU and the hardware from the PRU is either very vague, or very difficult to dig up, and because the PRU is not widely used, it is difficult to find people that can offer information on the topic.
Embedded Linux Class by Mark A. Yoder