Difference between revisions of "EBC Exercise 18 Using the DSP for Audio Processing"

From eLinux.org
Jump to: navigation, search
m (Removed an old section)
(Added Inside Makefile)
Line 63: Line 63:
 
# Once the DSP completes, the ARM continues running.
 
# Once the DSP completes, the ARM continues running.
 
# Subsequent calls to audio_process() need only pass the parameters and tell the DSP to go.
 
# Subsequent calls to audio_process() need only pass the parameters and tell the DSP to go.
 +
 +
== Details on Using C6Run ==
 +
 +
So we have a working example of how to use the DSP, but a lots of details have been skipped. Here are some details.
 +
 +
=== Inside Makefile ===
 +
 +
So how did the Beagle know what to run on the ARM and what to run on the DSP?  The answer is in the Makefile. Take a look at it. The first section sets up the PATHs and FLAGS for the ARM compiler.
 +
<pre>
 +
#  ----------------------------------------------------------------------------
 +
#  Name of the ARM GCC compiler & archiver
 +
#  ----------------------------------------------------------------------------
 +
</pre>
 +
The next section does the same for the DSP compiler.  One interesting flag is '''--C6Run:replace_malloc'''.  We'll discuss it in the next section.
 +
<pre>
 +
#  ----------------------------------------------------------------------------
 +
#  Name of the DSP C6RUN compiler & archiver
 +
#  ----------------------------------------------------------------------------
 +
</pre>
 +
The next section is the important one.
 +
<pre>
 +
#  ----------------------------------------------------------------------------
 +
#  List of source files
 +
#  ----------------------------------------------------------------------------
 +
# List the files to run on the ARM here
 +
EXEC_SRCS := main.c audio_input_output.c audio_thread.c
 +
EXEC_ARM_OBJS := $(EXEC_SRCS:%.c=gpp/%.o)
 +
EXEC_DSP_OBJS := $(EXEC_SRCS:%.c=dsp/%.o)
 +
 +
# List the files to run on the DSP here
 +
LIB_SRCS := audio_process.c
 +
LIB_ARM_OBJS := $(LIB_SRCS:%.c=gpp_lib/%.o)
 +
LIB_DSP_OBJS := $(LIB_SRCS:%.c=dsp_lib/%.o)
 +
</pre>
 +
Here is where you tell which files run on the ARM and which on the DSP.  '''EXEC_SRCS''' is a list of the .c files that run on the ARM.  '''LIB_SRCS''' is the list for the DSP. It's that easy.
 +
 +
Further down you see the rules for building the ARM only code. Look them over until you understand what they are doing.
 +
<pre>
 +
#  ----------------------------------------------------------------------------
 +
#  Rules for build and ARM (gpp) only target
 +
#  ----------------------------------------------------------------------------
 +
</pre>
 +
The next section is for the DSP.
 +
<pre>
 +
#  ----------------------------------------------------------------------------
 +
#  Rules for build and ARM/DSP (dsp) target
 +
#  ----------------------------------------------------------------------------
 +
</pre>
 +
Notice the ARM_CC is used for the files listed in EXEC_DSP_OBJS and the C6RUN_CC is used for those in LIB_DSP_OBJS.
 +
 +
The last thing to note is at the very end.  If the variable '''DUMP''' is defined, the values of many of the Makefile variables are displayed.
 +
 +
=== Sharing Memory ===
 +
 +
The ARM and the DSP share memory, so when we called audio_process() all we had to do was pass pointers to the buffers we wanted to process.  There was no need to copy from one processor to another, therefore very little overhead.  However, are some details that were handled for you that you need to know. 
 +
 +
The ARM uses a memory management unit (MMU) that maps virtual addresses to physical addresses.  The DSP doesn't have an MMU.  That means the pointers on the ARM (outputBuffer, inputBuffer) point to a virtual address and the pointers on the DSP (outputBuffer, inputBuffer) point to physical addresses, which probably aren't the same.  C6Run automatically provided the code needed to map from the virtual address space to the physical.
 +
 +
But there is a bigger problem. outputBuffer and inputBuffer were allocated at run time using the standard C routine '''malloc'''.  malloc allocates contiguous memory of the desired size; however it is contiguous in the virtual space, but probably not contiguous in the physical space.  This causes problems for the DSP. 
 +
 +
== Explore the Object Files ==
 +
 +
== Explore More Examples ==
  
 
== Assignment - Experiment with the code ==
 
== Assignment - Experiment with the code ==

Revision as of 08:11, 25 August 2011


In the previous exercise you saw how to bring audio into the Beagle and send it out again. You also did some processing on the audio. All this was done on the ARM processor. The DM3730 on the BeagleBoard has both an ARM processor and a C64x fixed-point DSP. This exercise shows you how to use the DSP via C6Run. C6Run is a set of tools which will take in C files and generate either an ARM executable, or an ARM library which will leverage the DSP to execute the C code.

There are two uses of C6Run, exposed through two different front-end scripts. They are called C6RunLib and C6RunApp. We focus on C6RunLib here.

Examine the Files

  • Copy the AudioThru files from here to your Beagle.
  • Change directories to AudioThru/lab06d_audio_c6run.
# cd lab06d_audio_c6run
# ls

These are the same files as used in the previous audio thru lab. The files audio_process.c and audio_process.h are new. audio_thread.c has been changed slightly. Makefile is completely different.

Edit audio_thread.c and search for audio_process. There are two occurrences. I'll talk about the first on later, go to the second.

$ gedit audio_thread.c

	audio_process((short *)outputBuffer, (short *)inputBuffer, blksize/2);

I've replaced the call to memcpy() with the call to audio_process(). Pointers to the input and output buffers are passed along with the number samples in the buffers. Note blksize is the number of 8-bit chars in the buffers. We're working with 16-bit samples, so everything is converted to shorts.

Look at audio_process.c. Presently all it does is a memcpy() when called.

Make

Before the first make the paths need to be set up. Do this:

$ source ~/c6run_build/environment.sh
$ source ~/c6run_build/loadmodules.sh

We are now dealing with two C compilers. One for the ARM the other for the DSP. The first source above sets PATHs for each of the compilers. Take a look at environment.sh to see the details. The most interesting part is PLATFORM_CFLAGS, which we'll discuss later. The second source (loadmodules.sh) loads some kernel modules that are needed to support the DSP. Take a look at it, we'll explain some of it later.

Be sure to source the environment.sh file every time you start a new terminal. Source the loadmodule.sh file every time you reboot your Beagle.

Run make, but do it like this so you can see what it creates:

$ make clean
$ ls
$ time make
$ ls -sh

My make takes about 20 seconds to compile everything. What new things do you see?

There are 3 groups of files here:

  1. The source code (*.c and *.h),
  2. object files for running on the ARM (gpp) only (audioThru_arm, gpp, gpp_lib) and
  3. objects files for running on the DSP (audioThru_dsp, dsp, dsp_lib).

What's significant is that the same source code produced both sets of objects (dsp, gpp). It's how the source is compiled determines where it is run. If you run ./audioThru_arm the code runs only on the ARM. Try it. It should run just like before.

Running the DSP

Now try running ./audioThru_dsp. This code runs on the ARM, mostly. The function(s) in audio_process.c run on the DSP. Here's what's happening.

  1. When you run ./audioThru_dsp, main.c and audio_thread.c run on the ARM as before.
  2. C6run has inserted a stub for audio_process() so that when is it called the first time on the ARM, it checks to see if the DSP has been initialized. If not, the ARM loads the code for audio_process() on the DSP and then passes the parameters to the DSP and tells it to run the code.
  3. Notice the line: Starting DSP...1.5XXX s. This is printed after the first call to audio_process(). Starting the DSP takes about 1.5s or so. Fortunately it only has to start once.
  4. The ARM waits for the code to complete on the DSP (i.e. it blocks so other processes on the ARM can run.)
  5. Once the DSP completes, the ARM continues running.
  6. Subsequent calls to audio_process() need only pass the parameters and tell the DSP to go.

Details on Using C6Run

So we have a working example of how to use the DSP, but a lots of details have been skipped. Here are some details.

Inside Makefile

So how did the Beagle know what to run on the ARM and what to run on the DSP? The answer is in the Makefile. Take a look at it. The first section sets up the PATHs and FLAGS for the ARM compiler.

#   ----------------------------------------------------------------------------
#   Name of the ARM GCC compiler & archiver
#   ----------------------------------------------------------------------------

The next section does the same for the DSP compiler. One interesting flag is --C6Run:replace_malloc. We'll discuss it in the next section.

#   ----------------------------------------------------------------------------
#   Name of the DSP C6RUN compiler & archiver
#   ----------------------------------------------------------------------------

The next section is the important one.

#   ----------------------------------------------------------------------------
#   List of source files
#   ----------------------------------------------------------------------------
# List the files to run on the ARM here
EXEC_SRCS := main.c audio_input_output.c audio_thread.c 
EXEC_ARM_OBJS := $(EXEC_SRCS:%.c=gpp/%.o)
EXEC_DSP_OBJS := $(EXEC_SRCS:%.c=dsp/%.o)

# List the files to run on the DSP here
LIB_SRCS := audio_process.c
LIB_ARM_OBJS := $(LIB_SRCS:%.c=gpp_lib/%.o)
LIB_DSP_OBJS := $(LIB_SRCS:%.c=dsp_lib/%.o)

Here is where you tell which files run on the ARM and which on the DSP. EXEC_SRCS is a list of the .c files that run on the ARM. LIB_SRCS is the list for the DSP. It's that easy.

Further down you see the rules for building the ARM only code. Look them over until you understand what they are doing.

#   ----------------------------------------------------------------------------
#   Rules for build and ARM (gpp) only target 
#   ----------------------------------------------------------------------------

The next section is for the DSP.

#   ----------------------------------------------------------------------------
#   Rules for build and ARM/DSP (dsp) target 
#   ----------------------------------------------------------------------------

Notice the ARM_CC is used for the files listed in EXEC_DSP_OBJS and the C6RUN_CC is used for those in LIB_DSP_OBJS.

The last thing to note is at the very end. If the variable DUMP is defined, the values of many of the Makefile variables are displayed.

Sharing Memory

The ARM and the DSP share memory, so when we called audio_process() all we had to do was pass pointers to the buffers we wanted to process. There was no need to copy from one processor to another, therefore very little overhead. However, are some details that were handled for you that you need to know.

The ARM uses a memory management unit (MMU) that maps virtual addresses to physical addresses. The DSP doesn't have an MMU. That means the pointers on the ARM (outputBuffer, inputBuffer) point to a virtual address and the pointers on the DSP (outputBuffer, inputBuffer) point to physical addresses, which probably aren't the same. C6Run automatically provided the code needed to map from the virtual address space to the physical.

But there is a bigger problem. outputBuffer and inputBuffer were allocated at run time using the standard C routine malloc. malloc allocates contiguous memory of the desired size; however it is contiguous in the virtual space, but probably not contiguous in the physical space. This causes problems for the DSP.

Explore the Object Files

Explore More Examples

Assignment - Experiment with the code

Now that you have something working, play around a bit. git is installed so you can preserve the present contents of the files with:

# git add Makefile audio_input_output.c audio_process.c audio_thread.c
# git commit -m "Initial commit"

If needed you can use git to retrieve the original version of the files.

Things to try:

  • There are places in the code where timing can be displayed. Remove the comments and display the times. How often is the main loop executed? How long does the DSP take? What's the overhead for the DSP?
  • Try making the DSP do more than pass through. Zero out the left channel to be sure it is working.
  • Try changing the sampling rate and buffer sizes. What setting cause the buffers to overflow or underflow?
  • Implement your own processing on the DSP. Do a simple FIR lowpass filter, etc.
  • Switch the input to the microphones on the web cam and listen to your voice.