ECE597 WooDoo VoiceBox
Embedded Linux Class by Mark A. Yoder
- 1 Grading Template
- 2 Executive Summary
- 3 Installation Instructions
- 3.1 List of required hardware
- 3.2 Setting up Network on RPi
- 3.3 Installing Network on Beaglebone Black
- 3.4 Setting up Network on Beaglebone
- 3.5 Installing Jasper on raspberry Pi
- 3.6 Setting up wireless light control with MRF24J40MA
- 3.7 Setting up wireless light control with Xbee
- 3.8 Advanced Configuration
- 4 User Instructions
- 5 Highlights
- 6 Theory of Operation
- 7 Completed Tasks
- 8 Future Work
- 9 Conclusions
I'm using the following template to grade. Each slot is 10 points. 0 = Missing, 5=OK, 10=Wow!
09 Executive Summary - Good concise review of the project's path 10 Installation Instructions 09 User Instructions 05 Highlights 09 Theory of Operation 10 Work Breakdown 09 Future Work 05 Conclusions 10 Demo 10 Not Late Comments: See red comments below. Score: 86/100
Our project has branched out along several paths since it's inception. The goal was to create a personal assistant that could listen to commands, talk back, and perform them. For modularity, our original goal was to use IBM's Node Red to accomplish this. However, Node Red is not designed for the complicated and workflows that we needed, and extensions were not scale-able. In this document we are going to describe three things.
- A simple node red speech recognition and synthesis system
- Using the open source jasper project on Raspberry Pi
- Using the open source Jasper project on Beaglebone Black
The simple node red speech synthesis will essentially listen to the user, and the repeat back what is said. Currently the speech synthesis is working, and we are trying to integrate pocket sphinx for the speech recognition.
Jasper was developed with the Raspberry Pi, and as a result has a very simple set up procedure. Our goal is to use this platform to begin developing modules. We currently have working versions of jasper installed and have 2 modules developed.
There are no reliable installation instructions for setting up Jasper on the Beaglebone black. We have done the research, and are hoping to be able to provide concise installation instructions and port over our work.
List of required hardware
|Raspberry Pi||Beaglebone Black|
|USB mic, Edimax USB wi-fi||USB audiocard, USB/Audio-In mic|
|Network, power supply, Speakers|
- Note: Internet connectivity is an important part of the project as we are using the Google Speech API for voice recognition in the current revision.
- Note 2: Additional hardware is required for some custom modules to function.
Setting up Network on RPi
There are two options for the internet access: wired connection and wi-fi connection. Both are very easy to manage with wicd connection manager with wicd-curses user interface.
To get wicd installed, plug the ethernet cord (assuming the initial raspbian installation it should work right out of the box) and then run:
sudo apt-get update sudo apt-get install wicd sudo apt-get install wicd-curses
Alternatively /etc/network/interfaces file should be edited to include all of the following:
auto lo iface lo inet loopback iface eth0 inet dhcp allow-hotplug wlan0 auto wlan0 iface wlan0 inet dhcp wpa-ssid "ssid" wpa-psk "password"
This also allows auto-connect to a wireless network specified on a startup.
Installing Network on Beaglebone Black
Setting up Network on Beaglebone
Complete the following instructions, which will guide you through the manual steps for installing Jasper:
After you have completed and tested the installation, run the following commands to install pip and the adafruit GPIO library
sudo apt-get update sudo apt-get install build-essential python-dev python-setuptools python-pip python-smbus -y sudo pip install Adafruit_BBIO
Installing our custom Modules to demonstrate Jasper
git clone https://github.com/dmitryvv/woodoo cd woodoo chmod +x install_modules_BBB.sh ./install_modules_BBB.sh
Installing Jasper on raspberry Pi
Follow the instructions provided on the documentation page to install JASPER
The modules we have implemented use the python RPI.gpio library, run the following commands to install it
sudo apt-get update sudo apt-get install python-dev sudo apt-get install python-rpi.gpio
Installing our custom Modules to demonstrate Jasper
git clone https://github.com/dmitryvv/woodoo cd woodoo chmod +x install_modules_RPi.sh ./install_modules_RPi.sh
Setting up wireless light control with MRF24J40MA
Section needs work, skip for now One of the custom modules is wireless light control based on 802.15.4 IEEE standard commonly used for home automation purposes.
For the proper functioning module there should be transmitter and receiver pieces assembled. The hardware and software instructions:
Rx Unit is currently based on Arduino Duemilanove, MRF24J40MA wireless module, relay, power regulator and a number of resistors and capacitors. See figure 1 for the proper schematics. To load the right firmware to Arduino:
- Install Arduino IDE
- Open Arduino IDE and add mrf24j library using Sketch -> Import Library, the library itself lives in the Wireless_Light/arduino as a zip file.
- Open the arduino file from Wireless_Light/arduino/mrf-rx-tx and upload it to the board.
- Receiver should be ready to go.
Tx Unit is the same MRF24J40MA transceiver but on the RPi side. Follow the steps below:
- Configure RPi to see SPI plus python I2C extension
- Run Wireless_Light/rpi/mrf24j_radio.py from the repo to test everything.
- Enable the module.
Setting up wireless light control with Xbee
This is a tested module that works correctly.
Use some FTDI Cable to conntect to XBee
- Download XCTU utility
- ZigBee needs to be burned with latest RS 232 ADAPTER firmware for the right board family
- Make sure that API Mode is enabled in the register list in XCTU
- Flow control is optional
This Rx Unit is based on Arduino Duemilanove, XBee X24 wireless module, RGB LED and a number of resistors and capacitors. To load the right firmware to Arduino:
- Install Arduino IDE
- Open Arduino IDE and add xbee library using Sketch -> Import Library, the library itself lives in the Wireless_Light/arduino as a zip file.
- Open the arduino file from Wireless_Light/arduino/xbee and upload it to the board.
- Receiver should be ready to go.
Then follow the schematic diagram on the right to wire the Xbee module and wire the RGB LED (assuming common ground) with the logic guidance (look pin assignations in the arduino sketch).
Tx Unit is the same XBee transceiver but on the RPi side. Follow the steps below:
- install_modules_RPi.py will do everything for you.
The appropriate schematics is shown on the right.
By default Jasper uses pocketsphinx as a voice recognition platform and it was found to produce a low quality recognition so we decided to move to Google Speech.
- Follow the directions here to get API public key for the jasper.
- From the jasper directory navigate to clients directory and run python populate.py
- Provide all the information and when prompted to enter STT enter 'google' and then enter your API key when asked.
- Check that everything is good to go by running jasper. It is common to have a few errors during the first startup but don't panic, as there is a very helpful Google Group and FAQ maintained by Jasper authors.
To start jasper, navigate to the jasper directory and run the following command
Jasper will initialize everything, and if everything is connected properly, should prompt you for a command. The typical use case is to say Jasper for it to begin listening, wait for a high pitched beep, say your command, and then wait for another lower pitched beep for a confirmation.
You: Jasper Jasper: *BEEP* You: What time is it. Jasper: *BEEP* Jasper: It's currently now.
Could you add some text highlight too. How well does it work? (X)
Here is a little Youtube Demo video of how the setup recognizes commands. The video shows us using the command we developed for changing the LED color over wireless communication. In the video you can see that the command worked properly on the first attempt both times, however on the second attempt it took two tries to get the system to listen.
When will this arrive?
Jerry Talyor of Mansfield, IN is going to make a fantastic wooden laser cut and laser engraved box for us.
Supported Voice Commands
- <color> ONE and <color> ZERO where color = [RED, GREEN, BLUE]
This command works very well most of the time. It has some difficulty recognizing ONE, often mistaking it for ON, but it will usually turn the command on and off most of the time.
- ROSE SCHEDULE will tell you the next class
This command also works very well, the fact that it is two unique words that are fairly long allows it to be correctly recognized almost everytime.
- TOGGLE ON <pin number> and TOGGLE OFF <pin number>
This last command has a lot of trouble being recognized because of the 'ON' and 'OFF' keywords. They are often mistaken for other commands like 'OF', 'NO', and 'ONE.'
Currently if NodeRed Module is enabled, all the recognized voice is being fed to the NodeRed. For the moment it is stored in a temp file, and that file can be watched by the Tail command in NodeRed, and then piped into the rest of the system and deployed. It does work consistently well, although not an ideal solution.
I don't see anything in your wiki about setting up Node Red. How do I set it up?
Theory of Operation
Our project uses the opensource Jasper project as it's backbone. The system uses one of two voice recognition systems for analyzing the input from the microphone. The first type, the default for jasper, is using pocketSphinx, and open source voice recognition system. PocketSphinx is initialized with a library of words, and only listens for those specific words and attempts to best fit a given sound with the words it knows. This provides fast, real time responses that don't require an internet connection. A second method for voice recognition, which only requires a small configuration change on jasper, is too switch to the Google speech API. When using Google speech, the system takes the discrete sound bytes and sends them to Google, which in turn returns an array of potential matches sorted by confidence. Jasper will take the most likely result and use that for determining commands. Jasper will take the sound byte and search through the installed modules. Each module has a isValid function that takes the text as an argument, if this function returns true then Jasper will run the handle method in the module and stop searching. Jasper will search through the modules first in order of priority, then by name.
- Wi-fi dongle working - 10/22/2014
- Created Module for Jasper - 10/28/2014
- Created install script for installing modules on RPi - 11/4/2014
- Created simple GPIO Module - 11/8/2014
- Created Rose-Schedule Module - 11/11/2014
- Created Wireless Light Module 11/16/2014
- Speech recognition software installed - 10/20/2014
- Sound recording/payback with arecord aplay works 10/22/2014
- Ported Jasper to Work on the Google voice API 11/12/2014
- Wireless Light control module transmitter and receiver modules 11/16/2014
- Jasper Wireless Light Module debug 11/16/2014
- Install script mods to setup RPi as wireless light server 11/17/2014
- Hook wireless light to an actual light - Dmitry
- Develop Additional testing modules - Matt, Dmitry
- Get Jasper working on BBB - Matt, Dmitry
The most immediate extension to this project would to fully port all of the code and features over to the Beaglebone. Although we do provide instructions for installing jasper on the beaglebone, some of the scripts that we wrote are not compatible for the Beaglebone. This will mostly require a simple port of the code, and replacing some of the libraries that are being used.
It would also be useful to see a way to integrate NodeRed and Jasper. We created a module that simply writes everything that jasper hears to a file. In NodeRed, that file can be tailed and used in NodeRed. A native way to interface the two would be useful for decreasing latency and hopefully improve compatibility.
Sometimes you use the Google Speech API and sometimes Pocketsphinx. It's hard to tell which is used when. Could you look over the whole wiki and make it clear which is being use? Is there a configuration that selected which is used? How do I change it?
Our initial project was to set up Node Red for doing Voice Recognition to create a personal assistant like device. We thought that creating a couple of nodes, we would be able to rapidly prototype a large number of different functions and deploy them all to the Node Red Server. What we found out was that Node Red is poorly set up to do this sort of processing, so we endeavored to look for an alternative method for creating our device. Our research led us to using the open source project known as Jasper. However, Jasper proved to be very difficult for installation on the Beaglebone Black, and we were hoping to find an easier way for setting up and installing our system.
Raspberry Pi was found to be the easiest to set up Jasper on. The default installation of Jasper uses the program Pocketsphinx for doing speech recognition, which works rather well on the small number of commands that we wanted to run. We noticed some scaling issues when increasing the number of commands, it couldn't do recognition on a very large number of words, which we ran into when trying to allow it recognize any number. Switching the configuration to Google Speech allowed for a much accurate speech to text conversion, and that sacrifice of speed and the requirement of a constant network connection. The Google speech API only allows so many conversions per month, we don't for see this being an issue unless the system is under heavy usage for most of the day.
We did end up implementing a work around to still do speech recognition with Node Red, using jasper. By setting up the NodeRed module on jasper, all of the words found are appended to a file. This file can be read by NodeRed by any number of workflows, and then the text can be extracted. The text can then undergo some sort of error checking in each flow to see if it satisfies some condition, and then a function can be performed. We never fully investigated this process, but it could be used for creating a speech recognition system that could identify a large number of commands.
Embedded Linux Class by Mark A. Yoder