https://elinux.org/api.php?action=feedcontributions&user=Liulei32&feedformat=atomeLinux.org - User contributions [en]2024-03-28T10:08:05ZUser contributionsMediaWiki 1.31.0https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=48619ECE497 Project Voice Dialer2011-05-17T10:30:15Z<p>Liulei32: /* Conclusions */</p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
<br />
== Executive Summary ==<br />
<br />
The voice dialer project aims to complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing. TIesr is used to build a Hidden Markov Model for the voice recognition. (Give two sentence intro to the project.)<br />
<br />
The TIser is working which returns a voice recognition result from audio input. The Google Voice dialer is also completed so that it can be used to make a call from a Google Voice account to any valid phone number. (Give two sentences telling what works.)<br />
<br />
Give two sentences telling what isn't working.<br />
<br />
Generally our team has reached our goal of making a voice controlled dialer. Although the TIesr HHM model does not work perfectly due to small training data, we have finished building all software structure and proved it working on Beagleboard. (End with a two sentence conclusion.)<br />
<br />
== Instillation Instructions ==<br />
<br />
Give step by step instructions on how to install your project on the SPEd2 image. <br />
<br />
* Include your [https://github.com/ github] path as a link like this: [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn]. <br />
* Include any additional packages installed via '''opkg'''.<br />
* Include kernel mods.<br />
* If there is extra hardware needed, include links to where it can be obtained.<br />
<br />
== User Instructions ==<br />
<br />
Once everything is installed, how do you use the program? Give details here, so if you have a long user manual, link to it here.<br />
<br />
== Highlights ==<br />
<br />
Speaker-independent speech recognition algorithm recognizes phone numbers from people talking, and then give it a call from Google Voice dialer!<br />
<br />
== Theory of Operation ==<br />
<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number. <br />
<br />
=== TIesr ===<br />
In our project, we used TI Embedded Speech Recognizer (TIESR) for Speaker-Independent recognition. The TIESR speech recognizer is targeted toward embedded platforms where computation and memory storage efficiency are important. TIESR uses Hidden Markov Model (HMM) technology to model the acoustic signals found in speech.<br />
<br />
To make TIesr a high performance speech recognizer, the model must be built and trained before using. During this, some softwares are needed to build the HMM. They are,<br />
The Hidden Markov Modeling Toolkit (HTK), which may be obtained from: <br />
http://htk.eng.cam.ac.uk/<br />
and Perl Modules Math::FFT and Algorithms::Cluster from the CPAN.<br />
<br />
Since our goal--to recognize ten digits--is a reletively simple task for TIesr, we do not utilize the pronunciation decision tree files. Below are steps we used to train the TIesr model.<br />
<br />
=====Step 1: Data Preparation=====<br />
<br />
Prepare text files for ten digits in alphabetical order.<br />
<br />
eight,<br />
<br />
five,<br />
<br />
four,<br />
<br />
nine,<br />
<br />
one,<br />
<br />
seven,<br />
<br />
six,<br />
<br />
three,<br />
<br />
two,<br />
<br />
zero.<br />
<br />
=====Step 2: Making the Letter File=====<br />
<br />
Instead of creating pronunciation decision trees, for small vocabularies the only file necessary is one that contains a sorted list of all characters making up words in the dictionary. This must be put in a file named "cAttValue.txt", and we put it in Data/Lang/cAttValue.txt. Each character should be a single byte.<br />
<br />
=====Step 3: Building the Compressed Binary Dictionary Files=====<br />
<br />
The dictionary file must be converted into a binary form for subsequent processing steps, since the TIesr tools use a binary dictionary. We use HTK HDMan tool to generate binary file "dict.bin" from "phone.lis".<br />
<br />
=====Step 4: Building the Acoustic Model Data Files=====<br />
<br />
Firstly we recorded 50 speech clips, 5 for each digit. They are sampled at 8KHz, using 16 bit LSB first PCM coding method.<br />
<br />
Then we use "sample_to_htk.pl" provided by TIesr to convert those .raw audio to .htk format file, which can be utilized for building HHM.<br />
<br />
After that, we carefully labelled out the time segment of each audio file, showing when a word starts and ends.<br />
<br />
Next we used the .htk files and segment information to train the HHM for four times. The number of iteration time can only determined by experiment. <br />
<br />
Finally the trained HTK data is converted to TIesr-compatible acoustic data files.<br />
<br />
=====Step 5: Creating the Hierarchical Linear Regression cluster tree file=====<br />
<br />
In this step we uses the results of word model to determine a linear regression tree for the HMM models.<br />
<br />
=====Step 6: Creating the Gaussian cluster files=====<br />
<br />
Gaussiancluster, which is also included in TIesr files, is used to provide TIesr clustering information.<br />
<br />
=====Step 7: Testing the data files=====<br />
<br />
In this step, we use testtiesrflex to generate all the final data needed by TIesrSI module to recognize speech and make decision.<br />
<br />
== Work Breakdown ==<br />
<br />
Main dialer program and cross-compilation by Dan Bennett and Will Gerth;<br />
<br />
Google Voice dialer script by David Bliss;<br />
<br />
TIesr module build and HMM training by Lei Liu.<br />
<br />
Also list here what doesn't work yet and when you think it will be finished and who is finishing it.<br />
<br />
== Conclusions ==<br />
<br />
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.<br />
<br />
Based on the progress right now, we can make a conclusion that our thought is almost implemented. We combined TIesr and Google Voice together, making them working on embedded Linux system.<br />
<br />
For suggestion, the Python Google Voice code does not support talking on the phone, so this can be improved definitely. But this would be very hard to do, because Google Voice never publish its code officially.<br />
<br />
What's more, there is also a long way to go on our TIesr model. A large amount of data is required for training to enhance the performance. To make things more interesting, name recognition can be added into our program. To make it more more interesting, the TIesr could be trained to be able to recognize words of foreign language.<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=48607ECE497 Project Voice Dialer2011-05-17T10:28:51Z<p>Liulei32: /* Conclusions */</p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
<br />
== Executive Summary ==<br />
<br />
The voice dialer project aims to complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing. TIesr is used to build a Hidden Markov Model for the voice recognition. (Give two sentence intro to the project.)<br />
<br />
The TIser is working which returns a voice recognition result from audio input. The Google Voice dialer is also completed so that it can be used to make a call from a Google Voice account to any valid phone number. (Give two sentences telling what works.)<br />
<br />
Give two sentences telling what isn't working.<br />
<br />
Generally our team has reached our goal of making a voice controlled dialer. Although the TIesr HHM model does not work perfectly due to small training data, we have finished building all software structure and proved it working on Beagleboard. (End with a two sentence conclusion.)<br />
<br />
== Instillation Instructions ==<br />
<br />
Give step by step instructions on how to install your project on the SPEd2 image. <br />
<br />
* Include your [https://github.com/ github] path as a link like this: [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn]. <br />
* Include any additional packages installed via '''opkg'''.<br />
* Include kernel mods.<br />
* If there is extra hardware needed, include links to where it can be obtained.<br />
<br />
== User Instructions ==<br />
<br />
Once everything is installed, how do you use the program? Give details here, so if you have a long user manual, link to it here.<br />
<br />
== Highlights ==<br />
<br />
Speaker-independent speech recognition algorithm recognizes phone numbers from people talking, and then give it a call from Google Voice dialer!<br />
<br />
== Theory of Operation ==<br />
<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number. <br />
<br />
=== TIesr ===<br />
In our project, we used TI Embedded Speech Recognizer (TIESR) for Speaker-Independent recognition. The TIESR speech recognizer is targeted toward embedded platforms where computation and memory storage efficiency are important. TIESR uses Hidden Markov Model (HMM) technology to model the acoustic signals found in speech.<br />
<br />
To make TIesr a high performance speech recognizer, the model must be built and trained before using. During this, some softwares are needed to build the HMM. They are,<br />
The Hidden Markov Modeling Toolkit (HTK), which may be obtained from: <br />
http://htk.eng.cam.ac.uk/<br />
and Perl Modules Math::FFT and Algorithms::Cluster from the CPAN.<br />
<br />
Since our goal--to recognize ten digits--is a reletively simple task for TIesr, we do not utilize the pronunciation decision tree files. Below are steps we used to train the TIesr model.<br />
<br />
=====Step 1: Data Preparation=====<br />
<br />
Prepare text files for ten digits in alphabetical order.<br />
<br />
eight,<br />
<br />
five,<br />
<br />
four,<br />
<br />
nine,<br />
<br />
one,<br />
<br />
seven,<br />
<br />
six,<br />
<br />
three,<br />
<br />
two,<br />
<br />
zero.<br />
<br />
=====Step 2: Making the Letter File=====<br />
<br />
Instead of creating pronunciation decision trees, for small vocabularies the only file necessary is one that contains a sorted list of all characters making up words in the dictionary. This must be put in a file named "cAttValue.txt", and we put it in Data/Lang/cAttValue.txt. Each character should be a single byte.<br />
<br />
=====Step 3: Building the Compressed Binary Dictionary Files=====<br />
<br />
The dictionary file must be converted into a binary form for subsequent processing steps, since the TIesr tools use a binary dictionary. We use HTK HDMan tool to generate binary file "dict.bin" from "phone.lis".<br />
<br />
=====Step 4: Building the Acoustic Model Data Files=====<br />
<br />
Firstly we recorded 50 speech clips, 5 for each digit. They are sampled at 8KHz, using 16 bit LSB first PCM coding method.<br />
<br />
Then we use "sample_to_htk.pl" provided by TIesr to convert those .raw audio to .htk format file, which can be utilized for building HHM.<br />
<br />
After that, we carefully labelled out the time segment of each audio file, showing when a word starts and ends.<br />
<br />
Next we used the .htk files and segment information to train the HHM for four times. The number of iteration time can only determined by experiment. <br />
<br />
Finally the trained HTK data is converted to TIesr-compatible acoustic data files.<br />
<br />
=====Step 5: Creating the Hierarchical Linear Regression cluster tree file=====<br />
<br />
In this step we uses the results of word model to determine a linear regression tree for the HMM models.<br />
<br />
=====Step 6: Creating the Gaussian cluster files=====<br />
<br />
Gaussiancluster, which is also included in TIesr files, is used to provide TIesr clustering information.<br />
<br />
=====Step 7: Testing the data files=====<br />
<br />
In this step, we use testtiesrflex to generate all the final data needed by TIesrSI module to recognize speech and make decision.<br />
<br />
== Work Breakdown ==<br />
<br />
Main dialer program and cross-compilation by Dan Bennett and Will Gerth;<br />
<br />
Google Voice dialer script by David Bliss;<br />
<br />
TIesr module build and HMM training by Lei Liu.<br />
<br />
Also list here what doesn't work yet and when you think it will be finished and who is finishing it.<br />
<br />
== Conclusions ==<br />
<br />
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.<br />
<br />
Based on the progress right now, we can make a conclusion that our thought is almost implemented. We combined TIesr and Google Voice together, making them working on embedded Linux system.<br />
<br />
For suggestion, the python coded Google Voice cannot support talking on the phone, so this can be improved definitely. But this would be very hard to do, because Google Voice never publish its code officially.<br />
<br />
What's more, there is also a long way to go on our TIesr model. A large amount of data is required for training to enhance the performance. To make things more interesting, name recognition can be added into our program. To make it more more interesting, the TIesr could be trained to be able to recognize words of foreign language.<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=48583ECE497 Project Voice Dialer2011-05-17T10:15:42Z<p>Liulei32: /* Work Breakdown */</p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
<br />
== Executive Summary ==<br />
<br />
The voice dialer project aims to complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing. TIesr is used to build a Hidden Markov Model for the voice recognition. (Give two sentence intro to the project.)<br />
<br />
The TIser is working which returns a voice recognition result from audio input. The Google Voice dialer is also completed so that it can be used to make a call from a Google Voice account to any valid phone number. (Give two sentences telling what works.)<br />
<br />
Give two sentences telling what isn't working.<br />
<br />
Generally our team has reached our goal of making a voice controlled dialer. Although the TIesr HHM model does not work perfectly due to small training data, we have finished building all software structure and proved it working on Beagleboard. (End with a two sentence conclusion.)<br />
<br />
== Instillation Instructions ==<br />
<br />
Give step by step instructions on how to install your project on the SPEd2 image. <br />
<br />
* Include your [https://github.com/ github] path as a link like this: [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn]. <br />
* Include any additional packages installed via '''opkg'''.<br />
* Include kernel mods.<br />
* If there is extra hardware needed, include links to where it can be obtained.<br />
<br />
== User Instructions ==<br />
<br />
Once everything is installed, how do you use the program? Give details here, so if you have a long user manual, link to it here.<br />
<br />
== Highlights ==<br />
<br />
Speaker-independent speech recognition algorithm recognizes phone numbers from people talking, and then give it a call from Google Voice dialer!<br />
<br />
== Theory of Operation ==<br />
<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number. <br />
<br />
=== TIesr ===<br />
In our project, we used TI Embedded Speech Recognizer (TIESR) for Speaker-Independent recognition. The TIESR speech recognizer is targeted toward embedded platforms where computation and memory storage efficiency are important. TIESR uses Hidden Markov Model (HMM) technology to model the acoustic signals found in speech.<br />
<br />
To make TIesr a high performance speech recognizer, the model must be built and trained before using. During this, some softwares are needed to build the HMM. They are,<br />
The Hidden Markov Modeling Toolkit (HTK), which may be obtained from: <br />
http://htk.eng.cam.ac.uk/<br />
and Perl Modules Math::FFT and Algorithms::Cluster from the CPAN.<br />
<br />
Since our goal--to recognize ten digits--is a reletively simple task for TIesr, we do not utilize the pronunciation decision tree files. Below are steps we used to train the TIesr model.<br />
<br />
=====Step 1: Data Preparation=====<br />
<br />
Prepare text files for ten digits in alphabetical order.<br />
<br />
eight,<br />
<br />
five,<br />
<br />
four,<br />
<br />
nine,<br />
<br />
one,<br />
<br />
seven,<br />
<br />
six,<br />
<br />
three,<br />
<br />
two,<br />
<br />
zero.<br />
<br />
=====Step 2: Making the Letter File=====<br />
<br />
Instead of creating pronunciation decision trees, for small vocabularies the only file necessary is one that contains a sorted list of all characters making up words in the dictionary. This must be put in a file named "cAttValue.txt", and we put it in Data/Lang/cAttValue.txt. Each character should be a single byte.<br />
<br />
=====Step 3: Building the Compressed Binary Dictionary Files=====<br />
<br />
The dictionary file must be converted into a binary form for subsequent processing steps, since the TIesr tools use a binary dictionary. We use HTK HDMan tool to generate binary file "dict.bin" from "phone.lis".<br />
<br />
=====Step 4: Building the Acoustic Model Data Files=====<br />
<br />
Firstly we recorded 50 speech clips, 5 for each digit. They are sampled at 8KHz, using 16 bit LSB first PCM coding method.<br />
<br />
Then we use "sample_to_htk.pl" provided by TIesr to convert those .raw audio to .htk format file, which can be utilized for building HHM.<br />
<br />
After that, we carefully labelled out the time segment of each audio file, showing when a word starts and ends.<br />
<br />
Next we used the .htk files and segment information to train the HHM for four times. The number of iteration time can only determined by experiment. <br />
<br />
Finally the trained HTK data is converted to TIesr-compatible acoustic data files.<br />
<br />
=====Step 5: Creating the Hierarchical Linear Regression cluster tree file=====<br />
<br />
In this step we uses the results of word model to determine a linear regression tree for the HMM models.<br />
<br />
=====Step 6: Creating the Gaussian cluster files=====<br />
<br />
Gaussiancluster, which is also included in TIesr files, is used to provide TIesr clustering information.<br />
<br />
=====Step 7: Testing the data files=====<br />
<br />
In this step, we use testtiesrflex to generate all the final data needed by TIesrSI module to recognize speech and make decision.<br />
<br />
== Work Breakdown ==<br />
<br />
Main dialer program and cross-compilation by Dan Bennett and Will Gerth;<br />
<br />
Google Voice dialer script by David Bliss;<br />
<br />
TIesr module build and HMM training by Lei Liu.<br />
<br />
Also list here what doesn't work yet and when you think it will be finished and who is finishing it.<br />
<br />
== Conclusions ==<br />
<br />
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.<br />
<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=48577ECE497 Project Voice Dialer2011-05-17T10:08:45Z<p>Liulei32: </p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
<br />
== Executive Summary ==<br />
<br />
The voice dialer project aims to complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing. TIesr is used to build a Hidden Markov Model for the voice recognition. (Give two sentence intro to the project.)<br />
<br />
The TIser is working which returns a voice recognition result from audio input. The Google Voice dialer is also completed so that it can be used to make a call from a Google Voice account to any valid phone number. (Give two sentences telling what works.)<br />
<br />
Give two sentences telling what isn't working.<br />
<br />
Generally our team has reached our goal of making a voice controlled dialer. Although the TIesr HHM model does not work perfectly due to small training data, we have finished building all software structure and proved it working on Beagleboard. (End with a two sentence conclusion.)<br />
<br />
== Instillation Instructions ==<br />
<br />
Give step by step instructions on how to install your project on the SPEd2 image. <br />
<br />
* Include your [https://github.com/ github] path as a link like this: [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn]. <br />
* Include any additional packages installed via '''opkg'''.<br />
* Include kernel mods.<br />
* If there is extra hardware needed, include links to where it can be obtained.<br />
<br />
== User Instructions ==<br />
<br />
Once everything is installed, how do you use the program? Give details here, so if you have a long user manual, link to it here.<br />
<br />
== Highlights ==<br />
<br />
Speaker-independent speech recognition algorithm recognizes phone numbers from people talking, and then give it a call from Google Voice dialer!<br />
<br />
== Theory of Operation ==<br />
<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number. <br />
<br />
=== TIesr ===<br />
In our project, we used TI Embedded Speech Recognizer (TIESR) for Speaker-Independent recognition. The TIESR speech recognizer is targeted toward embedded platforms where computation and memory storage efficiency are important. TIESR uses Hidden Markov Model (HMM) technology to model the acoustic signals found in speech.<br />
<br />
To make TIesr a high performance speech recognizer, the model must be built and trained before using. During this, some softwares are needed to build the HMM. They are,<br />
The Hidden Markov Modeling Toolkit (HTK), which may be obtained from: <br />
http://htk.eng.cam.ac.uk/<br />
and Perl Modules Math::FFT and Algorithms::Cluster from the CPAN.<br />
<br />
Since our goal--to recognize ten digits--is a reletively simple task for TIesr, we do not utilize the pronunciation decision tree files. Below are steps we used to train the TIesr model.<br />
<br />
=====Step 1: Data Preparation=====<br />
<br />
Prepare text files for ten digits in alphabetical order.<br />
<br />
eight,<br />
<br />
five,<br />
<br />
four,<br />
<br />
nine,<br />
<br />
one,<br />
<br />
seven,<br />
<br />
six,<br />
<br />
three,<br />
<br />
two,<br />
<br />
zero.<br />
<br />
=====Step 2: Making the Letter File=====<br />
<br />
Instead of creating pronunciation decision trees, for small vocabularies the only file necessary is one that contains a sorted list of all characters making up words in the dictionary. This must be put in a file named "cAttValue.txt", and we put it in Data/Lang/cAttValue.txt. Each character should be a single byte.<br />
<br />
=====Step 3: Building the Compressed Binary Dictionary Files=====<br />
<br />
The dictionary file must be converted into a binary form for subsequent processing steps, since the TIesr tools use a binary dictionary. We use HTK HDMan tool to generate binary file "dict.bin" from "phone.lis".<br />
<br />
=====Step 4: Building the Acoustic Model Data Files=====<br />
<br />
Firstly we recorded 50 speech clips, 5 for each digit. They are sampled at 8KHz, using 16 bit LSB first PCM coding method.<br />
<br />
Then we use "sample_to_htk.pl" provided by TIesr to convert those .raw audio to .htk format file, which can be utilized for building HHM.<br />
<br />
After that, we carefully labelled out the time segment of each audio file, showing when a word starts and ends.<br />
<br />
Next we used the .htk files and segment information to train the HHM for four times. The number of iteration time can only determined by experiment. <br />
<br />
Finally the trained HTK data is converted to TIesr-compatible acoustic data files.<br />
<br />
=====Step 5: Creating the Hierarchical Linear Regression cluster tree file=====<br />
<br />
In this step we uses the results of word model to determine a linear regression tree for the HMM models.<br />
<br />
=====Step 6: Creating the Gaussian cluster files=====<br />
<br />
Gaussiancluster, which is also included in TIesr files, is used to provide TIesr clustering information.<br />
<br />
=====Step 7: Testing the data files=====<br />
<br />
In this step, we use testtiesrflex to generate all the final data needed by TIesrSI module to recognize speech and make decision.<br />
<br />
== Work Breakdown ==<br />
<br />
List the major tasks in your project and who did what.<br />
<br />
Also list here what doesn't work yet and when you think it will be finished and who is finishing it.<br />
<br />
== Conclusions ==<br />
<br />
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.<br />
<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=48571ECE497 Project Voice Dialer2011-05-17T10:08:17Z<p>Liulei32: /* Theory of Operation */</p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
<br />
==Architecture==<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number.<br />
<br />
== Executive Summary ==<br />
<br />
The voice dialer project aims to complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing. TIesr is used to build a Hidden Markov Model for the voice recognition. (Give two sentence intro to the project.)<br />
<br />
The TIser is working which returns a voice recognition result from audio input. The Google Voice dialer is also completed so that it can be used to make a call from a Google Voice account to any valid phone number. (Give two sentences telling what works.)<br />
<br />
Give two sentences telling what isn't working.<br />
<br />
Generally our team has reached our goal of making a voice controlled dialer. Although the TIesr HHM model does not work perfectly due to small training data, we have finished building all software structure and proved it working on Beagleboard. (End with a two sentence conclusion.)<br />
<br />
== Instillation Instructions ==<br />
<br />
Give step by step instructions on how to install your project on the SPEd2 image. <br />
<br />
* Include your [https://github.com/ github] path as a link like this: [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn]. <br />
* Include any additional packages installed via '''opkg'''.<br />
* Include kernel mods.<br />
* If there is extra hardware needed, include links to where it can be obtained.<br />
<br />
== User Instructions ==<br />
<br />
Once everything is installed, how do you use the program? Give details here, so if you have a long user manual, link to it here.<br />
<br />
== Highlights ==<br />
<br />
Speaker-independent speech recognition algorithm recognizes phone numbers from people talking, and then give it a call from Google Voice dialer!<br />
<br />
== Theory of Operation ==<br />
<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number. <br />
<br />
=== TIesr ===<br />
In our project, we used TI Embedded Speech Recognizer (TIESR) for Speaker-Independent recognition. The TIESR speech recognizer is targeted toward embedded platforms where computation and memory storage efficiency are important. TIESR uses Hidden Markov Model (HMM) technology to model the acoustic signals found in speech.<br />
<br />
To make TIesr a high performance speech recognizer, the model must be built and trained before using. During this, some softwares are needed to build the HMM. They are,<br />
The Hidden Markov Modeling Toolkit (HTK), which may be obtained from: <br />
http://htk.eng.cam.ac.uk/<br />
and Perl Modules Math::FFT and Algorithms::Cluster from the CPAN.<br />
<br />
Since our goal--to recognize ten digits--is a reletively simple task for TIesr, we do not utilize the pronunciation decision tree files. Below are steps we used to train the TIesr model.<br />
<br />
=====Step 1: Data Preparation=====<br />
<br />
Prepare text files for ten digits in alphabetical order.<br />
<br />
eight,<br />
<br />
five,<br />
<br />
four,<br />
<br />
nine,<br />
<br />
one,<br />
<br />
seven,<br />
<br />
six,<br />
<br />
three,<br />
<br />
two,<br />
<br />
zero.<br />
<br />
=====Step 2: Making the Letter File=====<br />
<br />
Instead of creating pronunciation decision trees, for small vocabularies the only file necessary is one that contains a sorted list of all characters making up words in the dictionary. This must be put in a file named "cAttValue.txt", and we put it in Data/Lang/cAttValue.txt. Each character should be a single byte.<br />
<br />
=====Step 3: Building the Compressed Binary Dictionary Files=====<br />
<br />
The dictionary file must be converted into a binary form for subsequent processing steps, since the TIesr tools use a binary dictionary. We use HTK HDMan tool to generate binary file "dict.bin" from "phone.lis".<br />
<br />
=====Step 4: Building the Acoustic Model Data Files=====<br />
<br />
Firstly we recorded 50 speech clips, 5 for each digit. They are sampled at 8KHz, using 16 bit LSB first PCM coding method.<br />
<br />
Then we use "sample_to_htk.pl" provided by TIesr to convert those .raw audio to .htk format file, which can be utilized for building HHM.<br />
<br />
After that, we carefully labelled out the time segment of each audio file, showing when a word starts and ends.<br />
<br />
Next we used the .htk files and segment information to train the HHM for four times. The number of iteration time can only determined by experiment. <br />
<br />
Finally the trained HTK data is converted to TIesr-compatible acoustic data files.<br />
<br />
=====Step 5: Creating the Hierarchical Linear Regression cluster tree file=====<br />
<br />
In this step we uses the results of word model to determine a linear regression tree for the HMM models.<br />
<br />
=====Step 6: Creating the Gaussian cluster files=====<br />
<br />
Gaussiancluster, which is also included in TIesr files, is used to provide TIesr clustering information.<br />
<br />
=====Step 7: Testing the data files=====<br />
<br />
In this step, we use testtiesrflex to generate all the final data needed by TIesrSI module to recognize speech and make decision.<br />
<br />
== Work Breakdown ==<br />
<br />
List the major tasks in your project and who did what.<br />
<br />
Also list here what doesn't work yet and when you think it will be finished and who is finishing it.<br />
<br />
== Conclusions ==<br />
<br />
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.<br />
<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=48565ECE497 Project Voice Dialer2011-05-17T10:06:22Z<p>Liulei32: /* Highlights */</p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
<br />
==Architecture==<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number.<br />
<br />
== Executive Summary ==<br />
<br />
The voice dialer project aims to complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing. TIesr is used to build a Hidden Markov Model for the voice recognition. (Give two sentence intro to the project.)<br />
<br />
The TIser is working which returns a voice recognition result from audio input. The Google Voice dialer is also completed so that it can be used to make a call from a Google Voice account to any valid phone number. (Give two sentences telling what works.)<br />
<br />
Give two sentences telling what isn't working.<br />
<br />
Generally our team has reached our goal of making a voice controlled dialer. Although the TIesr HHM model does not work perfectly due to small training data, we have finished building all software structure and proved it working on Beagleboard. (End with a two sentence conclusion.)<br />
<br />
== Instillation Instructions ==<br />
<br />
Give step by step instructions on how to install your project on the SPEd2 image. <br />
<br />
* Include your [https://github.com/ github] path as a link like this: [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn]. <br />
* Include any additional packages installed via '''opkg'''.<br />
* Include kernel mods.<br />
* If there is extra hardware needed, include links to where it can be obtained.<br />
<br />
== User Instructions ==<br />
<br />
Once everything is installed, how do you use the program? Give details here, so if you have a long user manual, link to it here.<br />
<br />
== Highlights ==<br />
<br />
Speaker-independent speech recognition algorithm recognizes phone numbers from people talking, and then give it a call from Google Voice dialer!<br />
<br />
== Theory of Operation ==<br />
<br />
Give a high level overview of the structure of your software. Are you using GStreamer? Show a diagram of the pipeline. Are you running multiple tasks? Show what they do and how they interact.<br />
<br />
=== TIesr ===<br />
In our project, we used TI Embedded Speech Recognizer (TIESR) for Speaker-Independent recognition. The TIESR speech recognizer is targeted toward embedded platforms where computation and memory storage efficiency are important. TIESR uses Hidden Markov Model (HMM) technology to model the acoustic signals found in speech.<br />
<br />
To make TIesr a high performance speech recognizer, the model must be built and trained before using. During this, some softwares are needed to build the HMM. They are,<br />
The Hidden Markov Modeling Toolkit (HTK), which may be obtained from: <br />
http://htk.eng.cam.ac.uk/<br />
and Perl Modules Math::FFT and Algorithms::Cluster from the CPAN.<br />
<br />
Since our goal--to recognize ten digits--is a reletively simple task for TIesr, we do not utilize the pronunciation decision tree files. Below are steps we used to train the TIesr model.<br />
<br />
=====Step 1: Data Preparation=====<br />
<br />
Prepare text files for ten digits in alphabetical order.<br />
<br />
eight,<br />
<br />
five,<br />
<br />
four,<br />
<br />
nine,<br />
<br />
one,<br />
<br />
seven,<br />
<br />
six,<br />
<br />
three,<br />
<br />
two,<br />
<br />
zero.<br />
<br />
=====Step 2: Making the Letter File=====<br />
<br />
Instead of creating pronunciation decision trees, for small vocabularies the only file necessary is one that contains a sorted list of all characters making up words in the dictionary. This must be put in a file named "cAttValue.txt", and we put it in Data/Lang/cAttValue.txt. Each character should be a single byte.<br />
<br />
=====Step 3: Building the Compressed Binary Dictionary Files=====<br />
<br />
The dictionary file must be converted into a binary form for subsequent processing steps, since the TIesr tools use a binary dictionary. We use HTK HDMan tool to generate binary file "dict.bin" from "phone.lis".<br />
<br />
=====Step 4: Building the Acoustic Model Data Files=====<br />
<br />
Firstly we recorded 50 speech clips, 5 for each digit. They are sampled at 8KHz, using 16 bit LSB first PCM coding method.<br />
<br />
Then we use "sample_to_htk.pl" provided by TIesr to convert those .raw audio to .htk format file, which can be utilized for building HHM.<br />
<br />
After that, we carefully labelled out the time segment of each audio file, showing when a word starts and ends.<br />
<br />
Next we used the .htk files and segment information to train the HHM for four times. The number of iteration time can only determined by experiment. <br />
<br />
Finally the trained HTK data is converted to TIesr-compatible acoustic data files.<br />
<br />
=====Step 5: Creating the Hierarchical Linear Regression cluster tree file=====<br />
<br />
In this step we uses the results of word model to determine a linear regression tree for the HMM models.<br />
<br />
=====Step 6: Creating the Gaussian cluster files=====<br />
<br />
Gaussiancluster, which is also included in TIesr files, is used to provide TIesr clustering information.<br />
<br />
=====Step 7: Testing the data files=====<br />
<br />
In this step, we use testtiesrflex to generate all the final data needed by TIesrSI module to recognize speech and make decision.<br />
<br />
== Work Breakdown ==<br />
<br />
List the major tasks in your project and who did what.<br />
<br />
Also list here what doesn't work yet and when you think it will be finished and who is finishing it.<br />
<br />
== Conclusions ==<br />
<br />
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.<br />
<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=48559ECE497 Project Voice Dialer2011-05-17T10:01:02Z<p>Liulei32: /* TIesr */</p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
<br />
==Architecture==<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number.<br />
<br />
== Executive Summary ==<br />
<br />
The voice dialer project aims to complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing. TIesr is used to build a Hidden Markov Model for the voice recognition. (Give two sentence intro to the project.)<br />
<br />
The TIser is working which returns a voice recognition result from audio input. The Google Voice dialer is also completed so that it can be used to make a call from a Google Voice account to any valid phone number. (Give two sentences telling what works.)<br />
<br />
Give two sentences telling what isn't working.<br />
<br />
Generally our team has reached our goal of making a voice controlled dialer. Although the TIesr HHM model does not work perfectly due to small training data, we have finished building all software structure and proved it working on Beagleboard. (End with a two sentence conclusion.)<br />
<br />
== Instillation Instructions ==<br />
<br />
Give step by step instructions on how to install your project on the SPEd2 image. <br />
<br />
* Include your [https://github.com/ github] path as a link like this: [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn]. <br />
* Include any additional packages installed via '''opkg'''.<br />
* Include kernel mods.<br />
* If there is extra hardware needed, include links to where it can be obtained.<br />
<br />
== User Instructions ==<br />
<br />
Once everything is installed, how do you use the program? Give details here, so if you have a long user manual, link to it here.<br />
<br />
== Highlights ==<br />
<br />
Here is where you brag about what your project can do.<br />
<br />
Consider including a [http://www.youtube.com/ YouTube] demo.<br />
<br />
== Theory of Operation ==<br />
<br />
Give a high level overview of the structure of your software. Are you using GStreamer? Show a diagram of the pipeline. Are you running multiple tasks? Show what they do and how they interact.<br />
<br />
=== TIesr ===<br />
In our project, we used TI Embedded Speech Recognizer (TIESR) for Speaker-Independent recognition. The TIESR speech recognizer is targeted toward embedded platforms where computation and memory storage efficiency are important. TIESR uses Hidden Markov Model (HMM) technology to model the acoustic signals found in speech.<br />
<br />
To make TIesr a high performance speech recognizer, the model must be built and trained before using. During this, some softwares are needed to build the HMM. They are,<br />
The Hidden Markov Modeling Toolkit (HTK), which may be obtained from: <br />
http://htk.eng.cam.ac.uk/<br />
and Perl Modules Math::FFT and Algorithms::Cluster from the CPAN.<br />
<br />
Since our goal--to recognize ten digits--is a reletively simple task for TIesr, we do not utilize the pronunciation decision tree files. Below are steps we used to train the TIesr model.<br />
<br />
=====Step 1: Data Preparation=====<br />
<br />
Prepare text files for ten digits in alphabetical order.<br />
<br />
eight,<br />
<br />
five,<br />
<br />
four,<br />
<br />
nine,<br />
<br />
one,<br />
<br />
seven,<br />
<br />
six,<br />
<br />
three,<br />
<br />
two,<br />
<br />
zero.<br />
<br />
=====Step 2: Making the Letter File=====<br />
<br />
Instead of creating pronunciation decision trees, for small vocabularies the only file necessary is one that contains a sorted list of all characters making up words in the dictionary. This must be put in a file named "cAttValue.txt", and we put it in Data/Lang/cAttValue.txt. Each character should be a single byte.<br />
<br />
=====Step 3: Building the Compressed Binary Dictionary Files=====<br />
<br />
The dictionary file must be converted into a binary form for subsequent processing steps, since the TIesr tools use a binary dictionary. We use HTK HDMan tool to generate binary file "dict.bin" from "phone.lis".<br />
<br />
=====Step 4: Building the Acoustic Model Data Files=====<br />
<br />
Firstly we recorded 50 speech clips, 5 for each digit. They are sampled at 8KHz, using 16 bit LSB first PCM coding method.<br />
<br />
Then we use "sample_to_htk.pl" provided by TIesr to convert those .raw audio to .htk format file, which can be utilized for building HHM.<br />
<br />
After that, we carefully labelled out the time segment of each audio file, showing when a word starts and ends.<br />
<br />
Next we used the .htk files and segment information to train the HHM for four times. The number of iteration time can only determined by experiment. <br />
<br />
Finally the trained HTK data is converted to TIesr-compatible acoustic data files.<br />
<br />
=====Step 5: Creating the Hierarchical Linear Regression cluster tree file=====<br />
<br />
In this step we uses the results of word model to determine a linear regression tree for the HMM models.<br />
<br />
=====Step 6: Creating the Gaussian cluster files=====<br />
<br />
Gaussiancluster, which is also included in TIesr files, is used to provide TIesr clustering information.<br />
<br />
=====Step 7: Testing the data files=====<br />
<br />
In this step, we use testtiesrflex to generate all the final data needed by TIesrSI module to recognize speech and make decision.<br />
<br />
== Work Breakdown ==<br />
<br />
List the major tasks in your project and who did what.<br />
<br />
Also list here what doesn't work yet and when you think it will be finished and who is finishing it.<br />
<br />
== Conclusions ==<br />
<br />
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.<br />
<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=48553ECE497 Project Voice Dialer2011-05-17T09:59:20Z<p>Liulei32: /* TIesr */</p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
<br />
==Architecture==<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number.<br />
<br />
== Executive Summary ==<br />
<br />
The voice dialer project aims to complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing. TIesr is used to build a Hidden Markov Model for the voice recognition. (Give two sentence intro to the project.)<br />
<br />
The TIser is working which returns a voice recognition result from audio input. The Google Voice dialer is also completed so that it can be used to make a call from a Google Voice account to any valid phone number. (Give two sentences telling what works.)<br />
<br />
Give two sentences telling what isn't working.<br />
<br />
Generally our team has reached our goal of making a voice controlled dialer. Although the TIesr HHM model does not work perfectly due to small training data, we have finished building all software structure and proved it working on Beagleboard. (End with a two sentence conclusion.)<br />
<br />
== Instillation Instructions ==<br />
<br />
Give step by step instructions on how to install your project on the SPEd2 image. <br />
<br />
* Include your [https://github.com/ github] path as a link like this: [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn]. <br />
* Include any additional packages installed via '''opkg'''.<br />
* Include kernel mods.<br />
* If there is extra hardware needed, include links to where it can be obtained.<br />
<br />
== User Instructions ==<br />
<br />
Once everything is installed, how do you use the program? Give details here, so if you have a long user manual, link to it here.<br />
<br />
== Highlights ==<br />
<br />
Here is where you brag about what your project can do.<br />
<br />
Consider including a [http://www.youtube.com/ YouTube] demo.<br />
<br />
== Theory of Operation ==<br />
<br />
Give a high level overview of the structure of your software. Are you using GStreamer? Show a diagram of the pipeline. Are you running multiple tasks? Show what they do and how they interact.<br />
<br />
=== TIesr ===<br />
In our project, we used TI Embedded Speech Recognizer (TIESR) for Speaker-Independent recognition. The TIESR speech recognizer is targeted toward embedded platforms where computation and memory storage efficiency are important. TIESR uses Hidden Markov Model (HMM) technology to model the acoustic signals found in speech.<br />
<br />
To make TIesr a high performance speech recognizer, the model must be built and trained before using. During this, some softwares are needed to build the HMM. They are,<br />
The Hidden Markov Modeling Toolkit (HTK), which may be obtained from: <br />
http://htk.eng.cam.ac.uk/<br />
and Perl Modules Math::FFT and Algorithms::Cluster from the CPAN.<br />
<br />
Since our goal--to recognize ten digits--is a reletively simple task for TIesr, we do not utilize the pronunciation decision tree files. Below are steps we used to train the TIesr model.<br />
<br />
Step 1: Data Preparation<br />
<br />
Prepare text files for ten digits in alphabetical order.<br />
<br />
eight,<br />
<br />
five,<br />
<br />
four,<br />
<br />
nine,<br />
<br />
one,<br />
<br />
seven,<br />
<br />
six,<br />
<br />
three,<br />
<br />
two,<br />
<br />
zero.<br />
<br />
Step 2: Making the Letter File<br />
<br />
Instead of creating pronunciation decision trees, for small vocabularies the only file necessary is one that contains a sorted list of all characters making up words in the dictionary. This must be put in a file named "cAttValue.txt", and we put it in Data/Lang/cAttValue.txt. Each character should be a single byte.<br />
<br />
Step 3: Building the Compressed Binary Dictionary Files<br />
<br />
The dictionary file must be converted into a binary form for subsequent processing steps, since the TIesr tools use a binary dictionary. We use HTK HDMan tool to generate binary file "dict.bin" from "phone.lis".<br />
<br />
Step 4: Building the Acoustic Model Data Files<br />
<br />
Firstly we recorded 50 speech clips, 5 for each digit. They are sampled at 8KHz, using 16 bit LSB first PCM coding method.<br />
<br />
Then we use "sample_to_htk.pl" provided by TIesr to convert those .raw audio to .htk format file, which can be utilized for building HHM.<br />
<br />
After that, we carefully labelled out the time segment of each audio file, showing when a word starts and ends.<br />
<br />
Next we used the .htk files and segment information to train the HHM for four times. The number of iteration time can only determined by experiment. <br />
<br />
Finally the trained HTK data is converted to TIesr-compatible acoustic data files.<br />
<br />
Step 5: Creating the Hierarchical Linear Regression cluster tree file<br />
<br />
In this step we uses the results of word model to determine a linear regression tree for the HMM models.<br />
<br />
Step 6: Creating the Gaussian cluster files<br />
<br />
Gaussiancluster, which is also included in TIesr files, is used to provide TIesr clustering information.<br />
<br />
Step 7: Testing the data files<br />
<br />
In this step, we use testtiesrflex to generate all the final data needed by TIesrSI module to recognize speech and make decision.<br />
<br />
== Work Breakdown ==<br />
<br />
List the major tasks in your project and who did what.<br />
<br />
Also list here what doesn't work yet and when you think it will be finished and who is finishing it.<br />
<br />
== Conclusions ==<br />
<br />
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.<br />
<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=48547ECE497 Project Voice Dialer2011-05-17T09:37:07Z<p>Liulei32: /* TIesr */</p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
<br />
==Architecture==<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number.<br />
<br />
== Executive Summary ==<br />
<br />
The voice dialer project aims to complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing. TIesr is used to build a Hidden Markov Model for the voice recognition. (Give two sentence intro to the project.)<br />
<br />
The TIser is working which returns a voice recognition result from audio input. The Google Voice dialer is also completed so that it can be used to make a call from a Google Voice account to any valid phone number. (Give two sentences telling what works.)<br />
<br />
Give two sentences telling what isn't working.<br />
<br />
Generally our team has reached our goal of making a voice controlled dialer. Although the TIesr HHM model does not work perfectly due to small training data, we have finished building all software structure and proved it working on Beagleboard. (End with a two sentence conclusion.)<br />
<br />
== Instillation Instructions ==<br />
<br />
Give step by step instructions on how to install your project on the SPEd2 image. <br />
<br />
* Include your [https://github.com/ github] path as a link like this: [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn]. <br />
* Include any additional packages installed via '''opkg'''.<br />
* Include kernel mods.<br />
* If there is extra hardware needed, include links to where it can be obtained.<br />
<br />
== User Instructions ==<br />
<br />
Once everything is installed, how do you use the program? Give details here, so if you have a long user manual, link to it here.<br />
<br />
== Highlights ==<br />
<br />
Here is where you brag about what your project can do.<br />
<br />
Consider including a [http://www.youtube.com/ YouTube] demo.<br />
<br />
== Theory of Operation ==<br />
<br />
Give a high level overview of the structure of your software. Are you using GStreamer? Show a diagram of the pipeline. Are you running multiple tasks? Show what they do and how they interact.<br />
<br />
=== TIesr ===<br />
In our project, we used TI Embedded Speech Recognizer (TIESR) for Speaker-Independent recognition. The TIESR speech recognizer is targeted toward embedded platforms where computation and memory storage efficiency are important. TIESR uses Hidden Markov Model (HMM) technology to model the acoustic signals found in speech.<br />
<br />
To make TIesr a high performance speech recognizer, the model must be built and trained before using. During this, some softwares are needed to build the HMM. They are,<br />
The Hidden Markov Modeling Toolkit (HTK), which may be obtained from: <br />
http://htk.eng.cam.ac.uk/<br />
and Perl Modules Math::FFT and Algorithms::Cluster from the CPAN.<br />
<br />
Since our goal--to recognize ten digits--is a reletively simple task for TIesr, we do not utilize the pronunciation decision tree files. Below are steps we used to train the TIesr model.<br />
<br />
Step 1: Data Preparation<br />
<br />
Prepare text files for ten digits in alphabetical order.<br />
<br />
eight,<br />
<br />
five,<br />
<br />
four,<br />
<br />
nine,<br />
<br />
one,<br />
<br />
seven,<br />
<br />
six,<br />
<br />
three,<br />
<br />
two,<br />
<br />
zero.<br />
<br />
Step 2: Making the Letter File<br />
<br />
Instead of creating pronunciation decision trees, for small vocabularies the only file necessary is one that contains a sorted list of all characters making up words in the dictionary. This must be put in a file named cAttValue.txt, and we put it in Data/Lang/cAttValue.txt. Each character should be a single byte.<br />
<br />
== Work Breakdown ==<br />
<br />
List the major tasks in your project and who did what.<br />
<br />
Also list here what doesn't work yet and when you think it will be finished and who is finishing it.<br />
<br />
== Conclusions ==<br />
<br />
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.<br />
<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=48535ECE497 Project Voice Dialer2011-05-17T09:34:31Z<p>Liulei32: /* TIesr */</p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
<br />
==Architecture==<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number.<br />
<br />
== Executive Summary ==<br />
<br />
The voice dialer project aims to complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing. TIesr is used to build a Hidden Markov Model for the voice recognition. (Give two sentence intro to the project.)<br />
<br />
The TIser is working which returns a voice recognition result from audio input. The Google Voice dialer is also completed so that it can be used to make a call from a Google Voice account to any valid phone number. (Give two sentences telling what works.)<br />
<br />
Give two sentences telling what isn't working.<br />
<br />
Generally our team has reached our goal of making a voice controlled dialer. Although the TIesr HHM model does not work perfectly due to small training data, we have finished building all software structure and proved it working on Beagleboard. (End with a two sentence conclusion.)<br />
<br />
== Instillation Instructions ==<br />
<br />
Give step by step instructions on how to install your project on the SPEd2 image. <br />
<br />
* Include your [https://github.com/ github] path as a link like this: [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn]. <br />
* Include any additional packages installed via '''opkg'''.<br />
* Include kernel mods.<br />
* If there is extra hardware needed, include links to where it can be obtained.<br />
<br />
== User Instructions ==<br />
<br />
Once everything is installed, how do you use the program? Give details here, so if you have a long user manual, link to it here.<br />
<br />
== Highlights ==<br />
<br />
Here is where you brag about what your project can do.<br />
<br />
Consider including a [http://www.youtube.com/ YouTube] demo.<br />
<br />
== Theory of Operation ==<br />
<br />
Give a high level overview of the structure of your software. Are you using GStreamer? Show a diagram of the pipeline. Are you running multiple tasks? Show what they do and how they interact.<br />
<br />
=== TIesr ===<br />
In our project, we used TI Embedded Speech Recognizer (TIESR) for Speaker-Independent recognition. The TIESR speech recognizer is targeted toward embedded platforms where computation and memory storage efficiency are important. TIESR uses Hidden Markov Model (HMM) technology to model the acoustic signals found in speech.<br />
<br />
To make TIesr a high performance speech recognizer, the model must be built and trained before using. During this, some softwares are needed to build the HMM. They are,<br />
The Hidden Markov Modeling Toolkit (HTK), which may be obtained from: <br />
http://htk.eng.cam.ac.uk/<br />
and Perl Modules Math::FFT and Algorithms::Cluster from the CPAN.<br />
<br />
Since our goal--to recognize ten digits--is a reletively simple task for TIesr, we do not utilize the pronunciation decision tree files. Below are steps we used to train the TIesr model.<br />
Step 1: Data Preparation<br />
Prepare text files for ten digits in alphabetical order.<br />
eight,<br />
five,<br />
four,<br />
nine,<br />
one,<br />
seven,<br />
six,<br />
three,<br />
two,<br />
zero.<br />
<br />
== Work Breakdown ==<br />
<br />
List the major tasks in your project and who did what.<br />
<br />
Also list here what doesn't work yet and when you think it will be finished and who is finishing it.<br />
<br />
== Conclusions ==<br />
<br />
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.<br />
<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=48529ECE497 Project Voice Dialer2011-05-17T09:26:53Z<p>Liulei32: /* TIesr */</p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
<br />
==Architecture==<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number.<br />
<br />
== Executive Summary ==<br />
<br />
The voice dialer project aims to complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing. TIesr is used to build a Hidden Markov Model for the voice recognition. (Give two sentence intro to the project.)<br />
<br />
The TIser is working which returns a voice recognition result from audio input. The Google Voice dialer is also completed so that it can be used to make a call from a Google Voice account to any valid phone number. (Give two sentences telling what works.)<br />
<br />
Give two sentences telling what isn't working.<br />
<br />
Generally our team has reached our goal of making a voice controlled dialer. Although the TIesr HHM model does not work perfectly due to small training data, we have finished building all software structure and proved it working on Beagleboard. (End with a two sentence conclusion.)<br />
<br />
== Instillation Instructions ==<br />
<br />
Give step by step instructions on how to install your project on the SPEd2 image. <br />
<br />
* Include your [https://github.com/ github] path as a link like this: [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn]. <br />
* Include any additional packages installed via '''opkg'''.<br />
* Include kernel mods.<br />
* If there is extra hardware needed, include links to where it can be obtained.<br />
<br />
== User Instructions ==<br />
<br />
Once everything is installed, how do you use the program? Give details here, so if you have a long user manual, link to it here.<br />
<br />
== Highlights ==<br />
<br />
Here is where you brag about what your project can do.<br />
<br />
Consider including a [http://www.youtube.com/ YouTube] demo.<br />
<br />
== Theory of Operation ==<br />
<br />
Give a high level overview of the structure of your software. Are you using GStreamer? Show a diagram of the pipeline. Are you running multiple tasks? Show what they do and how they interact.<br />
<br />
=== TIesr ===<br />
In our project, we used TI Embedded Speech Recognizer (TIESR) for Speaker-Independent recognition. The TIESR speech recognizer is targeted toward embedded platforms where computation and memory storage efficiency are important. TIESR uses Hidden Markov Model (HMM) technology to model the acoustic signals found in speech.<br />
<br />
To make TIesr a high performance speech recognizer, the model must be built and trained before using. During this, some softwares are needed to build the HMM. They are,<br />
The Hidden Markov Modeling Toolkit (HTK), which may be obtained from: <br />
http://htk.eng.cam.ac.uk/<br />
and Perl Modules<br />
Math::FFT<br />
Algorithms::Cluster<br />
from the CPAN.<br />
<br />
== Work Breakdown ==<br />
<br />
List the major tasks in your project and who did what.<br />
<br />
Also list here what doesn't work yet and when you think it will be finished and who is finishing it.<br />
<br />
== Conclusions ==<br />
<br />
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.<br />
<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=48517ECE497 Project Voice Dialer2011-05-17T09:13:55Z<p>Liulei32: </p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
<br />
==Architecture==<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number.<br />
<br />
== Executive Summary ==<br />
<br />
The voice dialer project aims to complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing. TIesr is used to build a Hidden Markov Model for the voice recognition. (Give two sentence intro to the project.)<br />
<br />
The TIser is working which returns a voice recognition result from audio input. The Google Voice dialer is also completed so that it can be used to make a call from a Google Voice account to any valid phone number. (Give two sentences telling what works.)<br />
<br />
Give two sentences telling what isn't working.<br />
<br />
Generally our team has reached our goal of making a voice controlled dialer. Although the TIesr HHM model does not work perfectly due to small training data, we have finished building all software structure and proved it working on Beagleboard. (End with a two sentence conclusion.)<br />
<br />
== Instillation Instructions ==<br />
<br />
Give step by step instructions on how to install your project on the SPEd2 image. <br />
<br />
* Include your [https://github.com/ github] path as a link like this: [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn]. <br />
* Include any additional packages installed via '''opkg'''.<br />
* Include kernel mods.<br />
* If there is extra hardware needed, include links to where it can be obtained.<br />
<br />
== User Instructions ==<br />
<br />
Once everything is installed, how do you use the program? Give details here, so if you have a long user manual, link to it here.<br />
<br />
== Highlights ==<br />
<br />
Here is where you brag about what your project can do.<br />
<br />
Consider including a [http://www.youtube.com/ YouTube] demo.<br />
<br />
== Theory of Operation ==<br />
<br />
Give a high level overview of the structure of your software. Are you using GStreamer? Show a diagram of the pipeline. Are you running multiple tasks? Show what they do and how they interact.<br />
<br />
=== TIesr ===<br />
<br />
<br />
== Work Breakdown ==<br />
<br />
List the major tasks in your project and who did what.<br />
<br />
Also list here what doesn't work yet and when you think it will be finished and who is finishing it.<br />
<br />
== Conclusions ==<br />
<br />
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.<br />
<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=48493ECE497 Project Voice Dialer2011-05-17T08:34:16Z<p>Liulei32: /* Executive Summary */</p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
<br />
==Architecture==<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number.<br />
<br />
== Executive Summary ==<br />
<br />
The voice dialer project aims to complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing. TIesr is used to build a Hidden Markov Model for the voice recognition. (Give two sentence intro to the project.)<br />
<br />
The TIser is working which returns a voice recognition result from audio input. The Google Voice dialer is also completed so that it can be used to make a call from a Google Voice account to any valid phone number. (Give two sentences telling what works.)<br />
<br />
Give two sentences telling what isn't working.<br />
<br />
Generally our team has reached our goal of making a voice controlled dialer. Although the TIesr HHM model does not work perfectly due to small training data, we have finished building all software structure and proved it working on Beagleboard. (End with a two sentence conclusion.)<br />
<br />
== Instillation Instructions ==<br />
<br />
Give step by step instructions on how to install your project on the SPEd2 image. <br />
<br />
* Include your [https://github.com/ github] path as a link like this: [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn]. <br />
* Include any additional packages installed via '''opkg'''.<br />
* Include kernel mods.<br />
* If there is extra hardware needed, include links to where it can be obtained.<br />
<br />
== User Instructions ==<br />
<br />
Once everything is installed, how do you use the program? Give details here, so if you have a long user manual, link to it here.<br />
<br />
== Highlights ==<br />
<br />
Here is where you brag about what your project can do.<br />
<br />
Consider including a [http://www.youtube.com/ YouTube] demo.<br />
<br />
== Theory of Operation ==<br />
<br />
Give a high level overview of the structure of your software. Are you using GStreamer? Show a diagram of the pipeline. Are you running multiple tasks? Show what they do and how they interact.<br />
<br />
== Work Breakdown ==<br />
<br />
List the major tasks in your project and who did what.<br />
<br />
Also list here what doesn't work yet and when you think it will be finished and who is finishing it.<br />
<br />
== Conclusions ==<br />
<br />
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.<br />
<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=48487ECE497 Project Voice Dialer2011-05-17T08:18:52Z<p>Liulei32: </p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
<br />
==Architecture==<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number.<br />
<br />
== Executive Summary ==<br />
<br />
The voice dialer project aims to complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing. TIesr is used to build a Hidden Markov Model for the voice recognition. (Give two sentence intro to the project.)<br />
<br />
Give two sentences telling what works.<br />
<br />
Give two sentences telling what isn't working.<br />
<br />
End with a two sentence conclusion.<br />
<br />
The sentence count is approximate and only to give an idea of the expected length.<br />
<br />
== Instillation Instructions ==<br />
<br />
Give step by step instructions on how to install your project on the SPEd2 image. <br />
<br />
* Include your [https://github.com/ github] path as a link like this: [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn]. <br />
* Include any additional packages installed via '''opkg'''.<br />
* Include kernel mods.<br />
* If there is extra hardware needed, include links to where it can be obtained.<br />
<br />
== User Instructions ==<br />
<br />
Once everything is installed, how do you use the program? Give details here, so if you have a long user manual, link to it here.<br />
<br />
== Highlights ==<br />
<br />
Here is where you brag about what your project can do.<br />
<br />
Consider including a [http://www.youtube.com/ YouTube] demo.<br />
<br />
== Theory of Operation ==<br />
<br />
Give a high level overview of the structure of your software. Are you using GStreamer? Show a diagram of the pipeline. Are you running multiple tasks? Show what they do and how they interact.<br />
<br />
== Work Breakdown ==<br />
<br />
List the major tasks in your project and who did what.<br />
<br />
Also list here what doesn't work yet and when you think it will be finished and who is finishing it.<br />
<br />
== Conclusions ==<br />
<br />
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.<br />
<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=48481ECE497 Project Voice Dialer2011-05-17T08:13:00Z<p>Liulei32: </p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
<br />
==Architecture==<br />
This project is divided into two parts, the dialer and the recognizer. The recognizer is written in C, and acts as the main driver for the application. The dialer is a utility script written in Python that dials a phone number.<br />
<br />
== Executive Summary ==<br />
<br />
Give two sentence intro to the project.<br />
<br />
Give two sentences telling what works.<br />
<br />
Give two sentences telling what isn't working.<br />
<br />
End with a two sentence conclusion.<br />
<br />
The sentence count is approximate and only to give an idea of the expected length.<br />
<br />
== Instillation Instructions ==<br />
<br />
Give step by step instructions on how to install your project on the SPEd2 image. <br />
<br />
* Include your [https://github.com/ github] path as a link like this: [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn]. <br />
* Include any additional packages installed via '''opkg'''.<br />
* Include kernel mods.<br />
* If there is extra hardware needed, include links to where it can be obtained.<br />
<br />
== User Instructions ==<br />
<br />
Once everything is installed, how do you use the program? Give details here, so if you have a long user manual, link to it here.<br />
<br />
== Highlights ==<br />
<br />
Here is where you brag about what your project can do.<br />
<br />
Consider including a [http://www.youtube.com/ YouTube] demo.<br />
<br />
== Theory of Operation ==<br />
<br />
Give a high level overview of the structure of your software. Are you using GStreamer? Show a diagram of the pipeline. Are you running multiple tasks? Show what they do and how they interact.<br />
<br />
== Work Breakdown ==<br />
<br />
List the major tasks in your project and who did what.<br />
<br />
Also list here what doesn't work yet and when you think it will be finished and who is finishing it.<br />
<br />
== Conclusions ==<br />
<br />
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.<br />
<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=ECE497_Project_Voice_Dialer&diff=45799ECE497 Project Voice Dialer2011-05-03T11:57:08Z<p>Liulei32: </p>
<hr />
<div>Members: Dan Bennett, David Bliss, Will Gerth, and Lei Liu! <br />
<br />
Concept: Google Voice based voice dialer using TI embedded speech recognition.<br />
<br />
Timeline: TBD<br />
<br />
Goal: To complete and connect a voice dialed call from the beagleboard via a phone device of the users choosing.<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=EBC_Project_Ideas&diff=38641EBC Project Ideas2011-03-22T03:49:42Z<p>Liulei32: </p>
<hr />
<div>[[Category:ECE497]]<br />
[[Category:BeagleBoard]]<br />
<br />
We have both mini projects and projects in ECE497. <br />
<br />
'''Mini projects''' involved finding something interested in the Beagle world and installing it to your beagle and demoing it to the class. You would also create a wiki page documenting what you did to get it installed. Often you may find multiple efforts do to something, for example there are a few efforts to port Android on the Beagle. Your task is to figure out which one should be used. Generally mini projects won't require you to write new code; however they are the background work that may lead to a full project. You should do a couple mini projects for the class. Generally they are done alone, but working in pairs is OK. These will be about 1/3 of your grade and should be done in the first 5 weeks or so.<br />
<br />
Only one '''full project''' is done for the class and it's done with a team of 3 or 4. These projects can take a mini project (or a whole new idea) and add to it. The goal is to have your work contribute to the open source world. Any code is generated will be kept on [https://github.com/ github] and a [http://bitbake.berlios.de/manual/ bitbake receipt] will be created to automatically download and create the object files.<br />
<br />
What follows are<br />
<br />
; Places to look for project ideas: Feel free to add your own suggestions.<br />
; Mini Project ideas: Add your own suggestions, and do some of them. Mark the ones you've done.<br />
; Full Project ideas: ditto.<br />
<br />
== Sources for Project Ideas ==<br />
<br />
Here are some links where you'll find ideas for your project.<br />
* [http://wiki.omap.com/index.php/ETechDays_Community_Lightning_Talks ETechDays Community Lightning Talks], this is a one-day web-based conference where many project ideas are presented. One of our 2009-2010 senior design projects was found here.<br />
* [http://beagleboard.org/project Official list of Beagle Projects], there are many Beagle specific projects listed here. Many are inactive. ''List your project here once it running.''<br />
* [http://www.youtube.com/watch?v=Mk1xjbA-ISE Augmented Reality Project], here's an idea that I think we can do on the Beagle. Rather than using augmented reality glasses, I'd suggest we use a [http://focus.ti.com/dlpdmd/docs/dlpdiscovery.tsp?sectionId=60&tabId=2235 TI DLP pico projector]. [http://www.hitlabnz.org/wiki/EmbeddedAR Here's] AR running on the Beagle. <br />
* [http://code.google.com/p/0xdroid/ Android], this is one of a couple of efforts to port [http://source.android.com/ Google's Android OS] to the Beagle.<br />
* [[BeagleBoard/Ideas-2009]] Google summer code ideas 2009.<br />
<br />
== Mini Project Ideas ==<br />
<br />
{| border="1" cellspacing="0" cellpadding="5"<br />
! Suggestor<br />
! Implementor<br />
! Description<br />
! Link<br />
|-<br />
| Mark A. Yoder<br />
| <br />
| Get TI' embedded speech recognizer installed and demo the examples.<br />
| [https://gforge.ti.com/gf/project/tiesr TI Embedded Speech Recognizer]<br />
|-<br />
| Mark A. Yoder<br />
| <br />
| Demo last year's TI speech project. I have a microphone amplifier and mike you can use.<br />
| [[ECE597 Project pyWikiReader]]<br />
|-<br />
| Mark A. Yoder<br />
| Stephen Mayhew<br />
| Find who is doing what with Kinect on the Beagle and install and run it.<br />
| [http://www.google.com/webhp?rlz=1C1GPCK_enUS392US392&sourceid=chrome-instant&ie=UTF-8&ion=1#hl=en&sugexp=ldymls&xhr=t&q=beagleboard+kinect&cp=0&qe=YmVhZ2xlYm9hcmQga2lu&qesig=9qrD0rFfjWfujRRGmkB_Bw&pkc=AFgZ2tn-cylx0f71PasgBKOazjBQY3VK712RWQ7DueEjQNAdbOHr6BCgUd9xdyXyPe8TWErkesrQ246vygwImnAS5mIzCG2-5g&pf=p&sclient=psy&rlz=1C1GPCK_enUS392US392&site=webhp&source=hp&aq=0&aqi=&aql=&oq=beagleboard+kin&pbx=1&bav=on.2,or.&fp=3e817b7ec5d13467&ion=1]<br />
|-<br />
| Mark A. Yoder<br />
| <br />
| I have several [http://en.wikipedia.org/wiki/PlayStation_Eye Sony PlayStation Eye web cams] and I have examples of how to pull video from them via V4L2 ([[ECE497 DaVinci Workshop Labs]]). The Eye also has a 4 microphone array. I don't know how to get audio from it. Figure out how. This may expand to a full project if there is no solution out there.<br />
| [http://www.google.com/webhp?rlz=1C1GPCK_enUS392US392&sourceid=chrome-instant&ie=UTF-8&ion=1#hl=en&sugexp=ldymls&xhr=t&q=beagleboard+playstation+eye+microphone+array&cp=0&qe=YmVhZ2xlYm9hcmQgcGxheXN0YXRpb24gZXllIG1pY3JvcGhvbmUgYXJyYXk&qesig=Sdh5Ru_jodwYydoeTls1GA&pkc=AFgZ2tmwB41tQwF7XwrJPqFnf0NRO911bMCrbnU1HR9Vm6-Pg0sH8LvbJZsKwjKRUpoin4cZlwLIngZw8OC7dyanjcJCG4N_kg&pf=p&sclient=psy&rlz=1C1GPCK_enUS392US392&site=webhp&aq=f&aqi=&aql=&oq=beagleboard+playstation+eye+microphone+array&pbx=1&bav=on.2,or.&fp=3e817b7ec5d13467&ion=1]<br />
|-<br />
| Mark A. Yoder<br />
| <br />
| Find some examples of how to use '''cmem'''. CMEM is an API and library for managing one or more blocks of physically contiguous memory. It also provides address translation services (e.g. virtual to physical translation) and user-mode cache management APIs. It's used for managing the shared memory between the ARM and the DSP on the processor. I've been unable to find examples of how to use it.<br />
| [http://processors.wiki.ti.com/index.php/CMEM_Overview CMEM Overview]<br />
|-<br />
| Mike Lester<br />
| <br />
| Connect to your beagleboard using ethernet over USB. This allows your beagleboard to share the host computer's internet connection and allow you to connect via VNC/ssh without the need for an external router/switch. This should make development much easier. <br />
| [http://elinux.org/BeagleBoardBeginners#Connect_with_your_beagleboard_using_VNC_and_ethernet_over_USB]<br />
|-<br />
| Brian Hulette<br />
| <br />
| Experiment with audio synthesis and/or sampling/processing. You could either synthesize and play a few tones to generate a song, or have the Beagle sample an audio signal then process and output it to create a sort of effects pedal. <br />
| <br />
|-<br />
| David McGinnis<br />
| David McGinnis<br />
| Look into connecting the beagleboard to a phone or headphones using bluetooth. This could involve either outputting audio and taking in audio from a bluetooth headset, allowing you to have audio I/O with the beagleboard, or could involve connecting with phones automatically as they come into range of the beagleboard, allowing for an automatic attendence registration system, among other things.<br />
| <br />
|-<br />
| David Bliss<br />
| David Bliss<br />
| Get a video stream from a PS Eye, and identify the relevant device files.<br />
| http://en.wikipedia.org/wiki/PlayStation_Eye#cite_note-Linux_support-32<br />
|<br />
|-<br />
| William Gerth<br />
| William Gerth<br />
| Explore the possibility of implementing OpenAOS on the Beagle, to make a portable media player and etc.<br />
| http://www.openaos.org/<br />
|-<br />
| Joel Carlson<br />
|<br />
| Lacking a serial port and don't have a USB-serial converter? Why not find a way to make the BeagleBoard boot over a USB console connection?<br />
| [http://itgen.blogspot.com/2011/03/beagleboard-xm-u-boot-without-serial.html BeagleBoard XM U-boot without Serial]<br />
|-<br />
| Lei Liu<br />
| Lei Liu<br />
| Build communication with FPGA via USB port.<br />
| <br />
|}<br />
<br />
== Full Projects ==<br />
<br />
=== 2011 ===<br />
Edit this page to add projects you would like to do. If you aren't in the class, add ideas you would like to see done by class members.<br />
<br />
{| border="1" cellspacing="0" cellpadding="5"<br />
! Team&nbsp;Members<br />
! Project Title<br />
! Description <br />
|-<br />
| Mark A. Yoder<br />
| [https://gforge.ti.com/gf/project/tiesr TI Embedded Speech Recognizer]<br />
| Port TI's fixed-point speech recognizer to the DSP. It currently runs on the ARM.<br />
|-<br />
| Mark A. Yoder<br />
| Kinect<br />
| [http://hackaday.com/2010/11/15/rendering-a-3d-environment-from-kinect-video/ Here] and [http://gamerfront.net/2010/12/with-a-second-kinect-you-can-map-out-your-bedroom-in-3d/4644 here] are some interesting things people are doing with Kinects. Maybe we could port it to the Beagle.<br />
|-<br />
| Mark A. Yoder<br />
| Google PowerMeter<br />
| Google has a [http://www.google.com/powermeter project] to view and manage home electricity usage. This project would involve designing the hardware to measure the power usage and the Beagle software in interface with it. The Beagle would talk to the local home network via a wireless link and the home owner would configure the Beagle via a web page served on the Beagle.<br />
|}<br />
<br />
=== 2010 ===<br />
<br />
{| border="1" cellspacing="0" cellpadding="5"<br />
! Team&nbsp;Members<br />
! Project Title<br />
! Description <br />
|-<br />
| Yannick Polius<br />
| [[ECE597 Project pyWikiReader | pyWikiReader]]<br />
| This project is mostly software, with the hardware element being the use of the dsp. The idea is to tie together three technologies: speech recognition, speech synthesis, and internet access in order to create an interface capable of orating information to the user based on a vocal command. The implementation I have in mind is to use the Pocket Sphinx speech recognition engine to first understand what the user wants through speech, such as "Rose-Hulman". Once the speech is translated, the software can execute a Wikipedia search to pull said item's page. Most of the important info is contained within the introductory paragraph, so the software will take only that chunk and feed it into the Flite speech synthesis engine. The end result is a simple machine with "mother box" like usability, that is, no interaction besides what is natural to the user (speaking) should be necessary to retrieve the information.<br />
|-<br />
| Paul Morrison <br> Steven Stark<br />
| [[ECE597 3D Chess | 3D Chess with Networking]]<br />
| This project would simulate a hand-held chess game, and the game would allow two player games using two beagleboards over a network connection. The graphics would use the beagle's PowerVR SGX for hardware accelerated graphics by using OpenGL. In addition to 3D graphics and networking, a third portion of the project would be to optimize the boot time because a chess computer should start up quickly.<br />
|-<br />
| Tom Most <br> David Baty <br> Mark Jacobson<br />
| [[ECE597: Sumo Robot|Sumo Robot]]<br />
| The goal of this project is to create a robot capable of competing in the 3.0 kg weight class of a sumo competition ([http://www.youtube.com/watch?v=V3OR_sHrOJM an example]). This would have minor hardware and electronics elements, but would focus on communication with sensors using the BeagleBoard and the Linux kernel. At minimum, this involves sensors to detect the edge of the ring and the opposing robot. This would likely be implemented using Sharp IR rangefinders, a ultrasonic rangefinders, and ideally a camera. [http://circ.mtco.com/competitions/2010/rules/sumo Sumo rules].<br />
|-<br />
|Brian Embry <br> Jessica Lipscomb <br> Paul Banister<br />
| [[ECE597 Network based MP3 player]]<br />
| Network based mp3 player. The Beagle will be programmed using a custom, protocol for transferring files from a network based server (x86 pc) to a Beagle. Speakers will be attached to the Beagle, where the file will be played back. Possible extensions are a LCD for displaying id3 tag information, and buttons for user interaction (next track, previous track, etc.) on the GPIO interface.<br />
|-<br />
|[[user:routhcr | Chris Routh]] <br> [[user:collinjc | J. Cody Collins]] <br> [[user:jacksogc | Greg Jackson]] [[user:Xinkeqiong | Keqiong Xin]]<br />
| [[ECE597: Auto HUD]]<br />
| Use the beagle board to run image recognition on a camera feed located inside a car, and then signaling to the driver via a pico projector various objects of interest.<br />
|-<br />
| Adam Jesionowski<br>Qiang Jiang<br />
| [[ECE597_Adding_Sense_to_Beagle|Adding Sense to Beagle]] (See [[BeagleBoard/GSoC/Ideas]])<br />
| Sensory aware applications are becoming more mainstream with the release of the Apple iPhone. This project would combine both HW and SW to add sensory awareness to beagle. First, additional modules such as GPS, 3-axis accelerometers, Gyroscopes, Temperature Sensors, Humidity Sensors, Pressure Sensors, etc, would be added to beagle to compliment the microphone input in order to allow sensing of the real world environment. Then SW APIs would need to be layered on top to allow easy access to the sensory data for use by applications. <br />
|-<br />
| Mitch Garvin <br> Matt Luke <br> Elliot Simon <br> Jian Li<br />
| [[ECE597 Interactive Pong|Interactive Pong]]<br />
| Run classic pong, projecting the screen and using a camera to track user's hands for input.<br />
|}</div>Liulei32https://elinux.org/index.php?title=User:Liulei32&diff=38239User:Liulei322011-03-17T02:27:21Z<p>Liulei32: </p>
<hr />
<div>Lei Liu<br />
<br />
<br />
Rose-Hulman Institute of Technology<br />
<br />
Master of Science in Electrical Engineering<br />
<br />
<br />
Research Interests: <br />
<br />
Communication System<br />
<br />
Signal Processing<br />
<br />
<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=User:Liulei32&diff=38233User:Liulei322011-03-17T02:20:37Z<p>Liulei32: Created page with "Lei Liu Rose-Hulman Institute of Technology Master of Science in Electrical Engineering Research Interests: Communication System Signal Processing"</p>
<hr />
<div>Lei Liu<br />
<br />
Rose-Hulman Institute of Technology<br />
Master of Science in Electrical Engineering<br />
<br />
Research Interests: <br />
Communication System<br />
Signal Processing</div>Liulei32https://elinux.org/index.php?title=EBC_Editing_a_Wiki&diff=37249EBC Editing a Wiki2011-03-11T01:34:31Z<p>Liulei32: </p>
<hr />
<div>Here is a wiki you can practice editing. Before you can edit it you will have to create an login. Pick something that will make it easy for me to identify you as part of my class. Then just add your name and date on the end of the table.<br />
<br />
You can get help here: [[Help:Contents]].<br />
<br />
If you need help with syntax check out the [[Editing Quickstart Guide|eLinux guide]] or the [http://en.wikipedia.org/wiki/Wikipedia:Cheatsheet Wikipedia Cheetsheet].<br />
<br />
{|<br />
! Name<br />
! Date<br />
|-<br />
| [[user:Yoder | Mark A. Yoder]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:Selbydw | Douglas W. Selby]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:allensj | Samuel J. Allen]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:greenkt | Kyle T. Green]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:blissdw | David Bliss]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:carlsojs | Joel Carlson]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:billinrm | Randy Billingsley]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:Aaron.bamberger | Aaron Bamberger]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:hulettbh | Brian Hulette]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:Fusonmb | Michael Fuson]]<br />
| 3-Mar-2011<br />
|-<br />
| [[user:mayhewsw | Stephen Mayhew]]<br />
| 4-Mar-2011<br />
|-<br />
| [[user:strayeta | Ty Strayer]]<br />
| 7-Mar-2011<br />
|-<br />
| [[user:dialj| Jay Dial]]<br />
| 7-Mar-2011<br />
|-<br />
| [[user:j.ametsitsi| Julian Ametsitsi]]<br />
| 7-Mar-2011<br />
|-<br />
| [[user:bennetdj| Dan Bennett]]<br />
| 7-Mar-2011<br />
|-<br />
| [[user:shevicna| Nathan Shevick]]<br />
| 7-Mar-2011<br />
|-<br />
| [[user:w.gerth| William Gerth]]<br />
| 7-Mar-2011<br />
|-<br />
| [[user:mcginnda| David McGinnis]]<br />
| 7-Mar-2011<br />
|-<br />
| [[user:liulei32| Lei Liu]]<br />
| 10-Mar-2011<br />
| <br />
|}<br />
<br />
[[Category:ECE497]]</div>Liulei32https://elinux.org/index.php?title=EBC_Editing_a_Wiki&diff=36835EBC Editing a Wiki2011-03-10T03:06:03Z<p>Liulei32: </p>
<hr />
<div>Here is a wiki you can practice editing. Before you can edit it you will have to create an login. Pick something that will make it easy for me to identify you as part of my class. Then just add your name and date on the end of the table.<br />
<br />
You can get help here: [[Help:Contents]].<br />
<br />
If you need help with syntax check out the [[Editing Quickstart Guide|eLinux guide]] or the [http://en.wikipedia.org/wiki/Wikipedia:Cheatsheet Wikipedia Cheetsheet].<br />
<br />
{|<br />
! Name<br />
! Date<br />
|-<br />
| [[user:Yoder | Mark A. Yoder]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:Selbydw | Douglas W. Selby]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:allensj | Samuel J. Allen]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:greenkt | Kyle T. Green]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:blissdw | David Bliss]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:carlsojs | Joel Carlson]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:billinrm | Randy Billingsley]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:Aaron.bamberger | Aaron Bamberger]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:hulettbh | Brian Hulette]]<br />
| 2-Mar-2011<br />
|-<br />
| [[user:Fusonmb | Michael Fuson]]<br />
| 3-Mar-2011<br />
|-<br />
| [[user:mayhewsw | Stephen Mayhew]]<br />
| 4-Mar-2011<br />
|-<br />
| [[user:strayeta | Ty Strayer]]<br />
| 7-Mar-2011<br />
|-<br />
| [[user:dialj| Jay Dial]]<br />
| 7-Mar-2011<br />
|-<br />
| [[user:j.ametsitsi| Julian Ametsitsi]]<br />
| 7-Mar-2011<br />
|-<br />
| [[user:bennetdj| Dan Bennett]]<br />
| 7-Mar-2011<br />
|-<br />
| [[user:shevicna| Nathan Shevick]]<br />
| 7-Mar-2011<br />
|-<br />
| [[user:w.gerth| William Gerth]]<br />
| 7-Mar-2011<br />
|-<br />
| [[user:mcginnda| David McGinnis]]<br />
| 7-Mar-2011<br />
|-<br />
| [[user:liulei32| Lei Liu]]<br />
| 7-Mar-2011<br />
| <br />
|}<br />
<br />
[[Category:ECE497]]</div>Liulei32