Difference between revisions of "ECE497 Project - Object Detection w/ DNN"

From eLinux.org
Jump to: navigation, search
 
(38 intermediate revisions by 2 users not shown)
Line 30: Line 30:
 
== Executive Summary ==
 
== Executive Summary ==
  
Picture that summarizes the project.
+
[[File:WorkingDemo.jpg|400px|thumb|none|left|Picture of fully functional in-class demo]]
  
 +
We are using tensor flow and Open-CV to detect items in the frame of a web camera. The camera is mounted onto a tilt pan kit to allow us to track the objects in frame as well. Due to the intensive nature of the object detection, we are using a local computation server to process the image and find the objects within it. The computation server returns a processed image and error vector which the Pi coverts to a control vector. It can then display the processed image and adjust its angle to keep the tracked object in the middle of the frame. In order to dramatically decrease the complexity of the project, we would have liked to preform all the processing on the Pi as well however we were unable to get a reasonable response time with either the Pi or the Beagle Bone. The Raspberry Pi takes at least 3 seconds per image to process and the BeagleBone Black at least 5 seconds.
  
We are using tensorflow and OpenCV to detect items in the frame of a web cameraThe camera is mounted onto a tilt pan kit to allow us to track the objects in frame as wellDue to the intensive nature of the object detection, we are using a local web server to process the image and find the objects within it.  The web server returns an error vector which the '''Pi''' coverts to a control vectorIt can then adjust its angle to keep the tracked object in the middle of the frame.  In order to dramatically decrease the complexity of the project, we would have liked to preform all the processing on the '''Pi''' as well however we were unable to get a reasonable response time with either the Pi or the BeagleBone.
+
== Packaging ==
 +
In the spirit of small build, big execution, we created an enclosure for our project using mdfWe CNC'd two pieces that were then glued together, sanded and paintedA notch was cut into the back for the cables to access the Raspberry Pi that was mounted on the underside.  The Pi was mounted using 4x 3M plastic standoffs and some 3M screwsThe tilt pan kit was mounted on the top piece and a hole was drilled through to allow the servo motor cables access to the Pi.
 +
 
 +
Below you can see pictures of the assembly:
  
Give two sentences telling what works.
+
<gallery>
Give two sentences telling what isn't working:
+
Side.jpg|Side of system
End with a two sentence conclusion.
+
Front.jpg|Front of system
 +
Sauron_Underside.jpg|Underside of system
 +
</gallery>
  
The sentence count is approximate and only to give an idea of the expected length.
+
== Installation/User Instructions ==
  
== Packaging ==
+
These are step by step instructions on how to install and run this project.
If you have hardware, consider [http://cpprojects.blogspot.com/2013/07/small-build-big-execuition.html Small Build, Big Execuition] for ideas on the final packaging.
 
  
== Installation Instructions ==
+
* You can find our GitHub page at the following link: [https://github.com/LeelaPakanati/ECE434_Sauron.git https://github.com/LeelaPakanati/ECE434_Sauron.git].
  
Give step by step instructions on how to install your project. 
+
=== Install Requirements ===
  
* Include your [https://github.com/ github] path as a link like this to the read-only git site:  [https://github.com/MarkAYoder/gitLearn https://github.com/MarkAYoder/gitLearn].
+
* In order to run this project use the install_host.sh on the host machine and install_pi.sh to automatically install all the requirements. A full list of the installations can be found in the README.md
* Be sure your README.md is includes an up-to-date and clear description of your project so that someone who comes across you git repository can quickly learn what you did and how they can reproduce it.
 
* Include a Makefile for your code if using C.
 
* Include any additional packages installed via '''apt'''.  Include '''install.sh''' and '''setup.sh''' files.
 
* Include kernel mods.
 
* If there is extra hardware needed, include links to where it can be obtained.
 
  
== User Instructions ==
+
=== Setup ===
  
Once everything is installed, how do you use the program?  Give details here, so if you have a long user manual, link to it here.
+
==== IP Address Setup ====
  
== Highlights ==
+
*First get the IP addresses of both the compute server and the SBC client. Note if these devices are not on the same local area network, the server must be globally port forwarded to allow for access over the network.
 +
*Upon doing so, enter the server's IP address in eye.py for the value of server_ip. And similarily enter the SBC client's IP in tower.py for the value of client_ip.
  
Here is where you brag about what your project can do.
+
==== Running ====
  
Include a [http://www.youtube.com/ YouTube] demo the audio description.
+
* Run ./tower.py <object to track> on the host
 +
* Run ./eye.py on the Pi
  
 
== Theory of Operation ==
 
== Theory of Operation ==
  
[[File:High level diagram.png|frame|center|High-level Hardware Overview]]
 
  
 
=== Hardware ===
 
=== Hardware ===
  
[[File:Fritzing Diagram|frame|center|Hardware Schematic]]
+
[[File:Fritzing Diagram.png|400px|thumb|none|left|Schematic]]
  
 
=== Software ===
 
=== Software ===
  
Give a high level overview of the structure of your software.  Are you using GStreamer?  Show a diagram of the pipeline.  Are you running multiple tasks?  Show what they do and how they interact.
+
[[File:High level diagram.png|400px|thumb|none|left|High-level Hardware Overview]]
  
  
== Work Breakdown ==
+
The camera sends the image to the Raspberry Pi over usb.  The Pi then sends the image to the web server.  The web server processes the image, finds the nearest person it has the highest confidence for, and returns an error vector to the pi of the distance between the identified object and the center of the frame.  Using a PID control loop, this error vector is processed by the Pi and converted into a control vector.  Finally, the control vector is then turned into PWM signals that are sent to each servo.
  
Milestones:
+
This whole process takes anywhere from 100-130 ms.  Our greatest bottleneck in this process is the time it take to transfer the image to and from the web server.  Despite the delays inherent to file transfer over the internet, this is still significantly faster than trying to do all the processing on the Pi or the Beagle bone.  Due to these hardware limitations of the Pi and Beagle bone it took nearly 3 sec on the Pi, and 6 sec on the Beagle to process a single image.
  
Getting OpenCV on Pi/Beaglebone (10/28)
+
In an effort to further optimize our system we used both cores on the Pi to parallelize some of the tasks.  Currently, the image transfer and display is handled on one core, while the control loop runs on the other.  This was done to help preserve the timing of the control loop as well as take some load off off the core that was handling the image.
  
Testing Pi vs Beaglebone operation (10/29)
+
== Highlights: ==
  
Image sending and receiving (11/5)
+
*The project uses OpenCV and tensorflow Models for object detection
 +
*The servos are able to accurately track a person, even if it loses track of them temporarily due to the integrator factor of the PID loop
 +
*The display classifies all objects, regardless of the one its tracking so you can see all of its detection
 +
*The project uses a rudimentary form of cloud computing for the Neural Network detection
 +
*The project uses a TCP socket for reliable communication.
  
Web server configuration (11/5)
+
== Work Breakdown: ==
 
 
Servo and tilt pan kit assembly (11/10)
 
 
 
Enclosure design and construction (11/16)
 
 
 
Documentation (11/19)
 
  
 +
*Getting OpenCV on Pi/Beaglebone (10/28) - Leela
 +
*Testing Pi vs Beaglebone operation (10/29) - Both
 +
*Image sending and receiving (11/5) - Leela
 +
*Web server configuration (11/5) - Leela
 +
*Servo and tilt pan kit assembly (11/10) - Paul
 +
*Control loop and tuning for servos (11/14) - Paul
 +
*Enclosure design and construction (11/16) - Paul
 +
*Documentation (11/19) - Paul
  
 
== Future Work ==
 
== Future Work ==
  
Creating our own libraries to train our model on would be a very interesting addition to this project.  This would allow us to detect and recognize individuals and only track certain people.   
+
*Creating our own libraries to train our model on would be a very interesting addition to this project.  This would allow us to detect and recognize individuals and only track certain people.   
 
 
Making the tilt pan kit more robust would allow us to mount a nicer camera to the system and would significantly improve the image quality as well as the recognition accuracy.
 
  
 +
*Making the tilt pan kit more robust would allow us to mount a nicer camera to the system and would significantly improve the image quality as well as the recognition accuracy.
  
 +
*Since we are mostly only tracking a single image it would be interesting to look into other object detection algorithms.  About 40ms of the delay we are experiencing is from the processing of the image.  If, instead of analyzing the whole image to find all the objects, we only searched blobs of the image we could significantly decrease the amount of computation needed.  This could perhaps even allow us to run all the computation on the Pi/Beagle
  
 
== Conclusions ==
 
== Conclusions ==
  
Give some concluding thoughts about the project. Suggest some future additions that could make it even more interesting.
+
This was a very interesting project overall that introduced a lot of new concepts that neither of us had any experience with. We ran into some difficulties and road blocks largely due to the hardware limitation of the Pi, but this was a good starting point and it gives us a lot to improve on if we decide to continue to improve the system.

Latest revision as of 15:14, 20 November 2019


Team members: [Paul Wilda, Leela Pakanati]

Grading Template

I'm using the following template to grade. Each slot is 10 points. 0 = Missing, 5=OK, 10=Wow!!

00 Executive Summary
00 Installation Instructions 
00 User Instructions
00 Highlights
00 Theory of Operation
00 Work Breakdown
00 Future Work
00 Conclusions
00 Demo
00 Late
Comments: I'm looking forward to seeing this.

Score:  10/100

(Inline Comment)

Executive Summary

Picture of fully functional in-class demo

We are using tensor flow and Open-CV to detect items in the frame of a web camera. The camera is mounted onto a tilt pan kit to allow us to track the objects in frame as well. Due to the intensive nature of the object detection, we are using a local computation server to process the image and find the objects within it. The computation server returns a processed image and error vector which the Pi coverts to a control vector. It can then display the processed image and adjust its angle to keep the tracked object in the middle of the frame. In order to dramatically decrease the complexity of the project, we would have liked to preform all the processing on the Pi as well however we were unable to get a reasonable response time with either the Pi or the Beagle Bone. The Raspberry Pi takes at least 3 seconds per image to process and the BeagleBone Black at least 5 seconds.

Packaging

In the spirit of small build, big execution, we created an enclosure for our project using mdf. We CNC'd two pieces that were then glued together, sanded and painted. A notch was cut into the back for the cables to access the Raspberry Pi that was mounted on the underside. The Pi was mounted using 4x 3M plastic standoffs and some 3M screws. The tilt pan kit was mounted on the top piece and a hole was drilled through to allow the servo motor cables access to the Pi.

Below you can see pictures of the assembly:

Installation/User Instructions

These are step by step instructions on how to install and run this project.

Install Requirements

  • In order to run this project use the install_host.sh on the host machine and install_pi.sh to automatically install all the requirements. A full list of the installations can be found in the README.md

Setup

IP Address Setup

  • First get the IP addresses of both the compute server and the SBC client. Note if these devices are not on the same local area network, the server must be globally port forwarded to allow for access over the network.
  • Upon doing so, enter the server's IP address in eye.py for the value of server_ip. And similarily enter the SBC client's IP in tower.py for the value of client_ip.

Running

  • Run ./tower.py <object to track> on the host
  • Run ./eye.py on the Pi

Theory of Operation

Hardware

Schematic

Software

High-level Hardware Overview


The camera sends the image to the Raspberry Pi over usb. The Pi then sends the image to the web server. The web server processes the image, finds the nearest person it has the highest confidence for, and returns an error vector to the pi of the distance between the identified object and the center of the frame. Using a PID control loop, this error vector is processed by the Pi and converted into a control vector. Finally, the control vector is then turned into PWM signals that are sent to each servo.

This whole process takes anywhere from 100-130 ms. Our greatest bottleneck in this process is the time it take to transfer the image to and from the web server. Despite the delays inherent to file transfer over the internet, this is still significantly faster than trying to do all the processing on the Pi or the Beagle bone. Due to these hardware limitations of the Pi and Beagle bone it took nearly 3 sec on the Pi, and 6 sec on the Beagle to process a single image.

In an effort to further optimize our system we used both cores on the Pi to parallelize some of the tasks. Currently, the image transfer and display is handled on one core, while the control loop runs on the other. This was done to help preserve the timing of the control loop as well as take some load off off the core that was handling the image.

Highlights:

  • The project uses OpenCV and tensorflow Models for object detection
  • The servos are able to accurately track a person, even if it loses track of them temporarily due to the integrator factor of the PID loop
  • The display classifies all objects, regardless of the one its tracking so you can see all of its detection
  • The project uses a rudimentary form of cloud computing for the Neural Network detection
  • The project uses a TCP socket for reliable communication.

Work Breakdown:

  • Getting OpenCV on Pi/Beaglebone (10/28) - Leela
  • Testing Pi vs Beaglebone operation (10/29) - Both
  • Image sending and receiving (11/5) - Leela
  • Web server configuration (11/5) - Leela
  • Servo and tilt pan kit assembly (11/10) - Paul
  • Control loop and tuning for servos (11/14) - Paul
  • Enclosure design and construction (11/16) - Paul
  • Documentation (11/19) - Paul

Future Work

  • Creating our own libraries to train our model on would be a very interesting addition to this project. This would allow us to detect and recognize individuals and only track certain people.
  • Making the tilt pan kit more robust would allow us to mount a nicer camera to the system and would significantly improve the image quality as well as the recognition accuracy.
  • Since we are mostly only tracking a single image it would be interesting to look into other object detection algorithms. About 40ms of the delay we are experiencing is from the processing of the image. If, instead of analyzing the whole image to find all the objects, we only searched blobs of the image we could significantly decrease the amount of computation needed. This could perhaps even allow us to run all the computation on the Pi/Beagle

Conclusions

This was a very interesting project overall that introduced a lot of new concepts that neither of us had any experience with. We ran into some difficulties and road blocks largely due to the hardware limitation of the Pi, but this was a good starting point and it gives us a lot to improve on if we decide to continue to improve the system.