BeagleBoard/GSoC/2021 Proposal/YOLO models on the X15/AI
[YOLO models on the X15/AI]
Student: Jakub Duchniewicz
Mentors: Hunyue Yau
Code: not yet created!
GSoC: YOLO Models on the X15/AI
Discussing the tentative ideas with Hunyue Yau and others on #beagle-gsoc IRC.
Please complete the requirements listed on the ideas page and fill out this template.
School: University of Turku/KTH Royal Institute of Technology
Primary language: Polish
Typical work hours: 8AM-5PM CET
Previous GSoC participation: Participating in GSoC, especially with BeagleBoard would further develop my software and hardware(BB X15/AI architecture) skills and help me apply my current knowledge for the mutual benefit of the open source community. I aim to deliver a component which will be usable in many upcoming releases of YOLO model and hopefully other models.
About your project
Project name: YOLO models on the X15/AI (with an extensible interface for other models)
The main idea of the project is to accelerate Deep Learning models with help of available hardware resources on the BB X15 and BB AI platforms. Current inference times are abysmal for any real-time (or even slightly laggy but bearable) application ranging from 15 to 35 seconds per frame. This is unacceptable and this project will alleviate this problem and enable efficient deployment of other models once the Texas Instruments Deep Learning library allows for that (RNNs, LSTMs and GRUs are planned to be released).
As more and more developers recognize the benefits coming from DL and utilizing specialized hardware for acceleration of these calculations, the inclusion of such support is vital for BeagleBoard community. Additionally, adding such a component may encourage new developers interested in developing DL solutions for embedded systems to join the effort and grow the BB developer community.
The main focus on this project is to accelerate the YOLOv3 model using the TIDL library using C++ and maybe some intrinsics. In past there were some problems with YOLOv3 layers being not supported by TIDL, which should be fixed in this release (there is no confirmation on the forums with the AM5729, but the release notes of PSDK 6.03 mention support for EVEs + DSPs simultaneously).
The picture above is a visualization of how this component will fit into TIDL ecosystem and allow easier deployment of DL models on BB.
/Begin unstructured CURRENT DOUBTS: can all layers be accelerated with TIDL? Maybe accelerate some steps with ARM NEON? if not possible?? Is building an extensible interface in scope for the solution? Maybe try some additional acceleration for non-AI platforms Probably a library of various supported networks? -> wrapping the execution model somehow so user can choose: I want YOLO model on this particular HW esssa What about utilizing DSPs Double buffering also? /End unstructured
In 10-20 sentences, what are you making, for whom, why and with what technologies (programming languages, etc.)? (We are looking for open source SOFTWARE submissions.)
Using YOLOv4 instead of YOLOv3, runs 1 FPS faster on Jetson Nano than v3. Using TFLite instead of TIDL, as TIs plans regarding the TIDL library are not prospective, there will be probably encouragement to use TFLite and AWS Sagemaker. However, this removes some fine-grained control over the model inference pipeline.
Timeline needs more info of detailed steps, still not enough data. For sure setting up the enironment, deploying sample programs on the board, creating a scaffolding and only then creating the YOLO model bindings OR doing a very quick deployment of the YOLO and then creating proper abstractions. Provide a development timeline with a milestone each of the 11 weeks and any pre-work. (A realistic timeline is critical to our selection process.)
|Mar 29||Applications open, Students register with GSoC, work on proposal with mentors|
|Apr 13||Proposal complete, Submitted to https://summerofcode.withgoogle.com|
|May 17||Proposal accepted or rejected|
|Jun 07||Pre-work complete, Coding officially begins!|
|Jun 17||Milestone #1, Introductory YouTube video|
|June 24||Milestone #2|
|June 30||Milestone #3|
|July 12 18:00 UTC||Milestone #4, Mentors and students can begin submitting Phase 1 evaluations|
|July 16 18:00 UTC||Phase 1 Evaluation deadline|
|July 23||Milestone #5|
|July 30||Milestone #6|
|Aug 06||Milestone #7|
|August 10||Milestone #8, Completion YouTube video|
|August 16 - 26 18:00 UTC||Final week: Students submit their final work product and their final mentor evaluation|
|August 23 - 30 18:00 UTC||Mentors submit final student evaluations|
Experience and approach
I have strong programming background in the area of embedded Linux/operating systems as a Junior Software Engineer in Samsung Electronics during December 2017-March 2020. Additionally I have developed a game engine (PolyEngine) in C++ during this time and gave some talks on modern C++ during my time as a Vice-President of Game Development Student Group "Polygon".
Apart from that, I have completed my Bachelors degree at Warsaw University of Technology successfully defending my thesis titled: FPGA Based Hardware Accelerator for Musical Synthesis for Linux System. In this system I created a polyphonic musical synthesizer capable of producing various waveforms in Verilog code and deployed it on a De0 Nano SoC FPGA. Additionally I wrote two kernel drivers - one encompassed ALSA sound device and was responsible for proper synchronization of DMA transfers.
In my professional work, many times I had to complete various tasks under time pressure and choose the proper task scoping. Basing on this experience I believe that this task is deliverable in the mentioned time-frame.
In 5-15 sentences, convince us you will be able to successfully complete your project in the timeline you have described.
Since I am used to tackling seemingly insurmountable challenges, I will first of all keep calm and try to come up with alternative approach if I get stuck along the way. The internet is a vast ocean of knowledge and time and again I received help from benevolent strangers from reddit or other forums. Since I believe that humans are species, which solve problems in the best way collaboratively, I will contact #beagle, #beagle-gsoc and relevant subreddits (I received tremendous help on /r/FPGA, /r/embedded and /r/askelectronics in the past).
If all fails I may be able be forced to change my approach and backtrack, but this will not be a big problem, because the knowledge won't be lost and it will only make my future approaches better. Alternatively, I can focus on documenting my progress in a form of blogposts and videos while waiting for my mentor to come back to cyberspace.
The BB X15 and BB AI will be able to perform inference using YOLOv3 models in near real-time (maybe even allowing for using these boards for complex computer vision tasks). Additionally the BeagleBoard software codebase will have a good interface for deploying other models which would abstract the lower details of TIDL(or some other library future boards may use) interactions. The software will be prepared for rollout of newer and more advanced models.
Quotes?? If successfully completed, what will its impact be on the BeagleBoard.org community? Include quotes from BeagleBoard.org community members who can be found on http://beagleboard.org/discuss and http://bbb.io/gsocchat.
The PR is available here.
Is there anything else we should have asked you?