BeagleBoard/beaglecv stereo

=beaglecv_stereo =

A short summary of the idea will go here.

Student: Kumar Lekkala Mentors: Micheal Welling, ds2, Jason Kridner Code: https://github.com/kiran4399/beaglecv Wiki: http://elinux.org/BeagleBoard/beaglecv_stereo GSoC: GSoC entry

=Status= This project is currently just a proposal.

=Proposal= Please complete the requirements listed on the ideas page and fill out this template.

About you
IRC: kiran4399 Github: https://github.com/kiran4399 School: Indian Institute of Information Technology, SriCity Country: India Primary language (We have mentors who speak multiple languages): English Typical work hours (We have mentors in various time zones): 8AM-5PM IST Previous GSoC participation: https://summerofcode.withgoogle.com/archive/2016/projects/6295262146330624/

About your project
Project name: Stereo Vision support for BeagleBone using BeagleCV The aim of the project is to create support for the stereo vision on the Beaglebone Black/Blue using the BeagleCV. This would consist of developing the BeagleCV library and creating custom OpenGLES2 shaders for utilizing SGX530 3D accelerator present onboard for faster computation. The APIs which would be developed for these shaders would allow other users to write their own computer vision algorithms and also enable faster computation. I would be using these APIs for implementing stereo vision algorithm. Finally if time permits, this project will be added to the Beaglebone Blue APIs.

Description
The kernel which I would using is 4.4.56-bone17. This version has prebuilt kernel modules omaplfb, tilcdc and pvrsrvkm which are essential for running SGX530. Following are the complete set of project goals and their challenges which I plan to deliver at the end of the tenure:

Creating shaders for utilizing SGX530: Beaglebone Blue/Black has an inbuilt PowerVR SGX530 3D accelerator which is capable of performing image processing. By the end of this project, I will create basic shaders using the OpenGLES2 graphics library and GLSL 1.0 shader implementation. Fragment shaders (FS) will be used to modify the pixel values and Vertex shaders (VS) will be used for transformation and computation of the indices of the image (stored as a sampler2d type). Some of the challenges which are present while implementing image processing algorithms are as follows:

Texture compression: By default I will be using only greyscale images for image processing. The algorithms would be best performed when the size of the images reduces. It would be apt to converting into grayscale with POWERVR Texture Compression (PVRTC) format with a compression ration of 8:1 for better performance. Floating point Precision control: Using a float point unit has a lot of impact on the performance. For example, a 5x5 Gaussian filtering on an image took 3074.8 ms on a CPU and took 207.1 ms on a CPU with a fixed point implementation but took only 48.90 on a parallel implementation on SGX530. OpenGLES SL has 3 precisions modifiers: highp: single precision, 32 bit floating point value mediump: half-precision floating point value (16 bit) lowp: 10 bit fixed point format, values in the range of 1/256

Choosing a lower precision increases the performance but may introduce artifacts. I would be using lowp precision to represent colors in the range of 0.0 to 1.0 to enhance performance of the GPU. Load Sharing between VS and FS: For image processing algorithms, the number of vertices processed is much lower than total number of fragments which are millions in number. Therefore, the number of operations per vertex is significantly cheaper than fragment, so it is generally recommended to perform calculations per vertex. For example, in filtering, the straightforward way is to precompute neighboring texture coordinates in a vertex shader. By moving the calculations to the vertex shader and directly using the vertex shader computed texture coordinates, the fragment shader avoids the dependent texture read.

Branching and loop unrolling: To process a single instance of a loop needs more instruction in increment and compare operations. I will mostly try to eliminating loop by either an optimized unrolling or vector utilization in the shader to perform operations would enable the process to achieve higher performance. In cases where loop cannot be unrolled, a constant loop count will be maintained so that dynamic branching is reduced. Similarly, even in the case of branching, I will mostly try to branch on a constant known value as branching on a value computed inside the shader results in significantly low performance.

Porting libcvd as BeagleCV: One of the most important deliverable of this project is to develop BeagleCV library. BeagleCV is a minimized fork of libcvd (real-time vision library). The sequential execution would be replaced with the optimized shaders in the library. These shaders will be accessible by APIs which will be developed along with the shaders. This would allow other users to write their own CV algorithms and also enable faster computation. I will be implementing Stereo matching and a generic feature extractor by the end of this project by utilizing the APIs.

Stereo vision implementation: Implementing stereo matching algorithm for getting disparity map from stereo images. Block Matching (BM) algorithm is the most widely used stereo matching algorithm used in the embedded community because of its favourable computational characteristics for parallel implementation. The cost function in this local block matching is NCC (Normalized Cross Correlation). To improve the accuracy of the disparity map a LR-RL consistency check can be included which will be implemented in a separate shader.

Documentation and examples: I will provide extensive and accurate documentation for whatever I build in this project. Functional documentation for the BeagleCV will be done in doxygen. Code documentation will be as comments in the source file. to utilize the GPU and also create appropriate documentation of how to use the SGX530 on the Beaglebone Blue/Black.

(Future work) Adding BeagleCV support for Beaglebone Blue APIs: Once BeagleCV is implemented and the v2 released, I will add the support to the Beaglebone Blue API repository. This would enable users to implement sensor fusion algorithms that help in robotic localization, tracking, detection and navigation.

Timeline
Google Summer of Code stretches over a period of 12 weeks with the Phase-1, Phase-2 and final evaluations in the 4th, 8th and the 12th week respectively. Following are the timelines and milestones which I want to strictly follow throughout the project tenure:

May 30 - June 13 Week-1,2 Aim: Implementing OpenGLES2 shaders for Block Matching (BM) algorithm Description: Utility functions for a test shader will be implemented and APIs will be framed. Part 1 of shader implementations. Following shader functions will be written this week: Affine operations like scaling, translation, rotation, shearing etc. Image rectification 2x2, 3x3 and 4x4 Convolution (user can create their own kernel) LR-RL consistency checking

June 14 - June 27  Week-3,4 Aim: Cleaning, Minimizing and testing bare BeagleCV Description: Shaders (Part 2). Following shaders will be implemented and APIs will be developed enabling users to utilize these shader functions. Block operations (with block size 4x4) Correlation computation (NCC with 4x4 support window) Matrix operations like transpose, inverse etc. Gap filling

June 28 - July 11  Week-5,6 Aim: Implementing BM using OpenGLES2 shaders Description: The algorithm will be implemented using the shaders created in the previous week. Based on the performance of the algorithm, vertex shaders will be added to minimize the cost of computation. The implementation details can be found here.

July 12 - July 18 Aim: Testing stereo matching algorithm and checking performance Description: Generating the final disparity map given 2 stereo images. Evaluating the algorithm using the popular Tsukuba stereo dataset. Checking the accuracy of the algorithm, fixing bugs and documenting the approach.

July 19 - August 1 Week-8,9 Aim: Implementing basic OpenGLES2 shaders Description: Phase-3 of the shader implementations. Following are the fragment shaders which will be coded: Gradient computation (in both x and y direction) Image integration Image Thresholding (user specifies higher and lower threshold)

August 2 - August 15 Week-10,11 Aim: Implementing example algorithm using OpenGLES2 shaders Description: A feature extractor (SURF if time permits or else Harris corner) will be implemented as an example program to make use of the APIs coded earlier. Unit tests will be run and any bugs pertaining to this source code will be fixed. The implementation details can be found in this paper.

August 16 - August 22 Week-12 Aim: Performance evaluation and documentation of the SURF algorithm Description: Reserve week for testing SURF algorithm. I will also be checking the performance of those examples wrt to CPU, GPU and CPU+GPU support. All the functionalities will be properly documented.

August 23 - August 29 Week-13 Aim: FINAL EVALUATION !! Description: Checking and fixing bugs. Refining the previous documentation so that it is more easy to understand. Checking the final implementation and doing the runthrough again. Final commit to the beaglecv repository and releasing v2.

August 29 onwards: Future work Aim: Adding BeagleCV with Beaglebone Blue APIs Description: Once BeagleCV is stable, it will be added in the Beaglebone Blue APIs to create support for vision along with other sensors. I will also maintain the BeagleCV library by adding more algorithms and fixing any bugs pertaining to Beaglebone

Experience and approach
I am a fourth-year undergraduate student studying in India. Besides having a key interest towards Robotics, Computer vision and Machine learning. I also like hacking on embedded boards especially to make agile robots. I like to work on an open-source project this summer because it is interesting and contributing to the project is fun and exciting. I did not work much on open-source before, but I have some idea about how things work in open-source community which seem to be very fascinating.

Object segmentation and tracking in RGB-D images: Developed a robust segmentation method using deep learning which accurately extracts an object from an RGB-D image and subsequently tracks the object in the RGB-D stream. Currently this is an ongoing project.

Accurate and Augmented Localization and Mapping for Indoor Quadcopters: In this project, a state-estimation system for Quadcopters operating in indoor environment is developed that enables the quadcopter to localize itself on a globally scaled map reconstructed by the system. To estimate the pose and the global map, we use ORB-SLAM, fused with onboard metric sensors along with a 2D LIDAR mounted on the Quadcopter which helps in robust tracking and scale estimation.

Enhancing Visual SLAM using IMU and Sonar: Increased the accuracy and robustness of ORB-SLAM by integrating Extended Kalman Filter (EKF) by fusing the IMU and sonar measurements. The scale of the map is estimated by a closed form Maximum Likelihood approach.

Semi-Autonomous Quadcopter for Person Following: Developed an IBVS based robotic system, implemented on Parrot AR Drone, which is capable of following a person or any moving object and simultaneously measuring the localized coordinates of the quadcopter, on a scaled map.

API Support for Beaglebone Blue: Created easy-to-use APIs for Beaglebone Blue. With these APIs, applications can be directly ported onto the board. This project was a collaboration of Beagleboard.org with the University of California, San Diego as part of Google Summer of Code 2016.

Intelligent Parking system: This module is a part of ADS(Autonomous Driving System) used for accurate autonomous parking. The Beaglebone Black in the robot finds the set point by matching features using SURF descriptors on the template image and directs the output to the actuators(motors) connected to PRU(Programmable Real-time Unit).

Contingency
If I get stuck on my project and I don’t find my mentor, I will google the error and research about it myself. I personally feel that there is nothing in this world which is not present on the internet. I will also try take help from the other developers present on the IRC.

Benefit
kiran4399: As a robotics researcher, I personally feel that just by using the data from low-level sensors. By developing these BeagleCV, its functionalities and applications, students be greatly benefited can apply many high-level concepts like visual tracking, localization, detection, pose estimation etc. By adding BeagleCV to BeagleBone APIs, it would be very easy to implement sensor-fusion algorithms. roject.

Suggestions
I plan all my work properly and sketch out a perfect routine so that the work planned gets completed within the given time. I always sketch out priorities and keep priority management above time management. My policy is: “Hard-work beats talent when talent doesn't work hard !!”. I strongly feel that striving to know something is the best way to learn something. I can assure that I will work around 50-55 hours a week without any other object of interest. I also hope for lot of learning experience throughout the program and come closer to the open-source world.