EBC Exercise 39 Setting Up tidl on X15

Here are instructions on how to run TI's Deep Learning (tidl) examples on a BeagleBoard-X15.

Install
Get Robert's tidl repo

x15$ git clone https://github.com/rcn-ee/tidl-api

Now follow the instructions in the readme.md file.

x15$ sudo apt update x15$ sudo apt install ti-opencl libboost-dev libopencv-core-dev libopencv-imgproc-dev libopencv-highgui-dev libjson-c-dev

Most were already installed and up to date. Install time 38s.

Checkout the most current branch and compile. Use -j2 since we have 2 cores.

x15$ cd tidl-api/ x15$ git checkout origin/v01.02.02-bb.org -b v01.02.02-bb.org x15$ make -j2 build-api     # 1m31s The next build puts things in /usr/share/ti/tidl so create it and assume give user 1000 (should be debian) permission to read/write it.

x15$ sudo mkdir -p /usr/share/ti/tidl x15$ sudo chown -R 1000:1000 /usr/share/ti/tidl/ x15$ make -j2 build-examples  # 4m33s

Extras to install
Here are a few other handy extras to install.

If you get a cmemk error: x15$ cd /opt/scripts/tools/ ; git pull ; sudo ./update_kernel.sh ; sudo apt upgrade

Fix a path error with x15$ cd /usr/share/ti/tidl x15$ '''sudo ln -s /tidl-api/examples.

The x15 runs a bit hot. A fan is suggested. You can check the CPU temp with x15$ cat /sys/class/thermal/*/temp 36600 36200 35800 35400 36200 25625

The units are millidegrees C. A fan will drop the temp some 20 Deg C.

If you get Gtk-Message: Failed to load module "canberra-gtk-module", run x15$ sudo apt install libcanberra-gtk-module libcanberra-gtk3-module

Install the image viewer "eye of gnome" for viewing images on the x15. x15$ sudo apt install eog

Run Examples
Here's how to run some of the examples. From the host computer you need to ssh with the -XC flags so the x15 can access the host's X-windows to display things. You need to ssh as root for the X-Windows authentication to work. Here are instructions for setting a root password, etc. host$ ssh -XC root@x15

classification
The imagenet demo is looking for one object out of a list of 1000 things. The classification demo is looking for one (or two if you set TWO_ROIs) object out of a small list of 12 or so things. You need to login to the x15 as root for the X-Windows authentication to work.

root@x15$ cd classification root@x15$ ls avg_fps_window.h imagenet1001.txt  Makefile                        stream_config_mobilenet.txt classlist.txt    imagenet.txt      readme.md                       tidl_classification clips            images            stream_config_inceptionnet.txt  tidl-sw-stack-small.png findclasses.cpp  main.cpp          stream_config_j11_v2.txt

seems to have a file missing.

runs but gets the error "Corrupt JPEG data: 2 extraneous bytes before marker 0xd4". So I send stderr to /dev/null

runs but it looks like the color channels are switched

The following takes live video from a camera (/dev/video0) and displays it on the host. It also displays a list of objects it is looking for and highlights the last object it found. See readme.md for more details.

root@x15$ ./tidl_classification -g 1 -d 2 -e 2 -l ./imagenet.txt -s ./classlist.txt -i 0 -c ./stream_config_j11_v2.txt 2> /dev/null

This will play a video and classify it. Note: The readme.md referenced test50.mp4, but I couldn't find it so I'm using test10.mp4.

If you get the following error on the Bone AI TIOCL FATAL: Internal Error: Number of message queues (0) does not match number of compute units (2)

switch to the 4.14 kernel.

root@x15$ ls clips test10.mp4 test1.mp4  test2.mp4 root@x15$ ./tidl_classification -g 1 -d 2 -e 2 -l ./imagenet.txt -s ./classlist.txt -i ./clips/test10.mp4 -c ./stream_config_j11_v2.txt See readme.md for more examples

main.cpp, line 55, uncomment to have two Regions of Interest.

Look in imagenet.txt to see what can be recognized and add them to classlist.txt.

imagenet
Run the imagenet demo to recognize any of the 1000 images.

root@x15$ cd tidl-api/examples/imagenet root@x15$ ls imagenet imagenet_objects.json  main.cpp  Makefile Processing live video from /dev/video0 ./imagenet -i camera0 2> /dev/null # Redirect the errors to ignore a message

Processing a still image. ./imagenet -d 2 -e2 -i IMG_3806.jpg

segmentation
The segmentation example takes an image as input and performs pixel-level classification according to pre-trained categories.

root@x15 '''cd /tidl-api/examples/segmentation root@x15$ ./segmentation -d 2 -e 2 -i camera0 -w 1200 2> /dev/null

ssd_multibox
SSD is the abbreviation for Single Shot multi-box Detector. The ssd_multibox example takes an image as input and detects multiple objects with bounding boxes according to pre-trained categories.

root@x15$ cd /tidl-api/examples/ssd_multibox root@x15$ ./ssd_multibox -d 2 -e 2 -i camera0 -w 1200 2> /dev/null

Others
layer_ouput and mcbench look like handy tools.

Auto starting
Here are some notes that I hope will lead up to the examples auto starting.

First allow user debian to run  without a password. Do this by added a line to the /etc/sudoers file.

x15$ sudo visudo

The add the following to the end. debian ALL=(ALL) NOPASSWD: ALL Now debian doesn't need to enter a password when using sudo. Use with care!

Now create an auto start file.

x15$ mkdir -p ~/.config/autostart x15$ vi ~/.config/autostart/tidl.desktop

Put the following in the file:

[Desktop Entry] Type=Application Exec=sudo bash -c "cd /home/debian/exercises/x15/tidl/tidl-api/examples/classification ; gedit & ./tidl_classification -g 1 -d 2 -e 2 -l ./imagenet.txt -s ./classlist.txt -i 0 -c ./stream_config_j11_v2.txt" Hidden=false NoDisplay=false X-GNOME-Autostart-enabled=true Name=TIDL Example Comment=Just playing

The examples that use the GUI have an error unless you run  first. I hope this can be fixed.

Training on new images
Here are instructions for training the network.

Some links I'm using


 * https://github.com/tidsp/caffe-jacinto-models
 * https://github.com/tidsp/caffe-jacinto-models/blob/caffe-0.17/docs/Imagenet_Classification_README.md
 * https://github.com/amd/OpenCL-caffe/wiki/Instructions-to-create-ImageNet-2012-data This tells how to download the images.

Downloading the images
Instructions for downloading the various image data sets are here: https://github.com/amd/OpenCL-caffe/wiki/Instructions-to-create-ImageNet-2012-data

But there are a couple of things you have to do to make it work.

Download
time wget --user --ask-password -c http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_train.tar time wget --user --ask-password -c http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_val.tar
 * Go to http://www.image-net.org/signup and request an account. Remember your username and password.
 * Follow the instructions at https://github.com/amd/OpenCL-caffe/wiki/Instructions-to-create-ImageNet-2012-data
 * The path names for downloading the test and validation images are slightly wrong and  needs  your username and password.
 * Run:
 * Note: You need to use the username and password of your account.  Note also that nonpub has changed to nnoupd.
 * It took some 1585 minutes (26.4 hours) to download the training images.
 * Now download the validation images. This took some 2.5 hours for me.

For good measure get the test files too. Took about 4 hours. wget --user --ask-password -c http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_test.tar

Extract
mkdir train mv ILSVRC2012_img_train.tar train cd train tar -xvf ILSVRC2012_img_train.tar   # This took 48 minutes rm -f ILSVRC2012_img_train.tar find. -name "*.tar" | while read NAME ; do mkdir -p "${NAME%.tar}"; echo ${NAME} ; tar -xf "${NAME}" -C "${NAME%.tar}"; done
 * To extract training data

The find command took some 2 hours and 20 minutes. There will be 1000 folders, one for each object. Each folder will have some 1200 images in it. Make sure to check the completeness of the decompression, you should have 1,281,167 images in train folder. Check it with sum=0 for dir in `find. -type f` do  sum=$((sum+1)) done echo $sum

cd ../ mkdir val mv ILSVRC2012_img_val.tar val cd val tar -xvf ILSVRC2012_img_val.tar
 * To extract validation data. This took nearly 4 minutes for me.


 * Extracting test data took 10 minutes.

Installing caffe-jacinto
While you are waiting for all the images to download you can start install caffe-jacinto. cd ImageNet git clone https://github.com/tidsp/caffe-jacinto.git git clone https://github.com/tidsp/caffe-jacinto-models.git # Took about 40 seconds These are updated instructions from https://github.com/tidsp/caffe-jacinto/blob/caffe-0.17/INSTALL.md Go to https://www.anaconda.com/distribution/ to download and install Anaconda Python 2.7. wget https://repo.anaconda.com/archive/Anaconda2-2018.12-Linux-x86_64.sh chmod +x Anaconda2-2018.12-Linux-x86_64.sh ./Anaconda2-2018.12-Linux-x86_64.sh # Took about 10 minutes

Change directory to the folder where caffe source code is placed. From: https://github.com/tidsp/caffe-jacinto/blob/caffe-0.17/INSTALL.md

sudo apt install caffe-cuda sudo apt install libgflags-dev libgoogle-glog-dev liblmdb-dev sudo apt install libjpeg-turbo8-dev libjpeg8-dev libturbojpeg0-dev sudo apt install libopenblas-dev

May not be needed. sudo aptinstall libturbojpeg sudo ln -s /usr/lib/x86_64-linux-gnu/libturbojpeg.so.0 /usr/lib/x86_64-linux-gnu/libturbojpeg.so

The rest shouldn't be needed. sudo apt install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler sudo apt install --no-install-recommends libboost-all-dev

I'm assuming Cuda is already installed. Check the version. nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2018 NVIDIA Corporation Built on Tue_Jun_12_23:07:04_CDT_2018 Cuda compilation tools, release 9.2, V9.2.148

Install CUDNN (https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html). libcudnn-dev developer deb package can be downloaded from NVIDIA website (https://developer.nvidia.com/rdp/cudnn-download) Pick the version that matches the compiler and then install using dpkg -i path-to-deb. https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html

Prepare the images for training
Download and install the following to get the scripts to prepare the images.

cd ImageNet git clone https://github.com/tidsp/caffe-jacinto.git git clone https://github.com/tidsp/caffe-jacinto-models.git # Took about 40 seconds cd caffe-janinto cd data/ilsvrc2012 ./get_ilsvrc.sh  # Takes about 3 seconds
 * 1) N.B. This does not download the ilsvrcC12 data set, as it is gargantuan and we've already downloaded it.
 * 2) This script downloads the imagenet example auxiliary files including:
 * 3) - the ilsvrc12 image mean, binaryproto
 * 4) - synset ids and words
 * 5) - Python pickle-format data of ImageNet graph structure and relative infogain
 * 6) - the training splits with labels

cd ../../   # This puts you in caffe-janinto vi examples/imagenet/create_imagenet.sh

Modify the following variables to point to your ImageNet data dir

TRAIN_DATA_ROOT=/work/yoder/ImageNet/train/ VAL_DATA_ROOT=/work/yoder/ImageNet/val/

Then set data resize bool to true:

RESIZE=true

Next follow the directions (summarized here) here https://github.com/tidsp/caffe-jacinto/blob/caffe-0.17/INSTALL.md to configure the Makefile.

PYTHON_INCLUDE := /usr/include/python2.7 \ /usr/lib/python2.7/dist-packages/numpy/core/include \ /home/yoder/anaconda2/lib/python2.7/site-packages/numpy/core/include
 * copy Makefile.config.example into Makefile.config
 * In Makefile.config, uncomment the line that says WITH_PYTHON_LAYER
 * Uncomment the line that says USE_CUDNN
 * If more than one GPUs are available, uncommenting USE_NCCL will help us to enable multi gpu training.
 * run  and if you have version 3, Uncomment OPENCV_VERSION := 3
 * I had to also add a line to

Install python packages for Anaconda Python: (takes 60 some minutes) for req in $(cat python/requirements.txt); do conda install $req; done

Save then run. The -j option tells how many parallel compiles to run. I'm on a 32 core machine and it took 4 min and 15 s.

make -j32 make pycaffe # This took just 40 seconds

To test your install try: (From: http://caffe.berkeleyvision.org/installation.html#compilation) make test  # Takes a couple of minutes to compile make runtest  # Runs lots of tests. Takes 30 mins. I got: [ PASSED  ] 2096 tests. [ FAILED  ] 5 tests, listed below: [ FAILED  ] LayerFactoryTest/0.TestCreateLayer, where TypeParam = caffe::CPUDevice [ FAILED  ] LayerFactoryTest/1.TestCreateLayer, where TypeParam = caffe::CPUDevice [ FAILED  ] LayerFactoryTest/2.TestCreateLayer, where TypeParam = caffe::GPUDevice [ FAILED  ] LayerFactoryTest/3.TestCreateLayer, where TypeParam = caffe::GPUDevice [ FAILED  ] CommonTest.TestRandSeedGPU

Then you are ready to create the lmdb format of ImageNet data, as needed by the training! ./examples/imagenet/create_imagenet.sh This took about 9 hours.

Training
This is from: https://github.com/tidsp/caffe-jacinto-models

export PYTHONPATH=.:/user/tomato/work/caffe-jacinto/python:$PYTHONPATH export CAFFE_ROOT=/user/tomato/work/caffe-jacinto
 * Make sure that the pycaffe folder (for example: /user/tomato/work/caffe-jacinto/python) is in your environment variable PYTHONPATH (can add this in .bashrc if you are using bash shell).
 * Also make sure that PYTHONPATH starts with a .: so that the import of local folders work. Example:
 * Set caffe-jacinto path (for example: /user/tomato/work/caffe-jacinto) to your CAFFE_ROOT environment variable (can set this in .bashrc if you are using bash shell) Example:

Go to the models directory. cd .../caffe-jacinto-models/scripts

To Do
Need to position the windows so one isn't on top of the other. Try: wmctrl