TensorRT/YoloV3

From eLinux.org
Jump to: navigation, search

This page will provide some FAQs about using the TensorRT to do inference for the YoloV3 model, which can be helpful if you encounter similar problems.

FAQ

1. How to run YoloV3 with TRT/ONNX?

With the sample in TRT(5.1.5.0) release (Path: TRT_PATH/samples/python/yolov3_onnx/), we can do Yolov3 inference with below steps

1. Call TRT_PATH/samples/python/yolov3_onnx/yolov3_to_onnx.py to convert yolov3.cfg and yolov3.weights to onnx model - yolov3.onnx.
The yolov3_to_onnx.py will download the yolov3.cfg and yolov3.weights automatically, you may need to install wget module and onnx(1.4.1) module before executing it.
$ pip install wget
$ pip install onnx==1.4.1
$ python yolov3_to_onnx.py
2. Execute “python onnx_to_tensorrt.py” to load yolov3.onnx and do the inference, logs as below.
    $ python onnx_to_tensorrt.py
    Downloading from https://github.com/pjreddie/darknet/raw/f86901f6177dfc6116360a13cc06ab680e0c86b0/data/dog.jpg, this may take a while...
    100% [............................................................................] 163759 / 163759
    Loading ONNX file from path yolov3.onnx...
    Beginning ONNX file parsing
    Completed parsing of ONNX file
    Building an engine from file yolov3.onnx; this may take a while...
    Completed creating Engine
    Running inference on image dog.jpg...
    [[135.04631186 219.14289907 184.31729756 324.86079515]
    [ 98.95619619 135.56527022 499.10088664 299.16208427]
    [477.88941676  81.22835286 210.86738172  86.96319933]] [0.99852329 0.99881124 0.93929232] [16  1  7]
    Saved image with bounding boxes of detected objects to dog_bboxes.png.


You also could use TensorRT C++ API to do inference instead of the above step#2:

  • TRT C++ API + TRT built-in ONNX parser like other TRT C++ sample, e.g. sampleFasterRCNN, parse yolov3.onnx with TRT built-in ONNX parser and use TRT C++ API to build the engine and do inference.
Verify the onnx file before using API:
$ ./trtexec  --onnx=yolov3.onnx
Build ONNX converter from https://github.com/onnx/onnx-tensorrt.git, and then convert the .onnx file to TensorRT engine file
$ onnx2trt yolov3.onnx -o yolov3.engine
Load the engine file to do the inference with TRT C++ API, before that you could verify the engine file firstly with trtexec as below
$ ./trtexec --engine=yolov3.engine --input=000_net --output=082_convolutional --output=094_convolutional --output=106_convolutional         

Tips: as you know, the “Upsample” layer in YoloV3 is the only TRT un-supported layer, but ONNX parser has embedded its support, so TRT is able to run Yolov3 directly with ONNX as above.

2. YoloV3 perf with multiple batches on P4, T4 and Xavier GPU.

SW: TensorRT 5.1.5.0, CUDA 10.1, cuDNN 7.5.1, Driver: 418.67, JetPack 4.2.1(for Xavier)
Yolov3 Network : generate the yolov3.onnx file by following the step#1 of question #1
Input size : 3x608x608
T4 Freq: 1590/5001 MHz
P4 Freq: 1113/3003 MHz
Xavier Freq: 1377/2133 MHz
Test command: $ ./trtexec --onnx=yolov3.onnx --output=layer106-conv --int8 --batch=$BATCH --device=$DEVICE // BATCH = 1, 2, 6 … 32

INT8 Batch Size Xavier (ms) P4 (ms) T4 (ms) Ratio (P4/Xavier) Ratio (T4/P4)
1 29.288 16.311 7.29 1.80 2.24
2 56.47 31.03 13.767 1.82 2.25
6 163.871 87.389 39.801 1.88 2.20
yolov3 8 218.239 116.203 53.387 1.88 2.18
16 433.347 229.869 107.302 1.89 2.14
24 647.903 343.656 160.98 1.89 2.13
36 863.98 458.602 218.4422 1.88 2.10