Difference between revisions of "TensorRT/Int8CFAQ"

From eLinux.org
Jump to: navigation, search
(Created page with "---- ===== <big> How to do INT8 calibration without using BatchStream?</big> ===== The way using [https://github.com/NVIDIA/TensorRT/blob/release/5.1/samples/common/BatchStre...")
 
(How to do INT8 calibration without using BatchStream?)
 
Line 10: Line 10:
  
 
And then when we implement the ''IInt8EntropyCalibrator'', we can use the API ''loadBatch'' from assistant class to load batch data directly.
 
And then when we implement the ''IInt8EntropyCalibrator'', we can use the API ''loadBatch'' from assistant class to load batch data directly.
<syntaxhighlight>
+
<pre>
 
         bool getBatch(void* bindings[], const char* names[], int nbBindings) override
 
         bool getBatch(void* bindings[], const char* names[], int nbBindings) override
 
         {                                                                             
 
         {                                                                             
Line 28: Line 28:
 
             return true;                                                             
 
             return true;                                                             
 
         }                                                                             
 
         }                                                                             
</syntaxhighlight>
+
</pre>
 
----
 
----
 +
 
===== <big> Can INT8 calibration table be compatible among different TRT versions or HW platforms?</big> =====
 
===== <big> Can INT8 calibration table be compatible among different TRT versions or HW platforms?</big> =====
 
INT8 calibration table is absolutely NOT compatible between different TRT versions. This is because the optimized network graph is probably different among various TRT versions. If you enforce to use them, TRT may not find the corresponding scaling factor for given tensor.<br>
 
INT8 calibration table is absolutely NOT compatible between different TRT versions. This is because the optimized network graph is probably different among various TRT versions. If you enforce to use them, TRT may not find the corresponding scaling factor for given tensor.<br>

Latest revision as of 01:48, 14 August 2019


How to do INT8 calibration without using BatchStream?

The way using BatchStream to do calibration is too complicate to accommodate for practice.

Here we provide an assistant class BatchFactory which utilizes OpenCV for calibration data pre-processing and simplify the calibration procedure.

File:BatchFactory.zip


And then when we implement the IInt8EntropyCalibrator, we can use the API loadBatch from assistant class to load batch data directly.

         bool getBatch(void* bindings[], const char* names[], int nbBindings) override
         {                                                                            
             float mean[3]{102.9801f, 115.9465f, 122.7717f}; // also in BGR order     
             float *batchBuf = mBF.loadBatch(mean, 1.0f);                             
                                                                                      
             // Indicates calibration data feeding done                               
             if (!batchBuf)                                                           
                 return false;                                                        
                                                                                      
             CHECK(cudaMemcpy(mDeviceInput, batchBuf, mInputCount * sizeof(float),    
 cudaMemcpyHostToDevice));                                                            
                                                                                      
             assert(!strcmp(names[0], INPUT_BLOB_NAME0));                             
             bindings[0] = mDeviceInput;                                                                                 
                                                                                      
             return true;                                                             
         }                                                                            

Can INT8 calibration table be compatible among different TRT versions or HW platforms?

INT8 calibration table is absolutely NOT compatible between different TRT versions. This is because the optimized network graph is probably different among various TRT versions. If you enforce to use them, TRT may not find the corresponding scaling factor for given tensor.
As long as the installed TensorRT version is identical for different HW platforms, then the INT8 calibration table can be compatible. That means you can perform INT8 calibration on a faster computation platform, like V100 or T4 and then deploy the calibration table to Tegra for INT8 inferencing as long as these platforms have the same TensorRT version installed (at least with the same major and minor version, like 5.1.5 and 5.1.6).


How to do INT8 calibration for the networks with multiple inputs

TensorRT uses bindings to denote the input and output buffer pointer and they are arranged in order. Hence, if your network has multiple input node/layer, you can pass through the input buffer pointers into bindings (void **) separately, like below network with two inputs required,

         bool getBatch(void* bindings[], const char* names[], int nbBindings) override 
         {                                                                             
             // Prepare the batch data (on GPU) for mDeviceInput and imInfoDev                                             
             ...
                                                                          
             assert(!strcmp(names[0], INPUT_BLOB_NAME0));                              
             bindings[0] = mDeviceInput;                                               
                                                                                       
             assert(!strcmp(names[1], INPUT_BLOB_NAME1));                              
             bindings[1] = imInfoDev;                                                  
                                                                                       
             return true;                                                              
         }     

NOTE: If your calibration batch is 10, then for each calibration cycle, you will need to fill each of your input buffer with 10 images accordingly.