TensorRT/FP16 Accuracy

From eLinux.org
Jump to: navigation, search

The following is the data range of FP32, FP16 and INT8,

Dynamic Range Min Positive Value
FP32 -3.4 x 1038 ~ +3.4 x 1038 1.4 x 10-45
FP16 -65504 ~ +65504 5.96 x 10-8
INT8 -128 ~ +127 1

Not like INT8, generally, we wouldn’t see overflow case (activation or weight larger than 65504 or less than -65504) for FP16 computation, but the underflow (less than 5.96e-8) would be still appearing compared to FP32 values.
To debug FP16 accuracy analysis, we could dump the result of middle layer to scope whether FP16 activation value has big deviation compared to FP32’s (Refer to page to get how to do layer dumping and analyzing).

According to our experience, batch normalization and activation(Relu) can effectively decrease the information loss of FP16, like the following statistic we scoped from UNet semantic segmentation network,

Networks layer FP32_value - FP16 Value | / |FP32| > 10% Total Number Deviation ratio
(Diff_num/total_num * 100%)
UNet Conv0 23773 2621440 (40*256*256) 0.9069%
UNet bn0 371 2621440 (40*256*256) 0.0142%
UNet relu0 196 2621440 (40*256*256) 0.0075%

NOTE: If we want to dump FP16 result of the first layer, we have to set it as output layer, but setting certain layer as output probably causes TensorRT builder decides to run this layer in FP32, other than FP16 (it is probably due to the input and output both are FP32, if it runs FP16 computation, then it will need reformatting before and after, this reformat overhead might be larger than what we benefit from running FP16 mode). In this case, we shall use the following API to make the network run in FP16 mode strictly without considering any performance optimization,

builder->setStrictTypeConstraints(true);

Refer to the above result, we can see

  • Convolution FP16 does have 0.9% loss compared to FP32 result.
  • Batch normalization can help decrease the loss significantly from 0.9% to 0.014%.
  • Activation/Relu can also help (since negative overflow values get clipped to zero for both FP16 and FP32, so the loss will be decreasing by half ?).