Difference between revisions of "TensorRT"
(→TRT Int8 Calibration FAQ) |
(→The Usage of Polygraphy) |
||
(13 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
NVIDIA TensorRT™ is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. TensorRT-based applications perform up to 40x faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and finally deploy to hyperscale data centers, embedded, or automotive product platforms. | NVIDIA TensorRT™ is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. TensorRT-based applications perform up to 40x faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and finally deploy to hyperscale data centers, embedded, or automotive product platforms. | ||
<br> | <br> | ||
+ | |||
== Introduction == | == Introduction == | ||
− | [https://developer.nvidia.com/tensorrt | + | |
− | [https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html TensorRT Developer Guide ] | + | [https://developer.nvidia.com/tensorrt TensorRT Download]<br> |
+ | [https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html TensorRT Developer Guide] | ||
<br> | <br> | ||
+ | |||
== FAQ == | == FAQ == | ||
+ | |||
+ | |||
=== Official FAQ === | === Official FAQ === | ||
[https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#troubleshooting TensorRT Developer Guide#FAQs]<br> | [https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#troubleshooting TensorRT Developer Guide#FAQs]<br> | ||
+ | |||
---- | ---- | ||
Line 14: | Line 20: | ||
You can find answers here for some common questions about using TRT.<br> | You can find answers here for some common questions about using TRT.<br> | ||
Refer to the page [https://elinux.org/TensorRT/CommonFAQ TensorRT/CommonFAQ]<br> | Refer to the page [https://elinux.org/TensorRT/CommonFAQ TensorRT/CommonFAQ]<br> | ||
+ | |||
---- | ---- | ||
Line 19: | Line 26: | ||
If your FP16 result or Int8 result is not as expected, below page may help you fix the accuracy issues.<br> | If your FP16 result or Int8 result is not as expected, below page may help you fix the accuracy issues.<br> | ||
Refer to the page [https://elinux.org/TensorRT/AccuracyIssues TensorRT/AccuracyIssues]<br> | Refer to the page [https://elinux.org/TensorRT/AccuracyIssues TensorRT/AccuracyIssues]<br> | ||
+ | |||
---- | ---- | ||
Line 24: | Line 32: | ||
If the performance of doing inference with TRT is not as expected, below page may help you to optimize the performance.<br> | If the performance of doing inference with TRT is not as expected, below page may help you to optimize the performance.<br> | ||
Refer to the page [https://elinux.org/TensorRT/PerfIssues TensorRT/PerfIssues]<br> | Refer to the page [https://elinux.org/TensorRT/PerfIssues TensorRT/PerfIssues]<br> | ||
+ | |||
---- | ---- | ||
Line 30: | Line 39: | ||
Below page will present some FAQs about TRT Int8 Calibration.<br> | Below page will present some FAQs about TRT Int8 Calibration.<br> | ||
Refer to the page [https://elinux.org/TensorRT/Int8CFAQ TensorRT/Int8CFAQ]<br> | Refer to the page [https://elinux.org/TensorRT/Int8CFAQ TensorRT/Int8CFAQ]<br> | ||
+ | |||
---- | ---- | ||
Line 36: | Line 46: | ||
Below page will present some FAQs about TRT Plugin.<br> | Below page will present some FAQs about TRT Plugin.<br> | ||
Refer to the page [https://elinux.org/TensorRT/PluginFAQ TensorRT/PluginFAQ]<br> | Refer to the page [https://elinux.org/TensorRT/PluginFAQ TensorRT/PluginFAQ]<br> | ||
+ | |||
---- | ---- | ||
Line 41: | Line 52: | ||
If you met some Errors during using TRT, please find from below page for the answer.<br> | If you met some Errors during using TRT, please find from below page for the answer.<br> | ||
Refer to the page [https://elinux.org/TensorRT/CommonErrorFix TensorRT/CommonErrorFix]<br> | Refer to the page [https://elinux.org/TensorRT/CommonErrorFix TensorRT/CommonErrorFix]<br> | ||
+ | |||
---- | ---- | ||
=== How to debug or analyze === | === How to debug or analyze === | ||
− | + | Below page will help you debugging your inferencing in some ways.<br> | |
Refer to the page [https://elinux.org/TensorRT/How2Debug TensorRT/How2Debug]<br> | Refer to the page [https://elinux.org/TensorRT/How2Debug TensorRT/How2Debug]<br> | ||
+ | |||
---- | ---- | ||
+ | |||
=== TRT & YoloV3 FAQ === | === TRT & YoloV3 FAQ === | ||
Refer to the page [https://elinux.org/TensorRT/YoloV3 TensorRT/YoloV3]<br> | Refer to the page [https://elinux.org/TensorRT/YoloV3 TensorRT/YoloV3]<br> | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | === TRT & YoloV4 FAQ === | ||
+ | Refer to the page [https://elinux.org/TensorRT/YoloV4 TensorRT/YoloV4]<br> | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | === TRT ONNXParser FAQ === | ||
+ | If you have some question about onnx dynamic shape and onnx Parsing issues, this page might be helpful.<br> | ||
+ | Refer to the page [https://elinux.org/TensorRT/ONNX TensorRT/ONNX]<br> | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | === The Usage of Polygraphy === | ||
+ | Polygraphy is really useful debugging toolkit for TensorRT<br> | ||
+ | Refer to the page [https://elinux.org/TensorRT/Polygraphy_Usage TensorRT/Polygraphy_Usage] <br> | ||
+ | |||
---- | ---- | ||
− | |||
− |
Latest revision as of 03:36, 29 February 2024
NVIDIA TensorRT™ is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. TensorRT-based applications perform up to 40x faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and finally deploy to hyperscale data centers, embedded, or automotive product platforms.
Contents
Introduction
TensorRT Download
TensorRT Developer Guide
FAQ
Official FAQ
Common FAQ
You can find answers here for some common questions about using TRT.
Refer to the page TensorRT/CommonFAQ
TRT Accuracy FAQ
If your FP16 result or Int8 result is not as expected, below page may help you fix the accuracy issues.
Refer to the page TensorRT/AccuracyIssues
TRT Performance FAQ
If the performance of doing inference with TRT is not as expected, below page may help you to optimize the performance.
Refer to the page TensorRT/PerfIssues
TRT Int8 Calibration FAQ
Below page will present some FAQs about TRT Int8 Calibration.
Refer to the page TensorRT/Int8CFAQ
TRT Plugin FAQ
Below page will present some FAQs about TRT Plugin.
Refer to the page TensorRT/PluginFAQ
How to fix some Common Errors
If you met some Errors during using TRT, please find from below page for the answer.
Refer to the page TensorRT/CommonErrorFix
How to debug or analyze
Below page will help you debugging your inferencing in some ways.
Refer to the page TensorRT/How2Debug
TRT & YoloV3 FAQ
Refer to the page TensorRT/YoloV3
TRT & YoloV4 FAQ
Refer to the page TensorRT/YoloV4
TRT ONNXParser FAQ
If you have some question about onnx dynamic shape and onnx Parsing issues, this page might be helpful.
Refer to the page TensorRT/ONNX
The Usage of Polygraphy
Polygraphy is really useful debugging toolkit for TensorRT
Refer to the page TensorRT/Polygraphy_Usage