Difference between revisions of "Jetson nsight system"

From eLinux.org
Jump to: navigation, search
(Install NS on Jetson Device)
 
(18 intermediate revisions by the same user not shown)
Line 2: Line 2:
  
 
=Installation=
 
=Installation=
 +
Jetson NS (Nsight System) must be installed via SDKManager
 +
NSys User Guide: https://docs.nvidia.com/nsight-systems/UserGuide/index.html
 
==Install NS on x86 Linux Host==
 
==Install NS on x86 Linux Host==
===1. Install Nsight System via SDKManager===
+
'''1. Install Nsight System via SDKManager''' <br>
 
<gallery mode=nolines widths=260>
 
<gallery mode=nolines widths=260>
 
NS_Install1.PNG|Step#1: Select "Host Machine"     
 
NS_Install1.PNG|Step#1: Select "Host Machine"     
 
NS_Install2.PNG|Step#2: Install "NVIDIA Nsight Systems"
 
NS_Install2.PNG|Step#2: Install "NVIDIA Nsight Systems"
 
</gallery>
 
</gallery>
 
 
Just click '''Continue''' to install Nsight System on x86 Linux System.
 
Just click '''Continue''' to install Nsight System on x86 Linux System.
  
===2. Verify Installation===
+
'''2. Verify Installation'''<br>
 
After installation is done, you can open it with "nsight-sys" command as below.
 
After installation is done, you can open it with "nsight-sys" command as below.
 
<gallery mode=nolines>
 
<gallery mode=nolines>
Line 18: Line 19:
  
 
==Install NS on Jetson Device==
 
==Install NS on Jetson Device==
'''Note''': on the newly installed Jetson system, there is not Nsight System on it. <br>
+
'''1. Installation Steps''' <br>
Installation Steps: <br>
 
 
# On x86, launch the Nsight system installed via SDKManager as described above  
 
# On x86, launch the Nsight system installed via SDKManager as described above  
 
# In Nsight System, create a new project, and connect to Jetson device as below, this step will install the Nsight target binaries ontO Jetson device
 
# In Nsight System, create a new project, and connect to Jetson device as below, this step will install the Nsight target binaries ontO Jetson device
Line 25: Line 25:
 
Install_NS_libs_onto_Jetson_Device.png
 
Install_NS_libs_onto_Jetson_Device.png
 
</gallery>
 
</gallery>
After installation, nsys locates under  /opt/nvidia/nsight_systems/ .
+
'''2. Verify Installation''' <br>
 +
After installation, nsys locates under  /opt/nvidia/nsight_systems/ . <br>
  
 
=Profile=
 
=Profile=
Line 36: Line 37:
  
 
==Local Profile==
 
==Local Profile==
After
+
==== User can check the profile option ====
User can run the command to generate the profiling log ()
+
 
$ sudo /opt/nvidia/nsight_systems/nsys profile -t cuda,nvtx,nvmedia --accelerator-trace=nvmedia --show-output=true --force-overwrite=true --delay=20 --duration=90 --output=%p  /usr/src/tensorrt/bin/trtexec --loadEngine=yolo_dla_0_bs20.engine --useDLACore=0 --batch=20  --streams=1  --workspace=256M
+
$ /opt/nvidia/nsight_systems/nsys profile --help
 +
 
 +
==== Profile Application ====
 +
===== '''Profile''' =====
 +
Run an application and capture its profile into a '''QDSTRM''' file, and view it in Nsys GUI profiler<br>
 +
  $ sudo /opt/nvidia/nsight_systems/nsys profile -t cuda,nvtx,nvmedia,osrt --accelerator-trace=nvmedia --show-output=true --force-overwrite=true --delay=20 --duration=90 --output=%p  '''$APP'''
 +
 
 +
One exmaple to profile TRT inference:<br>
 +
$ sudo /opt/nvidia/nsight_systems/nsys profile -t cuda,nvtx,nvmedia,osrt --accelerator-trace=nvmedia --show-output=true --force-overwrite=true --delay=20 --duration=90 --output=%p  /usr/src/tensorrt/bin/trtexec --loadEngine=yolo_dla_0_bs20.engine --useDLACore=0 --batch=20<br>
 +
 +
Options:
 +
--accelerator-trace=nvmedia : enable profile DLA
 +
--delay : start profiling after 20 seconds
 +
  --duration : profile time
 +
 
 +
===== '''View Log''' =====
 +
Import QDSTRM file into GUI Nsys:<br>
 +
'''1. TensorRT and DLA Inference Time'''
 +
<gallery  mode=nolines widths="1000px" heights="520px">
 +
Inference_Time1.png
 +
</gallery>
 +
'''2. CUDA Kernel Run Time'''
 +
<gallery  mode=nolines widths="1500px" heights="500px">
 +
CUDA_kernel_time_time.png
 +
</gallery>
 +
'''3. Enqueue Time'''
 +
<gallery  mode=packed widths="1500px" heights="700px">
 +
TRT Enqueue.png
 +
</gallery>
 +
'''3. trtexec nsys log'''
 +
   https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#command-line-programs

Latest revision as of 05:47, 23 October 2021

This page describes how to use Nsight System on Jetson L4T system

Installation

Jetson NS (Nsight System) must be installed via SDKManager
NSys User Guide: https://docs.nvidia.com/nsight-systems/UserGuide/index.html

Install NS on x86 Linux Host

1. Install Nsight System via SDKManager

Just click Continue to install Nsight System on x86 Linux System.

2. Verify Installation
After installation is done, you can open it with "nsight-sys" command as below.

Install NS on Jetson Device

1. Installation Steps

  1. On x86, launch the Nsight system installed via SDKManager as described above
  2. In Nsight System, create a new project, and connect to Jetson device as below, this step will install the Nsight target binaries ontO Jetson device

2. Verify Installation
After installation, nsys locates under /opt/nvidia/nsight_systems/ .

Profile

Remote Profile (UI)

User can run Nsight System on Host and remotely profile the application running on Jetson. User can select several options to enable the corresponding proiling.

Local Profile

User can check the profile option

$ /opt/nvidia/nsight_systems/nsys profile --help

Profile Application

Profile

Run an application and capture its profile into a QDSTRM file, and view it in Nsys GUI profiler

 $ sudo /opt/nvidia/nsight_systems/nsys profile -t cuda,nvtx,nvmedia,osrt --accelerator-trace=nvmedia --show-output=true --force-overwrite=true --delay=20 --duration=90 --output=%p  $APP
One exmaple to profile TRT inference:
$ sudo /opt/nvidia/nsight_systems/nsys profile -t cuda,nvtx,nvmedia,osrt --accelerator-trace=nvmedia --show-output=true --force-overwrite=true --delay=20 --duration=90 --output=%p /usr/src/tensorrt/bin/trtexec --loadEngine=yolo_dla_0_bs20.engine --useDLACore=0 --batch=20
Options: --accelerator-trace=nvmedia : enable profile DLA --delay : start profiling after 20 seconds --duration : profile time
View Log

Import QDSTRM file into GUI Nsys:
1. TensorRT and DLA Inference Time

2. CUDA Kernel Run Time

3. Enqueue Time

3. trtexec nsys log

 https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#command-line-programs