Difference between revisions of "Jetson/Installing CUDA"

From eLinux.org
Jump to: navigation, search
(Added an intro)
(Formatted the CUDA native installation instructions)
Line 1: Line 1:
 
You have two options for developing CUDA applications for Jetson TK1:
 
You have two options for developing CUDA applications for Jetson TK1:
# native compilation (compiling code onboard the Jetson TK1)
+
* '''native compilation''' (compiling code onboard the Jetson TK1)
# cross-compilation (compiling code on an x86 desktop in a special way so it can execute on the Jetson TK1 target device).
+
* '''cross-compilation''' (compiling code on an x86 desktop in a special way so it can execute on the Jetson TK1 target device).
  
 
Native compilation is generally the easiest option, but takes longer to compile, whereas cross-compilation is typically more complex to configure and debug, but for large projects it will be noticeably faster at compiling. The CUDA Toolkit currently only supports cross-compilation from an Ubuntu 12.04 Linux desktop. Whereas native compilation happens onboard the Jetson device and thus it doesn't matter which OS or desktop you use for native compilation.
 
Native compilation is generally the easiest option, but takes longer to compile, whereas cross-compilation is typically more complex to configure and debug, but for large projects it will be noticeably faster at compiling. The CUDA Toolkit currently only supports cross-compilation from an Ubuntu 12.04 Linux desktop. Whereas native compilation happens onboard the Jetson device and thus it doesn't matter which OS or desktop you use for native compilation.
  
 
== Installing the CUDA Toolkit onto your device for native CUDA development ==
 
== Installing the CUDA Toolkit onto your device for native CUDA development ==
Download the CUDA Toolkit for L4T by going to "https://developer.nvidia.com/jetson-tk1-support" and downloading the .deb file. (Make sure you use the "Toolkit for L4T" and not the "Toolkit for Ubuntu" since that is for cross-compilation instead of native compilation). You will need to register & log in first before downloading, so the easiest way is perhaps to download the file on your PC. Then if you want to copy the file to your device through your network you can do something like this:
+
Download the .deb file for the [https://developer.nvidia.com/jetson-tk1-support CUDA Toolkit for L4T]. (Make sure you download the ''Toolkit for L4T'' and not the ''Toolkit for Ubuntu'' since that is for cross-compilation instead of native compilation). You will need to register & log in first before downloading, so the easiest way is perhaps to download the file on your PC. Then if you want to copy the file to your device you can copy it onto a USB flash stick then plug it into the device, or transfer it through your local network such as by running this on a Linux PC:
scp ~/Downloads/cuda-repo-l4t-wr19.2_6.0-42_armhf.deb ubuntu@tegra-ubuntu:Downloads/.
+
scp ~/Downloads/cuda-repo-l4t-wr19.2_6.0-42_armhf.deb ubuntu@tegra-ubuntu:Downloads/.
 
On the device, install the .deb file and the CUDA Toolkit. eg:
 
On the device, install the .deb file and the CUDA Toolkit. eg:
cd ~/Downloads
+
cd ~/Downloads
# Install the CUDA repo metadata that you downloaded manually for L4T
+
# Install the CUDA repo metadata that you downloaded manually for L4T
sudo dpkg -i cuda-repo-l4t-r19.2_6.0-42_armhf.deb
+
sudo dpkg -i cuda-repo-l4t-r19.2_6.0-42_armhf.deb
# Download & install the actual CUDA Toolkit including the OpenGL toolkit from NVIDIA. (It only downloads around 15MB)
+
# Download & install the actual CUDA Toolkit including the OpenGL toolkit from NVIDIA. (It only downloads around 15MB)
sudo apt-get update
+
sudo apt-get update
sudo apt-get install cuda-toolkit-6-0
+
sudo apt-get install cuda-toolkit-6-0
# Add yourself to the "video" group to allow access to the GPU
+
# Add yourself to the "video" group to allow access to the GPU
sudo usermod -a -G video $USER
+
sudo usermod -a -G video $USER
 +
Add the 32-bit CUDA paths to your .bashrc login script, and start using it in your current console:
 +
echo "# Add CUDA bin & library paths:" >> ~/.bashrc
 +
echo "export PATH=/usr/local/cuda-6.0/bin:$PATH" >> ~/.bashrc
 +
echo "export LD_LIBRARY_PATH=/usr/local/cuda-6.0/lib:$LD_LIBRARY_PATH" >> ~/.bashrc
 +
source ~/.bashrc
 +
Verify that the CUDA Toolkit is installed on your device:
 +
nvcc -V
  
Add the 32-bit CUDA paths to your .bashrc and start using it in your current shell console:
+
== Installing & running the CUDA samples (optional) ==
echo "# Add CUDA bin & library paths:" >> ~/.bashrc
+
If you think you will write your own CUDA code or you want to see what CUDA can do, then follow this section to build & run some of the CUDA samples.
echo "export PATH=/usr/local/cuda-6.0/bin:$PATH" >> ~/.bashrc
 
echo "export LD_LIBRARY_PATH=/usr/local/cuda-6.0/lib:$LD_LIBRARY_PATH" >> ~/.bashrc
 
source ~/.bashrc
 
Verify that the CUDA Toolkit is installed:
 
nvcc -V
 
  
If you think you will write your own CUDA code or you want to see what CUDA can do, then build & run some of the CUDA samples:
+
Install writeable copies of the CUDA samples to your device's home directory (it will create a "NVIDIA_CUDA-6.0_Samples" folder):
Install writeable copies of the CUDA samples to your home directory (it will create a "NVIDIA_CUDA-6.0_Samples" folder):
+
cuda-install-samples-6.0.sh /home/ubuntu
    cuda-install-samples-6.0.sh /home/ubuntu
+
Build the CUDA samples (takes around 15 minutes on Jetson TK1):
Build the CUDA samples (takes around 15 minutes):
+
cd ~/NVIDIA_CUDA-6.0_Samples
    cd ~/NVIDIA_CUDA-6.0_Samples
+
make
    make
 
 
Run some CUDA samples:
 
Run some CUDA samples:
    1_Utilities/deviceQuery/deviceQuery
+
1_Utilities/deviceQuery/deviceQuery
  
    1_Utilities/bandwidthTest/bandwidthTest
+
1_Utilities/bandwidthTest/bandwidthTest
  
    cd 0_Simple/matrixMul
+
cd 0_Simple/matrixMul
    ./matrixMulCUBLAS
+
./matrixMulCUBLAS
    cd ../..
+
cd ../..
  
    cd 0_Simple/simpleTexture
+
cd 0_Simple/simpleTexture
    ./simpleTexture
+
./simpleTexture
    cd ../..
+
cd ../..
  
    cd 3_Imaging/convolutionSeparable
+
cd 3_Imaging/convolutionSeparable
    ./convolutionSeparable
+
./convolutionSeparable
    cd ../..
+
cd ../..
  
    cd 3_Imaging/convolutionTexture
+
cd 3_Imaging/convolutionTexture
    ./convolutionTexture
+
./convolutionTexture
    cd ../..
+
cd ../..
  
    Note: Many of the CUDA samples use OpenGL GLX and open graphical windows. If you are running these programs through an SSH remote terminal, you can remotely display the windows on your desktop by typing "export DISPLAY=:0" and then executing the program. (This will only work if you are using a Linux/Unix machine or you run an X server such as the free "Xming" for Windows). eg:
+
Note: Many of the CUDA samples use OpenGL GLX and open graphical windows. If you are running these programs through an SSH remote terminal, you can remotely display the windows on your desktop by typing "export DISPLAY=:0" and then executing the program. (This will only work if you are using a Linux/Unix machine or you run an X server such as the free "Xming" for Windows). eg:
        export DISPLAY=:0
+
export DISPLAY=:0
        cd ~/NVIDIA_CUDA-6.0_Samples/2_Graphics/simpleGL
+
cd ~/NVIDIA_CUDA-6.0_Samples/2_Graphics/simpleGL
        ./simpleGL
+
./simpleGL
        cd ~/NVIDIA_CUDA-6.0_Samples/3_Imaging/bicubicTexture
+
cd ~/NVIDIA_CUDA-6.0_Samples/3_Imaging/bicubicTexture
        ./bicubicTexture
+
./bicubicTexture
        cd ~/NVIDIA_CUDA-6.0_Samples/3_Imaging/bilateralFilter
+
cd ~/NVIDIA_CUDA-6.0_Samples/3_Imaging/bilateralFilter
        ./bilateralFilter
+
./bilateralFilter
    Note: the Optical Flow sample (HSOpticalFlow) and 3D stereo sample (stereoDisparity) take rougly 1 minute each to execute since they compare results with CPU code.
+
Note: the Optical Flow sample (HSOpticalFlow) and 3D stereo sample (stereoDisparity) take rougly 1 minute each to execute since they compare results with CPU code.

Revision as of 07:01, 5 June 2014

You have two options for developing CUDA applications for Jetson TK1:

  • native compilation (compiling code onboard the Jetson TK1)
  • cross-compilation (compiling code on an x86 desktop in a special way so it can execute on the Jetson TK1 target device).

Native compilation is generally the easiest option, but takes longer to compile, whereas cross-compilation is typically more complex to configure and debug, but for large projects it will be noticeably faster at compiling. The CUDA Toolkit currently only supports cross-compilation from an Ubuntu 12.04 Linux desktop. Whereas native compilation happens onboard the Jetson device and thus it doesn't matter which OS or desktop you use for native compilation.

Installing the CUDA Toolkit onto your device for native CUDA development

Download the .deb file for the CUDA Toolkit for L4T. (Make sure you download the Toolkit for L4T and not the Toolkit for Ubuntu since that is for cross-compilation instead of native compilation). You will need to register & log in first before downloading, so the easiest way is perhaps to download the file on your PC. Then if you want to copy the file to your device you can copy it onto a USB flash stick then plug it into the device, or transfer it through your local network such as by running this on a Linux PC:

scp ~/Downloads/cuda-repo-l4t-wr19.2_6.0-42_armhf.deb ubuntu@tegra-ubuntu:Downloads/.

On the device, install the .deb file and the CUDA Toolkit. eg:

cd ~/Downloads
# Install the CUDA repo metadata that you downloaded manually for L4T
sudo dpkg -i cuda-repo-l4t-r19.2_6.0-42_armhf.deb
# Download & install the actual CUDA Toolkit including the OpenGL toolkit from NVIDIA. (It only downloads around 15MB)
sudo apt-get update
sudo apt-get install cuda-toolkit-6-0
# Add yourself to the "video" group to allow access to the GPU
sudo usermod -a -G video $USER

Add the 32-bit CUDA paths to your .bashrc login script, and start using it in your current console:

echo "# Add CUDA bin & library paths:" >> ~/.bashrc
echo "export PATH=/usr/local/cuda-6.0/bin:$PATH" >> ~/.bashrc
echo "export LD_LIBRARY_PATH=/usr/local/cuda-6.0/lib:$LD_LIBRARY_PATH" >> ~/.bashrc
source ~/.bashrc

Verify that the CUDA Toolkit is installed on your device:

nvcc -V

Installing & running the CUDA samples (optional)

If you think you will write your own CUDA code or you want to see what CUDA can do, then follow this section to build & run some of the CUDA samples.

Install writeable copies of the CUDA samples to your device's home directory (it will create a "NVIDIA_CUDA-6.0_Samples" folder):

cuda-install-samples-6.0.sh /home/ubuntu

Build the CUDA samples (takes around 15 minutes on Jetson TK1):

cd ~/NVIDIA_CUDA-6.0_Samples
make

Run some CUDA samples:

1_Utilities/deviceQuery/deviceQuery
1_Utilities/bandwidthTest/bandwidthTest
cd 0_Simple/matrixMul
./matrixMulCUBLAS
cd ../..
cd 0_Simple/simpleTexture
./simpleTexture
cd ../..
cd 3_Imaging/convolutionSeparable
./convolutionSeparable
cd ../..
cd 3_Imaging/convolutionTexture
./convolutionTexture
cd ../..

Note: Many of the CUDA samples use OpenGL GLX and open graphical windows. If you are running these programs through an SSH remote terminal, you can remotely display the windows on your desktop by typing "export DISPLAY=:0" and then executing the program. (This will only work if you are using a Linux/Unix machine or you run an X server such as the free "Xming" for Windows). eg:

export DISPLAY=:0
cd ~/NVIDIA_CUDA-6.0_Samples/2_Graphics/simpleGL
./simpleGL
cd ~/NVIDIA_CUDA-6.0_Samples/3_Imaging/bicubicTexture
./bicubicTexture
cd ~/NVIDIA_CUDA-6.0_Samples/3_Imaging/bilateralFilter
./bilateralFilter

Note: the Optical Flow sample (HSOpticalFlow) and 3D stereo sample (stereoDisparity) take rougly 1 minute each to execute since they compare results with CPU code.