Difference between revisions of "Jetson/Graphics Performance"

From eLinux.org
Jump to: navigation, search
m
Line 3: Line 3:
  
 
== GPGPU Capabilities ==
 
== GPGPU Capabilities ==
The Tegra K1 SOC GPU provides excellent GPGPU performance per Watt. Nvidia claims Tegra TK1 can attain 326 GFLOPS, whereas its closest contemporary competitor, the SnapDragon 805, may achieve [http://kyokojap.myweb.hinet.net/gpu_gflops/|an estimated 200 GFLOPS].
+
The Tegra K1 SOC provides excellent GPGPU performance per Watt. Nvidia claims Tegra TK1 can attain 326 GFLOPS, whereas its closest contemporary competitor, the SnapDragon 805, may achieve [http://kyokojap.myweb.hinet.net/gpu_gflops/|an estimated 200 GFLOPS].
 
Imagination Technologies [http://www.macrumors.com/2014/02/24/rogue-series6xt-ios-graphics/|have announced a PowerVR GX6650] design which they claim can challenge the Tegra K1 performance.  However, the design favors <abbr title="Floating Point 16-bit">FP16</abbr> operations which may limit its usefulness for GPGPU tasks.  As of June 20 2014, public bench marks are not available, and the GX6650 may not ship to consumer until 2015.  By that time, the NVidia Erista (the Maxwell successor to the Tegra K1) should be available.
 
Imagination Technologies [http://www.macrumors.com/2014/02/24/rogue-series6xt-ios-graphics/|have announced a PowerVR GX6650] design which they claim can challenge the Tegra K1 performance.  However, the design favors <abbr title="Floating Point 16-bit">FP16</abbr> operations which may limit its usefulness for GPGPU tasks.  As of June 20 2014, public bench marks are not available, and the GX6650 may not ship to consumer until 2015.  By that time, the NVidia Erista (the Maxwell successor to the Tegra K1) should be available.
  

Revision as of 22:55, 21 June 2014

The Tegra TK1 SOC

The Tegra K1 SOC in the Jetson TK1 is targeted for embedded GPGPU applications as well as general purpose use in power-constrained devices such as super-phones, tablets, laptops, set-top boxes, and low-power desktop computers.

GPGPU Capabilities

The Tegra K1 SOC provides excellent GPGPU performance per Watt. Nvidia claims Tegra TK1 can attain 326 GFLOPS, whereas its closest contemporary competitor, the SnapDragon 805, may achieve estimated 200 GFLOPS. Imagination Technologies announced a PowerVR GX6650 design which they claim can challenge the Tegra K1 performance. However, the design favors FP16 operations which may limit its usefulness for GPGPU tasks. As of June 20 2014, public bench marks are not available, and the GX6650 may not ship to consumer until 2015. By that time, the NVidia Erista (the Maxwell successor to the Tegra K1) should be available.

NVidia leapt over the competition by using the same Kepler GPU architecture that it has used for years to power the worlds fastest desktop GPUs and super computers. This decision allows them to offer existing, well-tested tools on the TK1 with minimal modification. Supported APIs include OpenGL ES 3.0 and OpenGL 4.4, DirectX 11, CUDA 6, and OpenCL 1.2.

Frames Per Seconds (FPS)

The graphics performance on the Jeston TK1 has been roughly comparable to Intel HD 4600 graphics, but with superior OpenGL and GPGPU support. We hope to add more benchmarks and comparisons to competing system below. However, the unique combination of hardware and software on the board presents a challenge.

The Jetson TK1 CPU uses ARM A15r2 cores, the GPU is one NVidia Kepler SMX modified for mobile, and the OS is Ubuntu Linux 14.04. While this a great combination of technologies, it is also unique. As a result, has proved more difficult to compare graphic performance with a traditional PC configurations. Common PC graphics benchmarks such as 3DMark and GFXBench are not available for ARM/Linux even though they are available for ARM/Android.[1] Compiling applications from source can also be a challenge, as many graphics games and utilities for Linux assume x86 architecture. Thus extensions like SSE cannot be used, and may not be easily replaced with a similar ARM extension like NEON.

The Xonotic tested below is a custom compile direct from source. The results are from "the big benchmark" which is provided with the source. This is apparently the same method used by Phoronix. Therefore, comparisons between results produced by the Phoronix Test Suite at the same resolution should be valuable. The author did find it interesting that lowering the resolution below 1080p had little effect on the frame rates. This implies that fillrate is not a limiting factor at 1080p and below.

Xontonic 0.7.0 @ 1920x1080

Effects Level Low Average High
Low 43 83 140
Medium 35 75 131
Normal 34 71 120
High 17 42 60
Ultra 6 29 47
Ultimate 4 19 32

Power Use - Overview

Graphics intensive applications, including demanding OpenGL games, have shown surprisingly low power requirements - generally below 4.73W for the SOC and RAM. This may be due the OpenGL interface defaulting to lower-power FP16 operations. GPGPU applications that harnessed the power of all CUDA cores, however, have required as much as 8.63W peak for both SOC and RAM, and 11.06W for the board.

Test System

  • Standard Jetson TK1 developer board
  • Audio out active
  • Attached GbE
  • One NFS mount to external NAS active
  • Four port USB3 hub attached
  • Logitech K310 USB Keyboard attached via USB hub
  • Logitech Marble Mouse attached via USB hub
  • Logitech C615 HD video cam attached via USB hub
  • HDMI out @1920x1080
  • Standard Cooling Fan
  • Installed 64GB SD card with one ext4 mount active
  • Kubuntu standard desktop, compositing disabled

Test Methodology

The Jetson TK1 was tested in a response to a forum discussion. I tested using a Multimeter patched into the DC line between the A/C power converter and the board.

Observations

  • The power adapter was measured to provide consistent 12.15 volts.
  • The fan's power draw (0.85W) was determined by unplugging it for a short time while the board was idle and noting the difference in power draw.
  • The Jetson TK1 board as configured has yet to exceed 12.0W total draw under any workload tested.
  • Nvidia's numbers found in their brief (page 13) appear accurate to conservative.
  • Nvidia's point about drawing comparisons to mobile appear valid. The board drives a number of ports that either have low-power alternatives or aren't normally available mobile devices. Examples include GbE, desktop RAM, the SATA port, and mini-PCI.

Base Measurements

Component Volts Amps Watts
Idle KDE Desktop 12.15 0.22 2.67
Less Fan 12.15 0.15 1.82
Less System 12.15 0.05 0.61

Power Use - Graphics

glmark2 -s 1920x1080 --off-screen

Component Volts Amps Watts
Minimum 12.15 0.22 2.67
Maximum 12.15 0.62 7.53
Average 12.15 0.35 4.25
Average less System 12.15 0.18 2.19

VLC streaming 720p video from NAS GbE

Component Volts Amps Watts
Minimum 12.15 0.29 3.52
Maximum 12.15 0.41 4.98
Average 12.15 0.34 4.13
Average less System 12.15 0.71 2.01

Xonotic v0.7.0 normal @ 1920x1080

Component Volts Amps Watts
Average 12.15 0.56 6.56
Average less System 12.15 0.39 4.74

Web Browsing, Chromium

Component Volts Amps Watts
Average 12.15 0.35 4.25
Average less System 12.15 0.28 2.19

Power Use - GPGPU

CUDA Smoke particle demo

Component Volts Amps Watts
Minimum 12.15 0.62 7.53
Maximum 12.15 0.91 11.06
Average 12.15 0.88 10.69
Average less System 12.15 0.71 8.26
  1. NVidia has commercial licenses for graphic benchmarks, and therefore have been able to publish results for the Jetson TK1.