Difference between revisions of "Jetson/Performance"

From eLinux.org
Jump to: navigation, search
m (Added newline)
(I added some useful information about which cpu frequency settings were applicable for which CPU clusters (on the Jetson-TK1).)
Line 62: Line 62:
 
  echo 1 > /sys/devices/system/cpu/cpu3/online
 
  echo 1 > /sys/devices/system/cpu/cpu3/online
 
  echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
 
  echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
 +
 +
=== Manually setting CPU frequency ===
 +
CPU frequency can be set by writing
 +
echo "userspace" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
 +
and then write one of the frequencies above to
 +
echo <frequency> > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
 +
 +
Note that, if you are running on the LP (Low performance) cluster, the following frequencies are available:
 +
    51000 102000 204000 312000 564000 696000 828000 960000 1092000
 +
And if you are running on the HP (High Performance, "G") cluster, the following frequencies are available:
 +
    204000 312000 564000 696000 828000 960000 1092000 1122000 1224000 1326000 1428000 1530000 1632000 1734000 1836000 1938000 2014500 2116500 2218500 2320500
 +
 +
(The frequency information has been manually extracted by attempting all frequency combinations on HP / LP cluster and checking /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq).
  
 
=== Limiting the main CPU cores to run at low speed ===
 
=== Limiting the main CPU cores to run at low speed ===

Revision as of 07:40, 5 November 2014

Introduction

This page discusses various Tegra CPU & GPU performance topics. The power draw of the Tegra processor and the overall embedded board are very tightly related to the performance of the Tegra processor, so you will often want to choose carefully between which things should be running at high performance settings (for max speed) and which should be running at low performance settings (for low power draw & heat) or disabled completely. Read the power draw page for more details on power-specific topics.

Controlling the CPU performance

Note: Debugfs and non-upstream sysfs nodes aren't guaranteed to remain unchanged in future L4T releases.

Tegra K1 is designed for mobile use-cases and thus contains a significant number of power reduction systems to control when parts of the hardware should run faster or slower or be turned off, based on runtime use. This is good for most use cases, since the default settings will give high performance for many intense projects and low power draw for many light tasks. But you might want to force a lower or higher performance of some parts of the hardware such as to run benchmarks of the peak performance or enforce lower power draw. The automatic turning on/off of the 4 main CPU cores & 5th companion core is mostly done in the L4T kernel using "cpuquiet", a mechanism for dynamically hot-plugging CPU cores based upon workload/policy. There are many ways to adjust the performance & power behavior at runtime or always on bootup (all requiring root privileges, mentioned below).


As a general guide, the different options for CPU perf/power (sorted from highest power draw to lowest power draw) are:

  • Force all 4 CPU cores to max performance by disabling the hot-plug scaling mechanism.
  • Just use everything with default settings (it will automatically switch on/off each of the 4 main CPU cores & the 5th low-power companion core, at runtime).
  • Limit the max clock-rate of the 4 main CPU cores to a low speed to reduce the power (it will still automatically switch them on/off and switch to the 5th low-power companion core when suitable).
  • Turn some CPU cores on and some off, based on what you know works best for a particular use-case.
  • Turn off all 4 main CPU cores to force all CPU code to run on the 5th "LP" low-power shadow companion core instead, for maximum power reduction.


How to run a command with root privileges temporarily or on every bootup

All the commands on this page must be executed as root. To execute some commands as root, simply run "sudo su" to login as root user, run the commands, then run "exit" to return to regular user. eg:

sudo su
    (enter your password, it is "ubuntu" by default)
    # Run a command that requires root privileges
    echo Hello > /TEST.txt
    exit

To execute commands automatically on every bootup, you can put your commands into the "/etc/rc.local" bootup script. For example, run this on your board:

sudo nano /etc/rc.local

Then add this near the bottom of the file but before the "exit" line:

# Disable USB auto-suspend, since it disconnects some devices such as webcams on Jetson TK1.
echo -1 > /sys/module/usbcore/parameters/autosuspend 

Save the file by hitting Ctrl+X then Y then Enter, then reboot the device:

sudo reboot

Viewing the current CPU status

You can check which of the 4 CPU cores are currently running (online) instead of sleeping (offline):

cat /sys/devices/system/cpu/online

For example, it might say "0-2" to say that cores 0, 1 & 2 are currently running. If you run it again, it might now say "0" to show that only core 0 is running, if there aren't many background tasks.

To see the current CPU clock frequencies (note that you will see errors for the cores that are currently sleeping):

cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_cur_freq
cat /sys/devices/system/cpu/cpu2/cpufreq/scaling_cur_freq
cat /sys/devices/system/cpu/cpu3/cpufreq/scaling_cur_freq

You can also see if it is currently running on the main 4-core CPU cluster (named "G" for General cluster) or the 5th low-power shadow core (named "LP" for Low-Power cluster):

cat /sys/kernel/cluster/active

Maximizing CPU performance

To obtain full performance on the CPU (eg: for performance measurements or benchmarking or when you don't care about power draw), you can disable CPU scaling and force the 4 main CPU cores to always run at max performance until reboot:

echo 0 > /sys/devices/system/cpu/cpuquiet/tegra_cpuquiet/enable
echo 1 > /sys/devices/system/cpu/cpu0/online
echo 1 > /sys/devices/system/cpu/cpu1/online
echo 1 > /sys/devices/system/cpu/cpu2/online
echo 1 > /sys/devices/system/cpu/cpu3/online
echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

Manually setting CPU frequency

CPU frequency can be set by writing

echo "userspace" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

and then write one of the frequencies above to

echo <frequency> > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed

Note that, if you are running on the LP (Low performance) cluster, the following frequencies are available:

    51000 102000 204000 312000 564000 696000 828000 960000 1092000

And if you are running on the HP (High Performance, "G") cluster, the following frequencies are available:

    204000 312000 564000 696000 828000 960000 1092000 1122000 1224000 1326000 1428000 1530000 1632000 1734000 1836000 1938000 2014500 2116500 2218500 2320500

(The frequency information has been manually extracted by attempting all frequency combinations on HP / LP cluster and checking /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq).

Limiting the main CPU cores to run at low speed

You can limit the max clock-rate of the 4 main cores (the "G" General cluster) in the CPU to a low speed to reduce the power (it will still automatically switch to the 5th "LP" (low-power) shadow companion core when suitable).

To see the available CPU frequencies (measured in kHz, so "51000" refers to 51MHz):

cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies
    51000 102000 204000 312000 564000 696000 828000 960000 1092000 1122000 1224000 1326000 1428000 1530000 1632000 1734000 1836000 1938000 2014500 2116500 2218500 2320500

Then you can set the min & max frequencies of each CPU core.

For example, to limit the speed of the 4 main CPU cores to just 564MHz, run this as root (as shown above):

echo 564000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
echo 564000 > /sys/devices/system/cpu/cpu1/cpufreq/scaling_max_freq
echo 564000 > /sys/devices/system/cpu/cpu2/cpufreq/scaling_max_freq
echo 564000 > /sys/devices/system/cpu/cpu3/cpufreq/scaling_max_freq

Note that it will only work for the cores that weren't sleeping at the time, so you might want to turn on all CPU cores (as mentioned below) before setting the frequencies and then turn cores 1-3 off when you are finished setting the frequencies.

Turning some CPU cores on and some off

You can turn each CPU core online (1) or offline (0) manually:

echo 1 > /sys/devices/system/cpu/cpu0/online
echo 1 > /sys/devices/system/cpu/cpu1/online
echo 1 > /sys/devices/system/cpu/cpu2/online
echo 1 > /sys/devices/system/cpu/cpu3/online

(Ignore any 'invalid argument' errors during this change) (Note: you might want to experiment with setting cpuquiet on/off, such as "echo 0 > /sys/devices/system/cpu/cpuquiet/tegra_cpuquiet/enable")

Restricting to low-power core only

Restricting the CPU to the low-power companion core can significantly reduce peak power (if running on a power-limited battery pack, for example). The 5th companion core in Tegra K1 is still a 1GHz Cortex-A15 core with NEON SIMD and 32KB L1 cache and 512KB L2 private cache, but obviously at lower performance than the 4 main cores. To use just the low-power core, run this as root:

echo 0 > /sys/devices/system/cpu/cpuquiet/tegra_cpuquiet/enable
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 0 > /sys/devices/system/cpu/cpu2/online
echo 0 > /sys/devices/system/cpu/cpu3/online
echo LP > /sys/kernel/cluster/active

(Ignore any 'invalid argument' errors during this change) (Use the "Viewing the current CPU status" section above to see which cluster & cores & frequencies are currently being used).


Controlling GPU performance

To manually control the clock frequencies of the GPU, first determine the rates supported (listed by sysfs in kHz):

 cat /sys/kernel/debug/clock/gbus/possible_rates
 72000 108000 180000 252000 324000 396000 468000 540000 612000 648000 684000 708000 756000 804000 852000 (kHz)

Then set a rate (eg. the maximum of 852000kHz), specified in Hz:

 echo 852000000 > /sys/kernel/debug/clock/override.gbus/rate
 echo 1 > /sys/kernel/debug/clock/override.gbus/state

Finally verify the rate:

 cat /sys/kernel/debug/clock/gbus/rate
 852000

The gbus sysfs nodes control the GPU's core clock. To control the GPU's memory clock, substitute emc for gbus.