Difference between revisions of "Tests:R-CAR-RAVB-RX-Checksum-Offload"
(→= Without RX Checksum Offload) |
|||
(One intermediate revision by the same user not shown) | |||
Line 60: | Line 60: | ||
Note that perf record writes to a file in /run. This was chosen as that directory is mounted as a tmpfs filesystem backed by memory. | Note that perf record writes to a file in /run. This was chosen as that directory is mounted as a tmpfs filesystem backed by memory. | ||
Writing to a file in an NFS partition significantly impacts the meaningfulness of results collected. | Writing to a file in an NFS partition significantly impacts the meaningfulness of results collected. | ||
+ | |||
+ | |||
+ | In terms of performance throughput is close to gigabit line-rate both with | ||
+ | and without RX checksum offload enabled. Perf output, however, appears to | ||
+ | indicate that significantly less time is spent in do_csum() when RX checksum offload is enabled. | ||
+ | This is the expected result. | ||
==== With RX Checksum Offload ==== | ==== With RX Checksum Offload ==== | ||
Line 141: | Line 147: | ||
1.90% ksoftirqd/0 [kernel.kallsyms] [k] __netdev_alloc_skb | 1.90% ksoftirqd/0 [kernel.kallsyms] [k] __netdev_alloc_skb | ||
1.52% ksoftirqd/0 [kernel.kallsyms] [k] __slab_alloc.isra.79 | 1.52% ksoftirqd/0 [kernel.kallsyms] [k] __slab_alloc.isra.79 | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
</pre> | </pre> |
Latest revision as of 04:32, 14 September 2017
Contents
Kernel Version Configuration
RX Checksum Offload support for RAVB is currently available in a topic branch:
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas.git topic/ravb-rx-checksum-offload
The ARM64 defconfig was used. The following option was distabled to produce a kernel image small enough to boot in the environment used for testing.
- CONFIG_SOUND
The following option was also disabled as the sub-system in question seems to fail to build in the net-next revision that the topic/ravb-rx-checksum-offload branch is based on, the latest net-next revision at the time.
- CONFIG_DRM
User Space Configuration
The tests described below requires netperf to be installed both on the board being tested and the host specified by the -H option when netperf is invoked on the board. netserver, which is part of the netperf package, should be running on the host.
Perf is used to record CPU usage during the test. For this reason perf needs to be installed on the board being tested.
Hardware Environment
- Salvator-X/r8a7795 (Gen 3 R-Car H3 SoC) ES1.0
- Salvator-X/r8a7796 (Gen 3 R-Car M3-W SoC) ES1.0
The results shown below are from tests performed on the Salvator-X/r8a7796.
The Salvator-XS/r8a7795 gives the same results.
Verify RAVB RX Checksum Offload Support
Verify Driver Initialisation
Initialisation of RAVB can be checked by inspection of the output of dmesg.
# dmesg | grep ravb [ 1.291370] libphy: ravb_mii: probed [ 1.295837] ravb e6800000.ethernet eth0: Base address at 0xe6800000, 2e:09:0a:00:be:d8, IRQ 45. [ 5.025952] ravb e6800000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
Verify Configurability of RX Checksum Offload
# ethtool -k eth0 | grep rx-checksum rx-checksumming: on # ethtool -K eth0 rx off # ethtool -k eth0 | grep rx-checksum rx-checksumming: off # ethtool -K eth0 rx on # ethtool -k eth0 | grep rx-checksum rx-checksumming: on
Run netperf TCP_MAERTS tests
When run on the board this exercises RX using the RAVB by recieving a stream of TCP packets from the host.
Note that perf record writes to a file in /run. This was chosen as that directory is mounted as a tmpfs filesystem backed by memory. Writing to a file in an NFS partition significantly impacts the meaningfulness of results collected.
In terms of performance throughput is close to gigabit line-rate both with
and without RX checksum offload enabled. Perf output, however, appears to
indicate that significantly less time is spent in do_csum() when RX checksum offload is enabled.
This is the expected result.
With RX Checksum Offload
# ethtool -K eth0 rx on # ethtool -k eth0 | grep rx-checksum rx-checksumming: on # /usr/bin/perf_3.16 record -o /run/perf.data -a netperf -t TCP_MAERTS -H 10.4.3.162 MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.4.3.162 () port 0 AF_INET : demo enable_enobufs failed: getprotobyname Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 938.78 [ perf record: Woken up 14 times to write data ] [ perf record: Captured and wrote 3.524 MB /run/perf.data (~153957 samples) ] # perf_3.16 report -i /run/perf.data | head -20 # To display the perf.data header info, please use --header/--header-only options. # # Samples: 75K of event 'cycles' # Event count (approx.): 19704920110 # # Overhead Command Shared Object Symbol # ........ ............... ................. .................................... # 19.49% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 9.88% ksoftirqd/0 [kernel.kallsyms] [k] __pi_memcpy 7.33% ksoftirqd/0 [kernel.kallsyms] [k] skb_put 7.00% ksoftirqd/0 [kernel.kallsyms] [k] ravb_poll 3.89% ksoftirqd/0 [kernel.kallsyms] [k] dev_gro_receive 3.65% netperf [kernel.kallsyms] [k] __arch_copy_to_user 3.43% swapper [kernel.kallsyms] [k] arch_cpu_idle 2.77% swapper [kernel.kallsyms] [k] tick_nohz_idle_enter 1.85% ksoftirqd/0 [kernel.kallsyms] [k] __netdev_alloc_skb 1.80% swapper [kernel.kallsyms] [k] _raw_spin_unlock_irq 1.64% ksoftirqd/0 [kernel.kallsyms] [k] __slab_alloc.isra.79 1.62% ksoftirqd/0 [kernel.kallsyms] [k] __pi___inval_cache_range
Without RX Checksum Offload
# ethtool -K eth0 rx off # ethtool -k eth0 | grep rx-checksum rx-checksumming: off # perf_3.16 record -o /run/perf.data -a netperf -t TCP_MAERTS -H 10.4.3.162 MIGRATED TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.4.3.162 () port 0 AF_INET : demo enable_enobufs failed: getprotobyname Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 941.09 [ perf record: Woken up 14 times to write data ] [ perf record: Captured and wrote 3.411 MB /run/perf.data (~149040 samples) ] # perf_3.16 report -i /run/perf.data | head -20 # To display the perf.data header info, please use --header/--header-only options. # # Samples: 73K of event 'cycles' # Event count (approx.): 18682878466 # # Overhead Command Shared Object Symbol # ........ ............. ................. .................................... # 17.50% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 10.60% ksoftirqd/0 [kernel.kallsyms] [k] __pi_memcpy 7.91% ksoftirqd/0 [kernel.kallsyms] [k] skb_put 6.95% ksoftirqd/0 [kernel.kallsyms] [k] do_csum 6.22% ksoftirqd/0 [kernel.kallsyms] [k] ravb_poll 3.84% ksoftirqd/0 [kernel.kallsyms] [k] dev_gro_receive 2.53% netperf [kernel.kallsyms] [k] __arch_copy_to_user 2.53% swapper [kernel.kallsyms] [k] arch_cpu_idle 2.27% swapper [kernel.kallsyms] [k] tick_nohz_idle_enter 1.90% ksoftirqd/0 [kernel.kallsyms] [k] __pi___inval_cache_range 1.90% ksoftirqd/0 [kernel.kallsyms] [k] __netdev_alloc_skb 1.52% ksoftirqd/0 [kernel.kallsyms] [k] __slab_alloc.isra.79