RPi Performance
Back to the Hub.
Software & Distributions:
Software - an overview.
Distributions - operating systems and development environments for the Raspberry Pi.
Kernel Compilation - advice on compiling a kernel.
Performance - measures of the Raspberry Pi's performance.
Programming - programming languages that might be used on the Raspberry Pi.
CPU
Linpack
The Arm has been tested using the linpack benchmark from [1], built with gcc with -O3 (Optimisation level 3). Run with array size 200.
With software floating point
Source
Compile/Run
cc -O3 -o linpack linpack.c -lm linpack.c: In function ‘main’: linpack.c:69: warning: return type of ‘main’ is not ‘int’ ./linpack Enter array size (q to quit) [200]: 200
Results
Crippled
Memory required: 315K. LINPACK benchmark, Double precision. Machine precision: 15 digits. Array size 200 X 200. Average rolled and unrolled performance: Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS 2 0.53 92.45% 1.89% 5.66% 5493.333 4 1.07 92.52% 2.80% 4.67% 5385.621 8 2.12 92.45% 2.36% 5.19% 5466.003 16 4.24 92.45% 2.83% 4.72% 5438.944 32 8.49 92.11% 2.71% 5.18% 5459.213 64 16.98 92.05% 2.89% 5.06% 5452.440
Hardware floating point (-mfloat-abi=softfp)
Memory required: 315K. LINPACK benchmark, Double precision. Machine precision: 15 digits. Array size 200 X 200. Average rolled and unrolled performance: Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS 8 0.51 90.20% 3.92% 5.88% 22888.889 16 1.02 89.22% 4.90% 5.88% 22888.889 32 2.05 90.24% 3.41% 6.34% 22888.889 64 4.08 91.42% 2.94% 5.64% 22829.437 128 8.16 91.54% 2.94% 5.51% 22799.827 256 16.31 91.35% 2.76% 5.89% 22903.800
Full hardware floating point on Raspbian (-mfloat-abi=hard -mfpu=vfp) and arm_freq=700
Memory required: 315K. LINPACK benchmark, Double precision. Machine precision: 15 digits. Array size 200 X 200. Average rolled and unrolled performance: Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS 16 0.58 89.66% 3.45% 6.90% 40691.358 32 1.17 87.18% 4.27% 8.55% 41071.651 64 2.32 88.36% 3.02% 8.62% 41459.119 128 4.67 88.22% 3.43% 8.35% 41071.651 256 9.33 88.85% 3.32% 7.82% 40880.620 512 18.63 89.00% 2.95% 8.05% 41047.675
Full hardware floating point on Raspbian (-mfloat-abi=hard -mfpu=vfp) and arm_freq=900
Memory required: 315K. LINPACK benchmark, Double precision. Machine precision: 15 digits. Array size 200 X 200. Average rolled and unrolled performance: Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS 32 0.97 88.66% 4.12% 7.22% 48829.630 64 1.93 88.60% 2.59% 8.81% 49939.394 128 3.90 88.46% 4.62% 6.92% 48426.079 256 7.75 88.90% 3.23% 7.87% 49239.963 512 15.49 89.15% 2.78% 8.07% 49378.277
Whetstone/Dhrystone
All code compiled with gcc options -float-abi=softfp -O3
Source
Code for these tests can be found here http://www.rowley.co.uk/arm/whet_dhry.zip. Or if 404 this code might be analogous http://freespace.virgin.net/roy.longbottom/benchnt.zip
Compile/Run
?
Results
Dhrystone
Microseconds for one run through Dhrystone: 1.2 Dhrystones per Second: 809061.5
Whetstone Crippled
Loops: 1000, Iterations: 10, Duration: 24 sec. C Converted Double Precision Whetstones: 41.7 MIPS
Rebuilding the Whetstone test code with 'gcc -mfpu -float-abi=softfp' gives better results:
Loops: 1000, Iterations: 100, Duration: 106 sec. C Converted Double Precision Whetstones: 94.3 MIPS
However, the majority of compute time is spent in the SQRT function, which for the above test was built without -mfpu=vfp. Using a library with vfp give the following much improved result :
Loops: 1000, Iterations: 100, Duration: 15 sec. C Converted Double Precision Whetstones: 666.7 MIPS
OpenSSL
Source
Compile/Run
openssl version; openssl speed;
Results
OpenSSL 0.9.8o 01 Jun 2010 built on: Thu Aug 26 18:56:26 UTC 2010 options:bn(64,32) md2(int) rc4(ptr,int) des(idx,risc1,4,long) aes(partial) blowfish(idx) compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O2 -Wa,--noexecstack -g -Wall available timing options: TIMES TIMEB HZ=100 [sysconf value] timing function used: times The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes md2 148.81k 372.18k 624.81k 769.95k 832.90k mdc2 0.00 0.00 0.00 0.00 0.00 md4 615.30k 2468.76k 7612.19k 16707.01k 28104.86k md5 380.13k 1501.12k 4800.77k 11312.81k 21682.77k hmac(md5) 1022.28k 3480.23k 9587.80k 17492.25k 25441.78k sha1 303.72k 1092.39k 3106.50k 6302.57k 9852.39k rmd160 244.29k 849.04k 2414.53k 4747.26k 7513.00k rc4 14658.70k 16836.49k 17462.03k 17628.21k 17522.08k des cbc 2913.17k 3221.30k 3289.77k 3360.09k 3367.21k des ede3 1149.87k 1188.59k 1198.46k 1206.00k 1208.25k idea cbc 0.00 0.00 0.00 0.00 0.00 seed cbc 0.00 0.00 0.00 0.00 0.00 rc2 cbc 2812.71k 3012.02k 3054.19k 3077.82k 3076.12k rc5-32/12 cbc 0.00 0.00 0.00 0.00 0.00 blowfish cbc 6091.32k 7007.89k 7250.62k 7288.21k 7163.88k cast cbc 5068.25k 6020.03k 6345.71k 6367.64k 6260.44k aes-128 cbc 3205.76k 3497.72k 3616.00k 3652.49k 3665.85k aes-192 cbc 2730.65k 2981.88k 3073.20k 3102.38k 3111.86k aes-256 cbc 2383.90k 2596.12k 2659.91k 2702.13k 2732.50k camellia-128 cbc 0.00 0.00 0.00 0.00 0.00 camellia-192 cbc 0.00 0.00 0.00 0.00 0.00 camellia-256 cbc 0.00 0.00 0.00 0.00 0.00 sha256 679.98k 1629.47k 2905.43k 3708.32k 4175.45k sha512 41.02k 163.83k 232.63k 318.20k 353.81k aes-128 ige 3089.03k 3579.08k 3698.68k 3689.14k 3578.18k aes-192 ige 2641.68k 3019.45k 3111.38k 3144.95k 3035.70k aes-256 ige 2334.50k 2632.35k 2705.04k 2735.69k 2687.74k sign verify sign/s verify/s rsa 512 bits 0.013747s 0.001193s 72.7 838.4 rsa 1024 bits 0.063481s 0.002742s 15.8 364.7 rsa 2048 bits 0.321250s 0.007378s 3.1 135.5 rsa 4096 bits 1.805000s 0.022528s 0.6 44.4 sign verify sign/s verify/s dsa 512 bits 0.011690s 0.013597s 85.5 73.5 dsa 1024 bits 0.027233s 0.031683s 36.7 31.6 dsa 2048 bits 0.073897s 0.087304s 13.5 11.5
GPU
The RaspberryPi appears to handle h264 1080p movie from USB to HDMI at least 4MB/s.
The Admin "JamesH" said it would handle "basically 1080p30, high profile, >40Mb/s." (5MB/s) in h264
And about WVGA(480p30) or 720p20 in VP8/WEBM
3DMarkMobile ES 2.0
Source
?
Compile/Run
?
Results
?
ioquake3
Source
https://github.com/raspberrypi/quake3
Compile/Run
- Download source, compile as delivered - Start game - Runs at display's native res, in my case 1280x1024 - Bitdepth stuck at 16bpp, not sure how to change, values in q3config.cfg seem to be ignored - In-game console commands: \timedemo 1 \demo four
Results
armel "driver info" : http://i.imgur.com/wtYhB.jpg armel timedemo score: http://i.imgur.com/i2TkN.jpg 20.2fps
armhf "driver info" : http://i.imgur.com/8nqa1.jpg armhf timedemo score: http://i.imgur.com/dUu0g.jpg 28.5fps
IO
USB bus
- All IO uses the same bus so the combination of all IO can not exceed the the bus speed of an as yet hypothetical 60MB/s
- A test with a fast USB-Stick showed that Raspberry Pi can achieve about 30 MB/s:
root@raspberrypi:~# dd if=/dev/sda of=/dev/null bs=32M count=10 iflag=direct 10+0 records in 10+0 records out 335544320 bytes (336 MB) copied, 10.6428 s, 31.5 MB/s
SD card
- TODO test
Compile/Run
# write dd if=/dev/zero of=~/test.tmp bs=500K count=1024 # read dd if=~/test.tmp of=/dev/null bs=500K count=1024 # cleanup rm ~/test.tmp
Results
- Depends on SD card used http://elinux.org/RaspberryPiBoardVerifiedPeripherals#SD_cards
?maybe 15MB/s?
SD Card | Read (MB/s) | Write (MB/s) | Distro | Kernel | Notes | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Extrememory 16GB SDHC class 10 | 4.7 | 4.5 | Debian Wheezy "Raspbian" | Linux raspbian 3.1.9+ #52 | |||||||
Kingston SDHC 4GB class 4 | 4.5 | 4.1 | Debian Squeeze "debian6-19-04-2012" | Linux raspberrypi 3.1.9+ #52 Tue May 8 23:49:32 BST 2012 | |||||||
Kingston SDHC 4GB class 4 | 4.2 | 2.5 | archlinuxarm-19-04-2012 | Linux alarmpi 3.1.9-13+ #6 Thu May 10 00:48:37 UTC 2012 | Identical card to one above. One to look into, as I was expecting Arch to be faster... | ||||||
Kingston uSDHC 4GB class 4 | 4.0 | 3.8 | Debian Squeeze | Linux 3.1.9+ #90 | |||||||
Kingston uSDHC 8GB class 4 (SDC4/8GB) | 4.7 | 3.7 | archlinuxarm-29-04-2012 | Linux alarmpi 3.1.9+ #66 Thu May 17 16:56:20 BST 2012 | CrystalDiskMark results (FAT32) | ||||||
Kingston SDHC 8GB class 4 (SD4/8GB) | 4.6 | 3.0 | archlinuxarm-29-04-2012 | Linux alarmpi 3.1.9+ #66 Thu May 17 16:56:20 BST 2012 | CrystalDiskMark results (FAT32) | ||||||
Kingston SDHC 32GB class 10 | 10.8 | 8.1 | Fedora 17 ARM snapshot 07 May 2012 | Linux fedora-arm 3.1.9 #1 | mmc0: note - long write sync 1453000ns - 14608 its. - kernel/module problems? | ||||||
Kingston SDHC 32GB class 10 | 4.6 | 3.5 | Debian Squeeze "debian6-19-04-2012" | Linux raspberrypi 3.1.9+ #90 | |||||||
Panasonic SDHC 8GB class 6 | 4.8 | 4.4 | |||||||||
Samsung SDHC 16GB Class 10 (MB-SPAGA) | 10.7 | 8.8 | Fedora 17 ARM snapshot 07 May 2012 - GUI release | Linux fedora-arm 3.1.9 #1 | Had "long write sync" errors, slow boot times and then system instability using USB port on Macbook, switched to iPhone charger (5V 1A) and warning disappeared | "SanDisk" uSD 2GB class ? | 4.7 | 4.2 | archlinuxarm-29-04-2012 | Linux alarmpi 3.1.9+ #66 Thu May 17 16:56:20 BST 2012 | Card has no serial/is likely a fake. CrystalDiskMark results (FAT32) |
SanDisk SDHC 8GB class 4 | 4.7 | 3.2 | Debian Squeeze | ||||||||
SanDisk SDHC 32GB class 6 | 4.6 | 4.8 | |||||||||
SanDisk uSDXC 64GB class 6 | 4.9 | 3.8 | archlinuxarm-29-04-2012 | Linux alarmpi 3.1.9+ #66 Thu May 17 16:56:20 BST 2012 | CrystalDiskMark results (FAT32) | ||||||
Transcend SDHC 8GB class 6 | 5.8 | 5.8 |
NIC
11,2MB/s with wget -O /dev/null [SOURCE]
- TODO test with wget, curl, etc
Power
(5V)
- 1080p playback: 0.75A (about 3h on 4 AA batteries)
- text editing: unknown
- idling: unknown
|