|
|
Line 82: |
Line 82: |
| | | |
| <pre> | | <pre> |
− | Doing md2 for 3s on 16 size blocks: 27716 md2's in 2.98s
| |
− | Doing md2 for 3s on 64 size blocks: 17388 md2's in 2.99s
| |
− | Doing md2 for 3s on 256 size blocks: 7322 md2's in 3.00s
| |
− | Doing md2 for 3s on 1024 size blocks: 2173 md2's in 2.89s
| |
− | Doing md2 for 3s on 8192 size blocks: 304 md2's in 2.99s
| |
− | Doing md4 for 3s on 16 size blocks: 115369 md4's in 3.00s
| |
− | Doing md4 for 3s on 64 size blocks: 115723 md4's in 3.00s
| |
− | Doing md4 for 3s on 256 size blocks: 88908 md4's in 2.99s
| |
− | Doing md4 for 3s on 1024 size blocks: 48620 md4's in 2.98s
| |
− | Doing md4 for 3s on 8192 size blocks: 10258 md4's in 2.99s
| |
− | Doing md5 for 3s on 16 size blocks: 70799 md5's in 2.98s
| |
− | Doing md5 for 3s on 64 size blocks: 69896 md5's in 2.98s
| |
− | Doing md5 for 3s on 256 size blocks: 56259 md5's in 3.00s
| |
− | Doing md5 for 3s on 1024 size blocks: 33143 md5's in 3.00s
| |
− | Doing md5 for 3s on 8192 size blocks: 7914 md5's in 2.99s
| |
− | Doing hmac(md5) for 3s on 16 size blocks: 190400 hmac(md5)'s in 2.98s
| |
− | Doing hmac(md5) for 3s on 64 size blocks: 163136 hmac(md5)'s in 3.00s
| |
− | Doing hmac(md5) for 3s on 256 size blocks: 111608 hmac(md5)'s in 2.98s
| |
− | Doing hmac(md5) for 3s on 1024 size blocks: 51076 hmac(md5)'s in 2.99s
| |
− | Doing hmac(md5) for 3s on 8192 size blocks: 9286 hmac(md5)'s in 2.99s
| |
− | Doing sha1 for 3s on 16 size blocks: 56948 sha1's in 3.00s
| |
− | Doing sha1 for 3s on 64 size blocks: 51206 sha1's in 3.00s
| |
− | Doing sha1 for 3s on 256 size blocks: 36283 sha1's in 2.99s
| |
− | Doing sha1 for 3s on 1024 size blocks: 18403 sha1's in 2.99s
| |
− | Doing sha1 for 3s on 8192 size blocks: 3584 sha1's in 2.98s
| |
− | Doing sha256 for 3s on 16 size blocks: 127496 sha256's in 3.00s
| |
− | Doing sha256 for 3s on 64 size blocks: 76127 sha256's in 2.99s
| |
− | Doing sha256 for 3s on 256 size blocks: 34048 sha256's in 3.00s
| |
− | Doing sha256 for 3s on 1024 size blocks: 10828 sha256's in 2.99s
| |
− | Doing sha256 for 3s on 8192 size blocks: 1524 sha256's in 2.99s
| |
− | Doing sha512 for 3s on 16 size blocks: 7691 sha512's in 3.00s
| |
− | Doing sha512 for 3s on 64 size blocks: 7654 sha512's in 2.99s
| |
− | Doing sha512 for 3s on 256 size blocks: 2717 sha512's in 2.99s
| |
− | Doing sha512 for 3s on 1024 size blocks: 926 sha512's in 2.98s
| |
− | Doing sha512 for 3s on 8192 size blocks: 130 sha512's in 3.01s
| |
− | Doing rmd160 for 3s on 16 size blocks: 45651 rmd160's in 2.99s
| |
− | Doing rmd160 for 3s on 64 size blocks: 39666 rmd160's in 2.99s
| |
− | Doing rmd160 for 3s on 256 size blocks: 28201 rmd160's in 2.99s
| |
− | Doing rmd160 for 3s on 1024 size blocks: 13908 rmd160's in 3.00s
| |
− | Doing rmd160 for 3s on 8192 size blocks: 2733 rmd160's in 2.98s
| |
− | Doing rc4 for 3s on 16 size blocks: 2739344 rc4's in 2.99s
| |
− | Doing rc4 for 3s on 64 size blocks: 783949 rc4's in 2.98s
| |
− | Doing rc4 for 3s on 256 size blocks: 203269 rc4's in 2.98s
| |
− | Doing rc4 for 3s on 1024 size blocks: 51473 rc4's in 2.99s
| |
− | Doing rc4 for 3s on 8192 size blocks: 6374 rc4's in 2.98s
| |
− | Doing des cbc for 3s on 16 size blocks: 546219 des cbc's in 3.00s
| |
− | Doing des cbc for 3s on 64 size blocks: 149992 des cbc's in 2.98s
| |
− | Doing des cbc for 3s on 256 size blocks: 38552 des cbc's in 3.00s
| |
− | Doing des cbc for 3s on 1024 size blocks: 9844 des cbc's in 3.00s
| |
− | Doing des cbc for 3s on 8192 size blocks: 1229 des cbc's in 2.99s
| |
− | Doing des ede3 for 3s on 16 size blocks: 213445 des ede3's in 2.97s
| |
− | Doing des ede3 for 3s on 64 size blocks: 55158 des ede3's in 2.97s
| |
− | Doing des ede3 for 3s on 256 size blocks: 13904 des ede3's in 2.97s
| |
− | Doing des ede3 for 3s on 1024 size blocks: 3227 des ede3's in 2.74s
| |
− | Doing des ede3 for 3s on 8192 size blocks: 441 des ede3's in 2.99s
| |
− | Doing aes-128 cbc for 3s on 16 size blocks: 595070 aes-128 cbc's in 2.97s
| |
− | Doing aes-128 cbc for 3s on 64 size blocks: 163409 aes-128 cbc's in 2.99s
| |
− | Doing aes-128 cbc for 3s on 256 size blocks: 42375 aes-128 cbc's in 3.00s
| |
− | Doing aes-128 cbc for 3s on 1024 size blocks: 10665 aes-128 cbc's in 2.99s
| |
− | Doing aes-128 cbc for 3s on 8192 size blocks: 1338 aes-128 cbc's in 2.99s
| |
− | Doing aes-192 cbc for 3s on 16 size blocks: 510290 aes-192 cbc's in 2.99s
| |
− | Doing aes-192 cbc for 3s on 64 size blocks: 138844 aes-192 cbc's in 2.98s
| |
− | Doing aes-192 cbc for 3s on 256 size blocks: 35894 aes-192 cbc's in 2.99s
| |
− | Doing aes-192 cbc for 3s on 1024 size blocks: 9089 aes-192 cbc's in 3.00s
| |
− | Doing aes-192 cbc for 3s on 8192 size blocks: 1132 aes-192 cbc's in 2.98s
| |
− | Doing aes-256 cbc for 3s on 16 size blocks: 444002 aes-256 cbc's in 2.98s
| |
− | Doing aes-256 cbc for 3s on 64 size blocks: 120882 aes-256 cbc's in 2.98s
| |
− | Doing aes-256 cbc for 3s on 256 size blocks: 30963 aes-256 cbc's in 2.98s
| |
− | Doing aes-256 cbc for 3s on 1024 size blocks: 7890 aes-256 cbc's in 2.99s
| |
− | Doing aes-256 cbc for 3s on 8192 size blocks: 994 aes-256 cbc's in 2.98s
| |
− | Doing aes-128 ige for 3s on 16 size blocks: 577263 aes-128 ige's in 2.99s
| |
− | Doing aes-128 ige for 3s on 64 size blocks: 166651 aes-128 ige's in 2.98s
| |
− | Doing aes-128 ige for 3s on 256 size blocks: 43055 aes-128 ige's in 2.98s
| |
− | Doing aes-128 ige for 3s on 1024 size blocks: 10772 aes-128 ige's in 2.99s
| |
− | Doing aes-128 ige for 3s on 8192 size blocks: 1306 aes-128 ige's in 2.99s
| |
− | Doing aes-192 ige for 3s on 16 size blocks: 493664 aes-192 ige's in 2.99s
| |
− | Doing aes-192 ige for 3s on 64 size blocks: 141065 aes-192 ige's in 2.99s
| |
− | Doing aes-192 ige for 3s on 256 size blocks: 36340 aes-192 ige's in 2.99s
| |
− | Doing aes-192 ige for 3s on 1024 size blocks: 9183 aes-192 ige's in 2.99s
| |
− | Doing aes-192 ige for 3s on 8192 size blocks: 1108 aes-192 ige's in 2.99s
| |
− | Doing aes-256 ige for 3s on 16 size blocks: 434801 aes-256 ige's in 2.98s
| |
− | Doing aes-256 ige for 3s on 64 size blocks: 122980 aes-256 ige's in 2.99s
| |
− | Doing aes-256 ige for 3s on 256 size blocks: 31594 aes-256 ige's in 2.99s
| |
− | Doing aes-256 ige for 3s on 1024 size blocks: 7988 aes-256 ige's in 2.99s
| |
− | Doing aes-256 ige for 3s on 8192 size blocks: 981 aes-256 ige's in 2.99s
| |
− | Doing rc2 cbc for 3s on 16 size blocks: 525625 rc2 cbc's in 2.99s
| |
− | Doing rc2 cbc for 3s on 64 size blocks: 140247 rc2 cbc's in 2.98s
| |
− | Doing rc2 cbc for 3s on 256 size blocks: 35672 rc2 cbc's in 2.99s
| |
− | Doing rc2 cbc for 3s on 1024 size blocks: 8987 rc2 cbc's in 2.99s
| |
− | Doing rc2 cbc for 3s on 8192 size blocks: 1119 rc2 cbc's in 2.98s
| |
− | Doing blowfish cbc for 3s on 16 size blocks: 1138316 blowfish cbc's in 2.99s
| |
− | Doing blowfish cbc for 3s on 64 size blocks: 327400 blowfish cbc's in 2.99s
| |
− | Doing blowfish cbc for 3s on 256 size blocks: 84685 blowfish cbc's in 2.99s
| |
− | Doing blowfish cbc for 3s on 1024 size blocks: 21281 blowfish cbc's in 2.99s
| |
− | Doing blowfish cbc for 3s on 8192 size blocks: 2606 blowfish cbc's in 2.98s
| |
− | Doing cast cbc for 3s on 16 size blocks: 940793 cast cbc's in 2.97s
| |
− | Doing cast cbc for 3s on 64 size blocks: 282189 cast cbc's in 3.00s
| |
− | Doing cast cbc for 3s on 256 size blocks: 73868 cast cbc's in 2.98s
| |
− | Doing cast cbc for 3s on 1024 size blocks: 18593 cast cbc's in 2.99s
| |
− | Doing cast cbc for 3s on 8192 size blocks: 2285 cast cbc's in 2.99s
| |
− | Doing 512 bit private rsa's for 10s: 726 512 bit private RSA's in 9.98s
| |
− | Doing 512 bit public rsa's for 10s: 8359 512 bit public RSA's in 9.97s
| |
− | Doing 1024 bit private rsa's for 10s: 158 1024 bit private RSA's in 10.03s
| |
− | Doing 1024 bit public rsa's for 10s: 3643 1024 bit public RSA's in 9.99s
| |
− | Doing 2048 bit private rsa's for 10s: 32 2048 bit private RSA's in 10.28s
| |
− | Doing 2048 bit public rsa's for 10s: 1350 2048 bit public RSA's in 9.96s
| |
− | Doing 4096 bit private rsa's for 10s: 6 4096 bit private RSA's in 10.83s
| |
− | Doing 4096 bit public rsa's for 10s: 443 4096 bit public RSA's in 9.98s
| |
− | Doing 512 bit sign dsa's for 10s: 852 512 bit DSA signs in 9.96s
| |
− | Doing 512 bit verify dsa's for 10s: 734 512 bit DSA verify in 9.98s
| |
− | Doing 1024 bit sign dsa's for 10s: 365 1024 bit DSA signs in 9.94s
| |
− | Doing 1024 bit verify dsa's for 10s: 315 1024 bit DSA verify in 9.98s
| |
− | Doing 2048 bit sign dsa's for 10s: 136 2048 bit DSA signs in 10.05s
| |
− | Doing 2048 bit verify dsa's for 10s: 115 2048 bit DSA verify in 10.04s
| |
| OpenSSL 0.9.8o 01 Jun 2010 | | OpenSSL 0.9.8o 01 Jun 2010 |
| built on: Thu Aug 26 18:56:26 UTC 2010 | | built on: Thu Aug 26 18:56:26 UTC 2010 |
Linpack
The Arm has been tested using the linpack benchmark from [1], built with gcc with -O3 (Optimisation level 3). Run with array size 200.
With software floating point
Memory required: 315K.
LINPACK benchmark, Double precision.
Machine precision: 15 digits.
Array size 200 X 200.
Average rolled and unrolled performance:
Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS
2 0.53 92.45% 1.89% 5.66% 5493.333
4 1.07 92.52% 2.80% 4.67% 5385.621
8 2.12 92.45% 2.36% 5.19% 5466.003
16 4.24 92.45% 2.83% 4.72% 5438.944
32 8.49 92.11% 2.71% 5.18% 5459.213
64 16.98 92.05% 2.89% 5.06% 5452.440
Hardware floating point (-mfloat-abi=softfp)
Memory required: 315K.
LINPACK benchmark, Double precision.
Machine precision: 15 digits.
Array size 200 X 200.
Average rolled and unrolled performance:
Reps Time(s) DGEFA DGESL OVERHEAD KFLOPS
8 0.51 90.20% 3.92% 5.88% 22888.889
16 1.02 89.22% 4.90% 5.88% 22888.889
32 2.05 90.24% 3.41% 6.34% 22888.889
64 4.08 91.42% 2.94% 5.64% 22829.437
128 8.16 91.54% 2.94% 5.51% 22799.827
256 16.31 91.35% 2.76% 5.89% 22903.800
Whetstone/Dhrystone
Code for these tests can be found here http://www.rowley.co.uk/arm/whet_dhry.zip.
All code compiled with gcc options -float-abi=softfp -O3
Whetstone
Loops: 1000, Iterations: 10, Duration: 24 sec.
C Converted Double Precision Whetstones: 41.7 MIPS
Dhrystone
Microseconds for one run through Dhrystone: 1.2
Dhrystones per Second: 809061.5
Rebuilding the Whetstone test code with 'gcc -mfpu -float-abi=softfp' gives better results:
Loops: 1000, Iterations: 100, Duration: 106 sec.
C Converted Double Precision Whetstones: 94.3 MIPS
However, the majority of compute time is spent in the SQRT function, which for the above test was built without -mfpu=vfp. Using a library with vfp give the following much improved result :
Loops: 1000, Iterations: 100, Duration: 15 sec.
C Converted Double Precision Whetstones: 666.7 MIPS
OpenSSL
Results of running openssl speed
OpenSSL 0.9.8o 01 Jun 2010
built on: Thu Aug 26 18:56:26 UTC 2010
options:bn(64,32) md2(int) rc4(ptr,int) des(idx,risc1,4,long) aes(partial) blowfish(idx)
compiler: gcc -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -DTERMIO -O2 -Wa,--noexecstack -g -Wall
available timing options: TIMES TIMEB HZ=100 [sysconf value]
timing function used: times
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
md2 148.81k 372.18k 624.81k 769.95k 832.90k
mdc2 0.00 0.00 0.00 0.00 0.00
md4 615.30k 2468.76k 7612.19k 16707.01k 28104.86k
md5 380.13k 1501.12k 4800.77k 11312.81k 21682.77k
hmac(md5) 1022.28k 3480.23k 9587.80k 17492.25k 25441.78k
sha1 303.72k 1092.39k 3106.50k 6302.57k 9852.39k
rmd160 244.29k 849.04k 2414.53k 4747.26k 7513.00k
rc4 14658.70k 16836.49k 17462.03k 17628.21k 17522.08k
des cbc 2913.17k 3221.30k 3289.77k 3360.09k 3367.21k
des ede3 1149.87k 1188.59k 1198.46k 1206.00k 1208.25k
idea cbc 0.00 0.00 0.00 0.00 0.00
seed cbc 0.00 0.00 0.00 0.00 0.00
rc2 cbc 2812.71k 3012.02k 3054.19k 3077.82k 3076.12k
rc5-32/12 cbc 0.00 0.00 0.00 0.00 0.00
blowfish cbc 6091.32k 7007.89k 7250.62k 7288.21k 7163.88k
cast cbc 5068.25k 6020.03k 6345.71k 6367.64k 6260.44k
aes-128 cbc 3205.76k 3497.72k 3616.00k 3652.49k 3665.85k
aes-192 cbc 2730.65k 2981.88k 3073.20k 3102.38k 3111.86k
aes-256 cbc 2383.90k 2596.12k 2659.91k 2702.13k 2732.50k
camellia-128 cbc 0.00 0.00 0.00 0.00 0.00
camellia-192 cbc 0.00 0.00 0.00 0.00 0.00
camellia-256 cbc 0.00 0.00 0.00 0.00 0.00
sha256 679.98k 1629.47k 2905.43k 3708.32k 4175.45k
sha512 41.02k 163.83k 232.63k 318.20k 353.81k
aes-128 ige 3089.03k 3579.08k 3698.68k 3689.14k 3578.18k
aes-192 ige 2641.68k 3019.45k 3111.38k 3144.95k 3035.70k
aes-256 ige 2334.50k 2632.35k 2705.04k 2735.69k 2687.74k
sign verify sign/s verify/s
rsa 512 bits 0.013747s 0.001193s 72.7 838.4
rsa 1024 bits 0.063481s 0.002742s 15.8 364.7
rsa 2048 bits 0.321250s 0.007378s 3.1 135.5
rsa 4096 bits 1.805000s 0.022528s 0.6 44.4
sign verify sign/s verify/s
dsa 512 bits 0.011690s 0.013597s 85.5 73.5
dsa 1024 bits 0.027233s 0.031683s 36.7 31.6
dsa 2048 bits 0.073897s 0.087304s 13.5 11.5