Janne Grunau
|
474010a91d
|
arm: NEON optimisations for gf_w16
Optimisations for the 4,16 split table region multiplications.
Selected time_tool.sh 16 -A -B results for a 1.7 GHz cortex-a9:
Region Best (MB/s): 532.14 W-Method: 16 -m SPLIT 16 4 -r SIMD -
Region Best (MB/s): 212.34 W-Method: 16 -m SPLIT 16 4 -r NOSIMD -
Region Best (MB/s): 801.36 W-Method: 16 -m SPLIT 16 4 -r SIMD -r ALTMAP -
Region Best (MB/s): 93.20 W-Method: 16 -m SPLIT 16 4 -r NOSIMD -r ALTMAP -
Region Best (MB/s): 273.99 W-Method: 16 -m SPLIT 16 8 -
Region Best (MB/s): 270.81 W-Method: 16 -m SPLIT 8 8 -
Region Best (MB/s): 70.42 W-Method: 16 -m COMPOSITE 2 - -
Region Best (MB/s): 393.54 W-Method: 16 -m COMPOSITE 2 - -r ALTMAP -
|
2014-10-24 14:53:57 +02:00 |