Commit Graph

16 Commits (4339569f14c95a8895a347845f8ed6e18b345ace)

Author SHA1 Message Date
Bassam Tabbara 4339569f14 Support for runtime SIMD detection
This commits adds support for runtime detection of SIMD instructions. The idea is that you would build once with all supported SIMD functions and the same binaries could run on different machines with varying support for SIMD. At runtime gf-complete will select the right functions based on the processor.

gf_cpu.c has the logic to detect SIMD instructions. On Intel processors this is done through cpuid. For ARM on linux we use getauxv.

The logic in gf_w*.c has been changed to check for runtime SIMD support and fallback to generic code.

Also a new test has been added. It compares the functions selected by gf_init when we enable/disable SIMD support through build flags, with runtime enabling/disabling. The test checks if the results are identical.
2016-09-13 12:24:25 -07:00
Bassam Tabbara 87f0d4395d Add support for printing functions selected in gf_init
There is currently no way to figure out which functions were selected
during gf_init and as a result of SIMD options. This is not even possible
in gdb since most functions are static.

This commit adds a new macro SET_FUNCTION that records the name of the
function selected during init inside the gf_internal structure. This macro
only works when DEBUG_FUNCTIONS is defined during compile. Otherwise the
code works exactly as it did before this change.

The names of selected functions will be used during testing of SIMD
runtime detection.

All calls such as:

gf->multiply.w32 = gf_w16_shift_multiply;

need to be replaced with the following:

SET_FUNCTION(gf,multiply,w32,gf_w16_shift_multiply)

Also added a new flag to tools/gf_methods that will print the names of
functions selected during gf_init.
2016-09-13 12:24:25 -07:00
Janne Grunau 1311a44f7a arm: NEON optimisations for gf_w4
Optimisations for the single table region multiplication and carry less
multiplication using NEON's polynomial multiplication of 8-bit values.

The single polynomial multiplication is not that useful but vector
version is for region multiplication.

Selected time_tool.sh results for a 1.7GHz cortex-a9:
Region Best (MB/s):   672.72   W-Method: 4 -m CARRY_FREE -
Region Best (MB/s):   265.84   W-Method: 4 -m BYTWO_p -
Region Best (MB/s):   329.41   W-Method: 4 -m TABLE -r DOUBLE -
Region Best (MB/s):   278.63   W-Method: 4 -m TABLE -r QUAD -
Region Best (MB/s):   329.81   W-Method: 4 -m TABLE -r QUAD -r LAZY -
Region Best (MB/s):  1318.03   W-Method: 4 -m TABLE -r SIMD -
Region Best (MB/s):   165.15   W-Method: 4 -m TABLE -r NOSIMD -
Region Best (MB/s):    99.73   W-Method: 4 -m LOG -
2014-10-24 14:53:12 +02:00
Janne Grunau 568df90edc simd: rename the region flags from SSE to SIMD
SSE is not the only supported SIMD instruction set. Keep the old names
for backward compatibility.
2014-10-09 23:22:32 +02:00
Danny Al-Gaaf df2c84d232 gf_w4.c: remove some dead code
Fix for coverity issue from Ceph project:

CID 1193093 (#1 of 1): Structurally dead code (UNREACHABLE)
 unreachable: This code cannot be reached: "return gf_w4_double_table_i...".

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-04-22 20:08:27 +02:00
Danny Al-Gaaf 3b6364e5f2 gf_w4.c: add missing breaks
Since there is no comment indicating fallthrough on purpose added a
break in switch value 5 and 6.

Fix for coverity issue from Ceph project:

CID 1193082 (#1 of 1): Missing break in switch (MISSING_BREAK)
 unterminated_case: This case (value 5) is not terminated by a 'break'
 statement.

CID 1193083 (#1 of 1): Missing break in switch (MISSING_BREAK)
 unterminated_case: This case (value 6) is not terminated by a 'break'
 statement.

Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
2014-04-22 20:08:26 +02:00
Loic Dachary b5ac2580c2 TODO reminder for KMG/JSP about hardcoded constant
Signed-off-by: Loic Dachary <loic@dachary.org>
2014-04-10 16:59:38 +02:00
Loic Dachary cfcc1881ea remove unused argument from SSE_AB2
Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-06 17:48:38 +01:00
Loic Dachary 191b86b5d2 remove unused variables from #if SSE blocs
Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-06 17:48:38 +01:00
Loic Dachary 29899ad443 move #if to avoid unused warning
Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-06 17:32:30 +01:00
Loic Dachary f043479e3c remove unused variables
In some places move variables in the scope of the CPP define where they
are used.

Signed-off-by: Loic Dachary <loic@dachary.org>
2014-03-06 15:15:22 +01:00
Jim Plank fb0bbdcf62 Fixed the problem with PCLMUL and gf_complete.h. Removed
ARCH_64 from everything but 128/GROUP/SSE.  Fortunately, no
one ever uses that.
2013-12-31 20:08:18 -05:00
Kevin Greenan 5687b9c2cc Third.1 time's a charm (autoconf non-sense for PCLMUL). 2013-12-30 22:50:04 -08:00
Kevin Greenan 137b7ccd75 Revert "Third time's a charm (autoconf non-sense for PCLMUL)."
The commit was not successfully pushed (not sure what happened).

This reverts commit 762926920a.
2013-12-30 22:40:18 -08:00
Kevin Greenan 762926920a Third time's a charm (autoconf non-sense for PCLMUL). 2013-12-30 21:26:47 -08:00
Kevin Greenan 153dd20988 Setting up autoconf/automake for GF-Complete
Also re-organized the directory structure.

Signed-off-by: Kevin Greenan <kmgreen2@gmail.com>
2013-12-04 21:24:29 -08:00