reedsolomon-go

Commit Graph

Author	SHA1	Message	Date
Zhang Boyang	195d6fc1ad	Fix build tags for gccgo (#163 )	2021-03-18 13:39:19 +01:00
Klaus Post	f338110979	Make sure assembler is formatted (#145 ) * Make sure assembler is formatted	2020-05-14 12:04:55 +02:00
Frank Wessels	27f8a7b6bf	Small optimization to parallal82 for AVX512 by reducing the number of VSHUFI64X2 instructions in the core loop (#143 )	2020-05-14 10:19:23 +02:00
Frank Wessels	d6d9fba4f9	Take vshufi64x2 out of main loop and initialize upfront (for parallel 81 only) (#139 )	2020-05-13 10:59:26 +02:00
Klaus Post	3067f8aed5	asmfmt	2020-05-06 12:36:43 +02:00
Frank Wessels	1b9e129671	Avx512 parallel81 (#131 ) * AVX512 routine for 8x1 parallel processing (WIP) * Testing and integration of Parallel81 assembly routine	2020-05-06 12:32:31 +02:00
Frank Wessels	0b98f5350a	Refactor AVX512 code to use Go assembly instructions. (#121 ) Additionally there is a small performance improvement using VPTERNLOGD (instead of two VPXORD instructions).	2020-05-03 13:43:52 +02:00
Frank Wessels	79aee05119	AVX512 accelerated version resulting in a 4x speed improvement over AVX2 (#91 ) The performance on AVX512 has been accelerated for Intel CPUs. This gives speedups on a per-core basis of up to 4x compared to AVX2 as can be seen in the following table: ``` $ benchcmp avx2.txt avx512.txt benchmark AVX2 MB/s AVX512 MB/s speedup BenchmarkEncode8x8x1M-72 1681.35 4125.64 2.45x BenchmarkEncode8x4x8M-72 1529.36 5507.97 3.60x BenchmarkEncode8x8x8M-72 791.16 2952.29 3.73x BenchmarkEncode8x8x32M-72 573.26 2168.61 3.78x BenchmarkEncode12x4x12M-72 1234.41 4912.37 3.98x BenchmarkEncode16x4x16M-72 1189.59 5138.01 4.32x BenchmarkEncode24x8x24M-72 690.68 2583.70 3.74x BenchmarkEncode24x8x48M-72 674.20 2643.31 3.92x ```	2019-02-10 11:17:23 +01:00

8 Commits (jerasure-matrix)