reedsolomon-go

Commit Graph

Author	SHA1	Message	Date
Klaus Post	a9588190c0	Optimize pure Go version. (#96 ) * Optimize pure Go version. * Update docs. Add Go 1.12 CI * Avoid dst bounds check when using noasm ~ 40-50% faster. * Convert multiply table to a slice whenever used. * Split on 32 byte boundaries instead of 16 byte.	2019-03-08 10:49:27 +01:00
Frank Wessels	79aee05119	AVX512 accelerated version resulting in a 4x speed improvement over AVX2 (#91 ) The performance on AVX512 has been accelerated for Intel CPUs. This gives speedups on a per-core basis of up to 4x compared to AVX2 as can be seen in the following table: ``` $ benchcmp avx2.txt avx512.txt benchmark AVX2 MB/s AVX512 MB/s speedup BenchmarkEncode8x8x1M-72 1681.35 4125.64 2.45x BenchmarkEncode8x4x8M-72 1529.36 5507.97 3.60x BenchmarkEncode8x8x8M-72 791.16 2952.29 3.73x BenchmarkEncode8x8x32M-72 573.26 2168.61 3.78x BenchmarkEncode12x4x12M-72 1234.41 4912.37 3.98x BenchmarkEncode16x4x16M-72 1189.59 5138.01 4.32x BenchmarkEncode24x8x24M-72 690.68 2583.70 3.74x BenchmarkEncode24x8x48M-72 674.20 2643.31 3.92x ```	2019-02-10 11:17:23 +01:00
Frank Wessels	8885f3a1c7	Feature/ppc support (#88 ) Add accelerated PPC support.	2018-12-18 20:39:59 +01:00

Author

SHA1

Message

Date

Klaus Post

a9588190c0

Optimize pure Go version. (#96 )

* Optimize pure Go version.
* Update docs. Add Go 1.12 CI

* Avoid dst bounds check when using noasm ~ 40-50% faster.
* Convert multiply table to a slice whenever used.
* Split on 32 byte boundaries instead of 16 byte.

2019-03-08 10:49:27 +01:00

Frank Wessels

79aee05119

AVX512 accelerated version resulting in a 4x speed improvement over AVX2 (#91 )

The performance on AVX512 has been accelerated for Intel CPUs. This gives speedups on a per-core basis of up to 4x compared to AVX2 as can be seen in the following table:

```
$ benchcmp avx2.txt avx512.txt
benchmark                      AVX2 MB/s    AVX512 MB/s   speedup
BenchmarkEncode8x8x1M-72       1681.35      4125.64       2.45x
BenchmarkEncode8x4x8M-72       1529.36      5507.97       3.60x
BenchmarkEncode8x8x8M-72        791.16      2952.29       3.73x
BenchmarkEncode8x8x32M-72       573.26      2168.61       3.78x
BenchmarkEncode12x4x12M-72     1234.41      4912.37       3.98x
BenchmarkEncode16x4x16M-72     1189.59      5138.01       4.32x
BenchmarkEncode24x8x24M-72      690.68      2583.70       3.74x
BenchmarkEncode24x8x48M-72      674.20      2643.31       3.92x
```

2019-02-10 11:17:23 +01:00

Frank Wessels

8885f3a1c7

Feature/ppc support (#88 )

Add accelerated PPC support.

2018-12-18 20:39:59 +01:00

3 Commits (70d62797617ea2be6dc26d343d2bce4e2c4e8e34)