This commit limits the capacity (additionally
to the length) of each shard to the shard size.
Before this change the following code behaves in
an unexpected way:
```
shards := encoder.Split(buffer)
// ...
shards[0] = shards[0][:cap(shards[0])
```
Instead of restoring the length of `shards[0]` to
the shard size, it assigns the entire memory of `buffer`
to `shards[0]`.
* Optimize pure Go version.
* Update docs. Add Go 1.12 CI
* Avoid dst bounds check when using noasm ~ 40-50% faster.
* Convert multiply table to a slice whenever used.
* Split on 32 byte boundaries instead of 16 byte.
The performance on AVX512 has been accelerated for Intel CPUs. This gives speedups on a per-core basis of up to 4x compared to AVX2 as can be seen in the following table:
```
$ benchcmp avx2.txt avx512.txt
benchmark AVX2 MB/s AVX512 MB/s speedup
BenchmarkEncode8x8x1M-72 1681.35 4125.64 2.45x
BenchmarkEncode8x4x8M-72 1529.36 5507.97 3.60x
BenchmarkEncode8x8x8M-72 791.16 2952.29 3.73x
BenchmarkEncode8x8x32M-72 573.26 2168.61 3.78x
BenchmarkEncode12x4x12M-72 1234.41 4912.37 3.98x
BenchmarkEncode16x4x16M-72 1189.59 5138.01 4.32x
BenchmarkEncode24x8x24M-72 690.68 2583.70 3.74x
BenchmarkEncode24x8x48M-72 674.20 2643.31 3.92x
```
* Experimental Cauchy Matrix
Experimental support for Cauchy style matrix
http://web.eecs.utk.edu/~plank/plank/papers/CS-05-569.pdf
All matrices appear reversible.
* Remove Go 1.5 and 1.6 from CI tests.
* Fix comment.
* Increase max number of goroutines+docs.
This changes the interface of Reconstruct and ReconstructData to accept
slices of zero length but sufficient capacity for shards to reconstruct,
and reslices them instead of allocating new memory.
* Add support for arm64 using NEON instructions
Specifically using the PMULL/PMULL2 polynomial multiplication instructions followed by a reduction step (actually two steps).
* Add ARM performance numbers
* Formatting for performance table
* Refactoring of NEON version and 256-bit wide version
* Expand test slice beyond 32 (for AVX2 and NEON) and test galMulSliceXor explicitly.
* Fix ARM code with missing function.
* Fix missing newline
Split divided the data into `DataShards` blocks and allocates all parity blocks.
This change adds a check whether the capacity of data is large enough to hold all
data and parity blocks. It only allocates parity blocks if necessary.
* Add ReconstructData interface method to allow reconstruction of any missing data shards
* Add support for just reconstructing data shards only to SteamEncoder.Reconstruct()
PAR1 is a file format which uses a Reed-Solomon code similar
to the current one, except it uses a different (flawed) coding
matrix.
Add support for it via a WithPAR1Matrix option, so that this code
can be used to encode/decode PAR1 files. Also add the option to
existing tests, and add a test demonstrating the flaw in PAR1's
coding matrix.
Also fix an mistakenly inverted test in testOpts().
Incidentally, PAR1 is obsoleted by PAR2, which uses GF(2^16)
and tries to fix the flaw in the coding matrix; however, PAR2's
coding matrix is still flawed! The real solution is to build the
coding matrix like in this repository.
PAR1 spec:
http://parchive.sourceforge.net/docs/specifications/parity-volume-spec-1.0/article-spec.html
Paper describing the (flawed) Reed-Solomon code used by PAR1:
http://web.eecs.utk.edu/~plank/plank/papers/CS-96-332.html
* Add options
Make constants changeable as options.
The API remains backwards compatible.
* Update documentation.
* Fix line endings
* fmt
* fmt
* Use functions for parameters.
Much neater.