Commit Graph

235 Commits (7bd22796ec03e9a4bd80544495c988db527e5f62)

Author SHA1 Message Date
Klaus Post 70d6279761 Update travis script 2019-09-27 16:33:57 -07:00
Christian Muehlhaeuser 4681100338 Removed unused struct members (#106)
creads & cwrites both seem to be unused.
2019-09-27 16:31:11 -07:00
Christian Muehlhaeuser 993c27a5ba Avoid unnecessary conversion (#107)
No need to convert to byte here.
2019-09-27 16:30:54 -07:00
Andreas Auernhammer 1f1369aa84 limit capacity of shards to shard size (#109)
This commit limits the capacity (additionally
to the length) of each shard to the shard size.

Before this change the following code behaves in
an unexpected way:
```
shards := encoder.Split(buffer)
// ...
shards[0] = shards[0][:cap(shards[0])
```

Instead of restoring the length of `shards[0]` to
the shard size, it assigns the entire memory of `buffer`
to `shards[0]`.
2019-09-27 16:30:26 -07:00
dssysolyatin 7890684129 Improve quick check for case when dataOnly is true (#105) 2019-06-25 16:30:44 +02:00
dssysolyatin ec2eb9fb8c Split: Reduce memory allocation (#103)
* [Split] Reduce memory allocation in Split function
2019-06-25 16:28:24 +02:00
Klaus Post 0883d2f011 Only enable AVX512 on AMD64
Fixes #102
2019-05-26 12:12:55 +02:00
Lennart Oldenburg a373324398 Fixed upper bound check for data shard cli argument in example encoders and file permission issue. (#98) 2019-04-07 17:36:31 +02:00
Klaus Post a9588190c0
Optimize pure Go version. (#96)
* Optimize pure Go version.
* Update docs. Add Go 1.12 CI

* Avoid dst bounds check when using noasm ~ 40-50% faster.
* Convert multiply table to a slice whenever used.
* Split on 32 byte boundaries instead of 16 byte.
2019-03-08 10:49:27 +01:00
Klaus Post 09979cdf93 Start documentation with method name.
Replaces #92
2019-02-15 15:31:43 +01:00
Klaus Post 2b210cf086
Update README.md
Remove dead link
2019-02-10 22:49:25 +01:00
Frank Wessels 79aee05119 AVX512 accelerated version resulting in a 4x speed improvement over AVX2 (#91)
The performance on AVX512 has been accelerated for Intel CPUs. This gives speedups on a per-core basis of up to 4x compared to AVX2 as can be seen in the following table:

```
$ benchcmp avx2.txt avx512.txt
benchmark                      AVX2 MB/s    AVX512 MB/s   speedup
BenchmarkEncode8x8x1M-72       1681.35      4125.64       2.45x
BenchmarkEncode8x4x8M-72       1529.36      5507.97       3.60x
BenchmarkEncode8x8x8M-72        791.16      2952.29       3.73x
BenchmarkEncode8x8x32M-72       573.26      2168.61       3.78x
BenchmarkEncode12x4x12M-72     1234.41      4912.37       3.98x
BenchmarkEncode16x4x16M-72     1189.59      5138.01       4.32x
BenchmarkEncode24x8x24M-72      690.68      2583.70       3.74x
BenchmarkEncode24x8x48M-72      674.20      2643.31       3.92x
```
2019-02-10 11:17:23 +01:00
Frank Wessels 8885f3a1c7 Feature/ppc support (#88)
Add accelerated PPC support.
2018-12-18 20:39:59 +01:00
Klaus Post 278ba25f43 Pre-slice input. 2018-11-16 00:23:56 +01:00
Klaus Post 454fd91890
Maintenance updates. (#86)
* Add gcc go build tags.
* Update Travis.
* Fix typo
2018-11-12 13:25:55 +01:00
Felix Yan 925cb01d65 Fix several typos in matrix_test.go (#80) 2018-07-04 19:30:09 +02:00
Darren 3133c51b91 Added link to ocaml-reed-solomon-erasure to README.md (#79) 2018-06-30 10:15:29 +02:00
Klaus Post 7d9453e171
Update README.md 2018-05-04 15:02:00 +02:00
Klaus Post 0b30fa71cc
Merge pull request #75 from ernado/patch-1
Fix typo in README.md
2017-12-19 14:34:37 +01:00
Aleksandr Razumov 19a926a71b
Fix typo in README.md 2017-12-15 19:01:33 +03:00
Klaus Post 6db5e38e85
Merge pull request #74 from klauspost/align-blocks
Split blocks into size divisible by 16, add WithAutoGoroutines
2017-12-10 21:00:05 +01:00
Klaus Post f5e73dcfe2 Split blocks into size divisible by 16
Older systems (typically without AVX2) are more sensitive to misaligned load+stores.

Add parameter to automatically set the number of goroutines.

name                  old time/op    new time/op    delta
Encode10x2x10000-8      18.4µs ± 1%    16.1µs ± 1%  -12.43%    (p=0.000 n=9+9)
Encode100x20x10000-8     692µs ± 1%     608µs ± 1%  -12.10%  (p=0.000 n=10+10)
Encode17x3x1M-8         1.78ms ± 5%    1.49ms ± 1%  -16.63%  (p=0.000 n=10+10)
Encode10x4x16M-8        21.5ms ± 5%    19.6ms ± 4%   -8.74%   (p=0.000 n=10+9)
Encode5x2x1M-8           343µs ± 2%     267µs ± 2%  -22.22%   (p=0.000 n=9+10)
Encode10x2x1M-8          858µs ± 5%     701µs ± 5%  -18.34%  (p=0.000 n=10+10)
Encode10x4x1M-8         1.34ms ± 1%    1.16ms ± 1%  -13.19%    (p=0.000 n=9+9)
Encode50x20x1M-8        30.3ms ± 4%    25.0ms ± 2%  -17.51%   (p=0.000 n=10+8)
Encode17x3x16M-8        26.9ms ± 1%    24.5ms ± 4%   -9.13%   (p=0.000 n=8+10)

name                  old speed      new speed      delta
Encode10x2x10000-8    5.45GB/s ± 1%  6.22GB/s ± 1%  +14.20%    (p=0.000 n=9+9)
Encode100x20x10000-8  1.44GB/s ± 1%  1.64GB/s ± 1%  +13.77%  (p=0.000 n=10+10)
Encode17x3x1M-8       10.0GB/s ± 5%  12.0GB/s ± 1%  +19.88%  (p=0.000 n=10+10)
Encode10x4x16M-8      7.81GB/s ± 5%  8.56GB/s ± 5%   +9.58%   (p=0.000 n=10+9)
Encode5x2x1M-8        15.3GB/s ± 2%  19.6GB/s ± 2%  +28.57%   (p=0.000 n=9+10)
Encode10x2x1M-8       12.2GB/s ± 5%  15.0GB/s ± 5%  +22.45%  (p=0.000 n=10+10)
Encode10x4x1M-8       7.84GB/s ± 1%  9.03GB/s ± 1%  +15.19%    (p=0.000 n=9+9)
Encode50x20x1M-8      1.73GB/s ± 4%  2.09GB/s ± 4%  +20.59%   (p=0.000 n=10+9)
Encode17x3x16M-8      10.6GB/s ± 1%  11.7GB/s ± 4%  +10.12%   (p=0.000 n=8+10)
2017-11-18 22:00:55 +01:00
Nick Heindl e52c150f96 Fix some small typos in README (#71) 2017-11-18 16:17:31 +01:00
Frank Wessels 3610933d2f Use AVX2 SIMD assembly instructions in favor of BYTE sequences. (#73)
* Use AVX2 SIMD assembly instructions in favor of BYTE sequences.
2017-11-18 16:17:10 +01:00
Klaus Post 6bb6130ff6 Add laste new feature to doc. 2017-10-01 14:06:06 +02:00
Klaus Post 61c22eab55 Cauchy Matrix option (#70)
* Experimental Cauchy Matrix

Experimental support for Cauchy style matrix

http://web.eecs.utk.edu/~plank/plank/papers/CS-05-569.pdf

All matrices appear reversible.

* Remove Go 1.5 and 1.6 from CI tests.

* Fix comment.

* Increase max number of goroutines+docs.
2017-10-01 14:02:11 +02:00
David Reiss ddcafc661e Allow reconstructing into pre-allocated memory. (#66)
This changes the interface of Reconstruct and ReconstructData to accept
slices of zero length but sufficient capacity for shards to reconstruct,
and reslices them instead of allocating new memory.
2017-09-20 21:08:24 +02:00
Klaus Post 87ba8262ab Add Go 1.9 to Travis. 2017-08-26 11:54:10 +02:00
Klaus Post 985e396eec Asmfmt. 2017-08-26 11:51:49 +02:00
Klaus Post c71640765a Update docs before release, when #62 is ready. (#63)
* Update docs before release, when #62 is ready.

* Update README.md
2017-08-26 11:48:42 +02:00
Frank Wessels 7b88f42e61 Add NEON support for ARM64 (#62)
* Add support for arm64 using NEON instructions

Specifically using the PMULL/PMULL2 polynomial multiplication instructions followed by a reduction step (actually two steps).

* Add ARM performance numbers

* Formatting for performance table

* Refactoring of NEON version and 256-bit wide version

* Expand test slice beyond 32 (for AVX2 and NEON) and test galMulSliceXor explicitly.

* Fix ARM code with missing function.

* Fix missing newline
2017-08-26 11:47:42 +02:00
chenzhongtao d78bf472d8 add Update parity function (#60)
Add Update parity function
2017-08-20 11:42:39 +02:00
Klaus Post dc6af2dce5 Minor cleanup (#61)
* Remove some benchmarks
* Format tables a bit.
* Doc cleanup
2017-08-13 22:38:27 +02:00
Andreas Auernhammer 48a4fd05f1 fix unnecessary memory alloc in Split (#59)
Split divided the data into `DataShards` blocks and allocates all parity blocks.

This change adds a check whether the capacity of data is large enough to hold all
data and parity blocks. It only allocates parity blocks if necessary.
2017-07-22 16:16:58 +02:00
Klaus Post 82ee2d9869 Update README.md 2017-07-20 12:24:02 +02:00
Frank Wessels 0de37d7697 Add ReconstructData interface method (#57)
* Add ReconstructData interface method to allow reconstruction of any missing data shards
* Add support for just reconstructing data shards only to SteamEncoder.Reconstruct()
2017-07-20 12:15:46 +02:00
Klaus Post 0dd0a0e50c Fix error grammar
Fixes #56
2017-07-16 17:00:58 +02:00
Fred Akalin 18d548df63 Add support for PAR1 (#55)
PAR1 is a file format which uses a Reed-Solomon code similar
to the current one, except it uses a different (flawed) coding
matrix.

Add support for it via a WithPAR1Matrix option, so that this code
can be used to encode/decode PAR1 files. Also add the option to
existing tests, and add a test demonstrating the flaw in PAR1's
coding matrix.

Also fix an mistakenly inverted test in testOpts().

Incidentally, PAR1 is obsoleted by PAR2, which uses GF(2^16)
and tries to fix the flaw in the coding matrix; however, PAR2's
coding matrix is still flawed! The real solution is to build the
coding matrix like in this repository.

PAR1 spec:
http://parchive.sourceforge.net/docs/specifications/parity-volume-spec-1.0/article-spec.html

Paper describing the (flawed) Reed-Solomon code used by PAR1:
http://web.eecs.utk.edu/~plank/plank/papers/CS-96-332.html
2017-06-20 20:24:57 +02:00
Fred Akalin 87c4e5ae75 Allow 256 total shards (#54)
* Allow 256 total shards
2017-06-19 11:26:52 +02:00
timchenxiaoyu d4658f22be fix example error (#53) 2017-06-06 22:26:01 +02:00
Klaus Post dde6ad55c5 Set correct field in WithMinSplitSize
Fixes #51
2017-05-28 12:38:06 +02:00
Klaus Post 5abf0ee302 Add options (#46)
* Add options

Make constants changeable as options.

The API remains backwards compatible.

* Update documentation.

* Fix line endings

* fmt

* fmt

* Use functions for parameters.

Much neater.
2017-02-19 11:13:22 +01:00
Klaus Post c056598956 Merge pull request #39 from jesselucas/patch-1
Update README.md to fix small typos.
2017-01-05 16:18:16 +01:00
Jesse Lucas ff2f89b6ca Update README.md to fix small typos. 2017-01-05 00:16:24 -05:00
Klaus Post d0a56f72c0 Update README.md 2016-10-28 09:13:20 +02:00
Klaus Post 9998b4cb21 Update README.md 2016-10-28 09:00:26 +02:00
Peter C c54154da9e Add Inverse Matrix caching in a Thread-Safe Lookup Tree (#36)
* Add matrix inversion caching
* Benchmark and Parallel Benchmark tests for Reconstruct
2016-09-12 21:31:07 +02:00
Klaus Post fac1884d47 Merge pull request #34 from muesli/master
Make Join return an error if a reconstruction is required first
2016-08-22 14:13:59 +02:00
Christian Muehlhaeuser b1c8b4b073 Make Join return an error if a reconstruction is required first
If one or more required data shards are still nil and we can't correctly join
them before a reconstruction, return ErrReconstructRequired.
2016-08-05 19:23:08 +02:00
Klaus Post 9b772b54b3 Merge pull request #30 from hackintoshrao/matrix-test
Tests: Coverage and enhancement for matrix_test.go
2016-08-01 19:20:09 +02:00