reedsolomon-go

Commit Graph

Author	SHA1	Message	Date
Klaus Post	cf8495259a	Add pure XOR for 1 parity (#138 ) WithFastOneParityMatrix will switch the matrix to a simple xor if there is only one parity shard. The PAR1 matrix already has this property so it has little effect there.	2020-05-13 11:10:58 +02:00
Klaus Post	151d8c7a05	Tweak concurrency (#132 )	2020-05-06 15:42:30 +02:00
Klaus Post	cb7a0b5aef	Do fast by one multiplication (#130 ) When multiplying by one we can use faster math.	2020-05-06 11:14:25 +02:00
Klaus Post	0e9e10435f	avx2: Add 64 bytes per loop processing (#128 ) * avx2: Add 64 bytes per loop processing Not super clean benchmark run, but `BenchmarkGalois` is consistently faster. ``` benchmark old ns/op new ns/op delta BenchmarkGalois128K-32 2551 2261 -11.37% BenchmarkGalois1M-32 22492 21107 -6.16% BenchmarkGaloisXor128K-32 2972 2808 -5.52% BenchmarkGaloisXor1M-32 25181 23951 -4.88% BenchmarkEncode10x2x10000-32 5081 4722 -7.07% BenchmarkEncode100x20x10000-32 383800 346655 -9.68% BenchmarkEncode17x3x1M-32 264806 263191 -0.61% BenchmarkEncode10x4x16M-32 8337857 8376910 +0.47% BenchmarkEncode5x2x1M-32 77119 73598 -4.57% BenchmarkEncode10x2x1M-32 108424 102423 -5.53% BenchmarkEncode10x4x1M-32 194427 184301 -5.21% BenchmarkEncode50x20x1M-32 3870301 3747639 -3.17% BenchmarkEncode17x3x16M-32 10617586 10602449 -0.14% BenchmarkEncode_8x4x8M-32 3227254 3229451 +0.07% BenchmarkEncode_12x4x12M-32 6841898 6847261 +0.08% BenchmarkEncode_16x4x16M-32 11153469 11048738 -0.94% BenchmarkEncode_16x4x32M-32 21947506 21826647 -0.55% BenchmarkEncode_16x4x64M-32 43163608 42971338 -0.45% BenchmarkEncode_8x5x8M-32 3856675 3780730 -1.97% BenchmarkEncode_8x6x8M-32 4322023 4437109 +2.66% BenchmarkEncode_8x7x8M-32 5011434 4959623 -1.03% BenchmarkEncode_8x9x8M-32 6243694 6098824 -2.32% BenchmarkEncode_8x10x8M-32 6724456 6657099 -1.00% BenchmarkEncode_8x11x8M-32 7207693 7340332 +1.84% BenchmarkEncode_8x8x05M-32 176877 172183 -2.65% BenchmarkEncode_8x8x1M-32 309716 301743 -2.57% BenchmarkEncode_8x8x8M-32 5498952 5489078 -0.18% BenchmarkEncode_8x8x32M-32 22630195 22557074 -0.32% BenchmarkEncode_24x8x24M-32 28488886 28220702 -0.94% BenchmarkEncode_24x8x48M-32 56124735 54862495 -2.25% BenchmarkVerify10x2x10000-32 9874 9356 -5.25% BenchmarkVerify50x5x50000-32 175610 159735 -9.04% BenchmarkVerify10x2x1M-32 331276 311726 -5.90% BenchmarkVerify5x2x1M-32 265466 248075 -6.55% BenchmarkVerify10x4x1M-32 701627 606420 -13.57% BenchmarkVerify50x20x1M-32 4338171 4245635 -2.13% BenchmarkVerify10x4x16M-32 12312830 11932698 -3.09% BenchmarkReconstruct10x2x10000-32 1594 1504 -5.65% BenchmarkReconstruct50x5x50000-32 95101 79558 -16.34% BenchmarkReconstruct10x2x1M-32 38479 37225 -3.26% BenchmarkReconstruct5x2x1M-32 30968 30013 -3.08% BenchmarkReconstruct10x4x1M-32 81630 75350 -7.69% BenchmarkReconstruct50x20x1M-32 1136952 1040156 -8.51% BenchmarkReconstruct10x4x16M-32 685408 656484 -4.22% BenchmarkReconstructData10x2x10000-32 1609 1486 -7.64% BenchmarkReconstructData50x5x50000-32 87090 71512 -17.89% BenchmarkReconstructData10x2x1M-32 31497 30347 -3.65% BenchmarkReconstructData5x2x1M-32 23379 22611 -3.28% BenchmarkReconstructData10x4x1M-32 63853 61035 -4.41% BenchmarkReconstructData50x20x1M-32 1048807 966201 -7.88% BenchmarkReconstructData10x4x16M-32 866658 892252 +2.95% BenchmarkReconstructP10x2x10000-32 544 540 -0.74% BenchmarkReconstructP10x5x20000-32 1242 1206 -2.90% BenchmarkSplit10x4x160M-32 2735508 2743214 +0.28% BenchmarkSplit5x2x5M-32 276232 288523 +4.45% BenchmarkSplit10x2x1M-32 44389 45517 +2.54% BenchmarkSplit10x4x10M-32 477282 460888 -3.43% BenchmarkSplit50x20x50M-32 1608821 1602105 -0.42% BenchmarkSplit17x3x272M-32 2035932 2034705 -0.06% BenchmarkParallel_8x8x05M-32 346733 351837 +1.47% BenchmarkParallel_20x10x05M-32 577127 586232 +1.58% BenchmarkParallel_8x8x1M-32 722453 729294 +0.95% BenchmarkParallel_8x8x8M-32 5717650 5817130 +1.74% BenchmarkParallel_8x8x32M-32 22914260 24132696 +5.32% BenchmarkStreamEncode10x2x10000-32 6703131 7141021 +6.53% BenchmarkStreamEncode100x20x10000-32 38175873 39767386 +4.17% BenchmarkStreamEncode17x3x1M-32 8920549 9218973 +3.35% BenchmarkStreamEncode10x4x16M-32 21841702 21784898 -0.26% BenchmarkStreamEncode5x2x1M-32 4088001 3247404 -20.56% BenchmarkStreamEncode10x2x1M-32 5860652 5932381 +1.22% BenchmarkStreamEncode10x4x1M-32 7555172 7589960 +0.46% BenchmarkStreamEncode50x20x1M-32 30006814 30250054 +0.81% BenchmarkStreamEncode17x3x16M-32 32757489 32818254 +0.19% BenchmarkStreamVerify10x2x10000-32 6714996 6831093 +1.73% BenchmarkStreamVerify50x5x50000-32 18525904 18761767 +1.27% BenchmarkStreamVerify10x2x1M-32 5232278 5444148 +4.05% BenchmarkStreamVerify5x2x1M-32 3673843 3755283 +2.22% BenchmarkStreamVerify10x4x1M-32 7184419 7185293 +0.01% BenchmarkStreamVerify50x20x1M-32 28441187 28574766 +0.47% BenchmarkStreamVerify10x4x16M-32 8538440 8668614 +1.52% benchmark old MB/s new MB/s speedup BenchmarkGalois128K-32 51374.59 57976.36 1.13x BenchmarkGalois1M-32 46620.03 49679.10 1.07x BenchmarkGaloisXor128K-32 44106.22 46671.56 1.06x BenchmarkGaloisXor1M-32 41641.82 43779.89 1.05x BenchmarkEncode10x2x10000-32 19682.61 21176.81 1.08x BenchmarkEncode100x20x10000-32 2605.52 2884.71 1.11x BenchmarkEncode17x3x1M-32 67316.54 67729.50 1.01x BenchmarkEncode10x4x16M-32 20121.74 20027.93 1.00x BenchmarkEncode5x2x1M-32 67984.17 71236.47 1.05x BenchmarkEncode10x2x1M-32 96710.29 102377.00 1.06x BenchmarkEncode10x4x1M-32 53931.74 56894.82 1.05x BenchmarkEncode50x20x1M-32 13546.44 13989.82 1.03x BenchmarkEncode17x3x16M-32 26862.29 26900.64 1.00x BenchmarkEncode_8x4x8M-32 20794.42 20780.27 1.00x BenchmarkEncode_12x4x12M-32 22069.16 22051.88 1.00x BenchmarkEncode_16x4x16M-32 24067.44 24295.58 1.01x BenchmarkEncode_16x4x32M-32 24461.59 24597.04 1.01x BenchmarkEncode_16x4x64M-32 24876.09 24987.40 1.00x BenchmarkEncode_8x5x8M-32 17400.71 17750.24 1.02x BenchmarkEncode_8x6x8M-32 15527.19 15124.46 0.97x BenchmarkEncode_8x7x8M-32 13391.15 13531.04 1.01x BenchmarkEncode_8x9x8M-32 10748.26 11003.58 1.02x BenchmarkEncode_8x10x8M-32 9979.82 10080.80 1.01x BenchmarkEncode_8x11x8M-32 9310.73 9142.48 0.98x BenchmarkEncode_8x8x05M-32 23713.12 24359.50 1.03x BenchmarkEncode_8x8x1M-32 27084.87 27800.50 1.03x BenchmarkEncode_8x8x8M-32 12203.94 12225.89 1.00x BenchmarkEncode_8x8x32M-32 11861.83 11900.28 1.00x BenchmarkEncode_24x8x24M-32 21200.54 21402.01 1.01x BenchmarkEncode_24x8x48M-32 21522.77 22017.95 1.02x BenchmarkVerify10x2x10000-32 10127.24 10688.01 1.06x BenchmarkVerify50x5x50000-32 28472.25 31301.75 1.10x BenchmarkVerify10x2x1M-32 31652.63 33637.74 1.06x BenchmarkVerify5x2x1M-32 19749.74 21134.27 1.07x BenchmarkVerify10x4x1M-32 14944.92 17291.25 1.16x BenchmarkVerify50x20x1M-32 12085.46 12348.87 1.02x BenchmarkVerify10x4x16M-32 13625.80 14059.87 1.03x BenchmarkReconstruct10x2x10000-32 62723.68 66470.81 1.06x BenchmarkReconstruct50x5x50000-32 52575.87 62847.32 1.20x BenchmarkReconstruct10x2x1M-32 272507.04 281685.84 1.03x BenchmarkReconstruct5x2x1M-32 169299.03 174685.39 1.03x BenchmarkReconstruct10x4x1M-32 128455.17 139161.42 1.08x BenchmarkReconstruct50x20x1M-32 46113.48 50404.73 1.09x BenchmarkReconstruct10x4x16M-32 244777.11 255561.72 1.04x BenchmarkReconstructData10x2x10000-32 62160.46 67305.98 1.08x BenchmarkReconstructData50x5x50000-32 57411.81 69917.97 1.22x BenchmarkReconstructData10x2x1M-32 332909.82 345526.29 1.04x BenchmarkReconstructData5x2x1M-32 224254.60 231868.74 1.03x BenchmarkReconstructData10x4x1M-32 164216.61 171799.68 1.05x BenchmarkReconstructData50x20x1M-32 49988.98 54262.82 1.09x BenchmarkReconstructData10x4x16M-32 193585.15 188032.29 0.97x BenchmarkReconstructP10x2x10000-32 183806.57 185284.57 1.01x BenchmarkReconstructP10x5x20000-32 160985.46 165852.51 1.03x BenchmarkParallel_8x8x05M-32 12096.63 11921.17 0.99x BenchmarkParallel_20x10x05M-32 18168.91 17886.72 0.98x BenchmarkParallel_8x8x1M-32 11611.28 11502.36 0.99x BenchmarkParallel_8x8x8M-32 11737.14 11536.42 0.98x BenchmarkParallel_8x8x32M-32 11714.78 11123.31 0.95x BenchmarkStreamEncode10x2x10000-32 14.92 14.00 0.94x BenchmarkStreamEncode100x20x10000-32 26.19 25.15 0.96x BenchmarkStreamEncode17x3x1M-32 1998.28 1933.60 0.97x BenchmarkStreamEncode10x4x16M-32 7681.28 7701.31 1.00x BenchmarkStreamEncode5x2x1M-32 1282.50 1614.48 1.26x BenchmarkStreamEncode10x2x1M-32 1789.18 1767.55 0.99x BenchmarkStreamEncode10x4x1M-32 1387.89 1381.53 1.00x BenchmarkStreamEncode50x20x1M-32 1747.23 1733.18 0.99x BenchmarkStreamEncode17x3x16M-32 8706.79 8690.67 1.00x BenchmarkStreamVerify10x2x10000-32 14.89 14.64 0.98x BenchmarkStreamVerify50x5x50000-32 269.89 266.50 0.99x BenchmarkStreamVerify10x2x1M-32 2004.05 1926.06 0.96x BenchmarkStreamVerify5x2x1M-32 1427.08 1396.13 0.98x BenchmarkStreamVerify10x4x1M-32 1459.51 1459.34 1.00x BenchmarkStreamVerify50x20x1M-32 1843.41 1834.79 1.00x BenchmarkStreamVerify10x4x16M-32 19649.04 19353.98 0.98x ```	2020-05-05 16:36:01 +02:00
Klaus Post	de70cc155f	AVX512 parallel processing (#120 ) Do concurrent processing in AVX512 mode and split jobs by cache size.	2020-05-04 09:17:40 +02:00
Klaus Post	65df535980	Make single goroutine encodes more efficient (#122 ) Calculate the optimal per round size to keep data in cache when not using WithAutoGoroutines. ``` λ benchcmp before.txt after.txt benchmark old ns/op new ns/op delta BenchmarkParallel_8x8x05M-16 675225 321053 -52.45% BenchmarkParallel_20x10x05M-16 3471988 600740 -82.70% BenchmarkParallel_8x8x1M-16 3948606 728093 -81.56% BenchmarkParallel_8x8x8M-16 47361588 5976467 -87.38% BenchmarkParallel_8x8x32M-16 195044200 24365474 -87.51% benchmark old MB/s new MB/s speedup BenchmarkParallel_8x8x05M-16 6211.71 13064.22 2.10x BenchmarkParallel_20x10x05M-16 3020.10 17454.73 5.78x BenchmarkParallel_8x8x1M-16 2124.45 11521.34 5.42x BenchmarkParallel_8x8x8M-16 1416.95 11228.85 7.92x BenchmarkParallel_8x8x32M-16 1376.28 11017.04 8.00x ```	2020-05-03 19:37:22 +02:00
Klaus Post	c3634dce94	Use CPU cache to set minSplitSize (#117 ) Use L1 cache size to set default split size.	2020-04-22 16:12:18 +02:00
Andreas Auernhammer	1f1369aa84	limit capacity of shards to shard size (#109 ) This commit limits the capacity (additionally to the length) of each shard to the shard size. Before this change the following code behaves in an unexpected way: ``` shards := encoder.Split(buffer) // ... shards[0] = shards[0][:cap(shards[0]) ``` Instead of restoring the length of `shards[0]` to the shard size, it assigns the entire memory of `buffer` to `shards[0]`.	2019-09-27 16:30:26 -07:00
dssysolyatin	7890684129	Improve quick check for case when dataOnly is true (#105 )	2019-06-25 16:30:44 +02:00
dssysolyatin	ec2eb9fb8c	Split: Reduce memory allocation (#103 ) * [Split] Reduce memory allocation in Split function	2019-06-25 16:28:24 +02:00
Klaus Post	a9588190c0	Optimize pure Go version. (#96 ) * Optimize pure Go version. * Update docs. Add Go 1.12 CI * Avoid dst bounds check when using noasm ~ 40-50% faster. * Convert multiply table to a slice whenever used. * Split on 32 byte boundaries instead of 16 byte.	2019-03-08 10:49:27 +01:00
Klaus Post	09979cdf93	Start documentation with method name. Replaces #92	2019-02-15 15:31:43 +01:00
Frank Wessels	79aee05119	AVX512 accelerated version resulting in a 4x speed improvement over AVX2 (#91 ) The performance on AVX512 has been accelerated for Intel CPUs. This gives speedups on a per-core basis of up to 4x compared to AVX2 as can be seen in the following table: ``` $ benchcmp avx2.txt avx512.txt benchmark AVX2 MB/s AVX512 MB/s speedup BenchmarkEncode8x8x1M-72 1681.35 4125.64 2.45x BenchmarkEncode8x4x8M-72 1529.36 5507.97 3.60x BenchmarkEncode8x8x8M-72 791.16 2952.29 3.73x BenchmarkEncode8x8x32M-72 573.26 2168.61 3.78x BenchmarkEncode12x4x12M-72 1234.41 4912.37 3.98x BenchmarkEncode16x4x16M-72 1189.59 5138.01 4.32x BenchmarkEncode24x8x24M-72 690.68 2583.70 3.74x BenchmarkEncode24x8x48M-72 674.20 2643.31 3.92x ```	2019-02-10 11:17:23 +01:00
Klaus Post	278ba25f43	Pre-slice input.	2018-11-16 00:23:56 +01:00
Klaus Post	f5e73dcfe2	Split blocks into size divisible by 16 Older systems (typically without AVX2) are more sensitive to misaligned load+stores. Add parameter to automatically set the number of goroutines. name old time/op new time/op delta Encode10x2x10000-8 18.4µs ± 1% 16.1µs ± 1% -12.43% (p=0.000 n=9+9) Encode100x20x10000-8 692µs ± 1% 608µs ± 1% -12.10% (p=0.000 n=10+10) Encode17x3x1M-8 1.78ms ± 5% 1.49ms ± 1% -16.63% (p=0.000 n=10+10) Encode10x4x16M-8 21.5ms ± 5% 19.6ms ± 4% -8.74% (p=0.000 n=10+9) Encode5x2x1M-8 343µs ± 2% 267µs ± 2% -22.22% (p=0.000 n=9+10) Encode10x2x1M-8 858µs ± 5% 701µs ± 5% -18.34% (p=0.000 n=10+10) Encode10x4x1M-8 1.34ms ± 1% 1.16ms ± 1% -13.19% (p=0.000 n=9+9) Encode50x20x1M-8 30.3ms ± 4% 25.0ms ± 2% -17.51% (p=0.000 n=10+8) Encode17x3x16M-8 26.9ms ± 1% 24.5ms ± 4% -9.13% (p=0.000 n=8+10) name old speed new speed delta Encode10x2x10000-8 5.45GB/s ± 1% 6.22GB/s ± 1% +14.20% (p=0.000 n=9+9) Encode100x20x10000-8 1.44GB/s ± 1% 1.64GB/s ± 1% +13.77% (p=0.000 n=10+10) Encode17x3x1M-8 10.0GB/s ± 5% 12.0GB/s ± 1% +19.88% (p=0.000 n=10+10) Encode10x4x16M-8 7.81GB/s ± 5% 8.56GB/s ± 5% +9.58% (p=0.000 n=10+9) Encode5x2x1M-8 15.3GB/s ± 2% 19.6GB/s ± 2% +28.57% (p=0.000 n=9+10) Encode10x2x1M-8 12.2GB/s ± 5% 15.0GB/s ± 5% +22.45% (p=0.000 n=10+10) Encode10x4x1M-8 7.84GB/s ± 1% 9.03GB/s ± 1% +15.19% (p=0.000 n=9+9) Encode50x20x1M-8 1.73GB/s ± 4% 2.09GB/s ± 4% +20.59% (p=0.000 n=10+9) Encode17x3x16M-8 10.6GB/s ± 1% 11.7GB/s ± 4% +10.12% (p=0.000 n=8+10)	2017-11-18 22:00:55 +01:00
Klaus Post	61c22eab55	Cauchy Matrix option (#70 ) * Experimental Cauchy Matrix Experimental support for Cauchy style matrix http://web.eecs.utk.edu/~plank/plank/papers/CS-05-569.pdf All matrices appear reversible. * Remove Go 1.5 and 1.6 from CI tests. * Fix comment. * Increase max number of goroutines+docs.	2017-10-01 14:02:11 +02:00
David Reiss	ddcafc661e	Allow reconstructing into pre-allocated memory. (#66 ) This changes the interface of Reconstruct and ReconstructData to accept slices of zero length but sufficient capacity for shards to reconstruct, and reslices them instead of allocating new memory.	2017-09-20 21:08:24 +02:00
chenzhongtao	d78bf472d8	add Update parity function (#60 ) Add Update parity function	2017-08-20 11:42:39 +02:00
Andreas Auernhammer	48a4fd05f1	fix unnecessary memory alloc in Split (#59 ) Split divided the data into `DataShards` blocks and allocates all parity blocks. This change adds a check whether the capacity of data is large enough to hold all data and parity blocks. It only allocates parity blocks if necessary.	2017-07-22 16:16:58 +02:00
Frank Wessels	0de37d7697	Add ReconstructData interface method (#57 ) * Add ReconstructData interface method to allow reconstruction of any missing data shards * Add support for just reconstructing data shards only to SteamEncoder.Reconstruct()	2017-07-20 12:15:46 +02:00
Klaus Post	0dd0a0e50c	Fix error grammar Fixes #56	2017-07-16 17:00:58 +02:00
Fred Akalin	18d548df63	Add support for PAR1 (#55 ) PAR1 is a file format which uses a Reed-Solomon code similar to the current one, except it uses a different (flawed) coding matrix. Add support for it via a WithPAR1Matrix option, so that this code can be used to encode/decode PAR1 files. Also add the option to existing tests, and add a test demonstrating the flaw in PAR1's coding matrix. Also fix an mistakenly inverted test in testOpts(). Incidentally, PAR1 is obsoleted by PAR2, which uses GF(2^16) and tries to fix the flaw in the coding matrix; however, PAR2's coding matrix is still flawed! The real solution is to build the coding matrix like in this repository. PAR1 spec: http://parchive.sourceforge.net/docs/specifications/parity-volume-spec-1.0/article-spec.html Paper describing the (flawed) Reed-Solomon code used by PAR1: http://web.eecs.utk.edu/~plank/plank/papers/CS-96-332.html	2017-06-20 20:24:57 +02:00
Fred Akalin	87c4e5ae75	Allow 256 total shards (#54 ) * Allow 256 total shards	2017-06-19 11:26:52 +02:00
Klaus Post	5abf0ee302	Add options (#46 ) * Add options Make constants changeable as options. The API remains backwards compatible. * Update documentation. * Fix line endings * fmt * fmt * Use functions for parameters. Much neater.	2017-02-19 11:13:22 +01:00
Peter C	c54154da9e	Add Inverse Matrix caching in a Thread-Safe Lookup Tree (#36 ) * Add matrix inversion caching * Benchmark and Parallel Benchmark tests for Reconstruct	2016-09-12 21:31:07 +02:00
Christian Muehlhaeuser	b1c8b4b073	Make Join return an error if a reconstruction is required first If one or more required data shards are still nil and we can't correctly join them before a reconstruction, return ErrReconstructRequired.	2016-08-05 19:23:08 +02:00
Harshavardhana	ba30981088	Add checks for data and parity to not exceed 255 shards in total. Fixes #16	2016-06-03 01:31:01 -07:00
Klaus Post	4fadad8564	Update reedsolomon.go Fix comment	2016-05-01 12:00:51 +02:00
Klaus Post	ed06f926b9	Merge pull request #20 from harshavardhana/fix ErrShortData shouldn't be returned for data less than dataShards.	2016-05-01 11:58:47 +02:00
Harshavardhana	df175d2921	ErrShortData shouldn't be returned for data less than dataShards. The reasoning behind this is that if we have a data block number of 10, and parity of 10. Restricting input such that files of size < 10Bytes should be errored out doesn't seem like the right approach. Most erasure subsystems will have static data and parity blocks, in such case erroring out is not correct since reedsolomon itself doesn't provide this limitation (please correct me here if i am wrong :-)). So removing the check itself is not a problem since most of the data after the split would be padded with zeros, which is okay and should be left as application optimization if they wish to pack small files in this range. ErrShortData will be still returned in case if the size of data is empty, or in case of streaming if the size == 0.	2016-04-29 20:38:45 -07:00
Harshavardhana	0b630aea27	use bytes.Equal rather than bytes.Compare	2016-04-29 14:12:03 -07:00
xiaost	4048a541c8	Optimized encoding & decoding goroutines number hardware: E5-2630 v2 (Intel x86-64 with ssse3) software: linux, go1.6, GOMAXPROCS=2 Performances before after change BenchmarkEncode10x2x10000-2 2884.95 MB/s 2837.93 MB/s 0.98x BenchmarkEncode100x20x10000-2 593.93 MB/s 577.17 MB/s 0.97x BenchmarkEncode17x3x1M-2 2903.74 MB/s 5197.99 MB/s 1.80x BenchmarkEncode10x4x16M-2 1992.13 MB/s 3689.69 MB/s 1.85x BenchmarkEncode5x2x1M-2 2883.78 MB/s 7506.19 MB/s 2.60x BenchmarkEncode10x2x1M-2 3205.63 MB/s 7848.12 MB/s 2.45x BenchmarkEncode10x4x1M-2 2218.35 MB/s 3998.35 MB/s 1.80x BenchmarkEncode50x20x1M-2 579.24 MB/s 641.08 MB/s 1.11x BenchmarkEncode17x3x16M-2 2652.36 MB/s 4775.41 MB/s 1.80x BenchmarkVerify10x2x10000-2 1327.27 MB/s 1837.41 MB/s 1.38x BenchmarkVerify50x5x50000-2 1481.89 MB/s 2684.57 MB/s 1.81x BenchmarkVerify10x2x1M-2 1553.91 MB/s 5704.71 MB/s 3.67x BenchmarkVerify5x2x1M-2 939.90 MB/s 4949.30 MB/s 5.26x BenchmarkVerify10x4x1M-2 956.89 MB/s 3191.01 MB/s 3.33x BenchmarkVerify50x20x1M-2 490.49 MB/s 823.46 MB/s 1.68x BenchmarkVerify10x4x16M-2 1078.03 MB/s 3196.97 MB/s 2.97x BenchmarkStreamEncode10x2x10000-2 2.40 MB/s 12.10 MB/s 5.04x BenchmarkStreamEncode100x20x10000-2 6.72 MB/s 10.72 MB/s 1.60x BenchmarkStreamEncode17x3x1M-2 390.75 MB/s 845.08 MB/s 2.16x BenchmarkStreamEncode10x4x16M-2 1175.93 MB/s 1803.71 MB/s 1.53x BenchmarkStreamEncode5x2x1M-2 207.85 MB/s 790.02 MB/s 3.80x BenchmarkStreamEncode10x2x1M-2 296.77 MB/s 872.41 MB/s 2.94x BenchmarkStreamEncode10x4x1M-2 264.43 MB/s 699.25 MB/s 2.64x BenchmarkStreamEncode50x20x1M-2 284.93 MB/s 414.65 MB/s 1.46x BenchmarkStreamEncode17x3x16M-2 1439.13 MB/s 1933.42 MB/s 1.34x BenchmarkStreamVerify10x2x10000-2 2.33 MB/s 12.07 MB/s 5.18x BenchmarkStreamVerify50x5x50000-2 86.53 MB/s 136.02 MB/s 1.57x BenchmarkStreamVerify10x2x1M-2 315.65 MB/s 909.44 MB/s 2.88x BenchmarkStreamVerify5x2x1M-2 180.45 MB/s 772.42 MB/s 4.28x BenchmarkStreamVerify10x4x1M-2 310.35 MB/s 779.26 MB/s 2.51x BenchmarkStreamVerify50x20x1M-2 547.23 MB/s 773.74 MB/s 1.41x BenchmarkStreamVerify10x4x16M-2 4128.01 MB/s 6606.43 MB/s 1.60x	2016-04-12 15:41:22 +08:00
klauspost	180472d98f	Make documentation conform to go vet.	2015-11-03 12:09:36 +01:00
lukechampine	295bf27a3d	fix Split panic	2015-08-08 16:38:55 -04:00
lukechampine	0bd572bc5b	tweak Split/Join functions	2015-08-08 13:51:12 -04:00
lukechampine	64b705bbf6	fully test Reconstruct function Well, I can't figure out how to trigger the Invert error. It may not be possible; need more domain knowledge to be sure.	2015-08-08 13:50:18 -04:00
lukechampine	cf985d4451	remove unreachable checkShards case this case would be caught by shardSize anyway	2015-08-08 13:50:18 -04:00
lukechampine	5784cfa7ff	remove impossible errors	2015-08-06 22:46:27 -04:00
klauspost	8ebf356efb	The number of data shards must be below 257. Check that and update documentation.	2015-06-23 13:39:57 +02:00
klauspost	5c2ef3ae72	Always check/return errors.	2015-06-23 12:16:26 +02:00
klauspost	7381e0b7b5	- Only run multiple goroutines if size is bigger than splitsize. - Update docs	2015-06-23 11:18:29 +02:00
klauspost	83703c37ac	Add package documentation and clarify interface docs.	2015-06-22 15:12:05 +02:00
Klaus Post	5aa37c3492	Add AMD64 SSE3 Galois multiplication. Approximately 5-10x faster. BenchmarkEncode10x2x10000 333.31 5827.17 17.48x BenchmarkEncode10x2x10000-2 431.20 2802.53 6.50x BenchmarkEncode10x2x10000-4 553.98 2432.95 4.39x BenchmarkEncode10x2x10000-8 585.79 3469.61 5.92x BenchmarkEncode100x20x10000 32.59 583.40 17.90x BenchmarkEncode100x20x10000-2 59.52 726.70 12.21x BenchmarkEncode100x20x10000-4 108.04 1363.25 12.62x BenchmarkEncode100x20x10000-8 113.76 1274.62 11.20x BenchmarkEncode17x3x1M 215.28 3141.85 14.59x BenchmarkEncode17x3x1M-2 398.76 3650.12 9.15x BenchmarkEncode17x3x1M-4 655.32 6071.11 9.26x BenchmarkEncode17x3x1M-8 832.16 6616.47 7.95x BenchmarkEncode10x4x16M 154.48 1357.30 8.79x BenchmarkEncode10x4x16M-2 295.62 2377.92 8.04x BenchmarkEncode10x4x16M-4 529.89 3519.49 6.64x BenchmarkEncode10x4x16M-8 632.11 4521.90 7.15x BenchmarkEncode5x2x1M 327.87 4879.09 14.88x BenchmarkEncode5x2x1M-2 576.11 2599.20 4.51x BenchmarkEncode5x2x1M-4 1043.65 3559.12 3.41x BenchmarkEncode5x2x1M-8 1227.77 4255.34 3.47x BenchmarkEncode10x2x1M 321.24 4574.68 14.24x BenchmarkEncode10x2x1M-2 587.73 3100.28 5.28x BenchmarkEncode10x2x1M-4 1101.96 4770.32 4.33x BenchmarkEncode10x2x1M-8 1217.08 5812.17 4.78x BenchmarkEncode10x4x1M 155.34 2037.27 13.11x BenchmarkEncode10x4x1M-2 298.38 2470.97 8.28x BenchmarkEncode10x4x1M-4 548.67 3603.15 6.57x BenchmarkEncode10x4x1M-8 625.23 4827.42 7.72x BenchmarkEncode50x20x1M 31.37 347.65 11.08x BenchmarkEncode50x20x1M-2 59.81 713.28 11.93x BenchmarkEncode50x20x1M-4 105.34 1175.47 11.16x BenchmarkEncode50x20x1M-8 123.84 1491.91 12.05x BenchmarkEncode17x3x16M 209.55 1861.59 8.88x BenchmarkEncode17x3x16M-2 394.19 3331.73 8.45x BenchmarkEncode17x3x16M-4 643.30 4942.74 7.68x BenchmarkEncode17x3x16M-8 839.64 6213.43 7.40x	2015-06-21 21:23:22 +02:00
Klaus Post	17e9fa30f0	Add Join function for join data shards.	2015-06-21 13:25:12 +02:00
Klaus Post	437e364842	Adjust splitsize: benchmark old ns/op new ns/op delta BenchmarkEncode10x2x10000-2 243613 229413 -5.83% BenchmarkEncode100x20x10000-2 23041318 19311104 -16.19% BenchmarkEncode17x3x1M-2 54469780 49602836 -8.94% BenchmarkEncode10x4x16M-2 674538600 647037000 -4.08% Bigger sizes (1024) yeilds slower less speedup.	2015-06-20 20:32:52 +02:00
Klaus Post	9f6744582c	Also refactor Verify as well as multithreaded options. benchmark old MB/s new MB/s speedup BenchmarkEncode10x2x10000 182.29 308.59 1.69x BenchmarkEncode100x20x10000 14.41 30.29 2.10x BenchmarkEncode17x3x1M 38.52 196.43 5.10x BenchmarkEncode10x4x16M 23.78 148.58 6.25x	2015-06-20 20:00:25 +02:00
Klaus Post	50a83296f4	Restructure to make one of the galois multiplication parts constant for the main loop.	2015-06-20 18:46:06 +02:00
Klaus Post	921adcb5d5	Use range to avoid one bound check per galMultiply: benchmark old MB/s new MB/s speedup BenchmarkVerify10x2x10000-2 235.24 253.36 1.08x BenchmarkVerify50x5x50000-2 76.78 94.87 1.24x BenchmarkVerify10x2x1M-2 180.90 209.73 1.16x BenchmarkVerify5x2x1M-2 173.22 202.89 1.17x BenchmarkVerify10x4x1M-2 71.51 118.20 1.65x BenchmarkVerify50x20x1M-2 11.27 12.84 1.14x BenchmarkVerify10x4x16M-2 44.74 50.07 1.12x	2015-06-20 14:51:11 +02:00
Klaus Post	419c6cc9e9	Add Splitter to help split data into shards.	2015-06-20 11:27:03 +02:00
Klaus Post	c5de03551c	Minor adjustments for golint.	2015-06-20 10:11:33 +02:00

1 2

56 Commits (01b307ec91a7a7587226724141df961f3a9b8d26)