reedsolomon-go

Commit Graph

Author	SHA1	Message	Date
Shawn Zivontsis	0e7f9a6a6f	Allow zero parity shards (#161 )	2021-03-08 16:13:24 +01:00
Klaus Post	ab26eb4126	Add WithInversionCache and use pointer methods (#160 ) There appears to be writes to value receivers. Add `WithInversionCache(bool)` to disable cache. Fixes #159	2021-01-13 10:21:28 +01:00
Klaus Post	7c8682430c	tests: Set full data size as number of bytes (#157 ) * Clean up deps. * tests: Set full data size as number of bytes Use total data size (data+parity) as benchmark sizes for more consistent benchmarks.	2020-12-18 09:09:17 +01:00
Klaus Post	519603f6e1	Update packages (#154 ) * Update packages Update cpuid and clean up generated.	2020-12-09 22:56:01 +01:00
Klaus Post	653e76aa26	Faster AVX2 encoding (#153 ) * Remove 50% of bounds checks when copying. * Use RIP only addressing, free one register. ``` benchmark old MB/s new MB/s speedup BenchmarkGalois128K-32 57663.49 58005.87 1.01x BenchmarkGalois1M-32 49479.31 49848.29 1.01x BenchmarkGaloisXor128K-32 46310.69 46501.88 1.00x BenchmarkGaloisXor1M-32 43804.86 43984.39 1.00x BenchmarkEncode10x2x10000-32 25926.93 27457.75 1.06x BenchmarkEncode100x20x10000-32 2635.82 2818.95 1.07x BenchmarkEncode17x3x1M-32 63215.11 61576.76 0.97x BenchmarkEncode10x4x16M-32 19551.54 19505.07 1.00x BenchmarkEncode5x2x1M-32 79612.06 81985.14 1.03x BenchmarkEncode10x2x1M-32 121478.29 127739.41 1.05x BenchmarkEncode10x4x1M-32 70757.61 74423.67 1.05x BenchmarkEncode50x20x1M-32 19811.96 20103.32 1.01x BenchmarkEncode17x3x16M-32 27202.10 27825.34 1.02x BenchmarkEncode_8x4x8M-32 19029.04 19701.31 1.04x BenchmarkEncode_12x4x12M-32 22449.87 22480.51 1.00x BenchmarkEncode_16x4x16M-32 24536.74 24672.24 1.01x BenchmarkEncode_16x4x32M-32 24381.34 24981.99 1.02x BenchmarkEncode_16x4x64M-32 24717.69 25086.94 1.01x BenchmarkEncode_8x5x8M-32 16763.51 17154.04 1.02x BenchmarkEncode_8x6x8M-32 15067.22 15205.87 1.01x BenchmarkEncode_8x7x8M-32 13156.38 13589.40 1.03x BenchmarkEncode_8x9x8M-32 11363.74 11523.70 1.01x BenchmarkEncode_8x10x8M-32 10359.37 10474.91 1.01x BenchmarkEncode_8x11x8M-32 9627.07 9463.24 0.98x BenchmarkEncode_8x8x05M-32 30104.80 32634.89 1.08x BenchmarkEncode_8x8x1M-32 36497.28 36425.88 1.00x BenchmarkEncode_8x8x8M-32 12186.19 11602.41 0.95x BenchmarkEncode_8x8x32M-32 11670.72 11413.71 0.98x BenchmarkEncode_24x8x24M-32 21709.83 21652.50 1.00x BenchmarkEncode_24x8x48M-32 22494.40 22280.59 0.99x BenchmarkVerify10x2x10000-32 10567.56 10483.91 0.99x BenchmarkVerify50x5x50000-32 28102.84 27923.63 0.99x BenchmarkVerify10x2x1M-32 30298.33 30106.18 0.99x BenchmarkVerify5x2x1M-32 16115.91 15847.03 0.98x BenchmarkVerify10x4x1M-32 15382.13 14852.68 0.97x BenchmarkVerify50x20x1M-32 8476.02 8466.24 1.00x BenchmarkVerify10x4x16M-32 15101.03 15434.71 1.02x BenchmarkReconstruct10x2x10000-32 26228.18 26960.19 1.03x BenchmarkReconstruct50x5x50000-32 31091.42 30975.82 1.00x BenchmarkReconstruct10x2x1M-32 58548.87 60281.92 1.03x BenchmarkReconstruct5x2x1M-32 39499.23 41791.80 1.06x BenchmarkReconstruct10x4x1M-32 41448.60 43053.15 1.04x BenchmarkReconstruct50x20x1M-32 17185.99 17354.67 1.01x BenchmarkReconstruct10x4x16M-32 18798.60 18847.43 1.00x BenchmarkReconstructData10x2x10000-32 27208.48 27538.38 1.01x BenchmarkReconstructData50x5x50000-32 32135.65 32078.91 1.00x BenchmarkReconstructData10x2x1M-32 63180.19 67332.17 1.07x BenchmarkReconstructData5x2x1M-32 47532.85 49932.17 1.05x BenchmarkReconstructData10x4x1M-32 50059.14 52323.15 1.05x BenchmarkReconstructData50x20x1M-32 26679.75 26714.11 1.00x BenchmarkReconstructData10x4x16M-32 24854.99 24527.23 0.99x BenchmarkReconstructP10x2x10000-32 115089.87 113229.75 0.98x BenchmarkReconstructP10x5x20000-32 129838.75 132871.10 1.02x BenchmarkParallel_8x8x64K-32 69951.43 69980.44 1.00x BenchmarkParallel_8x8x05M-32 11752.94 11724.35 1.00x BenchmarkParallel_20x10x05M-32 18553.93 18613.33 1.00x BenchmarkParallel_8x8x1M-32 11639.19 11746.86 1.01x BenchmarkParallel_8x8x8M-32 11799.36 11685.63 0.99x BenchmarkParallel_8x8x32M-32 11510.94 11791.72 1.02x BenchmarkParallel_8x3x1M-32 20268.92 20678.21 1.02x BenchmarkParallel_8x4x1M-32 17616.05 17856.17 1.01x BenchmarkParallel_8x5x1M-32 15590.87 15872.42 1.02x BenchmarkStreamEncode10x2x10000-32 14917.08 15408.39 1.03x BenchmarkStreamEncode100x20x10000-32 2014.81 2077.31 1.03x BenchmarkStreamEncode17x3x1M-32 11839.37 12434.80 1.05x BenchmarkStreamEncode10x4x16M-32 9151.14 9206.98 1.01x BenchmarkStreamEncode5x2x1M-32 13598.55 13663.56 1.00x BenchmarkStreamEncode10x2x1M-32 13192.91 13453.41 1.02x BenchmarkStreamEncode10x4x1M-32 12109.90 12050.68 1.00x BenchmarkStreamEncode50x20x1M-32 8640.73 8370.10 0.97x BenchmarkStreamEncode17x3x16M-32 10473.17 10527.04 1.01x BenchmarkStreamVerify10x2x10000-32 7032.23 7128.82 1.01x BenchmarkStreamVerify50x5x50000-32 13023.46 13109.31 1.01x BenchmarkStreamVerify10x2x1M-32 11941.63 11949.91 1.00x BenchmarkStreamVerify5x2x1M-32 8029.93 8263.39 1.03x BenchmarkStreamVerify10x4x1M-32 8137.82 8271.11 1.02x BenchmarkStreamVerify50x20x1M-32 7378.87 7708.81 1.04x BenchmarkStreamVerify10x4x16M-32 8973.18 8955.29 1.00x ```	2020-11-10 14:39:23 +01:00
Klaus Post	7daa20bf74	Generate AVX2 code (#141 ) Replaces AVX2 up to 10x8 configurations with specific generated functions. If code size is a concern `-tags=nogen` can be used. Biggest speedup when not memory constrained. ``` benchmark old MB/s new MB/s speedup BenchmarkEncode_8x5x8M 5895.75 9648.18 1.64x BenchmarkEncode_8x5x8M-4 16773.41 17220.67 1.03x BenchmarkEncode_8x5x8M-16 18263.12 17176.28 0.94x BenchmarkEncode_8x6x8M 5075.89 8548.39 1.68x BenchmarkEncode_8x6x8M-4 14559.83 15370.95 1.06x BenchmarkEncode_8x6x8M-16 16183.37 15291.98 0.94x BenchmarkEncode_8x7x8M 4481.18 7015.60 1.57x BenchmarkEncode_8x7x8M-4 12835.35 13695.90 1.07x BenchmarkEncode_8x7x8M-16 14246.94 13737.36 0.96x BenchmarkEncode_8x8x05M 5569.95 7947.70 1.43x BenchmarkEncode_8x8x05M-4 17334.91 25271.37 1.46x BenchmarkEncode_8x8x05M-16 29349.42 35043.36 1.19x BenchmarkEncode_8x8x1M 4830.58 7891.32 1.63x BenchmarkEncode_8x8x1M-4 17531.36 27371.42 1.56x BenchmarkEncode_8x8x1M-16 29593.98 39241.09 1.33x BenchmarkEncode_8x8x8M 3953.66 6584.26 1.67x BenchmarkEncode_8x8x8M-4 11527.34 12331.23 1.07x BenchmarkEncode_8x8x8M-16 12718.89 12173.08 0.96x BenchmarkEncode_8x8x32M 3927.51 6195.91 1.58x BenchmarkEncode_8x8x32M-4 11490.85 11424.39 0.99x BenchmarkEncode_8x8x32M-16 12506.09 11888.55 0.95x benchmark old MB/s new MB/s speedup BenchmarkParallel_8x8x64K 5490.24 6959.57 1.27x BenchmarkParallel_8x8x64K-4 21078.94 29557.51 1.40x BenchmarkParallel_8x8x64K-16 57508.45 73672.54 1.28x BenchmarkParallel_8x8x1M 4755.49 7667.84 1.61x BenchmarkParallel_8x8x1M-4 11818.66 12013.49 1.02x BenchmarkParallel_8x8x1M-16 12923.12 12109.42 0.94x BenchmarkParallel_8x8x8M 3973.94 6525.85 1.64x BenchmarkParallel_8x8x8M-4 11725.68 11312.46 0.96x BenchmarkParallel_8x8x8M-16 12608.20 11484.98 0.91x BenchmarkParallel_8x3x1M 14139.71 17993.04 1.27x BenchmarkParallel_8x3x1M-4 21805.97 23053.92 1.06x BenchmarkParallel_8x3x1M-16 24673.05 23596.71 0.96x BenchmarkParallel_8x4x1M 10617.88 14474.54 1.36x BenchmarkParallel_8x4x1M-4 18635.82 18965.65 1.02x BenchmarkParallel_8x4x1M-16 21518.12 20171.47 0.94x BenchmarkParallel_8x5x1M 8669.88 11833.96 1.36x BenchmarkParallel_8x5x1M-4 16321.00 17500.30 1.07x BenchmarkParallel_8x5x1M-16 17267.16 17191.04 1.00x ```	2020-05-20 12:48:34 +02:00
Klaus Post	cf8495259a	Add pure XOR for 1 parity (#138 ) WithFastOneParityMatrix will switch the matrix to a simple xor if there is only one parity shard. The PAR1 matrix already has this property so it has little effect there.	2020-05-13 11:10:58 +02:00
Klaus Post	2df03bd4d1	Ci test more archs (#135 ) * ci: test more architectures	2020-05-09 10:35:17 +02:00
Klaus Post	696c4018f8	bench: Fix reconstruct benchmarks (#133 ) Always corrupt at least one shard and don't shuffle shards.	2020-05-06 15:42:49 +02:00
Frank Wessels	1b9e129671	Avx512 parallel81 (#131 ) * AVX512 routine for 8x1 parallel processing (WIP) * Testing and integration of Parallel81 assembly routine	2020-05-06 12:32:31 +02:00
Klaus Post	cb7a0b5aef	Do fast by one multiplication (#130 ) When multiplying by one we can use faster math.	2020-05-06 11:14:25 +02:00
Klaus Post	65df535980	Make single goroutine encodes more efficient (#122 ) Calculate the optimal per round size to keep data in cache when not using WithAutoGoroutines. ``` λ benchcmp before.txt after.txt benchmark old ns/op new ns/op delta BenchmarkParallel_8x8x05M-16 675225 321053 -52.45% BenchmarkParallel_20x10x05M-16 3471988 600740 -82.70% BenchmarkParallel_8x8x1M-16 3948606 728093 -81.56% BenchmarkParallel_8x8x8M-16 47361588 5976467 -87.38% BenchmarkParallel_8x8x32M-16 195044200 24365474 -87.51% benchmark old MB/s new MB/s speedup BenchmarkParallel_8x8x05M-16 6211.71 13064.22 2.10x BenchmarkParallel_20x10x05M-16 3020.10 17454.73 5.78x BenchmarkParallel_8x8x1M-16 2124.45 11521.34 5.42x BenchmarkParallel_8x8x8M-16 1416.95 11228.85 7.92x BenchmarkParallel_8x8x32M-16 1376.28 11017.04 8.00x ```	2020-05-03 19:37:22 +02:00
Klaus Post	d2cfcb8065	Add commandline arg to disable asm for tests. (#116 ) * Add commandline test args	2020-04-22 15:38:21 +02:00
Klaus Post	0abe9de20c	Update tests (#115 ) Don't create new slices.	2020-02-21 11:30:44 -08:00
dssysolyatin	ec2eb9fb8c	Split: Reduce memory allocation (#103 ) * [Split] Reduce memory allocation in Split function	2019-06-25 16:28:24 +02:00
Klaus Post	f5e73dcfe2	Split blocks into size divisible by 16 Older systems (typically without AVX2) are more sensitive to misaligned load+stores. Add parameter to automatically set the number of goroutines. name old time/op new time/op delta Encode10x2x10000-8 18.4µs ± 1% 16.1µs ± 1% -12.43% (p=0.000 n=9+9) Encode100x20x10000-8 692µs ± 1% 608µs ± 1% -12.10% (p=0.000 n=10+10) Encode17x3x1M-8 1.78ms ± 5% 1.49ms ± 1% -16.63% (p=0.000 n=10+10) Encode10x4x16M-8 21.5ms ± 5% 19.6ms ± 4% -8.74% (p=0.000 n=10+9) Encode5x2x1M-8 343µs ± 2% 267µs ± 2% -22.22% (p=0.000 n=9+10) Encode10x2x1M-8 858µs ± 5% 701µs ± 5% -18.34% (p=0.000 n=10+10) Encode10x4x1M-8 1.34ms ± 1% 1.16ms ± 1% -13.19% (p=0.000 n=9+9) Encode50x20x1M-8 30.3ms ± 4% 25.0ms ± 2% -17.51% (p=0.000 n=10+8) Encode17x3x16M-8 26.9ms ± 1% 24.5ms ± 4% -9.13% (p=0.000 n=8+10) name old speed new speed delta Encode10x2x10000-8 5.45GB/s ± 1% 6.22GB/s ± 1% +14.20% (p=0.000 n=9+9) Encode100x20x10000-8 1.44GB/s ± 1% 1.64GB/s ± 1% +13.77% (p=0.000 n=10+10) Encode17x3x1M-8 10.0GB/s ± 5% 12.0GB/s ± 1% +19.88% (p=0.000 n=10+10) Encode10x4x16M-8 7.81GB/s ± 5% 8.56GB/s ± 5% +9.58% (p=0.000 n=10+9) Encode5x2x1M-8 15.3GB/s ± 2% 19.6GB/s ± 2% +28.57% (p=0.000 n=9+10) Encode10x2x1M-8 12.2GB/s ± 5% 15.0GB/s ± 5% +22.45% (p=0.000 n=10+10) Encode10x4x1M-8 7.84GB/s ± 1% 9.03GB/s ± 1% +15.19% (p=0.000 n=9+9) Encode50x20x1M-8 1.73GB/s ± 4% 2.09GB/s ± 4% +20.59% (p=0.000 n=10+9) Encode17x3x16M-8 10.6GB/s ± 1% 11.7GB/s ± 4% +10.12% (p=0.000 n=8+10)	2017-11-18 22:00:55 +01:00
Klaus Post	61c22eab55	Cauchy Matrix option (#70 ) * Experimental Cauchy Matrix Experimental support for Cauchy style matrix http://web.eecs.utk.edu/~plank/plank/papers/CS-05-569.pdf All matrices appear reversible. * Remove Go 1.5 and 1.6 from CI tests. * Fix comment. * Increase max number of goroutines+docs.	2017-10-01 14:02:11 +02:00
David Reiss	ddcafc661e	Allow reconstructing into pre-allocated memory. (#66 ) This changes the interface of Reconstruct and ReconstructData to accept slices of zero length but sufficient capacity for shards to reconstruct, and reslices them instead of allocating new memory.	2017-09-20 21:08:24 +02:00
chenzhongtao	d78bf472d8	add Update parity function (#60 ) Add Update parity function	2017-08-20 11:42:39 +02:00
Klaus Post	dc6af2dce5	Minor cleanup (#61 ) * Remove some benchmarks * Format tables a bit. * Doc cleanup	2017-08-13 22:38:27 +02:00
Frank Wessels	0de37d7697	Add ReconstructData interface method (#57 ) * Add ReconstructData interface method to allow reconstruction of any missing data shards * Add support for just reconstructing data shards only to SteamEncoder.Reconstruct()	2017-07-20 12:15:46 +02:00
Fred Akalin	18d548df63	Add support for PAR1 (#55 ) PAR1 is a file format which uses a Reed-Solomon code similar to the current one, except it uses a different (flawed) coding matrix. Add support for it via a WithPAR1Matrix option, so that this code can be used to encode/decode PAR1 files. Also add the option to existing tests, and add a test demonstrating the flaw in PAR1's coding matrix. Also fix an mistakenly inverted test in testOpts(). Incidentally, PAR1 is obsoleted by PAR2, which uses GF(2^16) and tries to fix the flaw in the coding matrix; however, PAR2's coding matrix is still flawed! The real solution is to build the coding matrix like in this repository. PAR1 spec: http://parchive.sourceforge.net/docs/specifications/parity-volume-spec-1.0/article-spec.html Paper describing the (flawed) Reed-Solomon code used by PAR1: http://web.eecs.utk.edu/~plank/plank/papers/CS-96-332.html	2017-06-20 20:24:57 +02:00
Fred Akalin	87c4e5ae75	Allow 256 total shards (#54 ) * Allow 256 total shards	2017-06-19 11:26:52 +02:00
Klaus Post	5abf0ee302	Add options (#46 ) * Add options Make constants changeable as options. The API remains backwards compatible. * Update documentation. * Fix line endings * fmt * fmt * Use functions for parameters. Much neater.	2017-02-19 11:13:22 +01:00
Peter C	c54154da9e	Add Inverse Matrix caching in a Thread-Safe Lookup Tree (#36 ) * Add matrix inversion caching * Benchmark and Parallel Benchmark tests for Reconstruct	2016-09-12 21:31:07 +02:00
Christian Muehlhaeuser	b1c8b4b073	Make Join return an error if a reconstruction is required first If one or more required data shards are still nil and we can't correctly join them before a reconstruction, return ErrReconstructRequired.	2016-08-05 19:23:08 +02:00
Harshavardhana	ba30981088	Add checks for data and parity to not exceed 255 shards in total. Fixes #16	2016-06-03 01:31:01 -07:00
xiaost	9f0bea8a29	Tests: backport go1.6 rand.Read for speedup tests	2016-04-07 18:34:47 +08:00
klauspost	976a24f33b	Move examples to separate file/package This makes the reedsolomon package prefix show up in the documentation examples. + StreamEncoder example.	2015-11-03 12:12:42 +01:00
lukechampine	86bd0f239b	seed RNG in TestSplitJoin	2015-08-08 18:20:40 -04:00
lukechampine	458f451fc2	add codeSomeShardsP test	2015-08-08 13:52:00 -04:00
lukechampine	bb7bd0036a	fully test Split/Join functions	2015-08-08 13:51:11 -04:00
lukechampine	64b705bbf6	fully test Reconstruct function Well, I can't figure out how to trigger the Invert error. It may not be possible; need more domain knowledge to be sure.	2015-08-08 13:50:18 -04:00
lukechampine	f81ea8daaf	fully test Verify function	2015-08-08 13:50:18 -04:00
lukechampine	0238782585	fully test Encode function	2015-08-08 13:50:18 -04:00
lukechampine	10fbe96890	use slice literal	2015-08-06 22:56:32 -04:00
lukechampine	640ab74d9d	fully test the New function	2015-08-06 22:47:11 -04:00
klauspost	d31049df42	Add another example that shows that sets can be xor'ed and still remain valid.	2015-06-23 14:35:16 +02:00
klauspost	8ebf356efb	The number of data shards must be below 257. Check that and update documentation.	2015-06-23 13:39:57 +02:00
klauspost	6861078d3b	Add more information to example.	2015-06-22 15:52:10 +02:00
klauspost	0cb21eccc5	Rename example function.	2015-06-22 15:48:52 +02:00
klauspost	7794948a5b	Add split/merge example.	2015-06-22 15:44:22 +02:00
Klaus Post	619e2b7d65	Add benchmark with 17 data shards and 3 parity shards with 16MB each, and correct comments.	2015-06-21 17:07:17 +02:00
Klaus Post	ab50161bb9	Update benchmarks.	2015-06-20 20:51:26 +02:00
Klaus Post	36a0e57744	Begin docs.	2015-06-20 13:10:51 +02:00
Klaus Post	d54843ee41	Add Encoder example (and test)	2015-06-20 11:29:26 +02:00
Klaus Post	c5de03551c	Minor adjustments for golint.	2015-06-20 10:11:33 +02:00
Klaus Post	cf70107291	Add verification test that also tests failure.	2015-06-19 19:20:44 +02:00
Klaus Post	e3aca6cd9d	Shorten the variable names and make an encoder interface, so it isn't possible to create it without calling New.	2015-06-19 18:54:58 +02:00
Klaus Post	67f8d8b8c7	Add another benchmark.	2015-06-19 18:25:48 +02:00

1 2

51 Commits (0e7f9a6a6f2503191dce1116fb4a45c3c43a1d9c)