reedsolomon-go/options.go

package reedsolomon

import (
	"runtime"

	"github.com/klauspost/cpuid"
)

// Option allows to override processing parameters.
type Option func(*options)

type options struct {
	maxGoroutines                         int
	minSplitSize                          int
	useAVX512, useAVX2, useSSSE3, useSSE2 bool
	usePAR1Matrix                         bool
	useCauchy                             bool
	shardSize                             int
}

var defaultOptions = options{
	maxGoroutines: 384,
	minSplitSize:  -1,

	// Detect CPU capabilities.
	useSSSE3:  cpuid.CPU.SSSE3(),
	useSSE2:   cpuid.CPU.SSE2(),
	useAVX2:   cpuid.CPU.AVX2(),
	useAVX512: cpuid.CPU.AVX512F() && cpuid.CPU.AVX512BW(),
}

func init() {
	if runtime.GOMAXPROCS(0) <= 1 {
		defaultOptions.maxGoroutines = 1
	}
}

// WithMaxGoroutines is the maximum number of goroutines number for encoding & decoding.
// Jobs will be split into this many parts, unless each goroutine would have to process
// less than minSplitSize bytes (set with WithMinSplitSize).
// For the best speed, keep this well above the GOMAXPROCS number for more fine grained
// scheduling.
// If n <= 0, it is ignored.
func WithMaxGoroutines(n int) Option {
	return func(o *options) {
		if n > 0 {
			o.maxGoroutines = n
		}
	}
}

// WithAutoGoroutines will adjust the number of goroutines for optimal speed with a
// specific shard size.
// Send in the shard size you expect to send. Other shard sizes will work, but may not
// run at the optimal speed.
// Overwrites WithMaxGoroutines.
// If shardSize <= 0, it is ignored.
func WithAutoGoroutines(shardSize int) Option {
	return func(o *options) {
		o.shardSize = shardSize
	}
}

// WithMinSplitSize is the minimum encoding size in bytes per goroutine.
// By default this parameter is determined by CPU cache characteristics.
// See WithMaxGoroutines on how jobs are split.
// If n <= 0, it is ignored.
func WithMinSplitSize(n int) Option {
	return func(o *options) {
		if n > 0 {
			o.minSplitSize = n
		}
	}
}

func withSSSE3(enabled bool) Option {
	return func(o *options) {
		o.useSSSE3 = enabled
	}
}

func withAVX2(enabled bool) Option {
	return func(o *options) {
		o.useAVX2 = enabled
	}
}

func withSSE2(enabled bool) Option {
	return func(o *options) {
		o.useSSE2 = enabled
	}
}

func withAVX512(enabled bool) Option {
	return func(o *options) {
		o.useAVX512 = enabled
	}
}

// WithPAR1Matrix causes the encoder to build the matrix how PARv1
// does. Note that the method they use is buggy, and may lead to cases
// where recovery is impossible, even if there are enough parity
// shards.
func WithPAR1Matrix() Option {
	return func(o *options) {
		o.usePAR1Matrix = true
		o.useCauchy = false
	}
}

// WithCauchyMatrix will make the encoder build a Cauchy style matrix.
// The output of this is not compatible with the standard output.
// A Cauchy matrix is faster to generate. This does not affect data throughput,
// but will result in slightly faster start-up time.
func WithCauchyMatrix() Option {
	return func(o *options) {
		o.useCauchy = true
		o.usePAR1Matrix = false
	}
}
Add options (#46) * Add options Make constants changeable as options. The API remains backwards compatible. * Update documentation. * Fix line endings * fmt * fmt * Use functions for parameters. Much neater. 2017-02-19 13:13:22 +03:00			`package reedsolomon`

			`import (`
			`"runtime"`

			`"github.com/klauspost/cpuid"`
			`)`

			`// Option allows to override processing parameters.`
			`type Option func(*options)`

			`type options struct {`
AVX512 accelerated version resulting in a 4x speed improvement over AVX2 (#91) The performance on AVX512 has been accelerated for Intel CPUs. This gives speedups on a per-core basis of up to 4x compared to AVX2 as can be seen in the following table: ``` $ benchcmp avx2.txt avx512.txt benchmark AVX2 MB/s AVX512 MB/s speedup BenchmarkEncode8x8x1M-72 1681.35 4125.64 2.45x BenchmarkEncode8x4x8M-72 1529.36 5507.97 3.60x BenchmarkEncode8x8x8M-72 791.16 2952.29 3.73x BenchmarkEncode8x8x32M-72 573.26 2168.61 3.78x BenchmarkEncode12x4x12M-72 1234.41 4912.37 3.98x BenchmarkEncode16x4x16M-72 1189.59 5138.01 4.32x BenchmarkEncode24x8x24M-72 690.68 2583.70 3.74x BenchmarkEncode24x8x48M-72 674.20 2643.31 3.92x ``` 2019-02-10 13:17:23 +03:00			`maxGoroutines int`
			`minSplitSize int`
			`useAVX512, useAVX2, useSSSE3, useSSE2 bool`
			`usePAR1Matrix bool`
			`useCauchy bool`
			`shardSize int`
Add options (#46) * Add options Make constants changeable as options. The API remains backwards compatible. * Update documentation. * Fix line endings * fmt * fmt * Use functions for parameters. Much neater. 2017-02-19 13:13:22 +03:00			`}`

			`var defaultOptions = options{`
Cauchy Matrix option (#70) * Experimental Cauchy Matrix Experimental support for Cauchy style matrix http://web.eecs.utk.edu/~plank/plank/papers/CS-05-569.pdf All matrices appear reversible. * Remove Go 1.5 and 1.6 from CI tests. * Fix comment. * Increase max number of goroutines+docs. 2017-10-01 15:02:11 +03:00			`maxGoroutines: 384,`
Use CPU cache to set minSplitSize (#117) Use L1 cache size to set default split size. 2020-04-22 17:12:18 +03:00			`minSplitSize: -1,`
Add commandline arg to disable asm for tests. (#116) * Add commandline test args 2020-04-22 16:38:21 +03:00
			`// Detect CPU capabilities.`
			`useSSSE3: cpuid.CPU.SSSE3(),`
			`useSSE2: cpuid.CPU.SSE2(),`
			`useAVX2: cpuid.CPU.AVX2(),`
			`useAVX512: cpuid.CPU.AVX512F() && cpuid.CPU.AVX512BW(),`
Add options (#46) * Add options Make constants changeable as options. The API remains backwards compatible. * Update documentation. * Fix line endings * fmt * fmt * Use functions for parameters. Much neater. 2017-02-19 13:13:22 +03:00			`}`

			`func init() {`
			`if runtime.GOMAXPROCS(0) <= 1 {`
			`defaultOptions.maxGoroutines = 1`
			`}`
			`}`

			`// WithMaxGoroutines is the maximum number of goroutines number for encoding & decoding.`
			`// Jobs will be split into this many parts, unless each goroutine would have to process`
			`// less than minSplitSize bytes (set with WithMinSplitSize).`
			`// For the best speed, keep this well above the GOMAXPROCS number for more fine grained`
			`// scheduling.`
			`// If n <= 0, it is ignored.`
			`func WithMaxGoroutines(n int) Option {`
			`return func(o *options) {`
			`if n > 0 {`
			`o.maxGoroutines = n`
			`}`
			`}`
			`}`

Split blocks into size divisible by 16 Older systems (typically without AVX2) are more sensitive to misaligned load+stores. Add parameter to automatically set the number of goroutines. name old time/op new time/op delta Encode10x2x10000-8 18.4µs ± 1% 16.1µs ± 1% -12.43% (p=0.000 n=9+9) Encode100x20x10000-8 692µs ± 1% 608µs ± 1% -12.10% (p=0.000 n=10+10) Encode17x3x1M-8 1.78ms ± 5% 1.49ms ± 1% -16.63% (p=0.000 n=10+10) Encode10x4x16M-8 21.5ms ± 5% 19.6ms ± 4% -8.74% (p=0.000 n=10+9) Encode5x2x1M-8 343µs ± 2% 267µs ± 2% -22.22% (p=0.000 n=9+10) Encode10x2x1M-8 858µs ± 5% 701µs ± 5% -18.34% (p=0.000 n=10+10) Encode10x4x1M-8 1.34ms ± 1% 1.16ms ± 1% -13.19% (p=0.000 n=9+9) Encode50x20x1M-8 30.3ms ± 4% 25.0ms ± 2% -17.51% (p=0.000 n=10+8) Encode17x3x16M-8 26.9ms ± 1% 24.5ms ± 4% -9.13% (p=0.000 n=8+10) name old speed new speed delta Encode10x2x10000-8 5.45GB/s ± 1% 6.22GB/s ± 1% +14.20% (p=0.000 n=9+9) Encode100x20x10000-8 1.44GB/s ± 1% 1.64GB/s ± 1% +13.77% (p=0.000 n=10+10) Encode17x3x1M-8 10.0GB/s ± 5% 12.0GB/s ± 1% +19.88% (p=0.000 n=10+10) Encode10x4x16M-8 7.81GB/s ± 5% 8.56GB/s ± 5% +9.58% (p=0.000 n=10+9) Encode5x2x1M-8 15.3GB/s ± 2% 19.6GB/s ± 2% +28.57% (p=0.000 n=9+10) Encode10x2x1M-8 12.2GB/s ± 5% 15.0GB/s ± 5% +22.45% (p=0.000 n=10+10) Encode10x4x1M-8 7.84GB/s ± 1% 9.03GB/s ± 1% +15.19% (p=0.000 n=9+9) Encode50x20x1M-8 1.73GB/s ± 4% 2.09GB/s ± 4% +20.59% (p=0.000 n=10+9) Encode17x3x16M-8 10.6GB/s ± 1% 11.7GB/s ± 4% +10.12% (p=0.000 n=8+10) 2017-11-18 19:37:40 +03:00			`// WithAutoGoroutines will adjust the number of goroutines for optimal speed with a`
			`// specific shard size.`
			`// Send in the shard size you expect to send. Other shard sizes will work, but may not`
			`// run at the optimal speed.`
			`// Overwrites WithMaxGoroutines.`
			`// If shardSize <= 0, it is ignored.`
			`func WithAutoGoroutines(shardSize int) Option {`
			`return func(o *options) {`
			`o.shardSize = shardSize`
			`}`
			`}`

Add support for PAR1 (#55) PAR1 is a file format which uses a Reed-Solomon code similar to the current one, except it uses a different (flawed) coding matrix. Add support for it via a WithPAR1Matrix option, so that this code can be used to encode/decode PAR1 files. Also add the option to existing tests, and add a test demonstrating the flaw in PAR1's coding matrix. Also fix an mistakenly inverted test in testOpts(). Incidentally, PAR1 is obsoleted by PAR2, which uses GF(2^16) and tries to fix the flaw in the coding matrix; however, PAR2's coding matrix is still flawed! The real solution is to build the coding matrix like in this repository. PAR1 spec: http://parchive.sourceforge.net/docs/specifications/parity-volume-spec-1.0/article-spec.html Paper describing the (flawed) Reed-Solomon code used by PAR1: http://web.eecs.utk.edu/~plank/plank/papers/CS-96-332.html 2017-06-20 21:24:57 +03:00			`// WithMinSplitSize is the minimum encoding size in bytes per goroutine.`
Use CPU cache to set minSplitSize (#117) Use L1 cache size to set default split size. 2020-04-22 17:12:18 +03:00			`// By default this parameter is determined by CPU cache characteristics.`
Add options (#46) * Add options Make constants changeable as options. The API remains backwards compatible. * Update documentation. * Fix line endings * fmt * fmt * Use functions for parameters. Much neater. 2017-02-19 13:13:22 +03:00			`// See WithMaxGoroutines on how jobs are split.`
			`// If n <= 0, it is ignored.`
			`func WithMinSplitSize(n int) Option {`
			`return func(o *options) {`
			`if n > 0 {`
Set correct field in WithMinSplitSize Fixes #51 2017-05-28 13:38:06 +03:00			`o.minSplitSize = n`
Add options (#46) * Add options Make constants changeable as options. The API remains backwards compatible. * Update documentation. * Fix line endings * fmt * fmt * Use functions for parameters. Much neater. 2017-02-19 13:13:22 +03:00			`}`
			`}`
			`}`

Add commandline arg to disable asm for tests. (#116) * Add commandline test args 2020-04-22 16:38:21 +03:00			`func withSSSE3(enabled bool) Option {`
Add options (#46) * Add options Make constants changeable as options. The API remains backwards compatible. * Update documentation. * Fix line endings * fmt * fmt * Use functions for parameters. Much neater. 2017-02-19 13:13:22 +03:00			`return func(o *options) {`
			`o.useSSSE3 = enabled`
			`}`
			`}`

			`func withAVX2(enabled bool) Option {`
			`return func(o *options) {`
			`o.useAVX2 = enabled`
			`}`
			`}`
Add support for PAR1 (#55) PAR1 is a file format which uses a Reed-Solomon code similar to the current one, except it uses a different (flawed) coding matrix. Add support for it via a WithPAR1Matrix option, so that this code can be used to encode/decode PAR1 files. Also add the option to existing tests, and add a test demonstrating the flaw in PAR1's coding matrix. Also fix an mistakenly inverted test in testOpts(). Incidentally, PAR1 is obsoleted by PAR2, which uses GF(2^16) and tries to fix the flaw in the coding matrix; however, PAR2's coding matrix is still flawed! The real solution is to build the coding matrix like in this repository. PAR1 spec: http://parchive.sourceforge.net/docs/specifications/parity-volume-spec-1.0/article-spec.html Paper describing the (flawed) Reed-Solomon code used by PAR1: http://web.eecs.utk.edu/~plank/plank/papers/CS-96-332.html 2017-06-20 21:24:57 +03:00
add Update parity function (#60) Add Update parity function 2017-08-20 12:42:39 +03:00			`func withSSE2(enabled bool) Option {`
			`return func(o *options) {`
			`o.useSSE2 = enabled`
			`}`
			`}`

AVX512 accelerated version resulting in a 4x speed improvement over AVX2 (#91) The performance on AVX512 has been accelerated for Intel CPUs. This gives speedups on a per-core basis of up to 4x compared to AVX2 as can be seen in the following table: ``` $ benchcmp avx2.txt avx512.txt benchmark AVX2 MB/s AVX512 MB/s speedup BenchmarkEncode8x8x1M-72 1681.35 4125.64 2.45x BenchmarkEncode8x4x8M-72 1529.36 5507.97 3.60x BenchmarkEncode8x8x8M-72 791.16 2952.29 3.73x BenchmarkEncode8x8x32M-72 573.26 2168.61 3.78x BenchmarkEncode12x4x12M-72 1234.41 4912.37 3.98x BenchmarkEncode16x4x16M-72 1189.59 5138.01 4.32x BenchmarkEncode24x8x24M-72 690.68 2583.70 3.74x BenchmarkEncode24x8x48M-72 674.20 2643.31 3.92x ``` 2019-02-10 13:17:23 +03:00			`func withAVX512(enabled bool) Option {`
			`return func(o *options) {`
			`o.useAVX512 = enabled`
			`}`
			`}`

Add support for PAR1 (#55) PAR1 is a file format which uses a Reed-Solomon code similar to the current one, except it uses a different (flawed) coding matrix. Add support for it via a WithPAR1Matrix option, so that this code can be used to encode/decode PAR1 files. Also add the option to existing tests, and add a test demonstrating the flaw in PAR1's coding matrix. Also fix an mistakenly inverted test in testOpts(). Incidentally, PAR1 is obsoleted by PAR2, which uses GF(2^16) and tries to fix the flaw in the coding matrix; however, PAR2's coding matrix is still flawed! The real solution is to build the coding matrix like in this repository. PAR1 spec: http://parchive.sourceforge.net/docs/specifications/parity-volume-spec-1.0/article-spec.html Paper describing the (flawed) Reed-Solomon code used by PAR1: http://web.eecs.utk.edu/~plank/plank/papers/CS-96-332.html 2017-06-20 21:24:57 +03:00			`// WithPAR1Matrix causes the encoder to build the matrix how PARv1`
			`// does. Note that the method they use is buggy, and may lead to cases`
			`// where recovery is impossible, even if there are enough parity`
			`// shards.`
			`func WithPAR1Matrix() Option {`
			`return func(o *options) {`
			`o.usePAR1Matrix = true`
Cauchy Matrix option (#70) * Experimental Cauchy Matrix Experimental support for Cauchy style matrix http://web.eecs.utk.edu/~plank/plank/papers/CS-05-569.pdf All matrices appear reversible. * Remove Go 1.5 and 1.6 from CI tests. * Fix comment. * Increase max number of goroutines+docs. 2017-10-01 15:02:11 +03:00			`o.useCauchy = false`
			`}`
			`}`

			`// WithCauchyMatrix will make the encoder build a Cauchy style matrix.`
			`// The output of this is not compatible with the standard output.`
			`// A Cauchy matrix is faster to generate. This does not affect data throughput,`
			`// but will result in slightly faster start-up time.`
			`func WithCauchyMatrix() Option {`
			`return func(o *options) {`
			`o.useCauchy = true`
			`o.usePAR1Matrix = false`
Add support for PAR1 (#55) PAR1 is a file format which uses a Reed-Solomon code similar to the current one, except it uses a different (flawed) coding matrix. Add support for it via a WithPAR1Matrix option, so that this code can be used to encode/decode PAR1 files. Also add the option to existing tests, and add a test demonstrating the flaw in PAR1's coding matrix. Also fix an mistakenly inverted test in testOpts(). Incidentally, PAR1 is obsoleted by PAR2, which uses GF(2^16) and tries to fix the flaw in the coding matrix; however, PAR2's coding matrix is still flawed! The real solution is to build the coding matrix like in this repository. PAR1 spec: http://parchive.sourceforge.net/docs/specifications/parity-volume-spec-1.0/article-spec.html Paper describing the (flawed) Reed-Solomon code used by PAR1: http://web.eecs.utk.edu/~plank/plank/papers/CS-96-332.html 2017-06-20 21:24:57 +03:00			`}`
			`}`