Big checkin after I've lost the others. Ha ha.

git-svn-id: svn://mamba.eecs.utk.edu/home/plank/svn/Galois-Library@78 36f187d4-5712-4624-889c-152d48957efa
master
plank 2012-12-08 15:28:43 +00:00
commit 70b6d55aee
62 changed files with 21842 additions and 0 deletions

51
GNUmakefile Normal file
View File

@ -0,0 +1,51 @@
#
# GNUmakefile for Galois field library
#
#
SRCS = gf_w4.c gf_w8.c gf_w16.c gf_w32.c gf_w64.c gf_w128.c gf_wgen.c gf.c gf_unit.c gf_time.c gf_mult.c gf_method.c gf_54.c gf_methods.c gf_div.c gf_rand.c gf_general.c
HDRS = gf.h gf_int.h
EXECUTABLES = gf_mult gf_div gf_unit gf_time gf_54 gf_methods
CFLAGS = -O3 -msse4 -DINTEL_SSE4
# CFLAGS = -g
LDFLAGS = -O3 -msse4
RM = /bin/rm -f
OBJS = $(addsuffix .o, $(basename $(SRCS)))
DEFAULT = $(EXECUTABLES)
default: $(DEFAULT)
all: $(OBJS)
gf_methods: gf_methods.o gf.o gf_method.o gf_wgen.o gf_w4.o gf_w8.o gf_w16.o gf_w32.o gf_w64.o gf_w128.o
gf_time: gf_time.o gf.o gf_method.o gf_wgen.o gf_w4.o gf_w8.o gf_w16.o gf_w32.o gf_w64.o gf_w128.o gf_rand.o gf_general.o
gf_unit: gf_unit.o gf.o gf_method.o gf_wgen.o gf_w4.o gf_w8.o gf_w16.o gf_w32.o gf_w64.o gf_w128.o gf_rand.o gf_general.o
gf_mult: gf_mult.o gf.o gf_wgen.o gf_w4.o gf_method.o gf_w8.o gf_w16.o gf_w32.o gf_w64.o gf_w128.o
gf_div: gf_div.o gf.o gf_wgen.o gf_w4.o gf_method.o gf_w8.o gf_w16.o gf_w32.o gf_w64.o gf_w128.o
gf_54: gf_54.o gf.o gf_wgen.o gf_w4.o gf_w8.o gf_w16.o gf_w32.o gf_w64.o gf_w128.o
clean:
$(RM) $(OBJS) gf_div.c
spotless: clean
$(RM) *~ $(EXECUTABLES)
gf_div.o: gf.h gf_method.h
gf_methods.o: gf.h gf_method.h
gf_time.o: gf.h gf_method.h gf_rand.h gf_general.h
gf_wgen.o: gf_int.h gf.h
gf_w4.o: gf_int.h gf.h
gf_w8.o: gf_int.h gf.h
gf_w16.o: gf_int.h gf.h
gf_w32.o: gf_int.h gf.h
gf_w64.o: gf_int.h gf.h
gf_54.o: gf.h
gf_unit.o: gf.h gf_method.h gf_rand.h gf_general.h
gf_general.o: gf.h gf_int.h gf_general.h gf_rand.h
gf_mult.o: gf.h gf_method.h
gf_method.o: gf.h
gf_div.c: gf_mult.c
sed 's/multiply/divide/g' gf_mult.c > gf_div.c

BIN
Log-Zero-for-w=8.odg Normal file

Binary file not shown.

1
README Normal file
View File

@ -0,0 +1 @@
This is a README file.

777
explanation.html Normal file
View File

@ -0,0 +1,777 @@
<h3>Code structure as of 7/20/2012</h3>
written by Jim.
<p>
Ok -- once again, I have messed with the structure. My goal is flexible and efficient.
It's similar to the stuff before, but better because it makes things like Euclid's
method much cleaner.
<p>
I think we're ready to hack.
<p>
<p>
<hr>
<h3>Files</h3>
<UL>
<LI> <a href=GNUmakefile><b>GNUmakefile</b></a>: Makefile
<LI> <a href=README><b>README</b></a>: Empty readme
<LI> <a href=explanation.html><b>explanation.html</b></a>: This file.
<LI> <a href=gf.c><b>gf.c</b></a>: Main gf routines
<LI> <a href=gf.h><b>gf.h</b></a>: Main gf prototypes and typedefs
<LI> <a href=gf_int.h><b>gf_int.h</b></a>: Prototypes and typedefs for common routines for the
internal gf implementations.
<LI> <a href=gf_method.c><b>gf_method.c</b></a>: Code to help parse argc/argv to define the method.
This way, various programs can be consistent with how they handle the command line.
<LI> <a href=gf_method.h><b>gf_method.h</b></a>: Prototypes for ibid.
<LI> <a href=gf_methods.c><b>gf_methods.c</b></a>: This program prints out how to define
the various methods on the command line. My idea is to beef this up so that you can
give it a method spec on the command line, and it will tell you whether it's valid, or
why it's invalid. I haven't written that part yet.
<LI> <a href=gf_mult.c><b>gf_mult.c</b></a>: Program to do single multiplication.
<LI> <a href=gf_mult.c><b>gf_mult.c</b></a>: Program to do single divisions -- it's created
in the makefile with a sed script on gf_mult.c.
<LI> <a href=gf_time.c><b>gf_time.c</b></a>: Time tester
<LI> <a href=gf_unit.c><b>gf_unit.c</b></a>: Unit tester
<LI> <a href=gf_54.c><b>gf_54.c</b></a>: A simple example program that multiplies
5 and 4 in GF(2^4).
<LI> <a href=gf_w4.c><b>gf_w4.c</b></a>: Implementation of code for <i>w</i> = 4.
(For now, only SHIFT and LOG, plus EUCLID & MATRIX).
<LI> <a href=gf_w8.c><b>gf_w8.c</b></a>: Implementation of code for <i>w</i> = 8.
(For now, only SHIFT plus EUCLID & MATRIX).
<LI> <a href=gf_w16.c><b>gf_w16.c</b></a>: Implementation of code for <i>w</i> = 16.
(For now, only SHIFT plus EUCLID & MATRIX).
<LI> <a href=gf_w32.c><b>gf_w32.c</b></a>: Implementation of code for <i>w</i> = 32.
(For now, only SHIFT plus EUCLID & MATRIX).
<LI> <a href=gf_w64.c><b>gf_w64.c</b></a>: Implementation of code for <i>w</i> = 64.
(For now, only SHIFT and EUCLID.
<LI> I don't have gf_w128.c or gf_gen.c yet.
</UL>
<hr>
<h3>Prototypes and typedefs in gf.h</h3>
The main structure that users will see is in <b>gf.h</b>, and it is of type
<b>gf_t</b>:
<p><center><table border=3 cellpadding=3><td><pre>
typedef struct gf {
gf_func_a_b multiply;
gf_func_a_b divide;
gf_func_a inverse;
gf_region multiply_region;
void *scratch;
} gf_t;
</pre></td></table></center><p>
We can beef it up later with buf-buf or buf-acc. The problem is that the paper is
already bloated, so right now, I want to keep it lean.
<p>
The types of the procedures are big unions, so that they work with the following
types of arguments:
<p><center><table border=3 cellpadding=3><td><pre>
typedef uint8_t gf_val_4_t;
typedef uint8_t gf_val_8_t;
typedef uint16_t gf_val_16_t;
typedef uint32_t gf_val_32_t;
typedef uint64_t gf_val_64_t;
typedef uint64_t *gf_val_128_t;
typedef uint32_t gf_val_gen_t; /* The intent here is for general values <= 32 */
</pre></td></table></center><p>
To use one of these, you need to create one with <b>gf_init_easy()</b> or
<b>gf_init_hard()</b>. Let's concentrate on the former:
<p><center><table border=3 cellpadding=3><td><pre>
extern int gf_init_easy(gf_t *gf, int w, int mult_type);
</pre></td></table></center><p>
You pass it memory for a <b>gf_t</b>, a value of <b>w</b> and
a variable that says how to do multiplication. The valid values of <b>mult_type</b>
are enumerated in <b>gf.h</b>:
<p><center><table border=3 cellpadding=3><td><pre>
typedef enum {GF_MULT_DEFAULT,
GF_MULT_SHIFT,
GF_MULT_GROUP,
GF_MULT_BYTWO_p,
GF_MULT_BYTWO_b,
GF_MULT_TABLE,
GF_MULT_LOG_TABLE,
GF_MULT_SPLIT_TABLE,
GF_MULT_COMPOSITE } gf_mult_type_t;
</pre></td></table></center><p>
After creating the <b>gf_t</b>, you use its <b>multiply</b> method
to multiply, using the union's fields to work with the various types.
It looks easier than my explanation. For example, suppose you wanted to multiply 5 and 4 in <i>GF(2<sup>4</sup>)</i>.
You can do it as in
<b><a href=gf_54.c>gf_54.c</a></b>
<p><center><table border=3 cellpadding=3><td><pre>
#include "gf.h"
main()
{
gf_t gf;
gf_init_easy(&gf, 4, GF_MULT_DEFAULT);
printf("%d\n", gf.multiply.w4(&gf, 5, 4));
exit(0);
}
</pre></td></table></center><p>
If you wanted to multiply in <i>GF(2<sup>8</sup>)</i>, then you'd have to use 8 as a parameter
to <b>gf_init_easy</b>, and call the multiplier as <b>gf.mult.w8()</b>.
<p>
When you're done with your <b>gf_t</b>, you should call <b>gf_free()</b> on it so
that it can free memory that it has allocated. We'll talk more about memory later, but if you
create your <b>gf_t</b> with <b>gf_init_easy</b>, then it calls <b>malloc()</b>, and
if you care about freeing memory, you'll have to call <b>gf_free()</b>.
<p>
<hr>
<h3>Memory allocation</h3>
Each implementation of a multiplication technique keeps around its
own data. For example, <b>GF_MULT_TABLE</b> keeps around
multiplication and division tables, and <b>GF_MULT_LOG</b> maintains log and
antilog tables. This data is stored in the pointer <b>scratch</b>. My intent
is that the memory that is there is all that's required. In other
words, the <b>multiply()</b>, <b>divide()</b>, <b>inverse()</b> and
<b>multiply_region()</b> calls don't do any memory allocation.
Moreover, <b>gf_init_easy()</b> only allocates one chunk of memory --
the one in <b>scratch</b>.
<p>
If you don't want to have the initialization call allocate memory, you can use <b>gf_init_hard()</b>:
<p><center><table border=3 cellpadding=3><td><pre>
extern int gf_init_hard(gf_t *gf,
int w,
int mult_type,
int region_type,
int divide_type,
uint64_t prim_poly,
int arg1,
int arg2,
gf_t *base_gf,
void *scratch_memory);
</pre></td></table></center><p>
The first three parameters are the same as <b>gf_init_easy()</b>.
You can add additional arguments for performing <b>multiply_region</b>, and
for performing division in the <b>region_type</b> and <b>divide_type</b>
arguments. Their values are also defined in <b>gf.h</b>. You can
mix the <b>region_type</b> values (e.g. "DOUBLE" and "SSE"):
<p><center><table border=3 cellpadding=3><td><pre>
#define GF_REGION_DEFAULT (0x0)
#define GF_REGION_SINGLE_TABLE (0x1)
#define GF_REGION_DOUBLE_TABLE (0x2)
#define GF_REGION_QUAD_TABLE (0x4)
#define GF_REGION_LAZY (0x8)
#define GF_REGION_SSE (0x10)
#define GF_REGION_NOSSE (0x20)
#define GF_REGION_STDMAP (0x40)
#define GF_REGION_ALTMAP (0x80)
#define GF_REGION_CAUCHY (0x100)
typedef uint32_t gf_region_type_t;
typedef enum { GF_DIVIDE_DEFAULT,
GF_DIVIDE_MATRIX,
GF_DIVIDE_EUCLID } gf_division_type_t;
</pre></td></table></center><p>
You can change
the primitive polynomial with <b>prim_poly</b>, give additional arguments with
<b>arg1</b> and <b>arg2</b> and give a base Galois Field for composite fields.
Finally, you can pass it a pointer to memory in <b>scratch_memory</b>. That
way, you can avoid having <b>gf_init_hard()</b> call <b>malloc()</b>.
<p>
There is a procedure called <b>gf_scratch_size()</b> that lets you know the minimum
size for <b>scratch_memory</b>, depending on <i>w</i>, the multiplication type
and the arguments:
<p><center><table border=3 cellpadding=3><td><pre>
extern int gf_scratch_size(int w,
int mult_type,
int region_type,
int divide_type,
int arg1,
int arg2);
</pre></td></table></center><p>
You can specify default arguments in <b>gf_init_hard()</b>:
<UL>
<LI> <b>region_type</b> = <b>GF_REGION_DEFAULT</b>
<LI> <b>divide_type</b> = <b>GF_REGION_DEFAULT</b>
<LI> <b>prim_poly</b> = 0
<LI> <b>arg1</b> = 0
<LI> <b>arg2</b> = 0
<LI> <b>base_gf</b> = <b>NULL</b>
<LI> <b>scratch_memory</b> = <b>NULL</b>
</UL>
If any argument is equal to its default, then default actions are taken (e.g. a
standard primitive polynomial is used, or memory is allocated for <b>scratch_memory</b>).
In fact, <b>gf_init_easy()</b> simply calls <b>gf_init_hard()</b> with the default
parameters.
<p>
<b>gf_free()</b> frees memory that was allocated with <b>gf_init_easy()</b>
or <b>gf_init_hard()</b>. The <b>recursive</b> parameter is in case you
use composite fields, and want to recursively free the base fields.
If you pass <b>scratch_memory</b> to <b>gf_init_hard()</b>, then you typically
don't need to call <b>gf_free()</b>. It won't hurt to call it, though.
<hr>
<h3>gf_mult and gf_div</h3>
For the moment, I have few things completely implemented, but that's because I want
to be able to explain the structure, and how to specify methods. In particular, for
<i>w=4</i>, I have implemented <b>SHIFT</b> and <b>LOG</b>. For <i>w=8, 16, 32, 64</i>
I have implemented <b>SHIFT</b>. For all <i>w &le; 32</i>, I have implemented both
Euclid's algorithm for inversion, and the matrix method for inversion. For
<i>w=64</i>, it's just Euclid. You can
test these all with <b>gf_mult</b> and <b>gf_div</b>. Here are a few calls:
<pre>
UNIX> <font color=darkred><b>gf_mult 7 11 4</b></font> - Default
4
UNIX> <font color=darkred><b>gf_mult 7 11 4 SHIFT - -</b></font> - Use shift
4
UNIX> <font color=darkred><b>gf_mult 7 11 4 LOG - -</b></font> - Use logs
4
UNIX> <font color=darkred><b>gf_div 4 7 4</b></font> - Default
11
UNIX> <font color=darkred><b>gf_div 4 7 4 LOG - -</b></font> - Use logs
11
UNIX> <font color=darkred><b>gf_div 4 7 4 LOG - EUCLID</b></font> - Use Euclid instead of logs
11
UNIX> <font color=darkred><b>gf_div 4 7 4 LOG - MATRIX</b></font> - Use Matrix inversion instead of logs
11
UNIX> <font color=darkred><b>gf_div 4 7 4 SHIFT - -</b></font> - Default
11
UNIX> <font color=darkred><b>gf_div 4 7 4 SHIFT - EUCLID</b></font> - Use Euclid (which is the default)
11
UNIX> <font color=darkred><b>gf_div 4 7 4 SHIFT - MATRIX</b></font> - Use Matrix inversion instead of logs
11
UNIX> <font color=darkred><b>gf_mult 200 211 8</b></font> - The remainder are shift/Euclid
201
UNIX> <font color=darkred><b>gf_div 201 211 8</b></font>
200
UNIX> <font color=darkred><b>gf_mult 60000 65111 16</b></font>
63515
UNIX> <font color=darkred><b>gf_div 63515 65111 16</b></font>
60000
UNIX> <font color=darkred><b>gf_mult abcd0001 9afbf788 32h</b></font>
b0359681
UNIX> <font color=darkred><b>gf_div b0359681 9afbf788 32h</b></font>
abcd0001
UNIX> <font color=darkred><b>gf_mult abcd00018c8b8c8a 9afbf7887f6d8e5b 64h</b></font>
3a7def35185bd571
UNIX> <font color=darkred><b>gf_mult abcd00018c8b8c8a 9afbf7887f6d8e5b 64h</b></font>
3a7def35185bd571
UNIX> <font color=darkred><b>gf_div 3a7def35185bd571 9afbf7887f6d8e5b 64h</b></font>
abcd00018c8b8c8a
UNIX> <font color=darkred><b></b></font>
</pre>
You can see all the methods with <b>gf_methods</b>. We have a lot of implementing to do:
<pre>
UNIX> <font color=darkred><b>gf_methods</b></font>
To specify the methods, do one of the following:
- leave empty to use defaults
- use a single dash to use defaults
- specify MULTIPLY REGION DIVIDE
Legal values of MULTIPLY:
SHIFT: shift
GROUP g_mult g_reduce: the Group technique - see the paper
BYTWO_p: BYTWO doubling the product.
BYTWO_b: BYTWO doubling b (more efficient thatn BYTWO_p)
TABLE: Full multiplication table
LOG: Discrete logs
LOG_ZERO: Discrete logs with a large table for zeros
SPLIT g_a g_b: Split tables defined by g_a and g_b
COMPOSITE k l [METHOD]: Composite field, recursively specify the
method of the base field in GF(2^l)
Legal values of REGION: Specify multiples with commas e.g. 'DOUBLE,LAZY'
-: Use defaults
SINGLE/DOUBLE/QUAD: Expand tables
LAZY: Lazily create table (only applies to TABLE and SPLIT)
SSE/NOSSE: Use 128-bit SSE instructions if you can
CAUCHY/ALTMAP/STDMAP: Use different memory mappings
Legal values of DIVIDE:
-: Use defaults
MATRIX: Use matrix inversion
EUCLID: Use the extended Euclidian algorithm.
See the user's manual for more information.
There are many restrictions, so it is better to simply use defaults in most cases.
UNIX> <font color=darkred><b></b></font>
</pre>
<hr>
<h3>gf_unit and gf_time</h3>
<b><a href=gf_unit.c>gf_unit.c</a></b> is a unit tester, and
<b><a href=gf_time.c>gf_time.c</a></b> is a time tester.
They are called as follows:
<p><center><table border=3 cellpadding=3><td><pre>
UNIX> <font color=darkred><b>gf_unit w tests seed [METHOD] </b></font>
UNIX> <font color=darkred><b>gf_time w tests seed size(bytes) iterations [METHOD] </b></font>
</pre></td></table></center><p>
The <b>tests</b> parameter is one or more of the following characters:
<UL>
<LI> A: Do all tests
<LI> S: Test only single operations (multiplication/division)
<LI> R: Test only region operations
<LI> V: Verbose Output
</UL>
<b>seed</b> is a seed for <b>srand48()</b> -- using -1 defaults to the current time.
<p>
For example, testing the defaults with w=4:
<pre>
UNIX> <font color=darkred><b>gf_unit 4 AV 1 LOG - -</b></font>
Seed: 1
Testing single multiplications/divisions.
Testing Inversions.
Testing buffer-constant, src != dest, xor = 0
Testing buffer-constant, src != dest, xor = 1
Testing buffer-constant, src == dest, xor = 0
Testing buffer-constant, src == dest, xor = 1
UNIX> <font color=darkred><b>gf_unit 4 AV 1 SHIFT - -</b></font>
Seed: 1
Testing single multiplications/divisions.
Testing Inversions.
No multiply_region.
UNIX> <font color=darkred><b></b></font>
</pre>
There is no <b>multiply_region()</b> method defined for <b>SHIFT</b>.
Thus, the procedures are <b>NULL</b> and the unit tester ignores them.
<p>
At the moment, I only have the unit tester working for w=4.
<p>
<b>gf_time</b> takes the size of an array (in bytes) and a number of iterations, and
tests the speed of both single and region operations. The tests are:
<UL>
<LI> A: All
<LI> S: All Single Operations
<LI> R: All Region Operations
<LI> M: Single: Multiplications
<LI> D: Single: Divisions
<LI> I: Single: Inverses
<LI> B: Region: Multipy_Region
</UL>
Here are some examples with <b>SHIFT</b> and <b>LOG</b> on my mac.
<pre>
UNIX> <font color=darkred><b>gf_time 4 A 1 102400 1024 LOG - -</b></font>
Seed: 1
Multiply: 0.538126 s 185.830 Mega-ops/s
Divide: 0.520825 s 192.003 Mega-ops/s
Inverse: 0.631198 s 158.429 Mega-ops/s
Buffer-Const,s!=d,xor=0: 0.478395 s 209.032 MB/s
Buffer-Const,s!=d,xor=1: 0.524245 s 190.751 MB/s
Buffer-Const,s==d,xor=0: 0.471851 s 211.931 MB/s
Buffer-Const,s==d,xor=1: 0.528275 s 189.295 MB/s
UNIX> <font color=darkred><b>gf_time 4 A 1 102400 1024 LOG - EUCLID</b></font>
Seed: 1
Multiply: 0.555512 s 180.014 Mega-ops/s
Divide: 5.359434 s 18.659 Mega-ops/s
Inverse: 4.911719 s 20.359 Mega-ops/s
Buffer-Const,s!=d,xor=0: 0.496097 s 201.573 MB/s
Buffer-Const,s!=d,xor=1: 0.538536 s 185.689 MB/s
Buffer-Const,s==d,xor=0: 0.485564 s 205.946 MB/s
Buffer-Const,s==d,xor=1: 0.540227 s 185.107 MB/s
UNIX> <font color=darkred><b>gf_time 4 A 1 102400 1024 LOG - MATRIX</b></font>
Seed: 1
Multiply: 0.544005 s 183.822 Mega-ops/s
Divide: 7.602822 s 13.153 Mega-ops/s
Inverse: 7.000564 s 14.285 Mega-ops/s
Buffer-Const,s!=d,xor=0: 0.474868 s 210.585 MB/s
Buffer-Const,s!=d,xor=1: 0.527588 s 189.542 MB/s
Buffer-Const,s==d,xor=0: 0.473130 s 211.358 MB/s
Buffer-Const,s==d,xor=1: 0.529877 s 188.723 MB/s
UNIX> <font color=darkred><b>gf_time 4 A 1 102400 1024 SHIFT - -</b></font>
Seed: 1
Multiply: 2.708842 s 36.916 Mega-ops/s
Divide: 8.756882 s 11.420 Mega-ops/s
Inverse: 5.695511 s 17.558 Mega-ops/s
UNIX> <font color=darkred><b></b></font>
</pre>
At the moment, I only have the timer working for w=4.
<hr>
<h3>Walking you through <b>LOG</b></h3>
To see how <b>scratch</b> is used to store data, let's look at what happens when
you call <b>gf_init_easy(&gf, 4, GF_MULT_LOG);</b>
First, <b>gf_init_easy()</b> calls <b>gf_init_hard()</b> with default parameters.
This is in <b><a href=gf.c>gf.c</a></b>.
<p>
<b>gf_init_hard()</b>' first job is to set up the scratch.
The scratch's type is <b>gf_internal_t</b>, defined in
<b><a href=gf_int.h>gf_int.h</a></b>:
<p><center><table border=3 cellpadding=3><td><pre>
typedef struct {
int mult_type;
int region_type;
int divide_type;
int w;
uint64_t prim_poly;
int free_me;
int arg1;
int arg2;
gf_t *base_gf;
void *private;
} gf_internal_t;
</pre></td></table></center><p>
All the fields are straightfoward, with the exception of <b>private</b>. That is
a <b>(void *)</b> which points to the implementation's private data.
<p>
Here's the code for
<b>gf_init_hard()</b>:
<p><center><table border=3 cellpadding=3><td><pre>
int gf_init_hard(gf_t *gf, int w, int mult_type,
int region_type,
int divide_type,
uint64_t prim_poly,
int arg1, int arg2,
gf_t *base_gf,
void *scratch_memory)
{
int sz;
gf_internal_t *h;
if (scratch_memory == NULL) {
sz = gf_scratch_size(w, mult_type, region_type, divide_type, arg1, arg2);
if (sz &lt;= 0) return 0;
h = (gf_internal_t *) malloc(sz);
h-&gt;free_me = 1;
} else {
h = scratch_memory;
h-&gt;free_me = 0;
}
gf-&gt;scratch = (void *) h;
h-&gt;mult_type = mult_type;
h-&gt;region_type = region_type;
h-&gt;divide_type = divide_type;
h-&gt;w = w;
h-&gt;prim_poly = prim_poly;
h-&gt;arg1 = arg1;
h-&gt;arg2 = arg2;
h-&gt;base_gf = base_gf;
h-&gt;private = (void *) gf-&gt;scratch;
h-&gt;private += (sizeof(gf_internal_t));
switch(w) {
case 4: return gf_w4_init(gf);
case 8: return gf_w8_init(gf);
case 16: return gf_w16_init(gf);
case 32: return gf_w32_init(gf);
case 64: return gf_w64_init(gf);
case 128: return gf_dummy_init(gf);
default: return 0;
}
}
</pre></td></table></center><p>
The first thing it does is determine if it has to allocate space for <b>scratch</b>.
If it must, it uses <b>gf_scratch_size()</b> to figure out how big the space must be.
It then sets <b>gf->scratch</b> to this space, and sets all of the fields of the
scratch to the arguments in <b>gf_init_hard()</b>. The <b>private</b> pointer is
set to be the space just after the pointer <b>gf->private</b>. Again, it is up to
<b>gf_scratch_size()</b> to make sure there is enough space for the scratch, and
for all of the private data needed by the implementation.
<p>
Once the scratch is set up, <b>gf_init_hard()</b> calls <b>gf_w4_init()</b>. This is
in <b><a href=gf_w4.c>gf_w4.c</a></b>, and it is a
simple dispatcher to the various initialization routines, plus it
sets <b>EUCLID</b> and <b>MATRIX</b> if need be:
<p><center><table border=3 cellpadding=3><td><pre>
int gf_w4_init(gf_t *gf)
{
gf_internal_t *h;
h = (gf_internal_t *) gf-&gt;scratch;
if (h-&gt;prim_poly == 0) h-&gt;prim_poly = 0x13;
gf-&gt;multiply.w4 = NULL;
gf-&gt;divide.w4 = NULL;
gf-&gt;inverse.w4 = NULL;
gf-&gt;multiply_region.w4 = NULL;
switch(h-&gt;mult_type) {
case GF_MULT_SHIFT: if (gf_w4_shift_init(gf) == 0) return 0; break;
case GF_MULT_LOG_TABLE: if (gf_w4_log_init(gf) == 0) return 0; break;
case GF_MULT_DEFAULT: if (gf_w4_log_init(gf) == 0) return 0; break;
default: return 0;
}
if (h-&gt;divide_type == GF_DIVIDE_EUCLID) {
gf-&gt;divide.w4 = gf_w4_divide_from_inverse;
gf-&gt;inverse.w4 = gf_w4_euclid;
} else if (h-&gt;divide_type == GF_DIVIDE_MATRIX) {
gf-&gt;divide.w4 = gf_w4_divide_from_inverse;
gf-&gt;inverse.w4 = gf_w4_matrix;
}
if (gf-&gt;inverse.w4 != NULL && gf-&gt;divide.w4 == NULL) {
gf-&gt;divide.w4 = gf_w4_divide_from_inverse;
}
if (gf-&gt;inverse.w4 == NULL && gf-&gt;divide.w4 != NULL) {
gf-&gt;inverse.w4 = gf_w4_inverse_from_divide;
}
return 1;
}
</pre></td></table></center><p>
The code in <b>gf_w4_log_init()</b> sets up the log and antilog tables, and sets
the <b>multiply.w4</b>, <b>divide.w4</b> etc routines to be the ones for logs. The
tables are put into <b>gf->scratch->private</b>, which is typecast to a <b>struct
gf_logtable_data *</b>:
<p><center><table border=3 cellpadding=3><td><pre>
struct gf_logtable_data {
gf_val_4_t log_tbl[GF_FIELD_SIZE];
gf_val_4_t antilog_tbl[GF_FIELD_SIZE * 2];
gf_val_4_t *antilog_tbl_div;
};
.......
static
int gf_w4_log_init(gf_t *gf)
{
gf_internal_t *h;
struct gf_logtable_data *ltd;
int i, b;
h = (gf_internal_t *) gf-&gt;scratch;
ltd = h-&gt;private;
ltd-&gt;log_tbl[0] = 0;
ltd-&gt;antilog_tbl_div = ltd-&gt;antilog_tbl + (GF_FIELD_SIZE-1);
b = 1;
for (i = 0; i &lt; GF_FIELD_SIZE-1; i++) {
ltd-&gt;log_tbl[b] = (gf_val_8_t)i;
ltd-&gt;antilog_tbl[i] = (gf_val_8_t)b;
ltd-&gt;antilog_tbl[i+GF_FIELD_SIZE-1] = (gf_val_8_t)b;
b &lt;&lt;= 1;
if (b & GF_FIELD_SIZE) {
b = b ^ h-&gt;prim_poly;
}
}
gf-&gt;inverse.w4 = gf_w4_inverse_from_divide;
gf-&gt;divide.w4 = gf_w4_log_divide;
gf-&gt;multiply.w4 = gf_w4_log_multiply;
gf-&gt;multiply_region.w4 = gf_w4_log_multiply_region;
return 1;
}
</pre></td></table></center><p>
And of course the individual routines use <b>h->private</b> to access the tables:
<p><center><table border=3 cellpadding=3><td><pre>
static
inline
gf_val_8_t gf_w4_log_multiply (gf_t *gf, gf_val_8_t a, gf_val_8_t b)
{
struct gf_logtable_data *ltd;
ltd = (struct gf_logtable_data *) ((gf_internal_t *) (gf-&gt;scratch))-&gt;private;
return (a == 0 || b == 0) ? 0 : ltd-&gt;antilog_tbl[(unsigned)(ltd-&gt;log_tbl[a] + ltd-&gt;log_tbl[b])];
}
</pre></td></table></center><p>
Finally, it's important that the proper sizes are put into
<b>gf_w4_scratch_size()</b> for each implementation:
<p><center><table border=3 cellpadding=3><td><pre>
int gf_w4_scratch_size(int mult_type, int region_type, int divide_type, int arg1, int arg2)
{
int region_tbl_size;
switch(mult_type)
{
case GF_MULT_DEFAULT:
case GF_MULT_LOG_TABLE:
return sizeof(gf_internal_t) + sizeof(struct gf_logtable_data) + 64;
break;
case GF_MULT_SHIFT:
return sizeof(gf_internal_t);
break;
default:
return -1;
}
}
</pre></td></table></center><p>
I hope that's enough explanation for y'all to start implementing. Let me know if you have
problems -- thanks -- Jim
<hr>
The initial structure has been set for w=4, 8, 16, 32 and 64, with implementations of SHIFT and EUCLID, and for w <= 32, MATRIX. There are some weird caveats:
<UL>
<LI> For w=32 and w=64, the primitive polynomial does not have the leading one.
<LI> I'd like for naming to be:
<p>
<UL>
<b>gf_w</b><i>w</i><b>_</b><i>technique</i></i><b>_</b><i>funcationality</i><b>()</b>.
</UL>
<p>
For example, the log techniques for w=4 are:
<pre>
gf_w4_log_multiply()
gf_w4_log_divide()
gf_w4_log_multiply_region()
gf_w4_log_init()
</pre>
<p>
<LI> I'd also like a header block on implementations that says who wrote it.
</UL>
<hr>
<h3>Things we need to Implement: <i>w=4</i></h3>
<p><table border=3 cellpadding=2>
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
<tr> <td> BYTWO_p </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_b </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_p, SSE </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_b, SSE </td> <td>Done - Jim</td> </tr>
<tr> <td> Single TABLE </td> <td> Done - Jim </td> </tr>
<tr> <td> Double TABLE </td> <td> Done - Jim </td> </tr>
<tr> <td> Double TABLE, SSE </td> <td> Done - Jim </td> </tr>
<tr> <td> Quad TABLE </td> <td>Done - Jim</td> </tr>
<tr> <td> Lazy Quad TABLE </td> <td>Done - Jim</td> </tr>
<tr> <td> LOG </td> <td> Done - Jim </td> </tr>
</table><p>
<hr>
<h3>Things we need to Implement: <i>w=8</i></h3>
<p><table border=3 cellpadding=2>
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
<tr> <td> BYTWO_p </td> <td>Done - Jim </td> </tr>
<tr> <td> BYTWO_b </td> <td>Done - Jim </td> </tr>
<tr> <td> BYTWO_p, SSE </td> <td>Done - Jim </td> </tr>
<tr> <td> BYTWO_b, SSE </td> <td>Done - Jim </td> </tr>
<tr> <td> Single TABLE </td> <td> Done - Kevin </td> </tr>
<tr> <td> Double TABLE </td> <td> Done - Jim </td> </tr>
<tr> <td> Lazy Double TABLE </td> <td> Done - Jim </td> </tr>
<tr> <td> Split 2 1 (Half) SSE </td> <td>Done - Jim</td> </tr>
<tr> <td> Composite, k=2 </td> <td> Done - Kevin (alt mapping not passing unit test) </td> </tr>
<tr> <td> LOG </td> <td> Done - Kevin </td> </tr>
<tr> <td> LOG ZERO</td> <td> Done - Jim</td> </tr>
</table><p>
<hr>
<h3>Things we need to Implement: <i>w=16</i></h3>
<p><table border=3 cellpadding=2>
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
<tr> <td> BYTWO_p </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_b </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_p, SSE </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_b, SSE </td> <td>Done - Jim</td> </tr>
<tr> <td> Lazy TABLE </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 4 16 No-SSE, lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 4 16 SSE, lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 4 16 SSE, lazy, alternate mapping </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 8 16, lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Composite, k=2, stdmap recursive </td> <td> Done - Kevin</td> </tr>
<tr> <td> Composite, k=2, altmap recursive </td> <td> Done - Kevin</td> </tr>
<tr> <td> Composite, k=2, stdmap inline </td> <td> Done - Kevin</td> </tr>
<tr> <td> LOG </td> <td> Done - Kevin </td> </tr>
<tr> <td> LOG ZERO</td> <td> Done - Kevin </td> </tr>
<tr> <td> Group 4 4 </td> <td>Done - Jim: I don't see a reason to implement others, although 4-8 will be faster, and 8 8 will have faster region ops. They'll never beat SPLIT.</td> </tr>
</table><p>
<hr>
<h3>Things we need to Implement: <i>w=32</i></h3>
<p><table border=3 cellpadding=2>
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
<tr> <td> BYTWO_p </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_b </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_p, SSE </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_b, SSE </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 2 32,lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 2 32, SSE, lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 4 32, lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 4 32, SSE,ALTMAP lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 4 32, SSE, lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 8 8 </td> <td>Done - Jim </td> </tr>
<tr> <td> Group, g_s == g_r </td> <td>Done - Jim</td></tr>
<tr> <td> Group, any g_s and g_r</td> <td>Done - Jim</td></tr>
<tr> <td> Composite, k=2, stdmap recursive </td> <td> Done - Kevin</td> </tr>
<tr> <td> Composite, k=2, altmap recursive </td> <td> Done - Kevin</td> </tr>
<tr> <td> Composite, k=2, stdmap inline </td> <td> Done - Kevin</td> </tr>
</table><p>
<hr>
<h3>Things we need to Implement: <i>w=64</i></h3>
<p><table border=3 cellpadding=2>
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
<tr> <td> BYTWO_p </td> <td> - </td> </tr>
<tr> <td> BYTWO_b </td> <td> - </td> </tr>
<tr> <td> BYTWO_p, SSE </td> <td> - </td> </tr>
<tr> <td> BYTWO_b, SSE </td> <td> - </td> </tr>
<tr> <td> Split 16 1 SSE, maybe lazy </td> <td> - </td> </tr>
<tr> <td> Split 8 1 lazy </td> <td> - </td> </tr>
<tr> <td> Split 8 8 </td> <td> - </td> </tr>
<tr> <td> Split 8 8 lazy </td> <td> - </td> </tr>
<tr> <td> Group </td> <td> - </td> </tr>
<tr> <td> Composite, k=2, alternate mapping </td> <td> - </td> </tr>
</table><p>
<hr>
<h3>Things we need to Implement: <i>w=128</i></h3>
<p><table border=3 cellpadding=2>
<tr> <td> SHIFT </td> <td> Done - Will </td> </tr>
<tr> <td> BYTWO_p </td> <td> - </td> </tr>
<tr> <td> BYTWO_b </td> <td> - </td> </tr>
<tr> <td> BYTWO_p, SSE </td> <td> - </td> </tr>
<tr> <td> BYTWO_b, SSE </td> <td> - </td> </tr>
<tr> <td> Split 32 1 SSE, maybe lazy </td> <td> - </td> </tr>
<tr> <td> Split 16 1 lazy </td> <td> - </td> </tr>
<tr> <td> Split 16 16 - Maybe that's insanity</td> <td> - </td> </tr>
<tr> <td> Split 16 16 lazy </td> <td> - </td> </tr>
<tr> <td> Group (SSE) </td> <td> - </td> </tr>
<tr> <td> Composite, k=?, alternate mapping </td> <td> - </td> </tr>
</table><p>
<hr>
<h3>Things we need to Implement: <i>w=general between 1 & 32</i></h3>
<p><table border=3 cellpadding=2>
<tr> <td> CAUCHY Region (SSE XOR)</td> <td> Done - Jim </td> </tr>
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
<tr> <td> TABLE </td> <td> Done - Jim </td> </tr>
<tr> <td> LOG </td> <td> Done - Jim </td> </tr>
<tr> <td> BYTWO_p </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_b </td> <td>Done - Jim</td> </tr>
<tr> <td> Group, g_s == g_r </td> <td>Done - Jim</td></tr>
<tr> <td> Group, any g_s and g_r</td> <td>Done - Jim</td></tr>
<tr> <td> Split - do we need it?</td> <td>Done - Jim</td></tr>
<tr> <td> Composite - do we need it?</td> <td> - </td></tr>
<tr> <td> Split - do we need it?</td> <td> - </td></tr>
<tr> <td> Logzero?</td> <td> - </td></tr>
</table><p>

478
gf.c Normal file
View File

@ -0,0 +1,478 @@
/*
* gf.c
*
* Generic routines for Galois fields
*/
#include "gf_int.h"
#include <stdio.h>
#include <stdlib.h>
int gf_scratch_size(int w,
int mult_type,
int region_type,
int divide_type,
int arg1,
int arg2)
{
switch(w) {
case 4: return gf_w4_scratch_size(mult_type, region_type, divide_type, arg1, arg2);
case 8: return gf_w8_scratch_size(mult_type, region_type, divide_type, arg1, arg2);
case 16: return gf_w16_scratch_size(mult_type, region_type, divide_type, arg1, arg2);
case 32: return gf_w32_scratch_size(mult_type, region_type, divide_type, arg1, arg2);
case 64: return gf_w64_scratch_size(mult_type, region_type, divide_type, arg1, arg2);
case 128: return gf_w128_scratch_size(mult_type, region_type, divide_type, arg1, arg2);
default: return gf_wgen_scratch_size(w, mult_type, region_type, divide_type, arg1, arg2);
}
}
int gf_dummy_init(gf_t *gf)
{
return 0;
}
int gf_init_easy(gf_t *gf, int w, int mult_type)
{
return gf_init_hard(gf, w, mult_type, GF_REGION_DEFAULT, GF_DIVIDE_DEFAULT, 0, 0, 0, NULL, NULL);
}
int gf_init_hard(gf_t *gf, int w, int mult_type,
int region_type,
int divide_type,
uint64_t prim_poly,
int arg1, int arg2,
gf_t *base_gf,
void *scratch_memory)
{
int sz;
gf_internal_t *h;
sz = gf_scratch_size(w, mult_type, region_type, divide_type, arg1, arg2);
if (sz <= 0) return 0;
if (scratch_memory == NULL) {
h = (gf_internal_t *) malloc(sz);
h->free_me = 1;
} else {
h = scratch_memory;
h->free_me = 0;
}
gf->scratch = (void *) h;
h->mult_type = mult_type;
h->region_type = region_type;
h->divide_type = divide_type;
h->w = w;
h->prim_poly = prim_poly;
h->arg1 = arg1;
h->arg2 = arg2;
h->base_gf = base_gf;
h->private = (void *) gf->scratch;
h->private += (sizeof(gf_internal_t));
gf->extract_word.w32 = NULL;
//printf("Created w=%d, with mult_type=%d and region_type=%d\n", w, mult_type, region_type);
switch(w) {
case 4: return gf_w4_init(gf);
case 8: return gf_w8_init(gf);
case 16: return gf_w16_init(gf);
case 32: return gf_w32_init(gf);
case 64: return gf_w64_init(gf);
case 128: return gf_w128_init(gf);
default: return gf_wgen_init(gf);
}
}
int gf_free(gf_t *gf, int recursive)
{
gf_internal_t *h;
h = (gf_internal_t *) gf->scratch;
if (recursive && h->base_gf != NULL) {
gf_free(h->base_gf, 1);
free(h->base_gf);
}
if (h->free_me) free(h);
}
void gf_alignment_error(char *s, int a)
{
fprintf(stderr, "Alignment error in %s:\n", s);
fprintf(stderr, " The source and destination buffers must be aligned to each other,\n");
fprintf(stderr, " and they must be aligned to a %d-byte address.\n", a);
exit(1);
}
/* Lifted this code from Jens Gregor -- thanks, Jens */
int gf_is_sse2()
{
unsigned int cpeinfo;
unsigned int cpsse;
asm ( "mov $0x1, %%eax\n\t"
"cpuid\n\t"
"mov %%edx, %0\n\t"
"mov %%ecx, %1\n" : "=m" (cpeinfo), "=m" (cpsse));
if ((cpeinfo >> 26) & 0x1 ) return 1;
return 0;
}
static
void gf_invert_binary_matrix(int *mat, int *inv, int rows) {
int cols, i, j, k;
int tmp;
cols = rows;
for (i = 0; i < rows; i++) inv[i] = (1 << i);
/* First -- convert into upper triangular */
for (i = 0; i < cols; i++) {
/* Swap rows if we ave a zero i,i element. If we can't swap, then the
matrix was not invertible */
if ((mat[i] & (1 << i)) == 0) {
for (j = i+1; j < rows && (mat[j] & (1 << i)) == 0; j++) ;
if (j == rows) {
fprintf(stderr, "galois_invert_matrix: Matrix not invertible!!\n");
exit(1);
}
tmp = mat[i]; mat[i] = mat[j]; mat[j] = tmp;
tmp = inv[i]; inv[i] = inv[j]; inv[j] = tmp;
}
/* Now for each j>i, add A_ji*Ai to Aj */
for (j = i+1; j != rows; j++) {
if ((mat[j] & (1 << i)) != 0) {
mat[j] ^= mat[i];
inv[j] ^= inv[i];
}
}
}
/* Now the matrix is upper triangular. Start at the top and multiply down */
for (i = rows-1; i >= 0; i--) {
for (j = 0; j < i; j++) {
if (mat[j] & (1 << i)) {
/* mat[j] ^= mat[i]; */
inv[j] ^= inv[i];
}
}
}
}
uint32_t gf_bitmatrix_inverse(uint32_t y, int w, uint32_t pp)
{
uint32_t mat[32], inv[32], mask;
int i;
mask = (w == 32) ? 0xffffffff : (1 << w) - 1;
for (i = 0; i < w; i++) {
mat[i] = y;
if (y & (1 << (w-1))) {
y = y << 1;
y = ((y ^ pp) & mask);
} else {
y = y << 1;
}
}
gf_invert_binary_matrix(mat, inv, w);
return inv[0];
}
/*
void gf_two_byte_region_table_multiply(gf_region_data *rd, uint16_t *base)
{
uint64_t p, ta, shift, tb;
uint64_t *s64, *d64
s64 = rd->s_start;
d64 = rd->d_start;
while (s64 < (uint64_t *) rd->s_top) {
p = (rd->xor) ? *d64 : 0;
ta = *s64;
shift = 0;
while (ta != 0) {
tb = base[ta&0xffff];
p ^= (tb << shift);
ta >>= 16;
shift += 16;
}
*d64 = p;
d64++;
s64++;
}
}
*/
void gf_two_byte_region_table_multiply(gf_region_data *rd, uint16_t *base)
{
uint64_t a, prod;
int j, xor;
uint64_t *s64, *d64, *top;
s64 = rd->s_start;
d64 = rd->d_start;
top = rd->d_top;
xor = rd->xor;
if (xor) {
while (d64 != top) {
a = *s64;
prod = base[a >> 48];
a <<= 16;
prod <<= 16;
prod ^= base[a >> 48];
a <<= 16;
prod <<= 16;
prod ^= base[a >> 48];
a <<= 16;
prod <<= 16;
prod ^= base[a >> 48];
prod ^= *d64;
*d64 = prod;
*s64++;
*d64++;
}
} else {
while (d64 != top) {
a = *s64;
prod = base[a >> 48];
a <<= 16;
prod <<= 16;
prod ^= base[a >> 48];
a <<= 16;
prod <<= 16;
prod ^= base[a >> 48];
a <<= 16;
prod <<= 16;
prod ^= base[a >> 48];
*d64 = prod;
*s64++;
*d64++;
}
}
}
static void gf_slow_multiply_region(gf_region_data *rd, void *src, void *dest, void *s_top)
{
uint8_t *s8, *d8;
uint16_t *s16, *d16;
uint32_t *s32, *d32;
gf_internal_t *h;
int wb;
uint32_t p, a;
h = rd->gf->scratch;
wb = (h->w)/8;
if (wb == 0) wb = 1;
while (src < s_top) {
switch (h->w) {
case 8:
s8 = (uint8_t *) src;
d8 = (uint8_t *) dest;
*d8 = (rd->xor) ? (*d8 ^ rd->gf->multiply.w32(rd->gf, rd->val, *s8)) :
rd->gf->multiply.w32(rd->gf, rd->val, *s8);
break;
case 4:
s8 = (uint8_t *) src;
d8 = (uint8_t *) dest;
a = *s8;
p = rd->gf->multiply.w32(rd->gf, rd->val, a&0xf);
p |= (rd->gf->multiply.w32(rd->gf, rd->val, a >> 4) << 4);
if (rd->xor) p ^= *d8;
*d8 = p;
break;
case 16:
s16 = (uint16_t *) src;
d16 = (uint16_t *) dest;
*d16 = (rd->xor) ? (*d16 ^ rd->gf->multiply.w32(rd->gf, rd->val, *s16)) :
rd->gf->multiply.w32(rd->gf, rd->val, *s16);
break;
case 32:
s32 = (uint32_t *) src;
d32 = (uint32_t *) dest;
*d32 = (rd->xor) ? (*d32 ^ rd->gf->multiply.w32(rd->gf, rd->val, *s32)) :
rd->gf->multiply.w32(rd->gf, rd->val, *s32);
break;
default:
fprintf(stderr, "Error: gf_slow_multiply_region: w=%d not implemented.\n", h->w);
exit(1);
}
src += wb;
dest += wb;
}
}
/* If align>16, you align to 16 bytes, but make sure that within the aligned region bytes is a multiple of align. However, you make sure that the region itself is a multiple of align.
If align = -1, then this is cauchy. You need to make sure that bytes is a multiple of w. */
void gf_set_region_data(gf_region_data *rd,
gf_t *gf,
void *src,
void *dest,
int bytes,
uint32_t val,
int xor,
int align)
{
uint8_t *s8, *d8;
gf_internal_t *h;
int wb;
uint32_t a;
unsigned long uls, uld;
h = gf->scratch;
wb = (h->w)/8;
if (wb == 0) wb = 1;
rd->gf = gf;
rd->src = src;
rd->dest = dest;
rd->bytes = bytes;
rd->val = val;
rd->xor = xor;
rd->align = align;
uls = (unsigned long) src;
uld = (unsigned long) dest;
a = (align <= 16) ? align : 16;
if (align == -1) { /* This is cauchy. Error check bytes, then set up the pointers
so that there is no alignment regions. */
if (bytes % h->w != 0) {
fprintf(stderr, "Error in region multiply operation.\n");
fprintf(stderr, "The size must be a multiple of %d bytes.\n", h->w);
exit(1);
}
rd->s_start = src;
rd->d_start = dest;
rd->s_top = src + bytes;
rd->d_top = src + bytes;
return;
}
if (uls % a != uld % a) {
fprintf(stderr, "Error in region multiply operation.\n");
fprintf(stderr, "The source & destination pointers must be aligned with respect\n");
fprintf(stderr, "to each other along a %d byte boundary.\n", a);
fprintf(stderr, "Src = 0x%lx. Dest = 0x%lx\n", (unsigned long) src,
(unsigned long) dest);
exit(1);
}
if (uls % wb != 0) {
fprintf(stderr, "Error in region multiply operation.\n");
fprintf(stderr, "The pointers must be aligned along a %d byte boundary.\n", wb);
fprintf(stderr, "Src = 0x%lx. Dest = 0x%lx\n", (unsigned long) src,
(unsigned long) dest);
exit(1);
}
if (bytes % wb != 0) {
fprintf(stderr, "Error in region multiply operation.\n");
fprintf(stderr, "The size must be a multiple of %d bytes.\n", wb);
exit(1);
}
uls %= a;
if (uls != 0) uls = (align-uls);
rd->s_start = rd->src + uls;
rd->d_start = rd->dest + uls;
bytes -= uls;
bytes -= (bytes % align);
rd->s_top = rd->s_start + bytes;
rd->d_top = rd->d_start + bytes;
}
void gf_do_initial_region_alignment(gf_region_data *rd)
{
gf_slow_multiply_region(rd, rd->src, rd->dest, rd->s_start);
}
void gf_do_final_region_alignment(gf_region_data *rd)
{
gf_slow_multiply_region(rd, rd->s_top, rd->d_top, rd->src+rd->bytes);
}
void gf_multby_zero(void *dest, int bytes, int xor)
{
if (xor) return;
bzero(dest, bytes);
return;
}
void gf_multby_one(gf_t *gf, void *src, void *dest, int bytes, int xor)
{
#ifdef INTEL_SSE4
__m128i ms, md;
#endif
uint8_t *s8, *d8, *dtop8;
uint64_t *s64, *d64, *dtop64;
int abytes;
gf_region_data rd;
if (!xor) {
memcpy(dest, src, bytes);
return;
}
#ifdef INTEL_SSE4
s8 = (uint8_t *) src;
d8 = (uint8_t *) dest;
abytes = bytes & 0xfffffff0;
while (d8 < (uint8_t *) dest + abytes) {
ms = _mm_loadu_si128 ((__m128i *)(s8));
md = _mm_loadu_si128 ((__m128i *)(d8));
md = _mm_xor_si128(md, ms);
_mm_storeu_si128((__m128i *)(d8), md);
s8 += 16;
d8 += 16;
}
while (d8 != (uint8_t *) dest+bytes) {
*d8 ^= *s8;
d8++;
s8++;
}
return;
#endif
/* If you don't have SSE, you'd better be aligned..... */
gf_set_region_data(&rd, gf, src, dest, bytes, 1, xor, 8);
s8 = (uint8_t *) src;
d8 = (uint8_t *) dest;
while (d8 != rd.d_start) {
*d8 ^= *s8;
d8++;
s8++;
}
dtop64 = (uint64_t *) rd.d_top;
while (d64 < dtop64) {
*d64 ^= *s64;
d64++;
s64++;
}
while (d8 != (uint8_t *) dest+bytes) {
*d8 ^= *s8;
d8++;
s8++;
}
return;
}

123
gf.h Normal file
View File

@ -0,0 +1,123 @@
/* gf.h
* External include file for Galois field arithmetic. */
#pragma once
#include <stdint.h>
#ifdef INTEL_SSE4
#include <nmmintrin.h>
#include <emmintrin.h>
#endif
#define GF_W128_IS_ZERO(val) (val[0] == 0 && val[1] == 0)
#define GF_W128_EQUAL(val1, val2) ((val1[0] == val2[0]) && (val1[1] == val2[1]))
/* These are the different ways to perform multiplication.
Not all are implemented for all values of w.
See the paper for an explanation of how they work. */
typedef enum {GF_MULT_DEFAULT,
GF_MULT_SHIFT,
GF_MULT_GROUP,
GF_MULT_BYTWO_p,
GF_MULT_BYTWO_b,
GF_MULT_TABLE,
GF_MULT_LOG_TABLE,
GF_MULT_SPLIT_TABLE,
GF_MULT_COMPOSITE } gf_mult_type_t;
/* These are the different ways to optimize region
operations. They are bits because you can compose them:
You can mix SINGLE/DOUBLE/QUAD, LAZY, SSE/NOSSE, STDMAP/ALTMAP/CAUCHY.
Certain optimizations only apply to certain gf_mult_type_t's.
Again, please see documentation for how to use these */
#define GF_REGION_DEFAULT (0x0)
#define GF_REGION_SINGLE_TABLE (0x1)
#define GF_REGION_DOUBLE_TABLE (0x2)
#define GF_REGION_QUAD_TABLE (0x4)
#define GF_REGION_LAZY (0x8)
#define GF_REGION_SSE (0x10)
#define GF_REGION_NOSSE (0x20)
#define GF_REGION_STDMAP (0x40)
#define GF_REGION_ALTMAP (0x80)
#define GF_REGION_CAUCHY (0x100)
typedef uint32_t gf_region_type_t;
/* These are different ways to implement division.
Once again, it's best to use "DEFAULT". However,
there are times when you may want to experiment
with the others. */
typedef enum { GF_DIVIDE_DEFAULT,
GF_DIVIDE_MATRIX,
GF_DIVIDE_EUCLID } gf_division_type_t;
/* We support w=4,8,16,32,64 and 128 with their own data types and
operations for multiplication, division, etc. We also support
a "gen" type so that you can do general gf arithmetic for any
value of w from 1 to 32. You can perform a "region" operation
on these if you use "CAUCHY" as the mapping.
*/
typedef uint32_t gf_val_32_t;
typedef uint64_t gf_val_64_t;
typedef uint64_t *gf_val_128_t;
typedef struct gf *GFP;
typedef union gf_func_a_b {
gf_val_32_t (*w32) (GFP gf, gf_val_32_t a, gf_val_32_t b);
gf_val_64_t (*w64) (GFP gf, gf_val_64_t a, gf_val_64_t b);
void (*w128)(GFP gf, gf_val_128_t a, gf_val_128_t b, gf_val_128_t c);
} gf_func_a_b;
typedef union {
gf_val_32_t (*w32) (GFP gf, gf_val_32_t a);
gf_val_64_t (*w64) (GFP gf, gf_val_64_t a);
void (*w128)(GFP gf, gf_val_128_t a, gf_val_128_t b);
} gf_func_a;
typedef union {
void (*w32) (GFP gf, void *src, void *dest, gf_val_32_t val, int bytes, int add);
void (*w64) (GFP gf, void *src, void *dest, gf_val_64_t val, int bytes, int add);
void (*w128)(GFP gf, void *src, void *dest, gf_val_128_t val, int bytes, int add);
} gf_region;
typedef union {
gf_val_32_t (*w32) (GFP gf, void *start, int bytes, int index);
gf_val_64_t (*w64) (GFP gf, void *start, int bytes, int index);
void (*w128)(GFP gf, void *start, int bytes, int index, gf_val_128_t rv);
} gf_extract;
typedef struct gf {
gf_func_a_b multiply;
gf_func_a_b divide;
gf_func_a inverse;
gf_region multiply_region;
gf_extract extract_word;
void *scratch;
} gf_t;
extern int gf_init_easy(gf_t *gf, int w, int mult_type);
extern int gf_init_hard(gf_t *gf,
int w,
int mult_type,
int region_type,
int divide_type,
uint64_t prim_poly,
int arg1,
int arg2,
gf_t *base_gf,
void *scratch_memory);
extern int gf_scratch_size(int w,
int mult_type,
int region_type,
int divide_type,
int arg1,
int arg2);
extern int gf_free(gf_t *gf, int recursive);

BIN
gf_54 Executable file

Binary file not shown.

18
gf_54.c Normal file
View File

@ -0,0 +1,18 @@
/*
* Multiplies four and five in GF(2^4).
*/
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include "gf.h"
main()
{
gf_t gf;
gf_init_easy(&gf, 4, GF_MULT_DEFAULT);
printf("%d\n", gf.multiply.w32(&gf, 5, 4));
exit(0);
}

BIN
gf_div Executable file

Binary file not shown.

116
gf_div.c Normal file
View File

@ -0,0 +1,116 @@
/*
* gf_mult.c
*
* Multiplies two numbers in gf_2^w
*/
#include <stdio.h>
#include <getopt.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include "gf.h"
#include "gf_method.h"
void usage(char *s)
{
fprintf(stderr, "usage: gf_mult a b w [method] - does multiplication of a and b in GF(2^w)\n");
fprintf(stderr, " If w has an h on the end, treat a, b and the product as hexadecimal (no 0x)\n");
fprintf(stderr, "\n");
fprintf(stderr, " legal w are: 1-32, 64 and 128\n");
fprintf(stderr, " 128 is hex only (i.e. '128' will be an error - do '128h')\n");
fprintf(stderr, "\n");
fprintf(stderr, " For method specification, type gf_methods\n");
if (s != NULL) fprintf(stderr, "%s", s);
exit(1);
}
int read_128(char *s, uint64_t *v)
{
int l, t;
char save;
l = strlen(s);
if (l > 32) return 0;
if (l > 16) {
if (sscanf(s + (l-16), "%llx", (long long unsigned int *) &(v[1])) == 0) return 0;
save = s[l-16];
s[l-16] = '\0';
t = sscanf(s, "%llx", (long long unsigned int *) &(v[0]));
s[l-16] = save;
return t;
} else {
v[0] = 0;
return sscanf(s, "%llx", (long long unsigned int *)&(v[1]));
}
return 1;
}
void print_128(uint64_t *v)
{
if (v[0] > 0) {
printf("%llx", (long long unsigned int) v[0]);
printf("%016llx", (long long unsigned int) v[1]);
} else {
printf("%llx", (long long unsigned int) v[1]);
}
printf("\n");
}
int main(int argc, char **argv)
{
int hex, al, bl, w;
uint32_t a, b, c, top;
uint64_t a64, b64, c64;
uint64_t a128[2], b128[2], c128[2];
char *format;
gf_t gf;
if (argc < 4) usage(NULL);
if (sscanf(argv[3], "%d", &w) == 0) usage("Bad w\n");
if (w <= 0 || (w > 32 && w != 64 && w != 128)) usage("Bad w");
hex = (strchr(argv[3], 'h') != NULL);
if (create_gf_from_argv(&gf, w, argc, argv, 4) == 0) usage("\nBad Method\n");
if (!hex && w == 128) usage(NULL);
if (w <= 32) {
format = (hex) ? "%x" : "%u";
if (sscanf(argv[1], format, &a) == 0) usage("Bad a\n");
if (sscanf(argv[2], format, &b) == 0) usage("Bad b\n");
if (w < 32) {
top = (w == 31) ? 0x80000000 : (1 << w);
if (w != 32 && a >= top) usage("a is too large\n");
if (w != 32 && b >= top) usage("b is too large\n");
}
c = gf.divide.w32(&gf, a, b);
printf(format, c);
printf("\n");
} else if (w == 64) {
format = (hex) ? "%llx" : "%llu";
if (sscanf(argv[1], format, &a64) == 0) usage("Bad a\n");
if (sscanf(argv[2], format, &b64) == 0) usage("Bad b\n");
c64 = gf.divide.w64(&gf, a64, b64);
printf(format, c64);
printf("\n");
} else if (w == 128) {
if (read_128(argv[1], a128) == 0) usage("Bad a\n");
if (read_128(argv[2], b128) == 0) usage("Bad b\n");
gf.divide.w128(&gf, a128, b128, c128);
print_128(c128);
}
exit(0);
}

421
gf_general.c Normal file
View File

@ -0,0 +1,421 @@
/*
* gf_general.c
*
* This file has helper routines for doing basic GF operations with any
* legal value of w. The problem is that w <= 32, w=64 and w=128 all have
* different data types, which is a pain. The procedures in this file try
* to alleviate that pain. They are used in gf_unit and gf_time.
*/
#include <stdio.h>
#include <getopt.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>
#include "gf.h"
#include "gf_int.h"
#include "gf_method.h"
#include "gf_rand.h"
#include "gf_general.h"
void gf_general_set_zero(gf_general_t *v, int w)
{
if (w <= 32) {
v->w32 = 0;
} else if (w <= 64) {
v->w64 = 0;
} else {
v->w128[0] = 0;
v->w128[1] = 0;
}
}
void gf_general_set_one(gf_general_t *v, int w)
{
if (w <= 32) {
v->w32 = 1;
} else if (w <= 64) {
v->w64 = 1;
} else {
v->w128[0] = 0;
v->w128[0] = 1;
}
}
void gf_general_set_two(gf_general_t *v, int w)
{
if (w <= 32) {
v->w32 = 2;
} else if (w <= 64) {
v->w64 = 2;
} else {
v->w128[0] = 0;
v->w128[0] = 2;
}
}
int gf_general_is_zero(gf_general_t *v, int w)
{
if (w <= 32) {
return (v->w32 == 0);
} else if (w <= 64) {
return (v->w64 == 0);
} else {
return (v->w128[0] == 0 && v->w128[1] == 0);
}
}
int gf_general_is_one(gf_general_t *v, int w)
{
if (w <= 32) {
return (v->w32 == 1);
} else if (w <= 64) {
return (v->w64 == 1);
} else {
return (v->w128[0] == 0 && v->w128[1] == 1);
}
}
void gf_general_set_random(gf_general_t *v, int w, int zero_ok)
{
if (w <= 32) {
v->w32 = MOA_Random_W(w, zero_ok);
} else if (w <= 64) {
while (1) {
v->w64 = MOA_Random_64();
if (v->w64 != 0 || zero_ok) return;
}
} else {
while (1) {
MOA_Random_128(v->w128);
if (v->w128[0] != 0 || v->w128[1] != 0 || zero_ok) return;
}
}
}
void gf_general_val_to_s(gf_general_t *v, int w, char *s)
{
if (w <= 32) {
sprintf(s, "%x", v->w32);
} else if (w <= 64) {
sprintf(s, "%llx", v->w64);
} else {
if (v->w128[0] == 0) {
sprintf(s, "%llx", v->w128[1]);
} else {
sprintf(s, "%llx%016llx", v->w128[0], v->w128[1]);
}
}
}
void gf_general_multiply(gf_t *gf, gf_general_t *a, gf_general_t *b, gf_general_t *c)
{
gf_internal_t *h;
int w;
h = (gf_internal_t *) gf->scratch;
w = h->w;
if (w <= 32) {
c->w32 = gf->multiply.w32(gf, a->w32, b->w32);
} else if (w <= 64) {
c->w64 = gf->multiply.w64(gf, a->w64, b->w64);
} else {
gf->multiply.w128(gf, a->w128, b->w128, c->w128);
}
}
void gf_general_divide(gf_t *gf, gf_general_t *a, gf_general_t *b, gf_general_t *c)
{
gf_internal_t *h;
int w;
h = (gf_internal_t *) gf->scratch;
w = h->w;
if (w <= 32) {
c->w32 = gf->divide.w32(gf, a->w32, b->w32);
} else if (w <= 64) {
c->w64 = gf->divide.w64(gf, a->w64, b->w64);
} else {
gf->divide.w128(gf, a->w128, b->w128, c->w128);
}
}
void gf_general_inverse(gf_t *gf, gf_general_t *a, gf_general_t *b)
{
gf_internal_t *h;
int w;
h = (gf_internal_t *) gf->scratch;
w = h->w;
if (w <= 32) {
b->w32 = gf->inverse.w32(gf, a->w32);
} else if (w <= 64) {
b->w64 = gf->inverse.w64(gf, a->w64);
} else {
gf->inverse.w128(gf, a->w128, b->w128);
}
}
int gf_general_are_equal(gf_general_t *v1, gf_general_t *v2, int w)
{
if (w <= 32) {
return (v1->w32 == v2->w32);
} else if (w <= 64) {
return (v1->w64 == v2->w64);
} else {
return (v1->w128[0] == v2->w128[0] &&
v1->w128[0] == v2->w128[0]);
}
}
void gf_general_do_region_multiply(gf_t *gf, gf_general_t *a, void *ra, void *rb, int bytes, int xor)
{
gf_internal_t *h;
int w;
h = (gf_internal_t *) gf->scratch;
w = h->w;
if (w <= 32) {
gf->multiply_region.w32(gf, ra, rb, a->w32, bytes, xor);
} else if (w <= 64) {
gf->multiply_region.w64(gf, ra, rb, a->w64, bytes, xor);
} else {
gf->multiply_region.w128(gf, ra, rb, a->w128, bytes, xor);
}
}
void gf_general_do_region_check(gf_t *gf, gf_general_t *a, void *orig_a, void *orig_target, void *final_target, int bytes, int xor)
{
gf_internal_t *h;
int w, words, i;
gf_general_t oa, ot, ft, sb;
char sa[50], soa[50], sot[50], sft[50], ssb[50];
uint8_t *p;
h = (gf_internal_t *) gf->scratch;
w = h->w;
words = (bytes * 8) / w;
for (i = 0; i < words; i++) {
if (w <= 32) {
oa.w32 = gf->extract_word.w32(gf, orig_a, bytes, i);
ot.w32 = gf->extract_word.w32(gf, orig_target, bytes, i);
ft.w32 = gf->extract_word.w32(gf, final_target, bytes, i);
sb.w32 = gf->multiply.w32(gf, a->w32, oa.w32);
if (xor) sb.w32 ^= ot.w32;
} else if (w <= 64) {
oa.w64 = gf->extract_word.w64(gf, orig_a, bytes, i);
ot.w64 = gf->extract_word.w64(gf, orig_target, bytes, i);
ft.w64 = gf->extract_word.w64(gf, final_target, bytes, i);
sb.w64 = gf->multiply.w64(gf, a->w64, oa.w64);
if (xor) sb.w64 ^= ot.w32;
} else {
gf->extract_word.w128(gf, orig_a, bytes, i, oa.w128);
gf->extract_word.w128(gf, orig_target, bytes, i, ot.w128);
gf->extract_word.w128(gf, final_target, bytes, i, ft.w128);
gf->multiply.w128(gf, a->w128, oa.w128, sb.w128);
if (xor) {
sb.w128[0] ^= ot.w128[0];
sb.w128[1] ^= ot.w128[1];
}
}
if (!gf_general_are_equal(&ft, &sb, w)) {
printf("Problem with region multiply (all values in hex):\n");
printf(" Target address base: 0x%lx. Word 0x%x of 0x%x. Xor: %d\n",
(unsigned long) final_target, i, words, xor);
gf_general_val_to_s(a, w, sa);
gf_general_val_to_s(&oa, w, soa);
gf_general_val_to_s(&ot, w, sot);
gf_general_val_to_s(&ft, w, sft);
gf_general_val_to_s(&sb, w, ssb);
printf(" Value: %s\n", sa);
printf(" Original source word: %s\n", soa);
if (xor) printf(" XOR with target word: %s\n", sot);
printf(" Product word: %s\n", sft);
printf(" It should be: %s\n", ssb);
exit(0);
}
}
}
void gf_general_set_up_single_timing_test(int w, void *ra, void *rb, int size)
{
uint32_t *r32;
int i;
/* If w is 8, 16, 32, 64 or 128, this is easy --
just fill the regions with random bytes.
Otherwise, treat every four bytes as an uint32_t
and fill it with a random value mod (1 << w).
*/
if (w == 8 || w == 16 || w == 32 || w == 64 || w == 128) {
MOA_Fill_Random_Region (ra, size);
MOA_Fill_Random_Region (rb, size);
} else {
r32 = (uint32_t *) ra;
for (i = 0; i < size/4; i++) r32[i] = MOA_Random_W(w, 1);
r32 = (uint32_t *) rb;
for (i = 0; i < size/4; i++) r32[i] = MOA_Random_W(w, 0);
}
}
/* This sucks, but in order to time, you really need to avoid putting ifs in
the inner loops. So, I'm doing a separate timing test for each w:
8, 16, 32, 64, 128 and everything else. Fortunately, the "everything else"
tests can be equivalent to w=32.
I'm also putting the results back into ra, because otherwise, the optimizer might
figure out that we're not really doing anything in the inner loops and it
will chuck that. */
int gf_general_do_single_timing_test(gf_t *gf, void *ra, void *rb, int size, char test)
{
gf_internal_t *h;
void *top;
uint8_t *r8a, *r8b, *top8;
uint16_t *r16a, *r16b, *top16;
uint32_t *r32a, *r32b, *top32;
uint64_t *r64a, *r64b, *top64, *r64c;
int w, rv;
h = (gf_internal_t *) gf->scratch;
w = h->w;
top = ra + size;
if (w == 8) {
r8a = (uint8_t *) ra;
r8b = (uint8_t *) rb;
top8 = (uint8_t *) top;
if (test == 'M') {
while (r8a < top8) {
*r8a = gf->multiply.w32(gf, *r8a, *r8b);
r8a++;
r8b++;
}
} else if (test == 'D') {
while (r8a < top8) {
*r8a = gf->divide.w32(gf, *r8a, *r8b);
r8a++;
r8b++;
}
} else if (test == 'I') {
while (r8a < top8) {
*r8a = gf->inverse.w32(gf, *r8a);
r8a++;
}
}
return (top8 - (uint8_t *) ra);
}
if (w == 16) {
r16a = (uint16_t *) ra;
r16b = (uint16_t *) rb;
top16 = (uint16_t *) top;
if (test == 'M') {
while (r16a < top16) {
*r16a = gf->multiply.w32(gf, *r16a, *r16b);
r16a++;
r16b++;
}
} else if (test == 'D') {
while (r16a < top16) {
*r16a = gf->divide.w32(gf, *r16a, *r16b);
r16a++;
r16b++;
}
} else if (test == 'I') {
while (r16a < top16) {
*r16a = gf->inverse.w32(gf, *r16a);
r16a++;
}
}
return (top16 - (uint16_t *) ra);
}
if (w <= 32) {
r32a = (uint32_t *) ra;
r32b = (uint32_t *) rb;
top32 = (uint32_t *) ra + (size/4); /* This is for the "everything elses" */
if (test == 'M') {
while (r32a < top32) {
*r32a = gf->multiply.w32(gf, *r32a, *r32b);
r32a++;
r32b++;
}
} else if (test == 'D') {
while (r32a < top32) {
*r32a = gf->divide.w32(gf, *r32a, *r32b);
r32a++;
r32b++;
}
} else if (test == 'I') {
while (r32a < top32) {
*r32a = gf->inverse.w32(gf, *r32a);
r32a++;
}
}
return (top32 - (uint32_t *) ra);
}
if (w == 64) {
r64a = (uint64_t *) ra;
r64b = (uint64_t *) rb;
top64 = (uint64_t *) top;
if (test == 'M') {
while (r64a < top64) {
*r64a = gf->multiply.w64(gf, *r64a, *r64b);
r64a++;
r64b++;
}
} else if (test == 'D') {
while (r64a < top64) {
*r64a = gf->divide.w64(gf, *r64a, *r64b);
r64a++;
r64b++;
}
} else if (test == 'I') {
while (r64a < top64) {
*r64a = gf->inverse.w64(gf, *r64a);
r64a++;
}
}
return (top64 - (uint64_t *) ra);
}
if (w == 128) {
r64a = (uint64_t *) ra;
r64c = r64a;
r64a += 2;
r64b = (uint64_t *) rb;
top64 = (uint64_t *) top;
rv = (top64 - r64a)/2;
if (test == 'M') {
while (r64a < top64) {
gf->multiply.w128(gf, r64a, r64b, r64c);
r64a += 2;
r64b += 2;
}
} else if (test == 'D') {
while (r64a < top64) {
gf->divide.w128(gf, r64a, r64b, r64c);
r64a += 2;
r64b += 2;
}
} else if (test == 'I') {
while (r64a < top64) {
gf->inverse.w128(gf, r64a, r64c);
r64a += 2;
}
}
return rv;
}
return 0;
}

55
gf_general.h Normal file
View File

@ -0,0 +1,55 @@
/*
* gf_general.h
*
* This file has helper routines for doing basic GF operations with any
* legal value of w. The problem is that w <= 32, w=64 and w=128 all have
* different data types, which is a pain. The procedures in this file try
* to alleviate that pain. They are used in gf_unit and gf_time.
*/
#pragma once
#include <stdio.h>
#include <getopt.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>
#include "gf.h"
typedef union {
uint32_t w32;
uint64_t w64;
uint64_t w128[2];
} gf_general_t;
void gf_general_set_zero(gf_general_t *v, int w);
void gf_general_set_one(gf_general_t *v, int w);
void gf_general_set_two(gf_general_t *v, int w);
int gf_general_is_zero(gf_general_t *v, int w);
int gf_general_is_one(gf_general_t *v, int w);
int gf_general_are_equal(gf_general_t *v1, gf_general_t *v2, int w);
void gf_general_val_to_s(gf_general_t *v, int w, char *s);
void gf_general_set_random(gf_general_t *v, int w, int zero_ok);
void gf_general_multiply(gf_t *gf, gf_general_t *a, gf_general_t *b, gf_general_t *c);
void gf_general_divide(gf_t *gf, gf_general_t *a, gf_general_t *b, gf_general_t *c);
void gf_general_inverse(gf_t *gf, gf_general_t *a, gf_general_t *b);
void gf_general_do_region_multiply(gf_t *gf, gf_general_t *a,
void *ra, void *rb,
int bytes, int xor);
void gf_general_do_region_check(gf_t *gf, gf_general_t *a,
void *orig_a, void *orig_target, void *final_target,
int bytes, int xor);
/* Which is M, D or I for multiply, divide or inverse. */
void gf_general_set_up_single_timing_test(int w, void *ra, void *rb, int size);
int gf_general_do_single_timing_test(gf_t *gf, void *ra, void *rb, int size, char which);

101
gf_int.h Normal file
View File

@ -0,0 +1,101 @@
/*
* gf_int.h
*
* Internal code for Galois field routines.
*/
#pragma once
#include "gf.h"
#include <string.h>
extern void timer_start (double *t);
extern double timer_split (const double *t);
extern void galois_fill_random (void *buf, int len, unsigned int seed);
extern int galois_is_sse();
typedef struct {
int mult_type;
int region_type;
int divide_type;
int w;
uint64_t prim_poly;
int free_me;
int arg1;
int arg2;
gf_t *base_gf;
void *private;
} gf_internal_t;
extern int gf_w4_init (gf_t *gf);
extern int gf_w4_scratch_size(int mult_type, int region_type, int divide_type, int arg1, int arg2);
extern int gf_w8_init (gf_t *gf);
extern int gf_w8_scratch_size(int mult_type, int region_type, int divide_type, int arg1, int arg2);
extern int gf_w16_init (gf_t *gf);
extern int gf_w16_scratch_size(int mult_type, int region_type, int divide_type, int arg1, int arg2);
extern int gf_w32_init (gf_t *gf);
extern int gf_w32_scratch_size(int mult_type, int region_type, int divide_type, int arg1, int arg2);
extern int gf_w64_init (gf_t *gf);
extern int gf_w64_scratch_size(int mult_type, int region_type, int divide_type, int arg1, int arg2);
extern int gf_w128_init (gf_t *gf);
extern int gf_w128_scratch_size(int mult_type, int region_type, int divide_type, int arg1, int arg2);
extern int gf_wgen_init (gf_t *gf);
extern int gf_wgen_scratch_size(int w, int mult_type, int region_type, int divide_type, int arg1, int arg2);
void gf_wgen_cauchy_region(gf_t *gf, void *src, void *dest, gf_val_32_t val, int bytes, int xor);
gf_val_32_t gf_wgen_extract_word(gf_t *gf, void *start, int bytes, int index);
extern void gf_alignment_error(char *s, int a);
extern uint32_t gf_bitmatrix_inverse(uint32_t y, int w, uint32_t pp);
/* This structure lets you define a region multiply. It helps because you can handle
unaligned portions of the data with the procedures below, which really cleans
up the code. */
typedef struct {
gf_t *gf;
void *src;
void *dest;
int bytes;
uint32_t val;
int xor;
int align; /* The number of bytes to which to align. */
void *s_start; /* The start and the top of the aligned region. */
void *d_start;
void *s_top;
void *d_top;
} gf_region_data;
/* This lets you set up one of these in one call. It also sets the start/top pointers. */
void gf_set_region_data(gf_region_data *rd,
gf_t *gf,
void *src,
void *dest,
int bytes,
uint32_t val,
int xor,
int align);
/* This performs gf->multiply.32() on all of the unaligned bytes in the beginning of the region */
extern void gf_do_initial_region_alignment(gf_region_data *rd);
/* This performs gf->multiply.32() on all of the unaligned bytes in the end of the region */
extern void gf_do_final_region_alignment(gf_region_data *rd);
extern void gf_two_byte_region_table_multiply(gf_region_data *rd, uint16_t *base);
extern void gf_multby_zero(void *dest, int bytes, int xor);
extern void gf_multby_one(gf_t *gf, void *src, void *dest, int bytes, int xor);

185
gf_method.c Normal file
View File

@ -0,0 +1,185 @@
/*
* gf_method.c
*
* Parses argv to figure out the mult_type and arguments. Returns the gf.
*/
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>
#include "gf.h"
#include "gf_method.h"
void methods_to_stderr()
{
fprintf(stderr, "To specify the methods, do one of the following: \n");
fprintf(stderr, " - leave empty to use defaults\n");
fprintf(stderr, " - use a single dash to use defaults\n");
fprintf(stderr, " - specify MULTIPLY REGION DIVIDE\n");
fprintf(stderr, "\n");
fprintf(stderr, "Legal values of MULTIPLY:\n");
fprintf(stderr, " SHIFT: shift\n");
fprintf(stderr, " GROUP g_mult g_reduce: the Group technique - see the paper\n");
fprintf(stderr, " BYTWO_p: BYTWO doubling the product.\n");
fprintf(stderr, " BYTWO_b: BYTWO doubling b (more efficient thatn BYTWO_p)\n");
fprintf(stderr, " TABLE: Full multiplication table\n");
fprintf(stderr, " LOG: Discrete logs\n");
fprintf(stderr, " LOG_ZERO: Discrete logs with a large table for zeros\n");
fprintf(stderr, " SPLIT g_a g_b: Split tables defined by g_a and g_b\n");
fprintf(stderr, " COMPOSITE k rec METHOD: Composite field. GF((2^l)^k), l=w/k.\n");
fprintf(stderr, " rec = 0 means inline single multiplication\n");
fprintf(stderr, " rec = 1 means recursive single multiplication\n");
fprintf(stderr, " METHOD is the method of the base field in GF(2^l)\n");
fprintf(stderr, "\n");
fprintf(stderr, "Legal values of REGION: Specify multiples with commas e.g. 'DOUBLE,LAZY'\n");
fprintf(stderr, " -: Use defaults\n");
fprintf(stderr, " SINGLE/DOUBLE/QUAD: Expand tables\n");
fprintf(stderr, " LAZY: Lazily create table (only applies to TABLE and SPLIT)\n");
fprintf(stderr, " SSE/NOSSE: Use 128-bit SSE instructions if you can\n");
fprintf(stderr, " CAUCHY/ALTMAP/STDMAP: Use different memory mappings\n");
fprintf(stderr, "\n");
fprintf(stderr, "Legal values of DIVIDE:\n");
fprintf(stderr, " -: Use defaults\n");
fprintf(stderr, " MATRIX: Use matrix inversion\n");
fprintf(stderr, " EUCLID: Use the extended Euclidian algorithm.\n");
fprintf(stderr, "\n");
fprintf(stderr, "See the user's manual for more information.\n");
fprintf(stderr, "There are many restrictions, so it is better to simply use defaults in most cases.\n");
}
int create_gf_from_argv(gf_t *gf, int w, int argc, char **argv, int starting)
{
int mult_type, divide_type, region_type;
uint32_t prim_poly = 0;
int arg1, arg2, subrg_size;
gf_t *base;
char *crt, *x, *y;
if (argc <= starting || strcmp(argv[starting], "-") == 0) {
mult_type = GF_MULT_DEFAULT;
if (!gf_init_easy(gf, w, mult_type)) return 0;
return (argc <= starting) ? starting : starting+1;
}
region_type = GF_REGION_DEFAULT;
divide_type = GF_DIVIDE_DEFAULT;
arg1 = 0;
arg2 = 0;
prim_poly = 0;
base = NULL;
subrg_size = 0;
if (argc < starting+3) return 0;
if (strcmp(argv[starting], "SHIFT") == 0) {
mult_type = GF_MULT_SHIFT;
starting++;
} else if (strcmp(argv[starting], "GROUP") == 0) {
mult_type = GF_MULT_GROUP;
if (argc < starting+5) return 0;
if (sscanf(argv[starting+1], "%d", &arg1) == 0 ||
sscanf(argv[starting+2], "%d", &arg2) == 0 ||
arg1 <= 0 || arg2 <= 0 || arg1 >= w || arg2 >= w) return 0;
starting += 3;
} else if (strcmp(argv[starting], "BYTWO_p") == 0) {
mult_type = GF_MULT_BYTWO_p;
starting++;
} else if (strcmp(argv[starting], "BYTWO_b") == 0) {
mult_type = GF_MULT_BYTWO_b;
starting++;
} else if (strcmp(argv[starting], "TABLE") == 0) {
mult_type = GF_MULT_TABLE;
starting++;
} else if (strcmp(argv[starting], "LOG") == 0) {
mult_type = GF_MULT_LOG_TABLE;
starting++;
} else if (strcmp(argv[starting], "LOG_ZERO") == 0) {
mult_type = GF_MULT_LOG_TABLE;
arg1 = 1;
starting++;
} else if (strcmp(argv[starting], "SPLIT") == 0) {
mult_type = GF_MULT_SPLIT_TABLE;
if (argc < starting+5) return 0;
if (sscanf(argv[starting+1], "%d", &arg1) == 0 ||
sscanf(argv[starting+2], "%d", &arg2) == 0 ||
arg1 <= 0 || arg2 <= 0 || w % arg1 != 0 || w % arg2 != 0) return 0;
starting += 3;
} else if (strcmp(argv[starting], "COMPOSITE") == 0) {
mult_type = GF_MULT_COMPOSITE;
if (argc < starting+6) return 0;
if (sscanf(argv[starting+1], "%d", &arg1) == 0 ||
sscanf(argv[starting+2], "%d", &arg2) == 0 ||
arg1 <= 1 || w %arg1 != 0 || ((arg2 | 1) != 1)) return 0;
base = (gf_t *) malloc(sizeof(gf_t));
starting = create_gf_from_argv(base, w/arg1, argc, argv, starting+3);
if (starting == 0) { free(base); return 0; }
} else {
return 0;
}
if (argc < starting+2) {
if (base != NULL) gf_free(base, 1);
return 0;
}
if (strcmp(argv[starting], "-") == 0) {
region_type = GF_REGION_DEFAULT;
} else {
crt = strdup(argv[starting]);
region_type = 0;
x = crt;
do {
y = strchr(x, ',');
if (y != NULL) *y = '\0';
if (strcmp(x, "DOUBLE") == 0) {
region_type |= GF_REGION_DOUBLE_TABLE;
} else if (strcmp(x, "QUAD") == 0) {
region_type |= GF_REGION_QUAD_TABLE;
} else if (strcmp(x, "SINGLE") == 0) {
region_type |= GF_REGION_SINGLE_TABLE;
} else if (strcmp(x, "LAZY") == 0) {
region_type |= GF_REGION_LAZY;
} else if (strcmp(x, "SSE") == 0) {
region_type |= GF_REGION_SSE;
} else if (strcmp(x, "NOSSE") == 0) {
region_type |= GF_REGION_NOSSE;
} else if (strcmp(x, "CAUCHY") == 0) {
region_type |= GF_REGION_CAUCHY;
} else if (strcmp(x, "ALTMAP") == 0) {
region_type |= GF_REGION_ALTMAP;
} else if (strcmp(x, "STDMAP") == 0) {
region_type |= GF_REGION_STDMAP;
} else {
if (base != NULL) gf_free(base, 1);
free(crt);
return 0;
}
if (y != NULL) x = y+1;
} while (y != NULL);
free(crt);
}
starting++;
if (strcmp(argv[starting], "-") == 0) {
divide_type = GF_DIVIDE_DEFAULT;
} else if (strcmp(argv[starting], "MATRIX") == 0) {
divide_type = GF_DIVIDE_MATRIX;
} else if (strcmp(argv[starting], "EUCLID") == 0) {
divide_type = GF_DIVIDE_EUCLID;
} else {
if (base != NULL) gf_free(base, 1);
return 0;
}
starting++;
if (!gf_init_hard(gf, w, mult_type, region_type, divide_type, prim_poly, arg1, arg2, base, NULL)) {
if (base != NULL) gf_free(base, 1);
return 0;
}
return starting;
}

15
gf_method.h Normal file
View File

@ -0,0 +1,15 @@
/*
* gf_method.h
*
* Parses argv to figure out the flags and arguments. Creates the gf.
*/
#pragma once
#include "gf.h"
/* This prints out the error string defining the methods that you can put on argv*/
extern void methods_to_stderr();
/* Parses argv starting at "starting" */
extern int create_gf_from_argv(gf_t *gf, int w, int argc, char **argv, int starting);

BIN
gf_methods Executable file

Binary file not shown.

141
gf_methods.c Normal file
View File

@ -0,0 +1,141 @@
/*
* gf_mult.c
*
* Multiplies two numbers in gf_2^w
*/
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include "gf.h"
#include "gf_method.h"
#define NMULTS (14)
static char *mults[NMULTS] = { "SHIFT", "GROUP44", "GROUP48", "BYTWO_p", "BYTWO_b",
"TABLE", "LOG", "LOG_ZERO", "SPLIT2", "SPLIT4", "SPLIT8", "SPLIT88", "COMPOSITE-0", "COMPOSITE-1" };
#define NREGIONS (96)
static char *regions[NREGIONS] = { "-", "SINGLE", "DOUBLE", "QUAD",
"LAZY", "SINGLE,LAZY", "DOUBLE,LAZY", "QUAD,LAZY", "SSE",
"SINGLE,SSE", "DOUBLE,SSE", "QUAD,SSE", "LAZY,SSE",
"SINGLE,LAZY,SSE", "DOUBLE,LAZY,SSE", "QUAD,LAZY,SSE", "NOSSE",
"SINGLE,NOSSE", "DOUBLE,NOSSE", "QUAD,NOSSE", "LAZY,NOSSE",
"SINGLE,LAZY,NOSSE", "DOUBLE,LAZY,NOSSE", "QUAD,LAZY,NOSSE",
"STDMAP", "SINGLE,STDMAP", "DOUBLE,STDMAP", "QUAD,STDMAP",
"LAZY,STDMAP", "SINGLE,LAZY,STDMAP", "DOUBLE,LAZY,STDMAP",
"QUAD,LAZY,STDMAP", "SSE,STDMAP", "SINGLE,SSE,STDMAP",
"DOUBLE,SSE,STDMAP", "QUAD,SSE,STDMAP", "LAZY,SSE,STDMAP",
"SINGLE,LAZY,SSE,STDMAP", "DOUBLE,LAZY,SSE,STDMAP",
"QUAD,LAZY,SSE,STDMAP", "NOSSE,STDMAP", "SINGLE,NOSSE,STDMAP",
"DOUBLE,NOSSE,STDMAP", "QUAD,NOSSE,STDMAP", "LAZY,NOSSE,STDMAP",
"SINGLE,LAZY,NOSSE,STDMAP", "DOUBLE,LAZY,NOSSE,STDMAP",
"QUAD,LAZY,NOSSE,STDMAP", "ALTMAP", "SINGLE,ALTMAP", "DOUBLE,ALTMAP",
"QUAD,ALTMAP", "LAZY,ALTMAP", "SINGLE,LAZY,ALTMAP",
"DOUBLE,LAZY,ALTMAP", "QUAD,LAZY,ALTMAP", "SSE,ALTMAP",
"SINGLE,SSE,ALTMAP", "DOUBLE,SSE,ALTMAP", "QUAD,SSE,ALTMAP",
"LAZY,SSE,ALTMAP", "SINGLE,LAZY,SSE,ALTMAP",
"DOUBLE,LAZY,SSE,ALTMAP", "QUAD,LAZY,SSE,ALTMAP", "NOSSE,ALTMAP",
"SINGLE,NOSSE,ALTMAP", "DOUBLE,NOSSE,ALTMAP", "QUAD,NOSSE,ALTMAP",
"LAZY,NOSSE,ALTMAP", "SINGLE,LAZY,NOSSE,ALTMAP",
"DOUBLE,LAZY,NOSSE,ALTMAP", "QUAD,LAZY,NOSSE,ALTMAP", "CAUCHY",
"SINGLE,CAUCHY", "DOUBLE,CAUCHY", "QUAD,CAUCHY", "LAZY,CAUCHY",
"SINGLE,LAZY,CAUCHY", "DOUBLE,LAZY,CAUCHY", "QUAD,LAZY,CAUCHY",
"SSE,CAUCHY", "SINGLE,SSE,CAUCHY", "DOUBLE,SSE,CAUCHY",
"QUAD,SSE,CAUCHY", "LAZY,SSE,CAUCHY", "SINGLE,LAZY,SSE,CAUCHY",
"DOUBLE,LAZY,SSE,CAUCHY", "QUAD,LAZY,SSE,CAUCHY", "NOSSE,CAUCHY",
"SINGLE,NOSSE,CAUCHY", "DOUBLE,NOSSE,CAUCHY", "QUAD,NOSSE,CAUCHY",
"LAZY,NOSSE,CAUCHY", "SINGLE,LAZY,NOSSE,CAUCHY",
"DOUBLE,LAZY,NOSSE,CAUCHY", "QUAD,LAZY,NOSSE,CAUCHY" };
#define NDIVS (3)
static char *divides[NDIVS] = { "-", "MATRIX", "EUCLID" };
int main()
{
int m, r, d, w, i, sa, j;
char *argv[20];
gf_t gf;
char divs[200], ks[10], ls[10];
methods_to_stderr();
printf("\n");
printf("Implemented Methods: \n\n");
for (i = 2; i < 8; i++) {
w = (1 << i);
argv[0] = "-";
if (create_gf_from_argv(&gf, w, 1, argv, 0) > 0) {
printf("w=%d: -\n", w);
gf_free(&gf, 1);
}
for (m = 0; m < NMULTS; m++) {
sa = 0;
if (strcmp(mults[m], "GROUP44") == 0) {
argv[sa++] = "GROUP";
argv[sa++] = "4";
argv[sa++] = "4";
} else if (strcmp(mults[m], "GROUP48") == 0) {
argv[sa++] = "GROUP";
argv[sa++] = "4";
argv[sa++] = "8";
} else if (strcmp(mults[m], "SPLIT2") == 0) {
argv[sa++] = "SPLIT";
sprintf(ls, "%d", w);
argv[sa++] = ls;
argv[sa++] = "2";
} else if (strcmp(mults[m], "SPLIT4") == 0) {
argv[sa++] = "SPLIT";
sprintf(ls, "%d", w);
argv[sa++] = ls;
argv[sa++] = "4";
} else if (strcmp(mults[m], "SPLIT8") == 0) {
argv[sa++] = "SPLIT";
sprintf(ls, "%d", w);
argv[sa++] = ls;
argv[sa++] = "8";
} else if (strcmp(mults[m], "SPLIT88") == 0) {
argv[sa++] = "SPLIT";
argv[sa++] = "8";
argv[sa++] = "8";
} else if (strcmp(mults[m], "COMPOSITE-0") == 0) {
argv[sa++] = "COMPOSITE";
argv[sa++] = "2";
argv[sa++] = "0";
argv[sa++] = "-";
} else if (strcmp(mults[m], "COMPOSITE-1") == 0) {
argv[sa++] = "COMPOSITE";
argv[sa++] = "2";
argv[sa++] = "1";
argv[sa++] = "-";
} else {
argv[sa++] = mults[m];
}
for (r = 0; r < NREGIONS; r++) {
argv[sa++] = regions[r];
strcpy(divs, "");
for (d = 0; d < NDIVS; d++) {
argv[sa++] = divides[d];
/* printf("w=%d:", w);
for (j = 0; j < sa; j++) printf(" %s", argv[j]);
printf("\n"); */
if (create_gf_from_argv(&gf, w, sa, argv, 0) > 0) {
strcat(divs, "|");
strcat(divs, divides[d]);
gf_free(&gf, 1);
}
sa--;
}
if (strlen(divs) > 0) {
printf("w=%d:", w);
for (j = 0; j < sa; j++) printf(" %s", argv[j]);
printf(" %s\n", divs+1);
}
sa--;
}
sa--;
}
}
}

BIN
gf_mult Executable file

Binary file not shown.

116
gf_mult.c Normal file
View File

@ -0,0 +1,116 @@
/*
* gf_mult.c
*
* Multiplies two numbers in gf_2^w
*/
#include <stdio.h>
#include <getopt.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include "gf.h"
#include "gf_method.h"
void usage(char *s)
{
fprintf(stderr, "usage: gf_mult a b w [method] - does multiplication of a and b in GF(2^w)\n");
fprintf(stderr, " If w has an h on the end, treat a, b and the product as hexadecimal (no 0x)\n");
fprintf(stderr, "\n");
fprintf(stderr, " legal w are: 1-32, 64 and 128\n");
fprintf(stderr, " 128 is hex only (i.e. '128' will be an error - do '128h')\n");
fprintf(stderr, "\n");
fprintf(stderr, " For method specification, type gf_methods\n");
if (s != NULL) fprintf(stderr, "%s", s);
exit(1);
}
int read_128(char *s, uint64_t *v)
{
int l, t;
char save;
l = strlen(s);
if (l > 32) return 0;
if (l > 16) {
if (sscanf(s + (l-16), "%llx", (long long unsigned int *) &(v[1])) == 0) return 0;
save = s[l-16];
s[l-16] = '\0';
t = sscanf(s, "%llx", (long long unsigned int *) &(v[0]));
s[l-16] = save;
return t;
} else {
v[0] = 0;
return sscanf(s, "%llx", (long long unsigned int *)&(v[1]));
}
return 1;
}
void print_128(uint64_t *v)
{
if (v[0] > 0) {
printf("%llx", (long long unsigned int) v[0]);
printf("%016llx", (long long unsigned int) v[1]);
} else {
printf("%llx", (long long unsigned int) v[1]);
}
printf("\n");
}
int main(int argc, char **argv)
{
int hex, al, bl, w;
uint32_t a, b, c, top;
uint64_t a64, b64, c64;
uint64_t a128[2], b128[2], c128[2];
char *format;
gf_t gf;
if (argc < 4) usage(NULL);
if (sscanf(argv[3], "%d", &w) == 0) usage("Bad w\n");
if (w <= 0 || (w > 32 && w != 64 && w != 128)) usage("Bad w");
hex = (strchr(argv[3], 'h') != NULL);
if (create_gf_from_argv(&gf, w, argc, argv, 4) == 0) usage("\nBad Method\n");
if (!hex && w == 128) usage(NULL);
if (w <= 32) {
format = (hex) ? "%x" : "%u";
if (sscanf(argv[1], format, &a) == 0) usage("Bad a\n");
if (sscanf(argv[2], format, &b) == 0) usage("Bad b\n");
if (w < 32) {
top = (w == 31) ? 0x80000000 : (1 << w);
if (w != 32 && a >= top) usage("a is too large\n");
if (w != 32 && b >= top) usage("b is too large\n");
}
c = gf.multiply.w32(&gf, a, b);
printf(format, c);
printf("\n");
} else if (w == 64) {
format = (hex) ? "%llx" : "%llu";
if (sscanf(argv[1], format, &a64) == 0) usage("Bad a\n");
if (sscanf(argv[2], format, &b64) == 0) usage("Bad b\n");
c64 = gf.multiply.w64(&gf, a64, b64);
printf(format, c64);
printf("\n");
} else if (w == 128) {
if (read_128(argv[1], a128) == 0) usage("Bad a\n");
if (read_128(argv[2], b128) == 0) usage("Bad b\n");
gf.multiply.w128(&gf, a128, b128, c128);
print_128(c128);
}
exit(0);
}

72
gf_rand.c Normal file
View File

@ -0,0 +1,72 @@
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include "gf_rand.h"
/* Lifted the "Mother of All" random number generator from http://www.agner.org/random/ */
static uint32_t MOA_X[5];
uint32_t MOA_Random_32() {
uint64_t sum;
sum = (uint64_t)2111111111UL * (uint64_t)MOA_X[3] +
(uint64_t)1492 * (uint64_t)(MOA_X[2]) +
(uint64_t)1776 * (uint64_t)(MOA_X[1]) +
(uint64_t)5115 * (uint64_t)(MOA_X[0]) +
(uint64_t)MOA_X[4];
MOA_X[3] = MOA_X[2]; MOA_X[2] = MOA_X[1]; MOA_X[1] = MOA_X[0];
MOA_X[4] = (uint32_t)(sum >> 32);
MOA_X[0] = (uint32_t)sum;
return MOA_X[0];
}
uint64_t MOA_Random_64() {
uint64_t sum;
sum = MOA_Random_32();
sum <<= 32;
sum |= MOA_Random_32();
return sum;
}
void MOA_Random_128(uint64_t *x) {
x[0] = MOA_Random_64();
x[1] = MOA_Random_64();
return;
}
uint32_t MOA_Random_W(int w, int zero_ok)
{
uint32_t b;
do {
b = MOA_Random_32();
if (w == 31) b &= 0x7fffffff;
if (w < 31) b %= (1 << w);
} while (!zero_ok && b == 0);
return b;
}
void MOA_Seed(uint32_t seed) {
int i;
uint32_t s = seed;
for (i = 0; i < 5; i++) {
s = s * 29943829 - 1;
MOA_X[i] = s;
}
for (i=0; i<19; i++) MOA_Random_32();
}
void MOA_Fill_Random_Region (void *reg, int size)
{
uint32_t *r32;
uint8_t *r8;
int i;
r32 = (uint32_t *) reg;
r8 = (uint8_t *) reg;
for (i = 0; i < size/4; i++) r32[i] = MOA_Random_32();
for (i *= 4; i < size; i++) r8[i] = MOA_Random_W(8, 1);
}

18
gf_rand.h Normal file
View File

@ -0,0 +1,18 @@
/* gf_rand.h
* External include file for random number generation. */
#pragma once
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
/* These are all pretty self-explanatory */
uint32_t MOA_Random_32();
uint64_t MOA_Random_64();
void MOA_Random_128(uint64_t *x);
uint32_t MOA_Random_W(int w, int zero_ok);
void MOA_Fill_Random_Region (void *reg, int size); /* reg should be aligned to 4 bytes, but
size can be anything. */
void MOA_Seed(uint32_t seed);

BIN
gf_time Executable file

Binary file not shown.

195
gf_time.c Normal file
View File

@ -0,0 +1,195 @@
/*
* gf_unit.c
*
* Performs unit testing for gf arithmetic
*/
#include <stdio.h>
#include <getopt.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>
#include "gf.h"
#include "gf_method.h"
#include "gf_rand.h"
#include "gf_general.h"
#define REGION_SIZE (4096)
void
timer_start (double *t)
{
struct timeval tv;
gettimeofday (&tv, NULL);
*t = (double)tv.tv_sec + (double)tv.tv_usec * 1e-6;
}
double
timer_split (const double *t)
{
struct timeval tv;
double cur_t;
gettimeofday (&tv, NULL);
cur_t = (double)tv.tv_sec + (double)tv.tv_usec * 1e-6;
return (cur_t - *t);
}
void problem(char *s)
{
fprintf(stderr, "Timing test failed.\n");
fprintf(stderr, "%s\n", s);
exit(1);
}
void usage(char *s)
{
fprintf(stderr, "usage: gf_time w tests seed size(bytes) iterations [method [params]] - does timing\n");
fprintf(stderr, "\n");
fprintf(stderr, "Legal w are: 1 - 32, 64 and 128\n");
fprintf(stderr, "\n");
fprintf(stderr, "Tests may be any combination of:\n");
fprintf(stderr, " A: All\n");
fprintf(stderr, " S: All Single Operations\n");
fprintf(stderr, " R: All Region Operations\n");
fprintf(stderr, " M: Single: Multiplications\n");
fprintf(stderr, " D: Single: Divisions\n");
fprintf(stderr, " I: Single: Inverses\n");
fprintf(stderr, " B: Region: Buffer-Constant Multiplication\n");
fprintf(stderr, "\n");
fprintf(stderr, "Use -1 for time(0) as a seed.\n");
fprintf(stderr, "\n");
fprintf(stderr, "For method specification, type gf_methods\n");
fprintf(stderr, "\n");
if (s != NULL) fprintf(stderr, "%s\n", s);
exit(1);
}
int main(int argc, char **argv)
{
int w, it, i, size, iterations, xor;
char tests[100];
char test;
char *single_tests = "MDI";
char *region_tests = "G012";
char *tstrings[256];
void *tmethods[256];
gf_t gf;
double timer, elapsed, ds, di, dnum;
int num;
time_t t0;
uint8_t *ra, *rb;
gf_general_t a;
if (argc < 6) usage(NULL);
if (sscanf(argv[1], "%d", &w) == 0) usage("Bad w\n");
if (sscanf(argv[3], "%ld", &t0) == 0) usage("Bad seed\n");
if (sscanf(argv[4], "%d", &size) == 0) usage("Bad size\n");
if (sscanf(argv[5], "%d", &iterations) == 0) usage("Bad iterations\n");
if (t0 == -1) t0 = time(0);
MOA_Seed(t0);
ds = size;
di = iterations;
if ((w > 32 && w != 64 && w != 128) || w < 0) usage("Bad w");
if ((size * 8) % w != 0) usage ("Bad size -- must be a multiple of w*8\n");
if (!create_gf_from_argv(&gf, w, argc, argv, 6)) usage("Bad Method");
strcpy(tests, "");
for (i = 0; i < argv[2][i] != '\0'; i++) {
switch(argv[2][i]) {
case 'A': strcat(tests, single_tests);
strcat(tests, region_tests);
break;
case 'S': strcat(tests, single_tests); break;
case 'R': strcat(tests, region_tests); break;
case 'G': strcat(tests, "G"); break;
case '0': strcat(tests, "0"); break;
case '1': strcat(tests, "1"); break;
case '2': strcat(tests, "2"); break;
case 'M': strcat(tests, "M"); break;
case 'D': strcat(tests, "D"); break;
case 'I': strcat(tests, "I"); break;
default: usage("Bad tests");
}
}
tstrings['M'] = "Multiply";
tstrings['D'] = "Divide";
tstrings['I'] = "Inverse";
tstrings['G'] = "Region-Random";
tstrings['0'] = "Region-By-Zero";
tstrings['1'] = "Region-By-One";
tstrings['2'] = "Region-By-Two";
tmethods['M'] = (void *) gf.multiply.w32;
tmethods['D'] = (void *) gf.divide.w32;
tmethods['I'] = (void *) gf.inverse.w32;
tmethods['G'] = (void *) gf.multiply_region.w32;
tmethods['0'] = (void *) gf.multiply_region.w32;
tmethods['1'] = (void *) gf.multiply_region.w32;
tmethods['2'] = (void *) gf.multiply_region.w32;
printf("Seed: %ld\n", t0);
ra = (uint8_t *) malloc(size);
rb = (uint8_t *) malloc(size);
if (ra == NULL || rb == NULL) { perror("malloc"); exit(1); }
for (i = 0; i < 3; i++) {
test = single_tests[i];
if (strchr(tests, test) != NULL) {
if (tmethods[test] == NULL) {
printf("No %s method.\n", tstrings[test]);
} else {
elapsed = 0;
dnum = 0;
for (it = 0; it < iterations; it++) {
gf_general_set_up_single_timing_test(w, ra, rb, size);
timer_start(&timer);
num = gf_general_do_single_timing_test(&gf, ra, rb, size, test);
dnum += num;
elapsed += timer_split(&timer);
}
printf("%14s: %10.6lf s Mops: %10.3lf %10.3lf Mega-ops/s\n",
tstrings[test], elapsed,
dnum/1024.0/1024.0, dnum/1024.0/1024.0/elapsed);
}
}
}
for (i = 0; i < 4; i++) {
test = region_tests[i];
if (strchr(tests, test) != NULL) {
if (tmethods[test] == NULL) {
printf("No %s method.\n", tstrings[test]);
} else {
elapsed = 0;
if (test == '0') gf_general_set_zero(&a, w);
if (test == '1') gf_general_set_one(&a, w);
if (test == '2') gf_general_set_two(&a, w);
for (xor = 0; xor < 2; xor++) {
elapsed = 0;
for (it = 0; it < iterations; it++) {
if (test == 'G') gf_general_set_random(&a, w, 1);
gf_general_set_up_single_timing_test(8, ra, rb, size);
timer_start(&timer);
gf_general_do_region_multiply(&gf, &a, ra, rb, size, xor);
elapsed += timer_split(&timer);
}
printf("%14s: XOR: %d %10.6lf s MB: %10.3lf %10.3lf MB/s\n",
tstrings[test], xor, elapsed,
ds*di/1024.0/1024.0, ds*di/1024.0/1024.0/elapsed);
}
}
}
}
}

BIN
gf_unit Executable file

Binary file not shown.

222
gf_unit.c Normal file
View File

@ -0,0 +1,222 @@
/*
* gf_unit.c
*
* Performs unit testing for gf arithmetic
*/
#include <stdio.h>
#include <getopt.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>
#include "gf.h"
#include "gf_int.h"
#include "gf_method.h"
#include "gf_rand.h"
#include "gf_general.h"
#define REGION_SIZE (16384)
void problem(char *s)
{
fprintf(stderr, "Unit test failed.\n");
fprintf(stderr, "%s\n", s);
exit(1);
}
void usage(char *s)
{
fprintf(stderr, "usage: gf_unit w tests seed [method] - does unit testing in GF(2^w)\n");
fprintf(stderr, "\n");
fprintf(stderr, "Legal w are: 1 - 32, 64 and 128\n");
fprintf(stderr, "\n");
fprintf(stderr, "Tests may be any combination of:\n");
fprintf(stderr, " A: All\n");
fprintf(stderr, " S: Single operations (multiplication/division)\n");
fprintf(stderr, " R: Region operations\n");
fprintf(stderr, " V: Verbose Output\n");
fprintf(stderr, "\n");
fprintf(stderr, "Use -1 for time(0) as a seed.\n");
fprintf(stderr, "\n");
fprintf(stderr, "For method specification, type gf_methods\n");
fprintf(stderr, "\n");
if (s != NULL) fprintf(stderr, "%s\n", s);
exit(1);
}
int main(int argc, char **argv)
{
int w, i, verbose, single, region, tested, top;
int start, end, xor;
gf_t gf, gf_def;
time_t t0;
gf_internal_t *h;
gf_general_t *a, *b, *c, *d, *ai, *bi;
char as[50], bs[50], cs[50], ds[50], ais[50], bis[50];
uint32_t mask;
char *ra, *rb, *rc, *rd, *target;
int align;
if (argc < 4) usage(NULL);
if (sscanf(argv[1], "%d", &w) == 0) usage("Bad w\n");
if (sscanf(argv[3], "%ld", &t0) == 0) usage("Bad seed\n");
if (t0 == -1) t0 = time(0);
MOA_Seed(t0);
if (w > 32 && w != 64 && w != 128) usage("Bad w");
if (create_gf_from_argv(&gf, w, argc, argv, 4) == 0) usage("Bad Method");
for (i = 0; i < strlen(argv[2]); i++) {
if (strchr("ASRV", argv[2][i]) == NULL) usage("Bad test\n");
}
h = (gf_internal_t *) gf.scratch;
a = (gf_general_t *) malloc(sizeof(gf_general_t));
b = (gf_general_t *) malloc(sizeof(gf_general_t));
c = (gf_general_t *) malloc(sizeof(gf_general_t));
d = (gf_general_t *) malloc(sizeof(gf_general_t));
ai = (gf_general_t *) malloc(sizeof(gf_general_t));
bi = (gf_general_t *) malloc(sizeof(gf_general_t));
ra = (char *) malloc(sizeof(char)*REGION_SIZE);
rb = (char *) malloc(sizeof(char)*REGION_SIZE);
rc = (char *) malloc(sizeof(char)*REGION_SIZE);
rd = (char *) malloc(sizeof(char)*REGION_SIZE);
if (w <= 32) {
mask = 0;
for (i = 0; i < w; i++) mask |= (1 << i);
}
verbose = (strchr(argv[2], 'V') != NULL);
single = (strchr(argv[2], 'S') != NULL || strchr(argv[2], 'A') != NULL);
region = (strchr(argv[2], 'R') != NULL || strchr(argv[2], 'A') != NULL);
if (!gf_init_easy(&gf_def, w, GF_MULT_DEFAULT)) problem("No default for this value of w");
if (verbose) printf("Seed: %ld\n", t0);
if (single) {
if (gf.multiply.w32 == NULL) problem("No multiplication operation defined.");
if (verbose) { printf("Testing single multiplications/divisions.\n"); fflush(stdout); }
if (w <= 10) {
top = (1 << w)*(1 << w);
} else {
top = 1024*1024;
}
for (i = 0; i < top; i++) {
if (w <= 10) {
a->w32 = i % (1 << w);
b->w32 = (i >> w);
} else if (i < 10) {
gf_general_set_zero(a, w);
gf_general_set_random(b, w, 1);
} else if (i < 20) {
gf_general_set_random(a, w, 1);
gf_general_set_zero(b, w);
} else if (i < 30) {
gf_general_set_one(a, w);
gf_general_set_random(b, w, 1);
} else if (i < 40) {
gf_general_set_random(a, w, 1);
gf_general_set_one(b, w);
} else {
gf_general_set_random(a, w, 1);
gf_general_set_random(b, w, 1);
}
tested = 0;
gf_general_multiply(&gf, a, b, c);
/* If this is not composite, then first test against the default: */
if (h->mult_type != GF_MULT_COMPOSITE) {
tested = 1;
gf_general_multiply(&gf_def, a, b, d);
if (!gf_general_are_equal(c, d, w)) {
gf_general_val_to_s(a, w, as);
gf_general_val_to_s(b, w, bs);
gf_general_val_to_s(c, w, cs);
gf_general_val_to_s(d, w, ds);
printf("Error in single multiplication (all numbers in hex):\n\n");
printf(" gf.multiply(gf, %s, %s) = %s\n", as, bs, cs);
printf(" The default gf multiplier returned %s\n", ds);
exit(1);
}
}
/* Now, we also need to double-check by other means, in case the default is wanky,
and when we're performing composite operations. Start with 0 and 1, where we know
what the result should be. */
if (gf_general_is_zero(a, w) || gf_general_is_zero(b, w) ||
gf_general_is_one(a, w) || gf_general_is_one(b, w)) {
tested = 1;
if (((gf_general_is_zero(a, w) || gf_general_is_zero(b, w)) && !gf_general_is_zero(c, w)) ||
(gf_general_is_one(a, w) && !gf_general_are_equal(b, c, w)) ||
(gf_general_is_one(b, w) && !gf_general_are_equal(a, c, w))) {
gf_general_val_to_s(a, w, as);
gf_general_val_to_s(b, w, bs);
gf_general_val_to_s(c, w, cs);
printf("Error in single multiplication (all numbers in hex):\n\n");
printf(" gf.multiply(gf, %s, %s) = %s, which is clearly wrong.\n", as, bs, cs);
;
exit(1);
}
}
/* Dumb check to make sure that it's not returning numbers that are too big: */
if (w < 32 && (c->w32 & mask) != c->w32) {
gf_general_val_to_s(a, w, as);
gf_general_val_to_s(b, w, bs);
gf_general_val_to_s(c, w, cs);
printf("Error in single multiplication (all numbers in hex):\n\n");
printf(" gf.multiply.w32(gf, %s, %s) = %s, which is too big.\n", as, bs, cs);
exit(1);
}
}
}
if (region) {
if (verbose) { printf("Testing region multiplications\n"); fflush(stdout); }
for (i = 0; i < 1000; i++) {
if (i < 20) {
gf_general_set_zero(a, w);
} else if (i < 40) {
gf_general_set_one(a, w);
} else {
gf_general_set_random(a, w, 1);
}
MOA_Fill_Random_Region(ra, REGION_SIZE);
MOA_Fill_Random_Region(rb, REGION_SIZE);
xor = i%2;
align = w/8;
if (align == 0) align = 1;
if (align > 16) align = 16;
if ((h->region_type & GF_REGION_CAUCHY) || (w < 32 && w != 4 && w != 8 && w != 16)) {
start = MOA_Random_W(5, 1);
end = REGION_SIZE - MOA_Random_W(5, 1);
target = rb;
while ((end-start)%w != 0) end--;
} else {
start = MOA_Random_W(5, 1) * align;
end = REGION_SIZE - (MOA_Random_W(5, 1) * align);
if (h->mult_type == GF_MULT_COMPOSITE && (h->region_type & GF_REGION_ALTMAP)) {
target = rb ;
} else {
target = ((i%4)/2) ? rb : ra;
}
}
memcpy(rc, ra, REGION_SIZE);
memcpy(rd, target, REGION_SIZE);
gf_general_do_region_multiply(&gf, a, ra+start, target+start, end-start, xor);
gf_general_do_region_check(&gf, a, rc+start, rd+start, target+start, end-start, xor);
}
}
}

496
gf_w128.c Normal file
View File

@ -0,0 +1,496 @@
/*
* gf_w128.c
*
* Routines for 128-bit Galois fields
*/
#include "gf_int.h"
#include <stdio.h>
#include <stdlib.h>
#define GF_FIELD_WIDTH (128)
#define two_x(a) {\
a[0] <<= 1; \
if (a[1] & (uint64_t) 1 << 63) a[0] ^= 1; \
a[1] <<= 1; }
#define a_get_b(a, i, b, j) {\
a[i] = b[j]; \
a[i + 1] = b[j + 1];}
#define set_zero(a, i) {\
a[i] = 0; \
a[i + 1] = 0;}
typedef struct gf_group_tables_s {
gf_val_128_t m_table;
gf_val_128_t r_table;
} gf_group_tables_t;
static
void
gf_w128_multiply_region_from_single(gf_t *gf, void *src, void *dest, gf_val_128_t val, int bytes,
int xor)
{
int i;
gf_val_128_t s128;
gf_val_128_t d128;
uint64_t c128[2];
set_zero(c128, 0);
s128 = (gf_val_128_t) src;
d128 = (gf_val_128_t) dest;
if (xor) {
for (i = 0; i < bytes/sizeof(gf_val_64_t); i += 2) {
gf->multiply.w128(gf, &s128[i], val, c128);
d128[i] ^= c128[0];
d128[i+1] ^= c128[1];
}
} else {
for (i = 0; i < bytes/sizeof(gf_val_64_t); i += 2) {
gf->multiply.w128(gf, &s128[i], val, &d128[i]);
}
}
}
/*
* Some w128 notes:
* --Big Endian
* --return values allocated beforehand
*/
void
gf_w128_shift_multiply(gf_t *gf, gf_val_128_t a128, gf_val_128_t b128, gf_val_128_t c128)
{
/* ordered highest bit to lowest l[0] l[1] r[0] r[1] */
uint64_t pl[2], pr[2], ppl[2], ppr[2], i, a[2], bl[2], br[2], one, lbit;
gf_internal_t *h;
h = (gf_internal_t *) gf->scratch;
if (GF_W128_IS_ZERO(a128) || GF_W128_IS_ZERO(b128)) {
set_zero(c128, 0);
return;
}
a_get_b(a, 0, a128, 0);
a_get_b(br, 0, b128, 0);
set_zero(bl, 0);
one = 1;
lbit = (one << 63);
set_zero(pl, 0);
set_zero(pr, 0);
for (i = 0; i < GF_FIELD_WIDTH/2; i++) {
if (a[1] & (one << i)) {
pl[1] ^= bl[1];
pr[0] ^= br[0];
pr[1] ^= br[1];
}
bl[1] <<= 1;
if (br[0] & lbit) bl[1] ^= 1;
br[0] <<= 1;
if (br[1] & lbit) br[0] ^= 1;
br[1] <<= 1;
}
for (i = 0; i < GF_FIELD_WIDTH/2; i++) {
if (a[0] & (one << i)) {
pl[0] ^= bl[0];
pl[1] ^= bl[1];
pr[0] ^= br[0];
}
bl[0] <<= 1;
if (bl[1] & lbit) bl[0] ^= 1;
bl[1] <<= 1;
if (br[0] & lbit) bl[1] ^= 1;
br[0] <<= 1;
}
one = lbit;
ppl[0] = lbit;
ppl[1] = h->prim_poly >> 1;
ppr[0] = lbit;
ppr[1] = 0;
while (one != 0) {
if (pl[0] & one) {
pl[0] ^= ppl[0];
pl[1] ^= ppl[1];
pr[0] ^= ppr[0];
pr[1] ^= ppr[1];
}
one >>= 1;
ppr[1] >>= 1;
if (ppr[0] & 1) ppr[1] ^= lbit;
ppr[0] >>= 1;
if (ppl[1] & 1) ppr[0] ^= lbit;
ppl[1] >>= 1;
if (ppl[0] & 1) ppl[1] ^= lbit;
ppl[0] >>= 1;
}
one = lbit;
while (one != 0) {
if (pl[1] & one) {
pl[1] ^= ppl[1];
pr[0] ^= ppr[0];
pr[1] ^= ppr[1];
}
one >>= 1;
ppr[1] >>= 1;
if (ppr[0] & 1) ppr[1] ^= lbit;
ppr[0] >>= 1;
if (ppl[1] & 1) ppr[0] ^= lbit;
ppl[1] >>= 1;
}
c128[0] = pr[0];
c128[1] = pr[1];
return;
}
static
void gf_w128_group_m_init(gf_t *gf, gf_val_128_t b128)
{
int i, j;
int g_m;
uint64_t prim_poly, lbit;
gf_internal_t *scratch;
gf_group_tables_t *gt;
uint64_t a128[2];
scratch = (gf_internal_t *) gf->scratch;
gt = scratch->private;
g_m = scratch->arg1;
prim_poly = scratch->prim_poly;
set_zero(gt->m_table, 0);
a_get_b(gt->m_table, 2, b128, 0);
lbit = 1;
lbit <<= 63;
for (i = 2; i < (1 << g_m); i <<= 1) {
a_get_b(a128, 0, gt->m_table, 2 * (i >> 1));
two_x(a128);
a_get_b(gt->m_table, 2 * i, a128, 0);
if (gt->m_table[2 * (i >> 1)] & lbit) gt->m_table[(2 * i) + 1] ^= prim_poly;
for (j = 0; j < i; j++) {
gt->m_table[(2 * i) + (2 * j)] = gt->m_table[(2 * i)] ^ gt->m_table[(2 * j)];
gt->m_table[(2 * i) + (2 * j) + 1] = gt->m_table[(2 * i) + 1] ^ gt->m_table[(2 * j) + 1];
}
}
return;
}
void
gf_w128_group_multiply(GFP gf, gf_val_128_t a128, gf_val_128_t b128, gf_val_128_t c128)
{
int i;
/* index_r, index_m, total_m (if g_r > g_m) */
int i_r, i_m, t_m;
int mask_m, mask_r;
int g_m, g_r;
uint64_t p_i[2], a[2];
gf_internal_t *scratch;
gf_group_tables_t *gt;
scratch = (gf_internal_t *) gf->scratch;
gt = scratch->private;
g_m = scratch->arg1;
g_r = scratch->arg2;
mask_m = (1 << g_m) - 1;
mask_r = (1 << g_r) - 1;
if (b128[0] != gt->m_table[2] || b128[1] != gt->m_table[3]) {
gf_w128_group_m_init(gf, b128);
}
p_i[0] = 0;
p_i[1] = 0;
a[0] = a128[0];
a[1] = a128[1];
t_m = 0;
i_r = 0;
/* Top 64 bits */
for (i = ((GF_FIELD_WIDTH / 2) / g_m) - 1; i >= 0; i--) {
i_m = (a[0] >> (i * g_m)) & mask_m;
i_r ^= (p_i[0] >> (64 - g_m)) & mask_r;
p_i[0] <<= g_m;
p_i[0] ^= (p_i[1] >> (64-g_m));
p_i[1] <<= g_m;
p_i[0] ^= gt->m_table[2 * i_m];
p_i[1] ^= gt->m_table[(2 * i_m) + 1];
t_m += g_m;
if (t_m == g_r) {
p_i[1] ^= gt->r_table[i_r];
t_m = 0;
i_r = 0;
} else {
i_r <<= g_m;
}
}
for (i = ((GF_FIELD_WIDTH / 2) / g_m) - 1; i >= 0; i--) {
i_m = (a[1] >> (i * g_m)) & mask_m;
i_r ^= (p_i[0] >> (64 - g_m)) & mask_r;
p_i[0] <<= g_m;
p_i[0] ^= (p_i[1] >> (64-g_m));
p_i[1] <<= g_m;
p_i[0] ^= gt->m_table[2 * i_m];
p_i[1] ^= gt->m_table[(2 * i_m) + 1];
t_m += g_m;
if (t_m == g_r) {
p_i[1] ^= gt->r_table[i_r];
t_m = 0;
i_r = 0;
} else {
i_r <<= g_m;
}
}
c128[0] = p_i[0];
c128[1] = p_i[1];
}
/* a^-1 -> b */
void
gf_w128_euclid(GFP gf, gf_val_128_t a128, gf_val_128_t b128)
{
uint64_t e_i[2], e_im1[2], e_ip1[2];
uint64_t d_i, d_im1, d_ip1;
uint64_t y_i[2], y_im1[2], y_ip1[2];
uint64_t c_i[2];
uint64_t *b;
uint64_t one = 1;
uint64_t buf, buf1;
/* This needs to return some sort of error (in b128?) */
if (a128[0] == 0 && a128[1] == 0) return;
e_im1[0] = 0;
e_im1[1] = ((gf_internal_t *) (gf->scratch))->prim_poly;
e_i[0] = a128[0];
e_i[1] = a128[1];
d_im1 = 128;
for (d_i = (d_im1-1) % 64; ((one << d_i) & e_i[0]) == 0 && d_i > 0; d_i--) ;
if (!((one << d_i) & e_i[0])) {
for (d_i = (d_im1-1) % 64; ((one << d_i) & e_i[1] == 0); d_i--) ;
} else {
d_i += 64;
}
y_i[0] = 0;
y_i[1] = 1;
y_im1[0] = 0;
y_im1[1] = 0;
while (!(e_i[0] == 0 && e_i[1] == 1)) {
e_ip1[0] = e_im1[0];
e_ip1[1] = e_im1[1];
d_ip1 = d_im1;
c_i[0] = 0;
c_i[1] = 0;
while (d_ip1 >= d_i) {
if ((d_ip1 - d_i) >= 64) {
c_i[0] ^= (one << ((d_ip1 - d_i) - 64));
e_ip1[0] ^= (e_i[1] << ((d_ip1 - d_i) - 64));
} else {
c_i[1] ^= (one << (d_ip1 - d_i));
e_ip1[0] ^= (e_i[0] << (d_ip1 - d_i));
if (d_ip1 - d_i > 0) e_ip1[0] ^= (e_i[1] >> (64 - (d_ip1 - d_i)));
e_ip1[1] ^= (e_i[1] << (d_ip1 - d_i));
}
d_ip1--;
while (d_ip1 >= 64 && (e_ip1[0] & (one << (d_ip1 - 64))) == 0) d_ip1--;
while (d_ip1 < 64 && (e_ip1[1] & (one << d_ip1)) == 0) d_ip1--;
}
gf->multiply.w128(gf, c_i, y_i, y_ip1);
y_ip1[0] ^= y_im1[0];
y_ip1[1] ^= y_im1[1];
y_im1[0] = y_i[0];
y_im1[1] = y_i[1];
y_i[0] = y_ip1[0];
y_i[1] = y_ip1[1];
e_im1[0] = e_i[0];
e_im1[1] = e_i[1];
d_im1 = d_i;
e_i[0] = e_ip1[0];
e_i[1] = e_ip1[1];
d_i = d_ip1;
}
b = (uint64_t *) b128;
b[0] = y_i[0];
b[1] = y_i[1];
return;
}
void
gf_w128_divide_from_inverse(GFP gf, gf_val_128_t a128, gf_val_128_t b128, gf_val_128_t c128)
{
uint64_t d[2];
gf->inverse.w128(gf, b128, d);
gf->multiply.w128(gf, a128, d, c128);
return;
}
void
gf_w128_inverse_from_divide(GFP gf, gf_val_128_t a128, gf_val_128_t b128)
{
uint64_t one128[2];
one128[0] = 0;
one128[1] = 1;
gf->divide.w128(gf, one128, a128, b128);
return;
}
static
int gf_w128_shift_init(gf_t *gf)
{
gf->multiply.w128 = gf_w128_shift_multiply;
gf->inverse.w128 = gf_w128_euclid;
gf->multiply_region.w128 = gf_w128_multiply_region_from_single;
return 1;
}
/*
* Because the prim poly is only 8 bits and we are limiting g_r to 16, I do not need the high 64
* bits in all of these numbers.
*/
static
void gf_w128_group_r_init(gf_t *gf)
{
int i, j;
int g_r;
uint64_t pp;
gf_internal_t *scratch;
gf_group_tables_t *gt;
scratch = (gf_internal_t *) gf->scratch;
gt = scratch->private;
g_r = scratch->arg2;
pp = scratch->prim_poly;
gt->r_table[0] = 0;
for (i = 1; i < (1 << g_r); i++) {
gt->r_table[i] = 0;
for (j = 0; j < g_r; j++) {
if (i & (1 << j)) {
gt->r_table[i] ^= (pp << j);
}
}
}
return;
}
static
int gf_w128_group_init(gf_t *gf)
{
gf_internal_t *scratch;
gf_group_tables_t *gt;
int g_m, g_r, size_r;
scratch = (gf_internal_t *) gf->scratch;
gt = scratch->private;
g_m = scratch->arg1;
g_r = scratch->arg2;
size_r = (1 << g_r);
gt->r_table = scratch->private + (2 * sizeof(uint64_t *));
gt->m_table = gt->r_table + size_r;
gt->m_table[2] = 0;
gt->m_table[3] = 0;
gf_w128_group_r_init(gf);
gf->multiply.w128 = gf_w128_group_multiply;
gf->inverse.w128 = gf_w128_euclid;
gf->multiply_region.w128 = gf_w128_multiply_region_from_single; /* This needs to change */
return 1;
}
int gf_w128_scratch_size(int mult_type, int region_type, int divide_type, int arg1, int arg2)
{
int size_m, size_r;
int w = 128;
switch(mult_type)
{
case GF_MULT_DEFAULT:
case GF_MULT_SHIFT:
if (arg1 != 0 || arg2 != 0 || region_type != 0) return -1;
return sizeof(gf_internal_t);
break;
case GF_MULT_GROUP:
/* arg1 == mult size, arg2 == reduce size */
/* Should prevent anything over arg1 > 16 || arg2 > 16 */
if (region_type != 0) return -1;
if (arg1 <= 0 || arg2 <= 0 || arg1 > 16 || arg2 > 16) return -1;
if (GF_FIELD_WIDTH % arg1 != 0 || GF_FIELD_WIDTH % arg2 != 0) return -1;
/*
* Currently implementing code where g_m and g_r are the same or where g_r is larger, as
* these it is more efficient to have g_r as large as possible (but still not > 16)
*/
if (arg1 > arg2) return -1;
/* size of each group, 128 bits */
size_m = (1 << arg1) * 2 * sizeof(uint64_t);
/* The PP is only 8 bits and we are limiting g_r to 16, so only uint64_t */
size_r = (1 << arg2) * sizeof(uint64_t);
/*
* two pointers prepend the table data for structure
* because the tables are of dynamic size
*/
return sizeof(gf_internal_t) + size_m + size_r + 2 * sizeof(uint64_t *);
default:
return -1;
}
}
int gf_w128_init(gf_t *gf)
{
gf_internal_t *h;
h = (gf_internal_t *) gf->scratch;
if (h->prim_poly == 0) h->prim_poly = 0x87; /* Omitting the leftmost 1 as in w=32 */
gf->multiply.w128 = NULL;
gf->divide.w128 = NULL;
gf->inverse.w128 = NULL;
gf->multiply_region.w128 = NULL;
switch(h->mult_type) {
case GF_MULT_DEFAULT:
case GF_MULT_SHIFT: if (gf_w128_shift_init(gf) == 0) return 0; break;
case GF_MULT_GROUP: if (gf_w128_group_init(gf) == 0) return 0; break;
default: return 0;
}
if (h->divide_type == GF_DIVIDE_EUCLID) {
gf->divide.w128 = gf_w128_divide_from_inverse;
gf->inverse.w128 = gf_w128_euclid;
} /* } else if (h->divide_type == GF_DIVIDE_MATRIX) {
gf->divide.w128 = gf_w128_divide_from_inverse;
gf->inverse.w128 = gf_w128_matrix;
} */
if (gf->inverse.w128 != NULL && gf->divide.w128 == NULL) {
gf->divide.w128 = gf_w128_divide_from_inverse;
}
if (gf->inverse.w128 == NULL && gf->divide.w128 != NULL) {
gf->inverse.w128 = gf_w128_inverse_from_divide;
}
return 1;
}

1941
gf_w16.c Normal file

File diff suppressed because it is too large Load Diff

2350
gf_w32.c Normal file

File diff suppressed because it is too large Load Diff

2006
gf_w4.c Normal file

File diff suppressed because it is too large Load Diff

206
gf_w64.c Normal file
View File

@ -0,0 +1,206 @@
/*
* gf_w64.c
*
* Routines for 64-bit Galois fields
*/
#include "gf_int.h"
#include <stdio.h>
#include <stdlib.h>
#define GF_FIELD_WIDTH (64)
static
inline
gf_val_64_t gf_w64_inverse_from_divide (gf_t *gf, gf_val_64_t a)
{
return gf->divide.w64(gf, 1, a);
}
static
inline
gf_val_64_t gf_w64_divide_from_inverse (gf_t *gf, gf_val_64_t a, gf_val_64_t b)
{
b = gf->inverse.w64(gf, b);
return gf->multiply.w64(gf, a, b);
}
static
void
gf_w64_multiply_region_from_single(gf_t *gf, void *src, void *dest, gf_val_64_t val, int bytes, int
xor)
{
int i;
gf_val_64_t *s64;
gf_val_64_t *d64;
s64 = (gf_val_64_t *) src;
d64 = (gf_val_64_t *) dest;
if (xor) {
for (i = 0; i < bytes/sizeof(gf_val_64_t); i++) {
d64[i] ^= gf->multiply.w64(gf, val, s64[i]);
}
} else {
for (i = 0; i < bytes/sizeof(gf_val_64_t); i++) {
d64[i] = gf->multiply.w64(gf, val, s64[i]);
}
}
}
static
inline
gf_val_64_t gf_w64_euclid (gf_t *gf, gf_val_64_t b)
{
gf_val_64_t e_i, e_im1, e_ip1;
gf_val_64_t d_i, d_im1, d_ip1;
gf_val_64_t y_i, y_im1, y_ip1;
gf_val_64_t c_i;
gf_val_64_t one = 1;
if (b == 0) return -1;
e_im1 = ((gf_internal_t *) (gf->scratch))->prim_poly;
e_i = b;
d_im1 = 64;
for (d_i = d_im1-1; ((one << d_i) & e_i) == 0; d_i--) ;
y_i = 1;
y_im1 = 0;
while (e_i != 1) {
e_ip1 = e_im1;
d_ip1 = d_im1;
c_i = 0;
while (d_ip1 >= d_i) {
c_i ^= (one << (d_ip1 - d_i));
e_ip1 ^= (e_i << (d_ip1 - d_i));
d_ip1--;
while ((e_ip1 & (one << d_ip1)) == 0) d_ip1--;
}
y_ip1 = y_im1 ^ gf->multiply.w64(gf, c_i, y_i);
y_im1 = y_i;
y_i = y_ip1;
e_im1 = e_i;
d_im1 = d_i;
e_i = e_ip1;
d_i = d_ip1;
}
return y_i;
}
/* JSP: GF_MULT_SHIFT: The world's dumbest multiplication algorithm. I only
include it for completeness. It does have the feature that it requires no
extra memory.
*/
static
inline
gf_val_64_t
gf_w64_shift_multiply (gf_t *gf, gf_val_64_t a64, gf_val_64_t b64)
{
uint64_t pl, pr, ppl, ppr, i, pp, a, bl, br, one, lbit;
gf_internal_t *h;
h = (gf_internal_t *) gf->scratch;
ppr = h->prim_poly;
ppl = 1;
a = a64;
bl = 0;
br = b64;
one = 1;
lbit = (one << 63);
pl = 0;
pr = 0;
for (i = 0; i < GF_FIELD_WIDTH; i++) {
if (a & (one << i)) {
pl ^= bl;
pr ^= br;
}
/* printf("P: %016llx %016llx ", pl, pr); printf("B: %016llx %016llx\n", bl, br); */
bl <<= 1;
if (br & lbit) bl ^= 1;
br <<= 1;
}
one = lbit;
ppl = ((h->prim_poly >> 1) | lbit);
ppr = lbit;
while (one != 0) {
if (pl & one) {
pl ^= ppl;
pr ^= ppr;
}
one >>= 1;
ppr >>= 1;
if (ppl & 1) ppr ^= lbit;
ppl >>= 1;
}
return pr;
}
static
int gf_w64_shift_init(gf_t *gf)
{
gf->multiply.w64 = gf_w64_shift_multiply;
gf->inverse.w64 = gf_w64_euclid;
gf->multiply_region.w64 = gf_w64_multiply_region_from_single;
return 1;
}
int gf_w64_scratch_size(int mult_type, int region_type, int divide_type, int arg1, int arg2)
{
if (divide_type == GF_DIVIDE_MATRIX) return -1;
switch(mult_type)
{
case GF_MULT_DEFAULT:
case GF_MULT_SHIFT:
if (arg1 != 0 || arg2 != 0 || region_type != 0) return -1;
return sizeof(gf_internal_t);
break;
default:
return -1;
}
}
int gf_w64_init(gf_t *gf)
{
gf_internal_t *h;
h = (gf_internal_t *) gf->scratch;
if (h->prim_poly == 0) h->prim_poly = 0x1b; /* Omitting the leftmost 1 as in w=32 */
gf->multiply.w64 = NULL;
gf->divide.w64 = NULL;
gf->inverse.w64 = NULL;
gf->multiply_region.w64 = NULL;
switch(h->mult_type) {
case GF_MULT_DEFAULT:
case GF_MULT_SHIFT: if (gf_w64_shift_init(gf) == 0) return 0; break;
default: return 0;
}
if (h->divide_type == GF_DIVIDE_EUCLID) {
gf->divide.w64 = gf_w64_divide_from_inverse;
gf->inverse.w64 = gf_w64_euclid;
}
/* else if (h->divide_type == GF_DIVIDE_MATRIX) {
gf->divide.w64 = gf_w64_divide_from_inverse;
gf->inverse.w64 = gf_w64_matrix;
} */
if (gf->inverse.w64 != NULL && gf->divide.w64 == NULL) {
gf->divide.w64 = gf_w64_divide_from_inverse;
}
if (gf->inverse.w64 == NULL && gf->divide.w64 != NULL) {
gf->inverse.w64 = gf_w64_inverse_from_divide;
}
return 1;
}

1837
gf_w8.c Normal file

File diff suppressed because it is too large Load Diff

945
gf_wgen.c Normal file
View File

@ -0,0 +1,945 @@
/*
* gf_wgen.c
*
* Routines for Galois fields for general w < 32. For specific w,
like 4, 8, 16, 32, 64 and 128, see the other files.
*/
#include "gf_int.h"
#include <stdio.h>
#include <stdlib.h>
struct gf_wgen_table_w8_data {
uint8_t *mult;
uint8_t *div;
uint8_t base;
};
struct gf_wgen_table_w16_data {
uint16_t *mult;
uint16_t *div;
uint16_t base;
};
struct gf_wgen_log_w8_data {
uint8_t *log;
uint8_t *anti;
uint8_t *danti;
uint8_t base;
};
struct gf_wgen_log_w16_data {
uint16_t *log;
uint16_t *anti;
uint16_t *danti;
uint16_t base;
};
struct gf_wgen_log_w32_data {
uint32_t *log;
uint32_t *anti;
uint32_t *danti;
uint32_t base;
};
struct gf_wgen_group_data {
uint32_t *reduce;
uint32_t *shift;
uint32_t mask;
uint64_t rmask;
int tshift;
uint32_t memory;
};
static
inline
gf_val_32_t gf_wgen_inverse_from_divide (gf_t *gf, gf_val_32_t a)
{
return gf->divide.w32(gf, 1, a);
}
static
inline
gf_val_32_t gf_wgen_divide_from_inverse (gf_t *gf, gf_val_32_t a, gf_val_32_t b)
{
b = gf->inverse.w32(gf, b);
return gf->multiply.w32(gf, a, b);
}
static
inline
gf_val_32_t gf_wgen_euclid (gf_t *gf, gf_val_32_t b)
{
gf_val_32_t e_i, e_im1, e_ip1;
gf_val_32_t d_i, d_im1, d_ip1;
gf_val_32_t y_i, y_im1, y_ip1;
gf_val_32_t c_i;
if (b == 0) return -1;
e_im1 = ((gf_internal_t *) (gf->scratch))->prim_poly;
e_i = b;
d_im1 = ((gf_internal_t *) (gf->scratch))->w;
for (d_i = d_im1; ((1 << d_i) & e_i) == 0; d_i--) ;
y_i = 1;
y_im1 = 0;
while (e_i != 1) {
e_ip1 = e_im1;
d_ip1 = d_im1;
c_i = 0;
while (d_ip1 >= d_i) {
c_i ^= (1 << (d_ip1 - d_i));
e_ip1 ^= (e_i << (d_ip1 - d_i));
while ((e_ip1 & (1 << d_ip1)) == 0) d_ip1--;
}
y_ip1 = y_im1 ^ gf->multiply.w32(gf, c_i, y_i);
y_im1 = y_i;
y_i = y_ip1;
e_im1 = e_i;
d_im1 = d_i;
e_i = e_ip1;
d_i = d_ip1;
}
return y_i;
}
gf_val_32_t gf_wgen_extract_word(gf_t *gf, void *start, int bytes, int index)
{
uint8_t *ptr;
uint32_t rv;
int rs;
int byte, bit, i;
gf_internal_t *h;
h = (gf_internal_t *) gf->scratch;
rs = bytes / h->w;
byte = index/8;
bit = index%8;
ptr = (uint8_t *) start;
ptr += bytes;
ptr -= rs;
ptr += byte;
rv = 0;
for (i = 0; i < h->w; i++) {
rv <<= 1;
if ((*ptr) & (1 << bit)) rv |= 1;
ptr -= rs;
}
return rv;
}
static
inline
gf_val_32_t gf_wgen_matrix (gf_t *gf, gf_val_32_t b)
{
return gf_bitmatrix_inverse(b, ((gf_internal_t *) (gf->scratch))->w,
((gf_internal_t *) (gf->scratch))->prim_poly);
}
static
inline
uint32_t
gf_wgen_shift_multiply (gf_t *gf, uint32_t a32, uint32_t b32)
{
uint64_t product, i, pp, a, b, one;
gf_internal_t *h;
a = a32;
b = b32;
h = (gf_internal_t *) gf->scratch;
one = 1;
pp = h->prim_poly | (one << h->w);
product = 0;
for (i = 0; i < h->w; i++) {
if (a & (one << i)) product ^= (b << i);
}
for (i = h->w*2-1; i >= h->w; i--) {
if (product & (one << i)) product ^= (pp << (i-h->w));
}
return product;
}
static
int gf_wgen_shift_init(gf_t *gf)
{
gf->multiply.w32 = gf_wgen_shift_multiply;
gf->inverse.w32 = gf_wgen_euclid;
return 1;
}
static
gf_val_32_t
gf_wgen_bytwo_b_multiply (gf_t *gf, gf_val_32_t a, gf_val_32_t b)
{
uint32_t prod, pp, bmask;
gf_internal_t *h;
h = (gf_internal_t *) gf->scratch;
pp = h->prim_poly;
prod = 0;
bmask = (1 << (h->w-1));
while (1) {
if (a & 1) prod ^= b;
a >>= 1;
if (a == 0) return prod;
if (b & bmask) {
b = ((b << 1) ^ pp);
} else {
b <<= 1;
}
}
}
static
int gf_wgen_bytwo_b_init(gf_t *gf)
{
gf->multiply.w32 = gf_wgen_bytwo_b_multiply;
gf->inverse.w32 = gf_wgen_euclid;
return 1;
}
static
inline
gf_val_32_t
gf_wgen_bytwo_p_multiply (gf_t *gf, gf_val_32_t a, gf_val_32_t b)
{
uint32_t prod, pp, pmask, amask;
gf_internal_t *h;
h = (gf_internal_t *) gf->scratch;
pp = h->prim_poly;
prod = 0;
pmask = (1 << (h->w)-1);
amask = pmask;
while (amask != 0) {
if (prod & pmask) {
prod = ((prod << 1) ^ pp);
} else {
prod <<= 1;
}
if (a & amask) prod ^= b;
amask >>= 1;
}
return prod;
}
static
int gf_wgen_bytwo_p_init(gf_t *gf)
{
gf->multiply.w32 = gf_wgen_bytwo_p_multiply;
gf->inverse.w32 = gf_wgen_euclid;
return 1;
}
static
void
gf_wgen_group_set_shift_tables(uint32_t *shift, uint32_t val, gf_internal_t *h)
{
int i;
uint32_t j;
shift[0] = 0;
for (i = 1; i < (1 << h->arg1); i <<= 1) {
for (j = 0; j < i; j++) shift[i|j] = shift[j]^val;
if (val & (1 << (h->w-1))) {
val <<= 1;
val ^= h->prim_poly;
} else {
val <<= 1;
}
}
}
static
inline
gf_val_32_t
gf_wgen_group_s_equals_r_multiply(gf_t *gf, gf_val_32_t a, gf_val_32_t b)
{
int i;
int leftover, rs;
uint32_t p, l, ind, r, a32;
int bits_left;
int g_s;
int w;
struct gf_wgen_group_data *gd;
gf_internal_t *h = (gf_internal_t *) gf->scratch;
g_s = h->arg1;
w = h->w;
gd = (struct gf_wgen_group_data *) h->private;
gf_wgen_group_set_shift_tables(gd->shift, b, h);
leftover = w % g_s;
if (leftover == 0) leftover = g_s;
rs = w - leftover;
a32 = a;
ind = a32 >> rs;
a32 <<= leftover;
a32 &= gd->mask;
p = gd->shift[ind];
bits_left = rs;
rs = w - g_s;
while (bits_left > 0) {
bits_left -= g_s;
ind = a32 >> rs;
a32 <<= g_s;
a32 &= gd->mask;
l = p >> rs;
p = (gd->shift[ind] ^ gd->reduce[l] ^ (p << g_s)) & gd->mask;
}
return p;
}
char *bits(uint32_t v)
{
char *rv;
int i, j;
rv = malloc(30);
j = 0;
for (i = 27; i >= 0; i--) {
rv[j] = '0' + ((v & (1 << i)) ? 1 : 0);
j++;
}
rv[j] = '\0';
return rv;
}
char *bits_56(uint64_t v)
{
char *rv;
int i, j;
uint64_t one;
one = 1;
rv = malloc(60);
j = 0;
for (i = 55; i >= 0; i--) {
rv[j] = '0' + ((v & (one << i)) ? 1 : 0);
j++;
}
rv[j] = '\0';
return rv;
}
static
inline
gf_val_32_t
gf_wgen_group_multiply(gf_t *gf, gf_val_32_t a, gf_val_32_t b)
{
int i;
int leftover;
uint64_t p, l, r, mask;
uint32_t a32, ind;
int g_s, g_r;
struct gf_wgen_group_data *gd;
int w;
gf_internal_t *h = (gf_internal_t *) gf->scratch;
g_s = h->arg1;
g_r = h->arg2;
w = h->w;
gd = (struct gf_wgen_group_data *) h->private;
gf_wgen_group_set_shift_tables(gd->shift, b, h);
leftover = w % g_s;
if (leftover == 0) leftover = g_s;
a32 = a;
ind = a32 >> (w - leftover);
p = gd->shift[ind];
p <<= g_s;
a32 <<= leftover;
a32 &= gd->mask;
i = (w - leftover);
while (i > g_s) {
ind = a32 >> (w-g_s);
p ^= gd->shift[ind];
a32 <<= g_s;
a32 &= gd->mask;
p <<= g_s;
i -= g_s;
}
ind = a32 >> (h->w-g_s);
p ^= gd->shift[ind];
for (i = gd->tshift ; i >= 0; i -= g_r) {
l = p & (gd->rmask << i);
r = gd->reduce[l >> (i+w)];
r <<= (i);
p ^= r;
}
return p & gd->mask;
}
static
int gf_wgen_group_init(gf_t *gf)
{
uint32_t i, j, p, index;
struct gf_wgen_group_data *gd;
gf_internal_t *h = (gf_internal_t *) gf->scratch;
int g_s, g_r;
g_s = h->arg1;
g_r = h->arg2;
gd = (struct gf_wgen_group_data *) h->private;
gd->shift = &(gd->memory);
gd->reduce = gd->shift + (1 << g_s);
gd->mask = (h->w != 31) ? ((1 << h->w)-1) : 0x7fffffff;
gd->rmask = (1 << g_r) - 1;
gd->rmask <<= h->w;
gd->tshift = h->w % g_s;
if (gd->tshift == 0) gd->tshift = g_s;
gd->tshift = (h->w - gd->tshift);
gd->tshift = ((gd->tshift-1)/g_r) * g_r;
gd->reduce[0] = 0;
for (i = 0; i < (1 << g_r); i++) {
p = 0;
index = 0;
for (j = 0; j < g_r; j++) {
if (i & (1 << j)) {
p ^= (h->prim_poly << j);
index ^= (h->prim_poly >> (h->w-j));
}
}
gd->reduce[index] = (p & gd->mask);
}
if (g_s == g_r) {
gf->multiply.w32 = gf_wgen_group_s_equals_r_multiply;
} else {
gf->multiply.w32 = gf_wgen_group_multiply;
}
gf->divide.w32 = NULL;
gf->divide.w32 = NULL;
return 1;
}
static
gf_val_32_t
gf_wgen_table_8_multiply(gf_t *gf, gf_val_32_t a, gf_val_32_t b)
{
gf_internal_t *h;
struct gf_wgen_table_w8_data *std;
h = (gf_internal_t *) gf->scratch;
std = (struct gf_wgen_table_w8_data *) h->private;
return (std->mult[(a<<h->w)+b]);
}
static
gf_val_32_t
gf_wgen_table_8_divide(gf_t *gf, gf_val_32_t a, gf_val_32_t b)
{
gf_internal_t *h;
struct gf_wgen_table_w8_data *std;
h = (gf_internal_t *) gf->scratch;
std = (struct gf_wgen_table_w8_data *) h->private;
return (std->div[(a<<h->w)+b]);
}
static
int gf_wgen_table_8_init(gf_t *gf)
{
gf_internal_t *h;
int w;
struct gf_wgen_table_w8_data *std;
uint32_t a, b, p, pp;
h = (gf_internal_t *) gf->scratch;
w = h->w;
std = (struct gf_wgen_table_w8_data *) h->private;
std->mult = &(std->base);
std->div = std->mult + ((1<<h->w)*(1<<h->w));
for (a = 0; a < (1 << w); a++) {
std->mult[a] = 0;
std->mult[a<<w] = 0;
std->div[a] = 0;
std->div[a<<w] = 0;
}
for (a = 1; a < (1 << w); a++) {
b = 1;
p = a;
do {
std->mult[(a<<w)|b] = p;
std->div[(p<<w)|b] = a;
b = (b & (1 << (w-1))) ? (b << 1) ^ h->prim_poly : (b << 1);
b &= ((1 << w)-1);
p = (p & (1 << (w-1))) ? (p << 1) ^ h->prim_poly : (p << 1);
p &= ((1 << w)-1);
} while (b != 1);
}
gf->multiply.w32 = gf_wgen_table_8_multiply;
gf->divide.w32 = gf_wgen_table_8_divide;
return 1;
}
static
gf_val_32_t
gf_wgen_table_16_multiply(gf_t *gf, gf_val_32_t a, gf_val_32_t b)
{
gf_internal_t *h;
struct gf_wgen_table_w16_data *std;
h = (gf_internal_t *) gf->scratch;
std = (struct gf_wgen_table_w16_data *) h->private;
return (std->mult[(a<<h->w)+b]);
}
static
gf_val_32_t
gf_wgen_table_16_divide(gf_t *gf, gf_val_32_t a, gf_val_32_t b)
{
gf_internal_t *h;
struct gf_wgen_table_w16_data *std;
h = (gf_internal_t *) gf->scratch;
std = (struct gf_wgen_table_w16_data *) h->private;
return (std->div[(a<<h->w)+b]);
}
static
int gf_wgen_table_16_init(gf_t *gf)
{
gf_internal_t *h;
int w;
struct gf_wgen_table_w16_data *std;
uint32_t a, b, p, pp;
h = (gf_internal_t *) gf->scratch;
w = h->w;
std = (struct gf_wgen_table_w16_data *) h->private;
std->mult = &(std->base);
std->div = std->mult + ((1<<h->w)*(1<<h->w));
for (a = 0; a < (1 << w); a++) {
std->mult[a] = 0;
std->mult[a<<w] = 0;
std->div[a] = 0;
std->div[a<<w] = 0;
}
for (a = 1; a < (1 << w); a++) {
b = 1;
p = a;
do {
std->mult[(a<<w)|b] = p;
std->div[(p<<w)|b] = a;
b = (b & (1 << (w-1))) ? (b << 1) ^ h->prim_poly : (b << 1);
b &= ((1 << w)-1);
p = (p & (1 << (w-1))) ? (p << 1) ^ h->prim_poly : (p << 1);
p &= ((1 << w)-1);
} while (b != 1);
}
gf->multiply.w32 = gf_wgen_table_16_multiply;
gf->divide.w32 = gf_wgen_table_16_divide;
return 1;
}
static
int gf_wgen_table_init(gf_t *gf)
{
gf_internal_t *h;
h = (gf_internal_t *) gf->scratch;
if (h->w <= 8) return gf_wgen_table_8_init(gf);
if (h->w <= 14) return gf_wgen_table_16_init(gf);
}
static
gf_val_32_t
gf_wgen_log_8_multiply(gf_t *gf, gf_val_32_t a, gf_val_32_t b)
{
gf_internal_t *h;
struct gf_wgen_log_w8_data *std;
h = (gf_internal_t *) gf->scratch;
std = (struct gf_wgen_log_w8_data *) h->private;
if (a == 0 || b == 0) return 0;
return (std->anti[std->log[a]+std->log[b]]);
}
static
gf_val_32_t
gf_wgen_log_8_divide(gf_t *gf, gf_val_32_t a, gf_val_32_t b)
{
gf_internal_t *h;
struct gf_wgen_log_w8_data *std;
int index;
h = (gf_internal_t *) gf->scratch;
std = (struct gf_wgen_log_w8_data *) h->private;
if (a == 0 || b == 0) return 0;
index = std->log[a];
index -= std->log[b];
return (std->danti[index]);
}
static
int gf_wgen_log_8_init(gf_t *gf)
{
gf_internal_t *h;
struct gf_wgen_log_w8_data *std;
int w;
uint32_t a, i;
h = (gf_internal_t *) gf->scratch;
w = h->w;
std = (struct gf_wgen_log_w8_data *) h->private;
std->log = &(std->base);
std->anti = std->log + (1<<h->w);
std->danti = std->anti + (1<<h->w)-1;
i = 0;
a = 1;
do {
std->log[a] = i;
std->anti[i] = a;
std->danti[i] = a;
i++;
a = (a & (1 << (w-1))) ? (a << 1) ^ h->prim_poly : (a << 1);
a &= ((1 << w)-1);
} while (a != 1);
gf->multiply.w32 = gf_wgen_log_8_multiply;
gf->divide.w32 = gf_wgen_log_8_divide;
return 1;
}
static
gf_val_32_t
gf_wgen_log_16_multiply(gf_t *gf, gf_val_32_t a, gf_val_32_t b)
{
gf_internal_t *h;
struct gf_wgen_log_w16_data *std;
h = (gf_internal_t *) gf->scratch;
std = (struct gf_wgen_log_w16_data *) h->private;
if (a == 0 || b == 0) return 0;
return (std->anti[std->log[a]+std->log[b]]);
}
static
gf_val_32_t
gf_wgen_log_16_divide(gf_t *gf, gf_val_32_t a, gf_val_32_t b)
{
gf_internal_t *h;
struct gf_wgen_log_w16_data *std;
int index;
h = (gf_internal_t *) gf->scratch;
std = (struct gf_wgen_log_w16_data *) h->private;
if (a == 0 || b == 0) return 0;
index = std->log[a];
index -= std->log[b];
return (std->danti[index]);
}
static
int gf_wgen_log_16_init(gf_t *gf)
{
gf_internal_t *h;
struct gf_wgen_log_w16_data *std;
int w;
uint32_t a, i;
h = (gf_internal_t *) gf->scratch;
w = h->w;
std = (struct gf_wgen_log_w16_data *) h->private;
std->log = &(std->base);
std->anti = std->log + (1<<h->w);
std->danti = std->anti + (1<<h->w)-1;
i = 0;
a = 1;
do {
std->log[a] = i;
std->anti[i] = a;
std->danti[i] = a;
i++;
a = (a & (1 << (w-1))) ? (a << 1) ^ h->prim_poly : (a << 1);
a &= ((1 << w)-1);
} while (a != 1);
gf->multiply.w32 = gf_wgen_log_16_multiply;
gf->divide.w32 = gf_wgen_log_16_divide;
return 1;
}
static
gf_val_32_t
gf_wgen_log_32_multiply(gf_t *gf, gf_val_32_t a, gf_val_32_t b)
{
gf_internal_t *h;
struct gf_wgen_log_w32_data *std;
h = (gf_internal_t *) gf->scratch;
std = (struct gf_wgen_log_w32_data *) h->private;
if (a == 0 || b == 0) return 0;
return (std->anti[std->log[a]+std->log[b]]);
}
static
gf_val_32_t
gf_wgen_log_32_divide(gf_t *gf, gf_val_32_t a, gf_val_32_t b)
{
gf_internal_t *h;
struct gf_wgen_log_w32_data *std;
int index;
h = (gf_internal_t *) gf->scratch;
std = (struct gf_wgen_log_w32_data *) h->private;
if (a == 0 || b == 0) return 0;
index = std->log[a];
index -= std->log[b];
return (std->danti[index]);
}
static
int gf_wgen_log_32_init(gf_t *gf)
{
gf_internal_t *h;
struct gf_wgen_log_w32_data *std;
int w;
uint32_t a, i;
h = (gf_internal_t *) gf->scratch;
w = h->w;
std = (struct gf_wgen_log_w32_data *) h->private;
std->log = &(std->base);
std->anti = std->log + (1<<h->w);
std->danti = std->anti + (1<<h->w)-1;
i = 0;
a = 1;
do {
std->log[a] = i;
std->anti[i] = a;
std->danti[i] = a;
i++;
a = (a & (1 << (w-1))) ? (a << 1) ^ h->prim_poly : (a << 1);
a &= ((1 << w)-1);
} while (a != 1);
gf->multiply.w32 = gf_wgen_log_32_multiply;
gf->divide.w32 = gf_wgen_log_32_divide;
return 1;
}
static
int gf_wgen_log_init(gf_t *gf)
{
gf_internal_t *h;
h = (gf_internal_t *) gf->scratch;
if (h->w <= 8) return gf_wgen_log_8_init(gf);
if (h->w <= 16) return gf_wgen_log_16_init(gf);
if (h->w <= 32) return gf_wgen_log_32_init(gf);
}
int gf_wgen_scratch_size(int w, int mult_type, int region_type, int divide_type, int arg1, int arg2)
{
if (w > 32 || w < 0) return -1;
if ((region_type | GF_REGION_CAUCHY) != GF_REGION_CAUCHY) return -1;
switch(mult_type)
{
case GF_MULT_DEFAULT:
case GF_MULT_SHIFT:
case GF_MULT_BYTWO_b:
case GF_MULT_BYTWO_p:
if (arg1 != 0 || arg2 != 0) return -1;
return sizeof(gf_internal_t);
break;
case GF_MULT_GROUP:
if (arg1 <= 0 || arg2 <= 0) return -1;
return sizeof(gf_internal_t) + sizeof(struct gf_wgen_group_data) +
sizeof(uint32_t) * (1 << arg1) +
sizeof(uint32_t) * (1 << arg2) + 64;
break;
case GF_MULT_TABLE:
if (arg1 != 0 || arg2 != 0) return -1;
if (w <= 8) {
return sizeof(gf_internal_t) + sizeof(struct gf_wgen_table_w8_data) +
sizeof(uint8_t)*(1 << w)*(1<<w)*2 + 64;
} else if (w < 15) {
return sizeof(gf_internal_t) + sizeof(struct gf_wgen_table_w16_data) +
sizeof(uint16_t)*(1 << w)*(1<<w)*2 + 64;
} else return -1;
case GF_MULT_LOG_TABLE:
if (arg1 != 0 || arg2 != 0) return -1;
if (w <= 8) {
return sizeof(gf_internal_t) + sizeof(struct gf_wgen_log_w8_data) +
sizeof(uint8_t)*(1 << w)*3;
} else if (w <= 16) {
return sizeof(gf_internal_t) + sizeof(struct gf_wgen_log_w16_data) +
sizeof(uint16_t)*(1 << w)*3;
} else if (w <= 29) {
return sizeof(gf_internal_t) + sizeof(struct gf_wgen_log_w32_data) +
sizeof(uint32_t)*(1 << w)*3;
} else return -1;
default:
return -1;
}
}
void
gf_wgen_cauchy_region(gf_t *gf, void *src, void *dest, gf_val_32_t val, int bytes, int xor)
{
gf_internal_t *h;
gf_region_data rd;
int written;
int rs, i, j;
gf_set_region_data(&rd, gf, src, dest, bytes, val, xor, -1);
if (val == 0) { gf_multby_zero(dest, bytes, xor); return; }
if (val == 1) { gf_multby_one(gf, src, dest, bytes, xor); return; }
h = (gf_internal_t *) gf->scratch;
rs = bytes / (h->w);
written = (xor) ? 0xffffffff : 0;
for (i = 0; i < h->w; i++) {
for (j = 0; j < h->w; j++) {
if (val & (1 << j)) {
gf_multby_one(gf, src, dest + j*rs, rs, (written & (1 << j)));
written |= (1 << j);
}
}
src += rs;
val = gf->multiply.w32(gf, val, 2);
}
}
int gf_wgen_init(gf_t *gf)
{
gf_internal_t *h;
h = (gf_internal_t *) gf->scratch;
if (h->prim_poly == 0) {
switch (h->w) {
case 1: h->prim_poly = 1; break;
case 2: h->prim_poly = 7; break;
case 3: h->prim_poly = 013; break;
case 4: h->prim_poly = 023; break;
case 5: h->prim_poly = 045; break;
case 6: h->prim_poly = 0103; break;
case 7: h->prim_poly = 0211; break;
case 8: h->prim_poly = 0435; break;
case 9: h->prim_poly = 01021; break;
case 10: h->prim_poly = 02011; break;
case 11: h->prim_poly = 04005; break;
case 12: h->prim_poly = 010123; break;
case 13: h->prim_poly = 020033; break;
case 14: h->prim_poly = 042103; break;
case 15: h->prim_poly = 0100003; break;
case 16: h->prim_poly = 0210013; break;
case 17: h->prim_poly = 0400011; break;
case 18: h->prim_poly = 01000201; break;
case 19: h->prim_poly = 02000047; break;
case 20: h->prim_poly = 04000011; break;
case 21: h->prim_poly = 010000005; break;
case 22: h->prim_poly = 020000003; break;
case 23: h->prim_poly = 040000041; break;
case 24: h->prim_poly = 0100000207; break;
case 25: h->prim_poly = 0200000011; break;
case 26: h->prim_poly = 0400000107; break;
case 27: h->prim_poly = 01000000047; break;
case 28: h->prim_poly = 02000000011; break;
case 29: h->prim_poly = 04000000005; break;
case 30: h->prim_poly = 010040000007; break;
case 31: h->prim_poly = 020000000011; break;
case 32: h->prim_poly = 00020000007; break;
default: fprintf(stderr, "gf_wgen_init: w not defined yet\n"); exit(1);
}
}
gf->multiply.w32 = NULL;
gf->divide.w32 = NULL;
gf->inverse.w32 = NULL;
gf->multiply_region.w32 = gf_wgen_cauchy_region;
gf->extract_word.w32 = gf_wgen_extract_word;
switch(h->mult_type) {
case GF_MULT_DEFAULT:
case GF_MULT_SHIFT: if (gf_wgen_shift_init(gf) == 0) return 0; break;
case GF_MULT_BYTWO_b: if (gf_wgen_bytwo_b_init(gf) == 0) return 0; break;
case GF_MULT_BYTWO_p: if (gf_wgen_bytwo_p_init(gf) == 0) return 0; break;
case GF_MULT_GROUP: if (gf_wgen_group_init(gf) == 0) return 0; break;
case GF_MULT_TABLE: if (gf_wgen_table_init(gf) == 0) return 0; break;
case GF_MULT_LOG_TABLE: if (gf_wgen_log_init(gf) == 0) return 0; break;
default: return 0;
}
if (h->divide_type == GF_DIVIDE_EUCLID) {
gf->divide.w32 = gf_wgen_divide_from_inverse;
gf->inverse.w32 = gf_wgen_euclid;
} else if (h->divide_type == GF_DIVIDE_MATRIX) {
gf->divide.w32 = gf_wgen_divide_from_inverse;
gf->inverse.w32 = gf_wgen_matrix;
}
if (gf->inverse.w32== NULL && gf->divide.w32 == NULL) gf->inverse.w32 = gf_wgen_euclid;
if (gf->inverse.w32 != NULL && gf->divide.w32 == NULL) {
gf->divide.w32 = gf_wgen_divide_from_inverse;
}
if (gf->inverse.w32 == NULL && gf->divide.w32 != NULL) {
gf->inverse.w32 = gf_wgen_inverse_from_divide;
}
return 1;
}

BIN
junk Executable file

Binary file not shown.

BIN
junk-pick-best-output Executable file

Binary file not shown.

78
junk-pick-best-output.cpp Normal file
View File

@ -0,0 +1,78 @@
#include <string>
#include <vector>
#include <list>
#include <algorithm>
#include <map>
#include <set>
#include <iostream>
#include <sstream>
#include <cstdio>
#include <cstdlib>
using namespace std;
#define VIT(i, v) for (i = 0; i < v.size(); i++)
#define IT(it, ds) for (it = ds.begin(); it != ds.end(); it++)
#define FUP(i, n) for (i = 0; i < n; i++)
typedef map<int, string> ISmap;
typedef map<int, int> IImap;
typedef map<string, double> SDmap;
typedef ISmap::iterator ISmit;
typedef IImap::iterator IImit;
typedef SDmap::iterator SDmit;
typedef vector <string> SVec;
void StoSVec(string &s, SVec &sv)
{
istringstream ss;
string s2;
ss.clear();
ss.str(s);
while (ss >> s2) sv.push_back(s2);
}
main()
{
string s, k;
double d, b;
int i;
SVec sv;
SDmap bmap;
SDmit bmit;
while (getline(cin, s)) {
sv.clear();
StoSVec(s, sv);
if (sv[0] == "Seed:") {
b = 0;
for (i = 0; i < 2; i++) {
getline(cin, s);
sv.clear();
StoSVec(s, sv);
sscanf(sv[3].c_str(), "%lf", &d);
if (d > b) b = d;
}
getline(cin, s);
sv.clear();
StoSVec(s, sv);
k = sv[2];
k += " ";
k += sv[3];
for (i = 4; i < sv.size(); i++) {
if (sv[i] != "-") {
k += " ";
k += sv[i];
}
}
if (bmap[k] < b) bmap[k] = b;
}
}
IT(bmit, bmap) {
printf("%10.4lf %s\n", bmit->second, bmit->first.c_str());
}
}

11
junk-proc.awk Normal file
View File

@ -0,0 +1,11 @@
($1 == "Seed:") { l = 0; n++; t=0 }
{ if (l >= 1 && l <= 4) {
t += $4
if (l == 4) avg = t/4.0
}
if (l == 5) {
printf("xaxis max %d hash_label at %d : %s\n", n+1, n, $0 )
printf("newcurve marktype xbar marksize 1 cfill 1 1 0 pts %d %.2lf\n", n, avg);
}
l++
}

658
junk-save.c Normal file
View File

@ -0,0 +1,658 @@
/*
c = gf.multiply.w32(&gf, a, b);
tested = 0;
*/
/* If this is not composite, then first test against the default: */
/*
if (h->mult_type != GF_MULT_COMPOSITE) {
tested = 1;
d = gf_def.multiply.w32(&gf_def, a, b);
if (c != d) {
printf("Error in single multiplication (all numbers in hex):\n\n");
printf(" gf.multiply.w32() of %x and %x returned %x\n", a, b, c);
printf(" The default returned %x\n", d);
exit(1);
}
}
*/
/* Now, we also need to double-check, in case the default is wanky, and when
we're performing composite operations. Start with 0 and 1: */
/*
if (a == 0 || b == 0 || a == 1 || b == 1) {
tested = 1;
if (((a == 0 || b == 0) && c != 0) ||
(a == 1 && c != b) ||
(b == 1 && c != a)) {
printf("Error in single multiplication (all numbers in hex):\n\n");
printf(" gf.multiply.w32() of %x and %x returned %x, which is clearly wrong.\n", a, b, c);
exit(1);
}
*/
/* If division or inverses are defined, let's test all combinations to make sure
that the operations are consistent with each other. */
/*
} else {
if ((c & mask) != c) {
printf("Error in single multiplication (all numbers in hex):\n\n");
printf(" gf.multiply.w32() of %x and %x returned %x, which is too big.\n", a, b, c);
exit(1);
}
}
if (gf.inverse.w32 != NULL && (a != 0 || b != 0)) {
tested = 1;
if (a != 0) {
ai = gf.inverse.w32(&gf, a);
if (gf.multiply.w32(&gf, c, ai) != b) {
printf("Error in single multiplication (all numbers in hex):\n\n");
printf(" gf.multiply.w32() of %x and %x returned %x\n", a, b, c);
printf(" The inverse of %x is %x, and gf_multiply.w32() of %x and %x equals %x\n",
a, ai, c, ai, gf.multiply.w32(&gf, c, ai));
exit(1);
}
}
if (b != 0) {
bi = gf.inverse.w32(&gf, b);
if (gf.multiply.w32(&gf, c, bi) != a) {
printf("Error in single multiplication (all numbers in hex):\n\n");
printf(" gf.multiply.w32() of %x and %x returned %x\n", a, b, c);
printf(" The inverse of %x is %x, and gf_multiply.w32() of %x and %x equals %x\n",
b, bi, c, bi, gf.multiply.w32(&gf, c, bi));
exit(1);
}
}
}
if (gf.divide.w32 != NULL && (a != 0 || b != 0)) {
tested = 1;
if (a != 0) {
ai = gf.divide.w32(&gf, c, a);
if (ai != b) {
printf("Error in single multiplication/division (all numbers in hex):\n\n");
printf(" gf.multiply.w32() of %x and %x returned %x\n", a, b, c);
printf(" gf.divide.w32() of %x and %x returned %x\n", c, a, ai);
exit(1);
}
}
if (b != 0) {
bi = gf.divide.w32(&gf, c, b);
if (bi != a) {
printf("Error in single multiplication/division (all numbers in hex):\n\n");
printf(" gf.multiply.w32() of %x and %x returned %x\n", a, b, c);
printf(" gf.divide.w32() of %x and %x returned %x\n", c, b, bi);
exit(1);
}
}
}
if (!tested) problem("There is no way to test multiplication.\n");
}
*/
/*
if (region) {
if (w == 4) {
if (gf.multiply_region.w32 == NULL) {
printf("No multiply_region.\n");
} else {
r8b = (uint8_t *) malloc(REGION_SIZE);
r8c = (uint8_t *) malloc(REGION_SIZE);
r8d = (uint8_t *) malloc(REGION_SIZE);
fill_random_region(r8b, REGION_SIZE);
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src != dest, xor = %d\n", xor);
fflush(stdout);
}
for (a = 0; a < 16; a++) {
fill_random_region(r8c, REGION_SIZE);
memcpy(r8d, r8c, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint8_t)-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint8_t);
gf.multiply_region.w32(&gf, (void *) (r8b+sindex), (void *) (r8c+sindex), a, size, xor);
for (i = sindex; i < eindex; i++) {
b = (r8b[i] >> 4);
c = (r8c[i] >> 4);
d = (r8d[i] >> 4);
if (!xor && gf.multiply.w32(&gf, a, b) != c) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r8b+i));
printf(" %d * %d = %d, but should equal %d\n", a, b, c, gf.multiply.w32(&gf, a, b) );
printf("i=%d. %d %d %d %d\n", i, a, r8b[i], r8c[i], r8d[i]);
problem("Failed buffer-constant, xor=0");
}
if (xor && (gf.multiply.w32(&gf, a, b) ^ d) != c) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r8b+i));
printf(" %d %d %d %d\n", a, b, c, d);
printf(" %d %d %d %d\n", a, r8b[i], r8c[i], r8d[i]);
problem("Failed buffer-constant, xor=1");
}
b = (r8b[i] & 0xf);
c = (r8c[i] & 0xf);
d = (r8d[i] & 0xf);
if (!xor && gf.multiply.w32(&gf, a, b) != c) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r8b+i));
printf(" %d * %d = %d, but should equal %d\n", a, b, c, gf.multiply.w32(&gf, a, b) );
printf("i=%d. 0x%x 0x%x 0x%x 0x%x\n", i, a, r8b[i], r8c[i], r8d[i]);
problem("Failed buffer-constant, xor=0");
}
if (xor && (gf.multiply.w32(&gf, a, b) ^ d) != c) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r8b+i));
printf(" (%d * %d ^ %d) should equal %d - equals %d\n",
a, b, d, (gf.multiply.w32(&gf, a, b) ^ d), c);
printf(" %d %d %d %d\n", a, r8b[i], r8c[i], r8d[i]);
problem("Failed buffer-constant, xor=1");
}
}
}
}
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src == dest, xor = %d\n", xor);
fflush(stdout);
}
for (a = 0; a < 16; a++) {
fill_random_region(r8b, REGION_SIZE);
memcpy(r8d, r8b, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint8_t)-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint8_t);
gf.multiply_region.w32(&gf, (void *) (r8b+sindex), (void *) (r8b+sindex), a, size, xor);
for (i = sindex; i < eindex; i++) {
b = (r8b[i] >> 4);
d = (r8d[i] >> 4);
if (!xor && gf.multiply.w32(&gf, a, d) != b) problem("Failed buffer-constant, xor=0");
if (xor && (gf.multiply.w32(&gf, a, d) ^ d) != b) {
printf("i=%d. %d %d %d\n", i, a, b, d);
printf("i=%d. %d %d %d\n", i, a, r8b[i], r8d[i]);
problem("Failed buffer-constant, xor=1");
}
b = (r8b[i] & 0xf);
d = (r8d[i] & 0xf);
if (!xor && gf.multiply.w32(&gf, a, d) != b) problem("Failed buffer-constant, xor=0");
if (xor && (gf.multiply.w32(&gf, a, d) ^ d) != b) {
printf("%d %d %d\n", a, b, d);
problem("Failed buffer-constant, xor=1");
}
}
}
}
free(r8b);
free(r8c);
free(r8d);
}
} else if (w == 8) {
if (gf.multiply_region.w32 == NULL) {
printf("No multiply_region.\n");
} else {
r8b = (uint8_t *) malloc(REGION_SIZE);
r8c = (uint8_t *) malloc(REGION_SIZE);
r8d = (uint8_t *) malloc(REGION_SIZE);
fill_random_region(r8b, REGION_SIZE);
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src != dest, xor = %d\n", xor);
fflush(stdout);
}
for (a = 0; a < 256; a++) {
fill_random_region(r8c, REGION_SIZE);
memcpy(r8d, r8c, REGION_SIZE);
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
sindex = 0;
eindex = REGION_SIZE;
} else {
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint8_t)-MOA_Random_W(3, 1);
}
size = (eindex-sindex)*sizeof(uint8_t);
gf.multiply_region.w32(&gf, (void *) (r8b+sindex), (void *) (r8c+sindex), a, size, xor);
for (i = sindex; i < eindex; i++) {
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
b = get_alt_map_2w8(i, (uint8_t*)r8b, REGION_SIZE / 2);
c = get_alt_map_2w8(i, (uint8_t*)r8c, REGION_SIZE / 2);
d = get_alt_map_2w8(i, (uint8_t*)r8d, REGION_SIZE / 2);
} else {
b = r8b[i];
c = r8c[i];
d = r8d[i];
}
if (!xor && gf.multiply.w32(&gf, a, b) != c) {
printf("i=%d. %d %d %d %d\n", i, a, b, c, d);
printf("i=%d. %d %d %d %d\n", i, a, r8b[i], r8c[i], r8d[i]);
printf("%llx. Sindex: %d\n", r8b+i, sindex);
problem("Failed buffer-constant, xor=0");
}
if (xor && (gf.multiply.w32(&gf, a, b) ^ d) != c) {
printf("i=%d. %d %d %d %d\n", i, a, b, c, d);
printf("i=%d. %d %d %d %d\n", i, a, r8b[i], r8c[i], r8d[i]);
problem("Failed buffer-constant, xor=1");
}
}
}
}
for (xor = 0; xor < 2; xor++) {
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
continue;
}
if (verbose) {
printf("Testing buffer-constant, src == dest, xor = %d\n", xor);
fflush(stdout);
}
for (a = 0; a < 256; a++) {
fill_random_region(r8b, REGION_SIZE);
memcpy(r8d, r8b, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint8_t)-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint8_t);
gf.multiply_region.w32(&gf, (void *) (r8b+sindex), (void *) (r8b+sindex), a, size, xor);
for (i = sindex; i < eindex; i++) {
b = r8b[i];
d = r8d[i];
if (!xor && gf.multiply.w32(&gf, a, d) != b) problem("Failed buffer-constant, xor=0");
if (xor && (gf.multiply.w32(&gf, a, d) ^ d) != b) {
printf("i=%d. %d %d %d\n", i, a, b, d);
printf("i=%d. %d %d %d\n", i, a, r8b[i], r8d[i]);
problem("Failed buffer-constant, xor=1");
}
}
}
}
free(r8b);
free(r8c);
free(r8d);
}
} else if (w == 16) {
if (gf.multiply_region.w32 == NULL) {
printf("No multiply_region.\n");
} else {
r16b = (uint16_t *) malloc(REGION_SIZE);
r16c = (uint16_t *) malloc(REGION_SIZE);
r16d = (uint16_t *) malloc(REGION_SIZE);
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src != dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
fill_random_region(r16b, REGION_SIZE);
a = MOA_Random_W(w, 0);
fill_random_region(r16c, REGION_SIZE);
memcpy(r16d, r16c, REGION_SIZE);
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
sindex = 0;
eindex = REGION_SIZE / sizeof(uint16_t);
} else {
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint16_t)-MOA_Random_W(3, 1);
}
size = (eindex-sindex)*sizeof(uint16_t);
gf.multiply_region.w32(&gf, (void *) (r16b+sindex), (void *) (r16c+sindex), a, size, xor);
ai = gf.inverse.w32(&gf, a);
if (!xor) {
gf.multiply_region.w32(&gf, (void *) (r16c+sindex), (void *) (r16d+sindex), ai, size, xor);
} else {
gf.multiply_region.w32(&gf, (void *) (r16c+sindex), (void *) (r16d+sindex), 1, size, xor);
gf.multiply_region.w32(&gf, (void *) (r16d+sindex), (void *) (r16b+sindex), ai, size, xor);
}
for (i = sindex; i < eindex; i++) {
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
b = get_alt_map_2w16(i, (uint8_t*)r16b, size / 2);
c = get_alt_map_2w16(i, (uint8_t*)r16c, size / 2);
d = get_alt_map_2w16(i, (uint8_t*)r16d, size / 2);
} else {
b = r16b[i];
c = r16c[i];
d = r16d[i];
}
if (!xor && d != b) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r16b+i));
printf("We have %d * %d = %d, and %d * %d = %d.\n", a, b, c, c, ai, d);
printf("%d is the inverse of %d\n", ai, a);
problem("Failed buffer-constant, xor=0");
}
if (xor && b != 0) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r16b+i));
printf("We did d=c; c ^= ba; d ^= c; b ^= (a^-1)d;\n");
printf(" b should equal 0, but it doesn't. Probe into it.\n");
problem("Failed buffer-constant, xor=1");
}
}
}
}
for (xor = 0; xor < 2; xor++) {
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
continue;
}
if (verbose) {
printf("Testing buffer-constant, src == dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
a = MOA_Random_W(w, 0);
fill_random_region(r16b, REGION_SIZE);
memcpy(r16d, r16b, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint16_t)-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint16_t);
gf.multiply_region.w32(&gf, (void *) (r16b+sindex), (void *) (r16b+sindex), a, size, xor);
ai = gf.inverse.w32(&gf, a);
if (!xor) {
gf.multiply_region.w32(&gf, (void *) (r16b+sindex), (void *) (r16b+sindex), ai, size, xor);
} else {
gf.multiply_region.w32(&gf, (void *) (r16d+sindex), (void *) (r16b+sindex), 1, size, xor);
gf.multiply_region.w32(&gf, (void *) (r16b+sindex), (void *) (r16b+sindex), ai, size, 0);
}
for (i = sindex; i < eindex; i++) {
b = r16b[i];
c = r16c[i];
d = r16d[i];
if (!xor && (d != b)) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r16b+i));
printf("We did d=b; b = ba; b = b(a^-1).\n");
printf("So, b should equal d, but it doesn't. Look into it.\n");
printf("b = %d. d = %d. a = %d\n", b, d, a);
problem("Failed buffer-constant, xor=0");
}
if (xor && d != b) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r16b+i));
printf("We did d=b; b = b + ba; b += d; b = b(a^-1);\n");
printf("We did d=c; c ^= ba; d ^= c; b ^= (a^-1)d;\n");
printf("So, b should equal d, but it doesn't. Look into it.\n");
problem("Failed buffer-constant, xor=1");
}
}
}
}
free(r16b);
free(r16c);
free(r16d);
}
} else if (w == 32) {
if (gf.multiply_region.w32 == NULL) {
printf("No multiply_region.\n");
} else {
r32b = (uint32_t *) malloc(REGION_SIZE);
r32c = (uint32_t *) malloc(REGION_SIZE);
r32d = (uint32_t *) malloc(REGION_SIZE);
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src != dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
a = MOA_Random_32();
fill_random_region(r32b, REGION_SIZE);
fill_random_region(r32c, REGION_SIZE);
memcpy(r32d, r32c, REGION_SIZE);
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
sindex = 0;
eindex = REGION_SIZE / sizeof(uint32_t);
} else {
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint32_t)-MOA_Random_W(3, 1);
}
size = (eindex-sindex)*sizeof(uint32_t);
gf.multiply_region.w32(&gf, (void *) (r32b+sindex), (void *) (r32c+sindex), a, size, xor);
ai = gf.inverse.w32(&gf, a);
if (!xor) {
gf.multiply_region.w32(&gf, (void *) (r32c+sindex), (void *) (r32d+sindex), ai, size, xor);
} else {
gf.multiply_region.w32(&gf, (void *) (r32c+sindex), (void *) (r32d+sindex), 1, size, xor);
gf.multiply_region.w32(&gf, (void *) (r32d+sindex), (void *) (r32b+sindex), ai, size, xor);
}
for (i = sindex; i < eindex; i++) {
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
b = get_alt_map_2w32(i, (uint8_t*)r32b, size / 2);
c = get_alt_map_2w32(i, (uint8_t*)r32c, size / 2);
d = get_alt_map_2w32(i, (uint8_t*)r32d, size / 2);
i++;
} else {
b = r32b[i];
c = r32c[i];
d = r32d[i];
}
if (!xor && d != b) {
printf("i=%d. Addresses: b: 0x%lx\n", i, (unsigned long) (r32b+i));
printf("We have %d * %d = %d, and %d * %d = %d.\n", a, b, c, c, ai, d);
printf("%d is the inverse of %d\n", ai, a);
problem("Failed buffer-constant, xor=0");
}
if (xor && b != 0) {
printf("i=%d. Addresses: b: 0x%lx c: 0x%lx d: 0x%lx\n", i,
(unsigned long) (r32b+i), (unsigned long) (r32c+i), (unsigned long) (r32d+i));
printf("We did d=c; c ^= ba; d ^= c; b ^= (a^-1)d;\n");
printf(" b should equal 0, but it doesn't. Probe into it.\n");
printf("a: %8x b: %8x c: %8x, d: %8x\n", a, b, c, d);
problem("Failed buffer-constant, xor=1");
}
}
}
}
for (xor = 0; xor < 2; xor++) {
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
continue;
}
if (verbose) {
printf("Testing buffer-constant, src == dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
a = MOA_Random_32();
fill_random_region(r32b, REGION_SIZE);
memcpy(r32d, r32b, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint32_t)-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint32_t);
gf.multiply_region.w32(&gf, (void *) (r32b+sindex), (void *) (r32b+sindex), a, size, xor);
ai = gf.inverse.w32(&gf, a);
if (!xor) {
gf.multiply_region.w32(&gf, (void *) (r32b+sindex), (void *) (r32b+sindex), ai, size, xor);
} else {
gf.multiply_region.w32(&gf, (void *) (r32d+sindex), (void *) (r32b+sindex), 1, size, xor);
gf.multiply_region.w32(&gf, (void *) (r32b+sindex), (void *) (r32b+sindex), ai, size, 0);
}
for (i = sindex; i < eindex; i++) {
b = r32b[i];
c = r32c[i];
d = r32d[i];
if (!xor && (d != b)) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r32b+i));
printf("We did d=b; b = ba; b = b(a^-1).\n");
printf("So, b should equal d, but it doesn't. Look into it.\n");
printf("b = %d. d = %d. a = %d\n", b, d, a);
problem("Failed buffer-constant, xor=0");
}
if (xor && d != b) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r32b+i));
printf("We did d=b; b = b + ba; b += d; b = b(a^-1);\n");
printf("We did d=c; c ^= ba; d ^= c; b ^= (a^-1)d;\n");
printf("So, b should equal d, but it doesn't. Look into it.\n");
problem("Failed buffer-constant, xor=1");
}
}
}
}
free(r32b);
free(r32c);
free(r32d);
}
} else if (w == 64) {
if (gf.multiply_region.w64 == NULL) {
printf("No multiply_region.\n");
} else {
r64b = (uint64_t *) malloc(REGION_SIZE);
r64c = (uint64_t *) malloc(REGION_SIZE);
r64d = (uint64_t *) malloc(REGION_SIZE);
fill_random_region(r64b, REGION_SIZE);
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src != dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
a64 = MOA_Random_64();
fill_random_region(r64c, REGION_SIZE);
memcpy(r64d, r64c, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint64_t)-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint64_t);
gf.multiply_region.w64(&gf, (void *) (r64b+sindex), (void *) (r64c+sindex), a64, size, xor);
for (i = sindex; i < eindex; i++) {
b64 = r64b[i];
c64 = r64c[i];
d64 = r64d[i];
if (!xor && gf.multiply.w64(&gf, a64, b64) != c64) {
printf("i=%d. 0x%llx 0x%llx 0x%llx should be 0x%llx\n", i, a64, b64, c64,
gf.multiply.w64(&gf, a64, b64));
printf("i=%d. 0x%llx 0x%llx 0x%llx\n", i, a64, r64b[i], r64c[i]);
problem("Failed buffer-constant, xor=0");
}
if (xor && (gf.multiply.w64(&gf, a64, b64) ^ d64) != c64) {
printf("i=%d. 0x%llx 0x%llx 0x%llx 0x%llx\n", i, a64, b64, c64, d64);
printf("i=%d. 0x%llx 0x%llx 0x%llx 0x%llx\n", i, a64, r64b[i], r64c[i], r64d[i]);
problem("Failed buffer-constant, xor=1");
}
}
}
}
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src == dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
a64 = MOA_Random_64();
fill_random_region(r64b, REGION_SIZE);
memcpy(r64d, r64b, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint64_t)-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint64_t);
gf.multiply_region.w64(&gf, (void *) (r64b+sindex), (void *) (r64b+sindex), a64, size, xor);
for (i = sindex; i < eindex; i++) {
b64 = r64b[i];
d64 = r64d[i];
if (!xor && gf.multiply.w64(&gf, a64, d64) != b64) problem("Failed buffer-constant, xor=0");
if (xor && (gf.multiply.w64(&gf, a64, d64) ^ d64) != b64) {
printf("i=%d. 0x%llx 0x%llx 0x%llx\n", i, a64, b64, d64);
printf("i=%d. 0x%llx 0x%llx 0x%llx\n", i, a64, r64b[i], r64d[i]);
problem("Failed buffer-constant, xor=1");
}
}
}
}
free(r64b);
free(r64c);
free(r64d);
}
} else if (w == 128) {
if (gf.multiply_region.w128 == NULL) {
printf("No multiply_region.\n");
} else {
r128b = (uint64_t *) malloc(REGION_SIZE);
r128c = (uint64_t *) malloc(REGION_SIZE);
r128d = (uint64_t *) malloc(REGION_SIZE);
fill_random_region(r128b, REGION_SIZE);
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src != dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
MOA_Random_128(a128);
fill_random_region(r128c, REGION_SIZE);
memcpy(r128d, r128c, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/(2*sizeof(uint64_t))-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint64_t)*2;
gf.multiply_region.w128(&gf, (void *) (r128b+sindex*2), (void *) (r128c+sindex*2), a128, size, xor);
for (i = sindex; i < eindex; i++) {
b128[0] = r128b[2*i];
b128[1] = r128b[2*i+1];
c128[0] = r128c[2*i];
c128[1] = r128c[2*i+1];
d128[0] = r128d[2*i];
d128[1] = r128d[2*i+1];
gf.multiply.w128(&gf, a128, b128, e128);
if (xor) {
e128[0] ^= d128[0];
e128[1] ^= d128[1];
}
if (!xor && !GF_W128_EQUAL(c128, e128)) {
printf("i=%d. 0x%llx%llx 0x%llx%llx 0x%llx%llx should be 0x%llx%llx\n",
i, a128[0], a128[1], b128[0], b128[1], c128[0], c128[1], e128[0], e128[1]);
problem("Failed buffer-constant, xor=0");
}
if (xor && !GF_W128_EQUAL(e128, c128)) {
printf("i=%d. 0x%llx%llx 0x%llx%llx 0x%llx%llx 0x%llx%llx\n", i,
a128[0], a128[1], b128[0], b128[1], c128[0], c128[1], d128[0], d128[1]);
problem("Failed buffer-constant, xor=1");
}
}
}
}
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src == dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
MOA_Random_128(a128);
fill_random_region(r128b, REGION_SIZE);
memcpy(r128d, r128b, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
sindex = 0;
eindex = REGION_SIZE/(2*sizeof(uint64_t))-MOA_Random_W(3, 1);
eindex = REGION_SIZE/(2*sizeof(uint64_t));
size = (eindex-sindex)*sizeof(uint64_t)*2;
gf.multiply_region.w128(&gf, (void *) (r128b+sindex), (void *) (r128b+sindex), a128, size, xor);
for (i = sindex; i < eindex; i++) {
b128[0] = r128b[2*i];
b128[1] = r128b[2*i + 1];
d128[0] = r128d[2*i];
d128[1] = r128d[2*i + 1];
gf.multiply.w128(&gf, a128, d128, e128);
if (xor) {
e128[0] ^= d128[0];
e128[1] ^= d128[1];
}
if (!xor && !GF_W128_EQUAL(b128, e128)) problem("Failed buffer-constant, xor=0");
if (xor && !GF_W128_EQUAL(b128, e128)) {
problem("Failed buffer-constant, xor=1");
}
}
}
}
free(r128b);
free(r128c);
free(r128d);
}
}
}
exit(0);
*/
}

1585
junk-w16-backup.c Normal file

File diff suppressed because it is too large Load Diff

12
junk-w16-timing-tests.sh Normal file
View File

@ -0,0 +1,12 @@
sh tmp-time-test.sh 16 LOG - -
sh tmp-time-test.sh 16 LOG_ZERO - -
sh tmp-time-test.sh 16 TABLE - -
sh tmp-time-test.sh 16 TABLE LE,LAZY -
sh tmp-time-test.sh 16 SPLIT 16 4 ALTMAP,NOSSE -
sh tmp-time-test.sh 16 SPLIT 16 4 ALTMAP,LAZY,SSE -
sh tmp-time-test.sh 16 SPLIT 16 4 ALTMAP,LAZY,NOSSE -
sh tmp-time-test.sh 16 SPLIT 16 4 ALTMAP,SSE -
sh tmp-time-test.sh 16 SPLIT 16 4 NOSSE -
sh tmp-time-test.sh 16 SPLIT 16 4 LAZY,SSE -
sh tmp-time-test.sh 16 SPLIT 16 4 LAZY,NOSSE -
sh tmp-time-test.sh 16 SPLIT 16 4 SSE -

203
junk-w2.eps Normal file
View File

@ -0,0 +1,203 @@
%!PS-Adobe-2.0 EPSF-1.2
%%Page: 1 1
%%BoundingBox: -40 -93 289 73
%%EndComments
1 setlinecap 1 setlinejoin
0.700 setlinewidth
0.00 setgray
/Jrnd { exch cvi exch cvi dup 3 1 roll idiv mul } def
/JDEdict 8 dict def
JDEdict /mtrx matrix put
/JDE {
JDEdict begin
/yrad exch def
/xrad exch def
/savematrix mtrx currentmatrix def
xrad yrad scale
0 0 1 0 360 arc
savematrix setmatrix
end
} def
/JSTR {
gsave 1 eq { gsave 1 setgray fill grestore } if
exch neg exch neg translate
clip
rotate
4 dict begin
pathbbox /&top exch def
/&right exch def
/&bottom exch def
&right sub /&width exch def
newpath
currentlinewidth mul round dup
&bottom exch Jrnd exch &top
4 -1 roll currentlinewidth mul setlinewidth
{ &right exch moveto &width 0 rlineto stroke } for
end
grestore
newpath
} bind def
gsave /Times-Roman findfont 9.000000 scalefont setfont
0.000000 0.000000 translate
0.700000 setlinewidth gsave newpath 0.000000 0.000000 moveto 288.000000 0.000000 lineto stroke
newpath 0.000000 0.000000 moveto 0.000000 -5.000000 lineto stroke
newpath 26.181818 0.000000 moveto 26.181818 -2.000000 lineto stroke
newpath 52.363636 0.000000 moveto 52.363636 -5.000000 lineto stroke
newpath 78.545456 0.000000 moveto 78.545456 -2.000000 lineto stroke
newpath 104.727272 0.000000 moveto 104.727272 -5.000000 lineto stroke
newpath 130.909088 0.000000 moveto 130.909088 -2.000000 lineto stroke
newpath 157.090912 0.000000 moveto 157.090912 -5.000000 lineto stroke
newpath 183.272720 0.000000 moveto 183.272720 -2.000000 lineto stroke
newpath 209.454544 0.000000 moveto 209.454544 -5.000000 lineto stroke
newpath 235.636368 0.000000 moveto 235.636368 -2.000000 lineto stroke
newpath 261.818176 0.000000 moveto 261.818176 -5.000000 lineto stroke
newpath 288.000000 0.000000 moveto 288.000000 -2.000000 lineto stroke
/Times-Roman findfont 11.000000 scalefont setfont
gsave 26.181818 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (BYTWO_p) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 52.363636 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (BYTWO_p SSE) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 78.545456 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (BYTWO_b) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 104.727272 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (BYTWO_b SSE) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 130.909088 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (TABLE SINGLE) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 157.090912 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (TABLE DOUBLE) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 183.272720 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (TABLE QUAD) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 209.454544 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (TABLE QUAD,LAZY) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 235.636368 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (TABLE SINGLE,SSE) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 261.818176 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (LOG) dup stringwidth pop pop 0 0 moveto
show
grestore
grestore
0.700000 setlinewidth gsave newpath 0.000000 0.000000 moveto 0.000000 72.000000 lineto stroke
newpath 0.000000 0.000000 moveto -5.000000 0.000000 lineto stroke
newpath 0.000000 9.916304 moveto -2.000000 9.916304 lineto stroke
newpath 0.000000 19.832607 moveto -5.000000 19.832607 lineto stroke
newpath 0.000000 29.748911 moveto -2.000000 29.748911 lineto stroke
newpath 0.000000 39.665215 moveto -5.000000 39.665215 lineto stroke
newpath 0.000000 49.581520 moveto -2.000000 49.581520 lineto stroke
newpath 0.000000 59.497822 moveto -5.000000 59.497822 lineto stroke
newpath 0.000000 69.414124 moveto -2.000000 69.414124 lineto stroke
/Times-Roman findfont 9.000000 scalefont setfont
gsave -8.000000 0.000000 translate 0.000000 rotate
0 -2.700000 translate (0) dup stringwidth pop neg 0 moveto
show
grestore
gsave -8.000000 19.832607 translate 0.000000 rotate
0 -2.700000 translate (2000) dup stringwidth pop neg 0 moveto
show
grestore
gsave -8.000000 39.665215 translate 0.000000 rotate
0 -2.700000 translate (4000) dup stringwidth pop neg 0 moveto
show
grestore
gsave -8.000000 59.497822 translate 0.000000 rotate
0 -2.700000 translate (6000) dup stringwidth pop neg 0 moveto
show
grestore
/Times-Bold findfont 10.000000 scalefont setfont
gsave -33.279999 36.000000 translate 90.000000 rotate
0 0.000000 translate (MB/s) dup stringwidth pop 2 div neg 0 moveto
show
grestore
grestore
gsave
gsave gsave 26.181818 9.564870 translate 0.000000 rotate
newpath 13.090909 0.000000 moveto -13.090909 0.000000 lineto
-13.090909 -9.564870 lineto
13.090909 -9.564870 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore grestore
gsave gsave 52.363636 15.887009 translate 0.000000 rotate
newpath 13.090909 0.000000 moveto -13.090909 0.000000 lineto
-13.090909 -15.887009 lineto
13.090909 -15.887009 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore grestore
gsave gsave 78.545456 20.109272 translate 0.000000 rotate
newpath 13.090909 0.000000 moveto -13.090909 0.000000 lineto
-13.090909 -20.109272 lineto
13.090909 -20.109272 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore grestore
gsave gsave 104.727272 26.881811 translate 0.000000 rotate
newpath 13.090909 0.000000 moveto -13.090909 0.000000 lineto
-13.090909 -26.881811 lineto
13.090909 -26.881811 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore grestore
gsave gsave 130.909088 4.538296 translate 0.000000 rotate
newpath 13.090909 0.000000 moveto -13.090909 0.000000 lineto
-13.090909 -4.538296 lineto
13.090909 -4.538296 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore grestore
gsave gsave 157.090912 8.978618 translate 0.000000 rotate
newpath 13.090909 0.000000 moveto -13.090909 0.000000 lineto
-13.090909 -8.978618 lineto
13.090909 -8.978618 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore grestore
gsave gsave 183.272720 13.178271 translate 0.000000 rotate
newpath 13.090909 0.000000 moveto -13.090909 0.000000 lineto
-13.090909 -13.178271 lineto
13.090909 -13.178271 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore grestore
gsave gsave 209.454544 11.003130 translate 0.000000 rotate
newpath 13.090909 0.000000 moveto -13.090909 0.000000 lineto
-13.090909 -11.003130 lineto
13.090909 -11.003130 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore grestore
gsave gsave 235.636368 72.000000 translate 0.000000 rotate
newpath 13.090909 0.000000 moveto -13.090909 0.000000 lineto
-13.090909 -72.000000 lineto
13.090909 -72.000000 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore grestore
gsave gsave 261.818176 2.016877 translate 0.000000 rotate
newpath 13.090909 0.000000 moveto -13.090909 0.000000 lineto
-13.090909 -2.016877 lineto
13.090909 -2.016877 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore grestore
grestore
-0.000000 -0.000000 translate
grestore

1337
junk-w32-backup.c Normal file

File diff suppressed because it is too large Load Diff

16
junk-w32-single-time.c Normal file
View File

@ -0,0 +1,16 @@
echo "SHIFT" `gf_time 32 M 0 10240 10240 SHIFT - - | tail -n 1`
echo "GROUP 2 4" `gf_time 32 M 0 10240 10240 GROUP 2 4 - - | tail -n 1`
echo "GROUP 3 4" `gf_time 32 M 0 10240 10240 GROUP 3 4 - - | tail -n 1`
echo "GROUP 4 4" `gf_time 32 M 0 10240 10240 GROUP 4 4 - - | tail -n 1`
echo "GROUP 2 8" `gf_time 32 M 0 10240 10240 GROUP 2 8 - - | tail -n 1`
echo "GROUP 3 8" `gf_time 32 M 0 10240 10240 GROUP 3 8 - - | tail -n 1`
echo "GROUP 4 8" `gf_time 32 M 0 10240 10240 GROUP 4 8 - - | tail -n 1`
echo "GROUP 2 2" `gf_time 32 M 0 10240 10240 GROUP 2 2 - - | tail -n 1`
echo "GROUP 3 3" `gf_time 32 M 0 10240 10240 GROUP 3 3 - - | tail -n 1`
echo "BYTWO_p" `gf_time 32 M 0 10240 10240 BYTWO_p - - | tail -n 1`
echo "BYTWO_b" `gf_time 32 M 0 10240 10240 BYTWO_b - - | tail -n 1`
echo "SPLIT 32 2" `gf_time 32 M 0 10240 10240 SPLIT 32 2 - - | tail -n 1`
echo "SPLIT 32 4" `gf_time 32 M 0 10240 10240 SPLIT 32 4 - - | tail -n 1`
echo "SPLIT 32 8" `gf_time 32 M 0 10240 10240 SPLIT 32 8 - - | tail -n 1`
echo "SPLIT 8 8" `gf_time 32 M 0 10240 10240 SPLIT 8 8 - - | tail -n 1`
echo "COMPOSITE 2 16 -" `gf_time 32 M 0 10240 10240 COMPOSITE 2 16 - - - | tail -n 1`

60
junk-w4-out.txt Normal file
View File

@ -0,0 +1,60 @@
Seed: 1345648646
Buffer-Const,s!=d,xor=0: 1.005451 s 971.268 MB/s
Buffer-Const,s!=d,xor=1: 1.029715 s 948.382 MB/s
Buffer-Const,s==d,xor=0: 0.989556 s 986.869 MB/s
Buffer-Const,s==d,xor=1: 1.026105 s 951.718 MB/s
BYTWO_p
Seed: 1345648655
Buffer-Const,s!=d,xor=0: 0.603574 s 1617.966 MB/s
Buffer-Const,s!=d,xor=1: 0.612757 s 1593.720 MB/s
Buffer-Const,s==d,xor=0: 0.599630 s 1628.609 MB/s
Buffer-Const,s==d,xor=1: 0.622749 s 1568.149 MB/s
BYTWO_p SSE
Seed: 1345648662
Buffer-Const,s!=d,xor=0: 0.487348 s 2003.831 MB/s
Buffer-Const,s!=d,xor=1: 0.488745 s 1998.100 MB/s
Buffer-Const,s==d,xor=0: 0.470528 s 2075.463 MB/s
Buffer-Const,s==d,xor=1: 0.480067 s 2034.223 MB/s
BYTWO_b
Seed: 1345648669
Buffer-Const,s!=d,xor=0: 0.359088 s 2719.564 MB/s
Buffer-Const,s!=d,xor=1: 0.365816 s 2669.543 MB/s
Buffer-Const,s==d,xor=0: 0.361701 s 2699.920 MB/s
Buffer-Const,s==d,xor=1: 0.354540 s 2754.449 MB/s
BYTWO_b SSE
Seed: 1345648689
Buffer-Const,s!=d,xor=0: 2.036338 s 479.568 MB/s
Buffer-Const,s!=d,xor=1: 2.237701 s 436.413 MB/s
Buffer-Const,s==d,xor=0: 2.048971 s 476.611 MB/s
Buffer-Const,s==d,xor=1: 2.229312 s 438.056 MB/s
TABLE SINGLE
Seed: 1345648703
Buffer-Const,s!=d,xor=0: 1.074082 s 909.207 MB/s
Buffer-Const,s!=d,xor=1: 1.083797 s 901.057 MB/s
Buffer-Const,s==d,xor=0: 1.077001 s 906.743 MB/s
Buffer-Const,s==d,xor=1: 1.079369 s 904.753 MB/s
TABLE DOUBLE
Seed: 1345648712
Buffer-Const,s!=d,xor=0: 0.743830 s 1312.884 MB/s
Buffer-Const,s!=d,xor=1: 0.760719 s 1283.736 MB/s
Buffer-Const,s==d,xor=0: 0.708908 s 1377.559 MB/s
Buffer-Const,s==d,xor=1: 0.727896 s 1341.624 MB/s
TABLE QUAD
Seed: 1345648720
Buffer-Const,s!=d,xor=0: 0.898810 s 1086.506 MB/s
Buffer-Const,s!=d,xor=1: 0.876269 s 1114.455 MB/s
Buffer-Const,s==d,xor=0: 0.872698 s 1119.015 MB/s
Buffer-Const,s==d,xor=1: 0.873175 s 1118.404 MB/s
TABLE QUAD,LAZY
Seed: 1345648729
Buffer-Const,s!=d,xor=0: 0.143798 s 6791.205 MB/s
Buffer-Const,s!=d,xor=1: 0.151166 s 6460.201 MB/s
Buffer-Const,s==d,xor=0: 0.123824 s 7886.721 MB/s
Buffer-Const,s==d,xor=1: 0.123538 s 7904.940 MB/s
TABLE SINGLE,SSE
Seed: 1345648748
Buffer-Const,s!=d,xor=0: 4.562493 s 214.041 MB/s
Buffer-Const,s!=d,xor=1: 5.116838 s 190.853 MB/s
Buffer-Const,s==d,xor=0: 4.533105 s 215.429 MB/s
Buffer-Const,s==d,xor=1: 5.053730 s 193.236 MB/s
LOG

792
junk-w4-timing-out.txt Normal file
View File

@ -0,0 +1,792 @@
Seed: 1352748099
Buffer-Const,s!=d,xor=0: 0.608121 s 210.484 MB/s
Buffer-Const,s!=d,xor=1: 0.692329 s 184.883 MB/s
1024 131072 4 LOG - -
Seed: 1352748102
Buffer-Const,s!=d,xor=0: 0.699226 s 183.060 MB/s
Buffer-Const,s!=d,xor=1: 0.687310 s 186.233 MB/s
2048 65536 4 LOG - -
Seed: 1352748106
Buffer-Const,s!=d,xor=0: 0.604397 s 211.781 MB/s
Buffer-Const,s!=d,xor=1: 0.682591 s 187.521 MB/s
4096 32768 4 LOG - -
Seed: 1352748109
Buffer-Const,s!=d,xor=0: 0.602384 s 212.489 MB/s
Buffer-Const,s!=d,xor=1: 0.678849 s 188.555 MB/s
8192 16384 4 LOG - -
Seed: 1352748112
Buffer-Const,s!=d,xor=0: 0.602103 s 212.588 MB/s
Buffer-Const,s!=d,xor=1: 0.688450 s 185.925 MB/s
16384 8192 4 LOG - -
Seed: 1352748115
Buffer-Const,s!=d,xor=0: 0.598464 s 213.881 MB/s
Buffer-Const,s!=d,xor=1: 0.676076 s 189.328 MB/s
32768 4096 4 LOG - -
Seed: 1352748119
Buffer-Const,s!=d,xor=0: 0.611499 s 209.322 MB/s
Buffer-Const,s!=d,xor=1: 0.693351 s 184.611 MB/s
65536 2048 4 LOG - -
Seed: 1352748122
Buffer-Const,s!=d,xor=0: 0.609786 s 209.910 MB/s
Buffer-Const,s!=d,xor=1: 0.689794 s 185.563 MB/s
131072 1024 4 LOG - -
Seed: 1352748125
Buffer-Const,s!=d,xor=0: 0.619027 s 206.776 MB/s
Buffer-Const,s!=d,xor=1: 0.703627 s 181.915 MB/s
262144 512 4 LOG - -
Seed: 1352748129
Buffer-Const,s!=d,xor=0: 0.605785 s 211.296 MB/s
Buffer-Const,s!=d,xor=1: 0.696728 s 183.716 MB/s
524288 256 4 LOG - -
Seed: 1352748132
Buffer-Const,s!=d,xor=0: 0.591555 s 216.379 MB/s
Buffer-Const,s!=d,xor=1: 0.666735 s 191.980 MB/s
1048576 128 4 LOG - -
Seed: 1352748135
Buffer-Const,s!=d,xor=0: 0.623167 s 205.403 MB/s
Buffer-Const,s!=d,xor=1: 0.675010 s 189.627 MB/s
2097152 64 4 LOG - -
Seed: 1352748138
Buffer-Const,s!=d,xor=0: 0.572467 s 223.594 MB/s
Buffer-Const,s!=d,xor=1: 0.733714 s 174.455 MB/s
4194304 32 4 LOG - -
Seed: 1352748142
Buffer-Const,s!=d,xor=0: 0.617676 s 207.228 MB/s
Buffer-Const,s!=d,xor=1: 0.742744 s 172.334 MB/s
8388608 16 4 LOG - -
Seed: 1352748145
Buffer-Const,s!=d,xor=0: 0.579833 s 220.753 MB/s
Buffer-Const,s!=d,xor=1: 0.736355 s 173.829 MB/s
16777216 8 4 LOG - -
Seed: 1352748148
Buffer-Const,s!=d,xor=0: 0.682980 s 187.414 MB/s
Buffer-Const,s!=d,xor=1: 0.738846 s 173.243 MB/s
33554432 4 4 LOG - -
Seed: 1352748152
Buffer-Const,s!=d,xor=0: 0.692141 s 184.933 MB/s
Buffer-Const,s!=d,xor=1: 0.725968 s 176.316 MB/s
67108864 2 4 LOG - -
Seed: 1352748155
Buffer-Const,s!=d,xor=0: 0.737346 s 173.596 MB/s
Buffer-Const,s!=d,xor=1: 0.725769 s 176.365 MB/s
134217728 1 4 LOG - -
Seed: 1352748159
Buffer-Const,s!=d,xor=0: 0.252694 s 506.541 MB/s
Buffer-Const,s!=d,xor=1: 0.280102 s 456.976 MB/s
1024 131072 4 TABLE SINGLE -
Seed: 1352748160
Buffer-Const,s!=d,xor=0: 0.246866 s 518.501 MB/s
Buffer-Const,s!=d,xor=1: 0.276830 s 462.377 MB/s
2048 65536 4 TABLE SINGLE -
Seed: 1352748162
Buffer-Const,s!=d,xor=0: 0.246874 s 518.482 MB/s
Buffer-Const,s!=d,xor=1: 0.274016 s 467.125 MB/s
4096 32768 4 TABLE SINGLE -
Seed: 1352748164
Buffer-Const,s!=d,xor=0: 0.247869 s 516.402 MB/s
Buffer-Const,s!=d,xor=1: 0.271679 s 471.144 MB/s
8192 16384 4 TABLE SINGLE -
Seed: 1352748166
Buffer-Const,s!=d,xor=0: 0.244581 s 523.345 MB/s
Buffer-Const,s!=d,xor=1: 0.270779 s 472.710 MB/s
16384 8192 4 TABLE SINGLE -
Seed: 1352748167
Buffer-Const,s!=d,xor=0: 0.256167 s 499.675 MB/s
Buffer-Const,s!=d,xor=1: 0.278188 s 460.121 MB/s
32768 4096 4 TABLE SINGLE -
Seed: 1352748169
Buffer-Const,s!=d,xor=0: 0.248786 s 514.498 MB/s
Buffer-Const,s!=d,xor=1: 0.305109 s 419.522 MB/s
65536 2048 4 TABLE SINGLE -
Seed: 1352748171
Buffer-Const,s!=d,xor=0: 0.249003 s 514.050 MB/s
Buffer-Const,s!=d,xor=1: 0.276043 s 463.696 MB/s
131072 1024 4 TABLE SINGLE -
Seed: 1352748173
Buffer-Const,s!=d,xor=0: 0.249019 s 514.016 MB/s
Buffer-Const,s!=d,xor=1: 0.278464 s 459.665 MB/s
262144 512 4 TABLE SINGLE -
Seed: 1352748174
Buffer-Const,s!=d,xor=0: 0.257905 s 496.308 MB/s
Buffer-Const,s!=d,xor=1: 0.266241 s 480.767 MB/s
524288 256 4 TABLE SINGLE -
Seed: 1352748176
Buffer-Const,s!=d,xor=0: 0.254655 s 502.641 MB/s
Buffer-Const,s!=d,xor=1: 0.267730 s 478.093 MB/s
1048576 128 4 TABLE SINGLE -
Seed: 1352748178
Buffer-Const,s!=d,xor=0: 0.264532 s 483.874 MB/s
Buffer-Const,s!=d,xor=1: 0.270533 s 473.140 MB/s
2097152 64 4 TABLE SINGLE -
Seed: 1352748180
Buffer-Const,s!=d,xor=0: 0.249658 s 512.702 MB/s
Buffer-Const,s!=d,xor=1: 0.265106 s 482.826 MB/s
4194304 32 4 TABLE SINGLE -
Seed: 1352748181
Buffer-Const,s!=d,xor=0: 0.244030 s 524.527 MB/s
Buffer-Const,s!=d,xor=1: 0.301052 s 425.176 MB/s
8388608 16 4 TABLE SINGLE -
Seed: 1352748183
Buffer-Const,s!=d,xor=0: 0.263009 s 486.676 MB/s
Buffer-Const,s!=d,xor=1: 0.270075 s 473.943 MB/s
16777216 8 4 TABLE SINGLE -
Seed: 1352748185
Buffer-Const,s!=d,xor=0: 0.318133 s 402.348 MB/s
Buffer-Const,s!=d,xor=1: 0.315726 s 405.415 MB/s
33554432 4 4 TABLE SINGLE -
Seed: 1352748187
Buffer-Const,s!=d,xor=0: 0.329082 s 388.961 MB/s
Buffer-Const,s!=d,xor=1: 0.303774 s 421.366 MB/s
67108864 2 4 TABLE SINGLE -
Seed: 1352748189
Buffer-Const,s!=d,xor=0: 0.373282 s 342.904 MB/s
Buffer-Const,s!=d,xor=1: 0.299255 s 427.729 MB/s
134217728 1 4 TABLE SINGLE -
Seed: 1352748191
Buffer-Const,s!=d,xor=0: 0.026432 s 4842.652 MB/s
Buffer-Const,s!=d,xor=1: 0.028027 s 4566.976 MB/s
1024 131072 4 TABLE SINGLE,SSE -
Seed: 1352748192
Buffer-Const,s!=d,xor=0: 0.020923 s 6117.629 MB/s
Buffer-Const,s!=d,xor=1: 0.021753 s 5884.226 MB/s
2048 65536 4 TABLE SINGLE,SSE -
Seed: 1352748193
Buffer-Const,s!=d,xor=0: 0.017533 s 7300.592 MB/s
Buffer-Const,s!=d,xor=1: 0.018308 s 6991.599 MB/s
4096 32768 4 TABLE SINGLE,SSE -
Seed: 1352748193
Buffer-Const,s!=d,xor=0: 0.016224 s 7889.591 MB/s
Buffer-Const,s!=d,xor=1: 0.016537 s 7740.353 MB/s
8192 16384 4 TABLE SINGLE,SSE -
Seed: 1352748194
Buffer-Const,s!=d,xor=0: 0.015627 s 8191.000 MB/s
Buffer-Const,s!=d,xor=1: 0.016160 s 7921.020 MB/s
16384 8192 4 TABLE SINGLE,SSE -
Seed: 1352748195
Buffer-Const,s!=d,xor=0: 0.015679 s 8163.599 MB/s
Buffer-Const,s!=d,xor=1: 0.016548 s 7735.000 MB/s
32768 4096 4 TABLE SINGLE,SSE -
Seed: 1352748196
Buffer-Const,s!=d,xor=0: 0.016351 s 7828.046 MB/s
Buffer-Const,s!=d,xor=1: 0.017147 s 7464.939 MB/s
65536 2048 4 TABLE SINGLE,SSE -
Seed: 1352748196
Buffer-Const,s!=d,xor=0: 0.015204 s 8418.863 MB/s
Buffer-Const,s!=d,xor=1: 0.016621 s 7701.049 MB/s
131072 1024 4 TABLE SINGLE,SSE -
Seed: 1352748197
Buffer-Const,s!=d,xor=0: 0.019366 s 6609.594 MB/s
Buffer-Const,s!=d,xor=1: 0.020611 s 6210.405 MB/s
262144 512 4 TABLE SINGLE,SSE -
Seed: 1352748198
Buffer-Const,s!=d,xor=0: 0.019287 s 6636.721 MB/s
Buffer-Const,s!=d,xor=1: 0.020470 s 6253.155 MB/s
524288 256 4 TABLE SINGLE,SSE -
Seed: 1352748199
Buffer-Const,s!=d,xor=0: 0.019210 s 6663.244 MB/s
Buffer-Const,s!=d,xor=1: 0.021175 s 6044.754 MB/s
1048576 128 4 TABLE SINGLE,SSE -
Seed: 1352748199
Buffer-Const,s!=d,xor=0: 0.035533 s 3602.314 MB/s
Buffer-Const,s!=d,xor=1: 0.032351 s 3956.628 MB/s
2097152 64 4 TABLE SINGLE,SSE -
Seed: 1352748200
Buffer-Const,s!=d,xor=0: 0.048733 s 2626.557 MB/s
Buffer-Const,s!=d,xor=1: 0.044163 s 2898.370 MB/s
4194304 32 4 TABLE SINGLE,SSE -
Seed: 1352748201
Buffer-Const,s!=d,xor=0: 0.051737 s 2474.071 MB/s
Buffer-Const,s!=d,xor=1: 0.048826 s 2621.555 MB/s
8388608 16 4 TABLE SINGLE,SSE -
Seed: 1352748202
Buffer-Const,s!=d,xor=0: 0.056330 s 2272.306 MB/s
Buffer-Const,s!=d,xor=1: 0.029557 s 4330.617 MB/s
16777216 8 4 TABLE SINGLE,SSE -
Seed: 1352748203
Buffer-Const,s!=d,xor=0: 0.066551 s 1923.338 MB/s
Buffer-Const,s!=d,xor=1: 0.037378 s 3424.489 MB/s
33554432 4 4 TABLE SINGLE,SSE -
Seed: 1352748203
Buffer-Const,s!=d,xor=0: 0.082171 s 1557.728 MB/s
Buffer-Const,s!=d,xor=1: 0.048228 s 2654.058 MB/s
67108864 2 4 TABLE SINGLE,SSE -
Seed: 1352748204
Buffer-Const,s!=d,xor=0: 0.125187 s 1022.469 MB/s
Buffer-Const,s!=d,xor=1: 0.047497 s 2694.905 MB/s
134217728 1 4 TABLE SINGLE,SSE -
Seed: 1352748205
Buffer-Const,s!=d,xor=0: 0.151542 s 844.651 MB/s
Buffer-Const,s!=d,xor=1: 0.153138 s 835.847 MB/s
1024 131072 4 TABLE DOUBLE -
Seed: 1352748207
Buffer-Const,s!=d,xor=0: 0.146267 s 875.111 MB/s
Buffer-Const,s!=d,xor=1: 0.150025 s 853.189 MB/s
2048 65536 4 TABLE DOUBLE -
Seed: 1352748208
Buffer-Const,s!=d,xor=0: 0.145038 s 882.529 MB/s
Buffer-Const,s!=d,xor=1: 0.146365 s 874.525 MB/s
4096 32768 4 TABLE DOUBLE -
Seed: 1352748209
Buffer-Const,s!=d,xor=0: 0.142601 s 897.608 MB/s
Buffer-Const,s!=d,xor=1: 0.144650 s 884.893 MB/s
8192 16384 4 TABLE DOUBLE -
Seed: 1352748211
Buffer-Const,s!=d,xor=0: 0.141861 s 902.293 MB/s
Buffer-Const,s!=d,xor=1: 0.142722 s 896.848 MB/s
16384 8192 4 TABLE DOUBLE -
Seed: 1352748212
Buffer-Const,s!=d,xor=0: 0.140131 s 913.433 MB/s
Buffer-Const,s!=d,xor=1: 0.143035 s 894.888 MB/s
32768 4096 4 TABLE DOUBLE -
Seed: 1352748213
Buffer-Const,s!=d,xor=0: 0.141368 s 905.436 MB/s
Buffer-Const,s!=d,xor=1: 0.142083 s 900.879 MB/s
65536 2048 4 TABLE DOUBLE -
Seed: 1352748214
Buffer-Const,s!=d,xor=0: 0.144412 s 886.351 MB/s
Buffer-Const,s!=d,xor=1: 0.145837 s 877.693 MB/s
131072 1024 4 TABLE DOUBLE -
Seed: 1352748216
Buffer-Const,s!=d,xor=0: 0.141466 s 904.810 MB/s
Buffer-Const,s!=d,xor=1: 0.146338 s 874.686 MB/s
262144 512 4 TABLE DOUBLE -
Seed: 1352748217
Buffer-Const,s!=d,xor=0: 0.141775 s 902.837 MB/s
Buffer-Const,s!=d,xor=1: 0.143733 s 890.543 MB/s
524288 256 4 TABLE DOUBLE -
Seed: 1352748218
Buffer-Const,s!=d,xor=0: 0.144309 s 886.984 MB/s
Buffer-Const,s!=d,xor=1: 0.145978 s 876.843 MB/s
1048576 128 4 TABLE DOUBLE -
Seed: 1352748219
Buffer-Const,s!=d,xor=0: 0.145523 s 879.584 MB/s
Buffer-Const,s!=d,xor=1: 0.152104 s 841.530 MB/s
2097152 64 4 TABLE DOUBLE -
Seed: 1352748221
Buffer-Const,s!=d,xor=0: 0.150421 s 850.944 MB/s
Buffer-Const,s!=d,xor=1: 0.154586 s 828.018 MB/s
4194304 32 4 TABLE DOUBLE -
Seed: 1352748222
Buffer-Const,s!=d,xor=0: 0.151304 s 845.978 MB/s
Buffer-Const,s!=d,xor=1: 0.151530 s 844.720 MB/s
8388608 16 4 TABLE DOUBLE -
Seed: 1352748223
Buffer-Const,s!=d,xor=0: 0.160126 s 799.369 MB/s
Buffer-Const,s!=d,xor=1: 0.151316 s 845.910 MB/s
16777216 8 4 TABLE DOUBLE -
Seed: 1352748224
Buffer-Const,s!=d,xor=0: 0.167688 s 763.323 MB/s
Buffer-Const,s!=d,xor=1: 0.152321 s 840.331 MB/s
33554432 4 4 TABLE DOUBLE -
Seed: 1352748226
Buffer-Const,s!=d,xor=0: 0.194515 s 658.047 MB/s
Buffer-Const,s!=d,xor=1: 0.149023 s 858.929 MB/s
67108864 2 4 TABLE DOUBLE -
Seed: 1352748227
Buffer-Const,s!=d,xor=0: 0.237898 s 538.046 MB/s
Buffer-Const,s!=d,xor=1: 0.148526 s 861.802 MB/s
134217728 1 4 TABLE DOUBLE -
Seed: 1352748229
Buffer-Const,s!=d,xor=0: 0.151483 s 844.979 MB/s
Buffer-Const,s!=d,xor=1: 0.153012 s 836.535 MB/s
1024 131072 4 TABLE DOUBLE -
Seed: 1352748230
Buffer-Const,s!=d,xor=0: 0.146577 s 873.259 MB/s
Buffer-Const,s!=d,xor=1: 0.146274 s 875.070 MB/s
2048 65536 4 TABLE DOUBLE -
Seed: 1352748231
Buffer-Const,s!=d,xor=0: 0.145069 s 882.341 MB/s
Buffer-Const,s!=d,xor=1: 0.143911 s 889.436 MB/s
4096 32768 4 TABLE DOUBLE -
Seed: 1352748233
Buffer-Const,s!=d,xor=0: 0.143011 s 895.035 MB/s
Buffer-Const,s!=d,xor=1: 0.142096 s 900.798 MB/s
8192 16384 4 TABLE DOUBLE -
Seed: 1352748234
Buffer-Const,s!=d,xor=0: 0.142743 s 896.719 MB/s
Buffer-Const,s!=d,xor=1: 0.142004 s 901.383 MB/s
16384 8192 4 TABLE DOUBLE -
Seed: 1352748235
Buffer-Const,s!=d,xor=0: 0.141290 s 905.940 MB/s
Buffer-Const,s!=d,xor=1: 0.142891 s 895.785 MB/s
32768 4096 4 TABLE DOUBLE -
Seed: 1352748236
Buffer-Const,s!=d,xor=0: 0.141509 s 904.534 MB/s
Buffer-Const,s!=d,xor=1: 0.142357 s 899.150 MB/s
65536 2048 4 TABLE DOUBLE -
Seed: 1352748237
Buffer-Const,s!=d,xor=0: 0.141353 s 905.532 MB/s
Buffer-Const,s!=d,xor=1: 0.147224 s 869.422 MB/s
131072 1024 4 TABLE DOUBLE -
Seed: 1352748239
Buffer-Const,s!=d,xor=0: 0.142758 s 896.623 MB/s
Buffer-Const,s!=d,xor=1: 0.144537 s 885.585 MB/s
262144 512 4 TABLE DOUBLE -
Seed: 1352748240
Buffer-Const,s!=d,xor=0: 0.141772 s 902.858 MB/s
Buffer-Const,s!=d,xor=1: 0.145832 s 877.723 MB/s
524288 256 4 TABLE DOUBLE -
Seed: 1352748241
Buffer-Const,s!=d,xor=0: 0.142111 s 900.705 MB/s
Buffer-Const,s!=d,xor=1: 0.143957 s 889.155 MB/s
1048576 128 4 TABLE DOUBLE -
Seed: 1352748242
Buffer-Const,s!=d,xor=0: 0.144863 s 883.596 MB/s
Buffer-Const,s!=d,xor=1: 0.148948 s 859.359 MB/s
2097152 64 4 TABLE DOUBLE -
Seed: 1352748244
Buffer-Const,s!=d,xor=0: 0.150453 s 850.766 MB/s
Buffer-Const,s!=d,xor=1: 0.151897 s 842.677 MB/s
4194304 32 4 TABLE DOUBLE -
Seed: 1352748245
Buffer-Const,s!=d,xor=0: 0.152495 s 839.371 MB/s
Buffer-Const,s!=d,xor=1: 0.153424 s 834.289 MB/s
8388608 16 4 TABLE DOUBLE -
Seed: 1352748246
Buffer-Const,s!=d,xor=0: 0.159227 s 803.886 MB/s
Buffer-Const,s!=d,xor=1: 0.151101 s 847.118 MB/s
16777216 8 4 TABLE DOUBLE -
Seed: 1352748248
Buffer-Const,s!=d,xor=0: 0.167903 s 762.344 MB/s
Buffer-Const,s!=d,xor=1: 0.152000 s 842.106 MB/s
33554432 4 4 TABLE DOUBLE -
Seed: 1352748249
Buffer-Const,s!=d,xor=0: 0.193370 s 661.943 MB/s
Buffer-Const,s!=d,xor=1: 0.153193 s 835.547 MB/s
67108864 2 4 TABLE DOUBLE -
Seed: 1352748250
Buffer-Const,s!=d,xor=0: 0.241834 s 529.288 MB/s
Buffer-Const,s!=d,xor=1: 0.150811 s 848.745 MB/s
134217728 1 4 TABLE DOUBLE -
Seed: 1352748252
Buffer-Const,s!=d,xor=0: 0.158047 s 809.887 MB/s
Buffer-Const,s!=d,xor=1: 0.156660 s 817.057 MB/s
1024 131072 4 TABLE QUAD -
Seed: 1352748253
Buffer-Const,s!=d,xor=0: 0.141239 s 906.264 MB/s
Buffer-Const,s!=d,xor=1: 0.146382 s 874.422 MB/s
2048 65536 4 TABLE QUAD -
Seed: 1352748254
Buffer-Const,s!=d,xor=0: 0.134986 s 948.245 MB/s
Buffer-Const,s!=d,xor=1: 0.140656 s 910.023 MB/s
4096 32768 4 TABLE QUAD -
Seed: 1352748256
Buffer-Const,s!=d,xor=0: 0.153383 s 834.514 MB/s
Buffer-Const,s!=d,xor=1: 0.128968 s 992.498 MB/s
8192 16384 4 TABLE QUAD -
Seed: 1352748257
Buffer-Const,s!=d,xor=0: 0.120985 s 1057.984 MB/s
Buffer-Const,s!=d,xor=1: 0.121486 s 1053.618 MB/s
16384 8192 4 TABLE QUAD -
Seed: 1352748258
Buffer-Const,s!=d,xor=0: 0.113212 s 1130.626 MB/s
Buffer-Const,s!=d,xor=1: 0.116994 s 1094.076 MB/s
32768 4096 4 TABLE QUAD -
Seed: 1352748259
Buffer-Const,s!=d,xor=0: 0.106910 s 1197.266 MB/s
Buffer-Const,s!=d,xor=1: 0.109951 s 1164.152 MB/s
65536 2048 4 TABLE QUAD -
Seed: 1352748260
Buffer-Const,s!=d,xor=0: 0.106585 s 1200.916 MB/s
Buffer-Const,s!=d,xor=1: 0.119656 s 1069.735 MB/s
131072 1024 4 TABLE QUAD -
Seed: 1352748261
Buffer-Const,s!=d,xor=0: 0.108813 s 1176.332 MB/s
Buffer-Const,s!=d,xor=1: 0.109021 s 1174.081 MB/s
262144 512 4 TABLE QUAD -
Seed: 1352748263
Buffer-Const,s!=d,xor=0: 0.103341 s 1238.614 MB/s
Buffer-Const,s!=d,xor=1: 0.108952 s 1174.826 MB/s
524288 256 4 TABLE QUAD -
Seed: 1352748264
Buffer-Const,s!=d,xor=0: 0.105469 s 1213.627 MB/s
Buffer-Const,s!=d,xor=1: 0.110848 s 1154.735 MB/s
1048576 128 4 TABLE QUAD -
Seed: 1352748265
Buffer-Const,s!=d,xor=0: 0.105542 s 1212.785 MB/s
Buffer-Const,s!=d,xor=1: 0.108646 s 1178.134 MB/s
2097152 64 4 TABLE QUAD -
Seed: 1352748266
Buffer-Const,s!=d,xor=0: 0.106677 s 1199.889 MB/s
Buffer-Const,s!=d,xor=1: 0.112022 s 1142.631 MB/s
4194304 32 4 TABLE QUAD -
Seed: 1352748267
Buffer-Const,s!=d,xor=0: 0.110966 s 1153.507 MB/s
Buffer-Const,s!=d,xor=1: 0.100766 s 1270.264 MB/s
8388608 16 4 TABLE QUAD -
Seed: 1352748268
Buffer-Const,s!=d,xor=0: 0.108207 s 1182.915 MB/s
Buffer-Const,s!=d,xor=1: 0.113488 s 1127.871 MB/s
16777216 8 4 TABLE QUAD -
Seed: 1352748269
Buffer-Const,s!=d,xor=0: 0.129142 s 991.157 MB/s
Buffer-Const,s!=d,xor=1: 0.110923 s 1153.953 MB/s
33554432 4 4 TABLE QUAD -
Seed: 1352748270
Buffer-Const,s!=d,xor=0: 0.156426 s 818.279 MB/s
Buffer-Const,s!=d,xor=1: 0.110093 s 1162.652 MB/s
67108864 2 4 TABLE QUAD -
Seed: 1352748272
Buffer-Const,s!=d,xor=0: 0.203508 s 628.967 MB/s
Buffer-Const,s!=d,xor=1: 0.111907 s 1143.807 MB/s
134217728 1 4 TABLE QUAD -
Seed: 1352748273
Buffer-Const,s!=d,xor=0: 8.741033 s 14.644 MB/s
Buffer-Const,s!=d,xor=1: 8.972750 s 14.265 MB/s
1024 131072 4 TABLE QUAD,LAZY -
Seed: 1352748309
Buffer-Const,s!=d,xor=0: 4.387740 s 29.172 MB/s
Buffer-Const,s!=d,xor=1: 4.401799 s 29.079 MB/s
2048 65536 4 TABLE QUAD,LAZY -
Seed: 1352748327
Buffer-Const,s!=d,xor=0: 2.255454 s 56.751 MB/s
Buffer-Const,s!=d,xor=1: 2.243299 s 57.059 MB/s
4096 32768 4 TABLE QUAD,LAZY -
Seed: 1352748337
Buffer-Const,s!=d,xor=0: 1.166870 s 109.695 MB/s
Buffer-Const,s!=d,xor=1: 1.180004 s 108.474 MB/s
8192 16384 4 TABLE QUAD,LAZY -
Seed: 1352748342
Buffer-Const,s!=d,xor=0: 0.661613 s 193.467 MB/s
Buffer-Const,s!=d,xor=1: 0.629827 s 203.230 MB/s
16384 8192 4 TABLE QUAD,LAZY -
Seed: 1352748345
Buffer-Const,s!=d,xor=0: 0.364647 s 351.024 MB/s
Buffer-Const,s!=d,xor=1: 0.376395 s 340.069 MB/s
32768 4096 4 TABLE QUAD,LAZY -
Seed: 1352748348
Buffer-Const,s!=d,xor=0: 0.226271 s 565.694 MB/s
Buffer-Const,s!=d,xor=1: 0.234560 s 545.704 MB/s
65536 2048 4 TABLE QUAD,LAZY -
Seed: 1352748349
Buffer-Const,s!=d,xor=0: 0.160475 s 797.630 MB/s
Buffer-Const,s!=d,xor=1: 0.166329 s 769.561 MB/s
131072 1024 4 TABLE QUAD,LAZY -
Seed: 1352748351
Buffer-Const,s!=d,xor=0: 0.130999 s 977.110 MB/s
Buffer-Const,s!=d,xor=1: 0.134676 s 950.431 MB/s
262144 512 4 TABLE QUAD,LAZY -
Seed: 1352748352
Buffer-Const,s!=d,xor=0: 0.110626 s 1157.057 MB/s
Buffer-Const,s!=d,xor=1: 0.118067 s 1084.134 MB/s
524288 256 4 TABLE QUAD,LAZY -
Seed: 1352748353
Buffer-Const,s!=d,xor=0: 0.105213 s 1216.581 MB/s
Buffer-Const,s!=d,xor=1: 0.109697 s 1166.854 MB/s
1048576 128 4 TABLE QUAD,LAZY -
Seed: 1352748354
Buffer-Const,s!=d,xor=0: 0.107641 s 1189.138 MB/s
Buffer-Const,s!=d,xor=1: 0.108062 s 1184.502 MB/s
2097152 64 4 TABLE QUAD,LAZY -
Seed: 1352748355
Buffer-Const,s!=d,xor=0: 0.103473 s 1237.035 MB/s
Buffer-Const,s!=d,xor=1: 0.098362 s 1301.310 MB/s
4194304 32 4 TABLE QUAD,LAZY -
Seed: 1352748356
Buffer-Const,s!=d,xor=0: 0.107058 s 1195.616 MB/s
Buffer-Const,s!=d,xor=1: 0.097883 s 1307.687 MB/s
8388608 16 4 TABLE QUAD,LAZY -
Seed: 1352748357
Buffer-Const,s!=d,xor=0: 0.116388 s 1099.769 MB/s
Buffer-Const,s!=d,xor=1: 0.098690 s 1296.990 MB/s
16777216 8 4 TABLE QUAD,LAZY -
Seed: 1352748358
Buffer-Const,s!=d,xor=0: 0.129120 s 991.325 MB/s
Buffer-Const,s!=d,xor=1: 0.109833 s 1165.403 MB/s
33554432 4 4 TABLE QUAD,LAZY -
Seed: 1352748360
Buffer-Const,s!=d,xor=0: 0.157534 s 812.524 MB/s
Buffer-Const,s!=d,xor=1: 0.114721 s 1115.750 MB/s
67108864 2 4 TABLE QUAD,LAZY -
Seed: 1352748361
Buffer-Const,s!=d,xor=0: 0.205053 s 624.229 MB/s
Buffer-Const,s!=d,xor=1: 0.110099 s 1162.589 MB/s
134217728 1 4 TABLE QUAD,LAZY -
Seed: 1352748362
Buffer-Const,s!=d,xor=0: 0.142388 s 898.955 MB/s
Buffer-Const,s!=d,xor=1: 0.146045 s 876.440 MB/s
1024 131072 4 BYTWO_p - -
Seed: 1352748363
Buffer-Const,s!=d,xor=0: 0.135040 s 947.867 MB/s
Buffer-Const,s!=d,xor=1: 0.140142 s 913.360 MB/s
2048 65536 4 BYTWO_p - -
Seed: 1352748365
Buffer-Const,s!=d,xor=0: 0.131358 s 974.437 MB/s
Buffer-Const,s!=d,xor=1: 0.137115 s 933.525 MB/s
4096 32768 4 BYTWO_p - -
Seed: 1352748366
Buffer-Const,s!=d,xor=0: 0.129772 s 986.347 MB/s
Buffer-Const,s!=d,xor=1: 0.135098 s 947.462 MB/s
8192 16384 4 BYTWO_p - -
Seed: 1352748367
Buffer-Const,s!=d,xor=0: 0.128670 s 994.795 MB/s
Buffer-Const,s!=d,xor=1: 0.133591 s 958.145 MB/s
16384 8192 4 BYTWO_p - -
Seed: 1352748368
Buffer-Const,s!=d,xor=0: 0.130064 s 984.129 MB/s
Buffer-Const,s!=d,xor=1: 0.135170 s 946.959 MB/s
32768 4096 4 BYTWO_p - -
Seed: 1352748369
Buffer-Const,s!=d,xor=0: 0.129942 s 985.052 MB/s
Buffer-Const,s!=d,xor=1: 0.134780 s 949.695 MB/s
65536 2048 4 BYTWO_p - -
Seed: 1352748371
Buffer-Const,s!=d,xor=0: 0.130649 s 979.725 MB/s
Buffer-Const,s!=d,xor=1: 0.134556 s 951.280 MB/s
131072 1024 4 BYTWO_p - -
Seed: 1352748372
Buffer-Const,s!=d,xor=0: 0.129390 s 989.255 MB/s
Buffer-Const,s!=d,xor=1: 0.134418 s 952.257 MB/s
262144 512 4 BYTWO_p - -
Seed: 1352748373
Buffer-Const,s!=d,xor=0: 0.130153 s 983.455 MB/s
Buffer-Const,s!=d,xor=1: 0.137027 s 934.126 MB/s
524288 256 4 BYTWO_p - -
Seed: 1352748374
Buffer-Const,s!=d,xor=0: 0.128065 s 999.493 MB/s
Buffer-Const,s!=d,xor=1: 0.136548 s 937.402 MB/s
1048576 128 4 BYTWO_p - -
Seed: 1352748375
Buffer-Const,s!=d,xor=0: 0.137841 s 928.608 MB/s
Buffer-Const,s!=d,xor=1: 0.149983 s 853.428 MB/s
2097152 64 4 BYTWO_p - -
Seed: 1352748377
Buffer-Const,s!=d,xor=0: 0.143009 s 895.049 MB/s
Buffer-Const,s!=d,xor=1: 0.151799 s 843.218 MB/s
4194304 32 4 BYTWO_p - -
Seed: 1352748378
Buffer-Const,s!=d,xor=0: 0.148001 s 864.859 MB/s
Buffer-Const,s!=d,xor=1: 0.150979 s 847.802 MB/s
8388608 16 4 BYTWO_p - -
Seed: 1352748379
Buffer-Const,s!=d,xor=0: 0.153637 s 833.133 MB/s
Buffer-Const,s!=d,xor=1: 0.133152 s 961.307 MB/s
16777216 8 4 BYTWO_p - -
Seed: 1352748380
Buffer-Const,s!=d,xor=0: 0.164125 s 779.894 MB/s
Buffer-Const,s!=d,xor=1: 0.150620 s 849.821 MB/s
33554432 4 4 BYTWO_p - -
Seed: 1352748382
Buffer-Const,s!=d,xor=0: 0.188526 s 678.952 MB/s
Buffer-Const,s!=d,xor=1: 0.153114 s 835.979 MB/s
67108864 2 4 BYTWO_p - -
Seed: 1352748383
Buffer-Const,s!=d,xor=0: 0.235626 s 543.234 MB/s
Buffer-Const,s!=d,xor=1: 0.158839 s 805.847 MB/s
134217728 1 4 BYTWO_p - -
Seed: 1352748385
Buffer-Const,s!=d,xor=0: 0.076323 s 1677.087 MB/s
Buffer-Const,s!=d,xor=1: 0.077654 s 1648.345 MB/s
1024 131072 4 BYTWO_b - -
Seed: 1352748386
Buffer-Const,s!=d,xor=0: 0.068027 s 1881.605 MB/s
Buffer-Const,s!=d,xor=1: 0.070778 s 1808.462 MB/s
2048 65536 4 BYTWO_b - -
Seed: 1352748387
Buffer-Const,s!=d,xor=0: 0.065722 s 1947.591 MB/s
Buffer-Const,s!=d,xor=1: 0.068535 s 1867.669 MB/s
4096 32768 4 BYTWO_b - -
Seed: 1352748388
Buffer-Const,s!=d,xor=0: 0.063732 s 2008.398 MB/s
Buffer-Const,s!=d,xor=1: 0.066054 s 1937.805 MB/s
8192 16384 4 BYTWO_b - -
Seed: 1352748389
Buffer-Const,s!=d,xor=0: 0.062660 s 2042.779 MB/s
Buffer-Const,s!=d,xor=1: 0.065213 s 1962.793 MB/s
16384 8192 4 BYTWO_b - -
Seed: 1352748390
Buffer-Const,s!=d,xor=0: 0.062758 s 2039.566 MB/s
Buffer-Const,s!=d,xor=1: 0.066957 s 1911.668 MB/s
32768 4096 4 BYTWO_b - -
Seed: 1352748390
Buffer-Const,s!=d,xor=0: 0.063058 s 2029.865 MB/s
Buffer-Const,s!=d,xor=1: 0.065829 s 1944.424 MB/s
65536 2048 4 BYTWO_b - -
Seed: 1352748391
Buffer-Const,s!=d,xor=0: 0.065844 s 1943.994 MB/s
Buffer-Const,s!=d,xor=1: 0.065374 s 1957.968 MB/s
131072 1024 4 BYTWO_b - -
Seed: 1352748392
Buffer-Const,s!=d,xor=0: 0.062168 s 2058.949 MB/s
Buffer-Const,s!=d,xor=1: 0.068710 s 1862.906 MB/s
262144 512 4 BYTWO_b - -
Seed: 1352748393
Buffer-Const,s!=d,xor=0: 0.062623 s 2043.984 MB/s
Buffer-Const,s!=d,xor=1: 0.066550 s 1923.379 MB/s
524288 256 4 BYTWO_b - -
Seed: 1352748394
Buffer-Const,s!=d,xor=0: 0.064571 s 1982.317 MB/s
Buffer-Const,s!=d,xor=1: 0.061325 s 2087.246 MB/s
1048576 128 4 BYTWO_b - -
Seed: 1352748395
Buffer-Const,s!=d,xor=0: 0.070771 s 1808.657 MB/s
Buffer-Const,s!=d,xor=1: 0.072981 s 1753.878 MB/s
2097152 64 4 BYTWO_b - -
Seed: 1352748396
Buffer-Const,s!=d,xor=0: 0.078018 s 1640.643 MB/s
Buffer-Const,s!=d,xor=1: 0.072307 s 1770.227 MB/s
4194304 32 4 BYTWO_b - -
Seed: 1352748397
Buffer-Const,s!=d,xor=0: 0.079478 s 1610.508 MB/s
Buffer-Const,s!=d,xor=1: 0.073757 s 1735.424 MB/s
8388608 16 4 BYTWO_b - -
Seed: 1352748398
Buffer-Const,s!=d,xor=0: 0.085826 s 1491.383 MB/s
Buffer-Const,s!=d,xor=1: 0.087615 s 1460.945 MB/s
16777216 8 4 BYTWO_b - -
Seed: 1352748399
Buffer-Const,s!=d,xor=0: 0.081822 s 1564.373 MB/s
Buffer-Const,s!=d,xor=1: 0.083410 s 1534.583 MB/s
33554432 4 4 BYTWO_b - -
Seed: 1352748400
Buffer-Const,s!=d,xor=0: 0.101873 s 1256.467 MB/s
Buffer-Const,s!=d,xor=1: 0.074412 s 1720.150 MB/s
67108864 2 4 BYTWO_b - -
Seed: 1352748401
Buffer-Const,s!=d,xor=0: 0.188405 s 679.387 MB/s
Buffer-Const,s!=d,xor=1: 0.053904 s 2374.589 MB/s
134217728 1 4 BYTWO_b - -
Seed: 1352748403
Buffer-Const,s!=d,xor=0: 0.092518 s 1383.520 MB/s
Buffer-Const,s!=d,xor=1: 0.097347 s 1314.877 MB/s
1024 131072 4 BYTWO_p SSE -
Seed: 1352748404
Buffer-Const,s!=d,xor=0: 0.086226 s 1484.463 MB/s
Buffer-Const,s!=d,xor=1: 0.092092 s 1389.910 MB/s
2048 65536 4 BYTWO_p SSE -
Seed: 1352748405
Buffer-Const,s!=d,xor=0: 0.082721 s 1547.370 MB/s
Buffer-Const,s!=d,xor=1: 0.088092 s 1453.025 MB/s
4096 32768 4 BYTWO_p SSE -
Seed: 1352748406
Buffer-Const,s!=d,xor=0: 0.081612 s 1568.395 MB/s
Buffer-Const,s!=d,xor=1: 0.086144 s 1485.885 MB/s
8192 16384 4 BYTWO_p SSE -
Seed: 1352748407
Buffer-Const,s!=d,xor=0: 0.080819 s 1583.783 MB/s
Buffer-Const,s!=d,xor=1: 0.085448 s 1497.982 MB/s
16384 8192 4 BYTWO_p SSE -
Seed: 1352748408
Buffer-Const,s!=d,xor=0: 0.080971 s 1580.804 MB/s
Buffer-Const,s!=d,xor=1: 0.086504 s 1479.709 MB/s
32768 4096 4 BYTWO_p SSE -
Seed: 1352748409
Buffer-Const,s!=d,xor=0: 0.080746 s 1585.214 MB/s
Buffer-Const,s!=d,xor=1: 0.085679 s 1493.943 MB/s
65536 2048 4 BYTWO_p SSE -
Seed: 1352748410
Buffer-Const,s!=d,xor=0: 0.081038 s 1579.511 MB/s
Buffer-Const,s!=d,xor=1: 0.086381 s 1481.804 MB/s
131072 1024 4 BYTWO_p SSE -
Seed: 1352748411
Buffer-Const,s!=d,xor=0: 0.079807 s 1603.873 MB/s
Buffer-Const,s!=d,xor=1: 0.085420 s 1498.484 MB/s
262144 512 4 BYTWO_p SSE -
Seed: 1352748412
Buffer-Const,s!=d,xor=0: 0.080044 s 1599.115 MB/s
Buffer-Const,s!=d,xor=1: 0.083843 s 1526.654 MB/s
524288 256 4 BYTWO_p SSE -
Seed: 1352748413
Buffer-Const,s!=d,xor=0: 0.082954 s 1543.016 MB/s
Buffer-Const,s!=d,xor=1: 0.086807 s 1474.535 MB/s
1048576 128 4 BYTWO_p SSE -
Seed: 1352748414
Buffer-Const,s!=d,xor=0: 0.090553 s 1413.536 MB/s
Buffer-Const,s!=d,xor=1: 0.092115 s 1389.565 MB/s
2097152 64 4 BYTWO_p SSE -
Seed: 1352748415
Buffer-Const,s!=d,xor=0: 0.087072 s 1470.054 MB/s
Buffer-Const,s!=d,xor=1: 0.093465 s 1369.492 MB/s
4194304 32 4 BYTWO_p SSE -
Seed: 1352748416
Buffer-Const,s!=d,xor=0: 0.097724 s 1309.812 MB/s
Buffer-Const,s!=d,xor=1: 0.090922 s 1407.795 MB/s
8388608 16 4 BYTWO_p SSE -
Seed: 1352748417
Buffer-Const,s!=d,xor=0: 0.104649 s 1223.136 MB/s
Buffer-Const,s!=d,xor=1: 0.084963 s 1506.532 MB/s
16777216 8 4 BYTWO_p SSE -
Seed: 1352748418
Buffer-Const,s!=d,xor=0: 0.112079 s 1142.050 MB/s
Buffer-Const,s!=d,xor=1: 0.096727 s 1323.313 MB/s
33554432 4 4 BYTWO_p SSE -
Seed: 1352748419
Buffer-Const,s!=d,xor=0: 0.136256 s 939.408 MB/s
Buffer-Const,s!=d,xor=1: 0.103244 s 1239.781 MB/s
67108864 2 4 BYTWO_p SSE -
Seed: 1352748420
Buffer-Const,s!=d,xor=0: 0.181231 s 706.281 MB/s
Buffer-Const,s!=d,xor=1: 0.092887 s 1378.016 MB/s
134217728 1 4 BYTWO_p SSE -
Seed: 1352748422
Buffer-Const,s!=d,xor=0: 0.107760 s 1187.825 MB/s
Buffer-Const,s!=d,xor=1: 0.065748 s 1946.828 MB/s
1024 131072 4 BYTWO_b SSE -
Seed: 1352748423
Buffer-Const,s!=d,xor=0: 0.104705 s 1222.484 MB/s
Buffer-Const,s!=d,xor=1: 0.058541 s 2186.508 MB/s
2048 65536 4 BYTWO_b SSE -
Seed: 1352748424
Buffer-Const,s!=d,xor=0: 0.098082 s 1305.026 MB/s
Buffer-Const,s!=d,xor=1: 0.053539 s 2390.768 MB/s
4096 32768 4 BYTWO_b SSE -
Seed: 1352748425
Buffer-Const,s!=d,xor=0: 0.094147 s 1359.576 MB/s
Buffer-Const,s!=d,xor=1: 0.051867 s 2467.839 MB/s
8192 16384 4 BYTWO_b SSE -
Seed: 1352748426
Buffer-Const,s!=d,xor=0: 0.092755 s 1379.975 MB/s
Buffer-Const,s!=d,xor=1: 0.049600 s 2580.651 MB/s
16384 8192 4 BYTWO_b SSE -
Seed: 1352748427
Buffer-Const,s!=d,xor=0: 0.093161 s 1373.971 MB/s
Buffer-Const,s!=d,xor=1: 0.048734 s 2626.480 MB/s
32768 4096 4 BYTWO_b SSE -
Seed: 1352748428
Buffer-Const,s!=d,xor=0: 0.092071 s 1390.227 MB/s
Buffer-Const,s!=d,xor=1: 0.048645 s 2631.282 MB/s
65536 2048 4 BYTWO_b SSE -
Seed: 1352748429
Buffer-Const,s!=d,xor=0: 0.093282 s 1372.191 MB/s
Buffer-Const,s!=d,xor=1: 0.047374 s 2701.903 MB/s
131072 1024 4 BYTWO_b SSE -
Seed: 1352748430
Buffer-Const,s!=d,xor=0: 0.094085 s 1360.479 MB/s
Buffer-Const,s!=d,xor=1: 0.050752 s 2522.072 MB/s
262144 512 4 BYTWO_b SSE -
Seed: 1352748431
Buffer-Const,s!=d,xor=0: 0.099099 s 1291.639 MB/s
Buffer-Const,s!=d,xor=1: 0.046550 s 2749.729 MB/s
524288 256 4 BYTWO_b SSE -
Seed: 1352748431
Buffer-Const,s!=d,xor=0: 0.093943 s 1362.530 MB/s
Buffer-Const,s!=d,xor=1: 0.050178 s 2550.940 MB/s
1048576 128 4 BYTWO_b SSE -
Seed: 1352748432
Buffer-Const,s!=d,xor=0: 0.121096 s 1057.011 MB/s
Buffer-Const,s!=d,xor=1: 0.055513 s 2305.770 MB/s
2097152 64 4 BYTWO_b SSE -
Seed: 1352748433
Buffer-Const,s!=d,xor=0: 0.109734 s 1166.456 MB/s
Buffer-Const,s!=d,xor=1: 0.057743 s 2216.716 MB/s
4194304 32 4 BYTWO_b SSE -
Seed: 1352748434
Buffer-Const,s!=d,xor=0: 0.117161 s 1092.513 MB/s
Buffer-Const,s!=d,xor=1: 0.057568 s 2223.464 MB/s
8388608 16 4 BYTWO_b SSE -
Seed: 1352748436
Buffer-Const,s!=d,xor=0: 0.102332 s 1250.832 MB/s
Buffer-Const,s!=d,xor=1: 0.061185 s 2092.004 MB/s
16777216 8 4 BYTWO_b SSE -
Seed: 1352748437
Buffer-Const,s!=d,xor=0: 0.173641 s 737.153 MB/s
Buffer-Const,s!=d,xor=1: 0.054822 s 2334.830 MB/s
33554432 4 4 BYTWO_b SSE -
Seed: 1352748438
Buffer-Const,s!=d,xor=0: 0.130181 s 983.246 MB/s
Buffer-Const,s!=d,xor=1: 0.051398 s 2490.367 MB/s
67108864 2 4 BYTWO_b SSE -
Seed: 1352748439
Buffer-Const,s!=d,xor=0: 0.150805 s 848.778 MB/s
Buffer-Const,s!=d,xor=1: 0.000005 s 2330.524 MB/s
134217728 1 4 BYTWO_b SSE -

11
junk-w4-timing-tests.sh Normal file
View File

@ -0,0 +1,11 @@
sh tmp-time-test.sh 4 LOG - -
sh tmp-time-test.sh 4 TABLE SINGLE -
sh tmp-time-test.sh 4 TABLE SINGLE,SSE -
sh tmp-time-test.sh 4 TABLE DOUBLE -
sh tmp-time-test.sh 4 TABLE DOUBLE -
sh tmp-time-test.sh 4 TABLE QUAD -
sh tmp-time-test.sh 4 TABLE QUAD,LAZY -
sh tmp-time-test.sh 4 BYTWO_p - -
sh tmp-time-test.sh 4 BYTWO_b - -
sh tmp-time-test.sh 4 BYTWO_p SSE -
sh tmp-time-test.sh 4 BYTWO_b SSE -

11
junk-w4-timing.jgr Normal file
View File

@ -0,0 +1,11 @@
newgraph
xaxis size 4 min 0 no_auto_hash_labels
hash_labels hjl vjc rotate -90 fontsize 11
shell : junk-pick-best-output < junk-w4-timing-out.txt | sort -nr | sed 's/.............//' | awk '{ print "hash_label at ", ++l, ":", $0 }'
yaxis size 1 min 0 label : MB/s
newcurve marktype xbar cfill 1 1 0 marksize 1 pts
shell : junk-pick-best-output < junk-w4-timing-out.txt | sort -nr | awk '{ print $1 }' | cat -n

6
junk-w4.jgr Normal file
View File

@ -0,0 +1,6 @@
newgraph
xaxis size 4 min 0 no_auto_hash_labels
hash_labels hjl vjc rotate -90 fontsize 11
yaxis size 1 min 0 label : MB/s
shell : awk -f junk-proc.awk < junk-w4-out.txt

936
junk-w8-timing-out.txt Normal file
View File

@ -0,0 +1,936 @@
Seed: 1352746852
Buffer-Const,s!=d,xor=0: 0.205907 s 621.640 MB/s
Buffer-Const,s!=d,xor=1: 0.252565 s 506.800 MB/s
1024 131072 8 LOG - -
Seed: 1352746854
Buffer-Const,s!=d,xor=0: 0.206410 s 620.126 MB/s
Buffer-Const,s!=d,xor=1: 0.251469 s 509.008 MB/s
2048 65536 8 LOG - -
Seed: 1352746856
Buffer-Const,s!=d,xor=0: 0.209941 s 609.695 MB/s
Buffer-Const,s!=d,xor=1: 0.255838 s 500.316 MB/s
4096 32768 8 LOG - -
Seed: 1352746857
Buffer-Const,s!=d,xor=0: 0.206109 s 621.030 MB/s
Buffer-Const,s!=d,xor=1: 0.262056 s 488.445 MB/s
8192 16384 8 LOG - -
Seed: 1352746859
Buffer-Const,s!=d,xor=0: 0.201892 s 634.001 MB/s
Buffer-Const,s!=d,xor=1: 0.250816 s 510.335 MB/s
16384 8192 8 LOG - -
Seed: 1352746860
Buffer-Const,s!=d,xor=0: 0.201995 s 633.679 MB/s
Buffer-Const,s!=d,xor=1: 0.254832 s 502.292 MB/s
32768 4096 8 LOG - -
Seed: 1352746862
Buffer-Const,s!=d,xor=0: 0.203099 s 630.236 MB/s
Buffer-Const,s!=d,xor=1: 0.255779 s 500.431 MB/s
65536 2048 8 LOG - -
Seed: 1352746864
Buffer-Const,s!=d,xor=0: 0.200691 s 637.796 MB/s
Buffer-Const,s!=d,xor=1: 0.256675 s 498.685 MB/s
131072 1024 8 LOG - -
Seed: 1352746865
Buffer-Const,s!=d,xor=0: 0.201240 s 636.057 MB/s
Buffer-Const,s!=d,xor=1: 0.255231 s 501.506 MB/s
262144 512 8 LOG - -
Seed: 1352746867
Buffer-Const,s!=d,xor=0: 0.202006 s 633.645 MB/s
Buffer-Const,s!=d,xor=1: 0.251845 s 508.250 MB/s
524288 256 8 LOG - -
Seed: 1352746868
Buffer-Const,s!=d,xor=0: 0.203552 s 628.830 MB/s
Buffer-Const,s!=d,xor=1: 0.255775 s 500.440 MB/s
1048576 128 8 LOG - -
Seed: 1352746870
Buffer-Const,s!=d,xor=0: 0.206480 s 619.915 MB/s
Buffer-Const,s!=d,xor=1: 0.256771 s 498.498 MB/s
2097152 64 8 LOG - -
Seed: 1352746872
Buffer-Const,s!=d,xor=0: 0.210690 s 607.528 MB/s
Buffer-Const,s!=d,xor=1: 0.260851 s 490.701 MB/s
4194304 32 8 LOG - -
Seed: 1352746873
Buffer-Const,s!=d,xor=0: 0.212292 s 602.944 MB/s
Buffer-Const,s!=d,xor=1: 0.263464 s 485.834 MB/s
8388608 16 8 LOG - -
Seed: 1352746875
Buffer-Const,s!=d,xor=0: 0.217703 s 587.957 MB/s
Buffer-Const,s!=d,xor=1: 0.260255 s 491.826 MB/s
16777216 8 8 LOG - -
Seed: 1352746876
Buffer-Const,s!=d,xor=0: 0.229996 s 556.531 MB/s
Buffer-Const,s!=d,xor=1: 0.268077 s 477.475 MB/s
33554432 4 8 LOG - -
Seed: 1352746878
Buffer-Const,s!=d,xor=0: 0.255076 s 501.811 MB/s
Buffer-Const,s!=d,xor=1: 0.268757 s 476.266 MB/s
67108864 2 8 LOG - -
Seed: 1352746880
Buffer-Const,s!=d,xor=0: 0.299095 s 427.958 MB/s
Buffer-Const,s!=d,xor=1: 0.271954 s 470.668 MB/s
134217728 1 8 LOG - -
Seed: 1352746882
Buffer-Const,s!=d,xor=0: 0.198089 s 646.175 MB/s
Buffer-Const,s!=d,xor=1: 0.199934 s 640.212 MB/s
1024 131072 8 LOG_ZERO - -
Seed: 1352746883
Buffer-Const,s!=d,xor=0: 0.191693 s 667.733 MB/s
Buffer-Const,s!=d,xor=1: 0.195976 s 653.142 MB/s
2048 65536 8 LOG_ZERO - -
Seed: 1352746885
Buffer-Const,s!=d,xor=0: 0.190896 s 670.524 MB/s
Buffer-Const,s!=d,xor=1: 0.194985 s 656.459 MB/s
4096 32768 8 LOG_ZERO - -
Seed: 1352746886
Buffer-Const,s!=d,xor=0: 0.190779 s 670.933 MB/s
Buffer-Const,s!=d,xor=1: 0.195833 s 653.617 MB/s
8192 16384 8 LOG_ZERO - -
Seed: 1352746887
Buffer-Const,s!=d,xor=0: 0.188468 s 679.159 MB/s
Buffer-Const,s!=d,xor=1: 0.192885 s 663.608 MB/s
16384 8192 8 LOG_ZERO - -
Seed: 1352746889
Buffer-Const,s!=d,xor=0: 0.187547 s 682.497 MB/s
Buffer-Const,s!=d,xor=1: 0.193131 s 662.763 MB/s
32768 4096 8 LOG_ZERO - -
Seed: 1352746890
Buffer-Const,s!=d,xor=0: 0.185810 s 688.875 MB/s
Buffer-Const,s!=d,xor=1: 0.192531 s 664.829 MB/s
65536 2048 8 LOG_ZERO - -
Seed: 1352746892
Buffer-Const,s!=d,xor=0: 0.186486 s 686.379 MB/s
Buffer-Const,s!=d,xor=1: 0.192416 s 665.226 MB/s
131072 1024 8 LOG_ZERO - -
Seed: 1352746893
Buffer-Const,s!=d,xor=0: 0.187854 s 681.379 MB/s
Buffer-Const,s!=d,xor=1: 0.193211 s 662.488 MB/s
262144 512 8 LOG_ZERO - -
Seed: 1352746895
Buffer-Const,s!=d,xor=0: 0.186622 s 685.880 MB/s
Buffer-Const,s!=d,xor=1: 0.193951 s 659.961 MB/s
524288 256 8 LOG_ZERO - -
Seed: 1352746896
Buffer-Const,s!=d,xor=0: 0.193502 s 661.492 MB/s
Buffer-Const,s!=d,xor=1: 0.194600 s 657.760 MB/s
1048576 128 8 LOG_ZERO - -
Seed: 1352746897
Buffer-Const,s!=d,xor=0: 0.191789 s 667.400 MB/s
Buffer-Const,s!=d,xor=1: 0.206557 s 619.683 MB/s
2097152 64 8 LOG_ZERO - -
Seed: 1352746899
Buffer-Const,s!=d,xor=0: 0.216762 s 590.509 MB/s
Buffer-Const,s!=d,xor=1: 0.220943 s 579.334 MB/s
4194304 32 8 LOG_ZERO - -
Seed: 1352746901
Buffer-Const,s!=d,xor=0: 0.212998 s 600.944 MB/s
Buffer-Const,s!=d,xor=1: 0.229660 s 557.346 MB/s
8388608 16 8 LOG_ZERO - -
Seed: 1352746902
Buffer-Const,s!=d,xor=0: 0.225217 s 568.340 MB/s
Buffer-Const,s!=d,xor=1: 0.208174 s 614.871 MB/s
16777216 8 8 LOG_ZERO - -
Seed: 1352746904
Buffer-Const,s!=d,xor=0: 0.215686 s 593.456 MB/s
Buffer-Const,s!=d,xor=1: 0.204155 s 626.975 MB/s
33554432 4 8 LOG_ZERO - -
Seed: 1352746905
Buffer-Const,s!=d,xor=0: 0.250863 s 510.239 MB/s
Buffer-Const,s!=d,xor=1: 0.200680 s 637.832 MB/s
67108864 2 8 LOG_ZERO - -
Seed: 1352746907
Buffer-Const,s!=d,xor=0: 0.285895 s 447.717 MB/s
Buffer-Const,s!=d,xor=1: 0.201105 s 636.484 MB/s
134217728 1 8 LOG_ZERO - -
Seed: 1352746909
Buffer-Const,s!=d,xor=0: 0.154129 s 830.473 MB/s
Buffer-Const,s!=d,xor=1: 0.200737 s 637.650 MB/s
1024 131072 8 TABLE - -
Seed: 1352746910
Buffer-Const,s!=d,xor=0: 0.150785 s 848.888 MB/s
Buffer-Const,s!=d,xor=1: 0.199187 s 642.614 MB/s
2048 65536 8 TABLE - -
Seed: 1352746911
Buffer-Const,s!=d,xor=0: 0.149158 s 858.153 MB/s
Buffer-Const,s!=d,xor=1: 0.196224 s 652.316 MB/s
4096 32768 8 TABLE - -
Seed: 1352746913
Buffer-Const,s!=d,xor=0: 0.147988 s 864.936 MB/s
Buffer-Const,s!=d,xor=1: 0.195025 s 656.325 MB/s
8192 16384 8 TABLE - -
Seed: 1352746914
Buffer-Const,s!=d,xor=0: 0.146994 s 870.786 MB/s
Buffer-Const,s!=d,xor=1: 0.193489 s 661.536 MB/s
16384 8192 8 TABLE - -
Seed: 1352746915
Buffer-Const,s!=d,xor=0: 0.151192 s 846.606 MB/s
Buffer-Const,s!=d,xor=1: 0.196197 s 652.405 MB/s
32768 4096 8 TABLE - -
Seed: 1352746917
Buffer-Const,s!=d,xor=0: 0.149436 s 856.553 MB/s
Buffer-Const,s!=d,xor=1: 0.194907 s 656.724 MB/s
65536 2048 8 TABLE - -
Seed: 1352746918
Buffer-Const,s!=d,xor=0: 0.150252 s 851.900 MB/s
Buffer-Const,s!=d,xor=1: 0.196657 s 650.878 MB/s
131072 1024 8 TABLE - -
Seed: 1352746920
Buffer-Const,s!=d,xor=0: 0.152423 s 839.767 MB/s
Buffer-Const,s!=d,xor=1: 0.196896 s 650.090 MB/s
262144 512 8 TABLE - -
Seed: 1352746921
Buffer-Const,s!=d,xor=0: 0.149577 s 855.748 MB/s
Buffer-Const,s!=d,xor=1: 0.196668 s 650.843 MB/s
524288 256 8 TABLE - -
Seed: 1352746922
Buffer-Const,s!=d,xor=0: 0.151604 s 844.307 MB/s
Buffer-Const,s!=d,xor=1: 0.198012 s 646.425 MB/s
1048576 128 8 TABLE - -
Seed: 1352746924
Buffer-Const,s!=d,xor=0: 0.155570 s 822.779 MB/s
Buffer-Const,s!=d,xor=1: 0.195111 s 656.036 MB/s
2097152 64 8 TABLE - -
Seed: 1352746925
Buffer-Const,s!=d,xor=0: 0.159052 s 804.766 MB/s
Buffer-Const,s!=d,xor=1: 0.204684 s 625.353 MB/s
4194304 32 8 TABLE - -
Seed: 1352746926
Buffer-Const,s!=d,xor=0: 0.163852 s 781.193 MB/s
Buffer-Const,s!=d,xor=1: 0.204403 s 626.215 MB/s
8388608 16 8 TABLE - -
Seed: 1352746928
Buffer-Const,s!=d,xor=0: 0.174190 s 734.832 MB/s
Buffer-Const,s!=d,xor=1: 0.202681 s 631.535 MB/s
16777216 8 8 TABLE - -
Seed: 1352746929
Buffer-Const,s!=d,xor=0: 0.184380 s 694.218 MB/s
Buffer-Const,s!=d,xor=1: 0.204282 s 626.585 MB/s
33554432 4 8 TABLE - -
Seed: 1352746931
Buffer-Const,s!=d,xor=0: 0.204508 s 625.892 MB/s
Buffer-Const,s!=d,xor=1: 0.207667 s 616.371 MB/s
67108864 2 8 TABLE - -
Seed: 1352746932
Buffer-Const,s!=d,xor=0: 0.252662 s 506.606 MB/s
Buffer-Const,s!=d,xor=1: 0.208596 s 613.626 MB/s
134217728 1 8 TABLE - -
Seed: 1352746934
Buffer-Const,s!=d,xor=0: 0.870799 s 146.991 MB/s
Buffer-Const,s!=d,xor=1: 0.888333 s 144.090 MB/s
1024 131072 8 TABLE DOUBLE -
Seed: 1352746938
Buffer-Const,s!=d,xor=0: 0.808797 s 158.260 MB/s
Buffer-Const,s!=d,xor=1: 0.812444 s 157.549 MB/s
2048 65536 8 TABLE DOUBLE -
Seed: 1352746942
Buffer-Const,s!=d,xor=0: 0.724551 s 176.661 MB/s
Buffer-Const,s!=d,xor=1: 0.733140 s 174.591 MB/s
4096 32768 8 TABLE DOUBLE -
Seed: 1352746946
Buffer-Const,s!=d,xor=0: 0.622008 s 205.785 MB/s
Buffer-Const,s!=d,xor=1: 0.636914 s 200.969 MB/s
8192 16384 8 TABLE DOUBLE -
Seed: 1352746949
Buffer-Const,s!=d,xor=0: 0.454528 s 281.611 MB/s
Buffer-Const,s!=d,xor=1: 0.467266 s 273.934 MB/s
16384 8192 8 TABLE DOUBLE -
Seed: 1352746952
Buffer-Const,s!=d,xor=0: 0.285370 s 448.541 MB/s
Buffer-Const,s!=d,xor=1: 0.292051 s 438.279 MB/s
32768 4096 8 TABLE DOUBLE -
Seed: 1352746954
Buffer-Const,s!=d,xor=0: 0.193707 s 660.791 MB/s
Buffer-Const,s!=d,xor=1: 0.202114 s 633.307 MB/s
65536 2048 8 TABLE DOUBLE -
Seed: 1352746955
Buffer-Const,s!=d,xor=0: 0.147023 s 870.614 MB/s
Buffer-Const,s!=d,xor=1: 0.151774 s 843.360 MB/s
131072 1024 8 TABLE DOUBLE -
Seed: 1352746957
Buffer-Const,s!=d,xor=0: 0.127245 s 1005.930 MB/s
Buffer-Const,s!=d,xor=1: 0.130981 s 977.243 MB/s
262144 512 8 TABLE DOUBLE -
Seed: 1352746958
Buffer-Const,s!=d,xor=0: 0.112772 s 1135.034 MB/s
Buffer-Const,s!=d,xor=1: 0.117758 s 1086.972 MB/s
524288 256 8 TABLE DOUBLE -
Seed: 1352746959
Buffer-Const,s!=d,xor=0: 0.106724 s 1199.355 MB/s
Buffer-Const,s!=d,xor=1: 0.110677 s 1156.521 MB/s
1048576 128 8 TABLE DOUBLE -
Seed: 1352746960
Buffer-Const,s!=d,xor=0: 0.109126 s 1172.960 MB/s
Buffer-Const,s!=d,xor=1: 0.115353 s 1109.641 MB/s
2097152 64 8 TABLE DOUBLE -
Seed: 1352746962
Buffer-Const,s!=d,xor=0: 0.111492 s 1148.063 MB/s
Buffer-Const,s!=d,xor=1: 0.114936 s 1113.660 MB/s
4194304 32 8 TABLE DOUBLE -
Seed: 1352746963
Buffer-Const,s!=d,xor=0: 0.114727 s 1115.694 MB/s
Buffer-Const,s!=d,xor=1: 0.112702 s 1135.740 MB/s
8388608 16 8 TABLE DOUBLE -
Seed: 1352746964
Buffer-Const,s!=d,xor=0: 0.122290 s 1046.691 MB/s
Buffer-Const,s!=d,xor=1: 0.112557 s 1137.205 MB/s
16777216 8 8 TABLE DOUBLE -
Seed: 1352746965
Buffer-Const,s!=d,xor=0: 0.130774 s 978.789 MB/s
Buffer-Const,s!=d,xor=1: 0.115443 s 1108.772 MB/s
33554432 4 8 TABLE DOUBLE -
Seed: 1352746966
Buffer-Const,s!=d,xor=0: 0.152678 s 838.367 MB/s
Buffer-Const,s!=d,xor=1: 0.112051 s 1142.337 MB/s
67108864 2 8 TABLE DOUBLE -
Seed: 1352746968
Buffer-Const,s!=d,xor=0: 0.199972 s 640.090 MB/s
Buffer-Const,s!=d,xor=1: 0.111309 s 1149.951 MB/s
134217728 1 8 TABLE DOUBLE -
Seed: 1352746969
Buffer-Const,s!=d,xor=0: 12.353054 s 10.362 MB/s
Buffer-Const,s!=d,xor=1: 12.311798 s 10.397 MB/s
1024 131072 8 TABLE DOUBLE,LAZY -
Seed: 1352747019
Buffer-Const,s!=d,xor=0: 6.245450 s 20.495 MB/s
Buffer-Const,s!=d,xor=1: 6.251623 s 20.475 MB/s
2048 65536 8 TABLE DOUBLE,LAZY -
Seed: 1352747045
Buffer-Const,s!=d,xor=0: 3.157618 s 40.537 MB/s
Buffer-Const,s!=d,xor=1: 3.147050 s 40.673 MB/s
4096 32768 8 TABLE DOUBLE,LAZY -
Seed: 1352747058
Buffer-Const,s!=d,xor=0: 1.631175 s 78.471 MB/s
Buffer-Const,s!=d,xor=1: 1.657020 s 77.247 MB/s
8192 16384 8 TABLE DOUBLE,LAZY -
Seed: 1352747065
Buffer-Const,s!=d,xor=0: 0.860207 s 148.801 MB/s
Buffer-Const,s!=d,xor=1: 0.874988 s 146.288 MB/s
16384 8192 8 TABLE DOUBLE,LAZY -
Seed: 1352747069
Buffer-Const,s!=d,xor=0: 0.478988 s 267.230 MB/s
Buffer-Const,s!=d,xor=1: 0.485077 s 263.876 MB/s
32768 4096 8 TABLE DOUBLE,LAZY -
Seed: 1352747072
Buffer-Const,s!=d,xor=0: 0.291041 s 439.800 MB/s
Buffer-Const,s!=d,xor=1: 0.294611 s 434.472 MB/s
65536 2048 8 TABLE DOUBLE,LAZY -
Seed: 1352747074
Buffer-Const,s!=d,xor=0: 0.195826 s 653.643 MB/s
Buffer-Const,s!=d,xor=1: 0.201743 s 634.472 MB/s
131072 1024 8 TABLE DOUBLE,LAZY -
Seed: 1352747075
Buffer-Const,s!=d,xor=0: 0.148775 s 860.359 MB/s
Buffer-Const,s!=d,xor=1: 0.153898 s 831.717 MB/s
262144 512 8 TABLE DOUBLE,LAZY -
Seed: 1352747077
Buffer-Const,s!=d,xor=0: 0.128037 s 999.707 MB/s
Buffer-Const,s!=d,xor=1: 0.130179 s 983.260 MB/s
524288 256 8 TABLE DOUBLE,LAZY -
Seed: 1352747078
Buffer-Const,s!=d,xor=0: 0.112728 s 1135.473 MB/s
Buffer-Const,s!=d,xor=1: 0.119275 s 1073.152 MB/s
1048576 128 8 TABLE DOUBLE,LAZY -
Seed: 1352747079
Buffer-Const,s!=d,xor=0: 0.113098 s 1131.763 MB/s
Buffer-Const,s!=d,xor=1: 0.117425 s 1090.056 MB/s
2097152 64 8 TABLE DOUBLE,LAZY -
Seed: 1352747080
Buffer-Const,s!=d,xor=0: 0.113271 s 1130.033 MB/s
Buffer-Const,s!=d,xor=1: 0.116355 s 1100.082 MB/s
4194304 32 8 TABLE DOUBLE,LAZY -
Seed: 1352747081
Buffer-Const,s!=d,xor=0: 0.109173 s 1172.448 MB/s
Buffer-Const,s!=d,xor=1: 0.114466 s 1118.239 MB/s
8388608 16 8 TABLE DOUBLE,LAZY -
Seed: 1352747082
Buffer-Const,s!=d,xor=0: 0.120238 s 1064.555 MB/s
Buffer-Const,s!=d,xor=1: 0.113906 s 1123.737 MB/s
16777216 8 8 TABLE DOUBLE,LAZY -
Seed: 1352747084
Buffer-Const,s!=d,xor=0: 0.127838 s 1001.266 MB/s
Buffer-Const,s!=d,xor=1: 0.112099 s 1141.846 MB/s
33554432 4 8 TABLE DOUBLE,LAZY -
Seed: 1352747085
Buffer-Const,s!=d,xor=0: 0.154731 s 827.243 MB/s
Buffer-Const,s!=d,xor=1: 0.111025 s 1152.893 MB/s
67108864 2 8 TABLE DOUBLE,LAZY -
Seed: 1352747086
Buffer-Const,s!=d,xor=0: 0.202618 s 631.730 MB/s
Buffer-Const,s!=d,xor=1: 0.110840 s 1154.819 MB/s
134217728 1 8 TABLE DOUBLE,LAZY -
Seed: 1352747087
Buffer-Const,s!=d,xor=0: 0.400666 s 319.468 MB/s
Buffer-Const,s!=d,xor=1: 0.408545 s 313.307 MB/s
1024 131072 8 BYTWO_p - -
Seed: 1352747090
Buffer-Const,s!=d,xor=0: 0.393822 s 325.020 MB/s
Buffer-Const,s!=d,xor=1: 0.400213 s 319.829 MB/s
2048 65536 8 BYTWO_p - -
Seed: 1352747092
Buffer-Const,s!=d,xor=0: 0.388415 s 329.545 MB/s
Buffer-Const,s!=d,xor=1: 0.396545 s 322.788 MB/s
4096 32768 8 BYTWO_p - -
Seed: 1352747094
Buffer-Const,s!=d,xor=0: 0.389005 s 329.044 MB/s
Buffer-Const,s!=d,xor=1: 0.395450 s 323.682 MB/s
8192 16384 8 BYTWO_p - -
Seed: 1352747096
Buffer-Const,s!=d,xor=0: 0.385698 s 331.866 MB/s
Buffer-Const,s!=d,xor=1: 0.395319 s 323.789 MB/s
16384 8192 8 BYTWO_p - -
Seed: 1352747099
Buffer-Const,s!=d,xor=0: 0.385273 s 332.232 MB/s
Buffer-Const,s!=d,xor=1: 0.396203 s 323.067 MB/s
32768 4096 8 BYTWO_p - -
Seed: 1352747101
Buffer-Const,s!=d,xor=0: 0.387427 s 330.385 MB/s
Buffer-Const,s!=d,xor=1: 0.394610 s 324.371 MB/s
65536 2048 8 BYTWO_p - -
Seed: 1352747103
Buffer-Const,s!=d,xor=0: 0.389866 s 328.318 MB/s
Buffer-Const,s!=d,xor=1: 0.398012 s 321.598 MB/s
131072 1024 8 BYTWO_p - -
Seed: 1352747105
Buffer-Const,s!=d,xor=0: 0.389453 s 328.666 MB/s
Buffer-Const,s!=d,xor=1: 0.397982 s 321.622 MB/s
262144 512 8 BYTWO_p - -
Seed: 1352747108
Buffer-Const,s!=d,xor=0: 0.388304 s 329.638 MB/s
Buffer-Const,s!=d,xor=1: 0.399512 s 320.391 MB/s
524288 256 8 BYTWO_p - -
Seed: 1352747110
Buffer-Const,s!=d,xor=0: 0.390699 s 327.618 MB/s
Buffer-Const,s!=d,xor=1: 0.407622 s 314.016 MB/s
1048576 128 8 BYTWO_p - -
Seed: 1352747112
Buffer-Const,s!=d,xor=0: 0.398830 s 320.939 MB/s
Buffer-Const,s!=d,xor=1: 0.401909 s 318.480 MB/s
2097152 64 8 BYTWO_p - -
Seed: 1352747114
Buffer-Const,s!=d,xor=0: 0.402605 s 317.930 MB/s
Buffer-Const,s!=d,xor=1: 0.410941 s 311.480 MB/s
4194304 32 8 BYTWO_p - -
Seed: 1352747117
Buffer-Const,s!=d,xor=0: 0.404638 s 316.332 MB/s
Buffer-Const,s!=d,xor=1: 0.406369 s 314.984 MB/s
8388608 16 8 BYTWO_p - -
Seed: 1352747119
Buffer-Const,s!=d,xor=0: 0.412950 s 309.965 MB/s
Buffer-Const,s!=d,xor=1: 0.411819 s 310.816 MB/s
16777216 8 8 BYTWO_p - -
Seed: 1352747121
Buffer-Const,s!=d,xor=0: 0.417898 s 306.295 MB/s
Buffer-Const,s!=d,xor=1: 0.412159 s 310.560 MB/s
33554432 4 8 BYTWO_p - -
Seed: 1352747124
Buffer-Const,s!=d,xor=0: 0.444945 s 287.676 MB/s
Buffer-Const,s!=d,xor=1: 0.404381 s 316.533 MB/s
67108864 2 8 BYTWO_p - -
Seed: 1352747126
Buffer-Const,s!=d,xor=0: 0.494330 s 258.936 MB/s
Buffer-Const,s!=d,xor=1: 0.412325 s 310.435 MB/s
134217728 1 8 BYTWO_p - -
Seed: 1352747129
Buffer-Const,s!=d,xor=0: 0.306549 s 417.552 MB/s
Buffer-Const,s!=d,xor=1: 0.309033 s 414.195 MB/s
1024 131072 8 BYTWO_b - -
Seed: 1352747131
Buffer-Const,s!=d,xor=0: 0.297702 s 429.961 MB/s
Buffer-Const,s!=d,xor=1: 0.297253 s 430.609 MB/s
2048 65536 8 BYTWO_b - -
Seed: 1352747132
Buffer-Const,s!=d,xor=0: 0.293193 s 436.572 MB/s
Buffer-Const,s!=d,xor=1: 0.293018 s 436.833 MB/s
4096 32768 8 BYTWO_b - -
Seed: 1352747134
Buffer-Const,s!=d,xor=0: 0.294984 s 433.922 MB/s
Buffer-Const,s!=d,xor=1: 0.290863 s 440.070 MB/s
8192 16384 8 BYTWO_b - -
Seed: 1352747136
Buffer-Const,s!=d,xor=0: 0.288896 s 443.067 MB/s
Buffer-Const,s!=d,xor=1: 0.288462 s 443.732 MB/s
16384 8192 8 BYTWO_b - -
Seed: 1352747138
Buffer-Const,s!=d,xor=0: 0.290112 s 441.208 MB/s
Buffer-Const,s!=d,xor=1: 0.288533 s 443.623 MB/s
32768 4096 8 BYTWO_b - -
Seed: 1352747140
Buffer-Const,s!=d,xor=0: 0.288124 s 444.253 MB/s
Buffer-Const,s!=d,xor=1: 0.286360 s 446.989 MB/s
65536 2048 8 BYTWO_b - -
Seed: 1352747142
Buffer-Const,s!=d,xor=0: 0.292166 s 438.106 MB/s
Buffer-Const,s!=d,xor=1: 0.288037 s 444.388 MB/s
131072 1024 8 BYTWO_b - -
Seed: 1352747143
Buffer-Const,s!=d,xor=0: 0.295804 s 432.719 MB/s
Buffer-Const,s!=d,xor=1: 0.292226 s 438.017 MB/s
262144 512 8 BYTWO_b - -
Seed: 1352747145
Buffer-Const,s!=d,xor=0: 0.284928 s 449.236 MB/s
Buffer-Const,s!=d,xor=1: 0.286746 s 446.388 MB/s
524288 256 8 BYTWO_b - -
Seed: 1352747147
Buffer-Const,s!=d,xor=0: 0.295747 s 432.803 MB/s
Buffer-Const,s!=d,xor=1: 0.291578 s 438.990 MB/s
1048576 128 8 BYTWO_b - -
Seed: 1352747149
Buffer-Const,s!=d,xor=0: 0.300418 s 426.073 MB/s
Buffer-Const,s!=d,xor=1: 0.283470 s 451.547 MB/s
2097152 64 8 BYTWO_b - -
Seed: 1352747151
Buffer-Const,s!=d,xor=0: 0.310105 s 412.764 MB/s
Buffer-Const,s!=d,xor=1: 0.306506 s 417.610 MB/s
4194304 32 8 BYTWO_b - -
Seed: 1352747153
Buffer-Const,s!=d,xor=0: 0.303049 s 422.373 MB/s
Buffer-Const,s!=d,xor=1: 0.294477 s 434.669 MB/s
8388608 16 8 BYTWO_b - -
Seed: 1352747155
Buffer-Const,s!=d,xor=0: 0.318920 s 401.354 MB/s
Buffer-Const,s!=d,xor=1: 0.292649 s 437.384 MB/s
16777216 8 8 BYTWO_b - -
Seed: 1352747157
Buffer-Const,s!=d,xor=0: 0.369239 s 346.659 MB/s
Buffer-Const,s!=d,xor=1: 0.299009 s 428.081 MB/s
33554432 4 8 BYTWO_b - -
Seed: 1352747159
Buffer-Const,s!=d,xor=0: 0.370332 s 345.636 MB/s
Buffer-Const,s!=d,xor=1: 0.292907 s 436.999 MB/s
67108864 2 8 BYTWO_b - -
Seed: 1352747161
Buffer-Const,s!=d,xor=0: 0.437750 s 292.404 MB/s
Buffer-Const,s!=d,xor=1: 0.303224 s 422.130 MB/s
134217728 1 8 BYTWO_b - -
Seed: 1352747163
Buffer-Const,s!=d,xor=0: 0.199102 s 642.888 MB/s
Buffer-Const,s!=d,xor=1: 0.198709 s 644.159 MB/s
1024 131072 8 BYTWO_p SSE -
Seed: 1352747164
Buffer-Const,s!=d,xor=0: 0.188358 s 679.558 MB/s
Buffer-Const,s!=d,xor=1: 0.190699 s 671.215 MB/s
2048 65536 8 BYTWO_p SSE -
Seed: 1352747166
Buffer-Const,s!=d,xor=0: 0.184177 s 694.985 MB/s
Buffer-Const,s!=d,xor=1: 0.186848 s 685.049 MB/s
4096 32768 8 BYTWO_p SSE -
Seed: 1352747167
Buffer-Const,s!=d,xor=0: 0.189242 s 676.384 MB/s
Buffer-Const,s!=d,xor=1: 0.186107 s 687.776 MB/s
8192 16384 8 BYTWO_p SSE -
Seed: 1352747169
Buffer-Const,s!=d,xor=0: 0.179632 s 712.566 MB/s
Buffer-Const,s!=d,xor=1: 0.182739 s 700.454 MB/s
16384 8192 8 BYTWO_p SSE -
Seed: 1352747170
Buffer-Const,s!=d,xor=0: 0.199486 s 641.648 MB/s
Buffer-Const,s!=d,xor=1: 0.187585 s 682.357 MB/s
32768 4096 8 BYTWO_p SSE -
Seed: 1352747172
Buffer-Const,s!=d,xor=0: 0.181719 s 704.385 MB/s
Buffer-Const,s!=d,xor=1: 0.183744 s 696.620 MB/s
65536 2048 8 BYTWO_p SSE -
Seed: 1352747173
Buffer-Const,s!=d,xor=0: 0.179243 s 714.114 MB/s
Buffer-Const,s!=d,xor=1: 0.181455 s 705.409 MB/s
131072 1024 8 BYTWO_p SSE -
Seed: 1352747174
Buffer-Const,s!=d,xor=0: 0.178887 s 715.536 MB/s
Buffer-Const,s!=d,xor=1: 0.180799 s 707.969 MB/s
262144 512 8 BYTWO_p SSE -
Seed: 1352747176
Buffer-Const,s!=d,xor=0: 0.180232 s 710.196 MB/s
Buffer-Const,s!=d,xor=1: 0.180657 s 708.523 MB/s
524288 256 8 BYTWO_p SSE -
Seed: 1352747177
Buffer-Const,s!=d,xor=0: 0.180044 s 710.938 MB/s
Buffer-Const,s!=d,xor=1: 0.183542 s 697.386 MB/s
1048576 128 8 BYTWO_p SSE -
Seed: 1352747179
Buffer-Const,s!=d,xor=0: 0.188030 s 680.743 MB/s
Buffer-Const,s!=d,xor=1: 0.189776 s 674.480 MB/s
2097152 64 8 BYTWO_p SSE -
Seed: 1352747180
Buffer-Const,s!=d,xor=0: 0.188869 s 677.718 MB/s
Buffer-Const,s!=d,xor=1: 0.199248 s 642.415 MB/s
4194304 32 8 BYTWO_p SSE -
Seed: 1352747181
Buffer-Const,s!=d,xor=0: 0.191749 s 667.538 MB/s
Buffer-Const,s!=d,xor=1: 0.188193 s 680.153 MB/s
8388608 16 8 BYTWO_p SSE -
Seed: 1352747183
Buffer-Const,s!=d,xor=0: 0.200427 s 638.638 MB/s
Buffer-Const,s!=d,xor=1: 0.189489 s 675.501 MB/s
16777216 8 8 BYTWO_p SSE -
Seed: 1352747184
Buffer-Const,s!=d,xor=0: 0.206467 s 619.954 MB/s
Buffer-Const,s!=d,xor=1: 0.195798 s 653.735 MB/s
33554432 4 8 BYTWO_p SSE -
Seed: 1352747186
Buffer-Const,s!=d,xor=0: 0.226630 s 564.797 MB/s
Buffer-Const,s!=d,xor=1: 0.189382 s 675.883 MB/s
67108864 2 8 BYTWO_p SSE -
Seed: 1352747187
Buffer-Const,s!=d,xor=0: 0.279772 s 457.515 MB/s
Buffer-Const,s!=d,xor=1: 0.196061 s 652.858 MB/s
134217728 1 8 BYTWO_p SSE -
Seed: 1352747189
Buffer-Const,s!=d,xor=0: 0.148536 s 861.741 MB/s
Buffer-Const,s!=d,xor=1: 0.276922 s 462.224 MB/s
1024 131072 8 BYTWO_b SSE -
Seed: 1352747191
Buffer-Const,s!=d,xor=0: 0.137811 s 928.805 MB/s
Buffer-Const,s!=d,xor=1: 0.268928 s 475.964 MB/s
2048 65536 8 BYTWO_b SSE -
Seed: 1352747192
Buffer-Const,s!=d,xor=0: 0.132821 s 963.706 MB/s
Buffer-Const,s!=d,xor=1: 0.265851 s 481.474 MB/s
4096 32768 8 BYTWO_b SSE -
Seed: 1352747194
Buffer-Const,s!=d,xor=0: 0.131842 s 970.862 MB/s
Buffer-Const,s!=d,xor=1: 0.263387 s 485.977 MB/s
8192 16384 8 BYTWO_b SSE -
Seed: 1352747195
Buffer-Const,s!=d,xor=0: 0.131891 s 970.495 MB/s
Buffer-Const,s!=d,xor=1: 0.260863 s 490.680 MB/s
16384 8192 8 BYTWO_b SSE -
Seed: 1352747197
Buffer-Const,s!=d,xor=0: 0.128815 s 993.670 MB/s
Buffer-Const,s!=d,xor=1: 0.260589 s 491.196 MB/s
32768 4096 8 BYTWO_b SSE -
Seed: 1352747198
Buffer-Const,s!=d,xor=0: 0.127239 s 1005.979 MB/s
Buffer-Const,s!=d,xor=1: 0.261076 s 490.278 MB/s
65536 2048 8 BYTWO_b SSE -
Seed: 1352747200
Buffer-Const,s!=d,xor=0: 0.127946 s 1000.421 MB/s
Buffer-Const,s!=d,xor=1: 0.266347 s 480.576 MB/s
131072 1024 8 BYTWO_b SSE -
Seed: 1352747201
Buffer-Const,s!=d,xor=0: 0.129641 s 987.340 MB/s
Buffer-Const,s!=d,xor=1: 0.261065 s 490.299 MB/s
262144 512 8 BYTWO_b SSE -
Seed: 1352747202
Buffer-Const,s!=d,xor=0: 0.131109 s 976.285 MB/s
Buffer-Const,s!=d,xor=1: 0.259368 s 493.507 MB/s
524288 256 8 BYTWO_b SSE -
Seed: 1352747204
Buffer-Const,s!=d,xor=0: 0.130358 s 981.911 MB/s
Buffer-Const,s!=d,xor=1: 0.268218 s 477.224 MB/s
1048576 128 8 BYTWO_b SSE -
Seed: 1352747205
Buffer-Const,s!=d,xor=0: 0.135308 s 945.990 MB/s
Buffer-Const,s!=d,xor=1: 0.282554 s 453.011 MB/s
2097152 64 8 BYTWO_b SSE -
Seed: 1352747207
Buffer-Const,s!=d,xor=0: 0.141210 s 906.454 MB/s
Buffer-Const,s!=d,xor=1: 0.284272 s 450.272 MB/s
4194304 32 8 BYTWO_b SSE -
Seed: 1352747208
Buffer-Const,s!=d,xor=0: 0.150900 s 848.245 MB/s
Buffer-Const,s!=d,xor=1: 0.291628 s 438.916 MB/s
8388608 16 8 BYTWO_b SSE -
Seed: 1352747210
Buffer-Const,s!=d,xor=0: 0.147792 s 866.084 MB/s
Buffer-Const,s!=d,xor=1: 0.278963 s 458.842 MB/s
16777216 8 8 BYTWO_b SSE -
Seed: 1352747211
Buffer-Const,s!=d,xor=0: 0.154891 s 826.390 MB/s
Buffer-Const,s!=d,xor=1: 0.176620 s 724.721 MB/s
33554432 4 8 BYTWO_b SSE -
Seed: 1352747213
Buffer-Const,s!=d,xor=0: 0.193885 s 660.186 MB/s
Buffer-Const,s!=d,xor=1: 0.268795 s 476.199 MB/s
67108864 2 8 BYTWO_b SSE -
Seed: 1352747214
Buffer-Const,s!=d,xor=0: 0.204667 s 625.407 MB/s
Buffer-Const,s!=d,xor=1: 0.269170 s 475.536 MB/s
134217728 1 8 BYTWO_b SSE -
Seed: 1352747216
Buffer-Const,s!=d,xor=0: 1.940300 s 65.969 MB/s
Buffer-Const,s!=d,xor=1: 2.143284 s 59.721 MB/s
1024 131072 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747225
Buffer-Const,s!=d,xor=0: 1.923481 s 66.546 MB/s
Buffer-Const,s!=d,xor=1: 2.147470 s 59.605 MB/s
2048 65536 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747234
Buffer-Const,s!=d,xor=0: 1.916270 s 66.796 MB/s
Buffer-Const,s!=d,xor=1: 2.139770 s 59.820 MB/s
4096 32768 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747243
Buffer-Const,s!=d,xor=0: 1.938715 s 66.023 MB/s
Buffer-Const,s!=d,xor=1: 2.137380 s 59.886 MB/s
8192 16384 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747252
Buffer-Const,s!=d,xor=0: 1.922527 s 66.579 MB/s
Buffer-Const,s!=d,xor=1: 2.148529 s 59.576 MB/s
16384 8192 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747261
Buffer-Const,s!=d,xor=0: 1.929218 s 66.348 MB/s
Buffer-Const,s!=d,xor=1: 2.138858 s 59.845 MB/s
32768 4096 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747270
Buffer-Const,s!=d,xor=0: 1.921590 s 66.612 MB/s
Buffer-Const,s!=d,xor=1: 2.137566 s 59.881 MB/s
65536 2048 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747278
Buffer-Const,s!=d,xor=0: 1.932345 s 66.241 MB/s
Buffer-Const,s!=d,xor=1: 2.130586 s 60.077 MB/s
131072 1024 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747287
Buffer-Const,s!=d,xor=0: 1.944353 s 65.832 MB/s
Buffer-Const,s!=d,xor=1: 2.126287 s 60.199 MB/s
262144 512 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747296
Buffer-Const,s!=d,xor=0: 1.921692 s 66.608 MB/s
Buffer-Const,s!=d,xor=1: 2.128691 s 60.131 MB/s
524288 256 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747305
Buffer-Const,s!=d,xor=0: 1.883663 s 67.953 MB/s
Buffer-Const,s!=d,xor=1: 2.149924 s 59.537 MB/s
1048576 128 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747314
Buffer-Const,s!=d,xor=0: 1.957364 s 65.394 MB/s
Buffer-Const,s!=d,xor=1: 2.167789 s 59.046 MB/s
2097152 64 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747323
Buffer-Const,s!=d,xor=0: 1.958212 s 65.366 MB/s
Buffer-Const,s!=d,xor=1: 2.159558 s 59.271 MB/s
4194304 32 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747332
Buffer-Const,s!=d,xor=0: 1.958506 s 65.356 MB/s
Buffer-Const,s!=d,xor=1: 2.019473 s 63.383 MB/s
8388608 16 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747341
Buffer-Const,s!=d,xor=0: 1.949758 s 65.649 MB/s
Buffer-Const,s!=d,xor=1: 2.165875 s 59.099 MB/s
16777216 8 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747349
Buffer-Const,s!=d,xor=0: 1.964626 s 65.152 MB/s
Buffer-Const,s!=d,xor=1: 2.151822 s 59.484 MB/s
33554432 4 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747358
Buffer-Const,s!=d,xor=0: 2.045733 s 62.569 MB/s
Buffer-Const,s!=d,xor=1: 2.177383 s 58.786 MB/s
67108864 2 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747367
Buffer-Const,s!=d,xor=0: 2.055240 s 62.280 MB/s
Buffer-Const,s!=d,xor=1: 2.190975 s 58.421 MB/s
134217728 1 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
Seed: 1352747377
Buffer-Const,s!=d,xor=0: 0.080290 s 1594.215 MB/s
Buffer-Const,s!=d,xor=1: 0.082083 s 1559.402 MB/s
1024 131072 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747378
Buffer-Const,s!=d,xor=0: 0.059030 s 2168.378 MB/s
Buffer-Const,s!=d,xor=1: 0.064752 s 1976.763 MB/s
2048 65536 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747379
Buffer-Const,s!=d,xor=0: 0.050239 s 2547.829 MB/s
Buffer-Const,s!=d,xor=1: 0.050503 s 2534.526 MB/s
4096 32768 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747379
Buffer-Const,s!=d,xor=0: 0.044825 s 2855.560 MB/s
Buffer-Const,s!=d,xor=1: 0.045130 s 2836.220 MB/s
8192 16384 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747380
Buffer-Const,s!=d,xor=0: 0.042018 s 3046.301 MB/s
Buffer-Const,s!=d,xor=1: 0.042297 s 3026.210 MB/s
16384 8192 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747381
Buffer-Const,s!=d,xor=0: 0.040955 s 3125.413 MB/s
Buffer-Const,s!=d,xor=1: 0.041454 s 3087.754 MB/s
32768 4096 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747382
Buffer-Const,s!=d,xor=0: 0.040984 s 3123.195 MB/s
Buffer-Const,s!=d,xor=1: 0.041577 s 3078.635 MB/s
65536 2048 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747383
Buffer-Const,s!=d,xor=0: 0.041093 s 3114.859 MB/s
Buffer-Const,s!=d,xor=1: 0.042611 s 3003.911 MB/s
131072 1024 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747384
Buffer-Const,s!=d,xor=0: 0.047338 s 2703.972 MB/s
Buffer-Const,s!=d,xor=1: 0.049673 s 2576.836 MB/s
262144 512 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747385
Buffer-Const,s!=d,xor=0: 0.049656 s 2577.739 MB/s
Buffer-Const,s!=d,xor=1: 0.050634 s 2527.950 MB/s
524288 256 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747386
Buffer-Const,s!=d,xor=0: 0.049906 s 2564.833 MB/s
Buffer-Const,s!=d,xor=1: 0.051381 s 2491.188 MB/s
1048576 128 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747386
Buffer-Const,s!=d,xor=0: 0.075184 s 1702.487 MB/s
Buffer-Const,s!=d,xor=1: 0.070414 s 1817.825 MB/s
2097152 64 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747387
Buffer-Const,s!=d,xor=0: 0.108748 s 1177.034 MB/s
Buffer-Const,s!=d,xor=1: 0.111286 s 1150.190 MB/s
4194304 32 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747388
Buffer-Const,s!=d,xor=0: 0.117474 s 1089.600 MB/s
Buffer-Const,s!=d,xor=1: 0.114860 s 1114.400 MB/s
8388608 16 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747389
Buffer-Const,s!=d,xor=0: 0.126348 s 1013.075 MB/s
Buffer-Const,s!=d,xor=1: 0.109330 s 1170.768 MB/s
16777216 8 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747391
Buffer-Const,s!=d,xor=0: 0.123002 s 1040.635 MB/s
Buffer-Const,s!=d,xor=1: 0.110046 s 1163.148 MB/s
33554432 4 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747392
Buffer-Const,s!=d,xor=0: 0.159381 s 803.107 MB/s
Buffer-Const,s!=d,xor=1: 0.120685 s 1060.611 MB/s
67108864 2 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747393
Buffer-Const,s!=d,xor=0: 0.196446 s 651.578 MB/s
Buffer-Const,s!=d,xor=1: 0.121685 s 1051.896 MB/s
134217728 1 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -
Seed: 1352747520
Buffer-Const,s!=d,xor=0: 0.244784 s 522.910 MB/s
Buffer-Const,s!=d,xor=1: 0.259940 s 492.421 MB/s
1024 131072 8 SPLIT 8 4 NOSSE -
Seed: 1352747522
Buffer-Const,s!=d,xor=0: 0.243595 s 525.463 MB/s
Buffer-Const,s!=d,xor=1: 0.253145 s 505.640 MB/s
2048 65536 8 SPLIT 8 4 NOSSE -
Seed: 1352747523
Buffer-Const,s!=d,xor=0: 0.240463 s 532.307 MB/s
Buffer-Const,s!=d,xor=1: 0.251567 s 508.811 MB/s
4096 32768 8 SPLIT 8 4 NOSSE -
Seed: 1352747525
Buffer-Const,s!=d,xor=0: 0.240079 s 533.157 MB/s
Buffer-Const,s!=d,xor=1: 0.255671 s 500.643 MB/s
8192 16384 8 SPLIT 8 4 NOSSE -
Seed: 1352747527
Buffer-Const,s!=d,xor=0: 0.242857 s 527.059 MB/s
Buffer-Const,s!=d,xor=1: 0.251837 s 508.264 MB/s
16384 8192 8 SPLIT 8 4 NOSSE -
Seed: 1352747528
Buffer-Const,s!=d,xor=0: 0.240757 s 531.657 MB/s
Buffer-Const,s!=d,xor=1: 0.253888 s 504.160 MB/s
32768 4096 8 SPLIT 8 4 NOSSE -
Seed: 1352747530
Buffer-Const,s!=d,xor=0: 0.240586 s 532.034 MB/s
Buffer-Const,s!=d,xor=1: 0.256642 s 498.749 MB/s
65536 2048 8 SPLIT 8 4 NOSSE -
Seed: 1352747532
Buffer-Const,s!=d,xor=0: 0.238570 s 536.529 MB/s
Buffer-Const,s!=d,xor=1: 0.254111 s 503.717 MB/s
131072 1024 8 SPLIT 8 4 NOSSE -
Seed: 1352747533
Buffer-Const,s!=d,xor=0: 0.237666 s 538.572 MB/s
Buffer-Const,s!=d,xor=1: 0.254334 s 503.275 MB/s
262144 512 8 SPLIT 8 4 NOSSE -
Seed: 1352747535
Buffer-Const,s!=d,xor=0: 0.244512 s 523.491 MB/s
Buffer-Const,s!=d,xor=1: 0.255911 s 500.174 MB/s
524288 256 8 SPLIT 8 4 NOSSE -
Seed: 1352747537
Buffer-Const,s!=d,xor=0: 0.242439 s 527.968 MB/s
Buffer-Const,s!=d,xor=1: 0.255622 s 500.740 MB/s
1048576 128 8 SPLIT 8 4 NOSSE -
Seed: 1352747538
Buffer-Const,s!=d,xor=0: 0.248633 s 514.815 MB/s
Buffer-Const,s!=d,xor=1: 0.257451 s 497.181 MB/s
2097152 64 8 SPLIT 8 4 NOSSE -
Seed: 1352747540
Buffer-Const,s!=d,xor=0: 0.241531 s 529.952 MB/s
Buffer-Const,s!=d,xor=1: 0.264452 s 484.020 MB/s
4194304 32 8 SPLIT 8 4 NOSSE -
Seed: 1352747542
Buffer-Const,s!=d,xor=0: 0.255533 s 500.914 MB/s
Buffer-Const,s!=d,xor=1: 0.248849 s 514.368 MB/s
8388608 16 8 SPLIT 8 4 NOSSE -
Seed: 1352747543
Buffer-Const,s!=d,xor=0: 0.259687 s 492.902 MB/s
Buffer-Const,s!=d,xor=1: 0.264417 s 484.084 MB/s
16777216 8 8 SPLIT 8 4 NOSSE -
Seed: 1352747545
Buffer-Const,s!=d,xor=0: 0.267928 s 477.740 MB/s
Buffer-Const,s!=d,xor=1: 0.269417 s 475.100 MB/s
33554432 4 8 SPLIT 8 4 NOSSE -
Seed: 1352747547
Buffer-Const,s!=d,xor=0: 0.295526 s 433.126 MB/s
Buffer-Const,s!=d,xor=1: 0.270747 s 472.766 MB/s
67108864 2 8 SPLIT 8 4 NOSSE -
Seed: 1352747549
Buffer-Const,s!=d,xor=0: 0.342706 s 373.498 MB/s
Buffer-Const,s!=d,xor=1: 0.266642 s 480.045 MB/s
134217728 1 8 SPLIT 8 4 NOSSE -
Seed: 1352747551
Buffer-Const,s!=d,xor=0: 0.027748 s 4612.927 MB/s
Buffer-Const,s!=d,xor=1: 0.028090 s 4556.704 MB/s
1024 131072 8 SPLIT 8 4 SSE -
Seed: 1352747552
Buffer-Const,s!=d,xor=0: 0.023128 s 5534.409 MB/s
Buffer-Const,s!=d,xor=1: 0.023134 s 5533.040 MB/s
2048 65536 8 SPLIT 8 4 SSE -
Seed: 1352747552
Buffer-Const,s!=d,xor=0: 0.019114 s 6696.740 MB/s
Buffer-Const,s!=d,xor=1: 0.019763 s 6476.596 MB/s
4096 32768 8 SPLIT 8 4 SSE -
Seed: 1352747553
Buffer-Const,s!=d,xor=0: 0.017541 s 7297.119 MB/s
Buffer-Const,s!=d,xor=1: 0.018266 s 7007.661 MB/s
8192 16384 8 SPLIT 8 4 SSE -
Seed: 1352747554
Buffer-Const,s!=d,xor=0: 0.017010 s 7524.892 MB/s
Buffer-Const,s!=d,xor=1: 0.017399 s 7356.613 MB/s
16384 8192 8 SPLIT 8 4 SSE -
Seed: 1352747555
Buffer-Const,s!=d,xor=0: 0.016979 s 7538.522 MB/s
Buffer-Const,s!=d,xor=1: 0.017508 s 7311.130 MB/s
32768 4096 8 SPLIT 8 4 SSE -
Seed: 1352747555
Buffer-Const,s!=d,xor=0: 0.016780 s 7628.283 MB/s
Buffer-Const,s!=d,xor=1: 0.017439 s 7340.018 MB/s
65536 2048 8 SPLIT 8 4 SSE -
Seed: 1352747556
Buffer-Const,s!=d,xor=0: 0.017527 s 7302.876 MB/s
Buffer-Const,s!=d,xor=1: 0.018656 s 6861.145 MB/s
131072 1024 8 SPLIT 8 4 SSE -
Seed: 1352747557
Buffer-Const,s!=d,xor=0: 0.020679 s 6189.855 MB/s
Buffer-Const,s!=d,xor=1: 0.022183 s 5770.138 MB/s
262144 512 8 SPLIT 8 4 SSE -
Seed: 1352747558
Buffer-Const,s!=d,xor=0: 0.020437 s 6263.296 MB/s
Buffer-Const,s!=d,xor=1: 0.021715 s 5894.434 MB/s
524288 256 8 SPLIT 8 4 SSE -
Seed: 1352747558
Buffer-Const,s!=d,xor=0: 0.020800 s 6153.883 MB/s
Buffer-Const,s!=d,xor=1: 0.021934 s 5835.617 MB/s
1048576 128 8 SPLIT 8 4 SSE -
Seed: 1352747559
Buffer-Const,s!=d,xor=0: 0.035634 s 3592.095 MB/s
Buffer-Const,s!=d,xor=1: 0.036323 s 3523.977 MB/s
2097152 64 8 SPLIT 8 4 SSE -
Seed: 1352747560
Buffer-Const,s!=d,xor=0: 0.050565 s 2531.419 MB/s
Buffer-Const,s!=d,xor=1: 0.048358 s 2646.914 MB/s
4194304 32 8 SPLIT 8 4 SSE -
Seed: 1352747561
Buffer-Const,s!=d,xor=0: 0.053646 s 2386.008 MB/s
Buffer-Const,s!=d,xor=1: 0.047063 s 2719.766 MB/s
8388608 16 8 SPLIT 8 4 SSE -
Seed: 1352747562
Buffer-Const,s!=d,xor=0: 0.055658 s 2299.775 MB/s
Buffer-Const,s!=d,xor=1: 0.047532 s 2692.918 MB/s
16777216 8 8 SPLIT 8 4 SSE -
Seed: 1352747563
Buffer-Const,s!=d,xor=0: 0.064355 s 1988.963 MB/s
Buffer-Const,s!=d,xor=1: 0.047547 s 2692.067 MB/s
33554432 4 8 SPLIT 8 4 SSE -
Seed: 1352747563
Buffer-Const,s!=d,xor=0: 0.084876 s 1508.086 MB/s
Buffer-Const,s!=d,xor=1: 0.048017 s 2665.721 MB/s
67108864 2 8 SPLIT 8 4 SSE -
Seed: 1352747564
Buffer-Const,s!=d,xor=0: 0.121661 s 1052.104 MB/s
Buffer-Const,s!=d,xor=1: 0.047558 s 2691.447 MB/s
134217728 1 8 SPLIT 8 4 SSE -

13
junk-w8-timing-tests.sh Normal file
View File

@ -0,0 +1,13 @@
sh tmp-time-test.sh 8 LOG - -
sh tmp-time-test.sh 8 LOG_ZERO - -
sh tmp-time-test.sh 8 TABLE - -
sh tmp-time-test.sh 8 TABLE DOUBLE -
sh tmp-time-test.sh 8 TABLE DOUBLE,LAZY -
sh tmp-time-test.sh 8 BYTWO_p - -
sh tmp-time-test.sh 8 BYTWO_b - -
sh tmp-time-test.sh 8 BYTWO_p SSE -
sh tmp-time-test.sh 8 BYTWO_b SSE -
sh tmp-time-test.sh 8 SPLIT 8 4 NOSSE -
sh tmp-time-test.sh 8 SPLIT 8 4 SSE -
sh tmp-time-test.sh 8 COMPOSITE 2 4 TABLE SINGLE,SSE - - -
sh tmp-time-test.sh 8 COMPOSITE 2 4 TABLE SINGLE,SSE - ALTMAP -

11
junk-w8-timing.jgr Normal file
View File

@ -0,0 +1,11 @@
newgraph
xaxis size 4 min 0 no_auto_hash_labels
hash_labels hjl vjc rotate -90 fontsize 11
shell : junk-pick-best-output < junk-w8-timing-out.txt | sort -nr | sed 's/.............//' | awk '{ print "hash_label at ", ++l, ":", $0 }'
yaxis size 1 min 0 label : MB/s
newcurve marktype xbar cfill 1 1 0 marksize 1 pts
shell : junk-pick-best-output < junk-w8-timing-out.txt | sort -nr | awk '{ print $1 }' | cat -n

17
junk.c Normal file
View File

@ -0,0 +1,17 @@
#include <stdio.h>
main()
{
int size, iterations;
double ds, di, elapsed;
elapsed = 0.614553;
size = 8192;
iterations = 655360;
ds = size;
di = iterations;
printf("%10.3lf\n", ((double) (size*iterations)) / (1024 * 1024 * elapsed));
printf("%10.3lf\n", ds * di / 1024.0 / 1024.0 / elapsed);
}

199
junk.ps Normal file
View File

@ -0,0 +1,199 @@
%!PS-Adobe-2.0 EPSF-1.2
%%Page: 1 1
%%BoundingBox: -40 -93 292 73
%%EndComments
180.000000 406.000000 translate
1 setlinecap 1 setlinejoin
0.700 setlinewidth
0.00 setgray
/Jrnd { exch cvi exch cvi dup 3 1 roll idiv mul } def
/JDEdict 8 dict def
JDEdict /mtrx matrix put
/JDE {
JDEdict begin
/yrad exch def
/xrad exch def
/savematrix mtrx currentmatrix def
xrad yrad scale
0 0 1 0 360 arc
savematrix setmatrix
end
} def
/JSTR {
gsave 1 eq { gsave 1 setgray fill grestore } if
exch neg exch neg translate
clip
rotate
4 dict begin
pathbbox /&top exch def
/&right exch def
/&bottom exch def
&right sub /&width exch def
newpath
currentlinewidth mul round dup
&bottom exch Jrnd exch &top
4 -1 roll currentlinewidth mul setlinewidth
{ &right exch moveto &width 0 rlineto stroke } for
end
grestore
newpath
} bind def
gsave /Times-Roman findfont 9.000000 scalefont setfont
0.000000 0.000000 translate
0.700000 setlinewidth gsave newpath 0.000000 0.000000 moveto 288.000000 0.000000 lineto stroke
newpath 0.000000 0.000000 moveto 0.000000 -5.000000 lineto stroke
newpath 28.799999 0.000000 moveto 28.799999 -2.000000 lineto stroke
newpath 57.599998 0.000000 moveto 57.599998 -5.000000 lineto stroke
newpath 86.399994 0.000000 moveto 86.399994 -2.000000 lineto stroke
newpath 115.199997 0.000000 moveto 115.199997 -5.000000 lineto stroke
newpath 144.000000 0.000000 moveto 144.000000 -2.000000 lineto stroke
newpath 172.799988 0.000000 moveto 172.799988 -5.000000 lineto stroke
newpath 201.599991 0.000000 moveto 201.599991 -2.000000 lineto stroke
newpath 230.399994 0.000000 moveto 230.399994 -5.000000 lineto stroke
newpath 259.199982 0.000000 moveto 259.199982 -2.000000 lineto stroke
newpath 288.000000 0.000000 moveto 288.000000 -5.000000 lineto stroke
/Times-Roman findfont 11.000000 scalefont setfont
gsave 28.799999 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (TABLE SINGLE,SSE) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 57.599998 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (BYTWO_b SSE) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 86.399994 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (BYTWO_b) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 115.199997 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (BYTWO_p SSE) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 144.000000 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (TABLE QUAD,LAZY) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 172.799988 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (TABLE QUAD) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 201.599991 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (BYTWO_p) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 230.399994 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (TABLE DOUBLE) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 259.199982 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (TABLE SINGLE) dup stringwidth pop pop 0 0 moveto
show
grestore
gsave 288.000000 -8.000000 translate -90.000000 rotate
0 -3.300000 translate (LOG) dup stringwidth pop pop 0 0 moveto
show
grestore
grestore
0.700000 setlinewidth gsave newpath 0.000000 0.000000 moveto 0.000000 72.000000 lineto stroke
newpath 0.000000 0.000000 moveto -5.000000 0.000000 lineto stroke
newpath 0.000000 8.552223 moveto -2.000000 8.552223 lineto stroke
newpath 0.000000 17.104446 moveto -5.000000 17.104446 lineto stroke
newpath 0.000000 25.656670 moveto -2.000000 25.656670 lineto stroke
newpath 0.000000 34.208893 moveto -5.000000 34.208893 lineto stroke
newpath 0.000000 42.761116 moveto -2.000000 42.761116 lineto stroke
newpath 0.000000 51.313339 moveto -5.000000 51.313339 lineto stroke
newpath 0.000000 59.865562 moveto -2.000000 59.865562 lineto stroke
newpath 0.000000 68.417786 moveto -5.000000 68.417786 lineto stroke
/Times-Roman findfont 9.000000 scalefont setfont
gsave -8.000000 0.000000 translate 0.000000 rotate
0 -2.700000 translate (0) dup stringwidth pop neg 0 moveto
show
grestore
gsave -8.000000 17.104446 translate 0.000000 rotate
0 -2.700000 translate (2000) dup stringwidth pop neg 0 moveto
show
grestore
gsave -8.000000 34.208893 translate 0.000000 rotate
0 -2.700000 translate (4000) dup stringwidth pop neg 0 moveto
show
grestore
gsave -8.000000 51.313339 translate 0.000000 rotate
0 -2.700000 translate (6000) dup stringwidth pop neg 0 moveto
show
grestore
gsave -8.000000 68.417786 translate 0.000000 rotate
0 -2.700000 translate (8000) dup stringwidth pop neg 0 moveto
show
grestore
/Times-Bold findfont 10.000000 scalefont setfont
gsave -33.279999 36.000000 translate 90.000000 rotate
0 0.000000 translate (MB/s) dup stringwidth pop 2 div neg 0 moveto
show
grestore
grestore
gsave
gsave gsave 28.799999 72.000000 translate 0.000000 rotate
newpath 14.400000 0.000000 moveto -14.400000 0.000000 lineto
-14.400000 -72.000000 lineto
14.400000 -72.000000 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore gsave 57.599998 23.516296 translate 0.000000 rotate
newpath 14.400000 0.000000 moveto -14.400000 0.000000 lineto
-14.400000 -23.516296 lineto
14.400000 -23.516296 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore gsave 86.399994 20.308016 translate 0.000000 rotate
newpath 14.400000 0.000000 moveto -14.400000 0.000000 lineto
-14.400000 -20.308016 lineto
14.400000 -20.308016 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore gsave 115.199997 13.716681 translate 0.000000 rotate
newpath 14.400000 0.000000 moveto -14.400000 0.000000 lineto
-14.400000 -13.716681 lineto
14.400000 -13.716681 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore gsave 144.000000 11.183632 translate 0.000000 rotate
newpath 14.400000 0.000000 moveto -14.400000 0.000000 lineto
-14.400000 -11.183632 lineto
14.400000 -11.183632 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore gsave 172.799988 10.863582 translate 0.000000 rotate
newpath 14.400000 0.000000 moveto -14.400000 0.000000 lineto
-14.400000 -10.863582 lineto
14.400000 -10.863582 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore gsave 201.599991 8.547887 translate 0.000000 rotate
newpath 14.400000 0.000000 moveto -14.400000 0.000000 lineto
-14.400000 -8.547887 lineto
14.400000 -8.547887 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore gsave 230.399994 7.811883 translate 0.000000 rotate
newpath 14.400000 0.000000 moveto -14.400000 0.000000 lineto
-14.400000 -7.811883 lineto
14.400000 -7.811883 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore gsave 259.199982 4.485872 translate 0.000000 rotate
newpath 14.400000 0.000000 moveto -14.400000 0.000000 lineto
-14.400000 -4.485872 lineto
14.400000 -4.485872 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore gsave 288.000000 1.912226 translate 0.000000 rotate
newpath 14.400000 0.000000 moveto -14.400000 0.000000 lineto
-14.400000 -1.912226 lineto
14.400000 -1.912226 lineto
closepath gsave 1.000000 1.000000 0.000000 setrgbcolor fill grestore
stroke
grestore grestore
grestore
-0.000000 -0.000000 translate
grestore showpage

14
junk.sh Normal file
View File

@ -0,0 +1,14 @@
gf_time 4 R -1 1024000 1000 - ; echo '-'
gf_time 4 R -1 1024000 1000 SHIFT - - ; echo 'SHIFT - -'
gf_time 4 R -1 1024000 1000 BYTWO_p - - ; echo 'BYTWO_p - -'
gf_time 4 R -1 1024000 1000 BYTWO_p SSE - ; echo 'BYTWO_p SSE -'
gf_time 4 R -1 1024000 1000 BYTWO_b - - ; echo 'BYTWO_b - -'
gf_time 4 R -1 1024000 1000 BYTWO_b SSE - ; echo 'BYTWO_b SSE -'
gf_time 4 R -1 1024000 1000 TABLE - - ; echo 'TABLE - -'
gf_time 4 R -1 1024000 1000 TABLE SINGLE - ; echo 'TABLE SINGLE -'
gf_time 4 R -1 1024000 1000 TABLE DOUBLE - ; echo 'TABLE DOUBLE -'
gf_time 4 R -1 1024000 1000 TABLE QUAD - ; echo 'TABLE QUAD -'
gf_time 4 R -1 1024000 1000 TABLE QUAD,LAZY - ; echo 'TABLE QUAD,LAZY -'
gf_time 4 R -1 1024000 1000 TABLE SINGLE,SSE - ; echo 'TABLE SINGLE,SSE -'
gf_time 4 R -1 1024000 1000 TABLE SINGLE,NOSSE - ; echo 'TABLE SINGLE,NOSSE -'
gf_time 4 R -1 1024000 1000 LOG - - ; echo 'LOG - -'

110
junk.txt Normal file
View File

@ -0,0 +1,110 @@
static
void
gf_w16_bytwo_b_sse_region_2_noxor(gf_region_data *rd, struct gf_w16_bytwo_data *btd)
{
#ifdef INTEL_SSE4
int i;
uint8_t *d8, *s8, tb;
__m128i pp, m1, m2, t1, t2, va, vb;
s8 = (uint8_t *) rd->s_start;
d8 = (uint8_t *) rd->d_start;
pp = _mm_set1_epi16(btd->prim_poly&0xffff);
m1 = _mm_set1_epi16((btd->mask1)&0xffff);
m2 = _mm_set1_epi16((btd->mask2)&0xffff);
while (d8 < (uint8_t *) rd->d_top) {
va = _mm_load_si128 ((__m128i *)(s8));
SSE_AB2(pp, m1, m2, va, t1, t2);
_mm_store_si128((__m128i *)d8, va);
d8 += 16;
s8 += 16;
}
#endif
}
static
void
gf_w16_bytwo_b_sse_region_2_xor(gf_region_data *rd, struct gf_w16_bytwo_data *btd)
{
#ifdef INTEL_SSE4
int i;
uint8_t *d8, *s8, tb;
__m128i pp, m1, m2, t1, t2, va, vb;
s8 = (uint8_t *) rd->s_start;
d8 = (uint8_t *) rd->d_start;
pp = _mm_set1_epi16(btd->prim_poly&0xffff);
m1 = _mm_set1_epi16((btd->mask1)&0xffff);
m2 = _mm_set1_epi16((btd->mask2)&0xffff);
while (d8 < (uint8_t *) rd->d_top) {
va = _mm_load_si128 ((__m128i *)(s8));
SSE_AB2(pp, m1, m2, va, t1, t2);
vb = _mm_load_si128 ((__m128i *)(d8));
vb = _mm_xor_si128(vb, va);
_mm_store_si128((__m128i *)d8, vb);
d8 += 16;
s8 += 16;
}
#endif
}
static
void
gf_w16_bytwo_b_sse_multiply_region(gf_t *gf, void *src, void *dest, gf_val_32_t val, int bytes, int xor)
{
#ifdef INTEL_SSE4
int itb;
uint8_t *d8, *s8;
__m128i pp, m1, m2, t1, t2, va, vb;
struct gf_w16_bytwo_data *btd;
gf_region_data rd;
if (val == 0) { gf_multby_zero(dest, bytes, xor); return; }
if (val == 1) { gf_multby_one(gf, src, dest, bytes, xor); return; }
gf_set_region_data(&rd, gf, src, dest, bytes, val, xor, 16);
gf_do_initial_region_alignment(&rd);
btd = (struct gf_w16_bytwo_data *) ((gf_internal_t *) (gf->scratch))->private;
if (val == 2) {
if (xor) {
gf_w16_bytwo_b_sse_region_2_xor(&rd, btd);
} else {
gf_w16_bytwo_b_sse_region_2_noxor(&rd, btd);
}
gf_do_final_region_alignment(&rd);
return;
}
s8 = (uint8_t *) rd.s_start;
d8 = (uint8_t *) rd.d_start;
pp = _mm_set1_epi16(btd->prim_poly&0xffff);
m1 = _mm_set1_epi16((btd->mask1)&0xffff);
m2 = _mm_set1_epi16((btd->mask2)&0xffff);
while (d8 < (uint8_t *) rd.d_top) {
va = _mm_load_si128 ((__m128i *)(s8));
vb = (!xor) ? _mm_setzero_si128() : _mm_load_si128 ((__m128i *)(d8));
itb = val;
while (1) {
if (itb & 1) vb = _mm_xor_si128(vb, va);
itb >>= 1;
if (itb == 0) break;
SSE_AB2(pp, m1, m2, va, t1, t2);
}
_mm_store_si128((__m128i *)d8, vb);
d8 += 16;
s8 += 16;
}
gf_do_final_region_alignment(&rd);
#endif
}

957
junk_gf_unit.c Normal file
View File

@ -0,0 +1,957 @@
/*
* gf_unit.c
*
* Performs unit testing for gf arithmetic
*/
#include <stdio.h>
#include <getopt.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>
#include "gf.h"
#include "gf_int.h"
#include "gf_method.h"
#include "gf_rand.h"
#define REGION_SIZE (65536)
static
uint8_t get_alt_map_2w8(int offset, uint8_t *buf, int region_size)
{
uint8_t symbol = 0;
int bit_off = offset % 2;
if (bit_off == 0) {
symbol = buf[offset / 2] & 0x0f | ((buf[(offset / 2)+region_size] & 0x0f) << 4);
} else {
symbol = ((buf[offset / 2] & 0xf0) >> 4) | (buf[(offset / 2)+region_size] & 0xf0);
}
return symbol;
}
static
uint16_t get_alt_map_2w16(int offset, uint8_t *buf, int region_size)
{
uint16_t symbol = 0;
symbol = buf[offset] | (buf[offset+region_size] << 8);
return symbol;
}
static
uint32_t get_alt_map_2w32(int offset, uint8_t *buf, int region_size)
{
uint32_t symbol = 0;
uint16_t buf_a = buf[offset] | (buf[offset + 1] << 8);
uint16_t buf_b = buf[offset + region_size] | (buf[offset + region_size + 1] << 8);
symbol = buf_a | (buf_b << 16);
return symbol;
}
static
void test_alt_map()
{
uint8_t* buf = (uint8_t*)malloc(sizeof(uint8_t)*REGION_SIZE);
int i=0;
uint8_t c=1, next_c;
for (i=0; i < REGION_SIZE/2;i++) {
if (c == 255) c = 1;
buf[i] = c;
buf[i+(REGION_SIZE/2)] = c;
c++;
}
c = 1;
for (i=0; i < REGION_SIZE;i++) {
uint8_t sym_w8 = get_alt_map_2w8(i, buf, REGION_SIZE/2);
uint8_t c_val = ((i % 2) == 0) ? (c & 0x0f) : ((c & 0xf0) >> 4);
uint8_t exp_sym_w8 = c_val | c_val << 4;
if (exp_sym_w8 != sym_w8) {
fprintf(stderr, "Alt mapping failure (w=8,c=%d,i=%d): %u != %u\n", c, i, exp_sym_w8, sym_w8);
exit(1);
}
if ((i % 2) == 1) {
c++;
}
if (c == 255) {
c = 1;
}
}
c = 1;
for (i=0; i < REGION_SIZE/2;i++) {
uint16_t sym_w16 = get_alt_map_2w16(i, buf, REGION_SIZE/2);
uint16_t exp_sym_w16 = c | c << 8;
if (exp_sym_w16 != sym_w16) {
fprintf(stderr, "Alt mapping failure (w=16,c=%d,i=%d): %u != %u\n", c, i, exp_sym_w16, sym_w16);
exit(1);
}
c++;
if (c == 255) {
c = 1;
}
}
c = 1;
next_c = 2;
for (i=0; i < REGION_SIZE/4;i++) {
uint32_t sym_w32 = get_alt_map_2w32(i, buf, REGION_SIZE/2);
uint32_t exp_sym_w32 = c | (next_c << 8) | c << 16 | (next_c << 24);
if (exp_sym_w32 != sym_w32) {
fprintf(stderr, "Alt mapping failure (w=32,c=%d,i=%d): %u != %u\n", c, i, exp_sym_w32, sym_w32);
exit(1);
}
c++;
next_c++;
if (c == 255) {
c = 1;
next_c = 2;
} else if (c == 254) {
next_c = 1;
}
}
}
void fill_random_region(void *reg, int size)
{
uint32_t *r;
int i;
r = (uint32_t *) reg;
for (i = 0; i < size/sizeof(uint32_t); i++) {
r[i] = MOA_Random_32();
}
}
void problem(char *s)
{
fprintf(stderr, "Unit test failed.\n");
fprintf(stderr, "%s\n", s);
exit(1);
}
void usage(char *s)
{
fprintf(stderr, "usage: gf_unit w tests seed [method] - does unit testing in GF(2^w)\n");
fprintf(stderr, "\n");
fprintf(stderr, "Legal w are: 4, 8, 16, 32, 64 and 128\n");
fprintf(stderr, "\n");
fprintf(stderr, "Tests may be any combination of:\n");
fprintf(stderr, " A: All\n");
fprintf(stderr, " S: Single operations (multiplication/division)\n");
fprintf(stderr, " R: Region operations\n");
fprintf(stderr, " V: Verbose Output\n");
fprintf(stderr, "\n");
fprintf(stderr, "Use -1 for time(0) as a seed.\n");
fprintf(stderr, "\n");
fprintf(stderr, "For method specification, type gf_methods\n");
fprintf(stderr, "\n");
if (s != NULL) fprintf(stderr, "%s\n", s);
exit(1);
}
int main(int argc, char **argv)
{
int w, i, j, verbose, single, region, xor, off, size, sindex, eindex, tested, top;
uint32_t a, b, c, d, ai, da, bi, mask;
uint64_t a64, b64, c64, d64;
uint64_t a128[2], b128[2], c128[2], d128[2], e128[2];
gf_t gf, gf_def;
uint8_t *r8b, *r8c, *r8d;
uint16_t *r16b, *r16c, *r16d;
uint32_t *r32b, *r32c, *r32d;
uint64_t *r64b, *r64c, *r64d;
uint64_t *r128b, *r128c, *r128d;
time_t t0;
gf_internal_t *h;
if (argc < 4) usage(NULL);
if (sscanf(argv[1], "%d", &w) == 0) usage("Bad w\n");
if (sscanf(argv[3], "%ld", &t0) == 0) usage("Bad seed\n");
if (t0 == -1) t0 = time(0);
MOA_Seed(t0);
if (w > 32 && w != 64 && w != 128) usage("Bad w");
if (create_gf_from_argv(&gf, w, argc, argv, 4) == 0) usage("Bad Method");
for (i = 0; i < strlen(argv[2]); i++) {
if (strchr("ASRV", argv[2][i]) == NULL) usage("Bad test\n");
}
h = (gf_internal_t *) gf.scratch;
if (w <= 32) {
mask = 0;
for (i = 0; i < w; i++) mask |= (1 << i);
}
verbose = (strchr(argv[2], 'V') != NULL);
single = (strchr(argv[2], 'S') != NULL || strchr(argv[2], 'A') != NULL);
region = (strchr(argv[2], 'R') != NULL || strchr(argv[2], 'A') != NULL);
if (((h->region_type & GF_REGION_ALTMAP) != 0) && (h->mult_type == GF_MULT_COMPOSITE)) {
test_alt_map();
}
if (!gf_init_easy(&gf_def, w, GF_MULT_DEFAULT)) problem("No default for this value of w");
if (verbose) printf("Seed: %ld\n", t0);
if (single) {
if (w <= 32) {
if (gf.multiply.w32 == NULL) problem("No multiplication operation defined.");
if (verbose) { printf("Testing single multiplications/divisions.\n"); fflush(stdout); }
if (w <= 10) {
top = (1 << w)*(1 << w);
} else {
top = 1000000;
}
for (i = 0; i < top; i++) {
if (w <= 10) {
a = i % (1 << w);
b = i >> w;
} else if (i < 10) {
a = 0;
b = MOA_Random_W(w, 1);
} else if (i < 20) {
b = 0;
a = MOA_Random_W(w, 1);
} else if (i < 30) {
a = 1;
b = MOA_Random_W(w, 1);
} else if (i < 40) {
b = 1;
a = MOA_Random_W(w, 1);
} else {
a = MOA_Random_W(w, 1);
b = MOA_Random_W(w, 1);
}
c = gf.multiply.w32(&gf, a, b);
tested = 0;
/* If this is not composite, then first test against the default: */
if (h->mult_type != GF_MULT_COMPOSITE) {
tested = 1;
d = gf_def.multiply.w32(&gf_def, a, b);
if (c != d) {
printf("Error in single multiplication (all numbers in hex):\n\n");
printf(" gf.multiply.w32() of %x and %x returned %x\n", a, b, c);
printf(" The default returned %x\n", d);
exit(1);
}
}
/* Now, we also need to double-check, in case the default is wanky, and when
we're performing composite operations. Start with 0 and 1: */
if (a == 0 || b == 0 || a == 1 || b == 1) {
tested = 1;
if (((a == 0 || b == 0) && c != 0) ||
(a == 1 && c != b) ||
(b == 1 && c != a)) {
printf("Error in single multiplication (all numbers in hex):\n\n");
printf(" gf.multiply.w32() of %x and %x returned %x, which is clearly wrong.\n", a, b, c);
exit(1);
}
/* If division or inverses are defined, let's test all combinations to make sure
that the operations are consistent with each other. */
} else {
if ((c & mask) != c) {
printf("Error in single multiplication (all numbers in hex):\n\n");
printf(" gf.multiply.w32() of %x and %x returned %x, which is too big.\n", a, b, c);
exit(1);
}
}
if (gf.inverse.w32 != NULL && (a != 0 || b != 0)) {
tested = 1;
if (a != 0) {
ai = gf.inverse.w32(&gf, a);
if (gf.multiply.w32(&gf, c, ai) != b) {
printf("Error in single multiplication (all numbers in hex):\n\n");
printf(" gf.multiply.w32() of %x and %x returned %x\n", a, b, c);
printf(" The inverse of %x is %x, and gf_multiply.w32() of %x and %x equals %x\n",
a, ai, c, ai, gf.multiply.w32(&gf, c, ai));
exit(1);
}
}
if (b != 0) {
bi = gf.inverse.w32(&gf, b);
if (gf.multiply.w32(&gf, c, bi) != a) {
printf("Error in single multiplication (all numbers in hex):\n\n");
printf(" gf.multiply.w32() of %x and %x returned %x\n", a, b, c);
printf(" The inverse of %x is %x, and gf_multiply.w32() of %x and %x equals %x\n",
b, bi, c, bi, gf.multiply.w32(&gf, c, bi));
exit(1);
}
}
}
if (gf.divide.w32 != NULL && (a != 0 || b != 0)) {
tested = 1;
if (a != 0) {
ai = gf.divide.w32(&gf, c, a);
if (ai != b) {
printf("Error in single multiplication/division (all numbers in hex):\n\n");
printf(" gf.multiply.w32() of %x and %x returned %x\n", a, b, c);
printf(" gf.divide.w32() of %x and %x returned %x\n", c, a, ai);
exit(1);
}
}
if (b != 0) {
bi = gf.divide.w32(&gf, c, b);
if (bi != a) {
printf("Error in single multiplication/division (all numbers in hex):\n\n");
printf(" gf.multiply.w32() of %x and %x returned %x\n", a, b, c);
printf(" gf.divide.w32() of %x and %x returned %x\n", c, b, bi);
exit(1);
}
}
}
if (!tested) problem("There is no way to test multiplication.\n");
}
} else if (w == 64) {
if (verbose) { printf("Testing single multiplications/divisions.\n"); fflush(stdout); }
if (gf.multiply.w64 == NULL) problem("No multiplication operation defined.");
for (i = 0; i < 1000; i++) {
for (j = 0; j < 1000; j++) {
a64 = MOA_Random_64();
b64 = MOA_Random_64();
c64 = gf.multiply.w64(&gf, a64, b64);
if ((a64 == 0 || b64 == 0) && c64 != 0) problem("Single Multiplication by zero Failed");
if (a64 != 0 && b64 != 0) {
d64 = (gf.divide.w64 == NULL) ? gf_def.divide.w64(&gf_def, c64, b64) : gf.divide.w64(&gf, c64, b64);
if (d64 != a64) {
printf("0x%llx * 0x%llx =? 0x%llx (check-a: 0x%llx)\n", a64, b64, c64, d64);
problem("Single multiplication/division failed");
}
}
}
}
if (gf.inverse.w64 == NULL) {
printf("No inverse defined for this method.\n");
} else {
if (verbose) { printf("Testing Inversions.\n"); fflush(stdout); }
for (i = 0; i < 1000; i++) {
do { a64 = MOA_Random_64(); } while (a64 == 0);
b64 = gf.inverse.w64(&gf, a64);
if (gf.multiply.w64(&gf, a64, b64) != 1) problem("Inversion failed.\n");
}
}
} else if (w == 128) {
if (verbose) { printf("Testing single multiplications/divisions.\n"); fflush(stdout); }
if (gf.multiply.w128 == NULL) problem("No multiplication operation defined.");
for (i = 0; i < 500; i++) {
for (j = 0; j < 500; j++) {
MOA_Random_128(a128);
MOA_Random_128(b128);
gf.multiply.w128(&gf, a128, b128, c128);
if ((GF_W128_IS_ZERO(a128) && GF_W128_IS_ZERO(b128)) && !(GF_W128_IS_ZERO(c128))) problem("Single Multiplication by zero Failed");
if (!GF_W128_IS_ZERO(a128) && !GF_W128_IS_ZERO(b128)) {
gf.divide.w128 == NULL ? gf_def.divide.w128(&gf_def, c128, b128, d128) : gf.divide.w128(&gf, c128, b128, d128);
if (!GF_W128_EQUAL(a128, d128)) {
printf("0x%llx 0x%llx * 0x%llx 0x%llx =? 0x%llx 0x%llx (check-a: 0x%llx 0x%llx)\n", a128[0], a128[1], b128[0], b128[1], c128[0], c128[1], d128[0], d128[1]);
problem("Single multiplication/division failed");
}
}
}
}
if (gf.inverse.w128 == NULL) {
printf("No inverse defined for this method.\n");
} else {
if (verbose) { printf("Testing Inversions.\n"); fflush(stdout); }
for (i = 0; i < 1000; i++) {
do { MOA_Random_128(a128); } while (GF_W128_IS_ZERO(a128));
gf.inverse.w128(&gf, a128, b128);
gf.multiply.w128(&gf, a128, b128, c128);
if (!(c128[0] == 0 && c128[1] == 1)) problem("Inversion failed.\n");
}
}
} else {
problem("Value of w not implemented yet");
}
}
if (region) {
if (w == 4) {
if (gf.multiply_region.w32 == NULL) {
printf("No multiply_region.\n");
} else {
r8b = (uint8_t *) malloc(REGION_SIZE);
r8c = (uint8_t *) malloc(REGION_SIZE);
r8d = (uint8_t *) malloc(REGION_SIZE);
fill_random_region(r8b, REGION_SIZE);
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src != dest, xor = %d\n", xor);
fflush(stdout);
}
for (a = 0; a < 16; a++) {
fill_random_region(r8c, REGION_SIZE);
memcpy(r8d, r8c, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint8_t)-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint8_t);
gf.multiply_region.w32(&gf, (void *) (r8b+sindex), (void *) (r8c+sindex), a, size, xor);
for (i = sindex; i < eindex; i++) {
b = (r8b[i] >> 4);
c = (r8c[i] >> 4);
d = (r8d[i] >> 4);
if (!xor && gf.multiply.w32(&gf, a, b) != c) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r8b+i));
printf(" %d * %d = %d, but should equal %d\n", a, b, c, gf.multiply.w32(&gf, a, b) );
printf("i=%d. %d %d %d %d\n", i, a, r8b[i], r8c[i], r8d[i]);
problem("Failed buffer-constant, xor=0");
}
if (xor && (gf.multiply.w32(&gf, a, b) ^ d) != c) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r8b+i));
printf(" %d %d %d %d\n", a, b, c, d);
printf(" %d %d %d %d\n", a, r8b[i], r8c[i], r8d[i]);
problem("Failed buffer-constant, xor=1");
}
b = (r8b[i] & 0xf);
c = (r8c[i] & 0xf);
d = (r8d[i] & 0xf);
if (!xor && gf.multiply.w32(&gf, a, b) != c) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r8b+i));
printf(" %d * %d = %d, but should equal %d\n", a, b, c, gf.multiply.w32(&gf, a, b) );
printf("i=%d. 0x%x 0x%x 0x%x 0x%x\n", i, a, r8b[i], r8c[i], r8d[i]);
problem("Failed buffer-constant, xor=0");
}
if (xor && (gf.multiply.w32(&gf, a, b) ^ d) != c) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r8b+i));
printf(" (%d * %d ^ %d) should equal %d - equals %d\n",
a, b, d, (gf.multiply.w32(&gf, a, b) ^ d), c);
printf(" %d %d %d %d\n", a, r8b[i], r8c[i], r8d[i]);
problem("Failed buffer-constant, xor=1");
}
}
}
}
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src == dest, xor = %d\n", xor);
fflush(stdout);
}
for (a = 0; a < 16; a++) {
fill_random_region(r8b, REGION_SIZE);
memcpy(r8d, r8b, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint8_t)-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint8_t);
gf.multiply_region.w32(&gf, (void *) (r8b+sindex), (void *) (r8b+sindex), a, size, xor);
for (i = sindex; i < eindex; i++) {
b = (r8b[i] >> 4);
d = (r8d[i] >> 4);
if (!xor && gf.multiply.w32(&gf, a, d) != b) problem("Failed buffer-constant, xor=0");
if (xor && (gf.multiply.w32(&gf, a, d) ^ d) != b) {
printf("i=%d. %d %d %d\n", i, a, b, d);
printf("i=%d. %d %d %d\n", i, a, r8b[i], r8d[i]);
problem("Failed buffer-constant, xor=1");
}
b = (r8b[i] & 0xf);
d = (r8d[i] & 0xf);
if (!xor && gf.multiply.w32(&gf, a, d) != b) problem("Failed buffer-constant, xor=0");
if (xor && (gf.multiply.w32(&gf, a, d) ^ d) != b) {
printf("%d %d %d\n", a, b, d);
problem("Failed buffer-constant, xor=1");
}
}
}
}
free(r8b);
free(r8c);
free(r8d);
}
} else if (w == 8) {
if (gf.multiply_region.w32 == NULL) {
printf("No multiply_region.\n");
} else {
r8b = (uint8_t *) malloc(REGION_SIZE);
r8c = (uint8_t *) malloc(REGION_SIZE);
r8d = (uint8_t *) malloc(REGION_SIZE);
fill_random_region(r8b, REGION_SIZE);
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src != dest, xor = %d\n", xor);
fflush(stdout);
}
for (a = 0; a < 256; a++) {
fill_random_region(r8c, REGION_SIZE);
memcpy(r8d, r8c, REGION_SIZE);
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
sindex = 0;
eindex = REGION_SIZE;
} else {
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint8_t)-MOA_Random_W(3, 1);
}
size = (eindex-sindex)*sizeof(uint8_t);
gf.multiply_region.w32(&gf, (void *) (r8b+sindex), (void *) (r8c+sindex), a, size, xor);
for (i = sindex; i < eindex; i++) {
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
b = get_alt_map_2w8(i, (uint8_t*)r8b, REGION_SIZE / 2);
c = get_alt_map_2w8(i, (uint8_t*)r8c, REGION_SIZE / 2);
d = get_alt_map_2w8(i, (uint8_t*)r8d, REGION_SIZE / 2);
} else {
b = r8b[i];
c = r8c[i];
d = r8d[i];
}
if (!xor && gf.multiply.w32(&gf, a, b) != c) {
printf("i=%d. %d %d %d %d\n", i, a, b, c, d);
printf("i=%d. %d %d %d %d\n", i, a, r8b[i], r8c[i], r8d[i]);
printf("%llx. Sindex: %d\n", r8b+i, sindex);
problem("Failed buffer-constant, xor=0");
}
if (xor && (gf.multiply.w32(&gf, a, b) ^ d) != c) {
printf("i=%d. %d %d %d %d\n", i, a, b, c, d);
printf("i=%d. %d %d %d %d\n", i, a, r8b[i], r8c[i], r8d[i]);
problem("Failed buffer-constant, xor=1");
}
}
}
}
for (xor = 0; xor < 2; xor++) {
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
continue;
}
if (verbose) {
printf("Testing buffer-constant, src == dest, xor = %d\n", xor);
fflush(stdout);
}
for (a = 0; a < 256; a++) {
fill_random_region(r8b, REGION_SIZE);
memcpy(r8d, r8b, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint8_t)-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint8_t);
gf.multiply_region.w32(&gf, (void *) (r8b+sindex), (void *) (r8b+sindex), a, size, xor);
for (i = sindex; i < eindex; i++) {
b = r8b[i];
d = r8d[i];
if (!xor && gf.multiply.w32(&gf, a, d) != b) problem("Failed buffer-constant, xor=0");
if (xor && (gf.multiply.w32(&gf, a, d) ^ d) != b) {
printf("i=%d. %d %d %d\n", i, a, b, d);
printf("i=%d. %d %d %d\n", i, a, r8b[i], r8d[i]);
problem("Failed buffer-constant, xor=1");
}
}
}
}
free(r8b);
free(r8c);
free(r8d);
}
} else if (w == 16) {
if (gf.multiply_region.w32 == NULL) {
printf("No multiply_region.\n");
} else {
r16b = (uint16_t *) malloc(REGION_SIZE);
r16c = (uint16_t *) malloc(REGION_SIZE);
r16d = (uint16_t *) malloc(REGION_SIZE);
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src != dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
fill_random_region(r16b, REGION_SIZE);
a = MOA_Random_W(w, 0);
fill_random_region(r16c, REGION_SIZE);
memcpy(r16d, r16c, REGION_SIZE);
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
sindex = 0;
eindex = REGION_SIZE / sizeof(uint16_t);
} else {
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint16_t)-MOA_Random_W(3, 1);
}
size = (eindex-sindex)*sizeof(uint16_t);
gf.multiply_region.w32(&gf, (void *) (r16b+sindex), (void *) (r16c+sindex), a, size, xor);
ai = gf.inverse.w32(&gf, a);
if (!xor) {
gf.multiply_region.w32(&gf, (void *) (r16c+sindex), (void *) (r16d+sindex), ai, size, xor);
} else {
gf.multiply_region.w32(&gf, (void *) (r16c+sindex), (void *) (r16d+sindex), 1, size, xor);
gf.multiply_region.w32(&gf, (void *) (r16d+sindex), (void *) (r16b+sindex), ai, size, xor);
}
for (i = sindex; i < eindex; i++) {
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
b = get_alt_map_2w16(i, (uint8_t*)r16b, size / 2);
c = get_alt_map_2w16(i, (uint8_t*)r16c, size / 2);
d = get_alt_map_2w16(i, (uint8_t*)r16d, size / 2);
} else {
b = r16b[i];
c = r16c[i];
d = r16d[i];
}
if (!xor && d != b) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r16b+i));
printf("We have %d * %d = %d, and %d * %d = %d.\n", a, b, c, c, ai, d);
printf("%d is the inverse of %d\n", ai, a);
problem("Failed buffer-constant, xor=0");
}
if (xor && b != 0) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r16b+i));
printf("We did d=c; c ^= ba; d ^= c; b ^= (a^-1)d;\n");
printf(" b should equal 0, but it doesn't. Probe into it.\n");
problem("Failed buffer-constant, xor=1");
}
}
}
}
for (xor = 0; xor < 2; xor++) {
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
continue;
}
if (verbose) {
printf("Testing buffer-constant, src == dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
a = MOA_Random_W(w, 0);
fill_random_region(r16b, REGION_SIZE);
memcpy(r16d, r16b, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint16_t)-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint16_t);
gf.multiply_region.w32(&gf, (void *) (r16b+sindex), (void *) (r16b+sindex), a, size, xor);
ai = gf.inverse.w32(&gf, a);
if (!xor) {
gf.multiply_region.w32(&gf, (void *) (r16b+sindex), (void *) (r16b+sindex), ai, size, xor);
} else {
gf.multiply_region.w32(&gf, (void *) (r16d+sindex), (void *) (r16b+sindex), 1, size, xor);
gf.multiply_region.w32(&gf, (void *) (r16b+sindex), (void *) (r16b+sindex), ai, size, 0);
}
for (i = sindex; i < eindex; i++) {
b = r16b[i];
c = r16c[i];
d = r16d[i];
if (!xor && (d != b)) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r16b+i));
printf("We did d=b; b = ba; b = b(a^-1).\n");
printf("So, b should equal d, but it doesn't. Look into it.\n");
printf("b = %d. d = %d. a = %d\n", b, d, a);
problem("Failed buffer-constant, xor=0");
}
if (xor && d != b) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r16b+i));
printf("We did d=b; b = b + ba; b += d; b = b(a^-1);\n");
printf("We did d=c; c ^= ba; d ^= c; b ^= (a^-1)d;\n");
printf("So, b should equal d, but it doesn't. Look into it.\n");
problem("Failed buffer-constant, xor=1");
}
}
}
}
free(r16b);
free(r16c);
free(r16d);
}
} else if (w == 32) {
if (gf.multiply_region.w32 == NULL) {
printf("No multiply_region.\n");
} else {
r32b = (uint32_t *) malloc(REGION_SIZE);
r32c = (uint32_t *) malloc(REGION_SIZE);
r32d = (uint32_t *) malloc(REGION_SIZE);
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src != dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
a = MOA_Random_32();
fill_random_region(r32b, REGION_SIZE);
fill_random_region(r32c, REGION_SIZE);
memcpy(r32d, r32c, REGION_SIZE);
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
sindex = 0;
eindex = REGION_SIZE / sizeof(uint32_t);
} else {
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint32_t)-MOA_Random_W(3, 1);
}
size = (eindex-sindex)*sizeof(uint32_t);
gf.multiply_region.w32(&gf, (void *) (r32b+sindex), (void *) (r32c+sindex), a, size, xor);
ai = gf.inverse.w32(&gf, a);
if (!xor) {
gf.multiply_region.w32(&gf, (void *) (r32c+sindex), (void *) (r32d+sindex), ai, size, xor);
} else {
gf.multiply_region.w32(&gf, (void *) (r32c+sindex), (void *) (r32d+sindex), 1, size, xor);
gf.multiply_region.w32(&gf, (void *) (r32d+sindex), (void *) (r32b+sindex), ai, size, xor);
}
for (i = sindex; i < eindex; i++) {
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
b = get_alt_map_2w32(i, (uint8_t*)r32b, size / 2);
c = get_alt_map_2w32(i, (uint8_t*)r32c, size / 2);
d = get_alt_map_2w32(i, (uint8_t*)r32d, size / 2);
i++;
} else {
b = r32b[i];
c = r32c[i];
d = r32d[i];
}
if (!xor && d != b) {
printf("i=%d. Addresses: b: 0x%lx\n", i, (unsigned long) (r32b+i));
printf("We have %d * %d = %d, and %d * %d = %d.\n", a, b, c, c, ai, d);
printf("%d is the inverse of %d\n", ai, a);
problem("Failed buffer-constant, xor=0");
}
if (xor && b != 0) {
printf("i=%d. Addresses: b: 0x%lx c: 0x%lx d: 0x%lx\n", i,
(unsigned long) (r32b+i), (unsigned long) (r32c+i), (unsigned long) (r32d+i));
printf("We did d=c; c ^= ba; d ^= c; b ^= (a^-1)d;\n");
printf(" b should equal 0, but it doesn't. Probe into it.\n");
printf("a: %8x b: %8x c: %8x, d: %8x\n", a, b, c, d);
problem("Failed buffer-constant, xor=1");
}
}
}
}
for (xor = 0; xor < 2; xor++) {
if ((((gf_internal_t*)gf.scratch)->region_type & GF_REGION_ALTMAP) != 0 &&
(((gf_internal_t*)gf.scratch)->mult_type == GF_MULT_COMPOSITE)) {
continue;
}
if (verbose) {
printf("Testing buffer-constant, src == dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
a = MOA_Random_32();
fill_random_region(r32b, REGION_SIZE);
memcpy(r32d, r32b, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint32_t)-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint32_t);
gf.multiply_region.w32(&gf, (void *) (r32b+sindex), (void *) (r32b+sindex), a, size, xor);
ai = gf.inverse.w32(&gf, a);
if (!xor) {
gf.multiply_region.w32(&gf, (void *) (r32b+sindex), (void *) (r32b+sindex), ai, size, xor);
} else {
gf.multiply_region.w32(&gf, (void *) (r32d+sindex), (void *) (r32b+sindex), 1, size, xor);
gf.multiply_region.w32(&gf, (void *) (r32b+sindex), (void *) (r32b+sindex), ai, size, 0);
}
for (i = sindex; i < eindex; i++) {
b = r32b[i];
c = r32c[i];
d = r32d[i];
if (!xor && (d != b)) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r32b+i));
printf("We did d=b; b = ba; b = b(a^-1).\n");
printf("So, b should equal d, but it doesn't. Look into it.\n");
printf("b = %d. d = %d. a = %d\n", b, d, a);
problem("Failed buffer-constant, xor=0");
}
if (xor && d != b) {
printf("i=%d. Address 0x%lx\n", i, (unsigned long) (r32b+i));
printf("We did d=b; b = b + ba; b += d; b = b(a^-1);\n");
printf("We did d=c; c ^= ba; d ^= c; b ^= (a^-1)d;\n");
printf("So, b should equal d, but it doesn't. Look into it.\n");
problem("Failed buffer-constant, xor=1");
}
}
}
}
free(r32b);
free(r32c);
free(r32d);
}
} else if (w == 64) {
if (gf.multiply_region.w64 == NULL) {
printf("No multiply_region.\n");
} else {
r64b = (uint64_t *) malloc(REGION_SIZE);
r64c = (uint64_t *) malloc(REGION_SIZE);
r64d = (uint64_t *) malloc(REGION_SIZE);
fill_random_region(r64b, REGION_SIZE);
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src != dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
a64 = MOA_Random_64();
fill_random_region(r64c, REGION_SIZE);
memcpy(r64d, r64c, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint64_t)-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint64_t);
gf.multiply_region.w64(&gf, (void *) (r64b+sindex), (void *) (r64c+sindex), a64, size, xor);
for (i = sindex; i < eindex; i++) {
b64 = r64b[i];
c64 = r64c[i];
d64 = r64d[i];
if (!xor && gf.multiply.w64(&gf, a64, b64) != c64) {
printf("i=%d. 0x%llx 0x%llx 0x%llx should be 0x%llx\n", i, a64, b64, c64,
gf.multiply.w64(&gf, a64, b64));
printf("i=%d. 0x%llx 0x%llx 0x%llx\n", i, a64, r64b[i], r64c[i]);
problem("Failed buffer-constant, xor=0");
}
if (xor && (gf.multiply.w64(&gf, a64, b64) ^ d64) != c64) {
printf("i=%d. 0x%llx 0x%llx 0x%llx 0x%llx\n", i, a64, b64, c64, d64);
printf("i=%d. 0x%llx 0x%llx 0x%llx 0x%llx\n", i, a64, r64b[i], r64c[i], r64d[i]);
problem("Failed buffer-constant, xor=1");
}
}
}
}
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src == dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
a64 = MOA_Random_64();
fill_random_region(r64b, REGION_SIZE);
memcpy(r64d, r64b, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/sizeof(uint64_t)-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint64_t);
gf.multiply_region.w64(&gf, (void *) (r64b+sindex), (void *) (r64b+sindex), a64, size, xor);
for (i = sindex; i < eindex; i++) {
b64 = r64b[i];
d64 = r64d[i];
if (!xor && gf.multiply.w64(&gf, a64, d64) != b64) problem("Failed buffer-constant, xor=0");
if (xor && (gf.multiply.w64(&gf, a64, d64) ^ d64) != b64) {
printf("i=%d. 0x%llx 0x%llx 0x%llx\n", i, a64, b64, d64);
printf("i=%d. 0x%llx 0x%llx 0x%llx\n", i, a64, r64b[i], r64d[i]);
problem("Failed buffer-constant, xor=1");
}
}
}
}
free(r64b);
free(r64c);
free(r64d);
}
} else if (w == 128) {
if (gf.multiply_region.w128 == NULL) {
printf("No multiply_region.\n");
} else {
r128b = (uint64_t *) malloc(REGION_SIZE);
r128c = (uint64_t *) malloc(REGION_SIZE);
r128d = (uint64_t *) malloc(REGION_SIZE);
fill_random_region(r128b, REGION_SIZE);
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src != dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
MOA_Random_128(a128);
fill_random_region(r128c, REGION_SIZE);
memcpy(r128d, r128c, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
eindex = REGION_SIZE/(2*sizeof(uint64_t))-MOA_Random_W(3, 1);
size = (eindex-sindex)*sizeof(uint64_t)*2;
gf.multiply_region.w128(&gf, (void *) (r128b+sindex*2), (void *) (r128c+sindex*2), a128, size, xor);
for (i = sindex; i < eindex; i++) {
b128[0] = r128b[2*i];
b128[1] = r128b[2*i+1];
c128[0] = r128c[2*i];
c128[1] = r128c[2*i+1];
d128[0] = r128d[2*i];
d128[1] = r128d[2*i+1];
gf.multiply.w128(&gf, a128, b128, e128);
if (xor) {
e128[0] ^= d128[0];
e128[1] ^= d128[1];
}
if (!xor && !GF_W128_EQUAL(c128, e128)) {
printf("i=%d. 0x%llx%llx 0x%llx%llx 0x%llx%llx should be 0x%llx%llx\n",
i, a128[0], a128[1], b128[0], b128[1], c128[0], c128[1], e128[0], e128[1]);
problem("Failed buffer-constant, xor=0");
}
if (xor && !GF_W128_EQUAL(e128, c128)) {
printf("i=%d. 0x%llx%llx 0x%llx%llx 0x%llx%llx 0x%llx%llx\n", i,
a128[0], a128[1], b128[0], b128[1], c128[0], c128[1], d128[0], d128[1]);
problem("Failed buffer-constant, xor=1");
}
}
}
}
for (xor = 0; xor < 2; xor++) {
if (verbose) {
printf("Testing buffer-constant, src == dest, xor = %d\n", xor);
fflush(stdout);
}
for (j = 0; j < 1000; j++) {
MOA_Random_128(a128);
fill_random_region(r128b, REGION_SIZE);
memcpy(r128d, r128b, REGION_SIZE);
sindex = MOA_Random_W(3, 1);
sindex = 0;
eindex = REGION_SIZE/(2*sizeof(uint64_t))-MOA_Random_W(3, 1);
eindex = REGION_SIZE/(2*sizeof(uint64_t));
size = (eindex-sindex)*sizeof(uint64_t)*2;
gf.multiply_region.w128(&gf, (void *) (r128b+sindex), (void *) (r128b+sindex), a128, size, xor);
for (i = sindex; i < eindex; i++) {
b128[0] = r128b[2*i];
b128[1] = r128b[2*i + 1];
d128[0] = r128d[2*i];
d128[1] = r128d[2*i + 1];
gf.multiply.w128(&gf, a128, d128, e128);
if (xor) {
e128[0] ^= d128[0];
e128[1] ^= d128[1];
}
if (!xor && !GF_W128_EQUAL(b128, e128)) problem("Failed buffer-constant, xor=0");
if (xor && !GF_W128_EQUAL(b128, e128)) {
problem("Failed buffer-constant, xor=1");
}
}
}
}
free(r128b);
free(r128c);
free(r128d);
}
}
}
exit(0);
}

0
tests.txt Normal file
View File

0
tmp-10-out.txt Normal file
View File

14
tmp-time-test.sh Normal file
View File

@ -0,0 +1,14 @@
if [ $# -lt 4 ]; then
echo 'usage: sh tmp-test.sh w gf_specs (e.g. LOG - -)' >&2
exit 1
fi
w=$1
shift
i=1024
while [ $i -le 134217728 ]; do
iter=`echo $i | awk '{ print (134217728/$1)*1 }'`
gf_time $w R -1 $i $iter $* | head -n 3
echo $i $iter $w $*
i=`echo $i | awk '{ print $1*2 }'`
done

1583
tmp.c Normal file

File diff suppressed because it is too large Load Diff

15
tmp.sh Normal file
View File

@ -0,0 +1,15 @@
for i in 5 10 ; do
sed 's/1 }/'$i' }/' tmp-time-test.sh > tmp2.sh
sh tmp2.sh 4 LOG - - >> tmp-$i-out.txt
sh tmp2.sh 4 TABLE - - >> tmp-$i-out.txt
sh tmp2.sh 4 TABLE SINGLE,SSE - >> tmp-$i-out.txt
sh tmp2.sh 8 LOG - - >> tmp-$i-out.txt
sh tmp2.sh 8 TABLE - - >> tmp-$i-out.txt
sh tmp2.sh 8 SPLIT 8 4 SSE - >> tmp-$i-out.txt
sh tmp2.sh 16 LOG - - >> tmp-$i-out.txt
sh tmp2.sh 16 SPLIT 16 4 SSE,STDMAP - >> tmp-$i-out.txt
sh tmp2.sh 16 SPLIT 16 4 SSE,ALTMAP - >> tmp-$i-out.txt
sh tmp2.sh 32 SPLIT 8 8 - - >> tmp-$i-out.txt
sh tmp2.sh 32 SPLIT 32 4 SSE,STDMAP - >> tmp-$i-out.txt
sh tmp2.sh 32 SPLIT 32 4 SSE,ALTMAP - >> tmp-$i-out.txt
done

294
tmp.txt Normal file
View File

@ -0,0 +1,294 @@
1024 1048576 4 LOG - - Seed: 1347471838 Buffer-Const,s!=d,xor=0: 4.824089 s 212.268 MB/s Buffer-Const,s!=d,xor=1: 5.341791 s 191.696 MB/s Buffer-Const,s==d,xor=0: 4.816530 s 212.601 MB/s Buffer-Const,s==d,xor=1: 5.333377 s 191.998 MB/s
2048 524288 4 LOG - - Seed: 1347471864 Buffer-Const,s!=d,xor=0: 4.796388 s 213.494 MB/s Buffer-Const,s!=d,xor=1: 5.355381 s 191.210 MB/s Buffer-Const,s==d,xor=0: 4.790053 s 213.776 MB/s Buffer-Const,s==d,xor=1: 5.342280 s 191.678 MB/s
4096 262144 4 LOG - - Seed: 1347471890 Buffer-Const,s!=d,xor=0: 4.785699 s 213.971 MB/s Buffer-Const,s!=d,xor=1: 5.272175 s 194.227 MB/s Buffer-Const,s==d,xor=0: 4.760163 s 215.119 MB/s Buffer-Const,s==d,xor=1: 5.285017 s 193.755 MB/s
8192 131072 4 LOG - - Seed: 1347471915 Buffer-Const,s!=d,xor=0: 4.772734 s 214.552 MB/s Buffer-Const,s!=d,xor=1: 5.301345 s 193.159 MB/s Buffer-Const,s==d,xor=0: 4.782723 s 214.104 MB/s Buffer-Const,s==d,xor=1: 5.294336 s 193.414 MB/s
16384 65536 4 LOG - - Seed: 1347471940 Buffer-Const,s!=d,xor=0: 4.779516 s 214.248 MB/s Buffer-Const,s!=d,xor=1: 5.311189 s 192.801 MB/s Buffer-Const,s==d,xor=0: 4.771980 s 214.586 MB/s Buffer-Const,s==d,xor=1: 5.294589 s 193.405 MB/s
32768 32768 4 LOG - - Seed: 1347471966 Buffer-Const,s!=d,xor=0: 4.745805 s 215.769 MB/s Buffer-Const,s!=d,xor=1: 5.289698 s 193.584 MB/s Buffer-Const,s==d,xor=0: 4.788919 s 213.827 MB/s Buffer-Const,s==d,xor=1: 5.323099 s 192.369 MB/s
65536 16384 4 LOG - - Seed: 1347471991 Buffer-Const,s!=d,xor=0: 4.782660 s 214.107 MB/s Buffer-Const,s!=d,xor=1: 5.279925 s 193.942 MB/s Buffer-Const,s==d,xor=0: 4.807014 s 213.022 MB/s Buffer-Const,s==d,xor=1: 5.296893 s 193.321 MB/s
131072 8192 4 LOG - - Seed: 1347472017 Buffer-Const,s!=d,xor=0: 4.792920 s 213.648 MB/s Buffer-Const,s!=d,xor=1: 5.460566 s 187.526 MB/s Buffer-Const,s==d,xor=0: 4.749562 s 215.599 MB/s Buffer-Const,s==d,xor=1: 5.267351 s 194.405 MB/s
262144 4096 4 LOG - - Seed: 1347472042 Buffer-Const,s!=d,xor=0: 4.785846 s 213.964 MB/s Buffer-Const,s!=d,xor=1: 5.336344 s 191.892 MB/s Buffer-Const,s==d,xor=0: 4.730902 s 216.449 MB/s Buffer-Const,s==d,xor=1: 5.312972 s 192.736 MB/s
524288 2048 4 LOG - - Seed: 1347472068 Buffer-Const,s!=d,xor=0: 4.768488 s 214.743 MB/s Buffer-Const,s!=d,xor=1: 5.302696 s 193.109 MB/s Buffer-Const,s==d,xor=0: 4.769302 s 214.706 MB/s Buffer-Const,s==d,xor=1: 5.322016 s 192.408 MB/s
1048576 1024 4 LOG - - Seed: 1347472093 Buffer-Const,s!=d,xor=0: 4.795875 s 213.517 MB/s Buffer-Const,s!=d,xor=1: 5.345346 s 191.569 MB/s Buffer-Const,s==d,xor=0: 4.810602 s 212.863 MB/s Buffer-Const,s==d,xor=1: 5.223796 s 196.026 MB/s
2097152 512 4 LOG - - Seed: 1347472118 Buffer-Const,s!=d,xor=0: 4.809727 s 212.902 MB/s Buffer-Const,s!=d,xor=1: 5.255259 s 194.852 MB/s Buffer-Const,s==d,xor=0: 4.853752 s 210.971 MB/s Buffer-Const,s==d,xor=1: 5.401798 s 189.567 MB/s
4194304 256 4 LOG - - Seed: 1347472144 Buffer-Const,s!=d,xor=0: 4.888658 s 209.464 MB/s Buffer-Const,s!=d,xor=1: 5.275764 s 194.095 MB/s Buffer-Const,s==d,xor=0: 4.880836 s 209.800 MB/s Buffer-Const,s==d,xor=1: 5.202162 s 196.841 MB/s
8388608 128 4 LOG - - Seed: 1347472170 Buffer-Const,s!=d,xor=0: 4.693878 s 218.156 MB/s Buffer-Const,s!=d,xor=1: 5.467869 s 187.276 MB/s Buffer-Const,s==d,xor=0: 4.752496 s 215.466 MB/s Buffer-Const,s==d,xor=1: 5.441666 s 188.178 MB/s
16777216 64 4 LOG - - Seed: 1347472195 Buffer-Const,s!=d,xor=0: 4.743789 s 215.861 MB/s Buffer-Const,s!=d,xor=1: 5.284770 s 193.764 MB/s Buffer-Const,s==d,xor=0: 4.864533 s 210.503 MB/s Buffer-Const,s==d,xor=1: 5.531778 s 185.112 MB/s
33554432 32 4 LOG - - Seed: 1347472221 Buffer-Const,s!=d,xor=0: 5.058158 s 202.445 MB/s Buffer-Const,s!=d,xor=1: 5.388520 s 190.034 MB/s Buffer-Const,s==d,xor=0: 5.017543 s 204.084 MB/s Buffer-Const,s==d,xor=1: 5.550337 s 184.493 MB/s
67108864 16 4 LOG - - Seed: 1347472247 Buffer-Const,s!=d,xor=0: 4.273755 s 239.602 MB/s Buffer-Const,s!=d,xor=1: 5.356849 s 191.157 MB/s Buffer-Const,s==d,xor=0: 4.884432 s 209.646 MB/s Buffer-Const,s==d,xor=1: 5.328478 s 192.175 MB/s
134217728 8 4 LOG - - Seed: 1347472272 Buffer-Const,s!=d,xor=0: 4.608675 s 222.190 MB/s Buffer-Const,s!=d,xor=1: 5.757140 s 177.866 MB/s Buffer-Const,s==d,xor=0: 4.494134 s 227.853 MB/s Buffer-Const,s==d,xor=1: 5.725754 s 178.841 MB/s
268435456 4 4 LOG - - Seed: 1347472298 Buffer-Const,s!=d,xor=0: 5.326828 s 192.234 MB/s Buffer-Const,s!=d,xor=1: 5.749257 s 178.110 MB/s Buffer-Const,s==d,xor=0: 3.930798 s 260.507 MB/s Buffer-Const,s==d,xor=1: 5.769782 s 177.476 MB/s
536870912 2 4 LOG - - Seed: 1347472325 Buffer-Const,s!=d,xor=0: 5.506971 s 185.946 MB/s Buffer-Const,s!=d,xor=1: 5.820843 s 175.920 MB/s Buffer-Const,s==d,xor=0: 5.151835 s 198.764 MB/s Buffer-Const,s==d,xor=1: 2.846869 s 359.693 MB/s
1073741824 1 4 LOG - - Seed: 1347472350 Buffer-Const,s!=d,xor=0: 5.887568 s 173.926 MB/s Buffer-Const,s!=d,xor=1: 5.696556 s 179.758 MB/s Buffer-Const,s==d,xor=0: 5.188843 s 197.346 MB/s Buffer-Const,s==d,xor=1: 5.662299 s 180.845 MB/s
1024 1048576 4 TABLE - - Seed: 1347472378 Buffer-Const,s!=d,xor=0: 2.090874 s 489.747 MB/s Buffer-Const,s!=d,xor=1: 2.333704 s 438.787 MB/s Buffer-Const,s==d,xor=0: 2.076584 s 493.117 MB/s Buffer-Const,s==d,xor=1: 2.341999 s 437.233 MB/s
2048 524288 4 TABLE - - Seed: 1347472393 Buffer-Const,s!=d,xor=0: 2.100408 s 487.524 MB/s Buffer-Const,s!=d,xor=1: 2.312246 s 442.859 MB/s Buffer-Const,s==d,xor=0: 2.095576 s 488.649 MB/s Buffer-Const,s==d,xor=1: 2.278695 s 449.380 MB/s
4096 262144 4 TABLE - - Seed: 1347472407 Buffer-Const,s!=d,xor=0: 2.051966 s 499.034 MB/s Buffer-Const,s!=d,xor=1: 2.292821 s 446.611 MB/s Buffer-Const,s==d,xor=0: 2.064646 s 495.969 MB/s Buffer-Const,s==d,xor=1: 2.306956 s 443.875 MB/s
8192 131072 4 TABLE - - Seed: 1347472421 Buffer-Const,s!=d,xor=0: 2.074299 s 493.661 MB/s Buffer-Const,s!=d,xor=1: 2.298558 s 445.497 MB/s Buffer-Const,s==d,xor=0: 2.066750 s 495.464 MB/s Buffer-Const,s==d,xor=1: 2.287467 s 447.657 MB/s
16384 65536 4 TABLE - - Seed: 1347472435 Buffer-Const,s!=d,xor=0: 2.152980 s 475.620 MB/s Buffer-Const,s!=d,xor=1: 2.282884 s 448.555 MB/s Buffer-Const,s==d,xor=0: 2.058036 s 497.562 MB/s Buffer-Const,s==d,xor=1: 2.298184 s 445.569 MB/s
32768 32768 4 TABLE - - Seed: 1347472449 Buffer-Const,s!=d,xor=0: 2.213344 s 462.648 MB/s Buffer-Const,s!=d,xor=1: 2.320572 s 441.271 MB/s Buffer-Const,s==d,xor=0: 2.206635 s 464.055 MB/s Buffer-Const,s==d,xor=1: 2.306156 s 444.029 MB/s
65536 16384 4 TABLE - - Seed: 1347472463 Buffer-Const,s!=d,xor=0: 2.201297 s 465.180 MB/s Buffer-Const,s!=d,xor=1: 2.309327 s 443.419 MB/s Buffer-Const,s==d,xor=0: 2.184618 s 468.732 MB/s Buffer-Const,s==d,xor=1: 2.301818 s 444.866 MB/s
131072 8192 4 TABLE - - Seed: 1347472477 Buffer-Const,s!=d,xor=0: 2.141175 s 478.242 MB/s Buffer-Const,s!=d,xor=1: 2.316740 s 442.000 MB/s Buffer-Const,s==d,xor=0: 2.187070 s 468.206 MB/s Buffer-Const,s==d,xor=1: 2.306461 s 443.970 MB/s
262144 4096 4 TABLE - - Seed: 1347472492 Buffer-Const,s!=d,xor=0: 2.166170 s 472.724 MB/s Buffer-Const,s!=d,xor=1: 2.306049 s 444.050 MB/s Buffer-Const,s==d,xor=0: 2.147129 s 476.916 MB/s Buffer-Const,s==d,xor=1: 2.309562 s 443.374 MB/s
524288 2048 4 TABLE - - Seed: 1347472506 Buffer-Const,s!=d,xor=0: 2.156061 s 474.940 MB/s Buffer-Const,s!=d,xor=1: 2.304203 s 444.405 MB/s Buffer-Const,s==d,xor=0: 2.155717 s 475.016 MB/s Buffer-Const,s==d,xor=1: 2.321065 s 441.177 MB/s
1048576 1024 4 TABLE - - Seed: 1347472520 Buffer-Const,s!=d,xor=0: 2.152224 s 475.787 MB/s Buffer-Const,s!=d,xor=1: 2.310472 s 443.199 MB/s Buffer-Const,s==d,xor=0: 2.151816 s 475.877 MB/s Buffer-Const,s==d,xor=1: 2.312655 s 442.781 MB/s
2097152 512 4 TABLE - - Seed: 1347472534 Buffer-Const,s!=d,xor=0: 2.170889 s 471.696 MB/s Buffer-Const,s!=d,xor=1: 2.361295 s 433.660 MB/s Buffer-Const,s==d,xor=0: 2.139913 s 478.524 MB/s Buffer-Const,s==d,xor=1: 2.316579 s 442.031 MB/s
4194304 256 4 TABLE - - Seed: 1347472548 Buffer-Const,s!=d,xor=0: 2.187952 s 468.018 MB/s Buffer-Const,s!=d,xor=1: 2.354228 s 434.962 MB/s Buffer-Const,s==d,xor=0: 2.193449 s 466.845 MB/s Buffer-Const,s==d,xor=1: 2.344275 s 436.809 MB/s
8388608 128 4 TABLE - - Seed: 1347472563 Buffer-Const,s!=d,xor=0: 2.211300 s 463.076 MB/s Buffer-Const,s!=d,xor=1: 2.382068 s 429.879 MB/s Buffer-Const,s==d,xor=0: 2.206019 s 464.185 MB/s Buffer-Const,s==d,xor=1: 2.333248 s 438.873 MB/s
16777216 64 4 TABLE - - Seed: 1347472577 Buffer-Const,s!=d,xor=0: 2.193599 s 466.813 MB/s Buffer-Const,s!=d,xor=1: 2.373979 s 431.343 MB/s Buffer-Const,s==d,xor=0: 2.181715 s 469.355 MB/s Buffer-Const,s==d,xor=1: 2.363553 s 433.246 MB/s
33554432 32 4 TABLE - - Seed: 1347472592 Buffer-Const,s!=d,xor=0: 2.205605 s 464.272 MB/s Buffer-Const,s!=d,xor=1: 2.388323 s 428.753 MB/s Buffer-Const,s==d,xor=0: 2.194591 s 466.602 MB/s Buffer-Const,s==d,xor=1: 2.352825 s 435.221 MB/s
67108864 16 4 TABLE - - Seed: 1347472606 Buffer-Const,s!=d,xor=0: 2.252406 s 454.625 MB/s Buffer-Const,s!=d,xor=1: 2.350086 s 435.729 MB/s Buffer-Const,s==d,xor=0: 2.186626 s 468.301 MB/s Buffer-Const,s==d,xor=1: 2.357336 s 434.389 MB/s
134217728 8 4 TABLE - - Seed: 1347472621 Buffer-Const,s!=d,xor=0: 2.312211 s 442.866 MB/s Buffer-Const,s!=d,xor=1: 2.397869 s 427.046 MB/s Buffer-Const,s==d,xor=0: 2.195088 s 466.496 MB/s Buffer-Const,s==d,xor=1: 2.354865 s 434.844 MB/s
268435456 4 4 TABLE - - Seed: 1347472635 Buffer-Const,s!=d,xor=0: 2.409825 s 424.927 MB/s Buffer-Const,s!=d,xor=1: 2.388709 s 428.683 MB/s Buffer-Const,s==d,xor=0: 2.217935 s 461.691 MB/s Buffer-Const,s==d,xor=1: 2.427467 s 421.839 MB/s
536870912 2 4 TABLE - - Seed: 1347472650 Buffer-Const,s!=d,xor=0: 2.572154 s 398.110 MB/s Buffer-Const,s!=d,xor=1: 2.357918 s 434.281 MB/s Buffer-Const,s==d,xor=0: 2.180809 s 469.551 MB/s Buffer-Const,s==d,xor=1: 2.330464 s 439.397 MB/s
1073741824 1 4 TABLE - - Seed: 1347472665 Buffer-Const,s!=d,xor=0: 2.942518 s 348.001 MB/s Buffer-Const,s!=d,xor=1: 2.349215 s 435.890 MB/s Buffer-Const,s==d,xor=0: 2.209902 s 463.369 MB/s Buffer-Const,s==d,xor=1: 2.368640 s 432.316 MB/s
1024 1048576 4 TABLE SINGLE,SSE - Seed: 1347472681 Buffer-Const,s!=d,xor=0: 0.160061 s 6397.547 MB/s Buffer-Const,s!=d,xor=1: 0.169124 s 6054.742 MB/s Buffer-Const,s==d,xor=0: 0.160015 s 6399.396 MB/s Buffer-Const,s==d,xor=1: 0.170060 s 6021.416 MB/s
2048 524288 4 TABLE SINGLE,SSE - Seed: 1347472688 Buffer-Const,s!=d,xor=0: 0.144030 s 7109.637 MB/s Buffer-Const,s!=d,xor=1: 0.149962 s 6828.377 MB/s Buffer-Const,s==d,xor=0: 0.143702 s 7125.880 MB/s Buffer-Const,s==d,xor=1: 0.149732 s 6838.902 MB/s
4096 262144 4 TABLE SINGLE,SSE - Seed: 1347472693 Buffer-Const,s!=d,xor=0: 0.129829 s 7887.273 MB/s Buffer-Const,s!=d,xor=1: 0.134809 s 7595.958 MB/s Buffer-Const,s==d,xor=0: 0.131632 s 7779.258 MB/s Buffer-Const,s==d,xor=1: 0.135138 s 7577.437 MB/s
8192 131072 4 TABLE SINGLE,SSE - Seed: 1347472699 Buffer-Const,s!=d,xor=0: 0.124071 s 8253.315 MB/s Buffer-Const,s!=d,xor=1: 0.127894 s 8006.605 MB/s Buffer-Const,s==d,xor=0: 0.124068 s 8253.505 MB/s Buffer-Const,s==d,xor=1: 0.127882 s 8007.382 MB/s
16384 65536 4 TABLE SINGLE,SSE - Seed: 1347472705 Buffer-Const,s!=d,xor=0: 0.120162 s 8521.845 MB/s Buffer-Const,s!=d,xor=1: 0.124806 s 8204.723 MB/s Buffer-Const,s==d,xor=0: 0.119825 s 8545.821 MB/s Buffer-Const,s==d,xor=1: 0.124612 s 8217.517 MB/s
32768 32768 4 TABLE SINGLE,SSE - Seed: 1347472711 Buffer-Const,s!=d,xor=0: 0.123173 s 8313.478 MB/s Buffer-Const,s!=d,xor=1: 0.129224 s 7924.250 MB/s Buffer-Const,s==d,xor=0: 0.118994 s 8605.476 MB/s Buffer-Const,s==d,xor=1: 0.123591 s 8285.397 MB/s
65536 16384 4 TABLE SINGLE,SSE - Seed: 1347472717 Buffer-Const,s!=d,xor=0: 0.120111 s 8525.465 MB/s Buffer-Const,s!=d,xor=1: 0.130905 s 7822.443 MB/s Buffer-Const,s==d,xor=0: 0.118989 s 8605.838 MB/s Buffer-Const,s==d,xor=1: 0.122049 s 8390.066 MB/s
131072 8192 4 TABLE SINGLE,SSE - Seed: 1347472722 Buffer-Const,s!=d,xor=0: 0.120384 s 8506.115 MB/s Buffer-Const,s!=d,xor=1: 0.131319 s 7797.817 MB/s Buffer-Const,s==d,xor=0: 0.118782 s 8620.849 MB/s Buffer-Const,s==d,xor=1: 0.124635 s 8215.976 MB/s
262144 4096 4 TABLE SINGLE,SSE - Seed: 1347472728 Buffer-Const,s!=d,xor=0: 0.151247 s 6770.381 MB/s Buffer-Const,s!=d,xor=1: 0.163074 s 6279.339 MB/s Buffer-Const,s==d,xor=0: 0.118564 s 8636.659 MB/s Buffer-Const,s==d,xor=1: 0.122733 s 8343.290 MB/s
524288 2048 4 TABLE SINGLE,SSE - Seed: 1347472734 Buffer-Const,s!=d,xor=0: 0.148822 s 6880.722 MB/s Buffer-Const,s!=d,xor=1: 0.160966 s 6361.595 MB/s Buffer-Const,s==d,xor=0: 0.129449 s 7910.429 MB/s Buffer-Const,s==d,xor=1: 0.129116 s 7930.864 MB/s
1048576 1024 4 TABLE SINGLE,SSE - Seed: 1347472740 Buffer-Const,s!=d,xor=0: 0.147404 s 6946.896 MB/s Buffer-Const,s!=d,xor=1: 0.159756 s 6409.758 MB/s Buffer-Const,s==d,xor=0: 0.128379 s 7976.391 MB/s Buffer-Const,s==d,xor=1: 0.128835 s 7948.168 MB/s
2097152 512 4 TABLE SINGLE,SSE - Seed: 1347472746 Buffer-Const,s!=d,xor=0: 0.236475 s 4330.268 MB/s Buffer-Const,s!=d,xor=1: 0.246601 s 4152.451 MB/s Buffer-Const,s==d,xor=0: 0.128898 s 7944.287 MB/s Buffer-Const,s==d,xor=1: 0.133252 s 7684.693 MB/s
4194304 256 4 TABLE SINGLE,SSE - Seed: 1347472752 Buffer-Const,s!=d,xor=0: 0.365861 s 2798.874 MB/s Buffer-Const,s!=d,xor=1: 0.361812 s 2830.198 MB/s Buffer-Const,s==d,xor=0: 0.209003 s 4899.441 MB/s Buffer-Const,s==d,xor=1: 0.202078 s 5067.354 MB/s
8388608 128 4 TABLE SINGLE,SSE - Seed: 1347472758 Buffer-Const,s!=d,xor=0: 0.369510 s 2771.238 MB/s Buffer-Const,s!=d,xor=1: 0.347091 s 2950.235 MB/s Buffer-Const,s==d,xor=0: 0.227157 s 4507.888 MB/s Buffer-Const,s==d,xor=1: 0.232318 s 4407.757 MB/s
16777216 64 4 TABLE SINGLE,SSE - Seed: 1347472764 Buffer-Const,s!=d,xor=0: 0.368891 s 2775.890 MB/s Buffer-Const,s!=d,xor=1: 0.356381 s 2873.326 MB/s Buffer-Const,s==d,xor=0: 0.226912 s 4512.772 MB/s Buffer-Const,s==d,xor=1: 0.219236 s 4670.758 MB/s
33554432 32 4 TABLE SINGLE,SSE - Seed: 1347472771 Buffer-Const,s!=d,xor=0: 0.379371 s 2699.205 MB/s Buffer-Const,s!=d,xor=1: 0.341562 s 2997.993 MB/s Buffer-Const,s==d,xor=0: 0.231817 s 4417.282 MB/s Buffer-Const,s==d,xor=1: 0.217154 s 4715.547 MB/s
67108864 16 4 TABLE SINGLE,SSE - Seed: 1347472777 Buffer-Const,s!=d,xor=0: 0.403540 s 2537.545 MB/s Buffer-Const,s!=d,xor=1: 0.360238 s 2842.563 MB/s Buffer-Const,s==d,xor=0: 0.230866 s 4435.479 MB/s Buffer-Const,s==d,xor=1: 0.180604 s 5669.879 MB/s
134217728 8 4 TABLE SINGLE,SSE - Seed: 1347472784 Buffer-Const,s!=d,xor=0: 0.441703 s 2318.301 MB/s Buffer-Const,s!=d,xor=1: 0.386278 s 2650.943 MB/s Buffer-Const,s==d,xor=0: 0.229751 s 4457.002 MB/s Buffer-Const,s==d,xor=1: 0.180658 s 5668.158 MB/s
268435456 4 4 TABLE SINGLE,SSE - Seed: 1347472791 Buffer-Const,s!=d,xor=0: 0.471682 s 2170.955 MB/s Buffer-Const,s!=d,xor=1: 0.383233 s 2672.005 MB/s Buffer-Const,s==d,xor=0: 0.236378 s 4332.041 MB/s Buffer-Const,s==d,xor=1: 0.243849 s 4199.315 MB/s
536870912 2 4 TABLE SINGLE,SSE - Seed: 1347472797 Buffer-Const,s!=d,xor=0: 0.666553 s 1536.262 MB/s Buffer-Const,s!=d,xor=1: 0.374508 s 2734.255 MB/s Buffer-Const,s==d,xor=0: 0.228591 s 4479.617 MB/s Buffer-Const,s==d,xor=1: 0.247453 s 4138.156 MB/s
1073741824 1 4 TABLE SINGLE,SSE - Seed: 1347472805 Buffer-Const,s!=d,xor=0: 0.739952 s 1383.873 MB/s Buffer-Const,s!=d,xor=1: 0.376333 s 2720.994 MB/s Buffer-Const,s==d,xor=0: 0.229283 s 4466.099 MB/s Buffer-Const,s==d,xor=1: 0.242894 s 4215.832 MB/s
1024 1048576 8 LOG - - Seed: 1347472813 Buffer-Const,s!=d,xor=0: 1.621880 s 631.366 MB/s Buffer-Const,s!=d,xor=1: 1.972670 s 519.093 MB/s Buffer-Const,s==d,xor=0: 1.703537 s 601.102 MB/s Buffer-Const,s==d,xor=1: 1.965952 s 520.867 MB/s
2048 524288 8 LOG - - Seed: 1347472825 Buffer-Const,s!=d,xor=0: 1.580008 s 648.098 MB/s Buffer-Const,s!=d,xor=1: 1.922355 s 532.680 MB/s Buffer-Const,s==d,xor=0: 1.619760 s 632.193 MB/s Buffer-Const,s==d,xor=1: 1.935444 s 529.078 MB/s
4096 262144 8 LOG - - Seed: 1347472838 Buffer-Const,s!=d,xor=0: 1.612208 s 635.154 MB/s Buffer-Const,s!=d,xor=1: 1.935781 s 528.985 MB/s Buffer-Const,s==d,xor=0: 1.619466 s 632.307 MB/s Buffer-Const,s==d,xor=1: 1.963975 s 521.391 MB/s
8192 131072 8 LOG - - Seed: 1347472850 Buffer-Const,s!=d,xor=0: 1.618882 s 632.535 MB/s Buffer-Const,s!=d,xor=1: 1.917912 s 533.914 MB/s Buffer-Const,s==d,xor=0: 1.604389 s 638.249 MB/s Buffer-Const,s==d,xor=1: 1.908338 s 536.593 MB/s
16384 65536 8 LOG - - Seed: 1347472863 Buffer-Const,s!=d,xor=0: 1.594616 s 642.161 MB/s Buffer-Const,s!=d,xor=1: 1.910674 s 535.936 MB/s Buffer-Const,s==d,xor=0: 1.609434 s 636.249 MB/s Buffer-Const,s==d,xor=1: 1.912407 s 535.451 MB/s
32768 32768 8 LOG - - Seed: 1347472875 Buffer-Const,s!=d,xor=0: 1.624596 s 630.311 MB/s Buffer-Const,s!=d,xor=1: 2.144199 s 477.568 MB/s Buffer-Const,s==d,xor=0: 1.588486 s 644.639 MB/s Buffer-Const,s==d,xor=1: 1.909198 s 536.351 MB/s
65536 16384 8 LOG - - Seed: 1347472887 Buffer-Const,s!=d,xor=0: 1.662282 s 616.020 MB/s Buffer-Const,s!=d,xor=1: 1.919168 s 533.565 MB/s Buffer-Const,s==d,xor=0: 1.591656 s 643.355 MB/s Buffer-Const,s==d,xor=1: 1.926590 s 531.509 MB/s
131072 8192 8 LOG - - Seed: 1347472900 Buffer-Const,s!=d,xor=0: 1.594085 s 642.375 MB/s Buffer-Const,s!=d,xor=1: 1.937719 s 528.456 MB/s Buffer-Const,s==d,xor=0: 1.648678 s 621.104 MB/s Buffer-Const,s==d,xor=1: 1.924335 s 532.132 MB/s
262144 4096 8 LOG - - Seed: 1347472912 Buffer-Const,s!=d,xor=0: 1.595497 s 641.806 MB/s Buffer-Const,s!=d,xor=1: 1.936042 s 528.914 MB/s Buffer-Const,s==d,xor=0: 1.608699 s 636.539 MB/s Buffer-Const,s==d,xor=1: 1.958862 s 522.752 MB/s
524288 2048 8 LOG - - Seed: 1347472925 Buffer-Const,s!=d,xor=0: 1.646453 s 621.943 MB/s Buffer-Const,s!=d,xor=1: 1.942311 s 527.207 MB/s Buffer-Const,s==d,xor=0: 1.621521 s 631.506 MB/s Buffer-Const,s==d,xor=1: 1.968560 s 520.177 MB/s
1048576 1024 8 LOG - - Seed: 1347472937 Buffer-Const,s!=d,xor=0: 1.627189 s 629.306 MB/s Buffer-Const,s!=d,xor=1: 1.938396 s 528.272 MB/s Buffer-Const,s==d,xor=0: 1.609066 s 636.394 MB/s Buffer-Const,s==d,xor=1: 1.940828 s 527.610 MB/s
2097152 512 8 LOG - - Seed: 1347472949 Buffer-Const,s!=d,xor=0: 1.654112 s 619.063 MB/s Buffer-Const,s!=d,xor=1: 1.977605 s 517.798 MB/s Buffer-Const,s==d,xor=0: 1.625274 s 630.048 MB/s Buffer-Const,s==d,xor=1: 1.957592 s 523.092 MB/s
4194304 256 8 LOG - - Seed: 1347472962 Buffer-Const,s!=d,xor=0: 1.637634 s 625.293 MB/s Buffer-Const,s!=d,xor=1: 1.960637 s 522.279 MB/s Buffer-Const,s==d,xor=0: 1.643415 s 623.093 MB/s Buffer-Const,s==d,xor=1: 1.968736 s 520.131 MB/s
8388608 128 8 LOG - - Seed: 1347472974 Buffer-Const,s!=d,xor=0: 1.642961 s 623.265 MB/s Buffer-Const,s!=d,xor=1: 1.997125 s 512.737 MB/s Buffer-Const,s==d,xor=0: 1.647587 s 621.515 MB/s Buffer-Const,s==d,xor=1: 1.982742 s 516.456 MB/s
16777216 64 8 LOG - - Seed: 1347472987 Buffer-Const,s!=d,xor=0: 1.628143 s 628.937 MB/s Buffer-Const,s!=d,xor=1: 1.988719 s 514.904 MB/s Buffer-Const,s==d,xor=0: 1.697431 s 603.265 MB/s Buffer-Const,s==d,xor=1: 1.982398 s 516.546 MB/s
33554432 32 8 LOG - - Seed: 1347472999 Buffer-Const,s!=d,xor=0: 1.655345 s 618.602 MB/s Buffer-Const,s!=d,xor=1: 1.987288 s 515.275 MB/s Buffer-Const,s==d,xor=0: 1.631150 s 627.778 MB/s Buffer-Const,s==d,xor=1: 1.988191 s 515.041 MB/s
67108864 16 8 LOG - - Seed: 1347473012 Buffer-Const,s!=d,xor=0: 1.716783 s 596.464 MB/s Buffer-Const,s!=d,xor=1: 2.007143 s 510.178 MB/s Buffer-Const,s==d,xor=0: 1.644218 s 622.789 MB/s Buffer-Const,s==d,xor=1: 1.981492 s 516.782 MB/s
134217728 8 8 LOG - - Seed: 1347473025 Buffer-Const,s!=d,xor=0: 1.744457 s 587.002 MB/s Buffer-Const,s!=d,xor=1: 2.015515 s 508.059 MB/s Buffer-Const,s==d,xor=0: 1.656633 s 618.121 MB/s Buffer-Const,s==d,xor=1: 1.984652 s 515.959 MB/s
268435456 4 8 LOG - - Seed: 1347473038 Buffer-Const,s!=d,xor=0: 1.943453 s 526.897 MB/s Buffer-Const,s!=d,xor=1: 1.982043 s 516.639 MB/s Buffer-Const,s==d,xor=0: 1.627782 s 629.077 MB/s Buffer-Const,s==d,xor=1: 1.963902 s 521.411 MB/s
536870912 2 8 LOG - - Seed: 1347473051 Buffer-Const,s!=d,xor=0: 1.984840 s 515.911 MB/s Buffer-Const,s!=d,xor=1: 1.999844 s 512.040 MB/s Buffer-Const,s==d,xor=0: 1.636470 s 625.737 MB/s Buffer-Const,s==d,xor=1: 1.961609 s 522.020 MB/s
1073741824 1 8 LOG - - Seed: 1347473064 Buffer-Const,s!=d,xor=0: 2.326284 s 440.187 MB/s Buffer-Const,s!=d,xor=1: 1.971229 s 519.473 MB/s Buffer-Const,s==d,xor=0: 1.628148 s 628.935 MB/s Buffer-Const,s==d,xor=1: 2.123621 s 482.195 MB/s
1024 1048576 8 TABLE - - Seed: 1347473078 Buffer-Const,s!=d,xor=0: 1.302151 s 786.391 MB/s Buffer-Const,s!=d,xor=1: 1.609089 s 636.385 MB/s Buffer-Const,s==d,xor=0: 1.298172 s 788.802 MB/s Buffer-Const,s==d,xor=1: 1.592942 s 642.836 MB/s
2048 524288 8 TABLE - - Seed: 1347473089 Buffer-Const,s!=d,xor=0: 1.270636 s 805.896 MB/s Buffer-Const,s!=d,xor=1: 1.564955 s 654.332 MB/s Buffer-Const,s==d,xor=0: 1.264134 s 810.040 MB/s Buffer-Const,s==d,xor=1: 1.553215 s 659.278 MB/s
4096 262144 8 TABLE - - Seed: 1347473100 Buffer-Const,s!=d,xor=0: 1.252112 s 817.818 MB/s Buffer-Const,s!=d,xor=1: 1.543294 s 663.516 MB/s Buffer-Const,s==d,xor=0: 1.248986 s 819.865 MB/s Buffer-Const,s==d,xor=1: 1.539075 s 665.335 MB/s
8192 131072 8 TABLE - - Seed: 1347473111 Buffer-Const,s!=d,xor=0: 1.254336 s 816.368 MB/s Buffer-Const,s!=d,xor=1: 1.556689 s 657.806 MB/s Buffer-Const,s==d,xor=0: 1.245368 s 822.247 MB/s Buffer-Const,s==d,xor=1: 1.556819 s 657.752 MB/s
16384 65536 8 TABLE - - Seed: 1347473122 Buffer-Const,s!=d,xor=0: 1.311951 s 780.517 MB/s Buffer-Const,s!=d,xor=1: 1.537484 s 666.023 MB/s Buffer-Const,s==d,xor=0: 1.236025 s 828.462 MB/s Buffer-Const,s==d,xor=1: 1.533817 s 667.615 MB/s
32768 32768 8 TABLE - - Seed: 1347473133 Buffer-Const,s!=d,xor=0: 1.185127 s 864.043 MB/s Buffer-Const,s!=d,xor=1: 1.560520 s 656.192 MB/s Buffer-Const,s==d,xor=0: 1.167446 s 877.128 MB/s Buffer-Const,s==d,xor=1: 1.548851 s 661.135 MB/s
65536 16384 8 TABLE - - Seed: 1347473144 Buffer-Const,s!=d,xor=0: 1.178377 s 868.992 MB/s Buffer-Const,s!=d,xor=1: 1.538837 s 665.438 MB/s Buffer-Const,s==d,xor=0: 1.174308 s 872.003 MB/s Buffer-Const,s==d,xor=1: 1.544995 s 662.785 MB/s
131072 8192 8 TABLE - - Seed: 1347473154 Buffer-Const,s!=d,xor=0: 1.209799 s 846.422 MB/s Buffer-Const,s!=d,xor=1: 1.556000 s 658.098 MB/s Buffer-Const,s==d,xor=0: 1.182813 s 865.733 MB/s Buffer-Const,s==d,xor=1: 1.532919 s 668.007 MB/s
262144 4096 8 TABLE - - Seed: 1347473165 Buffer-Const,s!=d,xor=0: 1.220862 s 838.751 MB/s Buffer-Const,s!=d,xor=1: 1.564978 s 654.322 MB/s Buffer-Const,s==d,xor=0: 1.212298 s 844.677 MB/s Buffer-Const,s==d,xor=1: 1.551679 s 659.930 MB/s
524288 2048 8 TABLE - - Seed: 1347473176 Buffer-Const,s!=d,xor=0: 1.293642 s 791.563 MB/s Buffer-Const,s!=d,xor=1: 1.576479 s 649.549 MB/s Buffer-Const,s==d,xor=0: 1.278135 s 801.167 MB/s Buffer-Const,s==d,xor=1: 1.551030 s 660.206 MB/s
1048576 1024 8 TABLE - - Seed: 1347473187 Buffer-Const,s!=d,xor=0: 1.255426 s 815.659 MB/s Buffer-Const,s!=d,xor=1: 1.552257 s 659.685 MB/s Buffer-Const,s==d,xor=0: 1.222667 s 837.513 MB/s Buffer-Const,s==d,xor=1: 1.537151 s 666.167 MB/s
2097152 512 8 TABLE - - Seed: 1347473198 Buffer-Const,s!=d,xor=0: 1.215521 s 842.437 MB/s Buffer-Const,s!=d,xor=1: 1.595463 s 641.820 MB/s Buffer-Const,s==d,xor=0: 1.183346 s 865.343 MB/s Buffer-Const,s==d,xor=1: 1.562343 s 655.426 MB/s
4194304 256 8 TABLE - - Seed: 1347473209 Buffer-Const,s!=d,xor=0: 1.253935 s 816.629 MB/s Buffer-Const,s!=d,xor=1: 1.607460 s 637.030 MB/s Buffer-Const,s==d,xor=0: 1.231988 s 831.177 MB/s Buffer-Const,s==d,xor=1: 1.585267 s 645.948 MB/s
8388608 128 8 TABLE - - Seed: 1347473220 Buffer-Const,s!=d,xor=0: 1.242970 s 823.833 MB/s Buffer-Const,s!=d,xor=1: 1.610712 s 635.744 MB/s Buffer-Const,s==d,xor=0: 1.217175 s 841.292 MB/s Buffer-Const,s==d,xor=1: 1.596637 s 641.348 MB/s
16777216 64 8 TABLE - - Seed: 1347473230 Buffer-Const,s!=d,xor=0: 1.250264 s 819.027 MB/s Buffer-Const,s!=d,xor=1: 1.613457 s 634.662 MB/s Buffer-Const,s==d,xor=0: 1.218879 s 840.116 MB/s Buffer-Const,s==d,xor=1: 1.610589 s 635.792 MB/s
33554432 32 8 TABLE - - Seed: 1347473241 Buffer-Const,s!=d,xor=0: 1.255215 s 815.797 MB/s Buffer-Const,s!=d,xor=1: 1.608961 s 636.436 MB/s Buffer-Const,s==d,xor=0: 1.288840 s 794.513 MB/s Buffer-Const,s==d,xor=1: 1.566385 s 653.735 MB/s
67108864 16 8 TABLE - - Seed: 1347473253 Buffer-Const,s!=d,xor=0: 1.266899 s 808.273 MB/s Buffer-Const,s!=d,xor=1: 1.589339 s 644.293 MB/s Buffer-Const,s==d,xor=0: 1.208477 s 847.347 MB/s Buffer-Const,s==d,xor=1: 1.563937 s 654.758 MB/s
134217728 8 8 TABLE - - Seed: 1347473264 Buffer-Const,s!=d,xor=0: 1.305275 s 784.509 MB/s Buffer-Const,s!=d,xor=1: 1.601313 s 639.475 MB/s Buffer-Const,s==d,xor=0: 1.205009 s 849.786 MB/s Buffer-Const,s==d,xor=1: 1.562011 s 655.565 MB/s
268435456 4 8 TABLE - - Seed: 1347473275 Buffer-Const,s!=d,xor=0: 1.393569 s 734.804 MB/s Buffer-Const,s!=d,xor=1: 1.588149 s 644.776 MB/s Buffer-Const,s==d,xor=0: 1.208391 s 847.408 MB/s Buffer-Const,s==d,xor=1: 1.568612 s 652.806 MB/s
536870912 2 8 TABLE - - Seed: 1347473286 Buffer-Const,s!=d,xor=0: 1.597872 s 640.852 MB/s Buffer-Const,s!=d,xor=1: 1.630728 s 627.940 MB/s Buffer-Const,s==d,xor=0: 1.222109 s 837.896 MB/s Buffer-Const,s==d,xor=1: 1.581507 s 647.484 MB/s
1073741824 1 8 TABLE - - Seed: 1347473298 Buffer-Const,s!=d,xor=0: 1.988823 s 514.877 MB/s Buffer-Const,s!=d,xor=1: 1.611288 s 635.516 MB/s Buffer-Const,s==d,xor=0: 1.222772 s 837.442 MB/s Buffer-Const,s==d,xor=1: 1.613392 s 634.688 MB/s
1024 1048576 8 SPLIT 8 4 SSE - Seed: 1347473310 Buffer-Const,s!=d,xor=0: 0.172217 s 5945.999 MB/s Buffer-Const,s!=d,xor=1: 0.175352 s 5839.687 MB/s Buffer-Const,s==d,xor=0: 0.163135 s 6277.026 MB/s Buffer-Const,s==d,xor=1: 0.177949 s 5754.459 MB/s
2048 524288 8 SPLIT 8 4 SSE - Seed: 1347473316 Buffer-Const,s!=d,xor=0: 0.154401 s 6632.099 MB/s Buffer-Const,s!=d,xor=1: 0.158355 s 6466.493 MB/s Buffer-Const,s==d,xor=0: 0.151740 s 6748.403 MB/s Buffer-Const,s==d,xor=1: 0.159225 s 6431.142 MB/s
4096 262144 8 SPLIT 8 4 SSE - Seed: 1347473322 Buffer-Const,s!=d,xor=0: 0.138276 s 7405.486 MB/s Buffer-Const,s!=d,xor=1: 0.152603 s 6710.226 MB/s Buffer-Const,s==d,xor=0: 0.136469 s 7503.516 MB/s Buffer-Const,s==d,xor=1: 0.142757 s 7173.019 MB/s
8192 131072 8 SPLIT 8 4 SSE - Seed: 1347473328 Buffer-Const,s!=d,xor=0: 0.129117 s 7930.820 MB/s Buffer-Const,s!=d,xor=1: 0.136170 s 7520.030 MB/s Buffer-Const,s==d,xor=0: 0.129542 s 7904.751 MB/s Buffer-Const,s==d,xor=1: 0.135069 s 7581.303 MB/s
16384 65536 8 SPLIT 8 4 SSE - Seed: 1347473334 Buffer-Const,s!=d,xor=0: 0.125494 s 8159.768 MB/s Buffer-Const,s!=d,xor=1: 0.133779 s 7654.412 MB/s Buffer-Const,s==d,xor=0: 0.126920 s 8068.075 MB/s Buffer-Const,s==d,xor=1: 0.131795 s 7769.618 MB/s
32768 32768 8 SPLIT 8 4 SSE - Seed: 1347473340 Buffer-Const,s!=d,xor=0: 0.128076 s 7995.233 MB/s Buffer-Const,s!=d,xor=1: 0.137454 s 7449.750 MB/s Buffer-Const,s==d,xor=0: 0.122883 s 8333.141 MB/s Buffer-Const,s==d,xor=1: 0.131206 s 7804.505 MB/s
65536 16384 8 SPLIT 8 4 SSE - Seed: 1347473346 Buffer-Const,s!=d,xor=0: 0.127193 s 8050.759 MB/s Buffer-Const,s!=d,xor=1: 0.137505 s 7447.025 MB/s Buffer-Const,s==d,xor=0: 0.123262 s 8307.496 MB/s Buffer-Const,s==d,xor=1: 0.130225 s 7863.331 MB/s
131072 8192 8 SPLIT 8 4 SSE - Seed: 1347473351 Buffer-Const,s!=d,xor=0: 0.138416 s 7398.011 MB/s Buffer-Const,s!=d,xor=1: 0.140051 s 7311.628 MB/s Buffer-Const,s==d,xor=0: 0.123666 s 8280.397 MB/s Buffer-Const,s==d,xor=1: 0.131191 s 7805.384 MB/s
262144 4096 8 SPLIT 8 4 SSE - Seed: 1347473357 Buffer-Const,s!=d,xor=0: 0.158679 s 6453.299 MB/s Buffer-Const,s!=d,xor=1: 0.174885 s 5855.291 MB/s Buffer-Const,s==d,xor=0: 0.126400 s 8101.297 MB/s Buffer-Const,s==d,xor=1: 0.131082 s 7811.901 MB/s
524288 2048 8 SPLIT 8 4 SSE - Seed: 1347473363 Buffer-Const,s!=d,xor=0: 0.153464 s 6672.571 MB/s Buffer-Const,s!=d,xor=1: 0.168238 s 6086.609 MB/s Buffer-Const,s==d,xor=0: 0.132881 s 7706.120 MB/s Buffer-Const,s==d,xor=1: 0.138437 s 7396.852 MB/s
1048576 1024 8 SPLIT 8 4 SSE - Seed: 1347473369 Buffer-Const,s!=d,xor=0: 0.153720 s 6661.456 MB/s Buffer-Const,s!=d,xor=1: 0.167944 s 6097.272 MB/s Buffer-Const,s==d,xor=0: 0.132126 s 7750.172 MB/s Buffer-Const,s==d,xor=1: 0.137430 s 7451.081 MB/s
2097152 512 8 SPLIT 8 4 SSE - Seed: 1347473375 Buffer-Const,s!=d,xor=0: 0.235302 s 4351.859 MB/s Buffer-Const,s!=d,xor=1: 0.252596 s 4053.902 MB/s Buffer-Const,s==d,xor=0: 0.141343 s 7244.769 MB/s Buffer-Const,s==d,xor=1: 0.142658 s 7177.994 MB/s
4194304 256 8 SPLIT 8 4 SSE - Seed: 1347473381 Buffer-Const,s!=d,xor=0: 0.380941 s 2688.078 MB/s Buffer-Const,s!=d,xor=1: 0.380621 s 2690.341 MB/s Buffer-Const,s==d,xor=0: 0.208288 s 4916.260 MB/s Buffer-Const,s==d,xor=1: 0.214539 s 4773.029 MB/s
8388608 128 8 SPLIT 8 4 SSE - Seed: 1347473387 Buffer-Const,s!=d,xor=0: 0.374304 s 2735.747 MB/s Buffer-Const,s!=d,xor=1: 0.371563 s 2755.926 MB/s Buffer-Const,s==d,xor=0: 0.228778 s 4475.957 MB/s Buffer-Const,s==d,xor=1: 0.232901 s 4396.716 MB/s
16777216 64 8 SPLIT 8 4 SSE - Seed: 1347473394 Buffer-Const,s!=d,xor=0: 0.378027 s 2708.800 MB/s Buffer-Const,s!=d,xor=1: 0.371758 s 2754.482 MB/s Buffer-Const,s==d,xor=0: 0.227618 s 4498.761 MB/s Buffer-Const,s==d,xor=1: 0.236930 s 4321.945 MB/s
33554432 32 8 SPLIT 8 4 SSE - Seed: 1347473400 Buffer-Const,s!=d,xor=0: 0.389149 s 2631.383 MB/s Buffer-Const,s!=d,xor=1: 0.373076 s 2744.748 MB/s Buffer-Const,s==d,xor=0: 0.227429 s 4502.496 MB/s Buffer-Const,s==d,xor=1: 0.232095 s 4411.986 MB/s
67108864 16 8 SPLIT 8 4 SSE - Seed: 1347473407 Buffer-Const,s!=d,xor=0: 0.404405 s 2532.117 MB/s Buffer-Const,s!=d,xor=1: 0.375439 s 2727.471 MB/s Buffer-Const,s==d,xor=0: 0.232084 s 4412.195 MB/s Buffer-Const,s==d,xor=1: 0.234886 s 4359.567 MB/s
134217728 8 8 SPLIT 8 4 SSE - Seed: 1347473413 Buffer-Const,s!=d,xor=0: 0.439466 s 2330.102 MB/s Buffer-Const,s!=d,xor=1: 0.373526 s 2741.445 MB/s Buffer-Const,s==d,xor=0: 0.238485 s 4293.774 MB/s Buffer-Const,s==d,xor=1: 0.242573 s 4221.405 MB/s
268435456 4 8 SPLIT 8 4 SSE - Seed: 1347473420 Buffer-Const,s!=d,xor=0: 0.522256 s 1960.723 MB/s Buffer-Const,s!=d,xor=1: 0.369594 s 2770.609 MB/s Buffer-Const,s==d,xor=0: 0.230843 s 4435.914 MB/s Buffer-Const,s==d,xor=1: 0.233838 s 4379.099 MB/s
536870912 2 8 SPLIT 8 4 SSE - Seed: 1347473427 Buffer-Const,s!=d,xor=0: 0.664837 s 1540.227 MB/s Buffer-Const,s!=d,xor=1: 0.374827 s 2731.926 MB/s Buffer-Const,s==d,xor=0: 0.234757 s 4361.958 MB/s Buffer-Const,s==d,xor=1: 0.244709 s 4184.566 MB/s
1073741824 1 8 SPLIT 8 4 SSE - Seed: 1347473434 Buffer-Const,s!=d,xor=0: 0.945331 s 1083.218 MB/s Buffer-Const,s!=d,xor=1: 0.378121 s 2708.129 MB/s Buffer-Const,s==d,xor=0: 0.232104 s 4411.819 MB/s Buffer-Const,s==d,xor=1: 0.236791 s 4324.491 MB/s
1024 1048576 16 LOG - - Seed: 1347473442 Buffer-Const,s!=d,xor=0: 2.210890 s 463.162 MB/s Buffer-Const,s!=d,xor=1: 2.500565 s 409.508 MB/s Buffer-Const,s==d,xor=0: 2.229298 s 459.337 MB/s Buffer-Const,s==d,xor=1: 2.488949 s 411.419 MB/s
2048 524288 16 LOG - - Seed: 1347473457 Buffer-Const,s!=d,xor=0: 2.183274 s 469.020 MB/s Buffer-Const,s!=d,xor=1: 2.471336 s 414.351 MB/s Buffer-Const,s==d,xor=0: 2.161852 s 473.668 MB/s Buffer-Const,s==d,xor=1: 2.537259 s 403.585 MB/s
4096 262144 16 LOG - - Seed: 1347473472 Buffer-Const,s!=d,xor=0: 2.158893 s 474.317 MB/s Buffer-Const,s!=d,xor=1: 2.430101 s 421.382 MB/s Buffer-Const,s==d,xor=0: 2.174221 s 470.973 MB/s Buffer-Const,s==d,xor=1: 2.417720 s 423.540 MB/s
8192 131072 16 LOG - - Seed: 1347473486 Buffer-Const,s!=d,xor=0: 2.139556 s 478.604 MB/s Buffer-Const,s!=d,xor=1: 2.411590 s 424.616 MB/s Buffer-Const,s==d,xor=0: 2.106997 s 486.000 MB/s Buffer-Const,s==d,xor=1: 2.374905 s 431.175 MB/s
16384 65536 16 LOG - - Seed: 1347473501 Buffer-Const,s!=d,xor=0: 2.131013 s 480.523 MB/s Buffer-Const,s!=d,xor=1: 2.424752 s 422.311 MB/s Buffer-Const,s==d,xor=0: 2.159855 s 474.106 MB/s Buffer-Const,s==d,xor=1: 2.340712 s 437.474 MB/s
32768 32768 16 LOG - - Seed: 1347473515 Buffer-Const,s!=d,xor=0: 2.080020 s 492.303 MB/s Buffer-Const,s!=d,xor=1: 2.353990 s 435.006 MB/s Buffer-Const,s==d,xor=0: 2.065719 s 495.711 MB/s Buffer-Const,s==d,xor=1: 2.348487 s 436.025 MB/s
65536 16384 16 LOG - - Seed: 1347473529 Buffer-Const,s!=d,xor=0: 2.083769 s 491.417 MB/s Buffer-Const,s!=d,xor=1: 2.401774 s 426.351 MB/s Buffer-Const,s==d,xor=0: 2.074343 s 493.650 MB/s Buffer-Const,s==d,xor=1: 2.341650 s 437.299 MB/s
131072 8192 16 LOG - - Seed: 1347473543 Buffer-Const,s!=d,xor=0: 2.191856 s 467.184 MB/s Buffer-Const,s!=d,xor=1: 2.369453 s 432.167 MB/s Buffer-Const,s==d,xor=0: 2.034702 s 503.268 MB/s Buffer-Const,s==d,xor=1: 2.307625 s 443.746 MB/s
262144 4096 16 LOG - - Seed: 1347473558 Buffer-Const,s!=d,xor=0: 2.084992 s 491.129 MB/s Buffer-Const,s!=d,xor=1: 2.385670 s 429.230 MB/s Buffer-Const,s==d,xor=0: 2.054360 s 498.452 MB/s Buffer-Const,s==d,xor=1: 2.374879 s 431.180 MB/s
524288 2048 16 LOG - - Seed: 1347473572 Buffer-Const,s!=d,xor=0: 2.107185 s 485.956 MB/s Buffer-Const,s!=d,xor=1: 2.368054 s 432.423 MB/s Buffer-Const,s==d,xor=0: 2.053791 s 498.590 MB/s Buffer-Const,s==d,xor=1: 2.313108 s 442.694 MB/s
1048576 1024 16 LOG - - Seed: 1347473586 Buffer-Const,s!=d,xor=0: 2.105079 s 486.443 MB/s Buffer-Const,s!=d,xor=1: 2.444869 s 418.836 MB/s Buffer-Const,s==d,xor=0: 2.271658 s 450.772 MB/s Buffer-Const,s==d,xor=1: 2.413470 s 424.285 MB/s
2097152 512 16 LOG - - Seed: 1347473600 Buffer-Const,s!=d,xor=0: 2.159018 s 474.290 MB/s Buffer-Const,s!=d,xor=1: 2.419327 s 423.258 MB/s Buffer-Const,s==d,xor=0: 2.031202 s 504.135 MB/s Buffer-Const,s==d,xor=1: 2.301943 s 444.842 MB/s
4194304 256 16 LOG - - Seed: 1347473615 Buffer-Const,s!=d,xor=0: 2.194868 s 466.543 MB/s Buffer-Const,s!=d,xor=1: 2.460607 s 416.158 MB/s Buffer-Const,s==d,xor=0: 2.142866 s 477.865 MB/s Buffer-Const,s==d,xor=1: 2.387212 s 428.952 MB/s
8388608 128 16 LOG - - Seed: 1347473629 Buffer-Const,s!=d,xor=0: 2.178997 s 469.941 MB/s Buffer-Const,s!=d,xor=1: 2.467580 s 414.982 MB/s Buffer-Const,s==d,xor=0: 2.240649 s 457.010 MB/s Buffer-Const,s==d,xor=1: 2.394828 s 427.588 MB/s
16777216 64 16 LOG - - Seed: 1347473644 Buffer-Const,s!=d,xor=0: 2.185086 s 468.632 MB/s Buffer-Const,s!=d,xor=1: 2.635728 s 388.508 MB/s Buffer-Const,s==d,xor=0: 2.144898 s 477.412 MB/s Buffer-Const,s==d,xor=1: 2.402648 s 426.196 MB/s
33554432 32 16 LOG - - Seed: 1347473658 Buffer-Const,s!=d,xor=0: 2.209707 s 463.410 MB/s Buffer-Const,s!=d,xor=1: 2.969263 s 344.867 MB/s Buffer-Const,s==d,xor=0: 2.144736 s 477.448 MB/s Buffer-Const,s==d,xor=1: 2.394575 s 427.633 MB/s
67108864 16 16 LOG - - Seed: 1347473673 Buffer-Const,s!=d,xor=0: 2.281165 s 448.893 MB/s Buffer-Const,s!=d,xor=1: 2.929988 s 349.489 MB/s Buffer-Const,s==d,xor=0: 2.160325 s 474.003 MB/s Buffer-Const,s==d,xor=1: 2.481787 s 412.606 MB/s
134217728 8 16 LOG - - Seed: 1347473689 Buffer-Const,s!=d,xor=0: 2.272319 s 450.641 MB/s Buffer-Const,s!=d,xor=1: 2.957651 s 346.221 MB/s Buffer-Const,s==d,xor=0: 2.097608 s 488.175 MB/s Buffer-Const,s==d,xor=1: 2.352378 s 435.304 MB/s
268435456 4 16 LOG - - Seed: 1347473704 Buffer-Const,s!=d,xor=0: 2.354143 s 434.978 MB/s Buffer-Const,s!=d,xor=1: 2.481760 s 412.610 MB/s Buffer-Const,s==d,xor=0: 2.108223 s 485.717 MB/s Buffer-Const,s==d,xor=1: 2.357849 s 434.294 MB/s
536870912 2 16 LOG - - Seed: 1347473719 Buffer-Const,s!=d,xor=0: 2.545543 s 402.272 MB/s Buffer-Const,s!=d,xor=1: 2.441404 s 419.431 MB/s Buffer-Const,s==d,xor=0: 2.135140 s 479.594 MB/s Buffer-Const,s==d,xor=1: 2.359027 s 434.077 MB/s
1073741824 1 16 LOG - - Seed: 1347473734 Buffer-Const,s!=d,xor=0: 2.985415 s 343.001 MB/s Buffer-Const,s!=d,xor=1: 2.422259 s 422.746 MB/s Buffer-Const,s==d,xor=0: 2.104547 s 486.566 MB/s Buffer-Const,s==d,xor=1: 2.372315 s 431.646 MB/s
1024 1048576 16 SPLIT 16 4 NOSSE - Seed: 1347473750 Buffer-Const,s!=d,xor=0: 1.606844 s 637.274 MB/s Buffer-Const,s!=d,xor=1: 1.804383 s 567.507 MB/s Buffer-Const,s==d,xor=0: 1.615952 s 633.682 MB/s Buffer-Const,s==d,xor=1: 1.797352 s 569.727 MB/s
2048 524288 16 SPLIT 16 4 NOSSE - Seed: 1347473762 Buffer-Const,s!=d,xor=0: 1.541486 s 664.294 MB/s Buffer-Const,s!=d,xor=1: 1.685396 s 607.572 MB/s Buffer-Const,s==d,xor=0: 1.503978 s 680.861 MB/s Buffer-Const,s==d,xor=1: 1.684867 s 607.763 MB/s
4096 262144 16 SPLIT 16 4 NOSSE - Seed: 1347473774 Buffer-Const,s!=d,xor=0: 1.458985 s 701.858 MB/s Buffer-Const,s!=d,xor=1: 1.642648 s 623.384 MB/s Buffer-Const,s==d,xor=0: 1.459893 s 701.421 MB/s Buffer-Const,s==d,xor=1: 1.647799 s 621.435 MB/s
8192 131072 16 SPLIT 16 4 NOSSE - Seed: 1347473785 Buffer-Const,s!=d,xor=0: 1.443518 s 709.378 MB/s Buffer-Const,s!=d,xor=1: 1.632999 s 627.067 MB/s Buffer-Const,s==d,xor=0: 1.445864 s 708.227 MB/s Buffer-Const,s==d,xor=1: 1.640243 s 624.298 MB/s
16384 65536 16 SPLIT 16 4 NOSSE - Seed: 1347473797 Buffer-Const,s!=d,xor=0: 1.440485 s 710.872 MB/s Buffer-Const,s!=d,xor=1: 1.610468 s 635.840 MB/s Buffer-Const,s==d,xor=0: 1.423815 s 719.194 MB/s Buffer-Const,s==d,xor=1: 1.616802 s 633.349 MB/s
32768 32768 16 SPLIT 16 4 NOSSE - Seed: 1347473808 Buffer-Const,s!=d,xor=0: 1.430503 s 715.832 MB/s Buffer-Const,s!=d,xor=1: 1.617286 s 633.159 MB/s Buffer-Const,s==d,xor=0: 1.450425 s 706.000 MB/s Buffer-Const,s==d,xor=1: 1.628290 s 628.881 MB/s
65536 16384 16 SPLIT 16 4 NOSSE - Seed: 1347473819 Buffer-Const,s!=d,xor=0: 1.431340 s 715.414 MB/s Buffer-Const,s!=d,xor=1: 1.603276 s 638.692 MB/s Buffer-Const,s==d,xor=0: 1.484436 s 689.824 MB/s Buffer-Const,s==d,xor=1: 1.626883 s 629.424 MB/s
131072 8192 16 SPLIT 16 4 NOSSE - Seed: 1347473831 Buffer-Const,s!=d,xor=0: 1.435691 s 713.245 MB/s Buffer-Const,s!=d,xor=1: 1.618436 s 632.710 MB/s Buffer-Const,s==d,xor=0: 1.450719 s 705.857 MB/s Buffer-Const,s==d,xor=1: 1.604518 s 638.198 MB/s
262144 4096 16 SPLIT 16 4 NOSSE - Seed: 1347473842 Buffer-Const,s!=d,xor=0: 1.434818 s 713.679 MB/s Buffer-Const,s!=d,xor=1: 1.685219 s 607.636 MB/s Buffer-Const,s==d,xor=0: 1.412647 s 724.880 MB/s Buffer-Const,s==d,xor=1: 1.606347 s 637.471 MB/s
524288 2048 16 SPLIT 16 4 NOSSE - Seed: 1347473854 Buffer-Const,s!=d,xor=0: 1.437057 s 712.568 MB/s Buffer-Const,s!=d,xor=1: 1.605284 s 637.893 MB/s Buffer-Const,s==d,xor=0: 1.424157 s 719.022 MB/s Buffer-Const,s==d,xor=1: 1.616495 s 633.469 MB/s
1048576 1024 16 SPLIT 16 4 NOSSE - Seed: 1347473865 Buffer-Const,s!=d,xor=0: 1.417949 s 722.170 MB/s Buffer-Const,s!=d,xor=1: 1.636933 s 625.560 MB/s Buffer-Const,s==d,xor=0: 1.413730 s 724.325 MB/s Buffer-Const,s==d,xor=1: 1.619097 s 632.451 MB/s
2097152 512 16 SPLIT 16 4 NOSSE - Seed: 1347473876 Buffer-Const,s!=d,xor=0: 1.449239 s 706.577 MB/s Buffer-Const,s!=d,xor=1: 1.642027 s 623.619 MB/s Buffer-Const,s==d,xor=0: 1.482682 s 690.640 MB/s Buffer-Const,s==d,xor=1: 1.606026 s 637.599 MB/s
4194304 256 16 SPLIT 16 4 NOSSE - Seed: 1347473888 Buffer-Const,s!=d,xor=0: 1.466069 s 698.467 MB/s Buffer-Const,s!=d,xor=1: 1.642522 s 623.431 MB/s Buffer-Const,s==d,xor=0: 1.439317 s 711.449 MB/s Buffer-Const,s==d,xor=1: 1.631946 s 627.472 MB/s
8388608 128 16 SPLIT 16 4 NOSSE - Seed: 1347473899 Buffer-Const,s!=d,xor=0: 1.454558 s 703.994 MB/s Buffer-Const,s!=d,xor=1: 1.650872 s 620.278 MB/s Buffer-Const,s==d,xor=0: 1.445065 s 708.619 MB/s Buffer-Const,s==d,xor=1: 1.628372 s 628.849 MB/s
16777216 64 16 SPLIT 16 4 NOSSE - Seed: 1347473911 Buffer-Const,s!=d,xor=0: 1.459724 s 701.503 MB/s Buffer-Const,s!=d,xor=1: 1.650864 s 620.281 MB/s Buffer-Const,s==d,xor=0: 1.440828 s 710.702 MB/s Buffer-Const,s==d,xor=1: 1.634402 s 626.529 MB/s
33554432 32 16 SPLIT 16 4 NOSSE - Seed: 1347473922 Buffer-Const,s!=d,xor=0: 1.466166 s 698.420 MB/s Buffer-Const,s!=d,xor=1: 1.644006 s 622.869 MB/s Buffer-Const,s==d,xor=0: 1.439810 s 711.205 MB/s Buffer-Const,s==d,xor=1: 1.644597 s 622.645 MB/s
67108864 16 16 SPLIT 16 4 NOSSE - Seed: 1347473934 Buffer-Const,s!=d,xor=0: 1.509510 s 678.366 MB/s Buffer-Const,s!=d,xor=1: 1.680938 s 609.184 MB/s Buffer-Const,s==d,xor=0: 1.465974 s 698.512 MB/s Buffer-Const,s==d,xor=1: 1.645416 s 622.335 MB/s
134217728 8 16 SPLIT 16 4 NOSSE - Seed: 1347473945 Buffer-Const,s!=d,xor=0: 1.553459 s 659.174 MB/s Buffer-Const,s!=d,xor=1: 1.670615 s 612.948 MB/s Buffer-Const,s==d,xor=0: 1.468984 s 697.080 MB/s Buffer-Const,s==d,xor=1: 1.640758 s 624.102 MB/s
268435456 4 16 SPLIT 16 4 NOSSE - Seed: 1347473957 Buffer-Const,s!=d,xor=0: 1.656677 s 618.105 MB/s Buffer-Const,s!=d,xor=1: 1.669660 s 613.299 MB/s Buffer-Const,s==d,xor=0: 1.457518 s 702.564 MB/s Buffer-Const,s==d,xor=1: 1.656764 s 618.072 MB/s
536870912 2 16 SPLIT 16 4 NOSSE - Seed: 1347473969 Buffer-Const,s!=d,xor=0: 1.828201 s 560.113 MB/s Buffer-Const,s!=d,xor=1: 1.649980 s 620.614 MB/s Buffer-Const,s==d,xor=0: 1.450785 s 705.825 MB/s Buffer-Const,s==d,xor=1: 1.643633 s 623.010 MB/s
1073741824 1 16 SPLIT 16 4 NOSSE - Seed: 1347473981 Buffer-Const,s!=d,xor=0: 2.171743 s 471.511 MB/s Buffer-Const,s!=d,xor=1: 1.651953 s 619.872 MB/s Buffer-Const,s==d,xor=0: 1.466839 s 698.100 MB/s Buffer-Const,s==d,xor=1: 1.650641 s 620.365 MB/s
1024 1048576 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347473994 Buffer-Const,s!=d,xor=0: 0.440479 s 2324.740 MB/s Buffer-Const,s!=d,xor=1: 0.442003 s 2316.728 MB/s Buffer-Const,s==d,xor=0: 0.434042 s 2359.220 MB/s Buffer-Const,s==d,xor=1: 0.438626 s 2334.561 MB/s
2048 524288 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474002 Buffer-Const,s!=d,xor=0: 0.316802 s 3232.303 MB/s Buffer-Const,s!=d,xor=1: 0.331473 s 3089.240 MB/s Buffer-Const,s==d,xor=0: 0.317398 s 3226.233 MB/s Buffer-Const,s==d,xor=1: 0.330775 s 3095.758 MB/s
4096 262144 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474008 Buffer-Const,s!=d,xor=0: 0.264277 s 3874.730 MB/s Buffer-Const,s!=d,xor=1: 0.274880 s 3725.263 MB/s Buffer-Const,s==d,xor=0: 0.262972 s 3893.956 MB/s Buffer-Const,s==d,xor=1: 0.275358 s 3718.789 MB/s
8192 131072 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474015 Buffer-Const,s!=d,xor=0: 0.232929 s 4396.190 MB/s Buffer-Const,s!=d,xor=1: 0.245429 s 4172.294 MB/s Buffer-Const,s==d,xor=0: 0.231047 s 4432.005 MB/s Buffer-Const,s==d,xor=1: 0.243273 s 4209.263 MB/s
16384 65536 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474021 Buffer-Const,s!=d,xor=0: 0.220903 s 4635.515 MB/s Buffer-Const,s!=d,xor=1: 0.232513 s 4404.047 MB/s Buffer-Const,s==d,xor=0: 0.217276 s 4712.908 MB/s Buffer-Const,s==d,xor=1: 0.235140 s 4354.860 MB/s
32768 32768 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474027 Buffer-Const,s!=d,xor=0: 0.209507 s 4887.660 MB/s Buffer-Const,s!=d,xor=1: 0.225435 s 4542.324 MB/s Buffer-Const,s==d,xor=0: 0.211855 s 4833.491 MB/s Buffer-Const,s==d,xor=1: 0.222550 s 4601.205 MB/s
65536 16384 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474033 Buffer-Const,s!=d,xor=0: 0.206457 s 4959.867 MB/s Buffer-Const,s!=d,xor=1: 0.226325 s 4524.466 MB/s Buffer-Const,s==d,xor=0: 0.209159 s 4895.805 MB/s Buffer-Const,s==d,xor=1: 0.222746 s 4597.167 MB/s
131072 8192 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474039 Buffer-Const,s!=d,xor=0: 0.211171 s 4849.147 MB/s Buffer-Const,s!=d,xor=1: 0.226647 s 4518.032 MB/s Buffer-Const,s==d,xor=0: 0.205702 s 4978.085 MB/s Buffer-Const,s==d,xor=1: 0.218717 s 4681.842 MB/s
262144 4096 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474046 Buffer-Const,s!=d,xor=0: 0.217228 s 4713.948 MB/s Buffer-Const,s!=d,xor=1: 0.228891 s 4473.752 MB/s Buffer-Const,s==d,xor=0: 0.208708 s 4906.375 MB/s Buffer-Const,s==d,xor=1: 0.218286 s 4691.098 MB/s
524288 2048 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474052 Buffer-Const,s!=d,xor=0: 0.213187 s 4803.301 MB/s Buffer-Const,s!=d,xor=1: 0.224506 s 4561.132 MB/s Buffer-Const,s==d,xor=0: 0.205507 s 4982.792 MB/s Buffer-Const,s==d,xor=1: 0.218068 s 4695.780 MB/s
1048576 1024 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474058 Buffer-Const,s!=d,xor=0: 0.212405 s 4820.980 MB/s Buffer-Const,s!=d,xor=1: 0.224411 s 4563.056 MB/s Buffer-Const,s==d,xor=0: 0.204215 s 5014.334 MB/s Buffer-Const,s==d,xor=1: 0.216172 s 4736.964 MB/s
2097152 512 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474064 Buffer-Const,s!=d,xor=0: 0.261045 s 3922.692 MB/s Buffer-Const,s!=d,xor=1: 0.282378 s 3626.339 MB/s Buffer-Const,s==d,xor=0: 0.206318 s 4963.209 MB/s Buffer-Const,s==d,xor=1: 0.223613 s 4579.335 MB/s
4194304 256 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474070 Buffer-Const,s!=d,xor=0: 0.404333 s 2532.567 MB/s Buffer-Const,s!=d,xor=1: 0.395253 s 2590.748 MB/s Buffer-Const,s==d,xor=0: 0.254806 s 4018.747 MB/s Buffer-Const,s==d,xor=1: 0.268242 s 3817.453 MB/s
8388608 128 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474077 Buffer-Const,s!=d,xor=0: 0.396502 s 2582.583 MB/s Buffer-Const,s!=d,xor=1: 0.384216 s 2665.169 MB/s Buffer-Const,s==d,xor=0: 0.259012 s 3953.481 MB/s Buffer-Const,s==d,xor=1: 0.268045 s 3820.258 MB/s
16777216 64 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474083 Buffer-Const,s!=d,xor=0: 0.398601 s 2568.986 MB/s Buffer-Const,s!=d,xor=1: 0.389561 s 2628.599 MB/s Buffer-Const,s==d,xor=0: 0.262993 s 3893.642 MB/s Buffer-Const,s==d,xor=1: 0.270538 s 3785.046 MB/s
33554432 32 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474090 Buffer-Const,s!=d,xor=0: 0.404290 s 2532.837 MB/s Buffer-Const,s!=d,xor=1: 0.377139 s 2715.179 MB/s Buffer-Const,s==d,xor=0: 0.263448 s 3886.919 MB/s Buffer-Const,s==d,xor=1: 0.271300 s 3774.415 MB/s
67108864 16 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474096 Buffer-Const,s!=d,xor=0: 0.422952 s 2421.079 MB/s Buffer-Const,s!=d,xor=1: 0.373653 s 2740.513 MB/s Buffer-Const,s==d,xor=0: 0.257768 s 3972.566 MB/s Buffer-Const,s==d,xor=1: 0.273398 s 3745.460 MB/s
134217728 8 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474103 Buffer-Const,s!=d,xor=0: 0.461641 s 2218.174 MB/s Buffer-Const,s!=d,xor=1: 0.383588 s 2669.532 MB/s Buffer-Const,s==d,xor=0: 0.269459 s 3800.207 MB/s Buffer-Const,s==d,xor=1: 0.270140 s 3790.625 MB/s
268435456 4 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474110 Buffer-Const,s!=d,xor=0: 0.530862 s 1928.940 MB/s Buffer-Const,s!=d,xor=1: 0.373995 s 2738.004 MB/s Buffer-Const,s==d,xor=0: 0.258169 s 3966.391 MB/s Buffer-Const,s==d,xor=1: 0.277354 s 3692.036 MB/s
536870912 2 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474117 Buffer-Const,s!=d,xor=0: 0.695551 s 1472.214 MB/s Buffer-Const,s!=d,xor=1: 0.374834 s 2731.876 MB/s Buffer-Const,s==d,xor=0: 0.259944 s 3939.311 MB/s Buffer-Const,s==d,xor=1: 0.278206 s 3680.724 MB/s
1073741824 1 16 SPLIT 16 4 SSE,STDMAP - Seed: 1347474124 Buffer-Const,s!=d,xor=0: 1.020324 s 1003.603 MB/s Buffer-Const,s!=d,xor=1: 0.379288 s 2699.796 MB/s Buffer-Const,s==d,xor=0: 0.257958 s 3969.639 MB/s Buffer-Const,s==d,xor=1: 0.269507 s 3799.531 MB/s
1024 1048576 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474132 Buffer-Const,s!=d,xor=0: 0.374943 s 2731.083 MB/s Buffer-Const,s!=d,xor=1: 0.381695 s 2682.773 MB/s Buffer-Const,s==d,xor=0: 0.360327 s 2841.861 MB/s Buffer-Const,s==d,xor=1: 0.388995 s 2632.423 MB/s
2048 524288 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474139 Buffer-Const,s!=d,xor=0: 0.257053 s 3983.608 MB/s Buffer-Const,s!=d,xor=1: 0.274414 s 3731.587 MB/s Buffer-Const,s==d,xor=0: 0.250040 s 4095.352 MB/s Buffer-Const,s==d,xor=1: 0.271855 s 3766.712 MB/s
4096 262144 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474146 Buffer-Const,s!=d,xor=0: 0.204864 s 4998.432 MB/s Buffer-Const,s!=d,xor=1: 0.218739 s 4681.388 MB/s Buffer-Const,s==d,xor=0: 0.198597 s 5156.172 MB/s Buffer-Const,s==d,xor=1: 0.220060 s 4653.274 MB/s
8192 131072 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474152 Buffer-Const,s!=d,xor=0: 0.170398 s 6009.469 MB/s Buffer-Const,s!=d,xor=1: 0.189123 s 5414.468 MB/s Buffer-Const,s==d,xor=0: 0.170900 s 5991.821 MB/s Buffer-Const,s==d,xor=1: 0.185934 s 5507.322 MB/s
16384 65536 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474158 Buffer-Const,s!=d,xor=0: 0.158379 s 6465.520 MB/s Buffer-Const,s!=d,xor=1: 0.173688 s 5895.623 MB/s Buffer-Const,s==d,xor=0: 0.153663 s 6663.916 MB/s Buffer-Const,s==d,xor=1: 0.169384 s 6045.427 MB/s
32768 32768 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474164 Buffer-Const,s!=d,xor=0: 0.151898 s 6741.381 MB/s Buffer-Const,s!=d,xor=1: 0.179828 s 5694.332 MB/s Buffer-Const,s==d,xor=0: 0.152833 s 6700.114 MB/s Buffer-Const,s==d,xor=1: 0.170787 s 5995.761 MB/s
65536 16384 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474170 Buffer-Const,s!=d,xor=0: 0.149569 s 6846.337 MB/s Buffer-Const,s!=d,xor=1: 0.170012 s 6023.104 MB/s Buffer-Const,s==d,xor=0: 0.147070 s 6962.685 MB/s Buffer-Const,s==d,xor=1: 0.164932 s 6208.628 MB/s
131072 8192 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474176 Buffer-Const,s!=d,xor=0: 0.147436 s 6945.391 MB/s Buffer-Const,s!=d,xor=1: 0.177689 s 5762.875 MB/s Buffer-Const,s==d,xor=0: 0.141681 s 7227.494 MB/s Buffer-Const,s==d,xor=1: 0.159149 s 6434.234 MB/s
262144 4096 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474181 Buffer-Const,s!=d,xor=0: 0.176476 s 5802.489 MB/s Buffer-Const,s!=d,xor=1: 0.190235 s 5382.819 MB/s Buffer-Const,s==d,xor=0: 0.145066 s 7058.832 MB/s Buffer-Const,s==d,xor=1: 0.161759 s 6330.400 MB/s
524288 2048 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474187 Buffer-Const,s!=d,xor=0: 0.170859 s 5993.259 MB/s Buffer-Const,s!=d,xor=1: 0.184301 s 5556.124 MB/s Buffer-Const,s==d,xor=0: 0.146564 s 6986.709 MB/s Buffer-Const,s==d,xor=1: 0.163810 s 6251.162 MB/s
1048576 1024 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474193 Buffer-Const,s!=d,xor=0: 0.173408 s 5905.164 MB/s Buffer-Const,s!=d,xor=1: 0.186853 s 5480.246 MB/s Buffer-Const,s==d,xor=0: 0.145153 s 7054.646 MB/s Buffer-Const,s==d,xor=1: 0.161184 s 6352.994 MB/s
2097152 512 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474199 Buffer-Const,s!=d,xor=0: 0.253428 s 4040.592 MB/s Buffer-Const,s!=d,xor=1: 0.282650 s 3622.849 MB/s Buffer-Const,s==d,xor=0: 0.151634 s 6753.114 MB/s Buffer-Const,s==d,xor=1: 0.162269 s 6310.523 MB/s
4194304 256 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474205 Buffer-Const,s!=d,xor=0: 0.392085 s 2611.678 MB/s Buffer-Const,s!=d,xor=1: 0.390785 s 2620.365 MB/s Buffer-Const,s==d,xor=0: 0.225171 s 4547.663 MB/s Buffer-Const,s==d,xor=1: 0.229246 s 4466.819 MB/s
8388608 128 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474212 Buffer-Const,s!=d,xor=0: 0.385162 s 2658.624 MB/s Buffer-Const,s!=d,xor=1: 0.384540 s 2662.920 MB/s Buffer-Const,s==d,xor=0: 0.236824 s 4323.885 MB/s Buffer-Const,s==d,xor=1: 0.242623 s 4220.542 MB/s
16777216 64 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474218 Buffer-Const,s!=d,xor=0: 0.390857 s 2619.887 MB/s Buffer-Const,s!=d,xor=1: 0.377723 s 2710.983 MB/s Buffer-Const,s==d,xor=0: 0.235221 s 4353.346 MB/s Buffer-Const,s==d,xor=1: 0.240751 s 4253.365 MB/s
33554432 32 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474225 Buffer-Const,s!=d,xor=0: 0.401487 s 2550.516 MB/s Buffer-Const,s!=d,xor=1: 0.375969 s 2723.626 MB/s Buffer-Const,s==d,xor=0: 0.235533 s 4347.591 MB/s Buffer-Const,s==d,xor=1: 0.238075 s 4301.161 MB/s
67108864 16 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474231 Buffer-Const,s!=d,xor=0: 0.412280 s 2483.751 MB/s Buffer-Const,s!=d,xor=1: 0.371134 s 2759.109 MB/s Buffer-Const,s==d,xor=0: 0.233195 s 4391.178 MB/s Buffer-Const,s==d,xor=1: 0.236408 s 4331.504 MB/s
134217728 8 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474238 Buffer-Const,s!=d,xor=0: 0.448526 s 2283.034 MB/s Buffer-Const,s!=d,xor=1: 0.370282 s 2765.460 MB/s Buffer-Const,s==d,xor=0: 0.233473 s 4385.940 MB/s Buffer-Const,s==d,xor=1: 0.240682 s 4254.570 MB/s
268435456 4 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474245 Buffer-Const,s!=d,xor=0: 0.524038 s 1954.058 MB/s Buffer-Const,s!=d,xor=1: 0.375533 s 2726.792 MB/s Buffer-Const,s==d,xor=0: 0.242939 s 4215.050 MB/s Buffer-Const,s==d,xor=1: 0.244587 s 4186.650 MB/s
536870912 2 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474251 Buffer-Const,s!=d,xor=0: 0.669426 s 1529.669 MB/s Buffer-Const,s!=d,xor=1: 0.376380 s 2720.655 MB/s Buffer-Const,s==d,xor=0: 0.238366 s 4295.912 MB/s Buffer-Const,s==d,xor=1: 0.246535 s 4153.567 MB/s
1073741824 1 16 SPLIT 16 4 SSE,ALTMAP - Seed: 1347474259 Buffer-Const,s!=d,xor=0: 0.980024 s 1044.872 MB/s Buffer-Const,s!=d,xor=1: 0.373039 s 2745.021 MB/s Buffer-Const,s==d,xor=0: 0.257666 s 3974.139 MB/s Buffer-Const,s==d,xor=1: 0.236541 s 4329.059 MB/s
1024 1048576 32 SPLIT 8 8 - - Seed: 1347474267 Buffer-Const,s!=d,xor=0: 4.078636 s 251.064 MB/s Buffer-Const,s!=d,xor=1: 4.030305 s 254.075 MB/s Buffer-Const,s==d,xor=0: 3.999870 s 256.008 MB/s Buffer-Const,s==d,xor=1: 4.004111 s 255.737 MB/s
2048 524288 32 SPLIT 8 8 - - Seed: 1347474288 Buffer-Const,s!=d,xor=0: 3.907939 s 262.031 MB/s Buffer-Const,s!=d,xor=1: 3.925473 s 260.860 MB/s Buffer-Const,s==d,xor=0: 3.923752 s 260.975 MB/s Buffer-Const,s==d,xor=1: 3.880185 s 263.905 MB/s
4096 262144 32 SPLIT 8 8 - - Seed: 1347474309 Buffer-Const,s!=d,xor=0: 3.841739 s 266.546 MB/s Buffer-Const,s!=d,xor=1: 3.807040 s 268.975 MB/s Buffer-Const,s==d,xor=0: 3.757250 s 272.540 MB/s Buffer-Const,s==d,xor=1: 3.823483 s 267.819 MB/s
8192 131072 32 SPLIT 8 8 - - Seed: 1347474330 Buffer-Const,s!=d,xor=0: 3.847896 s 266.119 MB/s Buffer-Const,s!=d,xor=1: 3.804132 s 269.181 MB/s Buffer-Const,s==d,xor=0: 3.718211 s 275.401 MB/s Buffer-Const,s==d,xor=1: 3.918893 s 261.298 MB/s
16384 65536 32 SPLIT 8 8 - - Seed: 1347474351 Buffer-Const,s!=d,xor=0: 3.864461 s 264.979 MB/s Buffer-Const,s!=d,xor=1: 3.730061 s 274.526 MB/s Buffer-Const,s==d,xor=0: 3.765701 s 271.928 MB/s Buffer-Const,s==d,xor=1: 3.654276 s 280.220 MB/s
32768 32768 32 SPLIT 8 8 - - Seed: 1347474371 Buffer-Const,s!=d,xor=0: 3.685288 s 277.862 MB/s Buffer-Const,s!=d,xor=1: 3.737605 s 273.972 MB/s Buffer-Const,s==d,xor=0: 3.696803 s 276.996 MB/s Buffer-Const,s==d,xor=1: 3.687180 s 277.719 MB/s
65536 16384 32 SPLIT 8 8 - - Seed: 1347474391 Buffer-Const,s!=d,xor=0: 3.750255 s 273.048 MB/s Buffer-Const,s!=d,xor=1: 3.722842 s 275.059 MB/s Buffer-Const,s==d,xor=0: 3.639989 s 281.320 MB/s Buffer-Const,s==d,xor=1: 3.657695 s 279.958 MB/s
131072 8192 32 SPLIT 8 8 - - Seed: 1347474411 Buffer-Const,s!=d,xor=0: 3.727132 s 274.742 MB/s Buffer-Const,s!=d,xor=1: 3.719290 s 275.321 MB/s Buffer-Const,s==d,xor=0: 3.658669 s 279.883 MB/s Buffer-Const,s==d,xor=1: 3.639225 s 281.379 MB/s
262144 4096 32 SPLIT 8 8 - - Seed: 1347474431 Buffer-Const,s!=d,xor=0: 3.699968 s 276.759 MB/s Buffer-Const,s!=d,xor=1: 3.708224 s 276.143 MB/s Buffer-Const,s==d,xor=0: 3.636058 s 281.624 MB/s Buffer-Const,s==d,xor=1: 3.663831 s 279.489 MB/s
524288 2048 32 SPLIT 8 8 - - Seed: 1347474451 Buffer-Const,s!=d,xor=0: 3.715168 s 275.627 MB/s Buffer-Const,s!=d,xor=1: 3.716467 s 275.531 MB/s Buffer-Const,s==d,xor=0: 3.642050 s 281.160 MB/s Buffer-Const,s==d,xor=1: 3.650928 s 280.477 MB/s
1048576 1024 32 SPLIT 8 8 - - Seed: 1347474471 Buffer-Const,s!=d,xor=0: 3.748222 s 273.196 MB/s Buffer-Const,s!=d,xor=1: 3.707973 s 276.162 MB/s Buffer-Const,s==d,xor=0: 3.633509 s 281.821 MB/s Buffer-Const,s==d,xor=1: 3.606194 s 283.956 MB/s
2097152 512 32 SPLIT 8 8 - - Seed: 1347474491 Buffer-Const,s!=d,xor=0: 3.772647 s 271.427 MB/s Buffer-Const,s!=d,xor=1: 3.751378 s 272.966 MB/s Buffer-Const,s==d,xor=0: 3.627191 s 282.312 MB/s Buffer-Const,s==d,xor=1: 3.683802 s 277.974 MB/s
4194304 256 32 SPLIT 8 8 - - Seed: 1347474511 Buffer-Const,s!=d,xor=0: 3.710802 s 275.951 MB/s Buffer-Const,s!=d,xor=1: 3.754683 s 272.726 MB/s Buffer-Const,s==d,xor=0: 3.725703 s 274.847 MB/s Buffer-Const,s==d,xor=1: 3.741130 s 273.714 MB/s
8388608 128 32 SPLIT 8 8 - - Seed: 1347474531 Buffer-Const,s!=d,xor=0: 3.725367 s 274.872 MB/s Buffer-Const,s!=d,xor=1: 3.786313 s 270.448 MB/s Buffer-Const,s==d,xor=0: 3.661536 s 279.664 MB/s Buffer-Const,s==d,xor=1: 3.663013 s 279.551 MB/s
16777216 64 32 SPLIT 8 8 - - Seed: 1347474551 Buffer-Const,s!=d,xor=0: 3.733772 s 274.254 MB/s Buffer-Const,s!=d,xor=1: 3.834374 s 267.058 MB/s Buffer-Const,s==d,xor=0: 3.682088 s 278.103 MB/s Buffer-Const,s==d,xor=1: 3.664883 s 279.409 MB/s
33554432 32 32 SPLIT 8 8 - - Seed: 1347474572 Buffer-Const,s!=d,xor=0: 3.788104 s 270.320 MB/s Buffer-Const,s!=d,xor=1: 3.787767 s 270.344 MB/s Buffer-Const,s==d,xor=0: 3.767594 s 271.792 MB/s Buffer-Const,s==d,xor=1: 3.785537 s 270.503 MB/s
67108864 16 32 SPLIT 8 8 - - Seed: 1347474592 Buffer-Const,s!=d,xor=0: 3.902282 s 262.411 MB/s Buffer-Const,s!=d,xor=1: 3.781487 s 270.793 MB/s Buffer-Const,s==d,xor=0: 3.793007 s 269.970 MB/s Buffer-Const,s==d,xor=1: 3.694762 s 277.149 MB/s
134217728 8 32 SPLIT 8 8 - - Seed: 1347474613 Buffer-Const,s!=d,xor=0: 3.803999 s 269.190 MB/s Buffer-Const,s!=d,xor=1: 4.004633 s 255.704 MB/s Buffer-Const,s==d,xor=0: 3.789479 s 270.222 MB/s Buffer-Const,s==d,xor=1: 3.750047 s 273.063 MB/s
268435456 4 32 SPLIT 8 8 - - Seed: 1347474633 Buffer-Const,s!=d,xor=0: 3.924148 s 260.948 MB/s Buffer-Const,s!=d,xor=1: 3.670553 s 278.977 MB/s Buffer-Const,s==d,xor=0: 3.616194 s 283.171 MB/s Buffer-Const,s==d,xor=1: 3.472242 s 294.910 MB/s
536870912 2 32 SPLIT 8 8 - - Seed: 1347474654 Buffer-Const,s!=d,xor=0: 3.879471 s 263.954 MB/s Buffer-Const,s!=d,xor=1: 3.399143 s 301.252 MB/s Buffer-Const,s==d,xor=0: 3.431460 s 298.415 MB/s Buffer-Const,s==d,xor=1: 3.755244 s 272.685 MB/s
1073741824 1 32 SPLIT 8 8 - - Seed: 1347474674 Buffer-Const,s!=d,xor=0: 4.345123 s 235.667 MB/s Buffer-Const,s!=d,xor=1: 3.869906 s 264.606 MB/s Buffer-Const,s==d,xor=0: 3.230677 s 316.961 MB/s Buffer-Const,s==d,xor=1: 3.296187 s 310.662 MB/s
1024 1048576 32 SPLIT 32 4 NOSSE - Seed: 1347474695 Buffer-Const,s!=d,xor=0: 3.426757 s 298.825 MB/s Buffer-Const,s!=d,xor=1: 3.431886 s 298.378 MB/s Buffer-Const,s==d,xor=0: 3.434533 s 298.148 MB/s Buffer-Const,s==d,xor=1: 3.427667 s 298.745 MB/s
2048 524288 32 SPLIT 32 4 NOSSE - Seed: 1347474714 Buffer-Const,s!=d,xor=0: 3.157299 s 324.328 MB/s Buffer-Const,s!=d,xor=1: 3.174731 s 322.547 MB/s Buffer-Const,s==d,xor=0: 3.147733 s 325.313 MB/s Buffer-Const,s==d,xor=1: 3.124081 s 327.776 MB/s
4096 262144 32 SPLIT 32 4 NOSSE - Seed: 1347474732 Buffer-Const,s!=d,xor=0: 2.994873 s 341.918 MB/s Buffer-Const,s!=d,xor=1: 2.995221 s 341.878 MB/s Buffer-Const,s==d,xor=0: 2.977546 s 343.907 MB/s Buffer-Const,s==d,xor=1: 3.072379 s 333.292 MB/s
8192 131072 32 SPLIT 32 4 NOSSE - Seed: 1347474749 Buffer-Const,s!=d,xor=0: 2.942536 s 347.999 MB/s Buffer-Const,s!=d,xor=1: 3.084363 s 331.997 MB/s Buffer-Const,s==d,xor=0: 3.405338 s 300.704 MB/s Buffer-Const,s==d,xor=1: 2.927504 s 349.786 MB/s
16384 65536 32 SPLIT 32 4 NOSSE - Seed: 1347474767 Buffer-Const,s!=d,xor=0: 2.898155 s 353.328 MB/s Buffer-Const,s!=d,xor=1: 2.916746 s 351.076 MB/s Buffer-Const,s==d,xor=0: 2.890006 s 354.325 MB/s Buffer-Const,s==d,xor=1: 2.884264 s 355.030 MB/s
32768 32768 32 SPLIT 32 4 NOSSE - Seed: 1347474784 Buffer-Const,s!=d,xor=0: 2.931310 s 349.332 MB/s Buffer-Const,s!=d,xor=1: 3.416450 s 299.726 MB/s Buffer-Const,s==d,xor=0: 2.887018 s 354.691 MB/s Buffer-Const,s==d,xor=1: 2.879376 s 355.633 MB/s
65536 16384 32 SPLIT 32 4 NOSSE - Seed: 1347474802 Buffer-Const,s!=d,xor=0: 2.836922 s 360.955 MB/s Buffer-Const,s!=d,xor=1: 2.840663 s 360.479 MB/s Buffer-Const,s==d,xor=0: 2.844574 s 359.984 MB/s Buffer-Const,s==d,xor=1: 2.861463 s 357.859 MB/s
131072 8192 32 SPLIT 32 4 NOSSE - Seed: 1347474818 Buffer-Const,s!=d,xor=0: 2.847367 s 359.630 MB/s Buffer-Const,s!=d,xor=1: 2.891688 s 354.118 MB/s Buffer-Const,s==d,xor=0: 2.893903 s 353.847 MB/s Buffer-Const,s==d,xor=1: 2.853056 s 358.913 MB/s
262144 4096 32 SPLIT 32 4 NOSSE - Seed: 1347474835 Buffer-Const,s!=d,xor=0: 2.845535 s 359.862 MB/s Buffer-Const,s!=d,xor=1: 2.883235 s 355.157 MB/s Buffer-Const,s==d,xor=0: 2.825692 s 362.389 MB/s Buffer-Const,s==d,xor=1: 2.824296 s 362.568 MB/s
524288 2048 32 SPLIT 32 4 NOSSE - Seed: 1347474852 Buffer-Const,s!=d,xor=0: 2.830584 s 361.763 MB/s Buffer-Const,s!=d,xor=1: 2.842232 s 360.280 MB/s Buffer-Const,s==d,xor=0: 2.855167 s 358.648 MB/s Buffer-Const,s==d,xor=1: 2.844668 s 359.972 MB/s
1048576 1024 32 SPLIT 32 4 NOSSE - Seed: 1347474868 Buffer-Const,s!=d,xor=0: 2.866585 s 357.220 MB/s Buffer-Const,s!=d,xor=1: 2.903829 s 352.638 MB/s Buffer-Const,s==d,xor=0: 2.842242 s 360.279 MB/s Buffer-Const,s==d,xor=1: 2.842065 s 360.301 MB/s
2097152 512 32 SPLIT 32 4 NOSSE - Seed: 1347474885 Buffer-Const,s!=d,xor=0: 2.881829 s 355.330 MB/s Buffer-Const,s!=d,xor=1: 2.869698 s 356.832 MB/s Buffer-Const,s==d,xor=0: 2.844980 s 359.932 MB/s Buffer-Const,s==d,xor=1: 2.879891 s 355.569 MB/s
4194304 256 32 SPLIT 32 4 NOSSE - Seed: 1347474902 Buffer-Const,s!=d,xor=0: 2.904719 s 352.530 MB/s Buffer-Const,s!=d,xor=1: 2.957000 s 346.297 MB/s Buffer-Const,s==d,xor=0: 2.897870 s 353.363 MB/s Buffer-Const,s==d,xor=1: 2.860694 s 357.955 MB/s
8388608 128 32 SPLIT 32 4 NOSSE - Seed: 1347474919 Buffer-Const,s!=d,xor=0: 2.891168 s 354.182 MB/s Buffer-Const,s!=d,xor=1: 2.912499 s 351.588 MB/s Buffer-Const,s==d,xor=0: 2.905373 s 352.450 MB/s Buffer-Const,s==d,xor=1: 2.877376 s 355.880 MB/s
16777216 64 32 SPLIT 32 4 NOSSE - Seed: 1347474936 Buffer-Const,s!=d,xor=0: 2.874882 s 356.188 MB/s Buffer-Const,s!=d,xor=1: 2.891762 s 354.109 MB/s Buffer-Const,s==d,xor=0: 2.882928 s 355.194 MB/s Buffer-Const,s==d,xor=1: 2.899087 s 353.215 MB/s
33554432 32 32 SPLIT 32 4 NOSSE - Seed: 1347474952 Buffer-Const,s!=d,xor=0: 2.927485 s 349.788 MB/s Buffer-Const,s!=d,xor=1: 2.908132 s 352.116 MB/s Buffer-Const,s==d,xor=0: 2.885413 s 354.889 MB/s Buffer-Const,s==d,xor=1: 2.878262 s 355.770 MB/s
67108864 16 32 SPLIT 32 4 NOSSE - Seed: 1347474969 Buffer-Const,s!=d,xor=0: 2.936867 s 348.671 MB/s Buffer-Const,s!=d,xor=1: 2.918213 s 350.900 MB/s Buffer-Const,s==d,xor=0: 2.860039 s 358.037 MB/s Buffer-Const,s==d,xor=1: 2.914023 s 351.404 MB/s
134217728 8 32 SPLIT 32 4 NOSSE - Seed: 1347474986 Buffer-Const,s!=d,xor=0: 3.004242 s 340.851 MB/s Buffer-Const,s!=d,xor=1: 2.907473 s 352.196 MB/s Buffer-Const,s==d,xor=0: 2.870576 s 356.723 MB/s Buffer-Const,s==d,xor=1: 2.869254 s 356.887 MB/s
268435456 4 32 SPLIT 32 4 NOSSE - Seed: 1347475003 Buffer-Const,s!=d,xor=0: 3.086275 s 331.792 MB/s Buffer-Const,s!=d,xor=1: 2.917016 s 351.044 MB/s Buffer-Const,s==d,xor=0: 2.872529 s 356.480 MB/s Buffer-Const,s==d,xor=1: 2.876983 s 355.928 MB/s
536870912 2 32 SPLIT 32 4 NOSSE - Seed: 1347475021 Buffer-Const,s!=d,xor=0: 3.246244 s 315.441 MB/s Buffer-Const,s!=d,xor=1: 2.880565 s 355.486 MB/s Buffer-Const,s==d,xor=0: 2.875096 s 356.162 MB/s Buffer-Const,s==d,xor=1: 2.880929 s 355.441 MB/s
1073741824 1 32 SPLIT 32 4 NOSSE - Seed: 1347475038 Buffer-Const,s!=d,xor=0: 3.564328 s 287.291 MB/s Buffer-Const,s!=d,xor=1: 2.865621 s 357.340 MB/s Buffer-Const,s==d,xor=0: 2.851534 s 359.105 MB/s Buffer-Const,s==d,xor=1: 2.894613 s 353.761 MB/s
1024 1048576 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475056 Buffer-Const,s!=d,xor=0: 1.386263 s 738.677 MB/s Buffer-Const,s!=d,xor=1: 1.388797 s 737.329 MB/s Buffer-Const,s==d,xor=0: 1.389924 s 736.731 MB/s Buffer-Const,s==d,xor=1: 1.397714 s 732.625 MB/s
2048 524288 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475068 Buffer-Const,s!=d,xor=0: 0.861808 s 1188.200 MB/s Buffer-Const,s!=d,xor=1: 0.862678 s 1187.002 MB/s Buffer-Const,s==d,xor=0: 0.858774 s 1192.397 MB/s Buffer-Const,s==d,xor=1: 0.877707 s 1166.676 MB/s
4096 262144 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475076 Buffer-Const,s!=d,xor=0: 0.599774 s 1707.309 MB/s Buffer-Const,s!=d,xor=1: 0.605536 s 1691.064 MB/s Buffer-Const,s==d,xor=0: 0.594488 s 1722.490 MB/s Buffer-Const,s==d,xor=1: 0.598080 s 1712.145 MB/s
8192 131072 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475084 Buffer-Const,s!=d,xor=0: 0.463454 s 2209.496 MB/s Buffer-Const,s!=d,xor=1: 0.476521 s 2148.911 MB/s Buffer-Const,s==d,xor=0: 0.463254 s 2210.451 MB/s Buffer-Const,s==d,xor=1: 0.475028 s 2155.663 MB/s
16384 65536 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475091 Buffer-Const,s!=d,xor=0: 0.403863 s 2535.512 MB/s Buffer-Const,s!=d,xor=1: 0.406392 s 2519.733 MB/s Buffer-Const,s==d,xor=0: 0.398511 s 2569.567 MB/s Buffer-Const,s==d,xor=1: 0.403220 s 2539.558 MB/s
32768 32768 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475098 Buffer-Const,s!=d,xor=0: 0.366185 s 2796.399 MB/s Buffer-Const,s!=d,xor=1: 0.371948 s 2753.077 MB/s Buffer-Const,s==d,xor=0: 0.365726 s 2799.912 MB/s Buffer-Const,s==d,xor=1: 0.373114 s 2744.470 MB/s
65536 16384 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475105 Buffer-Const,s!=d,xor=0: 0.355393 s 2881.320 MB/s Buffer-Const,s!=d,xor=1: 0.361260 s 2834.520 MB/s Buffer-Const,s==d,xor=0: 0.352397 s 2905.810 MB/s Buffer-Const,s==d,xor=1: 0.359221 s 2850.615 MB/s
131072 8192 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475111 Buffer-Const,s!=d,xor=0: 0.345755 s 2961.634 MB/s Buffer-Const,s!=d,xor=1: 0.349291 s 2931.656 MB/s Buffer-Const,s==d,xor=0: 0.349212 s 2932.317 MB/s Buffer-Const,s==d,xor=1: 0.346459 s 2955.613 MB/s
262144 4096 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475118 Buffer-Const,s!=d,xor=0: 0.339918 s 3012.490 MB/s Buffer-Const,s!=d,xor=1: 0.348600 s 2937.467 MB/s Buffer-Const,s==d,xor=0: 0.337834 s 3031.078 MB/s Buffer-Const,s==d,xor=1: 0.345001 s 2968.103 MB/s
524288 2048 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475125 Buffer-Const,s!=d,xor=0: 0.341756 s 2996.291 MB/s Buffer-Const,s!=d,xor=1: 0.350183 s 2924.183 MB/s Buffer-Const,s==d,xor=0: 0.339499 s 3016.213 MB/s Buffer-Const,s==d,xor=1: 0.344557 s 2971.934 MB/s
1048576 1024 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475131 Buffer-Const,s!=d,xor=0: 0.341502 s 2998.519 MB/s Buffer-Const,s!=d,xor=1: 0.346550 s 2954.845 MB/s Buffer-Const,s==d,xor=0: 0.335763 s 3049.766 MB/s Buffer-Const,s==d,xor=1: 0.341348 s 2999.872 MB/s
2097152 512 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475138 Buffer-Const,s!=d,xor=0: 0.359277 s 2850.168 MB/s Buffer-Const,s!=d,xor=1: 0.400653 s 2555.825 MB/s Buffer-Const,s==d,xor=0: 0.341248 s 3000.750 MB/s Buffer-Const,s==d,xor=1: 0.342671 s 2988.290 MB/s
4194304 256 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475145 Buffer-Const,s!=d,xor=0: 0.417889 s 2450.411 MB/s Buffer-Const,s!=d,xor=1: 0.473958 s 2160.527 MB/s Buffer-Const,s==d,xor=0: 0.376402 s 2720.496 MB/s Buffer-Const,s==d,xor=1: 0.382395 s 2677.859 MB/s
8388608 128 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475152 Buffer-Const,s!=d,xor=0: 0.438709 s 2334.120 MB/s Buffer-Const,s!=d,xor=1: 0.458300 s 2234.343 MB/s Buffer-Const,s==d,xor=0: 0.372593 s 2748.306 MB/s Buffer-Const,s==d,xor=1: 0.377737 s 2710.881 MB/s
16777216 64 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475158 Buffer-Const,s!=d,xor=0: 0.430602 s 2378.066 MB/s Buffer-Const,s!=d,xor=1: 0.457316 s 2239.150 MB/s Buffer-Const,s==d,xor=0: 0.380552 s 2690.828 MB/s Buffer-Const,s==d,xor=1: 0.377189 s 2714.818 MB/s
33554432 32 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475165 Buffer-Const,s!=d,xor=0: 0.435465 s 2351.507 MB/s Buffer-Const,s!=d,xor=1: 0.463290 s 2210.280 MB/s Buffer-Const,s==d,xor=0: 0.377565 s 2712.118 MB/s Buffer-Const,s==d,xor=1: 0.379561 s 2697.854 MB/s
67108864 16 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475172 Buffer-Const,s!=d,xor=0: 0.469558 s 2180.775 MB/s Buffer-Const,s!=d,xor=1: 0.454985 s 2250.625 MB/s Buffer-Const,s==d,xor=0: 0.375220 s 2729.069 MB/s Buffer-Const,s==d,xor=1: 0.378372 s 2706.333 MB/s
134217728 8 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475179 Buffer-Const,s!=d,xor=0: 0.500927 s 2044.211 MB/s Buffer-Const,s!=d,xor=1: 0.461080 s 2220.875 MB/s Buffer-Const,s==d,xor=0: 0.378093 s 2708.328 MB/s Buffer-Const,s==d,xor=1: 0.380782 s 2689.205 MB/s
268435456 4 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475187 Buffer-Const,s!=d,xor=0: 0.583161 s 1755.947 MB/s Buffer-Const,s!=d,xor=1: 0.454891 s 2251.089 MB/s Buffer-Const,s==d,xor=0: 0.370498 s 2763.848 MB/s Buffer-Const,s==d,xor=1: 0.380160 s 2693.602 MB/s
536870912 2 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475194 Buffer-Const,s!=d,xor=0: 0.773630 s 1323.630 MB/s Buffer-Const,s!=d,xor=1: 0.449662 s 2277.266 MB/s Buffer-Const,s==d,xor=0: 0.378359 s 2706.425 MB/s Buffer-Const,s==d,xor=1: 0.381460 s 2684.423 MB/s
1073741824 1 32 SPLIT 32 4 SSE,STDMAP - Seed: 1347475202 Buffer-Const,s!=d,xor=0: 1.140212 s 898.079 MB/s Buffer-Const,s!=d,xor=1: 0.448195 s 2284.720 MB/s Buffer-Const,s==d,xor=0: 0.371347 s 2757.529 MB/s Buffer-Const,s==d,xor=1: 0.383728 s 2668.557 MB/s
1024 1048576 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475210 Buffer-Const,s!=d,xor=0: 1.306085 s 784.023 MB/s Buffer-Const,s!=d,xor=1: 1.316872 s 777.600 MB/s Buffer-Const,s==d,xor=0: 1.312451 s 780.220 MB/s Buffer-Const,s==d,xor=1: 1.336282 s 766.305 MB/s
2048 524288 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475221 Buffer-Const,s!=d,xor=0: 0.780763 s 1311.537 MB/s Buffer-Const,s!=d,xor=1: 0.788499 s 1298.670 MB/s Buffer-Const,s==d,xor=0: 0.774973 s 1321.336 MB/s Buffer-Const,s==d,xor=1: 0.787734 s 1299.931 MB/s
4096 262144 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475229 Buffer-Const,s!=d,xor=0: 0.509444 s 2010.036 MB/s Buffer-Const,s!=d,xor=1: 0.528554 s 1937.360 MB/s Buffer-Const,s==d,xor=0: 0.515298 s 1987.198 MB/s Buffer-Const,s==d,xor=1: 0.533344 s 1919.963 MB/s
8192 131072 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475237 Buffer-Const,s!=d,xor=0: 0.385064 s 2659.301 MB/s Buffer-Const,s!=d,xor=1: 0.389860 s 2626.586 MB/s Buffer-Const,s==d,xor=0: 0.377777 s 2710.597 MB/s Buffer-Const,s==d,xor=1: 0.389788 s 2627.068 MB/s
16384 65536 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475244 Buffer-Const,s!=d,xor=0: 0.316446 s 3235.938 MB/s Buffer-Const,s!=d,xor=1: 0.327997 s 3121.980 MB/s Buffer-Const,s==d,xor=0: 0.313605 s 3265.256 MB/s Buffer-Const,s==d,xor=1: 0.323668 s 3163.736 MB/s
32768 32768 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475250 Buffer-Const,s!=d,xor=0: 0.280216 s 3654.318 MB/s Buffer-Const,s!=d,xor=1: 0.293557 s 3488.244 MB/s Buffer-Const,s==d,xor=0: 0.278453 s 3677.463 MB/s Buffer-Const,s==d,xor=1: 0.296944 s 3448.460 MB/s
65536 16384 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475256 Buffer-Const,s!=d,xor=0: 0.302436 s 3385.842 MB/s Buffer-Const,s!=d,xor=1: 0.277890 s 3684.909 MB/s Buffer-Const,s==d,xor=0: 0.262908 s 3894.892 MB/s Buffer-Const,s==d,xor=1: 0.272852 s 3752.951 MB/s
131072 8192 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475263 Buffer-Const,s!=d,xor=0: 0.253372 s 4041.493 MB/s Buffer-Const,s!=d,xor=1: 0.265148 s 3861.999 MB/s Buffer-Const,s==d,xor=0: 0.253380 s 4041.364 MB/s Buffer-Const,s==d,xor=1: 0.264949 s 3864.897 MB/s
262144 4096 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475269 Buffer-Const,s!=d,xor=0: 0.252645 s 4053.125 MB/s Buffer-Const,s!=d,xor=1: 0.265852 s 3851.771 MB/s Buffer-Const,s==d,xor=0: 0.252493 s 4055.552 MB/s Buffer-Const,s==d,xor=1: 0.261079 s 3922.183 MB/s
524288 2048 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475276 Buffer-Const,s!=d,xor=0: 0.250466 s 4088.377 MB/s Buffer-Const,s!=d,xor=1: 0.262400 s 3902.444 MB/s Buffer-Const,s==d,xor=0: 0.250604 s 4086.133 MB/s Buffer-Const,s==d,xor=1: 0.266080 s 3848.461 MB/s
1048576 1024 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475282 Buffer-Const,s!=d,xor=0: 0.261790 s 3911.528 MB/s Buffer-Const,s!=d,xor=1: 0.270695 s 3782.856 MB/s Buffer-Const,s==d,xor=0: 0.251318 s 4074.523 MB/s Buffer-Const,s==d,xor=1: 0.262138 s 3906.337 MB/s
2097152 512 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475288 Buffer-Const,s!=d,xor=0: 0.307526 s 3329.799 MB/s Buffer-Const,s!=d,xor=1: 0.335713 s 3050.225 MB/s Buffer-Const,s==d,xor=0: 0.251394 s 4073.291 MB/s Buffer-Const,s==d,xor=1: 0.261434 s 3916.854 MB/s
4194304 256 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475295 Buffer-Const,s!=d,xor=0: 0.392351 s 2609.907 MB/s Buffer-Const,s!=d,xor=1: 0.407979 s 2509.932 MB/s Buffer-Const,s==d,xor=0: 0.298268 s 3433.150 MB/s Buffer-Const,s==d,xor=1: 0.307510 s 3329.977 MB/s
8388608 128 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475301 Buffer-Const,s!=d,xor=0: 0.382194 s 2679.271 MB/s Buffer-Const,s!=d,xor=1: 0.401781 s 2548.653 MB/s Buffer-Const,s==d,xor=0: 0.299551 s 3418.447 MB/s Buffer-Const,s==d,xor=1: 0.310157 s 3301.553 MB/s
16777216 64 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475308 Buffer-Const,s!=d,xor=0: 0.380721 s 2689.631 MB/s Buffer-Const,s!=d,xor=1: 0.407525 s 2512.729 MB/s Buffer-Const,s==d,xor=0: 0.312348 s 3278.393 MB/s Buffer-Const,s==d,xor=1: 0.315135 s 3249.401 MB/s
33554432 32 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475315 Buffer-Const,s!=d,xor=0: 0.417304 s 2453.845 MB/s Buffer-Const,s!=d,xor=1: 0.399624 s 2562.406 MB/s Buffer-Const,s==d,xor=0: 0.301665 s 3394.496 MB/s Buffer-Const,s==d,xor=1: 0.308625 s 3317.940 MB/s
67108864 16 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475322 Buffer-Const,s!=d,xor=0: 0.409972 s 2497.732 MB/s Buffer-Const,s!=d,xor=1: 0.390287 s 2623.712 MB/s Buffer-Const,s==d,xor=0: 0.300863 s 3403.539 MB/s Buffer-Const,s==d,xor=1: 0.310340 s 3299.610 MB/s
134217728 8 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475328 Buffer-Const,s!=d,xor=0: 0.462659 s 2213.291 MB/s Buffer-Const,s!=d,xor=1: 0.394245 s 2597.369 MB/s Buffer-Const,s==d,xor=0: 0.304025 s 3368.140 MB/s Buffer-Const,s==d,xor=1: 0.305459 s 3352.329 MB/s
268435456 4 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475335 Buffer-Const,s!=d,xor=0: 0.535408 s 1912.560 MB/s Buffer-Const,s!=d,xor=1: 0.389819 s 2626.859 MB/s Buffer-Const,s==d,xor=0: 0.300778 s 3404.505 MB/s Buffer-Const,s==d,xor=1: 0.309189 s 3311.892 MB/s
536870912 2 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475342 Buffer-Const,s!=d,xor=0: 0.697741 s 1467.594 MB/s Buffer-Const,s!=d,xor=1: 0.391113 s 2618.169 MB/s Buffer-Const,s==d,xor=0: 0.297088 s 3446.786 MB/s Buffer-Const,s==d,xor=1: 0.312841 s 3273.229 MB/s
1073741824 1 32 SPLIT 32 4 SSE,ALTMAP - Seed: 1347475350 Buffer-Const,s!=d,xor=0: 1.069489 s 957.467 MB/s Buffer-Const,s!=d,xor=1: 0.396330 s 2583.705 MB/s Buffer-Const,s==d,xor=0: 0.315887 s 3241.666 MB/s Buffer-Const,s==d,xor=1: 0.307323 s 3331.999 MB/s

13
tmp2.sh Normal file
View File

@ -0,0 +1,13 @@
if [ $# -lt 4 ]; then
echo 'usage: sh tmp-test.sh w gf_specs (e.g. LOG - -)' >&2
exit 1
fi
w=$1
shift
i=1024
while [ $i -le 1073741824 ]; do
iter=`echo $i | awk '{ print (1073741824/$1)*10 }'`
echo $i $iter $w $* `gf_time $w R -1 $i $iter $*`
i=`echo $i | awk '{ print $1*2 }'`
done