gf-complete/explanation.html

778 lines
29 KiB
HTML

<h3>Code structure as of 7/20/2012</h3>
written by Jim.
<p>
Ok -- once again, I have messed with the structure. My goal is flexible and efficient.
It's similar to the stuff before, but better because it makes things like Euclid's
method much cleaner.
<p>
I think we're ready to hack.
<p>
<p>
<hr>
<h3>Files</h3>
<UL>
<LI> <a href=GNUmakefile><b>GNUmakefile</b></a>: Makefile
<LI> <a href=README><b>README</b></a>: Empty readme
<LI> <a href=explanation.html><b>explanation.html</b></a>: This file.
<LI> <a href=gf.c><b>gf.c</b></a>: Main gf routines
<LI> <a href=gf.h><b>gf.h</b></a>: Main gf prototypes and typedefs
<LI> <a href=gf_int.h><b>gf_int.h</b></a>: Prototypes and typedefs for common routines for the
internal gf implementations.
<LI> <a href=gf_method.c><b>gf_method.c</b></a>: Code to help parse argc/argv to define the method.
This way, various programs can be consistent with how they handle the command line.
<LI> <a href=gf_method.h><b>gf_method.h</b></a>: Prototypes for ibid.
<LI> <a href=gf_methods.c><b>gf_methods.c</b></a>: This program prints out how to define
the various methods on the command line. My idea is to beef this up so that you can
give it a method spec on the command line, and it will tell you whether it's valid, or
why it's invalid. I haven't written that part yet.
<LI> <a href=gf_mult.c><b>gf_mult.c</b></a>: Program to do single multiplication.
<LI> <a href=gf_mult.c><b>gf_mult.c</b></a>: Program to do single divisions -- it's created
in the makefile with a sed script on gf_mult.c.
<LI> <a href=gf_time.c><b>gf_time.c</b></a>: Time tester
<LI> <a href=gf_unit.c><b>gf_unit.c</b></a>: Unit tester
<LI> <a href=gf_54.c><b>gf_54.c</b></a>: A simple example program that multiplies
5 and 4 in GF(2^4).
<LI> <a href=gf_w4.c><b>gf_w4.c</b></a>: Implementation of code for <i>w</i> = 4.
(For now, only SHIFT and LOG, plus EUCLID & MATRIX).
<LI> <a href=gf_w8.c><b>gf_w8.c</b></a>: Implementation of code for <i>w</i> = 8.
(For now, only SHIFT plus EUCLID & MATRIX).
<LI> <a href=gf_w16.c><b>gf_w16.c</b></a>: Implementation of code for <i>w</i> = 16.
(For now, only SHIFT plus EUCLID & MATRIX).
<LI> <a href=gf_w32.c><b>gf_w32.c</b></a>: Implementation of code for <i>w</i> = 32.
(For now, only SHIFT plus EUCLID & MATRIX).
<LI> <a href=gf_w64.c><b>gf_w64.c</b></a>: Implementation of code for <i>w</i> = 64.
(For now, only SHIFT and EUCLID.
<LI> I don't have gf_w128.c or gf_gen.c yet.
</UL>
<hr>
<h3>Prototypes and typedefs in gf.h</h3>
The main structure that users will see is in <b>gf.h</b>, and it is of type
<b>gf_t</b>:
<p><center><table border=3 cellpadding=3><td><pre>
typedef struct gf {
gf_func_a_b multiply;
gf_func_a_b divide;
gf_func_a inverse;
gf_region multiply_region;
void *scratch;
} gf_t;
</pre></td></table></center><p>
We can beef it up later with buf-buf or buf-acc. The problem is that the paper is
already bloated, so right now, I want to keep it lean.
<p>
The types of the procedures are big unions, so that they work with the following
types of arguments:
<p><center><table border=3 cellpadding=3><td><pre>
typedef uint8_t gf_val_4_t;
typedef uint8_t gf_val_8_t;
typedef uint16_t gf_val_16_t;
typedef uint32_t gf_val_32_t;
typedef uint64_t gf_val_64_t;
typedef uint64_t *gf_val_128_t;
typedef uint32_t gf_val_gen_t; /* The intent here is for general values <= 32 */
</pre></td></table></center><p>
To use one of these, you need to create one with <b>gf_init_easy()</b> or
<b>gf_init_hard()</b>. Let's concentrate on the former:
<p><center><table border=3 cellpadding=3><td><pre>
extern int gf_init_easy(gf_t *gf, int w, int mult_type);
</pre></td></table></center><p>
You pass it memory for a <b>gf_t</b>, a value of <b>w</b> and
a variable that says how to do multiplication. The valid values of <b>mult_type</b>
are enumerated in <b>gf.h</b>:
<p><center><table border=3 cellpadding=3><td><pre>
typedef enum {GF_MULT_DEFAULT,
GF_MULT_SHIFT,
GF_MULT_GROUP,
GF_MULT_BYTWO_p,
GF_MULT_BYTWO_b,
GF_MULT_TABLE,
GF_MULT_LOG_TABLE,
GF_MULT_SPLIT_TABLE,
GF_MULT_COMPOSITE } gf_mult_type_t;
</pre></td></table></center><p>
After creating the <b>gf_t</b>, you use its <b>multiply</b> method
to multiply, using the union's fields to work with the various types.
It looks easier than my explanation. For example, suppose you wanted to multiply 5 and 4 in <i>GF(2<sup>4</sup>)</i>.
You can do it as in
<b><a href=gf_54.c>gf_54.c</a></b>
<p><center><table border=3 cellpadding=3><td><pre>
#include "gf.h"
main()
{
gf_t gf;
gf_init_easy(&gf, 4, GF_MULT_DEFAULT);
printf("%d\n", gf.multiply.w4(&gf, 5, 4));
exit(0);
}
</pre></td></table></center><p>
If you wanted to multiply in <i>GF(2<sup>8</sup>)</i>, then you'd have to use 8 as a parameter
to <b>gf_init_easy</b>, and call the multiplier as <b>gf.mult.w8()</b>.
<p>
When you're done with your <b>gf_t</b>, you should call <b>gf_free()</b> on it so
that it can free memory that it has allocated. We'll talk more about memory later, but if you
create your <b>gf_t</b> with <b>gf_init_easy</b>, then it calls <b>malloc()</b>, and
if you care about freeing memory, you'll have to call <b>gf_free()</b>.
<p>
<hr>
<h3>Memory allocation</h3>
Each implementation of a multiplication technique keeps around its
own data. For example, <b>GF_MULT_TABLE</b> keeps around
multiplication and division tables, and <b>GF_MULT_LOG</b> maintains log and
antilog tables. This data is stored in the pointer <b>scratch</b>. My intent
is that the memory that is there is all that's required. In other
words, the <b>multiply()</b>, <b>divide()</b>, <b>inverse()</b> and
<b>multiply_region()</b> calls don't do any memory allocation.
Moreover, <b>gf_init_easy()</b> only allocates one chunk of memory --
the one in <b>scratch</b>.
<p>
If you don't want to have the initialization call allocate memory, you can use <b>gf_init_hard()</b>:
<p><center><table border=3 cellpadding=3><td><pre>
extern int gf_init_hard(gf_t *gf,
int w,
int mult_type,
int region_type,
int divide_type,
uint64_t prim_poly,
int arg1,
int arg2,
gf_t *base_gf,
void *scratch_memory);
</pre></td></table></center><p>
The first three parameters are the same as <b>gf_init_easy()</b>.
You can add additional arguments for performing <b>multiply_region</b>, and
for performing division in the <b>region_type</b> and <b>divide_type</b>
arguments. Their values are also defined in <b>gf.h</b>. You can
mix the <b>region_type</b> values (e.g. "DOUBLE" and "SSE"):
<p><center><table border=3 cellpadding=3><td><pre>
#define GF_REGION_DEFAULT (0x0)
#define GF_REGION_SINGLE_TABLE (0x1)
#define GF_REGION_DOUBLE_TABLE (0x2)
#define GF_REGION_QUAD_TABLE (0x4)
#define GF_REGION_LAZY (0x8)
#define GF_REGION_SSE (0x10)
#define GF_REGION_NOSSE (0x20)
#define GF_REGION_STDMAP (0x40)
#define GF_REGION_ALTMAP (0x80)
#define GF_REGION_CAUCHY (0x100)
typedef uint32_t gf_region_type_t;
typedef enum { GF_DIVIDE_DEFAULT,
GF_DIVIDE_MATRIX,
GF_DIVIDE_EUCLID } gf_division_type_t;
</pre></td></table></center><p>
You can change
the primitive polynomial with <b>prim_poly</b>, give additional arguments with
<b>arg1</b> and <b>arg2</b> and give a base Galois Field for composite fields.
Finally, you can pass it a pointer to memory in <b>scratch_memory</b>. That
way, you can avoid having <b>gf_init_hard()</b> call <b>malloc()</b>.
<p>
There is a procedure called <b>gf_scratch_size()</b> that lets you know the minimum
size for <b>scratch_memory</b>, depending on <i>w</i>, the multiplication type
and the arguments:
<p><center><table border=3 cellpadding=3><td><pre>
extern int gf_scratch_size(int w,
int mult_type,
int region_type,
int divide_type,
int arg1,
int arg2);
</pre></td></table></center><p>
You can specify default arguments in <b>gf_init_hard()</b>:
<UL>
<LI> <b>region_type</b> = <b>GF_REGION_DEFAULT</b>
<LI> <b>divide_type</b> = <b>GF_REGION_DEFAULT</b>
<LI> <b>prim_poly</b> = 0
<LI> <b>arg1</b> = 0
<LI> <b>arg2</b> = 0
<LI> <b>base_gf</b> = <b>NULL</b>
<LI> <b>scratch_memory</b> = <b>NULL</b>
</UL>
If any argument is equal to its default, then default actions are taken (e.g. a
standard primitive polynomial is used, or memory is allocated for <b>scratch_memory</b>).
In fact, <b>gf_init_easy()</b> simply calls <b>gf_init_hard()</b> with the default
parameters.
<p>
<b>gf_free()</b> frees memory that was allocated with <b>gf_init_easy()</b>
or <b>gf_init_hard()</b>. The <b>recursive</b> parameter is in case you
use composite fields, and want to recursively free the base fields.
If you pass <b>scratch_memory</b> to <b>gf_init_hard()</b>, then you typically
don't need to call <b>gf_free()</b>. It won't hurt to call it, though.
<hr>
<h3>gf_mult and gf_div</h3>
For the moment, I have few things completely implemented, but that's because I want
to be able to explain the structure, and how to specify methods. In particular, for
<i>w=4</i>, I have implemented <b>SHIFT</b> and <b>LOG</b>. For <i>w=8, 16, 32, 64</i>
I have implemented <b>SHIFT</b>. For all <i>w &le; 32</i>, I have implemented both
Euclid's algorithm for inversion, and the matrix method for inversion. For
<i>w=64</i>, it's just Euclid. You can
test these all with <b>gf_mult</b> and <b>gf_div</b>. Here are a few calls:
<pre>
UNIX> <font color=darkred><b>gf_mult 7 11 4</b></font> - Default
4
UNIX> <font color=darkred><b>gf_mult 7 11 4 SHIFT - -</b></font> - Use shift
4
UNIX> <font color=darkred><b>gf_mult 7 11 4 LOG - -</b></font> - Use logs
4
UNIX> <font color=darkred><b>gf_div 4 7 4</b></font> - Default
11
UNIX> <font color=darkred><b>gf_div 4 7 4 LOG - -</b></font> - Use logs
11
UNIX> <font color=darkred><b>gf_div 4 7 4 LOG - EUCLID</b></font> - Use Euclid instead of logs
11
UNIX> <font color=darkred><b>gf_div 4 7 4 LOG - MATRIX</b></font> - Use Matrix inversion instead of logs
11
UNIX> <font color=darkred><b>gf_div 4 7 4 SHIFT - -</b></font> - Default
11
UNIX> <font color=darkred><b>gf_div 4 7 4 SHIFT - EUCLID</b></font> - Use Euclid (which is the default)
11
UNIX> <font color=darkred><b>gf_div 4 7 4 SHIFT - MATRIX</b></font> - Use Matrix inversion instead of logs
11
UNIX> <font color=darkred><b>gf_mult 200 211 8</b></font> - The remainder are shift/Euclid
201
UNIX> <font color=darkred><b>gf_div 201 211 8</b></font>
200
UNIX> <font color=darkred><b>gf_mult 60000 65111 16</b></font>
63515
UNIX> <font color=darkred><b>gf_div 63515 65111 16</b></font>
60000
UNIX> <font color=darkred><b>gf_mult abcd0001 9afbf788 32h</b></font>
b0359681
UNIX> <font color=darkred><b>gf_div b0359681 9afbf788 32h</b></font>
abcd0001
UNIX> <font color=darkred><b>gf_mult abcd00018c8b8c8a 9afbf7887f6d8e5b 64h</b></font>
3a7def35185bd571
UNIX> <font color=darkred><b>gf_mult abcd00018c8b8c8a 9afbf7887f6d8e5b 64h</b></font>
3a7def35185bd571
UNIX> <font color=darkred><b>gf_div 3a7def35185bd571 9afbf7887f6d8e5b 64h</b></font>
abcd00018c8b8c8a
UNIX> <font color=darkred><b></b></font>
</pre>
You can see all the methods with <b>gf_methods</b>. We have a lot of implementing to do:
<pre>
UNIX> <font color=darkred><b>gf_methods</b></font>
To specify the methods, do one of the following:
- leave empty to use defaults
- use a single dash to use defaults
- specify MULTIPLY REGION DIVIDE
Legal values of MULTIPLY:
SHIFT: shift
GROUP g_mult g_reduce: the Group technique - see the paper
BYTWO_p: BYTWO doubling the product.
BYTWO_b: BYTWO doubling b (more efficient thatn BYTWO_p)
TABLE: Full multiplication table
LOG: Discrete logs
LOG_ZERO: Discrete logs with a large table for zeros
SPLIT g_a g_b: Split tables defined by g_a and g_b
COMPOSITE k l [METHOD]: Composite field, recursively specify the
method of the base field in GF(2^l)
Legal values of REGION: Specify multiples with commas e.g. 'DOUBLE,LAZY'
-: Use defaults
SINGLE/DOUBLE/QUAD: Expand tables
LAZY: Lazily create table (only applies to TABLE and SPLIT)
SSE/NOSSE: Use 128-bit SSE instructions if you can
CAUCHY/ALTMAP/STDMAP: Use different memory mappings
Legal values of DIVIDE:
-: Use defaults
MATRIX: Use matrix inversion
EUCLID: Use the extended Euclidian algorithm.
See the user's manual for more information.
There are many restrictions, so it is better to simply use defaults in most cases.
UNIX> <font color=darkred><b></b></font>
</pre>
<hr>
<h3>gf_unit and gf_time</h3>
<b><a href=gf_unit.c>gf_unit.c</a></b> is a unit tester, and
<b><a href=gf_time.c>gf_time.c</a></b> is a time tester.
They are called as follows:
<p><center><table border=3 cellpadding=3><td><pre>
UNIX> <font color=darkred><b>gf_unit w tests seed [METHOD] </b></font>
UNIX> <font color=darkred><b>gf_time w tests seed size(bytes) iterations [METHOD] </b></font>
</pre></td></table></center><p>
The <b>tests</b> parameter is one or more of the following characters:
<UL>
<LI> A: Do all tests
<LI> S: Test only single operations (multiplication/division)
<LI> R: Test only region operations
<LI> V: Verbose Output
</UL>
<b>seed</b> is a seed for <b>srand48()</b> -- using -1 defaults to the current time.
<p>
For example, testing the defaults with w=4:
<pre>
UNIX> <font color=darkred><b>gf_unit 4 AV 1 LOG - -</b></font>
Seed: 1
Testing single multiplications/divisions.
Testing Inversions.
Testing buffer-constant, src != dest, xor = 0
Testing buffer-constant, src != dest, xor = 1
Testing buffer-constant, src == dest, xor = 0
Testing buffer-constant, src == dest, xor = 1
UNIX> <font color=darkred><b>gf_unit 4 AV 1 SHIFT - -</b></font>
Seed: 1
Testing single multiplications/divisions.
Testing Inversions.
No multiply_region.
UNIX> <font color=darkred><b></b></font>
</pre>
There is no <b>multiply_region()</b> method defined for <b>SHIFT</b>.
Thus, the procedures are <b>NULL</b> and the unit tester ignores them.
<p>
At the moment, I only have the unit tester working for w=4.
<p>
<b>gf_time</b> takes the size of an array (in bytes) and a number of iterations, and
tests the speed of both single and region operations. The tests are:
<UL>
<LI> A: All
<LI> S: All Single Operations
<LI> R: All Region Operations
<LI> M: Single: Multiplications
<LI> D: Single: Divisions
<LI> I: Single: Inverses
<LI> B: Region: Multipy_Region
</UL>
Here are some examples with <b>SHIFT</b> and <b>LOG</b> on my mac.
<pre>
UNIX> <font color=darkred><b>gf_time 4 A 1 102400 1024 LOG - -</b></font>
Seed: 1
Multiply: 0.538126 s 185.830 Mega-ops/s
Divide: 0.520825 s 192.003 Mega-ops/s
Inverse: 0.631198 s 158.429 Mega-ops/s
Buffer-Const,s!=d,xor=0: 0.478395 s 209.032 MB/s
Buffer-Const,s!=d,xor=1: 0.524245 s 190.751 MB/s
Buffer-Const,s==d,xor=0: 0.471851 s 211.931 MB/s
Buffer-Const,s==d,xor=1: 0.528275 s 189.295 MB/s
UNIX> <font color=darkred><b>gf_time 4 A 1 102400 1024 LOG - EUCLID</b></font>
Seed: 1
Multiply: 0.555512 s 180.014 Mega-ops/s
Divide: 5.359434 s 18.659 Mega-ops/s
Inverse: 4.911719 s 20.359 Mega-ops/s
Buffer-Const,s!=d,xor=0: 0.496097 s 201.573 MB/s
Buffer-Const,s!=d,xor=1: 0.538536 s 185.689 MB/s
Buffer-Const,s==d,xor=0: 0.485564 s 205.946 MB/s
Buffer-Const,s==d,xor=1: 0.540227 s 185.107 MB/s
UNIX> <font color=darkred><b>gf_time 4 A 1 102400 1024 LOG - MATRIX</b></font>
Seed: 1
Multiply: 0.544005 s 183.822 Mega-ops/s
Divide: 7.602822 s 13.153 Mega-ops/s
Inverse: 7.000564 s 14.285 Mega-ops/s
Buffer-Const,s!=d,xor=0: 0.474868 s 210.585 MB/s
Buffer-Const,s!=d,xor=1: 0.527588 s 189.542 MB/s
Buffer-Const,s==d,xor=0: 0.473130 s 211.358 MB/s
Buffer-Const,s==d,xor=1: 0.529877 s 188.723 MB/s
UNIX> <font color=darkred><b>gf_time 4 A 1 102400 1024 SHIFT - -</b></font>
Seed: 1
Multiply: 2.708842 s 36.916 Mega-ops/s
Divide: 8.756882 s 11.420 Mega-ops/s
Inverse: 5.695511 s 17.558 Mega-ops/s
UNIX> <font color=darkred><b></b></font>
</pre>
At the moment, I only have the timer working for w=4.
<hr>
<h3>Walking you through <b>LOG</b></h3>
To see how <b>scratch</b> is used to store data, let's look at what happens when
you call <b>gf_init_easy(&gf, 4, GF_MULT_LOG);</b>
First, <b>gf_init_easy()</b> calls <b>gf_init_hard()</b> with default parameters.
This is in <b><a href=gf.c>gf.c</a></b>.
<p>
<b>gf_init_hard()</b>' first job is to set up the scratch.
The scratch's type is <b>gf_internal_t</b>, defined in
<b><a href=gf_int.h>gf_int.h</a></b>:
<p><center><table border=3 cellpadding=3><td><pre>
typedef struct {
int mult_type;
int region_type;
int divide_type;
int w;
uint64_t prim_poly;
int free_me;
int arg1;
int arg2;
gf_t *base_gf;
void *private;
} gf_internal_t;
</pre></td></table></center><p>
All the fields are straightfoward, with the exception of <b>private</b>. That is
a <b>(void *)</b> which points to the implementation's private data.
<p>
Here's the code for
<b>gf_init_hard()</b>:
<p><center><table border=3 cellpadding=3><td><pre>
int gf_init_hard(gf_t *gf, int w, int mult_type,
int region_type,
int divide_type,
uint64_t prim_poly,
int arg1, int arg2,
gf_t *base_gf,
void *scratch_memory)
{
int sz;
gf_internal_t *h;
if (scratch_memory == NULL) {
sz = gf_scratch_size(w, mult_type, region_type, divide_type, arg1, arg2);
if (sz &lt;= 0) return 0;
h = (gf_internal_t *) malloc(sz);
h-&gt;free_me = 1;
} else {
h = scratch_memory;
h-&gt;free_me = 0;
}
gf-&gt;scratch = (void *) h;
h-&gt;mult_type = mult_type;
h-&gt;region_type = region_type;
h-&gt;divide_type = divide_type;
h-&gt;w = w;
h-&gt;prim_poly = prim_poly;
h-&gt;arg1 = arg1;
h-&gt;arg2 = arg2;
h-&gt;base_gf = base_gf;
h-&gt;private = (void *) gf-&gt;scratch;
h-&gt;private += (sizeof(gf_internal_t));
switch(w) {
case 4: return gf_w4_init(gf);
case 8: return gf_w8_init(gf);
case 16: return gf_w16_init(gf);
case 32: return gf_w32_init(gf);
case 64: return gf_w64_init(gf);
case 128: return gf_dummy_init(gf);
default: return 0;
}
}
</pre></td></table></center><p>
The first thing it does is determine if it has to allocate space for <b>scratch</b>.
If it must, it uses <b>gf_scratch_size()</b> to figure out how big the space must be.
It then sets <b>gf->scratch</b> to this space, and sets all of the fields of the
scratch to the arguments in <b>gf_init_hard()</b>. The <b>private</b> pointer is
set to be the space just after the pointer <b>gf->private</b>. Again, it is up to
<b>gf_scratch_size()</b> to make sure there is enough space for the scratch, and
for all of the private data needed by the implementation.
<p>
Once the scratch is set up, <b>gf_init_hard()</b> calls <b>gf_w4_init()</b>. This is
in <b><a href=gf_w4.c>gf_w4.c</a></b>, and it is a
simple dispatcher to the various initialization routines, plus it
sets <b>EUCLID</b> and <b>MATRIX</b> if need be:
<p><center><table border=3 cellpadding=3><td><pre>
int gf_w4_init(gf_t *gf)
{
gf_internal_t *h;
h = (gf_internal_t *) gf-&gt;scratch;
if (h-&gt;prim_poly == 0) h-&gt;prim_poly = 0x13;
gf-&gt;multiply.w4 = NULL;
gf-&gt;divide.w4 = NULL;
gf-&gt;inverse.w4 = NULL;
gf-&gt;multiply_region.w4 = NULL;
switch(h-&gt;mult_type) {
case GF_MULT_SHIFT: if (gf_w4_shift_init(gf) == 0) return 0; break;
case GF_MULT_LOG_TABLE: if (gf_w4_log_init(gf) == 0) return 0; break;
case GF_MULT_DEFAULT: if (gf_w4_log_init(gf) == 0) return 0; break;
default: return 0;
}
if (h-&gt;divide_type == GF_DIVIDE_EUCLID) {
gf-&gt;divide.w4 = gf_w4_divide_from_inverse;
gf-&gt;inverse.w4 = gf_w4_euclid;
} else if (h-&gt;divide_type == GF_DIVIDE_MATRIX) {
gf-&gt;divide.w4 = gf_w4_divide_from_inverse;
gf-&gt;inverse.w4 = gf_w4_matrix;
}
if (gf-&gt;inverse.w4 != NULL && gf-&gt;divide.w4 == NULL) {
gf-&gt;divide.w4 = gf_w4_divide_from_inverse;
}
if (gf-&gt;inverse.w4 == NULL && gf-&gt;divide.w4 != NULL) {
gf-&gt;inverse.w4 = gf_w4_inverse_from_divide;
}
return 1;
}
</pre></td></table></center><p>
The code in <b>gf_w4_log_init()</b> sets up the log and antilog tables, and sets
the <b>multiply.w4</b>, <b>divide.w4</b> etc routines to be the ones for logs. The
tables are put into <b>gf->scratch->private</b>, which is typecast to a <b>struct
gf_logtable_data *</b>:
<p><center><table border=3 cellpadding=3><td><pre>
struct gf_logtable_data {
gf_val_4_t log_tbl[GF_FIELD_SIZE];
gf_val_4_t antilog_tbl[GF_FIELD_SIZE * 2];
gf_val_4_t *antilog_tbl_div;
};
.......
static
int gf_w4_log_init(gf_t *gf)
{
gf_internal_t *h;
struct gf_logtable_data *ltd;
int i, b;
h = (gf_internal_t *) gf-&gt;scratch;
ltd = h-&gt;private;
ltd-&gt;log_tbl[0] = 0;
ltd-&gt;antilog_tbl_div = ltd-&gt;antilog_tbl + (GF_FIELD_SIZE-1);
b = 1;
for (i = 0; i &lt; GF_FIELD_SIZE-1; i++) {
ltd-&gt;log_tbl[b] = (gf_val_8_t)i;
ltd-&gt;antilog_tbl[i] = (gf_val_8_t)b;
ltd-&gt;antilog_tbl[i+GF_FIELD_SIZE-1] = (gf_val_8_t)b;
b &lt;&lt;= 1;
if (b & GF_FIELD_SIZE) {
b = b ^ h-&gt;prim_poly;
}
}
gf-&gt;inverse.w4 = gf_w4_inverse_from_divide;
gf-&gt;divide.w4 = gf_w4_log_divide;
gf-&gt;multiply.w4 = gf_w4_log_multiply;
gf-&gt;multiply_region.w4 = gf_w4_log_multiply_region;
return 1;
}
</pre></td></table></center><p>
And of course the individual routines use <b>h->private</b> to access the tables:
<p><center><table border=3 cellpadding=3><td><pre>
static
inline
gf_val_8_t gf_w4_log_multiply (gf_t *gf, gf_val_8_t a, gf_val_8_t b)
{
struct gf_logtable_data *ltd;
ltd = (struct gf_logtable_data *) ((gf_internal_t *) (gf-&gt;scratch))-&gt;private;
return (a == 0 || b == 0) ? 0 : ltd-&gt;antilog_tbl[(unsigned)(ltd-&gt;log_tbl[a] + ltd-&gt;log_tbl[b])];
}
</pre></td></table></center><p>
Finally, it's important that the proper sizes are put into
<b>gf_w4_scratch_size()</b> for each implementation:
<p><center><table border=3 cellpadding=3><td><pre>
int gf_w4_scratch_size(int mult_type, int region_type, int divide_type, int arg1, int arg2)
{
int region_tbl_size;
switch(mult_type)
{
case GF_MULT_DEFAULT:
case GF_MULT_LOG_TABLE:
return sizeof(gf_internal_t) + sizeof(struct gf_logtable_data) + 64;
break;
case GF_MULT_SHIFT:
return sizeof(gf_internal_t);
break;
default:
return -1;
}
}
</pre></td></table></center><p>
I hope that's enough explanation for y'all to start implementing. Let me know if you have
problems -- thanks -- Jim
<hr>
The initial structure has been set for w=4, 8, 16, 32 and 64, with implementations of SHIFT and EUCLID, and for w <= 32, MATRIX. There are some weird caveats:
<UL>
<LI> For w=32 and w=64, the primitive polynomial does not have the leading one.
<LI> I'd like for naming to be:
<p>
<UL>
<b>gf_w</b><i>w</i><b>_</b><i>technique</i></i><b>_</b><i>funcationality</i><b>()</b>.
</UL>
<p>
For example, the log techniques for w=4 are:
<pre>
gf_w4_log_multiply()
gf_w4_log_divide()
gf_w4_log_multiply_region()
gf_w4_log_init()
</pre>
<p>
<LI> I'd also like a header block on implementations that says who wrote it.
</UL>
<hr>
<h3>Things we need to Implement: <i>w=4</i></h3>
<p><table border=3 cellpadding=2>
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
<tr> <td> BYTWO_p </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_b </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_p, SSE </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_b, SSE </td> <td>Done - Jim</td> </tr>
<tr> <td> Single TABLE </td> <td> Done - Jim </td> </tr>
<tr> <td> Double TABLE </td> <td> Done - Jim </td> </tr>
<tr> <td> Double TABLE, SSE </td> <td> Done - Jim </td> </tr>
<tr> <td> Quad TABLE </td> <td>Done - Jim</td> </tr>
<tr> <td> Lazy Quad TABLE </td> <td>Done - Jim</td> </tr>
<tr> <td> LOG </td> <td> Done - Jim </td> </tr>
</table><p>
<hr>
<h3>Things we need to Implement: <i>w=8</i></h3>
<p><table border=3 cellpadding=2>
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
<tr> <td> BYTWO_p </td> <td>Done - Jim </td> </tr>
<tr> <td> BYTWO_b </td> <td>Done - Jim </td> </tr>
<tr> <td> BYTWO_p, SSE </td> <td>Done - Jim </td> </tr>
<tr> <td> BYTWO_b, SSE </td> <td>Done - Jim </td> </tr>
<tr> <td> Single TABLE </td> <td> Done - Kevin </td> </tr>
<tr> <td> Double TABLE </td> <td> Done - Jim </td> </tr>
<tr> <td> Lazy Double TABLE </td> <td> Done - Jim </td> </tr>
<tr> <td> Split 2 1 (Half) SSE </td> <td>Done - Jim</td> </tr>
<tr> <td> Composite, k=2 </td> <td> Done - Kevin (alt mapping not passing unit test) </td> </tr>
<tr> <td> LOG </td> <td> Done - Kevin </td> </tr>
<tr> <td> LOG ZERO</td> <td> Done - Jim</td> </tr>
</table><p>
<hr>
<h3>Things we need to Implement: <i>w=16</i></h3>
<p><table border=3 cellpadding=2>
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
<tr> <td> BYTWO_p </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_b </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_p, SSE </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_b, SSE </td> <td>Done - Jim</td> </tr>
<tr> <td> Lazy TABLE </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 4 16 No-SSE, lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 4 16 SSE, lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 4 16 SSE, lazy, alternate mapping </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 8 16, lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Composite, k=2, stdmap recursive </td> <td> Done - Kevin</td> </tr>
<tr> <td> Composite, k=2, altmap recursive </td> <td> Done - Kevin</td> </tr>
<tr> <td> Composite, k=2, stdmap inline </td> <td> Done - Kevin</td> </tr>
<tr> <td> LOG </td> <td> Done - Kevin </td> </tr>
<tr> <td> LOG ZERO</td> <td> Done - Kevin </td> </tr>
<tr> <td> Group 4 4 </td> <td>Done - Jim: I don't see a reason to implement others, although 4-8 will be faster, and 8 8 will have faster region ops. They'll never beat SPLIT.</td> </tr>
</table><p>
<hr>
<h3>Things we need to Implement: <i>w=32</i></h3>
<p><table border=3 cellpadding=2>
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
<tr> <td> BYTWO_p </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_b </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_p, SSE </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_b, SSE </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 2 32,lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 2 32, SSE, lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 4 32, lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 4 32, SSE,ALTMAP lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 4 32, SSE, lazy </td> <td>Done - Jim</td> </tr>
<tr> <td> Split 8 8 </td> <td>Done - Jim </td> </tr>
<tr> <td> Group, g_s == g_r </td> <td>Done - Jim</td></tr>
<tr> <td> Group, any g_s and g_r</td> <td>Done - Jim</td></tr>
<tr> <td> Composite, k=2, stdmap recursive </td> <td> Done - Kevin</td> </tr>
<tr> <td> Composite, k=2, altmap recursive </td> <td> Done - Kevin</td> </tr>
<tr> <td> Composite, k=2, stdmap inline </td> <td> Done - Kevin</td> </tr>
</table><p>
<hr>
<h3>Things we need to Implement: <i>w=64</i></h3>
<p><table border=3 cellpadding=2>
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
<tr> <td> BYTWO_p </td> <td> - </td> </tr>
<tr> <td> BYTWO_b </td> <td> - </td> </tr>
<tr> <td> BYTWO_p, SSE </td> <td> - </td> </tr>
<tr> <td> BYTWO_b, SSE </td> <td> - </td> </tr>
<tr> <td> Split 16 1 SSE, maybe lazy </td> <td> - </td> </tr>
<tr> <td> Split 8 1 lazy </td> <td> - </td> </tr>
<tr> <td> Split 8 8 </td> <td> - </td> </tr>
<tr> <td> Split 8 8 lazy </td> <td> - </td> </tr>
<tr> <td> Group </td> <td> - </td> </tr>
<tr> <td> Composite, k=2, alternate mapping </td> <td> - </td> </tr>
</table><p>
<hr>
<h3>Things we need to Implement: <i>w=128</i></h3>
<p><table border=3 cellpadding=2>
<tr> <td> SHIFT </td> <td> Done - Will </td> </tr>
<tr> <td> BYTWO_p </td> <td> - </td> </tr>
<tr> <td> BYTWO_b </td> <td> - </td> </tr>
<tr> <td> BYTWO_p, SSE </td> <td> - </td> </tr>
<tr> <td> BYTWO_b, SSE </td> <td> - </td> </tr>
<tr> <td> Split 32 1 SSE, maybe lazy </td> <td> - </td> </tr>
<tr> <td> Split 16 1 lazy </td> <td> - </td> </tr>
<tr> <td> Split 16 16 - Maybe that's insanity</td> <td> - </td> </tr>
<tr> <td> Split 16 16 lazy </td> <td> - </td> </tr>
<tr> <td> Group (SSE) </td> <td> - </td> </tr>
<tr> <td> Composite, k=?, alternate mapping </td> <td> - </td> </tr>
</table><p>
<hr>
<h3>Things we need to Implement: <i>w=general between 1 & 32</i></h3>
<p><table border=3 cellpadding=2>
<tr> <td> CAUCHY Region (SSE XOR)</td> <td> Done - Jim </td> </tr>
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
<tr> <td> TABLE </td> <td> Done - Jim </td> </tr>
<tr> <td> LOG </td> <td> Done - Jim </td> </tr>
<tr> <td> BYTWO_p </td> <td>Done - Jim</td> </tr>
<tr> <td> BYTWO_b </td> <td>Done - Jim</td> </tr>
<tr> <td> Group, g_s == g_r </td> <td>Done - Jim</td></tr>
<tr> <td> Group, any g_s and g_r</td> <td>Done - Jim</td></tr>
<tr> <td> Split - do we need it?</td> <td>Done - Jim</td></tr>
<tr> <td> Composite - do we need it?</td> <td> - </td></tr>
<tr> <td> Split - do we need it?</td> <td> - </td></tr>
<tr> <td> Logzero?</td> <td> - </td></tr>
</table><p>