778 lines
29 KiB
HTML
778 lines
29 KiB
HTML
<h3>Code structure as of 7/20/2012</h3>
|
|
|
|
written by Jim.
|
|
<p>
|
|
Ok -- once again, I have messed with the structure. My goal is flexible and efficient.
|
|
It's similar to the stuff before, but better because it makes things like Euclid's
|
|
method much cleaner.
|
|
<p>
|
|
I think we're ready to hack.
|
|
<p>
|
|
<p>
|
|
<hr>
|
|
<h3>Files</h3>
|
|
<UL>
|
|
<LI> <a href=GNUmakefile><b>GNUmakefile</b></a>: Makefile
|
|
<LI> <a href=README><b>README</b></a>: Empty readme
|
|
<LI> <a href=explanation.html><b>explanation.html</b></a>: This file.
|
|
<LI> <a href=gf.c><b>gf.c</b></a>: Main gf routines
|
|
<LI> <a href=gf.h><b>gf.h</b></a>: Main gf prototypes and typedefs
|
|
<LI> <a href=gf_int.h><b>gf_int.h</b></a>: Prototypes and typedefs for common routines for the
|
|
internal gf implementations.
|
|
<LI> <a href=gf_method.c><b>gf_method.c</b></a>: Code to help parse argc/argv to define the method.
|
|
This way, various programs can be consistent with how they handle the command line.
|
|
<LI> <a href=gf_method.h><b>gf_method.h</b></a>: Prototypes for ibid.
|
|
<LI> <a href=gf_methods.c><b>gf_methods.c</b></a>: This program prints out how to define
|
|
the various methods on the command line. My idea is to beef this up so that you can
|
|
give it a method spec on the command line, and it will tell you whether it's valid, or
|
|
why it's invalid. I haven't written that part yet.
|
|
<LI> <a href=gf_mult.c><b>gf_mult.c</b></a>: Program to do single multiplication.
|
|
<LI> <a href=gf_mult.c><b>gf_mult.c</b></a>: Program to do single divisions -- it's created
|
|
in the makefile with a sed script on gf_mult.c.
|
|
<LI> <a href=gf_time.c><b>gf_time.c</b></a>: Time tester
|
|
<LI> <a href=gf_unit.c><b>gf_unit.c</b></a>: Unit tester
|
|
<LI> <a href=gf_54.c><b>gf_54.c</b></a>: A simple example program that multiplies
|
|
5 and 4 in GF(2^4).
|
|
<LI> <a href=gf_w4.c><b>gf_w4.c</b></a>: Implementation of code for <i>w</i> = 4.
|
|
(For now, only SHIFT and LOG, plus EUCLID & MATRIX).
|
|
<LI> <a href=gf_w8.c><b>gf_w8.c</b></a>: Implementation of code for <i>w</i> = 8.
|
|
(For now, only SHIFT plus EUCLID & MATRIX).
|
|
<LI> <a href=gf_w16.c><b>gf_w16.c</b></a>: Implementation of code for <i>w</i> = 16.
|
|
(For now, only SHIFT plus EUCLID & MATRIX).
|
|
<LI> <a href=gf_w32.c><b>gf_w32.c</b></a>: Implementation of code for <i>w</i> = 32.
|
|
(For now, only SHIFT plus EUCLID & MATRIX).
|
|
<LI> <a href=gf_w64.c><b>gf_w64.c</b></a>: Implementation of code for <i>w</i> = 64.
|
|
(For now, only SHIFT and EUCLID.
|
|
<LI> I don't have gf_w128.c or gf_gen.c yet.
|
|
</UL>
|
|
|
|
<hr>
|
|
<h3>Prototypes and typedefs in gf.h</h3>
|
|
|
|
The main structure that users will see is in <b>gf.h</b>, and it is of type
|
|
<b>gf_t</b>:
|
|
|
|
<p><center><table border=3 cellpadding=3><td><pre>
|
|
typedef struct gf {
|
|
gf_func_a_b multiply;
|
|
gf_func_a_b divide;
|
|
gf_func_a inverse;
|
|
gf_region multiply_region;
|
|
void *scratch;
|
|
} gf_t;
|
|
</pre></td></table></center><p>
|
|
|
|
We can beef it up later with buf-buf or buf-acc. The problem is that the paper is
|
|
already bloated, so right now, I want to keep it lean.
|
|
<p>
|
|
The types of the procedures are big unions, so that they work with the following
|
|
types of arguments:
|
|
|
|
<p><center><table border=3 cellpadding=3><td><pre>
|
|
typedef uint8_t gf_val_4_t;
|
|
typedef uint8_t gf_val_8_t;
|
|
typedef uint16_t gf_val_16_t;
|
|
typedef uint32_t gf_val_32_t;
|
|
typedef uint64_t gf_val_64_t;
|
|
typedef uint64_t *gf_val_128_t;
|
|
typedef uint32_t gf_val_gen_t; /* The intent here is for general values <= 32 */
|
|
</pre></td></table></center><p>
|
|
|
|
To use one of these, you need to create one with <b>gf_init_easy()</b> or
|
|
<b>gf_init_hard()</b>. Let's concentrate on the former:
|
|
|
|
<p><center><table border=3 cellpadding=3><td><pre>
|
|
extern int gf_init_easy(gf_t *gf, int w, int mult_type);
|
|
</pre></td></table></center><p>
|
|
|
|
You pass it memory for a <b>gf_t</b>, a value of <b>w</b> and
|
|
a variable that says how to do multiplication. The valid values of <b>mult_type</b>
|
|
are enumerated in <b>gf.h</b>:
|
|
|
|
<p><center><table border=3 cellpadding=3><td><pre>
|
|
typedef enum {GF_MULT_DEFAULT,
|
|
GF_MULT_SHIFT,
|
|
GF_MULT_GROUP,
|
|
GF_MULT_BYTWO_p,
|
|
GF_MULT_BYTWO_b,
|
|
GF_MULT_TABLE,
|
|
GF_MULT_LOG_TABLE,
|
|
GF_MULT_SPLIT_TABLE,
|
|
GF_MULT_COMPOSITE } gf_mult_type_t;
|
|
</pre></td></table></center><p>
|
|
|
|
After creating the <b>gf_t</b>, you use its <b>multiply</b> method
|
|
to multiply, using the union's fields to work with the various types.
|
|
It looks easier than my explanation. For example, suppose you wanted to multiply 5 and 4 in <i>GF(2<sup>4</sup>)</i>.
|
|
You can do it as in
|
|
<b><a href=gf_54.c>gf_54.c</a></b>
|
|
|
|
<p><center><table border=3 cellpadding=3><td><pre>
|
|
#include "gf.h"
|
|
|
|
main()
|
|
{
|
|
gf_t gf;
|
|
|
|
gf_init_easy(&gf, 4, GF_MULT_DEFAULT);
|
|
printf("%d\n", gf.multiply.w4(&gf, 5, 4));
|
|
exit(0);
|
|
}
|
|
</pre></td></table></center><p>
|
|
|
|
|
|
If you wanted to multiply in <i>GF(2<sup>8</sup>)</i>, then you'd have to use 8 as a parameter
|
|
to <b>gf_init_easy</b>, and call the multiplier as <b>gf.mult.w8()</b>.
|
|
<p>
|
|
When you're done with your <b>gf_t</b>, you should call <b>gf_free()</b> on it so
|
|
that it can free memory that it has allocated. We'll talk more about memory later, but if you
|
|
create your <b>gf_t</b> with <b>gf_init_easy</b>, then it calls <b>malloc()</b>, and
|
|
if you care about freeing memory, you'll have to call <b>gf_free()</b>.
|
|
<p>
|
|
|
|
<hr>
|
|
<h3>Memory allocation</h3>
|
|
|
|
Each implementation of a multiplication technique keeps around its
|
|
own data. For example, <b>GF_MULT_TABLE</b> keeps around
|
|
multiplication and division tables, and <b>GF_MULT_LOG</b> maintains log and
|
|
antilog tables. This data is stored in the pointer <b>scratch</b>. My intent
|
|
is that the memory that is there is all that's required. In other
|
|
words, the <b>multiply()</b>, <b>divide()</b>, <b>inverse()</b> and
|
|
<b>multiply_region()</b> calls don't do any memory allocation.
|
|
Moreover, <b>gf_init_easy()</b> only allocates one chunk of memory --
|
|
the one in <b>scratch</b>.
|
|
<p>
|
|
If you don't want to have the initialization call allocate memory, you can use <b>gf_init_hard()</b>:
|
|
|
|
<p><center><table border=3 cellpadding=3><td><pre>
|
|
extern int gf_init_hard(gf_t *gf,
|
|
int w,
|
|
int mult_type,
|
|
int region_type,
|
|
int divide_type,
|
|
uint64_t prim_poly,
|
|
int arg1,
|
|
int arg2,
|
|
gf_t *base_gf,
|
|
void *scratch_memory);
|
|
</pre></td></table></center><p>
|
|
|
|
The first three parameters are the same as <b>gf_init_easy()</b>.
|
|
You can add additional arguments for performing <b>multiply_region</b>, and
|
|
for performing division in the <b>region_type</b> and <b>divide_type</b>
|
|
arguments. Their values are also defined in <b>gf.h</b>. You can
|
|
mix the <b>region_type</b> values (e.g. "DOUBLE" and "SSE"):
|
|
|
|
<p><center><table border=3 cellpadding=3><td><pre>
|
|
#define GF_REGION_DEFAULT (0x0)
|
|
#define GF_REGION_SINGLE_TABLE (0x1)
|
|
#define GF_REGION_DOUBLE_TABLE (0x2)
|
|
#define GF_REGION_QUAD_TABLE (0x4)
|
|
#define GF_REGION_LAZY (0x8)
|
|
#define GF_REGION_SSE (0x10)
|
|
#define GF_REGION_NOSSE (0x20)
|
|
#define GF_REGION_STDMAP (0x40)
|
|
#define GF_REGION_ALTMAP (0x80)
|
|
#define GF_REGION_CAUCHY (0x100)
|
|
|
|
typedef uint32_t gf_region_type_t;
|
|
|
|
typedef enum { GF_DIVIDE_DEFAULT,
|
|
GF_DIVIDE_MATRIX,
|
|
GF_DIVIDE_EUCLID } gf_division_type_t;
|
|
</pre></td></table></center><p>
|
|
You can change
|
|
the primitive polynomial with <b>prim_poly</b>, give additional arguments with
|
|
<b>arg1</b> and <b>arg2</b> and give a base Galois Field for composite fields.
|
|
Finally, you can pass it a pointer to memory in <b>scratch_memory</b>. That
|
|
way, you can avoid having <b>gf_init_hard()</b> call <b>malloc()</b>.
|
|
<p>
|
|
There is a procedure called <b>gf_scratch_size()</b> that lets you know the minimum
|
|
size for <b>scratch_memory</b>, depending on <i>w</i>, the multiplication type
|
|
and the arguments:
|
|
|
|
<p><center><table border=3 cellpadding=3><td><pre>
|
|
extern int gf_scratch_size(int w,
|
|
int mult_type,
|
|
int region_type,
|
|
int divide_type,
|
|
int arg1,
|
|
int arg2);
|
|
</pre></td></table></center><p>
|
|
|
|
You can specify default arguments in <b>gf_init_hard()</b>:
|
|
<UL>
|
|
<LI> <b>region_type</b> = <b>GF_REGION_DEFAULT</b>
|
|
<LI> <b>divide_type</b> = <b>GF_REGION_DEFAULT</b>
|
|
<LI> <b>prim_poly</b> = 0
|
|
<LI> <b>arg1</b> = 0
|
|
<LI> <b>arg2</b> = 0
|
|
<LI> <b>base_gf</b> = <b>NULL</b>
|
|
<LI> <b>scratch_memory</b> = <b>NULL</b>
|
|
</UL>
|
|
If any argument is equal to its default, then default actions are taken (e.g. a
|
|
standard primitive polynomial is used, or memory is allocated for <b>scratch_memory</b>).
|
|
In fact, <b>gf_init_easy()</b> simply calls <b>gf_init_hard()</b> with the default
|
|
parameters.
|
|
<p>
|
|
<b>gf_free()</b> frees memory that was allocated with <b>gf_init_easy()</b>
|
|
or <b>gf_init_hard()</b>. The <b>recursive</b> parameter is in case you
|
|
use composite fields, and want to recursively free the base fields.
|
|
If you pass <b>scratch_memory</b> to <b>gf_init_hard()</b>, then you typically
|
|
don't need to call <b>gf_free()</b>. It won't hurt to call it, though.
|
|
|
|
<hr>
|
|
<h3>gf_mult and gf_div</h3>
|
|
|
|
For the moment, I have few things completely implemented, but that's because I want
|
|
to be able to explain the structure, and how to specify methods. In particular, for
|
|
<i>w=4</i>, I have implemented <b>SHIFT</b> and <b>LOG</b>. For <i>w=8, 16, 32, 64</i>
|
|
I have implemented <b>SHIFT</b>. For all <i>w ≤ 32</i>, I have implemented both
|
|
Euclid's algorithm for inversion, and the matrix method for inversion. For
|
|
<i>w=64</i>, it's just Euclid. You can
|
|
test these all with <b>gf_mult</b> and <b>gf_div</b>. Here are a few calls:
|
|
|
|
<pre>
|
|
UNIX> <font color=darkred><b>gf_mult 7 11 4</b></font> - Default
|
|
4
|
|
UNIX> <font color=darkred><b>gf_mult 7 11 4 SHIFT - -</b></font> - Use shift
|
|
4
|
|
UNIX> <font color=darkred><b>gf_mult 7 11 4 LOG - -</b></font> - Use logs
|
|
4
|
|
UNIX> <font color=darkred><b>gf_div 4 7 4</b></font> - Default
|
|
11
|
|
UNIX> <font color=darkred><b>gf_div 4 7 4 LOG - -</b></font> - Use logs
|
|
11
|
|
UNIX> <font color=darkred><b>gf_div 4 7 4 LOG - EUCLID</b></font> - Use Euclid instead of logs
|
|
11
|
|
UNIX> <font color=darkred><b>gf_div 4 7 4 LOG - MATRIX</b></font> - Use Matrix inversion instead of logs
|
|
11
|
|
UNIX> <font color=darkred><b>gf_div 4 7 4 SHIFT - -</b></font> - Default
|
|
11
|
|
UNIX> <font color=darkred><b>gf_div 4 7 4 SHIFT - EUCLID</b></font> - Use Euclid (which is the default)
|
|
11
|
|
UNIX> <font color=darkred><b>gf_div 4 7 4 SHIFT - MATRIX</b></font> - Use Matrix inversion instead of logs
|
|
11
|
|
UNIX> <font color=darkred><b>gf_mult 200 211 8</b></font> - The remainder are shift/Euclid
|
|
201
|
|
UNIX> <font color=darkred><b>gf_div 201 211 8</b></font>
|
|
200
|
|
UNIX> <font color=darkred><b>gf_mult 60000 65111 16</b></font>
|
|
63515
|
|
UNIX> <font color=darkred><b>gf_div 63515 65111 16</b></font>
|
|
60000
|
|
UNIX> <font color=darkred><b>gf_mult abcd0001 9afbf788 32h</b></font>
|
|
b0359681
|
|
UNIX> <font color=darkred><b>gf_div b0359681 9afbf788 32h</b></font>
|
|
abcd0001
|
|
UNIX> <font color=darkred><b>gf_mult abcd00018c8b8c8a 9afbf7887f6d8e5b 64h</b></font>
|
|
3a7def35185bd571
|
|
UNIX> <font color=darkred><b>gf_mult abcd00018c8b8c8a 9afbf7887f6d8e5b 64h</b></font>
|
|
3a7def35185bd571
|
|
UNIX> <font color=darkred><b>gf_div 3a7def35185bd571 9afbf7887f6d8e5b 64h</b></font>
|
|
abcd00018c8b8c8a
|
|
UNIX> <font color=darkred><b></b></font>
|
|
</pre>
|
|
|
|
You can see all the methods with <b>gf_methods</b>. We have a lot of implementing to do:
|
|
|
|
<pre>
|
|
UNIX> <font color=darkred><b>gf_methods</b></font>
|
|
To specify the methods, do one of the following:
|
|
- leave empty to use defaults
|
|
- use a single dash to use defaults
|
|
- specify MULTIPLY REGION DIVIDE
|
|
|
|
Legal values of MULTIPLY:
|
|
SHIFT: shift
|
|
GROUP g_mult g_reduce: the Group technique - see the paper
|
|
BYTWO_p: BYTWO doubling the product.
|
|
BYTWO_b: BYTWO doubling b (more efficient thatn BYTWO_p)
|
|
TABLE: Full multiplication table
|
|
LOG: Discrete logs
|
|
LOG_ZERO: Discrete logs with a large table for zeros
|
|
SPLIT g_a g_b: Split tables defined by g_a and g_b
|
|
COMPOSITE k l [METHOD]: Composite field, recursively specify the
|
|
method of the base field in GF(2^l)
|
|
|
|
Legal values of REGION: Specify multiples with commas e.g. 'DOUBLE,LAZY'
|
|
-: Use defaults
|
|
SINGLE/DOUBLE/QUAD: Expand tables
|
|
LAZY: Lazily create table (only applies to TABLE and SPLIT)
|
|
SSE/NOSSE: Use 128-bit SSE instructions if you can
|
|
CAUCHY/ALTMAP/STDMAP: Use different memory mappings
|
|
|
|
Legal values of DIVIDE:
|
|
-: Use defaults
|
|
MATRIX: Use matrix inversion
|
|
EUCLID: Use the extended Euclidian algorithm.
|
|
|
|
See the user's manual for more information.
|
|
There are many restrictions, so it is better to simply use defaults in most cases.
|
|
UNIX> <font color=darkred><b></b></font>
|
|
</pre>
|
|
|
|
<hr>
|
|
<h3>gf_unit and gf_time</h3>
|
|
|
|
<b><a href=gf_unit.c>gf_unit.c</a></b> is a unit tester, and
|
|
<b><a href=gf_time.c>gf_time.c</a></b> is a time tester.
|
|
|
|
They are called as follows:
|
|
|
|
<p><center><table border=3 cellpadding=3><td><pre>
|
|
UNIX> <font color=darkred><b>gf_unit w tests seed [METHOD] </b></font>
|
|
UNIX> <font color=darkred><b>gf_time w tests seed size(bytes) iterations [METHOD] </b></font>
|
|
</pre></td></table></center><p>
|
|
|
|
The <b>tests</b> parameter is one or more of the following characters:
|
|
|
|
<UL>
|
|
<LI> A: Do all tests
|
|
<LI> S: Test only single operations (multiplication/division)
|
|
<LI> R: Test only region operations
|
|
<LI> V: Verbose Output
|
|
</UL>
|
|
|
|
<b>seed</b> is a seed for <b>srand48()</b> -- using -1 defaults to the current time.
|
|
<p>
|
|
For example, testing the defaults with w=4:
|
|
|
|
<pre>
|
|
UNIX> <font color=darkred><b>gf_unit 4 AV 1 LOG - -</b></font>
|
|
Seed: 1
|
|
Testing single multiplications/divisions.
|
|
Testing Inversions.
|
|
Testing buffer-constant, src != dest, xor = 0
|
|
Testing buffer-constant, src != dest, xor = 1
|
|
Testing buffer-constant, src == dest, xor = 0
|
|
Testing buffer-constant, src == dest, xor = 1
|
|
UNIX> <font color=darkred><b>gf_unit 4 AV 1 SHIFT - -</b></font>
|
|
Seed: 1
|
|
Testing single multiplications/divisions.
|
|
Testing Inversions.
|
|
No multiply_region.
|
|
UNIX> <font color=darkred><b></b></font>
|
|
</pre>
|
|
|
|
There is no <b>multiply_region()</b> method defined for <b>SHIFT</b>.
|
|
Thus, the procedures are <b>NULL</b> and the unit tester ignores them.
|
|
<p>
|
|
At the moment, I only have the unit tester working for w=4.
|
|
<p>
|
|
<b>gf_time</b> takes the size of an array (in bytes) and a number of iterations, and
|
|
tests the speed of both single and region operations. The tests are:
|
|
|
|
<UL>
|
|
<LI> A: All
|
|
<LI> S: All Single Operations
|
|
<LI> R: All Region Operations
|
|
<LI> M: Single: Multiplications
|
|
<LI> D: Single: Divisions
|
|
<LI> I: Single: Inverses
|
|
<LI> B: Region: Multipy_Region
|
|
</UL>
|
|
|
|
Here are some examples with <b>SHIFT</b> and <b>LOG</b> on my mac.
|
|
|
|
<pre>
|
|
UNIX> <font color=darkred><b>gf_time 4 A 1 102400 1024 LOG - -</b></font>
|
|
Seed: 1
|
|
Multiply: 0.538126 s 185.830 Mega-ops/s
|
|
Divide: 0.520825 s 192.003 Mega-ops/s
|
|
Inverse: 0.631198 s 158.429 Mega-ops/s
|
|
Buffer-Const,s!=d,xor=0: 0.478395 s 209.032 MB/s
|
|
Buffer-Const,s!=d,xor=1: 0.524245 s 190.751 MB/s
|
|
Buffer-Const,s==d,xor=0: 0.471851 s 211.931 MB/s
|
|
Buffer-Const,s==d,xor=1: 0.528275 s 189.295 MB/s
|
|
UNIX> <font color=darkred><b>gf_time 4 A 1 102400 1024 LOG - EUCLID</b></font>
|
|
Seed: 1
|
|
Multiply: 0.555512 s 180.014 Mega-ops/s
|
|
Divide: 5.359434 s 18.659 Mega-ops/s
|
|
Inverse: 4.911719 s 20.359 Mega-ops/s
|
|
Buffer-Const,s!=d,xor=0: 0.496097 s 201.573 MB/s
|
|
Buffer-Const,s!=d,xor=1: 0.538536 s 185.689 MB/s
|
|
Buffer-Const,s==d,xor=0: 0.485564 s 205.946 MB/s
|
|
Buffer-Const,s==d,xor=1: 0.540227 s 185.107 MB/s
|
|
UNIX> <font color=darkred><b>gf_time 4 A 1 102400 1024 LOG - MATRIX</b></font>
|
|
Seed: 1
|
|
Multiply: 0.544005 s 183.822 Mega-ops/s
|
|
Divide: 7.602822 s 13.153 Mega-ops/s
|
|
Inverse: 7.000564 s 14.285 Mega-ops/s
|
|
Buffer-Const,s!=d,xor=0: 0.474868 s 210.585 MB/s
|
|
Buffer-Const,s!=d,xor=1: 0.527588 s 189.542 MB/s
|
|
Buffer-Const,s==d,xor=0: 0.473130 s 211.358 MB/s
|
|
Buffer-Const,s==d,xor=1: 0.529877 s 188.723 MB/s
|
|
UNIX> <font color=darkred><b>gf_time 4 A 1 102400 1024 SHIFT - -</b></font>
|
|
Seed: 1
|
|
Multiply: 2.708842 s 36.916 Mega-ops/s
|
|
Divide: 8.756882 s 11.420 Mega-ops/s
|
|
Inverse: 5.695511 s 17.558 Mega-ops/s
|
|
UNIX> <font color=darkred><b></b></font>
|
|
</pre>
|
|
|
|
At the moment, I only have the timer working for w=4.
|
|
|
|
<hr>
|
|
<h3>Walking you through <b>LOG</b></h3>
|
|
|
|
To see how <b>scratch</b> is used to store data, let's look at what happens when
|
|
you call <b>gf_init_easy(&gf, 4, GF_MULT_LOG);</b>
|
|
First, <b>gf_init_easy()</b> calls <b>gf_init_hard()</b> with default parameters.
|
|
This is in <b><a href=gf.c>gf.c</a></b>.
|
|
<p>
|
|
<b>gf_init_hard()</b>' first job is to set up the scratch.
|
|
The scratch's type is <b>gf_internal_t</b>, defined in
|
|
<b><a href=gf_int.h>gf_int.h</a></b>:
|
|
|
|
<p><center><table border=3 cellpadding=3><td><pre>
|
|
typedef struct {
|
|
int mult_type;
|
|
int region_type;
|
|
int divide_type;
|
|
int w;
|
|
uint64_t prim_poly;
|
|
int free_me;
|
|
int arg1;
|
|
int arg2;
|
|
gf_t *base_gf;
|
|
void *private;
|
|
} gf_internal_t;
|
|
</pre></td></table></center><p>
|
|
|
|
All the fields are straightfoward, with the exception of <b>private</b>. That is
|
|
a <b>(void *)</b> which points to the implementation's private data.
|
|
<p>
|
|
Here's the code for
|
|
<b>gf_init_hard()</b>:
|
|
|
|
<p><center><table border=3 cellpadding=3><td><pre>
|
|
int gf_init_hard(gf_t *gf, int w, int mult_type,
|
|
int region_type,
|
|
int divide_type,
|
|
uint64_t prim_poly,
|
|
int arg1, int arg2,
|
|
gf_t *base_gf,
|
|
void *scratch_memory)
|
|
{
|
|
int sz;
|
|
gf_internal_t *h;
|
|
|
|
|
|
if (scratch_memory == NULL) {
|
|
sz = gf_scratch_size(w, mult_type, region_type, divide_type, arg1, arg2);
|
|
if (sz <= 0) return 0;
|
|
h = (gf_internal_t *) malloc(sz);
|
|
h->free_me = 1;
|
|
} else {
|
|
h = scratch_memory;
|
|
h->free_me = 0;
|
|
}
|
|
gf->scratch = (void *) h;
|
|
h->mult_type = mult_type;
|
|
h->region_type = region_type;
|
|
h->divide_type = divide_type;
|
|
h->w = w;
|
|
h->prim_poly = prim_poly;
|
|
h->arg1 = arg1;
|
|
h->arg2 = arg2;
|
|
h->base_gf = base_gf;
|
|
h->private = (void *) gf->scratch;
|
|
h->private += (sizeof(gf_internal_t));
|
|
|
|
switch(w) {
|
|
case 4: return gf_w4_init(gf);
|
|
case 8: return gf_w8_init(gf);
|
|
case 16: return gf_w16_init(gf);
|
|
case 32: return gf_w32_init(gf);
|
|
case 64: return gf_w64_init(gf);
|
|
case 128: return gf_dummy_init(gf);
|
|
default: return 0;
|
|
}
|
|
}
|
|
</pre></td></table></center><p>
|
|
|
|
The first thing it does is determine if it has to allocate space for <b>scratch</b>.
|
|
If it must, it uses <b>gf_scratch_size()</b> to figure out how big the space must be.
|
|
It then sets <b>gf->scratch</b> to this space, and sets all of the fields of the
|
|
scratch to the arguments in <b>gf_init_hard()</b>. The <b>private</b> pointer is
|
|
set to be the space just after the pointer <b>gf->private</b>. Again, it is up to
|
|
<b>gf_scratch_size()</b> to make sure there is enough space for the scratch, and
|
|
for all of the private data needed by the implementation.
|
|
<p>
|
|
Once the scratch is set up, <b>gf_init_hard()</b> calls <b>gf_w4_init()</b>. This is
|
|
in <b><a href=gf_w4.c>gf_w4.c</a></b>, and it is a
|
|
simple dispatcher to the various initialization routines, plus it
|
|
sets <b>EUCLID</b> and <b>MATRIX</b> if need be:
|
|
|
|
<p><center><table border=3 cellpadding=3><td><pre>
|
|
int gf_w4_init(gf_t *gf)
|
|
{
|
|
gf_internal_t *h;
|
|
|
|
h = (gf_internal_t *) gf->scratch;
|
|
if (h->prim_poly == 0) h->prim_poly = 0x13;
|
|
|
|
gf->multiply.w4 = NULL;
|
|
gf->divide.w4 = NULL;
|
|
gf->inverse.w4 = NULL;
|
|
gf->multiply_region.w4 = NULL;
|
|
|
|
switch(h->mult_type) {
|
|
case GF_MULT_SHIFT: if (gf_w4_shift_init(gf) == 0) return 0; break;
|
|
case GF_MULT_LOG_TABLE: if (gf_w4_log_init(gf) == 0) return 0; break;
|
|
case GF_MULT_DEFAULT: if (gf_w4_log_init(gf) == 0) return 0; break;
|
|
default: return 0;
|
|
}
|
|
if (h->divide_type == GF_DIVIDE_EUCLID) {
|
|
gf->divide.w4 = gf_w4_divide_from_inverse;
|
|
gf->inverse.w4 = gf_w4_euclid;
|
|
} else if (h->divide_type == GF_DIVIDE_MATRIX) {
|
|
gf->divide.w4 = gf_w4_divide_from_inverse;
|
|
gf->inverse.w4 = gf_w4_matrix;
|
|
}
|
|
|
|
if (gf->inverse.w4 != NULL && gf->divide.w4 == NULL) {
|
|
gf->divide.w4 = gf_w4_divide_from_inverse;
|
|
}
|
|
if (gf->inverse.w4 == NULL && gf->divide.w4 != NULL) {
|
|
gf->inverse.w4 = gf_w4_inverse_from_divide;
|
|
}
|
|
return 1;
|
|
}
|
|
</pre></td></table></center><p>
|
|
|
|
The code in <b>gf_w4_log_init()</b> sets up the log and antilog tables, and sets
|
|
the <b>multiply.w4</b>, <b>divide.w4</b> etc routines to be the ones for logs. The
|
|
tables are put into <b>gf->scratch->private</b>, which is typecast to a <b>struct
|
|
gf_logtable_data *</b>:
|
|
|
|
<p><center><table border=3 cellpadding=3><td><pre>
|
|
struct gf_logtable_data {
|
|
gf_val_4_t log_tbl[GF_FIELD_SIZE];
|
|
gf_val_4_t antilog_tbl[GF_FIELD_SIZE * 2];
|
|
gf_val_4_t *antilog_tbl_div;
|
|
};
|
|
.......
|
|
|
|
static
|
|
int gf_w4_log_init(gf_t *gf)
|
|
{
|
|
gf_internal_t *h;
|
|
struct gf_logtable_data *ltd;
|
|
int i, b;
|
|
|
|
h = (gf_internal_t *) gf->scratch;
|
|
ltd = h->private;
|
|
|
|
ltd->log_tbl[0] = 0;
|
|
|
|
ltd->antilog_tbl_div = ltd->antilog_tbl + (GF_FIELD_SIZE-1);
|
|
b = 1;
|
|
for (i = 0; i < GF_FIELD_SIZE-1; i++) {
|
|
ltd->log_tbl[b] = (gf_val_8_t)i;
|
|
ltd->antilog_tbl[i] = (gf_val_8_t)b;
|
|
ltd->antilog_tbl[i+GF_FIELD_SIZE-1] = (gf_val_8_t)b;
|
|
b <<= 1;
|
|
if (b & GF_FIELD_SIZE) {
|
|
b = b ^ h->prim_poly;
|
|
}
|
|
}
|
|
|
|
gf->inverse.w4 = gf_w4_inverse_from_divide;
|
|
gf->divide.w4 = gf_w4_log_divide;
|
|
gf->multiply.w4 = gf_w4_log_multiply;
|
|
gf->multiply_region.w4 = gf_w4_log_multiply_region;
|
|
return 1;
|
|
}
|
|
</pre></td></table></center><p>
|
|
|
|
And of course the individual routines use <b>h->private</b> to access the tables:
|
|
|
|
<p><center><table border=3 cellpadding=3><td><pre>
|
|
static
|
|
inline
|
|
gf_val_8_t gf_w4_log_multiply (gf_t *gf, gf_val_8_t a, gf_val_8_t b)
|
|
{
|
|
struct gf_logtable_data *ltd;
|
|
|
|
ltd = (struct gf_logtable_data *) ((gf_internal_t *) (gf->scratch))->private;
|
|
return (a == 0 || b == 0) ? 0 : ltd->antilog_tbl[(unsigned)(ltd->log_tbl[a] + ltd->log_tbl[b])];
|
|
}
|
|
</pre></td></table></center><p>
|
|
|
|
Finally, it's important that the proper sizes are put into
|
|
<b>gf_w4_scratch_size()</b> for each implementation:
|
|
|
|
<p><center><table border=3 cellpadding=3><td><pre>
|
|
int gf_w4_scratch_size(int mult_type, int region_type, int divide_type, int arg1, int arg2)
|
|
{
|
|
int region_tbl_size;
|
|
switch(mult_type)
|
|
{
|
|
case GF_MULT_DEFAULT:
|
|
case GF_MULT_LOG_TABLE:
|
|
return sizeof(gf_internal_t) + sizeof(struct gf_logtable_data) + 64;
|
|
break;
|
|
case GF_MULT_SHIFT:
|
|
return sizeof(gf_internal_t);
|
|
break;
|
|
default:
|
|
return -1;
|
|
}
|
|
}
|
|
</pre></td></table></center><p>
|
|
I hope that's enough explanation for y'all to start implementing. Let me know if you have
|
|
problems -- thanks -- Jim
|
|
|
|
<hr>
|
|
The initial structure has been set for w=4, 8, 16, 32 and 64, with implementations of SHIFT and EUCLID, and for w <= 32, MATRIX. There are some weird caveats:
|
|
|
|
<UL>
|
|
<LI> For w=32 and w=64, the primitive polynomial does not have the leading one.
|
|
<LI> I'd like for naming to be:
|
|
<p>
|
|
<UL>
|
|
<b>gf_w</b><i>w</i><b>_</b><i>technique</i></i><b>_</b><i>funcationality</i><b>()</b>.
|
|
</UL>
|
|
<p>
|
|
For example, the log techniques for w=4 are:
|
|
<pre>
|
|
gf_w4_log_multiply()
|
|
gf_w4_log_divide()
|
|
gf_w4_log_multiply_region()
|
|
gf_w4_log_init()
|
|
</pre>
|
|
<p>
|
|
<LI> I'd also like a header block on implementations that says who wrote it.
|
|
</UL>
|
|
|
|
<hr>
|
|
<h3>Things we need to Implement: <i>w=4</i></h3>
|
|
|
|
<p><table border=3 cellpadding=2>
|
|
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
|
|
<tr> <td> BYTWO_p </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> BYTWO_b </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> BYTWO_p, SSE </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> BYTWO_b, SSE </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Single TABLE </td> <td> Done - Jim </td> </tr>
|
|
<tr> <td> Double TABLE </td> <td> Done - Jim </td> </tr>
|
|
<tr> <td> Double TABLE, SSE </td> <td> Done - Jim </td> </tr>
|
|
<tr> <td> Quad TABLE </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Lazy Quad TABLE </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> LOG </td> <td> Done - Jim </td> </tr>
|
|
</table><p>
|
|
|
|
<hr>
|
|
<h3>Things we need to Implement: <i>w=8</i></h3>
|
|
|
|
<p><table border=3 cellpadding=2>
|
|
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
|
|
<tr> <td> BYTWO_p </td> <td>Done - Jim </td> </tr>
|
|
<tr> <td> BYTWO_b </td> <td>Done - Jim </td> </tr>
|
|
<tr> <td> BYTWO_p, SSE </td> <td>Done - Jim </td> </tr>
|
|
<tr> <td> BYTWO_b, SSE </td> <td>Done - Jim </td> </tr>
|
|
<tr> <td> Single TABLE </td> <td> Done - Kevin </td> </tr>
|
|
<tr> <td> Double TABLE </td> <td> Done - Jim </td> </tr>
|
|
<tr> <td> Lazy Double TABLE </td> <td> Done - Jim </td> </tr>
|
|
<tr> <td> Split 2 1 (Half) SSE </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Composite, k=2 </td> <td> Done - Kevin (alt mapping not passing unit test) </td> </tr>
|
|
<tr> <td> LOG </td> <td> Done - Kevin </td> </tr>
|
|
<tr> <td> LOG ZERO</td> <td> Done - Jim</td> </tr>
|
|
</table><p>
|
|
|
|
<hr>
|
|
<h3>Things we need to Implement: <i>w=16</i></h3>
|
|
|
|
<p><table border=3 cellpadding=2>
|
|
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
|
|
<tr> <td> BYTWO_p </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> BYTWO_b </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> BYTWO_p, SSE </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> BYTWO_b, SSE </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Lazy TABLE </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Split 4 16 No-SSE, lazy </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Split 4 16 SSE, lazy </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Split 4 16 SSE, lazy, alternate mapping </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Split 8 16, lazy </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Composite, k=2, stdmap recursive </td> <td> Done - Kevin</td> </tr>
|
|
<tr> <td> Composite, k=2, altmap recursive </td> <td> Done - Kevin</td> </tr>
|
|
<tr> <td> Composite, k=2, stdmap inline </td> <td> Done - Kevin</td> </tr>
|
|
<tr> <td> LOG </td> <td> Done - Kevin </td> </tr>
|
|
<tr> <td> LOG ZERO</td> <td> Done - Kevin </td> </tr>
|
|
<tr> <td> Group 4 4 </td> <td>Done - Jim: I don't see a reason to implement others, although 4-8 will be faster, and 8 8 will have faster region ops. They'll never beat SPLIT.</td> </tr>
|
|
</table><p>
|
|
|
|
<hr>
|
|
<h3>Things we need to Implement: <i>w=32</i></h3>
|
|
|
|
<p><table border=3 cellpadding=2>
|
|
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
|
|
<tr> <td> BYTWO_p </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> BYTWO_b </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> BYTWO_p, SSE </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> BYTWO_b, SSE </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Split 2 32,lazy </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Split 2 32, SSE, lazy </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Split 4 32, lazy </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Split 4 32, SSE,ALTMAP lazy </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Split 4 32, SSE, lazy </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Split 8 8 </td> <td>Done - Jim </td> </tr>
|
|
<tr> <td> Group, g_s == g_r </td> <td>Done - Jim</td></tr>
|
|
<tr> <td> Group, any g_s and g_r</td> <td>Done - Jim</td></tr>
|
|
<tr> <td> Composite, k=2, stdmap recursive </td> <td> Done - Kevin</td> </tr>
|
|
<tr> <td> Composite, k=2, altmap recursive </td> <td> Done - Kevin</td> </tr>
|
|
<tr> <td> Composite, k=2, stdmap inline </td> <td> Done - Kevin</td> </tr>
|
|
</table><p>
|
|
<hr>
|
|
<h3>Things we need to Implement: <i>w=64</i></h3>
|
|
|
|
<p><table border=3 cellpadding=2>
|
|
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
|
|
<tr> <td> BYTWO_p </td> <td> - </td> </tr>
|
|
<tr> <td> BYTWO_b </td> <td> - </td> </tr>
|
|
<tr> <td> BYTWO_p, SSE </td> <td> - </td> </tr>
|
|
<tr> <td> BYTWO_b, SSE </td> <td> - </td> </tr>
|
|
<tr> <td> Split 16 1 SSE, maybe lazy </td> <td> - </td> </tr>
|
|
<tr> <td> Split 8 1 lazy </td> <td> - </td> </tr>
|
|
<tr> <td> Split 8 8 </td> <td> - </td> </tr>
|
|
<tr> <td> Split 8 8 lazy </td> <td> - </td> </tr>
|
|
<tr> <td> Group </td> <td> - </td> </tr>
|
|
<tr> <td> Composite, k=2, alternate mapping </td> <td> - </td> </tr>
|
|
</table><p>
|
|
<hr>
|
|
<h3>Things we need to Implement: <i>w=128</i></h3>
|
|
|
|
<p><table border=3 cellpadding=2>
|
|
<tr> <td> SHIFT </td> <td> Done - Will </td> </tr>
|
|
<tr> <td> BYTWO_p </td> <td> - </td> </tr>
|
|
<tr> <td> BYTWO_b </td> <td> - </td> </tr>
|
|
<tr> <td> BYTWO_p, SSE </td> <td> - </td> </tr>
|
|
<tr> <td> BYTWO_b, SSE </td> <td> - </td> </tr>
|
|
<tr> <td> Split 32 1 SSE, maybe lazy </td> <td> - </td> </tr>
|
|
<tr> <td> Split 16 1 lazy </td> <td> - </td> </tr>
|
|
<tr> <td> Split 16 16 - Maybe that's insanity</td> <td> - </td> </tr>
|
|
<tr> <td> Split 16 16 lazy </td> <td> - </td> </tr>
|
|
<tr> <td> Group (SSE) </td> <td> - </td> </tr>
|
|
<tr> <td> Composite, k=?, alternate mapping </td> <td> - </td> </tr>
|
|
</table><p>
|
|
<hr>
|
|
<h3>Things we need to Implement: <i>w=general between 1 & 32</i></h3>
|
|
|
|
<p><table border=3 cellpadding=2>
|
|
<tr> <td> CAUCHY Region (SSE XOR)</td> <td> Done - Jim </td> </tr>
|
|
<tr> <td> SHIFT </td> <td> Done - Jim </td> </tr>
|
|
<tr> <td> TABLE </td> <td> Done - Jim </td> </tr>
|
|
<tr> <td> LOG </td> <td> Done - Jim </td> </tr>
|
|
<tr> <td> BYTWO_p </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> BYTWO_b </td> <td>Done - Jim</td> </tr>
|
|
<tr> <td> Group, g_s == g_r </td> <td>Done - Jim</td></tr>
|
|
<tr> <td> Group, any g_s and g_r</td> <td>Done - Jim</td></tr>
|
|
<tr> <td> Split - do we need it?</td> <td>Done - Jim</td></tr>
|
|
<tr> <td> Composite - do we need it?</td> <td> - </td></tr>
|
|
<tr> <td> Split - do we need it?</td> <td> - </td></tr>
|
|
<tr> <td> Logzero?</td> <td> - </td></tr>
|
|
</table><p>
|