[1]
PUFKY: A Fully Functional PUF-based Cryptographic Key Generator
3.3
Syndrome Generation and Error Decoding for C REP and C BCH
XOR-ing
x 1 with each remaining bit of x n REP , or h i = x 1 ⊕ x i+1
. Error decoding
is
based on a Hamming weight check of the syndrome s n REP −1 ,
which immediately
yields
the value for the first error bit e 1 . The remaining error bits are
again
obtained
by a pairwise XOR of e 1 with each of the syndrome bits, but this
step is
discarded
in the syndrome construction. In our design, both syndrome generation
and
error decoding of a repetition code are fully combinatorial.
so
bch(7,1,3)
encoder:
input:
PUF readout {m0, m1, ..., m6} 7-bits
output:
helper
data: {m1^m0, m2^m0, ..., m6^m0} 6-bits
decoder:
位于 Pufkey/source/rep_decoder.vhd
input:
PUF readout {m'0, m'1, ..., m'6} 8-bit, helper data: {m1^m0, m2^m0,
..., m6^m0} 6-bits
output:
recovered:
r = {m0, m0^(m1^m'1), ..., m0^(m7^m'7)} 7-bits
if
weight(r) >=4
r
= 0
else
r
= 1
BCH
code C BCH . Since BCH codes are cyclical codes, their syndrome
generation
is
a finite field division by the code’s generator polynomial. This is
efficiently
implemented
in hardware as an LFSR evaluation of length (n BCH − k BCH ).
The
error decoding step of a BCH code is more complex and requires the
largest
design effort of all elements in our secure sketch. Most BCH decoders
are
designed
with a focus on throughput and use systolic array designs, e.g. [19,
20, 22].
Aiming
for a size-optimized implementation, we propose a
serialized, minimalistic
coprocessor
design with a 10-bit application-specific instruction set and limited
conditional
execution support. Although highly optimized towards BCH
decoding,
the
architecture is generic in the sense that it can decode any BCH code,
including
shortened
versions, requiring only a slight change of firmware and memory size.
The
datapath consists of two blocks: an
address and a data block. To optimize
array
indexing, all addressing is done indirectly using a five element
address
RAM,
which is efficiently updated by a dedicated address ALU. The output
of the
address
RAM is directly connected to the data RAM. The data block consists of
data
RAM and an ALU which is used mainly for multiply-accumulate
operations
over
F 2 u . To minimize the size, this ALU contains only a single
register. All other
necessary
operands come directly from the data RAM. A high-level overview of
the
coprocessor architecture is shown in Fig. 3.
[2]
Implementation of BCH Code (n, k) Encoder using lfsr
输入多项式h(x)
= h0 + h1*x + h2* x^2 + … hn*x^n
输入次序为
hn,...
, h2, h1, h0 从左到右
对于bch(255,
171,11), n应该为254位,
即一共输入255bits
没有评论:
发表评论