3 RNS to Binary Conversion Based on New CRT-I, New CRT-II, Mixed-Radix CRT and New CRT-III
Tải bản đầy đủ - 0trang
96
5 RNS to Binary Conversion
Thus, Y can be first found and appending x2 as LSBs, we can obtain X.
Wang et al. [14] have suggested computation of Y in (5.24) as A ỵ 2n Bị 2n
2 1
where
"
x1 ỵ x10 ẩ x30 ị2n ỵ 2n 1 x3 ị ỵ 2n 1
Aẳ
2
#
5:25aị
and
"
x1 ỵ x10 ẩ x30 ị2n ỵ x3 ỵ 22n 1 x2 ị
Bẳ
2
#
5:25bị
where x10 and x30 are the LSBs of x1 and x3, respectively. The value A can be
computed using a 2-input adder to yield the sum and carry vectors A1 and A2.
Similarly, B can be estimated to yield the sum and carry vectors B1 and B2 and a
carry bit using a three-input n-bit adder. Next, Y can be obtained from A and B using
a 2n-bit adder (Converter I) or n-bit adders to reduce the propagation delay.
Two solutions for n-bit case have been suggested denoted as Converter II and
Converter III.
Bi and Gross [34] have described a Mixed-Radix Chinese Remainder Theorem
(Mixed-Radix CRT) for RNS to binary conversion. The result of RNS to binary
conversion can be computed in this approach for an RNS having moduli {m1, m2,
. . ., mn} with residues (x1, x2, . . ., xn) as
"
#
1 x1 ỵ 2 x2 ỵ 3 x3
X ẳ x1 ỵ m1 j 1 x1 ỵ 2 x2 jm2 ỵ m1 m2
m2
m
"
#3
5:26aị
1 x1 ỵ 2 x2 ỵ 3 x3 ỵ ỵ n xn
ỵ þ m1 m2 . . . mnÀ1
m2 m3 . . . mnÀ1
mn
where
M1
γ1 ¼
1
À1
M 1 m1
m1
and γ i ¼
M
1
:
m1 mi Mi mi
ð5:26bÞ
Note that the first two terms use MRC and other terms use CRT like expansion.
The advantage of this formulation is the possibility for parallel computation of
various MRC digits enabling fast comparison of two numbers at the expense of
hardware since many terms in the numerators for expressions for several Mixed
Radix digits and division by product of moduli and taking integer value are
cumbersome except for special moduli. The topic of comparison using this technique is discussed in Chapter 6. An example will be illustrative.
5.4 RNS to Binary Converters for Other Three Moduli Sets
97
Example 5.7 We wish to find the number corresponding to residues (1, 2, 3, 4) in
the RNS {3, 5, 7, 11}. Wecancompute asin CRT,
3 ¼ 165,
M1 ¼
385,M2 ¼ 231,
M
1
1
1
1
as
¼ 1,
¼ 1,
¼ 2,
M4 ¼ 105 and various
Mi mi
385 3
231 5
165 7
1
¼ 2. Next, we compute γ 1 ¼ 128, γ 2 ¼ 77, γ 3 ¼ 110, γ 4 ¼ 70. Thus,
105 11
X can be computed as
"
X ẳ 1 ỵ 3 128 ỵ 154ị5 ỵ 15
"
128 ỵ 154 ỵ 330 ỵ 280
35
#
128 ỵ 154 þ 330
5
#
þ 105
7
11
¼ 1 þ 3 Â 2 þ 15 3 ỵ 105 3 ẳ 367
New CRT III [35, 36] can be used to perform RNS to binary conversion when the
moduli have common factors. Considering two moduli m1 and m2 with common
factors d, and considering m1 > m2, the decoded number corresponding to residues
x1 and x2 can be obtained as
X ẳ x1 ỵ m1
1
x 2 x 1 Þ
m1 =d
d
m2 =d
ð5:27Þ
As an illustration, consider the moduli set {15, 12} with d ¼ 3 as a common factor
and given residues (5, 2). The decoded number can be obtained from (5.27) as
X ẳ 5 ỵ 15
1 2 5ị
ẳ 50:
5
3
4
We will later consider application of this technique for Reverse conversion for an
eight moduli set.
5.4
RNS to Binary Converters for Other Three Moduli Sets
Premkumar [37], Premkumar et al. [38], Wang et al. [39], and Globagade et al. [40]
have investigated the three moduli set {m1, m2, m3} ¼ {2n + 1, 2n, 2n À 1}. The
reverse converter for this moduli set based on CRT described by Premkumar [37]
uses the expressions
98
5 RNS to Binary Conversion
&
Xẳ
'
m m
M m2 m3
1 2
ỵ
x1 ỵ
x3 m1 m3 x2 mod M
2
2
2
for x1 ỵ x3 ị odd
5:28aị
and
Xẳ
nm m
o
m m
2 3
1 2
x1 ỵ
x3 m1 m3 x2 mod M
2
2
for x1 ỵ x3 ị even 5:28bị
where M ¼ 2n(4n2 À 1).
Note that the output of the adder computing the value inside the brackets needs
to be tested and based on the sign, M has to be added or subtracted once.
The hardware implementation needs three 2k-bit Â k-bit multipliers where
k ¼ log2(2n + 1) and a four-input 3k-bit adder. Premkumar et al. [38] suggested
simplification which needs one 2k-bit Â k-bit multiplier and one k-bit Â k-bit multiplier and 7 or 9 adders in Architecture A and B, respectively. They divide both
sides of CRT expression by m2 and find the integer part as
"
#
X
x1 x3
ẳ nx1 ỵ x3 2x2 ị ỵ
m2
2
m1 m3
both x1 , x3 odd or both even
5:29aị
and
"
#
X
x1 x3 ỵ m1 m3
x1 even, x3 odd or vice-versa:
ẳ nx1 ỵ x3 2x2 ị ỵ
m2
2
m1 m3
5:29bị
Note that in this
m1 ẳ 2n 1, m2 ¼ 2n and m3 ¼ 2n + 1. The final result is
j case,
k
X
given by X ẳ m2 m2 ỵ x2 . The authors suggest a high-speed version as well as a
cost-effective version.
Wang et al. [39] have given another technique for reverse conversion using the
formula based on new CRT II,
X ẳ x2 ỵ 2nfx2 x3 ị ỵ x1 2x2 ỵ x3 ịn2n ỵ 1ịgmod2n ỵ 1ị2n 1ịị
5:30ị
which needs one 2k-bit Â k-bit multiplier and one k-bit Â k-bit multiplier and few
adders. Note that in this case, m1 ¼ 2n À 1, m2 ¼ 2n and m3 ¼ 2n + 1. More recently,
Gbolagade et al. [40] have suggested computing X as
5.5 RNS to Binary Converters for Four and More Moduli Sets
x ỵ x
1
3
x2
X ẳ m2 x2 x3 ị ỵ x2 ỵ m3 m2
2
m
99
5:31ị
1 M
Note that in this case, m1 ¼ 2n À 1, m2 ¼ 2n and m3 ¼ 2n + 1. This needs at most one
corrective addition of M. The critical path has been shown to be less than Wang
et al. converter [37] with reduced hardware complexity.
Premkumar [41, 42], Wang et al. [39] and Gbolagade [43] have considered
another moduli set {2n, 2n + 1, 2n + 2} which has 2 as a common factor and hence
half the dynamic range compared to the moduli set {2n + 1, 2n, 2n À 1}. It may be
remarked that the moduli sets {2n + 1, 2n, 2n À 1} and {2n, 2n + 1, 2n + 2} are not
attractive compared to powers of two related moduli sets since the hardware needed
has quadratic dependence on the bit size of the moduli.
Reverse converters for the moduli set {2k, 2k À 1, 2kÀ1 À 1} have also been
described [44–47]. The design due to Hiasat and Abdel-Aty-Zohdy [44] was
based on CRT. Denoting m1 ¼ 2k, m2 ¼ 2k À 1,j andk m3 ¼ 2kÀ1 À 1, the authors
X
M3 where M3 ¼ M/m3. Wang
et al. [45, 46] have used New CRT II and have shown that the conversion time
can be reduced whereas area is increased. Ananda Mohan [47] has suggested both
CRT and MRC-based converters. The CRT-based converter has reduced conversion time and uses ROM. On the other hand, the MRC-based converter has reduced
area but higher conversion time.
The moduli set {22n À 1, 22n, 22n + 1} has been suggested by Ananda Mohan
[48, 49] for which using CRT, cost-effective as well as high-speed converters have
been described. Note that the moduli have word lengths of n bits, 2n bits and 2n + 1
bits. The dynamic range is 5n + 1 bits. Another moduli set with (3n + 1)-bit dynamic
range has also been explored {2n, 2n À 1, 2n+1 À 1} [50] using CRT as well as MRC
techniques. The multiplicative inverses needed in the case of MRC technique are
very simple. The CRT-based converter needs modulo (2n À 1)(2n+1 À 1) reduction
after a CPA which has been suggested to be realized by using ROMs by looking at
the MSBs and subtracting the appropriate residue. Thus, one converter using ROM
and two converters not using ROM have been suggested. This moduli set has the
advantage that due to absence of modulus (2n + 1), the multiplication and addition
operations for all moduli channels can be simpler.
start with CRT and estimate X mod M3 and
5.5
RNS to Binary Converters for Four and More
Moduli Sets
Some reverse converters of four moduli sets [51–54] are extensions of the converters for the three moduli sets. These use the optimum converters for the three
moduli set M1 {2n À 1, 2n, 2n + 1} and use MRC to get the final result to include the
fourth modulus 2n+1 À 1, 2nÀ1 + 1, 2nÀ1 À 1, 2n+1 + 1, etc.
100
5 RNS to Binary Conversion
The reverse converter due to Vinod and Premkumar [51] for the moduli set
n
n
n n+1
{m1, m2, m3, m4} ¼ {2
j Àk1, 2 À+ 1, 2 , 2 Á À 1} uses CRT but computes the higher
Mixed Radix Digit MX mod 2nỵ1 1 where X is the desired decoded number
4
using the three moduli
and Mi ¼ M/mi. On the other hand, X mod M4 is computed
j k
X
RNS to binary converter. Next, X is computed as M M4 ỵ x4 .
4
The reverse converter due to Bhardwaj et al. [52] for the moduli set
j {m
k 1, m2, m3,
m4} ¼ {2n À 1, 2n + 1, 2n, 2n+1 + 1} uses CRT but computes first E ¼
X
. Note that
2n
E can be obtained by using CRT on the four moduli set and subtracting the residue
r3 and dividing by m3. However, the multiplicative inverses needed in CRT are
quite complex and hence, E1 and E2 are estimated from the expression for E. Next,
from E1 and E2 using CRT, E can be obtained:
Â
Ã
E1 ¼ jEj 2n ẳ 2n1 2n ỵ 1ịr 1 2n r 2 À 2nÀ1 ð2n À 1Þr 3 2n
2 À1
2 À1
E2 ẳ j E j
2nỵ1 ỵ1
ẳ ẵ2r 2 2r 4
5:32aị
5:32bị
2nỵ1 ỵ1
Ananda Mohan and Premkumar [53] have suggested using MRC for obtaining
E from E1 and E2.
Ananda Mohan and Premkumar [53] have given an unified architecture for RNS
to binary conversion for the moduli sets {2n À 1, 2n + 1, 2n, 2n+1 À 1} and {2n À 1, 2n
+ 1, 2n, 2n+1 + 1} which uses a front-end RNS to binary converter for the moduli set
{2n À 1, 2n + 1, 2n} and then uses MRC to include the fourth modulus. Both
ROM-based and non-ROM-based solutions have been given.
Hosseinzadeh et al. [55] have suggested an improvement for the converter of
Ananda Mohan and Premkumar [53] for the moduli set {2n À 1, 2n + 1, 2n, 2n+1 À 1}
for reducing the conversion delay at the expense of area. They suggest using (n + 1)bit adders in place of (3n + 1)-bit CPA to compute the three parts of the final result.
Theydo not perform the final addition of the output of the multiplier evaluating
1
ðx4 Xa ị nỵ1 where Xa is the decoded output corresponding the
2 1
Xa 2nỵ1 1
moduli set {2n 1, 2n + 1, 2n} but preserve as two carry and sum output vectors and
compute the final output.
Sousa et al. [56] have described an RNS to binary converter for the moduli set
{2n + 1, 2n À 1, 2n, 2n+1 + 1}. They have used two-level MRC. In the first
level, reverse conversion using MRC for moduli sets {x1, x2} ¼ {2n + 1, 2n À 1}
and {x4, x3} ¼ {2n+1 + 1, 2n} is performed and the decoded words
X12, X34 are
obtained. Note that the various multiplicative inverses are
nÀ1
1
x1 modx2 ¼ 2 ,
nÀ3
nÀ1
2
X
X
2iỵ1
1
1
ẳ
1
and
m
ị
ẳ
2
ỵ
22iỵ2 . Since the archimodx
mod
m
3
1 2
x4
m3 m4
iẳ0
iẳn1
2
tecture uses MRC, it can be pipelined. The multiplication with multiplicative
inverses mod (2n À 1), mod 2n, and mod (22n À 1) can be easily performed.
5.5 RNS to Binary Converters for Four and More Moduli Sets
101
The resulting area is more than that of Ananda Mohan and Premkumar converter
[53], whereas the conversion time is less.
Cao et al. [54] have described reverse converters for the two four moduli sets
{2n + 1, 2n À 1, 2n, 2n+1 À 1} and {2n + 1, 2n À 1, 2n, 2nÀ1 À 1} both for n even.
They use a front-end RNS to binary converter due to Wang et al. [14] for the three
moduli set to obtain the decoded word X1 and use MRC later to include the fourth
modulus m4 (i.e. (2n+1 À 1) or (2nÀ1 À 1)). The authors suggest three stage and
four stage converters which differ in the way the MRC in second level is
performed. In the three-stage converter considering the first moduli set, the
second stage computes
!!
Z¼
and
the
third
!
1
À
Á
2n 22n À 1
stage
ẳ2
x4 X1 ị
computes
nỵ2
1
2n 22n 1
2nỵ1 1
X ẳ X1 þ 2n 22n À 1 Z.
where S ¼
Noting
that
À10
, the authors realize Z as
2 1
nỵ2
1
2
10
x4 X1 ị
ẳ SQị nỵ1
Zẳ
2 1
3
3
2nỵ1 1
nỵ1
5:33aị
3
5:33bị
nỵ2
1
2
10
, Q ẳ x 4 X 1 ị
. Note that S can be
3 2nỵ1 1
3
2nỵ1 1
realized as
Sẳ
1
ẳ 20 ỵ 22 ỵ 24 ỵ ỵ 2n :
3 2nỵ1 1
Thus, Z can be computed as sum of shifted and rotated versions of Q available in
carry save form using a tree of CSA with end-around-carry. In the four-stage
converter, the sum and carry vectors realizing Q are first added in a mod (2n+1 À 1)
adder and then multiplied with S realized by summing shifted and rotated terms.
Same technique has been used for the other moduli set as well.
The reverse converters for the four moduli set {2n À 1, 2n + 1, 2n À 3, 2n + 3}
have also been described which use ROMs and combinational logic
[48, 57–59]. The designs in [48, 57, 58] consider in the first level, two 2-moduli
sets {2n À 3, 2n + 1} and {2n + 3, 2n À 1} to compute the decoded numbers Xa and Xb
respectively using MRC. Sheu et al. [57] use a ROM-based approach. In the design
in [58], Montgomery algorithm is used to perform the multiplication with
multi
plicative inverse needed in MRC. This takes advantage of the fact that m12 mod
À1Á
ðx 1 À x 2 Þm 1
n
n
m1 ¼ 4 modm1 where m1 ¼ 2 À 3 and m2 ¼ 2 + 1. Thus,
modm1
4
implies adding a multiple of m1 to ðx1 À x2 Þm1 to make the two LSBs zero so that
102
5 RNS to Binary Conversion
division by 4 implies ignoring the two LSBs. In the case of computation of Xb, m13
ÀÁ
modm4 ¼ 14 modm4 ¼ 2nÀ2 where m3 ¼ 2n + 3 and m4 ¼ 2n À 1. The multiplication with 2nÀ2 mod (2n À 1) can be carried out in a simple manner by
bit rotation
of ðx3 À x4 Þm4 . In the case of MRC in the second level, note that m31m4 mod
1
ðm1 m2 Þ ẳ nỵ2
modm1 m2 ị enabling Montgomery technique to be used easily.
2
In [58], MRC using ROMs and CRT using ROMs also have been explored. In
MRC techniques, modulo subtractions are realized using logic, whereas multiplication with multiplicativeinverse
is carried out using ROMs. In the CRT-based
1
values are stored in ROM. Carry-save-adder
method, the various Mi
M i mi
followed by CPA and modulo reduction stage are used to compute the decoded
result.
Jaberipur and Ahmadifar [59] have described an ROM less adder-only reverse
converter for this moduli set. They consider a two-stage converter. The first stage
performs mixed radix conversion corresponding to the two pairs of moduli {2n À 1,
2n + 1} and {2n À 3, 2n + 3} to obtain residues corresponding to the pair of composite moduli {22n À 1, 22n À 9}. The multiplicative inverses needed are as follows:
1
1
n1
ẳ
2
,
ẳ 2n3 ỵ 2n5 ỵ þ 23 þ 2 for n even and
n
n
n
n
2 þ 3 2 3
2 1 2 ỵ1
n3
1
1
n5
2
0
for
n
odd,
ẳ
2
ỵ
2
ỵ
ỵ
2
ỵ
2
ẳ 22n3 :
2n þ 3 2n À3
22n À 9 22n À1
The decoded words in the first and second stages can be easily obtained using
multi-operand addition of circularly shifted words.
Patronik and Piestrak [60] have considered residue to binary conversion for a
new moduli set {m1, m2, m3, m4} ¼ {2n + 1, 2n, 2n À 1, 2nÀ1 + 1} for n odd. They
have described two converters. The first converter is based on MRC of a two moduli
set {m1m2m3, m4}. This uses Wang et al. converter [12] for the three moduli set to
obtain the number X1 in the moduli set {m1, m2, m3}. The multiplicative inverse
needed in MRC is
0n3
1
2 1
X
ẳ k1 ẳ @
22iỵ1 ỵ 1A
!
1
2n 22n 1
2
n1
ỵ1
5:34ị
iẳ0
Note that since the lengths of residues corresponding to the moduli m1m2m3 and
m4 are different, the operation (x4 À X1) mod (2nÀ1 + 1) needs to be carried out using
periodic properties of residues. The multiplication with the multiplicative inverse in
(5.34) needs circular left shifts, one’s complementing of bits arriving in LSBs due to
circular shift and addition of all these modified partial products with a correction
term using several CSA stages. Note that mod (2nÀ1 + 1) addition needs correction
5.5 RNS to Binary Converters for Four and More Moduli Sets
103
to cater for inverting the carry and
À addingÁ in the LSB position. The number of
partial products can be seen to be nÀ3
2 þ 2 . The final computation of X 1 þ m1 m2
m3 ðÀk1 ðx4 À X1 ÞÞm4 can be rearranged to take advantage of the fact that LSBs of
the decoded word are already available as x3.
The second converter uses two-stage conversion comprising of moduli sets
{m1m2, m3m4} using MRC. The numbers corresponding to moduli sets m1m2 and
m3m4 are obtained using CRT and MRC respectively in the first stage. The various
multiplicative inverse used in CRT and MRC in this stage are as follows:
!
1
n
2 ỵ1
!
1
n
2 1
ẳ
2
n
1
!
ẳ2
2
n
ỵ1
n1
1
, n1
2
ỵ1
ẳ 2n1 ỵ 1 5:35aị
2
n
The multiplicative inverse needed in MRC in the second stage is
!
1
2n 2n1 ỵ 1
22n 1
0 0 n3
11
2
X
1
ẳ @ n@
22iỵ2 ỵ 22iỵnỵ2 ỵ 2AA
2 iẳ0
5:35bị
2
2n
1
The multiplication with this multiplicative inverse mod (22n 1) can be obtained
by using a multi-operand carry-save-adder mod (22n À 1) which can yield sum and
carry vectors RC and RS. Two versions of the second converter have been presented
which differ in the second stage.
Didier and Rivaille [61] have described a two-stage RNS to binary converter for
moduli specially chosen to simplify the converter using ROMs. They suggest
choosing pairs of moduli with a difference of power of two and difference between
products of pairs of moduli being powers of two. Specifically, the set is of the type
È
É
fm1 ; m2 ; m3 ; m4 g ẳ m1 , m1 ỵ 2p1 , m3 , m3 ỵ 2p2 such that m1m2 m3m4 ẳ 2pp
where pp is an integer. In the first stage, the decoded numbers corresponding to
residues of {m1, m2} and {m3, m4} can be found and in the second stage, the
decoded number corresponding to the moduli set {m1m2, m3m4} can be found. The
basic converter for the two moduli set {m1, m2} can be realized using one addition
without needing any modular reduction. Denoting the residues as (r1, r2), the
decoded number B1 can be written as B1 ẳ r 2 ỵ r 1 r 2 , 0Þ where the second
term corresponds to the binary number corresponding to (r1 À r2, 0). Since r1 À r2
can be negative, it can be written as a α-bit two’s complement number with a sign
bit S and (α À 1) remaining bits. The authors suggest that the decoded number be
obtained using a look-up table T addressed by sign bit and p LSBs where
m2 À m1 ¼ 2p and using addition operation as follows:
ỵ T signr 1 r 2 Þ, LSBðr 1 À r 2 ÞpÀ1
B1 ¼ r 2 þ m2 Â MSBðr 1 À r 2 ÞαÀ1
ð5:36Þ
0
p
Some of the representative moduli sets are {7, 9, 5, 13}, {23, 39, 25, 41}, {127,
129, 113, 145} and {511, 513, 481, 545}. As an illustration, the implementation for
the RNS {511, 513, 481, 545} needs 170AFA, 2640 bits of ROM and needs a
104
5 RNS to Binary Conversion
conversion time of 78ΔFA + 2ΔROM where ΔFA is the delay of a full adder and
ΔROM is ROM access time.
We next consider four moduli sets with dynamic range (DR) of the order of 5n
and 6n bits. The four moduli set {2n, 2n À 1, 2n + 1, 22n + 1} [62] is attractive since
New CRT-I-based reduction can be easily carried out. However, the bit length of
one modulus is double that of the other three moduli. Note that this moduli set can
be considered to be derived from {22n À 1, 22n, 22n + 1} [48, 49].
The reverse converters for the moduli set {2n À 1, 2n + 1, 22n+1 À 1, 2n} with DR
of about (5n + 1) bits and {2n À 1, 2n + 1, 22n, 22n + 1} with a DR of about 6n bits
based on New CRT II and New CRT I respectively have been described in [63]. In
the first case, MRC is used for the two two moduli sets {m1, m2} ¼ {2n, 22n+1 À 1}
and {m3, m4} ¼ {2n + 1, 2n À 1} to compute Z and Y. A second MRC stage computes
X from Y and Z:
Z ẳ x1 ỵ 2n 2nỵ1 x2 x1 ị 2nỵ1
1
2
Y ẳ x3 ỵ 2n ỵ 1ị 2n1 x4 x3 ị 2n 1
X ẳ Z ỵ 2n 22nỵ1 1 2n Y Z ÞÞ 2n
2 À1
ð5:37aÞ
ð5:37bÞ
ð5:37cÞ
Due to the modulo reductions which are convenient, the hardware can be simpler.
In the case of the moduli set {m1, m2, m3, m4} ¼ {2n À 1, 2n + 1, 22n, 22n + 1},
New CRT-I has been used. The decoded number in this case is given by
À
À
Á
À
Á
Á
X ¼ x1 ỵ 22n 22n x2 x1 ị ỵ 22n1 22n ỵ 1 x3 x2 ị ỵ 2n2 22n þ 1 ð2n þ 1Þðx4 À x3 Þ
24n À1
ð5:38Þ
Zhang and Siy [64] have described an RNS to binary converter for the moduli set
{2n À 1, 2n + 1, 22n À 2, 22n+1 À 3} with a DR of about (6n + 1) bits. They
consider two-level MRC using the two moduli sets {m1 ¼ 2n À 1, m2 ¼ 2n + 1}
and {m3 ¼ 22n À 2, m4 ¼ 22n+1 À 3}. The multiplicative inverses are very simple:
1
1
1
nÀ1
¼2 ,
¼ 1,
¼1
m2 m1
m4 m3
m3 m4 m1 m2
ð5:39Þ
Sousa and Antao [65] have described MRC-based RNS to binary converters for
the moduli sets {2n + 1, 2n À 1, 2n, 22n+1 À 1} and {2n À 1, 2n + 1, 22n, 22n+1 À 1}.
They consider in the first level {x1, x2} ¼ {2n À 1, 2n + 1} and {x3, x4} ¼ {2n(1+α),
22n+1 À 1} where α ¼ 0,1 correspond to the two moduli sets to compute X12 and
X34 respectively.
The multiplicative
inverses in the first level are
1
1
ẳ 2n1 ,
ẳ 21ỵịn 1, and in the second
2nỵ1
n1ỵị
2n ỵ 1 2n 1
1
2
2
1
1
n
ẳ
2
for
ẳ
0
and
ẳ 1 for ẳ 1.
level are 3nỵ1
2n 22n 1
2
24nỵ1 2n 22n 1
5.5 RNS to Binary Converters for Four and More Moduli Sets
105
Note that all modulo operations are mod (2n À 1), 2(1+α)n and 22n À 1 which are
convenient to realize. The authors use X12 and X34 in carry save form for computing
ðX12 À X34 Þ 2n thus reducing the critical path.
2 À1
Stamenkovic and Jovanovic [66] have described a reverse converter for the four
moduli set {2n À 1, 2n, 2n + 1, 22n+1 À 1}. They have suggested exploring the
24 possible orderings of the moduli for being used in MRC so that the multiplicative
inverses are Ỉ1 and 2 nÀ1 . The recommended ordering is {2 2n+1 À 1, 2n, 2 n + 1,
2n À 1}. This leads to MRC using only subtractors and not needing modulo
multiplications. They have not, however, presented the details of hardware requirement and conversion delay.
The reverse converter for the five moduli set [67] {2n À 1, 2n, 2n + 1, 2n+1 À 1,
nÀ1
2
À 1} for n even uses in first level the converter for four moduli set {2n À 1,
n
n
2 , 2 + 1, 2n+1 À 1} due to [54] and then uses MRC to include the fifth modulus
(2nÀ1 À 1).
Hiasat
n [68] has described reverse converters for two
o five moduli sets based on
nỵ1
nỵ1
CRT 2n , 2n 1, 2n ỵ 1, 2n 2 2 ỵ 1, 2n ỵ 2 2 ỵ 1 when n is odd and n ! 5 and
n
o
nỵ1
nỵ1
2nỵ1 , 2n 1, 2n ỵ 1, 2n 2 2 ỵ 1, 2n ỵ 2 2 ỵ 1 when n is odd and n ! 7. Note
that this moduli set uses factored form of the two moduli (22n À 1) and (22n + 1) in
the moduli set {2n, 22n À 1, 22n + 1}. The reverse conversion procedure is similar to
Andraros and Ahmad technique [4] of evaluating the 4n MSBs since n LSBs of the
decoded result are already available. The architecture needs addition of eight
4n-bit words using 4n-bit CSA with EAC followed by 4n bit CPA with EAC or
modulo (24n À 1) adder using parallel prefix architectures.
Skavantzos and Stouraitis [69] and Skavantzos and Abdallah [70] have
suggested general converters for moduli products of the form 2a(2b À 1) where 2b
À 1 is made up of several conjugate moduli pairs such as (2n À 1), (2n + 1) or
À n
nỵ1
nỵ1
2 ỵ 2 2 ỵ 1 , 2n 2 2 ỵ 1 . The reverse converter for conjugate moduli is
quite simple which needs rotation of bits and one’s complementing and addition
using modulo (24n À 1) adders or modulo (22n À 1) adders. The authors suggest
two-level converters which will find the final binary number using MRC
corresponding to the intermediate residues. The first level converter uses CRT,
whereas the second level uses MRC. The four moduli sets {2n+1, 2n À 1, 2n+1 À 1,
2n+1 + 1} for n odd, {2n, 2n À 1, 2nÀ1 À 1, 2nÀ1 + 1}for n odd, the five moduli
sets {2n+1, 2n À 1, 2n + 1, 2n+1 À 1, 2n+1 + 1}, {2n, 2n À 1, 2n + 1, 2n + 2(n+1)/2 + 1,
2n À 2(n+1)/2 + 1} and the RNS with seven moduli {2n+3, 2n À 1, 2n + 1, 2n+2 À 1,
2n+2 + 1, 2n+2 + 2(n+3)/2 + 1, 2n+2 À 2(n+3)/2 + 1} have been suggested. Other RNS
with only pairs of conjugate moduli up to 8 moduli also have been suggested.
Note that care must be taken to see that the moduli are relatively prime. Note
that in case of one common factor existing among the two sets of moduli, this
should be taken into account in the application of CRT in the second level
converter.
Pettenghi et al. [71] have described general RNS to binary converters for the
moduli sets {2n+β, 2n À 1, 2n + 1, 2n + k1, 2n À k1} and {2n+β, 2n Ỉ 1, 2n Ỉ k1, 2n Ỉ k2,
106
5 RNS to Binary Conversion
j k
. . ., 2n Ỉ kf} using CRT. In the case of first moduli set, they compute mX1 where
5
j k X
Mi 1
m1 ¼ 2n+β as mX1 ¼
V i xi where V i ¼ m
xi for i ¼ 2, . . ., 5 which are
1 Mi
mi
i¼1
integers since m1 divides Mi exactly. On the other hand, in case of V1, we have
!
3n
1
n 2
2
2
k 1 ỵ 1 ỵ x1
M 1 m1
V1 ẳ
where is defined as
1
k2 ẳ m1 ỵ 1
M1 m1 1
ð5:40aÞ
ð5:40bÞ
"
#
X
can be removed using
m1 m1
this technique. As an illustration for m1 ẳ 2nỵ , k1 ẳ 3,
ẳ n ¼ 3, m1 ¼ 64, m2 ¼ 15,
1
¼ 57 and V1 ¼ 14,024,
m3 ¼ 17, m4 ¼ 13, m5 ¼ 19, we have ψ ¼ 2,
M 1 m1
V2 ¼ 58,786, V3 ¼ 59,280, V4 ¼ 43,605 and V5 ¼ 13,260. Note that the technique
can be extended to the case of additional moduli pairs with different k1, k2, etc.
Skavantzos et al. [72] have suggested in case of the balanced eight moduli RNS
using the moduli set {m1, m2, m3, m4, m5, m6, m7, m8} ¼ {2nÀ5 À 1, 2nÀ3 À 1, 2nÀ3
+ 1, 2nÀ2 + 1,2nÀ1 À 1, 2nÀ1 + 1, 2n, 2n + 1}, four first level converters comprising of
moduli {2nÀ3 À 1, 2nÀ3 + 1}, {2nÀ5 À 1, 2nÀ2 + 1}, {2nÀ1 À 1, 2nÀ1 + 1}, {2n, 2n + 1}
to obtain the results B, D, C and E respectively. The computation of
Note that the fractional part in the computation of
D ẳ x4 ỵ m4 X01
5:41aị
where
X01 ẳ
1
2n2 ỵ 1
x 1 À x 4 Þ
2nÀ5 À1
ð5:41bÞ
needs a multi-operand modulo (2nÀ5 À 1) CSA tree followed by a modulo (2nÀ5 À 1)
CPA. The computation E is simpler where
E ẳ x8 ỵ m 8
1
x 7 x8 ị
m8
m7
5:42ị
where m8 ẳ 2n ỵ 1 and m7 ¼ 2n .
The second level converter takes the pairs {B, D} and {C, E} and evaluates the
corresponding numbers F and G respectively which also uses MRC which can
also be realized by multi-operand modulo (22nÀ6 À 1) CSA tree followed by a