The Pentium division Flaw - Chapter 3
The SRT division algorithm was named after the 3 scientists who
discovered it independently at about the same time: D. Sweeney of IBM,
J.E. Robertson of the University of Illinois, and T.D. Tocher of the
Imperial College of London.
The Pentium uses a Radix 4 SRT Division algorithm for floating point
divisions. It's simply the division we've been doing above, except
it's done in base 4, and the quotient digits we can choose are
{-2,-1,0,1,2}.
(I think the reason we don't include 3 and -3 as possible quotent
digits is because hardware to implement multiplies by anything
other than a power of 2 is very expensive both in chip area and
time. Multiplications by positive and negative powers of two are
just shifts and inverts. Having radix 4 and the possible quotient
digits of {-2,-1,0,1,2} meet the multiplication criteria nicely.)
Our iterative formula for base 4 is:
R[j+1] = 4 * (R[j] - q[j]*D) (3-1)
At the beginning of each iteration we must choose a quotient digit
which will result in the next remainder R[j+1] being
within the acceptable limits of:
_ _____
(D * 2.22222...) <= R[j+1] < (D * 2.22222...)
[base 4] [base 4]
_
where: 2 = lowest quotient digit we can use (-2)
2 = highest quotient digit we can use
Using a little calculus we can show that:
2.22222... (base 4) = 8/3 (decimal)
Proof: 2.22222... = 2*(x0 + x1 + x2 + x3 +...) [x = 1/4]
(base 4) (decimal)
Recall from calculus if |x| < 1 then:
1
(x0 + x1 + x2 + x3 +...) = -----
1 - x
1 8
= 2 * ------- = ---
1 - 1/4 3
Q.E.D.
Similarly, it can be shown that:
_ _____
2.22222... (base 4) = -8/3 (decimal)
Our restriction on the remainder R[j+1] thus becomes:
(D * -8/3) <= R[j+1] < (D * 8/3)
Substituting Eq. (3-1) for R[j+1] gives:
(-8/3)*D <= 4*(R[j] - qD) < (8/3)*D (3-2)
Given a remainder R[j] at the beginning of an iteration, we can
now determine what values of q will satisfy the above restraint.
FOR q=+2 (-8/3)*D <= 4*(R[j] - 2*D) < (8/3)*D
becomes: (4/3)*D <= R[j] < (8/3)*D
We may thus choose quotient digit 2 when our remainder R[j] is between
(4/3)*D and (8/3)*D.
FOR q=+1: (-8/3)*D <= 4*(R[j] - 1*D) < (8/3)*D
becomes: (1/3)*D <= R[j] < (5/3)*D
We may thus choose quotient digit 1 when our remainder R[j] is between
(1/3)*D and (5/3)*D.
FOR q=0: (-8/3)*D <= 4*(R[j] - 0*D) < (8/3)*D
becomes: (-2/3)*D <= R[j] < (2/3)*D
We may thus choose quotient digit 0 when our remainder R[j] is between
(-2/3)*D and (2/3)*D.
FOR q=-1: (-8/3)*D <=4*(R[j] - (-1)*D) < (8/3)*D
becomes: (-5/3)*D <= R[j] < (-1/3)*D
We may thus choose quotient digit -1 when our remainder R[j] is between
(-5/3)*D and (-1/3)*D.
FOR q=-2: (-8/3)*D < 4*(R[j] - (-2)*D) < (8/3)*D
becomes: (-8/3)*D < R[j] < (-4/3)*D
We may thus choose quotient digit -2 when our remainder R is between
(-8/3)*D and (-4/3)*D.
Notice there is some overlap in the choices of q:
If R is between (5/3)*D and (4/3)*D we can choose either q=2 or q=1.
If R is between (2/3)*D and (1/3)*D we can choose either q=1 or q=0.
If R is between (-1/3)*D and (-2/3)*D we can choose either q=0 or q=-1.
If R is between (-4/3)*D and (-5/3)*D we can choose either q=-1 or q=-2.
The above information leads to the following plot, called the Partial
remainder vs. Divisor plot, or P-D plot for short. Dividing the plot
up into cells generates a lookup table that we can use to aid us in
choosing the next quotient digit q given the current remainder R[j].
The divisions up the vertical axis are 0.001 [binary] apart,
representing R[j] values. The divisions across the horizontal axis are
0.0001 [binary] apart, representing Divisor values. (The divisor
values range from 1 to 1.11111..., since a normalized divisor mantissa
will always be in this range).
There are five red cells marked with a 0 near the top of the graph
running along the (8/3)D line. These are the error cells that should
specify a quotient digit of 2, but instead specify a quotient digit of
0. Selecting a quotient digit of 0 when R[j] is within this range is
unacceptable -- the digit value is too low, and no matter what
subsequent quotient digits are chosen the total quotient answer will
always be lower than the correct answer from this point on. This is
the Pentium flaw. Those five cells failed to get the proper value
loaded into them when Intel made the chip.
These 5 error cells are very rarely used being positioned where they
are and with less than half of each error cell actually below the (8/3)D
line. It is also impossible to hit any of these error cells early on
in the divison process, so even if an error cell is encountered the
generated result will still be close to the correct answer.
THE PENTIUM P-D PLOT:
Let's look at this graph for a moment. There are diagonal lines
(8/3)D, (5/3)D, (4/3)D, (2/3)D, (1/3)D, (-1/3)D, (-2/3)D, (-4/3)D,
(-5/3)D, (-8/3)D. These lines would pass through the origin if we
extended the graph to the left. These are the theoretical boundaries
discussed above.
Above (8/3)D is out of bounds. If we ever end up there,
something has gone wrong. The Pentium returns a value
of q=0 if we ever hit this area. |
|
Between (8/3)D and (5/3)D we must have q=2. Each cell which
is in or partly overlaps this area must have the value 2.
The five red error cells marked with have a value of 0
instead of 2. |
|
Between (5/3)D and (4/3)D we can have either q=2 or q=1. The
actual division line (as best as we know) is the staircase line drawn in brown.
Cells above that staircase shaded light blue have q=2,
cells below that staircase shaded yellow have q=1. Cells immediately
above the division line, marked with a '?', are cells which
might return either q=2 or q=1 depending upon other factors
that will be discussed in chapter 4. [2004—I apologize,
I no longer remember the reason for the '?' in these cells, and
chapter 4 doesn't explain them. Chapter 4 does explain why every
cell must return a value suitable also for the cell directly above it.
The dividing lines shown here I believe are from Tim Coe's Source Code.] |
|
Between (4/3)D and (2/3)D we must have q=1. Each cell which is
in or partly overlaps this area must have the value 1. |
|
Between (2/3)D and (1/3)D we can have either q=1 or q=0. The
actual division line (as best as we know) is the staircase
line drawn in brown. Cells above that staircase have q=1, cells
below that staircase have q=0. Cells immediately above the
division line, marked with a '?', are cells which might return
either q=1 or q=0 depending upon other factors that will be
discussed in chapter 4. |
|
Between (1/3)D and (-1/3)D we must have q=0. Each cell which
is in or partly overlaps this area must have the value 0. |
|
Between (-1/3)D and (-2/3)D we can have either q=0 or q=-1. The
actual division line (as best as we know) is the staircase
line drawn in brown. Cells above that staircase have q=0, cells
below that staircase have q=-1. Note this division staircase
is not a mirror image of the one in the positive plane, but is
instead shifted down by two cells. Cells immediately above the
division line, marked with a '?', are cells which might return
either q=0 or q=-1 depending upon other factors that will be
discussed in chapter 4. |
|
Between (-2/3) and (-4/3) we must have q=-1. Each cell which
is in or partly overlaps this area must have the value -1. |
|
Between (-4/3)D and (-5/3)D we can have either q=-1 or q=-2.
The actual division line (as best as we know) is the staircase
line drawn in brown. Cells above that staircase have q=-1, cells below
that staircase have q=-2. Note again this division staircase
is not a mirror image of the one in the positive plane, but is
instead shifted down by two cells. Cells immediately above the
division line, marked with a '?', are cells which might return
either q=-1 or q=-2 depending upon other factors that will be
discussed in chapter 4. |
|
Between (-5/3)D and (-8/3)D we must have q=-2. Each cell which
is in or partly overlaps this area must have the value -2.
Note there are no error cells down here in the negative plane.
The 5 error cells are all in the positive plane. |
|
Below (-8/3)D is out of bounds. If we ever end up there,
something has gone wrong. The Pentium returns a value of q=0
if we ever hit this area. (We actually do hit this area after
hitting the error cell.) |
The cells in the above graph define the lookup table used during the
Pentium's SRT division process. The iteration goes something like
this:
Given:
D - Divisor (note this value remains fixed)
R - Dividend (initial Remainder)
Step 1:
Using D and R, look in P-D table above for next quotient digit q. Note
that we need only know the first 4 digits after the radix point of the
Divisor, and we need only know the first 3 digits after the radix point
of the Remainder.
Step 2:
Do the math:
R[j+1] = 4 * (R[j] - q*D)
Repeat steps 1 & 2 until enough quotient digits have been determined.