Floating point arithmetics
Integer arithmetics is fairly straightforward and most engineers understand it well. Floating pointer arithmetics - very different story.
Addition and subtraction
In order to perform addition and subtraction of fp numbers, we need to represent them with the same exponent first: $$\begin{align}2.5 \cdot 10^3 + 3.1 \cdot 10^2 = 2.5 \cdot 10^3 + 0.31 \cdot 10^3\end{align}$$
Once the numbers have the same exponends we can add significands: $$\begin{align}(2.5 + 0.31) \cdot 10^3 = 2.81 \cdot 10^3 \end{align}$$
In some cases, we may need to perform normalization to make sure we are still using standard scientific notation.
With above implementation, the problem arises if we were to add
drastically different numbers while having limited amount of digits. To
demonstrate that, let’s assume we have only 5 digits to store
significand.
Let’s add 2.345 * 105 + 1.312 * 10−2:
ADD
e= 5 s=2.345
e=-2 s=1.312
Step 1: shift
e=5 s=2.3450000000
e=5 s=0.0000001312
Step 2: add
e=5 s=2.3450001312
Step 3: round and normalize
e=5 s=2.34500
As you can see, after adding two numbers we lost precision. In fact, in this example we got a result that equals to the first number.
To alleviate this and other similar problems with fp arithmetics, guard bit, rounding bit and sticky bit are used. We will cover that another time.
Multiplication
In order to perform multiplication of fp numbers, we multiply theirs significands and add exponents. $$\begin{align}2.5 \cdot 10^3 * 3.1 \cdot 10^2 = (2.5 * 3.1) \cdot 10^{3+2} \to \end{align}$$ $$\begin{align} \to 7.75 \cdot 10^5 \end{align}$$
Another example:
MUL
e= 5 s=2.345
e=-2 s=7.246
Step 1: add exponents, multiply significants
1) e=5 + (-2)
2) s=2.345 * 7.246
------------------
e=3 s=16.99187
Step 2: normalize
e=4 s=1.699187
Step 3: round
e=4 s=1.69918
Division
In order to perform division of fp numbers, we divide theirs significands and subtract exponents. $$\frac{4.2 \cdot 10^3}{2.1 \cdot 10^2} = \frac{4.2}{2.1} \cdot 10^{3-2} \to $$ $$\begin{align} \to 2 \cdot 10^1 \end{align}$$ Another example:
DIV
e= 5 s=8.2
e=-2 s=2.5
Step 1: subtract exponents, divide significants
1) e = 5 - (-2) = 7
2) s = 8.2 / 2.5
--------------------
e=7 s=3.28
Step 2: round and normalize
e=7 s=3.28
TODO: add diagrams
Sources:
University
of Maryland: Floating Point Arithmetic Unit by Dr A. P. Shanti
George Mason
University: Floating Point Arithmetic
Drexel
University: Systems Architecture Lecture: Floating Point
Arithmetic
Wikipedia:
Floating-point arithmetic
24 Mar 2022 - Hasan Al-Ammori