# Floating point arithmetics

Integer arithmetics is fairly straightforward and most engineers understand it well. Floating pointer arithmetics - very different story.

## Addition and subtraction

In order to perform addition and subtraction of fp numbers, we need to represent them with the same exponent first: $$\begin{align}2.5 \cdot 10^3 + 3.1 \cdot 10^2 = 2.5 \cdot 10^3 + 0.31 \cdot 10^3\end{align}$$

Once the numbers have the same exponends we can add significands: $$\begin{align}(2.5 + 0.31) \cdot 10^3 = 2.81 \cdot 10^3 \end{align}$$

In some cases, we may need to perform normalization to make sure we are still using standard scientific notation.

With above implementation, the problem arises if we were to add
drastically different numbers while having limited amount of digits. To
demonstrate that, let’s assume we have only 5 digits to store
significand.

Let’s add 2.345 * 10^{5} + 1.312 * 10^{−2}:

Step 1: shift

Step 2: add

Step 3: round and normalize

As you can see, after adding two numbers we lost precision. In fact, in this example we got a result that equals to the first number.

To alleviate this and other similar problems with fp arithmetics, guard bit, rounding bit and sticky bit are used. We will cover that another time.

## Multiplication

In order to perform multiplication of fp numbers, we multiply theirs significands and add exponents. $$\begin{align}2.5 \cdot 10^3 * 3.1 \cdot 10^2 = (2.5 * 3.1) \cdot 10^{3+2} \to \end{align}$$ $$\begin{align} \to 7.75 \cdot 10^5 \end{align}$$

Another example:

Step 1: add exponents, multiply significants

Step 2: normalize

Step 3: round

## Division

In order to perform division of fp numbers, we divide theirs significands and subtract exponents. $$\frac{4.2 \cdot 10^3}{2.1 \cdot 10^2} = \frac{4.2}{2.1} \cdot 10^{3-2} \to $$ $$\begin{align} \to 2 \cdot 10^1 \end{align}$$ Another example:

Step 1: subtract exponents, divide significants

Step 2: round and normalize

TODO: add diagrams

### Sources:

University
of Maryland: Floating Point Arithmetic Unit by Dr A. P. Shanti

George Mason
University: Floating Point Arithmetic

Drexel
University: Systems Architecture Lecture: Floating Point
Arithmetic

Wikipedia:
Floating-point arithmetic

24 Mar 2022 - Hasan Al-Ammori