## Computer architecture, solved problems

v. 01 2018/19

1. Minicomputers in the eighties (eg. DEC PDP-11) had 18 address signals and of course, the 18 -bit address bus. Answer the following questions:
a) What was the address space of these computers?
b) What may have been the largest possible memory of these computers in bytes if the memory location is 1 Byte?
c) How long should the program counter (PC) of these computers be?
d) What would be needed to change in these computers if we would like to increase address space 8 -times?

## Solution:

a) $2^{\wedge} 18=2 \wedge 8 * 2^{\wedge} 10=256 \mathrm{k}$
$\left(2^{\wedge} 10=1 \mathrm{~K}, 2^{\wedge} 20=1 \mathrm{M}, 2^{\wedge} 30=1 \mathrm{G}, 2^{\wedge} 40=1 \mathrm{~T} \ldots\right)$
b) $2{ }^{\wedge} 18 \mathrm{~B}=256 \mathrm{~KB}$
c) 18 b , because the program can be anywhere in memory ...
d) add 3 address signals, extend the PC for 3 bits, change the instruction format (the address field extended for 3 bits)
2. Write down a signed decimal number with a value of -25 in 8-bit fixed point presentation in each of the four 8-bit modes for the presentation of signed numbers. Do the same also with the number 33. Write resulting presentations as binary and hexadecimal numbers. Consider, how to perform addition of these numbers in binary form for each presented mode.
a) sign and magnitude:

$$
\begin{aligned}
& V(b)=(-1)^{b_{n-1}} \sum_{i=0}^{n-2} b_{i} 2^{i} \quad+\ldots 0 /-. .1 \\
& 25_{(10)}=?_{(2)} \quad=11001_{(2)} \\
& 25: 2=12+1 \\
& 12: 2=6+0 \\
& 6: 2=3+0 \\
& 3: 2=1+1 \\
& 1: 2=\underline{0}+1 \\
& 33_{(10)}=?_{(2)} \quad=100001_{(2)} \\
& 33: 2=16+1 \\
& 16: 2=8+0 \\
& 8: 2=4+0 \\
& 4: 2=2+0 \\
& 2: 2=1+0 \\
& 1: 2=\underline{0}+1 \\
& \text { therefore: }-25_{(10)}=10011001_{(2)}\left(99_{(16)}\right) \\
& \text { addition: it is necessary to take into account the sign }=>\text { additional complexity } . . .
\end{aligned}
$$

b) the presentation with offset

$$
\begin{aligned}
& V(b)=\sum_{i=0}^{n-1} b_{i} 2^{i}-2^{n-1} \quad \text { tudi }\left(2^{n-1}-1\right) \quad-128 . .127 \\
& -25+2^{\mathrm{n}-1}=-25+128=103 \\
& \\
& 103_{(10)}=?_{(2)}=1100111_{(2)} \\
& 103: 2=51+1 \\
& 51: 2=25+1 \\
& 25: 2=12+1 \\
& 12: 2=6+0 \\
& 6: 2 \\
& 3: 2 \\
& 1: 2+0 \\
& 1: 2
\end{aligned}=\underline{0}+1 \quad \begin{aligned}
& \text { 2 } \\
& 3+2^{\mathrm{n}-1}=33+128=161
\end{aligned}
$$

$$
\text { therefore: }-25_{(10)}=01100111_{(2)}\left(67_{(16)}\right) \quad 33_{(10)}=10100001_{(2)}\left(\mathrm{A} 1_{(16)}\right)
$$

addition: the result includes two offsets, so one offset has to be subtracted separately
c) ones' complement

$$
\begin{array}{ll}
V(b)=\sum_{i=0}^{n-1} b_{i} 2^{i}-b_{n-1}\left(2^{n}-1\right) & -127 . .127 \\
25_{(10)}=11001_{(2)}
\end{array}
$$

therefore: $-25_{(10)}=11100110_{(2)}\left(\mathrm{E}_{(16)}\right), \quad 33_{(10)}=00100001_{(2)}\left(21_{(16)}\right)$
addition: in case of carry from place $n-1,1$ has to be added to result; on the other hand, sign bit can be treated the same as the other (value) bits ...
d) two's complement

$$
\begin{aligned}
& V(b)=\sum_{i=0}^{n-1} b_{i} 2^{i}-b_{n-1}\left(2^{n}\right) \\
& 25_{(10)}=00011001_{(2)} \\
& \quad 11100110_{(2)} \\
& +00000001_{(2)} \\
& =11100111_{(2)} \\
& \text { therefore: }-25_{(10)}=11100111_{(2)}\left(\mathrm{E} 7_{(16)}\right) \quad 33_{(10)}=00100001_{(2)}
\end{aligned}
$$

3. Write the value of 4.75 in 32-bit floating point format IEEE 754. Another floating point value is written in the memory as $44 \mathrm{FAC} 000_{(16)}$. Which value does it represent if it is written in the format IEEE 754 ?
(This problem is informative, similar floating point problems will not appear in written exams)
```
\(4_{(10)}=?_{(2)}=100_{(2)}\)
\(4: 2=2+0\)
\(2: 2=1+0\)
\(1: 2=\underline{0}+1\)
\(0,75_{(10)}=?_{(2)}=0,11_{(2)}\)
\(0,75 * 2=0,5+1\)
\(0,5 * 2=\underline{0}+1\)
\(4,75_{(10)}=100,11_{(2)}\)
```

In floating point: $4,75=-1^{\wedge} \mathrm{s}^{*} \mathrm{~m} * 2^{\wedge} \exp =1 * 100,11 * 2^{\wedge} 0$
mantissa is converted into a form of 1, ???:
$100,11 * 2^{\wedge} 0=1,0011 * 2^{\wedge} 2$
$\mathrm{s}=0$
$\mathrm{m}=\underline{1}, 0011$
$\exp =2$
IEEE 754:

| s | $\exp +127$ | m |
| :---: | :---: | :---: |
| 1 | 8 | 23 |

$\mathrm{s}=0$
exp written in a presentation with offset 127:
$\exp +127=129_{(10)}=10000001_{(2)}$
mantissa it always in a form of 1, ???, the implicit bit (1,) is not part of the presentation:
$\mathrm{m}=0011$
the final presentation result:
$01000000100110000000000000000000=40980000_{(16)}$

## Second problem: The conversion in the opposite direction:

```
44FAC000 (16) }=0\underline{1000100111110101100000000000000 (2)
s=0
exp+127=100010001(2)}=137
therefore exp=10, m=1,111101011 (2)
2^0*m*2^exp
1*1,111101011*2^10
11111010110 (2) = 2006 (10)
```

4. We want to compare the computers R1 and R2, which differ that R1 has the machine instructions for the floatingpoint operations, while R2 has not (FP operations are implemented in the software using several non-FP instructions). Both computers have a clock frequency of 400 MHz . In both we perform the same program, which has the following mixture of commands:

| Type the command | Dynamic Share of <br> instructions in <br> program ( $\left.\mathrm{p}_{\mathrm{i}}\right)$ | Instruction duration (Number of clock periods $\mathrm{CPI}_{\mathrm{i}}$ ) |  |
| :---: | :---: | :---: | :---: |
|  |  | R 1 | R 2 |
| FP addition | $16 \%$ | 6 | 20 |
| FP multiplication | $10 \%$ | 8 | 32 |
| FP division | $8 \%$ | 10 | 66 |
| Non - FP instructions | $66 \%$ | 3 | 3 |

a) Calculate the MIPS for the computers R1 and R2.
b) Calculate the CPU program execution time on the computers R1 and R2, if there are 12000 instructions in the program?
c) At what mixture of instructions in the program will both computers R1 and R2 be equally fast?

## Solution:

a) $\mathrm{CPI}=\sum_{i=0}^{3} C P I_{i} * p_{i}$

$$
\mathrm{MIPS}=\frac{f_{C P E}}{C P I * 10^{6}}
$$

## Computer R1:

$\mathrm{CPI}=\sum_{i=1}^{3} C P I_{i} * p_{i}=0,16 * 6+0,1 * 8+0,08 * 10+0,66 * 3=4,54$

Computer R1 needs an average of 4.54 clock periods for one instruction

MIPS $=\frac{f_{C P E}}{C P I * 10^{6}}=\frac{400 * 10^{6}}{4,54 * 10^{6}}=88,1$

Computer R1 executes an average of 88100000 instructions per second.

## Computer R2:

$\mathrm{CPI}=\sum_{i=1}^{3} C P I_{i} * p_{i}=0,16 * 20+0,1 * 32+0,08 * 66+0,66 * 3=13,66$

Computer R2 needs an average of 13.66 clock periods for one instruction
MIPS $=\frac{\mathrm{f}_{C P E}}{\mathrm{CPI} * 10^{6}}=\frac{400 * 10^{6}}{13,66 * 10^{6}}=29,28$
Computer R2 executes an average of 29280000 instructions per second.
b) $\quad \boldsymbol{C} \boldsymbol{P} \boldsymbol{U}_{\text {time }}=\frac{\text { Number_of_instructions }}{M I P S * 10^{6}}$

Another form of the equation to calculate the CPU time is:
$C P U_{\text {time }}=$ Number_of_instructions $* C P I * t_{\text {CPU }}$

## Computer R1:

$$
\boldsymbol{C P} \boldsymbol{U}_{\text {time }}=\frac{\text { Number_of_instructions }}{\boldsymbol{M I P S} * \mathbf{1 0}^{6}}=\frac{12000}{88,1 * 10^{6}}=136,2 * 10^{-6}=136,2 \mu \mathrm{~s}
$$

## Computer R2:

$$
\boldsymbol{C P} \boldsymbol{U}_{\text {time }}=\frac{\text { Number_of_instructions }}{\boldsymbol{M I P S} * \mathbf{1 0}^{\mathbf{6}}}=\frac{12000}{29,28 * 10^{6}}=410 * 10^{-6}=410 \mu \mathrm{~s}
$$

c) Programs without FP instructions ...
5. We want to speed up computer performance with an additional unit for calculating in floating point format. This unit is 20 times faster than the same operations without unit. What percentage of a total computer time must this unit be active to achieve an overall increase in computer speed for 2.5 times?

## Solution:

Use Amdahl's law:

$$
S(N)=\frac{N}{1+(N-1) * f}
$$

| f | $=$ | share of operations that do not speed up |
| :--- | :--- | :--- |
| N | $=$ | Factor of speedup for $(1-\mathrm{f})$ share of operations |
| $\mathrm{S}(\mathrm{N})=$ | Increase of the overall speed |  |

In this case, is $S(N)=2.5 ; N=20$, however, we are looking for ( $1-\mathrm{f}$ )
$1-f=\frac{N-S(N)}{S(N) *(N-1)}=1-\frac{20-2,5}{2,5 * 19}=1-0,3684=0,6315$

FP unit must be active $63.15 \%$ of the time so that the computer's performance is 2.5 times faster.
6. The computer has a main memory access time of 60 ns . We want to reduce this time to 20 ns by adding cache. Determine how fast the cache must be (access time) if we can expect a $90 \%$ probability of a hit.

$$
\begin{aligned}
& \mathrm{t}_{\mathrm{ag}}=60 \mathrm{~ns} \\
& \mathrm{t}_{\mathrm{a}}=20 \mathrm{~ns} \\
& \mathrm{H}=90 \%=0,9 \\
& \mathrm{t}_{\mathrm{ap}}=? \\
& \mathrm{t}_{\mathrm{a}}=\mathrm{t}_{\mathrm{ap}}+(1-\mathrm{H}) \times \mathrm{t}_{\mathrm{ag}} \\
& \mathrm{t}_{\mathrm{ap}}=\mathrm{t}_{\mathrm{a}}-(1-\mathrm{H}) \times \mathrm{t}_{\mathrm{ag}} \\
& \mathrm{t}_{\mathrm{ap}}=20 \times 10^{-9}[\mathrm{~s}]-(1-0,9) \times 60 \times 10^{-9}[\mathrm{~s}]=20 \times 10^{-9}-6 \times 10^{-9}=14 \times 10^{-9}[\mathrm{~s}]=14[\mathrm{~ns}]
\end{aligned}
$$

7. In a computer with cache, we have the average number of clock periods per instruction equal to 4 , if there are no misses in the cache.
a) What is the real number of clock periods per instruction, if the probability of miss in the cache is $10 \%$ ? For the replacement of the block (line) in the cache, we need 5 clock periods for read and 10 for write accesses. Assume that each instruction requires an average of 2 memory accesses and that $20 \%$ of all are write accesses.
```
\(\mathrm{CPI}_{\mathrm{I}}=4\)
\((1-\mathrm{H})=10 \%=0,1\)
\(\mathrm{M}_{\mathrm{I}}=2\)
\(\mathrm{N}_{\mathrm{R}}=5\)
\(\mathrm{N}_{\mathrm{W}}=10\)
\(\mathrm{P}_{\mathrm{W}}=0,2\)
\(\underline{P}_{R}=0,8\)
\(\mathrm{CPI}_{\mathrm{R}}=\mathrm{CPI}_{\mathrm{I}}+\mathrm{M}_{\mathrm{I}} \times(1-\mathrm{H}) \times\) miss_penalty
miss_penalty \(=P_{W} \times\) num_periods_for_write \(+P_{R} \times\) num_periods_for_read \(=P_{W} \times N_{W}+P_{R} \times N_{R}\)
\(\mathrm{CPI}_{\mathrm{R}}=4+2 \times 0.1 \times(0.2 \times 10+0.8 \times 5)=5.2\)
```

b) What is the real CPI, if we increase the probability of hit to $95 \%$ ?

```
CPI }=
M
(1-H)=0,05
miss_penalty=6
CPI
CPI}=\mp@subsup{\textrm{CPI}}{\textrm{I}}{}+\mp@subsup{\textrm{M}}{\textrm{I}}{}\times(1-H)\times\mathrm{ miss_penalty
CPI
```

8. On a computer with a 32-bit memory address and the length of the memory location of 1 byte is installed set-associative cache. Cache size is 16 KB , block (line) size is 16 Bytes, set associative cache is 4-way.
a) How many sets are there in cache?
$\mathrm{n}=32$
$\mathrm{M}=16 \mathrm{~KB}=2^{4} \mathrm{~KB}=2{ }^{14} \mathrm{~B}$
block $=\mathrm{B}=16 \mathrm{~B}=2^{4} \mathrm{~B}$
$\mathrm{E}=4=2^{2}$
$\mathrm{M}=\mathrm{S} \times \mathrm{E} \times \mathrm{B}$
$\mathrm{S}=\mathrm{M} \div(\mathrm{E} \times \mathrm{B})=2^{14} \div\left(2^{2} \times 2^{4}\right)=2^{14-6}=2^{8}=256$ sets
b) Which bits in the memory address determine the address of the set?

address of the location in the block
Address bits of set: 4-11.
c) Into which set is mapped the content of the memory address $10 \mathrm{FFCFF}_{(\mathrm{HEX})}$ ? 10001
20010
30011
40100
50101
60110
70111
81000
91001
A 1010 Address belongs to set 207.
B 1011
C 1100
D 1101
E 1110
F 1111
9. A computer with virtual memory has an access time tomain memory 50 ns , the time to transfer a block from the virtual into main memory is 10 ms . The probability for the page-fault is $10^{-6}$.

What is the average access time, if the page-table is in the main memory?
$\mathrm{t}_{\mathrm{ag}}=50 \mathrm{~ns}$
$\mathrm{t}_{\mathrm{B}}=10 \mathrm{~ms}$
$(1-H)=10^{-6}$
$\mathrm{t}_{\mathrm{a}}=$ ?

(access to the page-frame)

$$
\begin{aligned}
\mathrm{t}_{\mathrm{a}} & =50 \times 10^{-9}+50 \times 10^{-9}+10^{-6} \times 10 \times 10^{-3}= \\
& =100 \times 10^{-9}+10 \times 10^{-9}=110 \times 10^{-9}=110[\mathrm{~ns}]
\end{aligned}
$$

10. Computer with virtual memory has the following features:

- length of the virtual address is 38 bits,
- the page size is 16 KB ,
- length of the physical address is 32 bits.
a) How many bits is the length of page descriptor, if in addition to the frame number (FN), additional parameters occupy another 6 bits?
$\mathrm{n}=38$
$\mathrm{f}=32$
page size $=16 \mathrm{~KB}=2^{\mathrm{p}}=2^{14} \mathrm{~B}$
- a number of pages in virtual memory:

$$
2^{n-p}=2^{n} \div 2^{p}=2^{38} \div 2^{14}=2^{24}
$$

- a number of page frames in main memory: $2^{\mathrm{f}-\mathrm{p}}=2^{\mathrm{f}} \div 2^{\mathrm{p}}=2^{32} \div 2^{14}=2^{18} \quad(\mathrm{FN})$

Page descriptor $=$ frame number $(\mathrm{FN})+6$
Page descriptor $=24$ bits $=3 \mathrm{~B}$

b) What is the maximum size of the page-table in bytes?
num_pages x page_descriptor
$2^{24} 3 \mathrm{Bx}=2^{20} \times 2^{4} 3 \mathrm{Bx}=2^{4} \times 3 \mathrm{MB}=16 \times 3 \mathrm{MB}=48 \mathrm{MB} \quad$ page-table size in main memory.

