Computer Architecture

The specifications around which a computer’s organizational layout is defined.

  • Microcontroller: Embedded all in one device. Specific tasks
  • Microprocessor: Processor > memory / timer. More generic.
RISCCISC
simplercomplex
fixed len: ‘32’ onlyvariable len: 32, 64 bit
multiple reg setsingle reg set
single cyclemulti cycle
hardware controlmicroprogram control
highly pipelinedless pipelining
only LOAD STOREmany memory instructions

ARM

Advanced RISC Machine. It’s a family of instruction set architectures (ISAs) for computer processors. ARM processors are used in a variety of devices, including mobile phones, portable media players, and GPS navigation systems.

Features of ARM

  1. Conditional Instrutctions
  2. Load / Save Architecture
  3. 32 bit width
  4. A general shift/ALU op in a single clock-cycle
  5. 3 addr instruction format

Tradeoffs

  1. Moving data from one place to another: A common misconception is most time goes in (ALU work)
  2. Used to calc address/data of where the program is stored (1).
  3. The RISC compiler bridges the gaps, We should also design a good ISA
Data movement43%
Control Flow (branching)23%
ALU15%
Comparison13%
Logical5%

Instead, we can start a new fetch phase after the first decode is in progress: 3-Stage Pipeline

FetchDecodeExecute
one byone
  1. Concurrency: via Pipelining
  2. Caching: To reduce average time for frequently used data
  3. Super Scaling HPCA

ARM Instructions

  1. Shortform: ADD, SUB Condition (modifier): EQ, EG, MI, GT, LE
  2. {S} optional suffix: Sets N, O, C, V, Z
  3. {Rd}: Reg Destination
  4. Operand 1 and 2
    1. Either register or immediate value
    2. Flexible: Can be immediate value or a register with optional shift

They can be classified as:

  1. Data Proc: MOV, ADD, SUB
  2. Data Transfer: LDR, STR
  3. Control Flow: B, BL, BEQ, BGT

Program Structure

Addr | Instr, Data
------------------
set	 | .text
by 	 | ADD <instr>
proc |
	 | .data
	 | var <x>
	 | .end

7 ARM modes

codemode
10000user
10001FIQ
10010IRQ
10011SUPER
10111ABORT
11011Undef
11111System

Register Windows

  • Large number of registers
  • Processor entry / exit moved to visible windows to give each procedure access to new registers.
  • Saves state on stack, and then branch
  • This reduces traffic b/w processor > memory

Delayed Branches

They use delayed branches so it doesn’t interrupt the smooth flow as we know a branch can result in T/F. But it isn’t great for super-scalar processors.

Status Registers (SR)

The state of CSPR SPSR on every transition

  • N: prev was -ve
  • Z: produces 0
  • C: carry out
  • V: prev was signed bit

Flags

I = 1, disables IRQ F = 1, disables FIQ T bit: (arch with thumb mode only) T= 0 (arm state) T = 1 (thumb state)

Thumb Mode: 16 bit

  • Only the reg: r0-7 are used
  • narrow data bus improves perf from memory
  • subset of functionality of the ARM instruction set

Memory System

  1. Little Endian
  2. Big Endian
  • 8 bit signed/unsigned
  • 16 bit signed/unsigned: aligned on 2 byte memory
  • 1 word signed/unsigned. aligned on 4 byte memory
  • A word in ARM is 32 bit

Important

  • LOAD: memory value reg
  • STORE: reg memory

Warning

STORE [R1][R2] This is not allowed


Multiplication Using Barrel Shifter

The barrel shifter in ARM assembly can be used to perform efficient multiplication by powers of two, sums, and differences.

  1. Multiplying by:
MOV Ra, Ra, LSL #n 
  1. Multiplying by: 2n + 1 Ra = Ra + (Ra << n)
ADD Ra, Ra, Ra, LSL #n
  1. Multiplying by 2n - 1 Ra = (Ra << n) - Ra
RSB Ra, Ra, Ra, LSL #n

Multiplying by 6

We can calculate 6 * Ra as:

  • Multiply Ra by 2 using MOV Ra, Ra, LSL #1.
  • Multiply Ra by 3 (which is 2 * 1 + 1) using ADD Ra, Ra, Ra, LSL #1.
MOV Ra, Ra, LSL #1        ; Ra = Ra * 2
ADD Ra, Ra, Ra, LSL #1    ; Ra = Ra + Ra * 2 = Ra * 3
ADD Ra, Ra, Ra, LSL #1    ; Ra = Ra + Ra * 3 = Ra * 6

Multiplying by 45

We can calculate 45 * Ra as:

  • Multiply Ra by 2 using MOV Ra, Ra, LSL #1.
  • Multiply Ra by 22 (which is 2 * 11) using ADD Ra, Ra, Ra, LSL #1.
  • Add Ra to Ra * 22 using ADD Ra, Ra, Ra, LSL #1.
  • Add Ra to Ra * 44 to get Ra * 45.
MOV Ra, Ra, LSL #1        ; Ra = Ra * 2
ADD Ra, Ra, Ra, LSL #1    ; Ra = Ra + Ra * 2 = Ra * 3
ADD Ra, Ra, Ra, LSL #1    ; Ra = Ra + Ra * 3 = Ra * 6
ADD Ra, Ra, Ra, LSL #1    ; Ra = Ra + Ra * 6 = Ra * 12
ADD Ra, Ra, Ra, LSL #1    ; Ra = Ra + Ra * 12 = Ra * 24
ADD Ra, Ra, Ra, LSL #1    ; Ra = Ra + Ra * 24 = Ra * 48
RSB Ra, Ra, Ra, LSL #1    ; Ra = Ra * 48 - Ra = Ra * 45