Fundamentally, transistors encode a switching state in a circuit interpreted as a number in binary, denoting a numeric value, a character shape, or a selector code for a sub-circuit.

A computer, composed of physical devices that encode and transform a binary state (represented as bits), needs to know what to do in terms of its physical circuitry.

Let's examine closely how a typical <u>modern</u> machine expects **operations** and **operands**, but first a major realization ...

### Previously...

Physical device to (binary) code: see Slide 0

# Specifying operations and operands

### From logical representation (high-level) to machine code

Ops Instructions

© 2024 Dr. Muhammad Al-Hashimi

An advanced review not about MIPS or assembly programming, focus on instruction issues, tradeoffs, and <mark>examples</mark>

KAU • CS-704 1

cs704fig\_instr.cdr Thursday, February 15, 2024 10:40:10 AM Color profile: Disabled Composite Default screen

# (hi

# Motivation



© 2024 Dr. Muhammad Al-Hashimi

cs704fig\_instr.cdr Thursday, February 15, 2024 10:40:13 AM Color profile: Disabled Composite Default screen

Register operands, fundamental to machine instructions, focus on speedy access (number and bit size reflect tradeoffs driven by the state of technology).

### In 1985, the original MIPS processor had 32 registers, denoted \$0-\$31, each 32 bits with fixed $0 \leftarrow 0$ .

Registers \$16-\$23 are symbolically referred to as \$s0-\$s7 in MIPS assembly.

## **MIPS Register Operands Basic Arithmetic**

### □ Register, opcode Assembly language



© 2024 Dr. Muhammad Al-Hashimi

8-

cs704fig\_instr.cdr Thursday, February 15, 2024 10:40:15 AM Color profile: Disabled

Composite Default screen

### High-level variables are kept in memory, s0-\$s7 (\$16-23) used conventionally by the compiler to track up to 8 at a time.

## **MIPS Memory Operands Load-Store**

### -> Memory address

A numbering scheme for logically usable memory cells, conventionally, every 8 bits (byte) given a unique numerical code.



© 2024 Dr. Muhammad Al-Hashimi

# **Machine Instructions**

### -> Instruction word



### cs704fig\_instr.cdr Thursday, February 15, 2024 10:40:20 AM Color profile: Disabled Composite Default screen

## B- Connections **Program Execution**

### -> Program counter (PC)

A special *programmer* register not part of the **general purpose register** (**GPR**) set (\$0-\$31 in MIPS).



© 2024 Dr. Muhammad Al-Hashimi

## Decision Instructions Conditional Branch

### Solution → Memory label



cs704fig\_instr.cdr Thursday, February 15, 2024 10:40:22 AM Color profile: Disabled

Composite Default screen



001000 0x61A8



Essentially, ways to specify **operands**, an important aspect of machine instructions.

- S→ Immediate operand
- Load-store machine

Register addressing

✓ Base (displacement)

## Immediate addressing addi \$16,\$0,25000

## **PC-relative addr**

beq \$1,\$0,EXIT

**Pseudodirect** 

© 2024 Dr. Muhammad Al-Hashimi

PC + 32-bit branch target

Is it a good idea to <u>provide an</u> <u>opcode</u> for adding a memory operand to a register operand? Specify changes to instruction design.

8

KAU • CS-7048

cs704fig\_instr.cdr Thursday, February 15, 2024 10:40:22 AM Color profile: Disabled Composite Default screen

Separate instructions deal with important operand types and the complications resulting from finite bit representations.

Typically 2's complement signed integers in as many bits as fits in a data register or GPR (in MIPS-like designs), corresponding to int data type in C-like languages. Unsigned integers

**Engineering and scientific** calculations require a versatile representation of real numbers.

Byte/double-byte operands are important for processing text (ASCII and Unicode characters).

Shorter floating point operands optionally allow faster processing at expense of range and precision.

© 2024 Dr. Muhammad Al-Hashimi

## **Machine Operands** Variations

Quiz 🔘

Different operands are indistinguishable bits for the machine, how can it tell them apart? What are the consequences for a compiler? Hint: physical machine's concerns may differ from those of programmers (see Wulf).

## Default operands

# 

### Short operands

### Complications (overflow...)

types and related operations. bused for different operand Answer Different opcodes are

# Composite Default screen Composite Default scr

Execution thread
Control flow

An OS is concerned with how instructions flow through hard-ware, not their execution details.

We may visualize the default execution sequence, due to incrementing a PC, as imaginary **thread** through consecutive instructions; <u>sequencing control</u> may be thought to <u>flow</u> along threaded instructions.

**Control flow** is an abstraction of sequencing performed by the physical **control unit** of a CPU (the actual execution sequence may be more complex than what programmers see).

In practice, some instruction sequences from the same program are logically independent and may run in any order or *threaded* concurrently.

Extra resources are needed to run different threads concurrently. At a minimum, an OS that <u>simulates</u> concurrence by transferring control around.

© 2024 Dr. Muhammad Al-Hashimi



cs704fig\_instr.cdr Thursday, February 15, 2024 10:40:23 AM Color profile: Disabled Composite Default screen

## **Instruction Models**

### Instruction set

Essentially a set of **opcodes** and **operand** <u>specifications</u> that the hardware can recognize.



⇔ CISC (Intel 8086): do more

## ⇒ RISC (MIPS): do less

⇒ VLIW: do in parallel (later)

Pack multiple independent ops in one instr to relieve fetch-decode burdens and facilitate parallel execution.

Natural **data flow** <u>allowed</u> <u>concurrence</u> where centralized control forced artificial sequencing, which motivated adding functional units to exploit it.

**Multithreading** can support apparent concurrency; programs will seem to run faster.

**Exercise** Compare parallelism and concurrency.

© 2024 Dr. Muhammad Al-Hashimi

**Instruction execution** 

Solution Dataflow model origins

Where multi-threading fits?

Hardware threads

;?

Composite Default screen

# cs704fig\_instr.cdr Thursday, February 15, 2024 10:40:23 AM Register Size Limitation

The number of bits internal registers can hold is a major architectural feature.

Historically, separate registers were used for data operands, indexing, and addresses; MIPS uses GPR in all cases.

Most machines will not provide circuitry to handle more bits than associated registers can store.

Reg size can influence instruction design and capabilities in subtle ways.

### ۲

Old microprocessors had economic variants with narrower data bus (fewer pins) labeled 8/16-bit despite having the same wider internal registers (e.g., 8086/88, 68000/68008).

© 2024 Dr. Muhammad Al-Hashimi

## Memory addressing Limit directly accessible memory

## Computation

Limit computation performed directly in hardware

## What if operands don't fit? Breakdown computation in software

KAU • CS-704 12

Quiz Why would the move from a 32-bit processor to 64-bit be significant from instruction set viewpoint?

### cs704fig\_instr.cdr Thursday, February 15, 2024 10:40:23 AM Composite Default screen What Exactly is Software?

Most computers can add two 32/64-bit integers in physical registers using physical circuits but can't calculate *f* directly since they mostly lack a dedicated circuit to perform a *difference-of-2-sums* operation.

A compiler can easily generate a sequence of machine instructions to perform the operation.

f = (x+y) - (s+t);1 2 … 5  $s1-5 \leftarrow f, x, y, s, t$  $t1 \leftarrow x+y$ \$t2 ← s + t

add \$t1,\$s2,\$s3
add \$t2,\$s4,\$s5
sub \$s1,\$t1,\$t2

Similarly, while classic MIPS (circa 1985) can't add two 128-bit integers in hardware, a small **program** (schedule of instructions) can do it.

The operation is said to be performed in **software** since no circuitry is dedicated for it.

| Different machine programs<br>(why?) even though they<br>logically perform the same<br>operation. | add \$t2,\$s4,\$s5<br>add \$t1,\$s2,\$s3<br>sub \$s1,\$t1,\$t2 | add <mark>\$t0</mark> ,\$s2,\$s3<br>add <mark>\$t1</mark> ,\$s4,\$s5<br>sub \$s1,\$t0,\$t1 | A computer designer may<br>decide to provide physical<br>circuits for such expressions<br>and a corresponding machine<br>instruction though, as was<br>the case often with CISC. |
|---------------------------------------------------------------------------------------------------|----------------------------------------------------------------|--------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|---------------------------------------------------------------------------------------------------|----------------------------------------------------------------|--------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

© 2024 Dr. Muhammad Al-Hashimi



© 2024 Dr. Muhammad Al-Hashimi

## **Instruction Design Simplified vs. Complex**

### Intel 8088 ADD

A memory operand can considerably complicate the execution profile of the instr on the same hardware.

**Memory operand** 

9 cycles + 4 cycles (transfer) + address computation

**Register operand** 3 cycles

### **Conditional branch**

### Intel 8088/8086

JG/JNLE JGE/JNL JL/JNGE JLE/JNG JO JS JNO JNS JA/JNBE JAE/JNB JB/JNAE JBE/JNA JC JE/JZ JP/JPE JNC JNE/JNZ JNP/JP0 JCKZ CMP\*

### **MIPS R2000/3000**

**beg bne** bltz blez bgtz bgez bltzal bgezal slt slti sltu sltiu

© 2024 Dr. Muhammad Al-Hashimi

operands only.

Instruction set supports every potentially useful condition with a variety of operand

scenarios. MIPS supports a minimum necessary to compose the rest, favoring the more frequent scenarios, with fast register

cs704fig\_instr.cdr Thursday, February 15, 2024 10:40:26 AM Color profile: Disabled Composite Default screen

## Instruction Design Note on Composability

 $r4 \leftarrow r1 + r2 \times r3$  $r1 \leftarrow r1 + r2 \times r3$ 

A frequent op in significant applications (part of a sumof-products calculations) for which a dedicated machine instruction (circuit) may be justified.

### \$t1←mem[\$a0+\$s3]

Memory address obtained from 2 registers, one for array base address and the 2nd for a variable index (instead of a constant).

Note specialized register for (base) indexing in 8086/8088 vs. GPR in RISC.

© 2024 Dr. Muhammad Al-Hashimi

### **Fused multiply-add** One instruction, two jobs

**ARMv8 (RISC)** madd r4,r2,r3,r1 **MIPS (R10000)** madd.d f4,f1,f2,f3

### MIPS R2000/R3000

madd \$s4,\$s1,\$s2,\$s3
multu \$s2,\$s3
mflo \$at
add \$s4,\$s1,\$at

Or, a machine may offer a **pseudo-instruction** (gray) to make programming easy, to be <u>com-</u> <u>posed</u> by a sequence of machine instructions (*in color*).

### **Indexed Addressing** Processing large arrays

### **PowerPC (RISC)**

lw \$t1,\$a0+\$s3

### 8086/8088 (CISC)

MOV AX,[BX+**SI**]

### MIPS R2000/R3000

addu \$t0,\$a0,\$s3 lw \$t1,0(\$t0) Note version (opcode) of add needed to treat operands as unsigned since dealing with an address.