Introduction to Pipelining

Tom Kelliher, CS 240

Nov. 11, 2005

Administrivia

Simple MIPS datapath implementation.

Pipelined datapath.

The laundry analogy:

$\begin{figure}\centering\includegraphics[width=4in]{Figures/f0601.eps}\end{figure}$

The five stage MIPS pipeline:

Instruction fetch.
Decode and read registers.
The consistent placement of the source registers permits this.
Execute ALU operation or calculate an address.
Access memory.
Result write-back.

Assume:

Instruction class times:

	Instruction	Register	ALU	Data	Register	Total
Instruction Class	fetch	read	operation	access	write	time
lw	2 ns	1 ns	2 ns	2 ns	1 ns	8 ns
sw	2 ns	1 ns	2 ns	2 ns		7 ns
R-format	2 ns	1 ns	2 ns		1 ns	6 ns
beq	2 ns	1 ns	2 ns			5 ns

Clock periods for the two implementations?

Execution example:

$\begin{figure}\centering\includegraphics[width=6in]{Figures/f0603.eps}\end{figure}$

Note that pipelined register file reads are done during the second half of the clock cycle and writes are done during the first half. Why?

Consider the speedup:

Assumption: Stages are of equal length. What if they aren't?
Speedup is at most the number of pipeline stages.
Do we achieve that?
Consider the execution of 1,000 instructions and compute the actual speedup.
What happened? The cost of the pipeline registers.

Consider:

How does the speedup occur: shortened instruction execution time or higher instruction bandwidth?

A single instruction size: simplifies instruction fetch.
A small number of instruction formats, with register fields in common locations for all formats: simplifies instruction decode and allows register fetch to proceed in parallel.
Memory operands occur only in sw/lw instructions: simplifies pipeline design and decreases pipeline depth.
Memory data is aligned: a memory operation requires only a single memory read or write.

Thomas P. Kelliher 2005-11-08