DMA, Introduction to Caches

Tom Kelliher, CS26

Nov. 12, 1996

We'll skip 4.5.2--4.7.

Direct Memory Access

DMA.

How does a disk perform a transfer?

CPU driven method:

Prepare memory buffer, pointer into buffer, and count.
Send disk a command through I/O registers:
1. Cylinder, head, sector.
2. Read or write.
3. Go.
Do other things.
Repeat:
1. Receive interrupt.
2. Transfer byte between disk and memory.
3. Update pointer, count.

One possible speed-up: block transfers.

How efficient is this? Assume:

Disk:
1. 3600 RPM.
2. 512 bytes/sector.
3. 56 sectors/track.
4. How many bytes/sec.?
CPU:
1. 100 MHz.
2. 200 instructions/interrupt.

The problem here?
DMA removes the burden.
DMA controller is a bus master --- arbitration required.

DMA Hardware

DMA architecture (2 channels)

DMA channel interface:

Is it memory mapped?

Programming a DMA Channel

Write starting address, count registers.
Write control register.
Interrupt will be received upon completion.

Is the order important?

Schematic operation:

CPU reserves an area of memory as a buffer for the I/O (assume a read is performed).
CPU loads the starting address of the buffer into the DMA controller
CPU loads the transfer count into the DMA controller (assume that it's same as the buffer size).
Concurrently:
- CPU goes on to another task (block requesting process).
- I/O device starts transfer, writes data directly to memory.
Memory arbitration problems here.
I/O device interrupts CPU upon completion.
CPU receives interrupt, checks status, schedules formerly blocked process.

Bus Arbitration

Only one device --- bus master --- can control bus.
CPU and DMA controller are bus masters.
How is control passed back and forth?

Centralized arbitration:

Operation. Assume CPU is bus master at start:

DMA n asserts bus request.
CPU accepts request, asserts BG.
BG daisy chains until reaching requesting controller.
Controller releases bus request, waits for bus busy to go away.
Controller asserts bus busy and begins using bus.
Controller releases bus busy when done.

De-centralized arbitration?

Memory

Definitions

Addressing conventions for 32-bit memory:
CPU/memory behavior on word/byte accesses. Consider writing a single byte.
Memory organization:
Memory access is a bottleneck to CPU operation. Speed-ups:
1. Caches.
2. Interleaving. Pipelined access to multiple memory banks. Example:
  1. 200 ns. RAM.
  2. 50 ns. CPU cycle.
  3. No interleaving vs. 4-way interleaved.

Memory Cell Array Organization

organization:

Static RAM

No address bit sharing.
Memory cell organization:

How many transistors?
Reading, Writing?

A static RAM:

Dynamic RAM

Row, Column share address lines --- must strobe and latch.
Memory cell organization:

How many transistors?
Reading, Writing?
Refresh. Stalls.

A dynamic RAM:

Fast page mode DRAMS.
EDO DRAMS.

Static, Dynamic RAM Comparison

Faster?
Denser?

Why?

Caches use static.

Main memory uses dynamic.

Design techniques:

Modular.
Scalable.
Reducing number of drawn transistors.

A 4Mx32 DRAM Memory System

Uses DRAMS in chip array.

The Memory Hierarchy

Registers --- flip-flops.
L1 cache --- on-chip SRAM.
L2 cache --- off-chip SRAM.
Main memory --- DRAM.
Secondary storage --- magnetic disk.

Size, speed, cost? Management?

Thomas P. Kelliher
Mon Nov 11 15:28:37 EST 1996