Associative Caches

Tom Kelliher, CS 240

Apr. 10, 2000

Administrivia

Announcements

Homework due Wednesday.

Assignment

From Last Time

Memory system design.

CAMs, fully-associative caches.

Outline

Finish-up fully associative caches.
Set-associative caches. Design exercise.
Virtual memory, paging.

Coming Up

Virtual memory.

Associative Caches

Fully Associative Caches

One set; all blocks in this set.
The entire cache is organized as a CAM.
Format of address: tag, word offset, byte offset.
Benefit: Maximum flexibility because an incoming block can be placed anywhere, allowing us to optimize the replacement policy.
(Predicting the future.)
Cost: One tag comparator per block, using up silicon resources and increasing access time.

Set-Associative Caches

A compromise.

n blocks per set: .
Blocks evenly distributed among sets.
Number of sets a power of two, number of blocks/set a power of two. Why?
Associative search within a set. Number of comparators is n.
Format of address: tag, set index, word offset, byte offset.
Some flexibility in block placement. Less cost than fully associative because of reduced number of comparators.
Organization:

Design Example

Starting with a data size of 64KB, assuming 4 32-bit words per block, show the address organization and compute the total number of bits for these organizations:

Direct-mapped.
Two-way set-associative.
Four-way set-associative.
Eight-way set-associative.
Fully associative.

Replacement Policies

Choices:

Optimal: Replace the block which will be accessed last (furthest into the future).
Problem: can't predict future.
A reasonable compromise: least recently used (LRU).
Assume future performance is determined by past performance.
Implementation: Shift register access counters, linked lists.
Other policies: stack, FIFO, MRU, LFU, MFU.

Virtual Addresses

Logical = Virtual

Recall:

Program (CPU) generates logical addresses.
MMU converts them to physical addresses.
Compile-time, load-time binding: physical address = logical address.

Paging

The idea:

Entire process in memory.
Partition memory into frames of size .
1. Typical frame sizes: 512 to 8K bytes.
2. Frame size constrained by MMU design.
Logical address space is broken into pages.
1. Page size = frame size.
2. Pages can be arbitrarily mapped onto frames.
3. However, process ``sees'' a contiguous, flat address space.
MMU splits logical address:
1. Assume logical address is n bits.
2. Page number field is n - m most significant bits.
3. Page offset field is m least significant bits.
4. Using table look-up, convert logical page number to physical frame number.
5. Frame offset = page offset.

Why is frame, page size a power of two?

Paging hardware:

Design issues:

Page size:
1. Internal fragmentation.
2. Maximizing I/O transfer rate.
I/O --- process passes logical address to kernel.
Implementation of the page table.
1. Small register file.
2. Array in memory.
Issues for memory implementation:
1. Page table must be in contiguous memory.
2. Page table base register.
3. ``Logical memory access'' requires two physical accesses.
  1. Translation look-aside buffer.
  2. TLB entries contain: Page number, frame number pairs.
  3. Issue: Context switches.
  4. Only a few entries needed.

Size, Structure of Page Table

Problem: huge page tables. How did this happen?
Solutions:
1. Valid/invalid bit.
2. Page table limit register.
3. Multi-level paging.

Protection

Is it possible for a process to access an arbitrary memory location?
Using valid bit to introduce ``holes'' into logical address space.

Page Sharing

What can be shared?
Read-only pages.
Page alignment --- segment the logical address space.
Page de-allocation.

Thomas P. Kelliher
Mon Apr 10 07:50:18 EDT 2000

Tom Kelliher