Associative Caches
Tom Kelliher, CS 240
Apr. 10, 2000
Homework due Wednesday.
Memory system design.
CAMs, fully-associative caches.
- Finish-up fully associative caches.
- Set-associative caches. Design exercise.
- Virtual memory, paging.
Virtual memory.
- One set; all blocks in this set.
- The entire cache is organized as a CAM.
- Format of address: tag, word offset, byte offset.
- Benefit: Maximum flexibility because an incoming block can be placed
anywhere, allowing us to optimize the replacement policy.
(Predicting the future.)
- Cost: One tag comparator per block, using up silicon resources and
increasing access time.
A compromise.
- n blocks per set:
.
Blocks evenly distributed among sets.
- Number of sets a power of two, number of blocks/set a power of two.
Why?
- Associative search within a set. Number of comparators is n.
- Format of address: tag, set index, word offset, byte offset.
- Some flexibility in block placement. Less cost than fully
associative because of reduced number of comparators.
- Organization:

Starting with a data size of 64KB, assuming 4 32-bit words per block, show
the address organization and compute the total number of bits for these
organizations:
- Direct-mapped.
- Two-way set-associative.
- Four-way set-associative.
- Eight-way set-associative.
- Fully associative.
Choices:
- Optimal: Replace the block which will be accessed last (furthest into
the future).
Problem: can't predict future.
- A reasonable compromise: least recently used (LRU).
Assume future performance is determined by past performance.
Implementation: Shift register access counters, linked lists.
- Other policies: stack, FIFO, MRU, LFU, MFU.
Logical = Virtual
Recall:

- Program (CPU) generates logical addresses.
- MMU converts them to physical addresses.
- Compile-time, load-time binding: physical address = logical address.
The idea:
- Entire process in memory.
- Partition memory into frames of size
.
- Typical frame sizes: 512 to 8K bytes.
- Frame size constrained by MMU design.
- Logical address space is broken into pages.
- Page size = frame size.
- Pages can be arbitrarily mapped onto frames.
- However, process ``sees'' a contiguous, flat address space.
- MMU splits logical address:
- Assume logical address is n bits.
- Page number field is n - m most significant bits.
- Page offset field is m least significant bits.
- Using table look-up, convert logical page number to physical frame
number.
- Frame offset = page offset.
Why is frame, page size a power of two?
Paging hardware:

Design issues:
- Page size:
- Internal fragmentation.
- Maximizing I/O transfer rate.
- I/O --- process passes logical address to kernel.
- Implementation of the page table.
- Small register file.
- Array in memory.
- Issues for memory implementation:
- Page table must be in contiguous memory.
- Page table base register.
- ``Logical memory access'' requires two physical accesses.
- Translation look-aside buffer.
- TLB entries contain: Page number, frame number pairs.
- Issue: Context switches.
- Only a few entries needed.

- Problem: huge page tables. How did this happen?
- Solutions:
- Valid/invalid bit.
- Page table limit register.
- Multi-level paging.
- Is it possible for a process to access an arbitrary memory location?
- Using valid bit to introduce ``holes'' into logical address space.
- What can be shared?
- Read-only pages.
- Page alignment --- segment the logical address space.
- Page de-allocation.
Thomas P. Kelliher
Mon Apr 10 07:50:18 EDT 2000
Tom Kelliher