Associative Caches

Tom Kelliher, CS 240

Apr. 10, 2000

Administrivia

Announcements

Homework due Wednesday.

Assignment

From Last Time

Memory system design.

CAMs, fully-associative caches.

Outline

  1. Finish-up fully associative caches.

  2. Set-associative caches. Design exercise.

  3. Virtual memory, paging.

Coming Up

Virtual memory.

Associative Caches

Fully Associative Caches

  1. One set; all blocks in this set.

  2. The entire cache is organized as a CAM.

  3. Format of address: tag, word offset, byte offset.

  4. Benefit: Maximum flexibility because an incoming block can be placed anywhere, allowing us to optimize the replacement policy.

    (Predicting the future.)

  5. Cost: One tag comparator per block, using up silicon resources and increasing access time.

Set-Associative Caches

A compromise.

  1. n blocks per set: .

    Blocks evenly distributed among sets.

  2. Number of sets a power of two, number of blocks/set a power of two. Why?

  3. Associative search within a set. Number of comparators is n.

  4. Format of address: tag, set index, word offset, byte offset.

  5. Some flexibility in block placement. Less cost than fully associative because of reduced number of comparators.

  6. Organization:

Design Example

Starting with a data size of 64KB, assuming 4 32-bit words per block, show the address organization and compute the total number of bits for these organizations:

Replacement Policies

Choices:

  1. Optimal: Replace the block which will be accessed last (furthest into the future).

    Problem: can't predict future.

  2. A reasonable compromise: least recently used (LRU).

    Assume future performance is determined by past performance.

    Implementation: Shift register access counters, linked lists.

  3. Other policies: stack, FIFO, MRU, LFU, MFU.

Virtual Addresses

Logical = Virtual

Recall:

  1. Program (CPU) generates logical addresses.

  2. MMU converts them to physical addresses.

  3. Compile-time, load-time binding: physical address = logical address.

Paging

The idea:

  1. Entire process in memory.

  2. Partition memory into frames of size .
    1. Typical frame sizes: 512 to 8K bytes.

    2. Frame size constrained by MMU design.

  3. Logical address space is broken into pages.
    1. Page size = frame size.

    2. Pages can be arbitrarily mapped onto frames.

    3. However, process ``sees'' a contiguous, flat address space.

  4. MMU splits logical address:
    1. Assume logical address is n bits.

    2. Page number field is n - m most significant bits.

    3. Page offset field is m least significant bits.

    4. Using table look-up, convert logical page number to physical frame number.

    5. Frame offset = page offset.

Why is frame, page size a power of two?

Paging hardware:

Design issues:

  1. Page size:
    1. Internal fragmentation.

    2. Maximizing I/O transfer rate.

  2. I/O --- process passes logical address to kernel.

  3. Implementation of the page table.
    1. Small register file.

    2. Array in memory.

  4. Issues for memory implementation:
    1. Page table must be in contiguous memory.

    2. Page table base register.

    3. ``Logical memory access'' requires two physical accesses.
      1. Translation look-aside buffer.

      2. TLB entries contain: Page number, frame number pairs.

      3. Issue: Context switches.

      4. Only a few entries needed.

Size, Structure of Page Table

  1. Problem: huge page tables. How did this happen?

  2. Solutions:
    1. Valid/invalid bit.

    2. Page table limit register.

    3. Multi-level paging.

Protection

  1. Is it possible for a process to access an arbitrary memory location?

  2. Using valid bit to introduce ``holes'' into logical address space.

Page Sharing

  1. What can be shared?

  2. Read-only pages.

  3. Page alignment --- segment the logical address space.

  4. Page de-allocation.



Thomas P. Kelliher
Mon Apr 10 07:50:18 EDT 2000
Tom Kelliher