Control Hazards and Exceptions

Tom Kelliher, CS 240

Feb. 9, 2000

Administrivia

Announcements

Paper handouts.

Assignment

Read 6.8--6.12.

From Last Time

Data hazards.

Outline

  1. Control hazards, optimizations.

  2. Exceptions, interrupt types, imprecise interrupts.

Coming Up

Advanced pipelining techniques.

Control Hazards

How would you stall the pipeline to wait for a branch to resolve? When would you begin stalling? For how long?

Consider this code segment executing on an un-optimized pipeline:

We ``predict'' branch-not-taken --- speculative execution. Helps 50% of the time.

  1. How does this conflict with the delayed branch behavior we've discussed?

  2. Can you explain why it takes so long to load the branch target into the PC?

  3. We have to kill three instructions. Will any of them have modified architectural state? Suppose the instruction at address 44 was a sw. Would that have modified state?

Optimizations

Can we improve this? Decide earlier, predict (guess), delayed branch.

Deciding Earlier

Make the decision during the ID stage:

  1. Test register file outputs for equality, using XNORs and an AND gate.

  2. Complicating factors? (Forwarding.)

  3. Conditionally flush the instruction in the IF stage.

The modified pipeline:

Dynamic Branch Prediction

Branch history table:

  1. One bit entries: branch taken/not taken.

  2. Indexed by low-order bits of branch instruction's address.

  3. Aliasing? Not really a problem. Show for a two entry table.

Consider the following C fragment:

for (i = 0; i < lastI; ++i)
   for (j = 0; j < LastJ; ++j)
      sum += array[i,j];
It compiles to this structure:

With one-bit prediction, the inner beq is mis-predicted twice each iteration. Why?

We can improve this with a two-bit saturating counter.

Delayed Branch

Use of this technique is waning due to:

  1. Deeper pipelines.

  2. Superscalar execution.

The compiler can't cope.

Exception Handling

Exception types:

  1. Arithmetic exceptions: overflow, divide by zero, etc.

  2. Hardware exception: parity error.

  3. System call.

  4. Page fault, access to non-existent memory.

Responsibilities:

  1. Hardware: record the PC (actually +4), and the cause.

  2. Operating system: recover. Terminate, kernel mode, bring the page in, etc., as required.

Points:

  1. An interrupt is like a conditional branch.

  2. Branches to a fixed location: 0x40000040.

  3. Can occur anywhere in the pipeline.

  4. Subsequent instructions must be flushed.

  5. Multiple interrupts can occur during a cycle. Prioritizing?

Pipeline with exception handling added:

This won't handle memory exceptions. Why? (Consider a lw of a non-existent address.)

Imprecise interrupts:

  1. What are they? Why do we care?

  2. Example: Consider the following floating point code segment running on a CPU with two FP pipes:
          DIVF   F0, F2, F4
          ADDF   F10, F10, F8
          SUBF   F12, F12, F14
    
    This is an example of out-of-order completion on a superscalar machine.
    1. Are there any dependencies?

    2. What would the completion order be?

    3. Suppose the SUBF faults. What does the exception hardware do?

    4. Suppose the DIVF faults subsequent to the SUBF faulting. Now we're really in a mess. (The ADDF has completed and committed.)



Thomas P. Kelliher
Tue Feb 8 15:16:47 EST 2000
Tom Kelliher