Transactions: Atomic, Durable, and Distributed

Tom Kelliher, CS 318

May 1, 2002

Administrivia

Announcements

Final will only cover material since the midterm. It will be one hour. We will begin with a demonstration of the three projects.

Assignment

From Last Time

Transactions and isolation.

Outline

  1. Atomic transactions.

  2. Durable transactions.

  3. Distributed transactions, serializability and replication.

Coming Up

Course evaluation, review.

Atomic Transactions

  1. Atomicity: what is it?

  2. How do we achieve atomicity when:
    1. a transaction aborts?

    2. the system crashes while transactions are active?

  3. Maintain a write-ahead log. Records contained therein:
    1. Begin record for each transaction.

    2. Commit or abort record for each transaction.

    3. Before- (undo) and after-image (redo) records for each database item modified.

    4. Checkpoint records.

    Example:
    ...
    Begin T1
    Undo T1 X <val>
    Redo T1 X <val'>
    Begin T2
    Begin T3
    Checkpoint T0 T1 T2 T3
    Commit T1
    Undo T2 X <val>
    Redo T2 X <val'>
    Undo T2 Y <val>
    Redo T2 Y <val'>
    Undo T3 Z <val>
    Redo T3 Z <val'>
    Abort T2
    

    Using the WAL:

    1. when a transaction aborts. Scan backwards.

    2. when a crash occurs.

      Locate and rollback active transactions. Use of checkpoint records.

  4. Write ahead: log records written before data updated. Why?

Durable Transactions

  1. Durability: what is it?

  2. How do we achieve durability when:
    1. machines crash?

    2. disks fatally crash?

  3. Strategies:
    1. Maintain data and log on different drives.

    2. Mirror data on different drives.

    3. Take database offline and dump.

      ``Fuzzy'' dumps of online databases.

  4. Recovery from lost data:
    1. Scan forwards using redo records, to re-establish data at time of crash.

    2. Scan backwards using undo records, to rollback uncompleted transactions.

Distributed Transactions

The other day I was configuring a Sun server. Transactions involved:

  1. Checking warehouse database for inventory.

  2. Checking logistics database for shipping times.

  3. Checking corporate database for pricing.

These are distributed multidatabase transactions.

  1. Consider a transaction of a transfer of $1,000 from one bank branch to another. Each branch locally maintains its own customer database.

  2. Two subtransactions:
    1. Withdrawal $1,000 from branch A.

    2. Deposit $1,000 to branch B.

    What can go wrong?

  3. How do we ensure atomicity?
    1. A transaction manager manages the distributed transaction. It is the coordinator.

      Responsible for atomicity.

    2. The individual DBMS' are known as cohorts.

    3. Two phase commit protocol:
      1. Coordinator sends Prepare message to cohorts.

      2. Each cohort responds with a vote: Commit or Abort.

        Silence is interpreted as abort. How long to wait?

        Once a cohort votes to commit, it cannot reverse the decision.

      3. Depending upon the outcome of the vote, the coordinator sends a Commit or Abort message to each cohort.

      4. Upon completion, each cohort responds with a Done message.

  4. Strict two phase locking at the cohorts and two-phase commit at the transaction manager ensure global serializability!

Replication

  1. Replicate distributed databases to:
    1. improve response time.

    2. improve availability.

  2. Synchronous update systems: all replicas are updated before any DB modifications commit.

    Lower throughput when modification rate high. Global serializability ensured.

  3. Asynchronous update systems: one replica update before any modifications commit. Other replicas get updated later.
    1. Closest replica updated on commit (or other metric).

      Conflict resolution required.

    2. Primary copy replica updated on commit.

      No conflict resolution necessary.

    Greater throughput. Can produce non-serializable schedules.



Thomas P. Kelliher
Tue Apr 30 18:13:15 EDT 2002
Tom Kelliher