Transactions: Atomic, Durable, and Distributed
Tom Kelliher, CS 318
May 1, 2002
Final will only cover material since the midterm. It will be one hour. We
will begin with a demonstration of the three projects.
Transactions and isolation.
- Atomic transactions.
- Durable transactions.
- Distributed transactions, serializability and replication.
Course evaluation, review.
- Atomicity: what is it?
- How do we achieve atomicity when:
- a transaction aborts?
- the system crashes while transactions are active?
- Maintain a write-ahead log. Records contained therein:
- Begin record for each transaction.
- Commit or abort record for each transaction.
- Before- (undo) and after-image (redo) records for each database
item modified.
- Checkpoint records.
Example:
...
Begin T1
Undo T1 X <val>
Redo T1 X <val'>
Begin T2
Begin T3
Checkpoint T0 T1 T2 T3
Commit T1
Undo T2 X <val>
Redo T2 X <val'>
Undo T2 Y <val>
Redo T2 Y <val'>
Undo T3 Z <val>
Redo T3 Z <val'>
Abort T2
Using the WAL:
- when a transaction aborts. Scan backwards.
- when a crash occurs.
Locate and rollback active transactions. Use of checkpoint records.
- Write ahead: log records written before data
updated. Why?
- Durability: what is it?
- How do we achieve durability when:
- machines crash?
- disks fatally crash?
- Strategies:
- Maintain data and log on different drives.
- Mirror data on different drives.
- Take database offline and dump.
``Fuzzy'' dumps of online databases.
- Recovery from lost data:
- Scan forwards using redo records, to re-establish data at time of
crash.
- Scan backwards using undo records, to rollback uncompleted
transactions.
The other day I was configuring a Sun server. Transactions involved:
- Checking warehouse database for inventory.
- Checking logistics database for shipping times.
- Checking corporate database for pricing.
These are distributed multidatabase transactions.
- Consider a transaction of a transfer of $1,000 from one bank branch
to another. Each branch locally maintains its own customer database.
- Two subtransactions:
- Withdrawal $1,000 from branch A.
- Deposit $1,000 to branch B.
What can go wrong?
- How do we ensure atomicity?
- A transaction manager manages the distributed
transaction. It is the coordinator.
Responsible for atomicity.
- The individual DBMS' are known as cohorts.
- Two phase commit protocol:
- Coordinator sends Prepare message to cohorts.
- Each cohort responds with a vote: Commit or Abort.
Silence is interpreted as abort. How long to wait?
Once a cohort votes to commit, it cannot reverse the decision.
- Depending upon the outcome of the vote, the coordinator sends a
Commit or Abort message to each cohort.
- Upon completion, each cohort responds with a Done
message.
- Strict two phase locking at the cohorts and two-phase commit at the
transaction manager ensure global serializability!
- Replicate distributed databases to:
- improve response time.
- improve availability.
- Synchronous update systems: all replicas are updated before any
DB modifications commit.
Lower throughput when modification rate high. Global serializability
ensured.
- Asynchronous update systems: one replica update before any
modifications commit. Other replicas get updated later.
- Closest replica updated on commit (or other metric).
Conflict resolution required.
- Primary copy replica updated on commit.
No conflict resolution necessary.
Greater throughput. Can produce non-serializable schedules.
Thomas P. Kelliher
Tue Apr 30 18:13:15 EDT 2002
Tom Kelliher