Access Paths for Joins
Tom Kelliher, CS 318
Apr. 24, 2002
Read 15.1.
Group assignment due beginning of class 5/1: demonstrate that each of your
project's tables are in BCNF or 3NF. For any tables that are not in these
normal forms, provide a decomposition that is in dependency-preserving BCNF
(preferably) or 3NF. Post your demonstration to your group web site.
Access paths: sorting, selection.
- Finish up select processing from last time.
- Access paths for joins.
Transactions --- Isolation.
Three of many paths:
- Nested loops join.
- Block-nested loops join.
- Hash join.
Preliminaries:
- Two tables:
- Table 1 has pages and rows.
- Table 2 has pages and rows.
- We are computing where A is in Table 1
and B is in Table 2.
- When counting page accesses, we don't count writes.
The simplest method:
for each tuple t1 in Table 1
for each tuple t2 in Table 2
if t1.A = t2.B
output <t1, t2>
How many pages accesses?
Does the ordering of the tables make a difference?
- Optimization of nested loops join.
- Idea: Rather than scan inner table once per tuple of outer table,
scan once per page of outer table.
Page accesses count?
- Further optimization: If we have M buffers, read M - 2 pages of
outer table then scan inner table.
(1 page each needed for inner table input buffer and output buffer.)
Page accesses count?
- Only useful for equijoins.
- Hash Table 1 on A and Table 2 on B.
- After hashing, tuples which can possibly join are in the same bucket.
- Re-read each bucket, separate by table, and join.
- Assuming each bucket fits into memory, how many page accesses?
Thomas P. Kelliher
Wed Apr 24 08:37:37 EDT 2002
Tom Kelliher