Architectural Support for Operating Systems

Tom Kelliher, CS42

Sept. 6, 1996

Simple System Communication

Block Diagram of a Computer System:

How does the CPU manage all these resources?

What are their relative speeds?

CPU: 200,000,000 instructions/sec.
Memory: 10,000,000 word accesses/sec.
Terminal: 4 characters/sec.
Disk: 10MB/sec; 10 ms latency
Typical network card: 1Mb/sec.
(Near) Future network card: 1Gb/sec.

Example

Consider reading a sector from a disk:

Physical disk interface implemented in kernel as struct:

typedef struct
{
   short busy;
   short status;
   short command;
   ...
} disk;

Scheme:
1. Verify that disk is idle.
2. Issue a new command (read, write, etc.).
3. Wait for disk to become idle again.
4. Check status.

Busy Waiting

Kernel level routine for reading disk block:

int ReadBlock (disk d, char* buffer, int blockNum)
{
   while (d.busy) /* Verify idle */
      ;

   /* Send read command to disk. */

   while (d.busy)   /* Busy wait on read */
      ;

   return d.status;
}

Used at low-level by scanf().

OS operates on user's behalf --- user doesn't directly manipulate hardware.

How efficient is this?

Polling

Kernel level routine for reading disk block:

int ReadBlock (disk d, char* buffer, int blockNum)
{
   while (d.busy) /* Verify idle */
      /* Go do something else and check back. */

   /* Send read command to disk. */

   while (d.busy)
      /* Go do something else and check back. */

   return d.status;
}

How safe is this?

Interrupts

Introduce parallelism into system by decoupling CPU and I/O devices.

Schema:

Start device.
Go do something useful.
Device signals completion via interrupt.
Kernel handles interrupt.

Interrupt Features:

Current process is temporarily abandoned --- arbitrary function call.
Several priority levels.
Most levels can be masked.
Several devices can share the same interrupt line. How do we identify the sender?
Kernel contains interrupt handlers which service interrupts. How do they identify the device if several exist?
After handler finished, unmask interrupts and return to previous process.

A Longer Example

CPU instruction cycle:

do
{
   fetch instruction;
   increment pc;
   decode instruction;
   execute instruction;
   if (interrupt)
   {
      push pc;
      pc = loc(handler);
   }
} while (forever);

Global declarations:

#define FALSE 0
#define TRUE 1

kernel area:

int busy;      /* local flag of channel (I/O device) status */

User data area:

#define MAXBUF 5
buffer buf[MAXBUF];
int nextget = 0;  /* index of next full buffer */
int nextio = 0;   /* index of next buffer to fill */
int free = MAXBUF;   /* number of free buffers */

System calls:

int getbuf()
{
int current;

   while (free == MAXBUF)
      /* Call CPU scheduler. */
   current = nextget;
   nextget = ++nextget % MAXBUF;
   return (current);
}

relbuf()
{
   free++;
   if (!busy)
   {
      busy = TRUE;
      startread(buf[nextio]);
   }
}

user code:

USER()
{
   int cur;

   busy = TRUE;
   startread(buf[nextio]);
   do
   {
      cur = getbuf();
      /* Use buf[cur]. */>
      relbuf();
      /* Compute. */
   } while (!done);
}

Interrupt handler called when interrupt received from channel ( independent process):

int_hndl()
{
   /* Save processor state. */
   busy = FALSE;
   nextio = ++nextio % MAXBUF;
   free--;
   if (free > 0)
   {
      busy = TRUE;
      startread(buf[nextio]);
   }
   /* Restore processor state. */
}

Interrupt driven OS called:

Interrupts.
System calls (software interrupts).

Direct Memory Access

DMA.

A block I/O device could swamp a CPU.

Solution: Let device access memory itself.

Here's what's going on with DMA:

CPU reserves an area of memory as a buffer for the I/O (assume a read is performed).
CPU loads the base (bottom) address of the buffer into the I/O device.
CPU loads the length of the transfer into the I/O device (assume that it's same as the buffer size).
Concurrently:
- CPU goes on to another task (block requesting process).
- I/O device starts transfer, writes data directly to memory.
Memory arbitration problems here.
I/O device interrupts CPU upon completion.
CPU receives interrupt, checks status, schedules formerly blocked process.

Caches

General principle.
Hit rates over 80%.
CPU caches.
Memory as a cache for virtual memory.
Disk buffers.
Consistency problems with multiple copies of data (OS shutdown).

Storage Hierarchy

Registers.
Cache.
Memory --- resource allocation:
- Allocated to each thread.
- Protection mechanisms.
- Split into ``pages.''
- Free list.
Disk --- resource allocation:
- Files mapped onto blocks/sectors.
- Free list.
- Allocation strategies to decrease seeks, fragmentation.
- Access delays: seek time, rotational latency, transfer time.
- Seek ordering.
Tape --- sequential, not DASD.

Thomas P. Kelliher
Thu Sep 5 16:11:17 EDT 1996

Tom Kelliher