SYNAPSE Web Server Project

CS 318

100 points, due Apr. 26, 2000

SYNAPSE is the Simple S ynchronous C apable Web Server. It is capable of serving HTML, GIF, and JPEG files and executing CGI programs called via the HTTP GET operation. It is an iterative server, hence it is synchronous. Its implementation requires only a few hundred lines of Perl, so it is quite a simple server.

Design and Grading

As we discussed earlier in the semester, I expect that no body of code will be longer than a single page, no magic numbers will be embedded in the code (use symbolic constants, instead), and that your code will be readable. The programs server.pl and client.pl are excellent examples of what I expect from you. I recommend top-down design followed by bottom-up implementation and testing. During the early stages of implementation, you may wish to use a telnet client in place of a browser.

Additionally, you are to turn in a testing document. Begin by designing a set of test cases. Describe each of the test cases and the actual results of applying each test case to your server in your testing document. You are to turn in hard copy of your server code and the testing document. Also, e-mail a copy of your server code to Send mail to kelliher AT DOMAIN bluebird.goucher.edu. Everything is due at the beginning of class on the 26th. For each day your project is late, 5% will be deducted. No projects will be accepted after the 30th. (The weekend counts as one day.)

You server should work with Netscape and Internet Explorer.

Points will be distributed as follows: effort, 30 points; design, 20 points; documentation, 20 points; runs, 10 points; testing, 20 points.

SYNAPSE Pseudocode

SYNAPSE performs some initialization and enters a server loop:

   determine what port to run on;
   create and initialize a socket, using the port determined earlier;
   print the port number used;
   do forever
      wait for and accept a client connection;
      parse client request;
      print information regarding the client and the request;
      if the request is for CGI execution
         open a filehandle who's input is the piped output of the CGI
            program;
      else if the request is for an HTML page or a GIF or JPEG image
         open a filehandle who's input is the requested file;
      send client a reply header;
      while not at end-of-file on the filehandle
         read the next packet's worth of data from the handle;
         send the next packet's worth of data to the client;
      close filehandle;
      close client socket;
The structure of the server is similar to that of server.pl, discussed in class. For SYNAPSE, however, the server executes a recv to get the client request before sending any data to the client. The person starting the server may specify a port number on the command line. Otherwise, the port number should default to 8080. After accepting a client connection, the server should print the hostname, IP address, and port of the client. If the hostname cannot be obtained, print ``Could not resolve hostname'' in place of the hostname. The client's complete request should also be printed. The remainder of the pseudocode will be covered in the next few sections.

HTTP Client Requests

The structure of a client request is:

   Operation Path HTTP-Protocol-Version
   0 or more request attributes, each on a separate line
   Empty line.
If the operation is anything other than a GET, return a page with a 501 Not Implemented status page. Keep the path, ignore the protocol version. Also ignore any of the request attributes. The empty line at the end of the request implies that the last four characters of a request are \r\n\r\n. It is possible that a request could be fragmented into several packets. Thus, the code you use to read a client request should look something like:
   recv(CLIENT, $buffer, $BUFFER_LEN, 0);

   while (substr($buffer, -4, 4) ne "\r\n\r\n")
   {
       recv(CLIENT, $line, $BUFFER_LEN, 0);
       $buffer .= $line;
   }
By the way, under some conditions Internet Explorer sends an empty request. If the size of the buffer after the first receive is zero bytes, close the connection and begin waiting for the next client request.

Interpreting the Path

All request paths should be relative to a document root, the root of the web site. For example, I might choose /home1/kelliher/web as my document root. Thus, if the request path is /two.html, the path of the file to serve is /home1/kelliher/web/two.html. If the path is /, append index.html to it. The content-type can be derived from the last four characters of the path's extension (path's suffix) as follows:

Server Reply

The structure of a server reply header is:

   HTTP-Protocol-Version Status-Code Status-Message
   0 or more request attributes, each on a separate line
   Empty line.
You should always send HTTP/1.0 for the protocol version. Normally, the status code and message will be 200 OK. In addition to the content-type attribute, the server should send the attribute Connection: close. The code you use to send the reply header will be similar to:
   send(CLIENT, "HTTP/1.0 200 OK\n"
        . "Connection: close\n"
        . "Content-Type: text/html\n\n",  0);
If the requested path cannot be opened, the server should send a page with a status of 404 Not Found. (For an example of what this page and the 501 page should look like, try viewing http://phoenix.goucher.edu/~kelliher/foo. Assuming that the page could be opened, the reply header is followed by the reply body. That is, the contents of a file or the output of a CGI program. Refer to the Perl booklet I gave you for using open to open a pipe from a command, which is what the CGI program is. Once the file/pipe is open, the code to read and send a packet to the client will be similar to:
   while (read(FILE, $buffer, $BUFFER_LEN))
   {
      if (!send(CLIENT, $buffer, 0))
      {
         last;   # Deal with Netscape closing the socket early.
      }
   }
You should never send more than 1,400 bytes of data in a single packet.

Odds and Ends

You'll need the following to close your server down, by typing the interrupt character (Ctrl-c), and to handle Netscape's early socket close scenario:

   sub TERM_handler
   {
      print "\nTerminating.\n";
      # Add code here to close any file/socket handles that may be open.
      exit 0;
   }

$SIG{'TERM'} = 'TERM_handler';
$SIG{'INT'} = 'TERM_handler';

sub PIPE_handler
{
   print "Remote side closed the connection early.\n";
   print "=====\n";
}

$SIG{'PIPE'} = 'PIPE_handler';

A Sample Web Site

I've left a small web site in ~kelliher/pub/cs318/docs.tar. Copy the file to the your document root and use the following command to unpack it:

   tar -xvf docs.tar
You may then remove docs.tar. Start your server, then enter the following URL into a browser: http://phoenix:8080/ (if necessary, substitute another port number). You should see index.html display in the browser. (You did remember to translate the path / to /index.html, didn't you? Test all the links on the page. For comparison purposes, the same web site is available at http://phoenix.goucher.edu/~kelliher/s2000/cs318/docs/.

Thomas P. Kelliher
Mon Apr 10 09:15:42 EDT 2000
Tom Kelliher