CS 119 Lab 7Trees

Objectives

  • Use and manipulate trees
  • Perform the following tasks in the order given.

    1. Download the Lab7 project and import it into Eclipse as a Haskell project.
       
    2. Test out the example functions from the notes and make sure you understand how they work.  You can always use the substitution model to trace through some computations
       
    3. An illustration of the use of binary trees is in the problem of data compression.  Ordinarily, each character is represented by an 8-bit code.  We can reduce the total number of bits required to code a text by replacing this fixed-length code with a coding scheme based on the frequency of occurrence of characters in the text.  Characters that appear most frequently should have short codes whereas characters that appear infrequently can have longer codes.  For example in the word "text" we could encode t->0, e->10, and x->11.  Then the encoding of "text" would be 010110.

      We have to be careful, however, that we choose the code so that it can be uniquely decoded.  If we had chosen t->0, e->10, x->1, then both "text" and "tee" would have the same code!  Not good.  To prevent this from happening we must choose the codes so that no code is a proper prefix of any other.

      To construct an optimal code satisfying the prefix property, we will use a technique called Huffman coding (named after David Huffman).  Each character is stored as the leaf in a binary tree in such a way that more frequently used characters are of lesser depth in the tree than less frequently used ones.  The code of character is a sequence of 0's and 1's describing the path in the tree to the character, where a 0 represents a left branch and a 1 a right branch.  Consider the following tree structure and tree:

      data HTree = Leaf Char | Branch HTree HTree deriving Show

      Branch (Branch (Leaf 'x') (Leaf 'e')) (Leaf 't')

      In this tree, the character 'x' is coded by 00, 'e' by 01, and 't' by 1.

      To build a Huffman tree we start with a list of characters along with their frequencies.  For example:

      [('g',8),('r',9),('a',11),('t',13),('e',17)]

      We convert this list of pairs into a list of trees and then repeatedly combine the trees with the lightest weights until just one tree remains.  The weight of a single leaf will be the weight of the character at that leaf.  The weight of a binary node is the sum of the weights of its two subtrees.  We will need another tree data type for this weighted tree:

      data WeightedTree = Tip Int Char | Node Int WeightedTree WeightedTree deriving Show

      After the weighted tree is constructed we can simply remove the weights and get our HTree.

      The code for construction of the weighted tree is given to you.  Look through the code and trace through it with the given frequencies.  Test it out and see if you get the weighted tree that you expect.

       

    4. We will make our HTree as follows:

      makeHTree :: [(Char,Int)] -> HTree
      makeHTree x = unweight (makeWeightedTree x)

       

      Assignment:
      Write the function unweight which takes a weighted tree and converts it to an HTree by stripping off the weights. 
      Test if out with the weights above.

       

    5. Now that we have our HTree it should be fairly straightforward to decode a message.  Simply traverse the tree making left branches for 0's and right branches for 1's until you get a leaf.  The given character is produced and if you have more bits, repeat the process again starting at the root for the next character.
       
      Assignment:
      Write the function decode :: HTree -> [Bit] -> [Char].
      Test if out with the HTree from question #4 and the bit string [1,1,0,1,1,1,1,0,0,0,0,1] .

       

    6. Encoding is not as easy since the tree is good for finding a character associated with a bit string but poor at finding the bit string associated with a given character.  So we will write a function transform which will transform the tree into a table where we can look up the code for a given character.  This table will be

      type CodeTable = [(Char,[Bit])]

        The code for transform is :

      transform :: HTree -> CodeTable
      transform (Leaf x) = [(x,[])]
      transform (Branch t1 t2) = hufmerge (transform t1) (transform t2)

      The function hufmerge takes two code tables and merges them, adding a zero bit to the front of all the codes coming from the first table, and a one bit to the front of all the codes coming from the second table.
       

      Assignment:
      Write the function hufmerge :: CodeTable -> CodeTable -> CodeTable
      Test it out by doing a transform of your HTree.


       

    7. We need to write a look up function which will look up the bit string for a given character in the CodeTable.
       
      Assignment:
      Write the function codeLookup :: Char -> CodeTable -> [Bit].
      Test if out.

       

    8. Now with the CodeTable we should be able to do the encode operation.
       
      Assignment:
      Write the function encode :: HTree -> [Char] -> [Bit].
      Test if out with the HTree from question #4
      and the string "great".


       

    9. Email your modified zipped project to me for grading.