CS340    Lab 9 - NLP: From words to (some) information extraction


  1. Read section 3.4 in this reading

  2. Copy the file ~jillz/cs340/code_random_text.py that is described in the reading and try it out. 

  3. Do parts a) and b) of the exercise #2 in the reading on predicting the next word. 

  4. Read section 4.1, 4.2 in this reading on tagging and complete exercise #1 on ambiguity resolved by part-of-speech tags.  You can use this website and find headlines where pos-tagging helps with the ambiguity and headlines where it does not.

  5. Complete exercise #2, parts b and e in the second reading.  You can get the brown tags from nltk.corpus.brown.tagged_words().  (I found conditionalFreqDist and list comprehensions in python very useful.)

  6. Here is the last reading for this lab.

  7. Copy the file ~jillz/cs340/code_cascaded_chunker.py and test it out by parsing the sentences with

    print (cp.parse(sentence))

  8. Write a tag pattern to cover noun phrases that contain gerunds, e.g. "the/DT receiving/VBG end/NN", "assistant/NN managing/VBG editor/NN". Add these patterns to the grammar and test your work using some tagged sentences of your own devising.

  9. Submit your code and write-up on your results.