This is the schedule of topics for Computational Linguistics II, Spring 2006.
Readings are from Christopher D. Manning and Hinrich Schuetze, Foundations of Statistical Natural Language Processing, unless otherwise specified. The "other" column has optional links pointing either to material you should already know (but might want to review) or to related material you might be interested in.
THIS SCHEDULE IS A WORK IN PROGRESS!
In addition, some topic areas may take longer than expected, so keep
an eye on the class mailing list or e-mail me for "official"
dates.
Class | Topic |
Readings* | Assignments | Other |
---|---|---|---|---|
Jan 25 | Course administrivia, semester plan; corpus-driven and computational linguistics |
Ch 1, 2.1.[1-9] (for review) Word counts; tokenization; frequency and Zipf's law; concordances |
Assignment 0 (given in class) | Corpus Colossal (The Economist, 20 Jan 2005); Language Log; Resnik and Elkiss (DRAFT); Linguist's Search Engine |
Words and lexical association |
Ch 5 Collocations; mutual information; hypothesis testing |
Assignment 1a, Assignment 1b | Dunning (1993), Bland and Altman (1995) | |
Information theory, n-gram models |
Ch 2.2, Ch 6 Information theory essentials; noisy channel model; maximum likelihood estimation |
Assignment 2 | ||
Smoothing; hidden Markov models |
Ch 9-10 Smoothing methods; review of forward and Viterbi algorithms; EM and the forward-backward algorithm |
Assignment 3 | ||
Treebanks and probabilistic parsing |
Ch 11-12, Abney (1996) PCFGs; inside probabilities; dependency-based models; NLP evaluation paradigms and parser evaluation |
Pereira (2000); Detlef Prescher, A Tutorial on the Expectation-Maximization Algorithm Including Maximum-Likelihood Estimation and EM Training of Probabilistic Context-Free Grammars. | ||
EM Revisited: Inside-Outside Algorithm |
A fairly general, intuitive schema for deriving EM update equations, and the inside-outside algorithm as an instance of it. | Assignment 4 | ||
Mar 15 | Guest Lecture: Jimmy Lin on Information Retrieval |
Ch 8.5, 15.{1,2,4} | Lecture slides | Take-home midterm assigned |
Mar 22 | Spring Break |
none | ||
Mar 29 | Supervised classification |
Ch 16 Experimental setups and evaluation; k-nearest neighbor classification; naive Bayes; decision lists; decision trees |
Assignment 5 (Project) Due April 26 |
|
Apr 5 | Maximum entropy models |
Ch 16 The maximum entropy principle; log-linear models; feature selection for supervised classification |
Other useful readings include Adwait Ratnaparkhi's A Simple Introduction to Maximum Entropy Models for Natural Language Processing (1997) and A Maximum Entropy Model for Part-Of-Speech Tagging (EMNLP 1996); Adam Berger's maxent tutorial; and Noah Smith's notes on loglinear models. | |
Apr 12 | Word sense disambiguation |
Ch 7 Characterizing the WSD problem; WSD evaluation; unsupervised methods/Lesk's algorithm; supervised techniques; semi-supervised learning and Yarowsky's algorithm |
||
Apr 12 | Word sense disambiguation in NLP applications |
Resnik (2006), "WSD in NLP Applications" (to appear in
Edmonds and Agirre 2006) "Traditional" WSD in IR, QA, MT, and related applications |
||
Apr 26 | Machine translation |
Ch 13 and Adam Lopez, Statistical Machine Translation
(survey article, submitted) Historical view of MT approaches; noisy channel for SMT; IBM models 1 and 4; HMM distortion model; going beyond word-level models |
Mihalcea and Pedersen (2003); Philip Resnik, Exploiting Hidden Meanings: Using Bilingual Text for Monolingual Annotation. In Alexander Gelbukh (ed.), Lecture Notes in Computer Science 2945: Computational Linguistics and Intelligent Text Processing, Springer, 2004, pp. 283-299. | |
May 3 | Phrase-based statistical MT | Components of a phrase-based system: language modeling, translation modeling; sentence alignment, word alignment, phrase extraction, parameter tuning, decoding, rescoring, evaluation. | Assignment 6 | Koehn, PHARAOH: A Beam Search Decoder for Phrase-Based Statistical Machine Translation |
May 10 | Computational approaches to human language acquisition |
Mintz, T. H. (2006). Finding the verbs: distributional cues to categories available to young learners. In K. Hirsh-Pasek & R. M. Golinkoff (Eds.), Action Meets Word: How Children Learn Verbs, p. 31-63. New York: Oxford University Press. [link]; |