Schedule of Topics

This is the schedule of topics for Computational Linguistics II, Spring 2006.

Readings are from Christopher D. Manning and Hinrich Schuetze, Foundations of Statistical Natural Language Processing, unless otherwise specified. The "other" column has optional links pointing either to material you should already know (but might want to review) or to related material you might be interested in.

THIS SCHEDULE IS A WORK IN PROGRESS!
In addition, some topic areas may take longer than expected, so keep an eye on the class mailing list or e-mail me for "official" dates.

Class	Topic	Readings*	Assignments	Other
Jan 25	Course administrivia, semester plan; corpus-driven and computational linguistics	Ch 1, 2.1.[1-9] (for review) Word counts; tokenization; frequency and Zipf's law; concordances	Assignment 0 (given in class)	Corpus Colossal (The Economist, 20 Jan 2005); Language Log; Resnik and Elkiss (DRAFT); Linguist's Search Engine
	Words and lexical association	Ch 5 Collocations; mutual information; hypothesis testing	Assignment 1a, Assignment 1b	Dunning (1993), Bland and Altman (1995)
	Information theory, n-gram models	Ch 2.2, Ch 6 Information theory essentials; noisy channel model; maximum likelihood estimation	Assignment 2
	Smoothing; hidden Markov models	Ch 9-10 Smoothing methods; review of forward and Viterbi algorithms; EM and the forward-backward algorithm	Assignment 3
	Treebanks and probabilistic parsing	Ch 11-12, Abney (1996) PCFGs; inside probabilities; dependency-based models; NLP evaluation paradigms and parser evaluation		Pereira (2000); Detlef Prescher, A Tutorial on the Expectation-Maximization Algorithm Including Maximum-Likelihood Estimation and EM Training of Probabilistic Context-Free Grammars.
	EM Revisited: Inside-Outside Algorithm	A fairly general, intuitive schema for deriving EM update equations, and the inside-outside algorithm as an instance of it.	Assignment 4
Mar 15	Guest Lecture: Jimmy Lin on Information Retrieval	Ch 8.5, 15.{1,2,4}	Lecture slides	Take-home midterm assigned
Mar 22	Spring Break	none
Mar 29	Supervised classification	Ch 16 Experimental setups and evaluation; k-nearest neighbor classification; naive Bayes; decision lists; decision trees	Assignment 5 (Project) Due April 26
Apr 5	Maximum entropy models	Ch 16 The maximum entropy principle; log-linear models; feature selection for supervised classification		Other useful readings include Adwait Ratnaparkhi's A Simple Introduction to Maximum Entropy Models for Natural Language Processing (1997) and A Maximum Entropy Model for Part-Of-Speech Tagging (EMNLP 1996); Adam Berger's maxent tutorial; and Noah Smith's notes on loglinear models.
Apr 12	Word sense disambiguation	Ch 7 Characterizing the WSD problem; WSD evaluation; unsupervised methods/Lesk's algorithm; supervised techniques; semi-supervised learning and Yarowsky's algorithm
Apr 12	Word sense disambiguation in NLP applications	Resnik (2006), "WSD in NLP Applications" (to appear in Edmonds and Agirre 2006) "Traditional" WSD in IR, QA, MT, and related applications
Apr 26	Machine translation	Ch 13 and Adam Lopez, Statistical Machine Translation (survey article, submitted) Historical view of MT approaches; noisy channel for SMT; IBM models 1 and 4; HMM distortion model; going beyond word-level models		Mihalcea and Pedersen (2003); Philip Resnik, Exploiting Hidden Meanings: Using Bilingual Text for Monolingual Annotation. In Alexander Gelbukh (ed.), Lecture Notes in Computer Science 2945: Computational Linguistics and Intelligent Text Processing, Springer, 2004, pp. 283-299.
May 3	Phrase-based statistical MT	Components of a phrase-based system: language modeling, translation modeling; sentence alignment, word alignment, phrase extraction, parameter tuning, decoding, rescoring, evaluation.	Assignment 6	Koehn, PHARAOH: A Beam Search Decoder for Phrase-Based Statistical Machine Translation
May 10	Computational approaches to human language acquisition			Mintz, T. H. (2006). Finding the verbs: distributional cues to categories available to young learners. In K. Hirsh-Pasek & R. M. Golinkoff (Eds.), Action Meets Word: How Children Learn Verbs, p. 31-63. New York: Oxford University Press. [link];

*Readings are from Manning and Schuetze unless otherwise specified. Do the reading before the class where it is listed!

Return to course home page

This page last updated 5 April 2006.

Many thanks to David Chiang, Bonnie Dorr, Christof Monz, Amy Weinberg, for discussions about the syllabus. Responsibility for the outcome is, of course, completely indeterminate. :-)