This is the schedule of topics for Computational Linguistics II, Spring 2007.
Readings are from Christopher D. Manning and Hinrich Schuetze, Foundations of Statistical Natural Language Processing, unless otherwise specified. The "other" column has optional links pointing either to material you should already know (but might want to review) or to related material you might be interested in.
THIS SCHEDULE IS A WORK IN PROGRESS!
In addition, some topic areas may take longer than expected, so keep
an eye on the class mailing list or e-mail me for "official"
dates.
Class | Topic |
Readings* | Assignments | Other | |
---|---|---|---|---|---|
Jan 24 | Course administrivia, semester plan; corpus-driven and computational linguistics |
Ch 1, 2.1.[1-9] (for review) Word counts; tokenization; frequency and Zipf's law; concordances |
Assignment 0 | Corpus Colossal (The Economist, 20 Jan 2005); Language Log; Resnik and Elkiss (DRAFT); Linguist's Search Engine | |
Jan 31 | Words and lexical association |
Ch 5 Collocations; mutual information; hypothesis testing |
Assignment 1a, Assignment 1b | Dunning (1993), Bland and Altman (1995) | |
Feb 7 | Information theory, n-gram models |
Ch 2.2, Ch 6 Information theory essentials; entropy, relative entropy, mutual information, perplexity; noisy channel model; maximum likelihood estimation |
Assignment 2 | ||
Feb 14 | Cancelled: snow |
||||
Feb 21 | Smoothing |
Ch 9-10 Smoothing methods |
Assignment 3 |
An
empirical study of smoothing techniques for language modeling (Stanley
Chen and Joshua Goodman, Technical report TR-10-98, Harvard University,
August 1998); Revised Chapter 4 from the updated Jurafsky and Martin textbook. |
|
Feb 28 | Probabilistic grammar |
Ch 11-12, Abney (1996) HMM review (forward and Viterbi algorithms, EM using the forward-backward algorithm); PCFGs; inside probabilities; revisiting EM with the inside-outside algorithm; lexicalized and dependency-based models; |
Pereira (2000); Detlef Prescher, A Tutorial on the Expectation-Maximization Algorithm Including Maximum-Likelihood Estimation and EM Training of Probabilistic Context-Free Grammars; McClosky, Charniak, and Johnson (2006), Effective Self-Training for Parsing | ||
Mar 7 | Cancelled: snow |
||||
Mar 14 | Beyond CFG; Parser Evaluation and NLP Evaluation in General |
History-based grammars; dependency representations; evaluation paradigms for NLP; parser evaluation; | |||
Mar 21 | Spring Break |
Have fun! | |||
Mar 28 | Supervised classification |
Ch 16 Supervised learning -- k-nearest neighbor classification; naive Bayes; decision lists; decision trees; transformation-based learning (Sec 10.4); linear classifiers; the maximum entropy principle and maxent models; feature selection |
Take-home midterm handed out | Other useful readings include Adwait Ratnaparkhi's A Simple Introduction to Maximum Entropy Models for Natural Language Processing (1997) and A Maximum Entropy Model for Part-Of-Speech Tagging (EMNLP 1996); Adam Berger's maxent tutorial; and Noah Smith's notes on loglinear models. | |
Apr 4 | Word sense disambiguation |
Ch 7;
Resnik, "WSD in NLP Applications" (Ch 11 in
Edmonds and Agirre (2006)) Characterizing the WSD problem; WSD in applications; WSD evaluation; unsupervised methods/Lesk's algorithm; supervised techniques; semi-supervised learning and Yarowsky's algorithm |
Assignment 4 | ||
Apr 11 | Information Retrieval |
Ch 8.5, 15.{1,2,4} | Lecture slides | ||
Apr 18 | Guest lecture (Smaranda Muresan) on graph-based methods in NLP |
(a) Rada Mihalcea and Paul Tarau, TextRank:
Bringing Order into Texts, in Proceedings of the Conference on
Empirical Methods in Natural Language Processing (EMNLP 2004),
Barcelona, Spain, July 2004.;
(b) Rada Mihalcea, Graph-based
Ranking Algorithms for Sentence Extraction, Applied to Text
Summarization, in Proceedings of the 42nd Annual Meeting of the
Association for Computational Linguistics, companion volume (ACL
2004), Barcelona, Spain, July 2004;
(c) Paper/data of Pang and Lee on sentiment
analysis with min-cuts
PageRank and variants; HITS; min-cuts |
Team Project | Lecture slides. Optional readings of interest: (a) Christopher D. Manning, Prabhakar Raghavan and Hinrich Schutze, Introduction to Information Retrieval, Cambridge University Press: Chapter 21 "Link Analysis"; (b) Page L. et. al Page Rank Citation Ranking: Bringing Order to the Web; (c) Jon Kleinberg Authoritative sources in a hyperlinked environment, in proceedings of SODA 1998 (d) Kurt Bryan and Tanya Leise, The $25,000,000,000 Eigenvector: The Linear Algebra Behind Google (SIAM Review 48(3), 2006, pp. 569-581) | |
Apr 11 | The Web as a Corpus |
(a) A. Kilgarriff and G. Grefenstette, Introduction to the
special issue on the web as corpus, Computational Linguistics
29(3): 333-348 (2003) (b) Lapata, Mirella and Frank Keller. 2004. The Web as a Baseline: Evaluating the Performance of Unsupervised Web-based Models for a Range of NLP Tasks. Proc HLT/NAACL, pp. 121-128. (c) Lapata, Maria. 2001. A Corpus-based Account of Regular Polysemy: The Case of Context-sensitive Adjectives., Proc NAACL. (d) Philip Resnik, Aaron Elkiss, Ellen Lau and Heather Taylor. The Web in Theoretical Linguistics Research: Two Case Studies Using the Linguist's Search Engine., Proc. 31st Meeting of the Berkeley Linguistics Society, pp. 265-276, February 2005. What is a corpus?; using the Web for NLP tasks; ways linguists can use the Web. |
Also of possible interest:
Linguist's Search Engine;
Mirella Lapata, and Frank Keller. 2005. Web-based Models for Natural Language Processing. ACM Transactions on Speech and Language Processing 2:1, 1-31. (Extends Lapata and Keller 2004); WebExp software for Web-based psycholinguistics |
||
May 2 | Machine translation |
Ch 13 and Adam Lopez, A Survey of Statistical Machine Translation,
Techreport LAMP-TR-135/CS-TR-4831/UMIACS-TR-2006-47, University of Maryland, College Park, April 2007 Historical view of MT approaches; noisy channel for SMT; IBM models 1 and 4; HMM distortion model; going beyond word-level models |
Also potentially useful or of interest:
Kevin Knight, A Statistical MT Tutorial Workbook;
Mihalcea and Pedersen (2003); Philip Resnik, Exploiting Hidden Meanings: Using Bilingual Text for Monolingual Annotation. In Alexander Gelbukh (ed.), Lecture Notes in Computer Science 2945: Computational Linguistics and Intelligent Text Processing, Springer, 2004, pp. 283-299. |
||
May 9 | Phrase-based statistical MT |
Papineni, Roukos, Ward and Zhu. 2001.
BLEU: A Method for Automatic Evaluation
of Machine Translation
Components of a phrase-based system: language modeling, translation modeling; sentence alignment, word alignment, phrase extraction, parameter tuning, decoding, rescoring, evaluation. |
Take-home final handed out | Koehn, PHARAOH: A Beam Search Decoder for Phrase-Based Statistical Machine Translation; Koehn (2004) presentation on PHARAOH decoder |