INST 734
Information Retrieval Systems
Fall 2014
Module 2: Monday September 8 to Sunday September 14
This module takes us through the process of getting from a collection
of text documents to the inverted index that we will use in the next
module as a basis for ranked retrieval. The module is designed to be
completed in 12 hours over 7 days. As with every module, you must
complete all components of this module by midnight on the evening
of the indicated end date for the module (in this case, Sunday,
September 14).
Module Checklist
The recommended order for completing the activities in this module is:
Check ELMS to see if you have an additional reading summary assigned to you
this week. If so, complete that summary by midnight Thursday and
submit it using ELMS.
Additional readings are assigned
to five students. All students should read the five summaries once
they are posted to ELMS, and all students other than those who wrote a
summary this week should make a comment on the discussion board about
at least one of the summaries (or about someone else's comment on one
of them).
A lecture on The Inverted
Index from Chris Manning's Stanford MOOC on Natural
Language Processing.
A lecture on Boolean
Retrieval from Chris Manning's Stanford MOOC on Natural
Language Processing.
A lecture on Phrase
Queries from Chris Manning's Stanford MOOC on Natural
Language Processing.
A guest lecture by Bill Graham on Introduction to
Hadoop from Marti Hearst's Fall 2012 i290 course on Analyzing
Big Data with Twitter at the University of California, Berkeley.
Finally, complete Exercise E2. Like all
assignments that you are asked to turn in, this is due at midnight on
the last day of the module (which is always a Sunday night). Note
that you are allowed to work with other students on exercises, but you
must type in the results yourself (no cut and paste -- see the course
description for details).
Doug Oard
Last modified: Sat Oct 18 23:25:07 2014