Please turn in this homework by linking it from your course Web page DURING CLASS on the due date. Please do not link it before that time.
This Excel spreadsheet contains seven separate sets of judgments for the hits examined in Homework 2. For one of the topics (you can choose either one), you will:
1. Agreement on Relevance Judgments
The above spreadsheet contains seven sets of judgments for every document (Web page). The first question you'll answer is: How often do judges agree on relevance? Pick the first (leftmost) three columns of judges. There are four possibilities:
For your chosen topic (please say which one you chose!), figure out how often each of those four cases happens (both as counts and as percentages). Turn this information in. Then pick three cases in which the judgments about a particular hit are not all the same (i.e., not 000 or 111), and briefly speculate on why this may be so in these specific cases (i.e., with reference to the contents of the pages), drawing on the nature of relevance as we discussed it in class. Turn this in.
2. Adjudication
Adjudication is simply the process of reconciling inconsistent judgments. Do this by simple majority voting. You do not need to turn anything in for this, but you will need the results for the third question.
3. Evaluation of Bing and Google
Now, evaluate Bing and Google using the adjudicated relevance judgments you just created (for the topic you chose). Make sure you are pooling judgments from both systems! Turn in the following information for both search engines:
In addition, answer the following questions:
Assignment adapted from James Allan's CMPSCI 646 course (Fall 2004) at the University of Massachusetts