1
|
- LBSC 796/INFM 719R
- Douglas W. Oard
- Session 7, October 22, 2007
|
2
|
|
3
|
|
4
|
|
5
|
|
6
|
|
7
|
- Use ratings as to describe objects
- Personal recommendations, peer review, …
- Beyond topicality:
- Accuracy, coherence, depth, novelty, style, …
- Has been applied to many modalities
- Books, Usenet news, movies, music, jokes, beer, …
|
8
|
- Self-interest
- Use the ratings to improve system’s user model
- Economic benefit
- If a market for ratings is created
- Altruism
|
9
|
- A1: User has sufficient knowledge for a reasonable initial query
- A2: Selected examples are representative
- A3: The user will try giving feedback
- A4: The user will keep giving feedback
|
10
|
- Two problems:
- User may not have sufficient initial knowledge
- Few or no relevant documents may be retrieved
- Examples:
- Misspellings (Brittany Speers)
- Cross-language information retrieval
- Vocabulary mismatch (e.g., cosmonaut/astronaut)
|
11
|
- There may be several clusters of relevant documents
- Examples:
- Burma/Myanmar
- Contradictory government policies
- Opinions
|
12
|
- Efficiency
- Longer queries require more processing time
- Understandability
- Harder to see why subsequent documents retrieved
- Risk
- Users are reluctant to provide negative feedback
|
13
|
|
14
|
- Maximize the value
- Provide for continuous user model adaptation
- Minimize the costs
- Use implicit feedback rather than explicit ratings
- Minimize privacy concerns through encryption
- Build an efficient scalable architecture
- Limit the scope to noncompetitive activities
|
15
|
|
16
|
|
17
|
|
18
|
|
19
|
|
20
|
- Sometimes yields unexpected results:
- Google ranked Microsoft #1 for “evil empire”
|
21
|
- Browsing histories are easily captured
- Make all links initially point to a central site
- Encode the desired URL as a parameter
- Build a time-annotated transition graph for each user
- Cookies identify users (when they use the same machine)
- Redirect the browser to the desired page
- Reading time is correlated with interest
- Can be used to build individual profiles
- Used to target advertising by doubleclick.com
|
22
|
|
23
|
- User selects an article
- Interpretation: Summary was interesting
- User quickly prints the article
- Interpretation: They want to read it
- User selects a second article
- Interpretation: another interesting summary
- User scrolls around in the article
- Interpretation: Parts with high dwell time and/or repeated revisits=
are
interesting
- User stops scrolling for an extended period
- Interpretation: User was interrupted
|
24
|
|
25
|
|
26
|
- Protecting privacy
- What absolute assurances can we provide?
- How can we make remaining risks understood?
- Scalable rating servers
- Is a fully distributed architecture practical?
- Non-cooperative users
- How can the effect of spamming be limited?
|
27
|
- Observe public behavior
- Hypertext linking, publication, citing, …
- Policy protection
- EU: Privacy laws
- US: Privacy policies + FTC enforcement
- Statistical assurance of privacy
- Distributed architecture
- Model and mitigate privacy risks
|
28
|
|
29
|
- http://hannu.biz/aolsearch/
|
30
|
|