http://www.informatik.uni-trier.de/~ley/db/conf/sigir/sigir2002.html
SIGIR 2002
http://doi.acm.org/10.1145/564376.564390	9	Predicting Category Accesses for a User in a Structured Information Space*	access accesses actions albrecht amazon applications approach based browsing chen cluster collections company conference cooley cutting data dayal deshpande discovery document dynamic efficient engineering explorations extracting fetching first from garcia gather http hypertext ieee information international internet issue jacobsen karger karypis knowledge large linking markov mining mobasher model modeling models molina multimedia needs nicholson page park path patterns pedersen pirolli pitkow predicting prediction preparation proc proceedings references request scatter scent selective siam sigir sigkdd silk srivastava stratify structures system systems trans traversal tukey usage useable user users using wide world yang zhang zukerman
http://doi.acm.org/10.1145/564376.564424	35	Empirical Studies in Strategies for Arabic Retrieval	algorithm alignment aljlayl allan analysis applications approach approaches arabic automatic ballerini banerjee based beesley beitzel berger brown buckley buckwalter butterworth callan chowdhury classification clir coling communications computational cooccurrence costello croft cross crosslanguage darwish deerwester della documents doermann dumais durand effective english estimation evaluating evens experiments feng filtering finite french frieder from furnas generation grossman harman harshman hidden holmes improved indexing individual information inquery isabella jasis jensen jones june keyword lafferty landauer language latent leek lingual linguistics london machine malin markov maryland mathematics mayfield mccarley mcnamee mercer methodologies miller mining mitra model modeling models morphological multilingual nguyen nist noamany oard omari over pages parallel parameter pedersen personal piatko pietra ponte porter probabilistic proceedings program publication publications published queries query rautiainen references retrieval riao salem salton schwartz searching semantic sheridan should sigir simard singhal smart spark special spider state statistical stemming stripping suffix suffixing system text texts thesaurus track translate translation trec using video voorhees weischedel words
http://doi.acm.org/10.1145/564376.564468	76	Content-based Music Indexing and Organization	acmdl acoustical advanced america amplitude analyzing archives atal audio automatic automatically bainbridge band berlin broad cation classi cluster coders conf cook data database digital essl europ exploiting facts fast fastl fluctuation fruhwirth genre ghias hall hearing histograms human humming icann information kohonen kosugi large libraries library maps masking merkl models modulated multimedia music musical networks neural noise optimizing organizing pampalk patterns popular practical proc properties psychoacoustics query rauber references research retrieval schroder self signals smoothed society speech springer strength symp system technology temporal towards using verlag visualization zanetakis zwicker
http://doi.acm.org/10.1145/564376.564457	65	Spatial Information Retrieval and Geographical Ontologies An Overview of the SPIRIT Project	able alani alternatively analogous analysis area been both buyukokkten cdrom collections combining conducted constantopoulos could create croft data delineate describe descriptors develop developed digital direction disambiguate displayed dlib document draw drawing editor effective evaluated expansion exploiting extent features field fields frew gazetteer general geographic geographical geometric georeferenced geospatial global hagedoorn harpring have help hill html http image implementation imprecise include information interact interest interface interfaces interpretation january jones known large lexical library lncs local location maintains many mapping mark matching mccurley measures merging method methods modal monolingual multi names navigation novel ontologies ontology opengis pages papers parallel pattern place places principles proper proximity qualitative quantitative query ranking recognition references reflect region regions relating relations relationships relevance required research retrieval revealing search selecting semantic server shape sigir sigmod similarity sintichakis spatial specification specs spirit springer state strands study system techniques techno terminology terms textually that theory thesauri thesaurus these this topic tudhope upon used user users using veltkamp visual voorhees webdb will with words years zheng
http://doi.acm.org/10.1145/564376.564423	34	Term Selection for Searching Printed Arabic	ahmed aitao aljlayl analysis analyzer appear applications approaches arabic areeb attia baird beesley beitzel bibliography building cairo chen chowdhury clir coling combining comparing computational conference cross darwish defects degradation document doermann egypt electronic engineering evens experiments faculty finite frieder generation grossman henry holmes iapr ibrahim icdar image index information international jasis jensen jones july kareem kenneth kharashi language languages large laurel malcolm martha maryland master modeling models mohamed morphological morphology oard proceedings processor publishers rautiainen recognition resources retrieval roots scale second semitic shallow state stems system systems term terms their thesis translation trec university uses video weighting words workshop
http://doi.acm.org/10.1145/564376.564391	10	Detecting and Browsing Events in Unstructured Text	access accurate achieves aggregation allan also although american amit among amount anne annual anton arbitrary architecture architectures articles associated association athens atlanta attention august australia automatic balance bergmark between broad browse browsing buckley building built cance cant carbonell carl case catherine charles chavez choose chris christian church cikm city cluster clustering clusters coincidence collocation collocations come communications compendium computational computer computing concentrated concentration conclusions conference corpora corpus could crane cunningham dana darmstadt data date dates david days deciding deep department designing detected detection development digital disambiguating displays distance distinct document documents does donna drudgery dunning dyer ecdl ects edition editor eighth enabled etzioni evaluating evaluation event events evidence exhibit explore extracting extraction fast features first focus found francisco frederick frequency from full future gain gary gather generation generic geographic geospatial gerard given greece greene gregory grouping gupta hanks have hearst historical hong hope human humanities hypertextual hypothesis ieee incorporate individual information interactive interface international interpreting into intuitive jaime james joint journal july june kansas karp kaufmann kenneth kevin khandelwal kinds know kong lack lagoze language ledge length leuski lexicography libraries library libya likelihood line linguistics linking lion london lrec machinery madani mahoney management mandar mapping maps marchionini marti mccurley mckay measurable measure measures melbourne methods milbank minimize mining mitra models more morgan multilingual mutual names navigation needs news norms november number occurrence occurrences often omid once oren other overview overviews pages particular patrick pedersen phrases pierce pittsburgh place places plaisant plaunt press previews proceedings processing provide rahul ranked ranking rebel recording reexamining reference references regions reliable report repr requires research resources rest results retrieval retrospective richard roanoke robert russell rydberg sally salton scalable scatter scholarly science scope seeking september shneiderman sighal sigir sigkdd signi since smith society some spatial stand statistical statistics stephan structure structuring study subcorpus subtopic successful such summarization support surprise surrogates swan system systems table tabular technical technology temporal tenth term terms testbed text that their these they thomas thought through tight time timelines topic tracking university useful user users using varying vikash visual waikato wayne which wish with within word work wulfman yang years yiming york yoselo zamir zurich
http://doi.acm.org/10.1145/564376.564449	57	Effective Collection Metasearch in a Hierarchical Environment: Global vs. Localized Retrieval Performance	ability activity actual allan alternative analysis apply assigned ballesteros based battle browse byrd callan categorization centric concept conclusions consuming contextual control could croft database databases determined digital directories directory document does domain ective ectiveness ehaviors emphasis employment energy engine environment erformance eriments exert figure flood full general given hierarchy implications individual information initial inquery integrated jasis ject judgments june libraries likely lower march materials meng metasearch modes more moreover narrower natural need nist obtained other ound over pages paradigm park placed point precision preferences previously proc productively professional provides queries query recall references relevance relevant represents restricted result resulted retrieval returned sample scenario search searches searching selected selection since source stands subsets swann system text than their there these they this thompson those thus time trec turtle under understanding usability used user users using wang were westlaw when which wise with would yang
http://doi.acm.org/10.1145/564376.564444	52	ICA and SOM in Text Document Analysis	accuracy adaptivity advances algorithms american analysis analyzed assigned averages berlin bingham both categories category cation cause chapter characterizing chosen classi classify clustering code coherent collection comon comp comparable comparison complexity component computation concept contexts corresp counted data deerwester describ dialogue dimensional dimensionality discussion document documents domain dumais dynamical each ecially ective ectively editor eginnings either elsnews endent endings entity erent errors estimated etween experiment extracting fast fastica form found fully furnas girolami grid groups hansen hard harshman have help here high hill http icann identi ieee ijcnn implementations indep independent indexing information initializations interact interscience into introduction ject jected jokinen jority journal karhunen kaski keyword keywords kohonen kolenda lagus landauer large latent letters limitations listed mapping maps massive matching matlab mcgill mcgraw meaningful method methods minimum minority modern networks neural next nicely numb oint onent onents only order organization organized organizing original ortion over paatero pages placed presented proc processing projects prop quite random randompro realized reduction references resp restucturing results retrieval rinen robust roughly runs saarela salo salton science sciences selection self semantic series sigdial signal sigurdsson similar similarity size society somewhat space sparse springer square statistically successfully summary summer systems table text that their there these they time times topic topics uctuations unit units used uses using usix verlag viola volume vote were when where which whole wiley within york
http://doi.acm.org/10.1145/564376.564397	14	The Use of Unlabeled Data to Improve Supervised Learning for Text Summarization	abstracts advances algorithm alignment american analysis anderson annual approach artificial automatic banko barzilay based berger bias biometrics bloedorn blum boardwatch both buckley cambridge carbonell celeux chains chen chuang classification classifier clustering coling collection combining comp computational computer conf conference corpora correction creation criteria data development discourse discriminant discrimination diversity document documents domain duda eacl elhadad estimation evaluation experts extracting extraction fairfax fast fifteenth first from functions gaithersburg general generating generation generic goldstein govaert group hand hart heuristics highlighting html http iaui ifip information institute intelligence intelligent interactive internet john jones journal kantrowitz kernel klavans knaus kupiec labeled laboratory language learning lexical likelihood linguistics literature logistic luhn machine mani marcu maryland matsumoto maximum mccallum mckeown mclachlan methods metrics miller mitchell mitra mittal mittendorf mixture mmac modeling moens month multiple multivariate national natural neural nigam nist nomoto nonlinear normal notes ocelot online pages paragraph passages pattern pedersen phase practical proceedings processing producing projects radev ratio recognition references related relevance relevant reordering report reranking research retrieval richardson robertson robust roth scalable scene schauble science scott search segments selecting selection semantics sentence sentences shaw sheridan sigir singhal society sons sources spans sparck spider standards statistical statistics steinhage stochastic structures strzalkowski summaries summarization summarizer summarizing symons system systems task technical technology technometrics terms teufel text theory thrun tipster trainable training trec university unlabeled unsupervised userfocused users using uyar versions wang weighting wiley wise with working workshop written yang york zechner
http://doi.acm.org/10.1145/564376.564429	39	Predicting Query Performance	academic advances ambiguity american analysis annual applied approach approaches automatic azzilini based basics bigi boston bowman buckley cambridge carlo carpineto cation chakraborty ciir city classi cognition combining computational conference constraints cover croft cronen culty data dekker digital duda editor editors elements employing expansion explorations foundations from general gibbons harman hart human ieee inference information informatioon innovations international interscience jarvelin jelinek john joint journal july june kalos keys kluwer know krovetz kwok language lavrenko ledge libraries locating manning marcel march massachusetts measure measurement meeting method methods model modeling models monte mori morpholgy natural ninth nist nonparametric oxford pages pattern pirkola ponte power press proc proceedings process processing publication publishers quantifying query question realization recent recognition references relevance research resnik resolution retrieval romano rorvig scene schutze science search selectional september sigir smoothing society song sons space speci special speech statistical sullivan systems techniques technology term terms text theoretic theory thomas through townsend track transactions trec university viewing volume voorhees weighting whitlock wiley wong york
http://doi.acm.org/10.1145/564376.564381	2	Analysis of Lexical Signatures for Finding Lost or Related Documents	accessibility activity answers april autonomous bell bollacker bringing caughey characterization characterizations cheap citation coetzee comments company computer conference consortium ddep december design dhtml diego digital documents edition electronic engineering evaluation everywhere extended fausey february flake functional gigabytes giles glover group harcourt highly http hyperlinks hypermedia ieee ietf indexing inet information ingham international internet introduction isdn january jarvelin ject jects journal july kekalainen krovetz kruger laboratory lawrence libraries link little locators managing masinter methods names nature networks nielsen oberholzer open oriented pages pdganswers pennock persistence persistent phelps pitkow proceedings protocol publishing questions references relevant report reports request requirements research resource reston retrieving robust science scienti searching september shafer shrivastava sigir society sollins street suite summary system systems technical technology uniform visualization voorhees weibel wide wilde wilensky with witten world zrich
http://doi.acm.org/10.1145/564376.564463	71	A Logistic Regression Approach to Distributed IR	according accumulated across advances algorithm algorithms always analog averaged axis based between boston build calculated callan center chapter chen cients collection collections common comparing conference constant contain coop cori croft cross dabney data database databases discovery disks distributed divided dividing document documents domain ecause editor emmitt equation erformance eriment estimating evaluating evaluation example figure formed french frequency from full gaithersburg garc gloss gravano harman have ideal implementation information intel internet into inverse jcdl judgements kluwer larson lecm lection ligent line logistic measures molina month multi nist numb number only optimal order ortion ossible over pages powell preliminary prey probabilistic prop queries query rank ranked ranking recall recent references regression relevance relevant research resource results retrieval root sample second selection servers sets short sigir size some source square staged study summarizes systems techniques term terms testb text that thesis they this tion tipster title tomasic total transactions trec tted university used using viles virginia were where whether with york
http://doi.acm.org/10.1145/564376.564462	70	The Boomerang Effect: Retrieving Scientific Documents via the Network of References and Citations	analogy automatic back because berlin bollacker boomerang chiaramella citation citeseer cognitive digital documentation documents effect elements entities essir exploiting field from generating giles high ideally indexing information ingwersen interaction journal larsen lawrence lectures libraries lncs loop made management needs network overlaps papers past perspectives polyrepresentation precision proceedings processing references research retrieval returning scientific scientometrics semantic sigir springer structured study system term theory time yield
http://doi.acm.org/10.1145/564376.564445	53	Improving Hierarchical Text Classification Using Unlabeled Data	aaai algorithm amounts applied appropriately articles assumption averaged bars baseline bayes below benefit between boost both boyapati cases categorization child class classes classification classifiers classify classifying collection comp compared comparison comprehensive containing create data dataset dempster described divided document documents each ecml error event examples experiments figure filter forty from hierarchical hierarchically hierarchy honours icml ignored implicit improvement incomplete independence information into journal koller labeled laird lang learning lewis likelihood main make maximum mccallum measure mitchell models naive naming neither networks neural news newsgroup newsgroups newsweeder nigam organized parent performance pool previous proceedings provided randomly references relationship remaining repeated results retrieval royal rubin ruiz sahami section separate sets shown sigir since size society split srinivasan statistical taken test text that then these thesis this thrun times topic towards training unlabeled used uses using varying very were which with words workshop would
http://doi.acm.org/10.1145/564376.564421	33	Methods and Metrics for Cold-Start Recommendations	acknowledgments actors actual against agents alexandrin algorithm algorithmic algorithms allows alternatives analysis andrew annual application applications applying approach appropriate architecture area arti associate automating available average averaging axiomatic based baselines basu bayes bayesian bergstrom billsus borchers breese budapest built called case cases categorization cation caused characterize chickering choice choices cial cient class classi clayp clustering cohen cold collab combination combines combining commerce communications community computational computer computing conclusion conclusions condli conference content cooperative corresp could croc curve curves data delos demonstrate department deploy development diagnosis digital dimensionality draw each easiest ective ectiveness electronic empirical endency environments eral erating erent erformance erforming erforms ersonal ersonality ersp escul etter evaluating evaluation exactly factors feasible feel fifteenth fifth filtering folding fortunately foster foundations fourteenth framework freund from furnas future gain generally generative genomics giles give goal gokhale good gordon grant groc grouplens have heckerman herlocker heuristic hill hirsh hofmann horvitz however human hybrid iacovou icting implement implementation implicit imputation include indexing information instance institute intel international introduction item items iyer joint jority journal kadie karypis konstan laborative latent lawrence learning lewis libraries ligence link live ltering lters machine madigan maes make makes maltz many mcgill mcgraw measure measures meek melville memory method methods metrics miller mining miranda mixed model models modern mooney mouth movie nagara nakamura national needed netnews networks news newspap odied often onding online oosted oosting orative order ortant orted osed ossible other outcome outp over pages part pazzani pennasp pennock pennsylvania plots posse prediction predictions predictive predictors preferences probabilistic problems proceedings prop publicly puzicha random rating real recommend recommendation recommendations recommender recommenders recommending reduction regarding research resnick results retrieval riedl rosenstein rounthwaite salton same sarwar schafer schapire schein science semantic sense seventeenth shardanand should show sigir since singer situations sixteenth soap social sophisticated sort sparse start stead study success suchak supp supported surprising system systems task tasks technical techniques tells tenth tested testing text that theory these they things this through training tting uncertainty under underp ungar university unrated usenet users using value variable varian virtual visualization webkdd weighted well when where while wide with word work workshop world
http://doi.acm.org/10.1145/564376.564461	69	Higher Precision for Two-Word Queries	after approach automatic burkowski chen chinese clarke conclusion conference coord cormack crouch document effect effectiveness english evidence experiments fagan fourth hawking holtz improving indexing info information initial jasis kwok mngmt multitext near nist nonsyntactic nostem occurrence operators original performance phrase process project proximity queries ranking references report rerank retrieval short shortest stem stge substring table text thistlewaite tipster trec very washington windw word
http://doi.acm.org/10.1145/564376.564452	60	How Many Bits are Needed to Store Term Frequencies?	analysis annual baeza baseline beaulieu bell binning bins bits changing comparing compressing conclusion conference croft dependent desc development document documents eighth ertson expansion figure frequency gatford gigabytes global hancock harman images indexing information international jones local managing modern neto nist nostrand numb number observe okapi ossible overview pages pass precision press proceedings publication query reduce references reinhold research retrieval ribiero right seventh sigir special static term text that third title trec unbinned using values voorhees walker with witten word yates york
http://doi.acm.org/10.1145/564376.564448	56	The Impact of Corpus Size on Question Answering Performance	accurate actually after answer answering answers appears artifact asymptote asymtote automatic banko both breadth brill building clarke collection conference context cormack corpora corpus correct crawling data declines determine directly discussion does dumais estimating etzioni evaluation examination expected experience exploiting further generally have high human improve indicates intensive international jork judge kwok lhotak lynam manual marking match mclearn methodology missing needed observed overview pages palmer performance possible precision quality question random range reach real redundancy references reinforced related relationship represents responses retrieval returned sampling scaling script scripts search sigir similar size slightly suggests support surrounding system systems tenth test text that then tice track traditional trec voorhees weakness weld when while wide wiener with work world yields
http://doi.acm.org/10.1145/564376.564430	40	Using Part-of-speech Patterns to Reduce Query Ambiguity	academic acquisition advances allan also ambiguity amherst analysis anick anton applications applying aspect assistant available broglio bruce bruza build building byrd callan canada center chapter church ciir cikm citeseer cluster clustering cobb combinations compared comparing comparisons computer concept conference croft cronen data database databases dawn delos dennis deriving design dexa digital directory disambiguation discovering document editor empirical engine english evaluating evaluation expansion expert exploiting feedback finding franzen from gale gather grefenstette hand hanks harding hearst hierarchical hierarchies hindle html http human hyperindex hypothesis ieee image implementation indirect information infovis inquery institute intel intelligent inter interactive interface intermediate international internet iterative james karlgren keyword kluwer know krovetz language large lavrenko lawrie ledge leuski lexical lexicon libraries ligent lighthouse line ling linguistic lists livetopics management mark marti massachusetts mcarthur mechanisms mirages modeling montreal organization pages palliating paraphrase part pedersen performance personalization perspective peter predicting press proceedings processing providing publishers quantifying queries query ranked recent recommender reexamining references reformulation relevance relevant report research resources results retrieval riao richard robert rosenberg russell samizdat sanderson scatter science search seeking seltzer sense short showing sics sigir similarity simon snapshots speech sqlet statistics stephen steve structure summarization suresh swan swedish symposium system systems tagger taking technical techniques technology tenth terminological testing text thesis tipirneni topic townsend university using verbosity visualization visualizations watch windows with word words worshop zhou
http://doi.acm.org/10.1145/564376.564481	89	Example-Based Phrase Translation in Chinese-English CLIR	alignment amta approach based beijing best bilingual books brown carbonell chinese cilin corpora each ebmt english example forest frederking from generate hong hybrid ijcai information into issue jaime journal jscl langhorne learning lexicon ming november october pair phrase pick press proceedings process ralf references result retrieval robert shanghai shiwen should simple some special substituting synonym system target template text through tongyici transeasy translated translation translingual used very viewed wang what word words xiang yang yiming zhang
http://doi.acm.org/10.1145/564376.564407	22	Statistical Cross-Language Information Retrieval using N-Best Query Translations	according acoustics adaptive added advanced after algorithm algorithms along also altos annual append appendix applications applied arti assigned assuming ation average backtrack backward baeza based bell berger berlin bertoldi best bestlist better brie brown call cambridge canada cannot carried case chain cial ciently clef combining complete complexity compression computational compute computed computer computes conference continuation continuous cost covers cross current darmstatd data decomposition della dependences determined development digital disambiguation document dynamic each easy editor editors else empty ending englewood erty essen estimating estimation european evaluating evaluation events examination examines expand expansion expansions experiments fast federico finally following forward frakes frequency from gaithersburg generation germany hall heidelberg hence hidden hiemstra huang hull hypotheses ieee incoming inform information initial initialize input insert inserted insertion insertions intel international into involves irst italian iter iteration iterations johnson jones jong jourlin just kaufmann kneser kraaij language lecture leek libraries ligence likelihood lingual linguistics link lisbon list machine markov mathematics maxe maximum mercer methods miller model modelling models monolingual morgan nding nguyen nilsson notes novel number obtained opens openset operation operations optimal ordered otherwise pages parameter partial path perform performed peters pietra pohlmann pops portugal position positions possible prentice principles probabilistic probabilities probability problem proceeding proceedings processing programming proposed query rabiner ratio readings recognition references research results retrieval reverted schwartz science score scores search second selected sentence sept sequence shown sigir signal simpli smoothing solution soong source speech spoken springer stack stage starting statistical step steps stochastic strategies structures structuring study table technology term testing text that then theories theory third this time times toronto total track trans translation translations trec tree trellis tutorial twenty university uses using verlag version viterbi volume weibel weischedel while whole with witten woodland word working workshop yates zero zhai
http://doi.acm.org/10.1145/564376.564420	32	Inverted File Search Algorithms for Collaborative Filtering	access addison agents algorithmic algorithms analysis annual application applying approach arti automating bayer bell billsus borchers breese buckley case cial collaborative commerce communications compressing computer computing conference cooperative database development dimensionality documents dordrecht edition editor eight eighteenth empirical experiments factors fast fifteenth file folk fourteenth framework gigabytes gordon grouplens heckerman herlocker hill human images improve indexing information integers intel international introduction inverted journal july kadie karlgren karypis kaufman kaufmann kluwer konstan language learning lewit ligence ltering lters machine madison maes maltz managing march mcgill mcgraw miller mining modern morgan mouth natural news object optimization oriented pages pazzani performing prediction predictive proceedings publisher publishing quality recommender reduction references research retrieval riccardi riedl salton sarwar searches shardanand sigir social structures strzalkowski study stylistic supported system systems trans trees uncertainty unterauer usenet using vector volume webkdd wesley williams with witten word work workshop zobel zoellick
http://doi.acm.org/10.1145/564376.564382	3	Using Sampled Data and Regression to Merge Search Engine Results	academic advances algorithms analyses annual applications approach approaches aslam australasian bailey based broglio broker buckley callan calv chang client cluster collection collections combination combining comparing computed conf conference connell craswell croft data database databases decision development digital dissemination distributed distributions document documents editor effective ellen emmitt engines evidence experiments feng fifth french from fuhr fusion garcia generalizing gloss gravano gupta hawking hierarchies impact information inquery institute international internet isolated johnson kirsch kluwer knowledge laird language large larkey learning lemur libraries logistic management manmatha merging meta metasearch mitra modeling models molina montague multiple narendra national networked networks ogilvie organized outputs over paepcke pages patent patents performance powell prey proc proceedings processing proposal publication publishers query ranking rath references regression relevance research results retrieval salton sampling savoy score scores search searching selection server sigir sigmod singhal smart space special standards stanford starts strategies strategy system systems technology tenth text theoretic thistlewaite tipster toolkit topically transactions trec using vector very viles vldb voorhees wherein wide with world yuwono
http://doi.acm.org/10.1145/564376.564492	99	Adaptive Information Extraction for Document Annotation in Amilcare	aberdeen adaptive algorithm anlp applications august author automatic bontcheva challenges ciravegna components conference copyright cream creation cunningham developing development domingue driven extraction fifth finland from gate generalisation handschuh held hirschman ieee ijcai induction information initiative intelligent knowledge kozierok language lanzoni management markup maynard metadata mining mixed motta ontology owner paper proc proceedings processing references related robinson rule seattle semantic semi sigir staab submitted support systems tablan tampere text texts their ursu vargas vera vilain washington with workshop
http://doi.acm.org/10.1145/564376.564489	96	87$&/,5 ± I *HQHUDO 4XHU\ 7UDQVODWLRQ )UDPHZRUN RU 6HYHUDO /DQJXDJH 3DLUV	airio appear august author available based bilingual categories clarity clef clir compound copyright cross dealing description descriptors design dictionary effects experimentation findings finland finnish form general german gollins gram hedlund held html http improving information informationr keskustalo keywords language languages lepp lingual lisbon matching methods monolingual morphology novel orleans owner paper pirkola portugal press problems proceedings project queries query references research retrieval rvelin sanderson sepponen shef sigir splitting springer storage structure subject swedish tampere targeted technique techniques terms tests translation triangulated utaclir variants with word words workshop
http://doi.acm.org/10.1145/564376.564417	30	Set-Based Model: A New Approach for Information Retrieval	addison agrawal algorithms alsaffar american application approaches associated association august automatic average baeza based bases basis bell between bollmann boolean boston brosis buckley burkowski cant cation causes charlote chile city clarke cliffs closed collections compressing computation computer concept concepts conference consequence content control cormack curves cystic data database databases davis deogun dependence dependencies described determination development discard discovery docs document documentation documents early effective employ employed englewood enhancing evaluation execution experiments fast feedback figure filtered first fourth framework frequency function gaithersburg generalized generating germany gigabytes hafez hall harman harper hawking identify images imielinski impact implementation increase indexes indexing information intelligent international interpretation inverted irrelevant items january journal june kaufmann knowledge large lesk library lists main management managing manipulated maryland measured memory mentioned method methods minimal mining model modeling models modern moffat morgan munich near necessity neto november number occurrence october operators opportunities overview pages penalty persin possible precision prentice presented proceedings processing proximity pruning publishers query raghavan ranking reasons reductions redundant reference references related relationships relatively research response retrieval ribeiro rijsbergen rules sacks salton santiago science sdorra second section september sets sever shaw shortest shows sigir sigkdd sigmod signi small smart society sorted space spaces sparck speci srikant statistical substring swami symposium system systems table techniques tend term terms termsets text that theoretical there third thistlewaite threshold thus tibbo time transactions trec used using usually values varied varying vector very volume warehousing washington weighted weighting wesley where with witten wong wood yang yates york zaki ziarko zobel
http://doi.acm.org/10.1145/564376.564482	90	Probabilistic Multimedia Retrieval	advance algorithm always annual appear applied approach automatic background ballegooij barnard bayesian behind belong belonging block both calculate centre class classes collection colour compute computer conference consideration content context covariance data dempster describe descriptions development different document documents does drawn each entirely estimate estimated estimation eurasip evaluating expectation experiments fact feature features foreground forsyth frequency from function have hiemstra however idea image images incomplete information institut international issue jong journal know lafferty laird language lazy learning likelihood lowlands management massachusetts maximisation maximum means measures minimization mixing mixture model modelling models multimedia multimodal need only optimal outlined pages parameter parameters paris part performance pictures pixels practice previous priors probabilities proc proceedings processing proportion query range references research respectively retrieval riao risk royal rubin sample section semantics series shows sigir signal simply society sources space special standard statistical taking team technology telematics term test text textual that their themselves thesis this tools trec twente under university unstructured used users using usually value values vasconcelos vectors versus video vision visual volume vries weighed westerveld what which with words work ycbcr zhai
http://doi.acm.org/10.1145/564376.564465	73	A Critical Examination of TDT’s Cost Function ∗	above academic achieve across actual adaptive after allan arampatzis assessment assumed assumes assuming average averages based binary cannot case cation classi combining computed conf considerable constant corpus cost curve depending described detection developement different distributional distributions doddington editor engines eurospeech evaluation even event feng figure fiscus from function further goals hameren histogram ideal indicates information judgements kluwer least manmatha martin modeling number obtain optimization ordowski organization outputs overview pages performance probability proc proceedings przybocki publishers rath references relevance research retrieval score search select sept shows sigir standard stories task tasks that there they this threshold topic topics tracking training used using value variation vary volume which with
http://doi.acm.org/10.1145/564376.564403	19	Text Genre Classification with Genre-Revealing and Subject-Revealing Features	actor addition adjustments against algorithms although always american analysis analyzing anders andrew angel annual another appendix assembling assigned assistant associate association athenes attempted attorney automatic available avenue award balanced barbara based bayesian because best better between born both bretan brett brettan brown business career case categorization celebrity charset class classes classification classified classifier clustering coling collections college combined commercial common company compariosn computational computer conclusion conference confusing consider constructed copenhagen corpus correlations county course court cutting delos department detail detection determining deviation dewe digital dillon discriminant discrimination document documents done douglass easily education effect effectiveness eliminate email employed employee english examination experience experiment explore expressions faculty fakotakis fast features film first formula formulas framework frequencies friend from further future gave general genre genreclassified genres geoffrey graduate greece gushrowski hair hallberg hand hawaii height high higher hinrich home homepage html hypertext include increase increasing incrementally information institute interfaces international internet into introduced iterative itself ivan jasis johan john july jussi karlgren kessler kokkinakis korean kyoto language larger lawyer learning legal lewis libraries life linguistically linguistics links live love lower mail major makes means method methodology methods metrics more movie music mutually myaeng name national need nemlap niklas nordic number nunberg observations obtain obtained office ordinary oriented other outperformed page pages paper particular people performance personal play possibilities practical practice precision presented probably proc processing production professional professor proposed publication ratio ratios realized recall recognizing reduce references related research result resulting results retrieval revealing ringuette role same school science select separated series service share sigir significantly similarity simple site smaller society some specific stage stamatatos star state statistics strategy student study stylistic subject symposium system table teach teacher technology television tend terms tested text than that theatre they this threshold time train training turned uniquely university usage user using values variation visualization ways well were when with within wolkert word workshop would yang
http://doi.acm.org/10.1145/564376.564456	64	Implementation of Relevance Feedback for Content-based Music Retrieval Based on User Prefences	acoustic assp audio automatic average based baseline calculated comp compared comparison conducted content continuously data david document each eriment eriments evaluation experiments feedback foote hall have ieee information list mermelstein monosyllabic music obtained package parametric precision prentice prior proc proceedings processing randomness ranked recognition references referred relevance representations results retrieval rocchio sentences singal smart speech spie spoken system therefore times tools training trans treeq variance vectors were which wide word
http://doi.acm.org/10.1145/564376.564398	15	Generic Summarization and Keyphrase Extraction Using Mutual Reinforcement Principle and Sentence Clustering	advances algorithm amsterdam analysis appear applications applied approach authoritative automatic based becker benjamins berger bipartite bradley buckley cambridge chen chuang chunking clustering comp conf conference conroy constraint contiguity corpora correlating decomposition department dept dietterich ding document documents edited environment extracting fayyad generic ghahramani gong graph hartigan hidden holland horn http hyperlinked indexing information initial international john johnson journal kleinberg kupiec large latent learning leary lovasz machine mani marcus markov martin maryland matching matrices matrix matsumoto maybury means measure method mitra mittal modeling models multilingual neural ning nist nlpir nomoto north ocelot pages partial pedersen penalized penn pivoted plummer plus points porter porterstemmer press proc proceedings processing projects ramshaw rank references relaxation relevance report research retrieval salton segments semantic sentence shift siam sigir simon singhal soda sources spatial spectral state statistics stemming structures structuring summarization summarizer summarizing system systems tartarus technical text theory third trainable transformation under understanding univ university unsupervised usama using very with wong workshop yang zhang
http://doi.acm.org/10.1145/564376.564467	75	Modeling (In)Variability of Human Judgments for Text Summarization	approaches articles asso biased ciated classi comparison compressing computer conference consists corpus data december experimental from ieee international jose journal matsumoto means mercury mining news nomoto note pages picking press proceedings references scalable sentences several society sources splitting ssdt street subspace such summarization supervised text that unsupervised wall wang wire
http://doi.acm.org/10.1145/564376.564415	28	Efﬁcient Phrase Querying with an Auxiliary Index	american anderson antonio assistee athens auckland australasian australia bahle based belkin bell berkeley biased bookstein browsing bruza buchanan california case challenges chiaramella chicago chile cient clarke clustering coast collections compaction compared compressing computer conference cormack craswell croft cunningham database databases davis decision dennis development digital directory document documents early ective edition editor editors eighth evaluation expansion exploring fast filtered forum francisco frank frequency gigabytes gold gutwin harman harp hawking images improving index indexes indexing information ingwersen interactive international internet inverted jansen journal kaufmann keyphrase keyword kraft kretser large leong lewis libraries lima managing mcarthur mechanisms montreal morgan nevillmanning next nextword optimised ordinateur orlowska oudshoorn pages paynter pedersen persin phrase phrases precision proc processing public queries query querying rafael raghavan ranking recherche recognition references reformulation relevance research results retrieval riao sacks salton saracevic scalable science search searching second self sept short sigir similarity society sorted space spink spire string structured structures study support symposium syntactic systems techniques term termination text their thistlewaite three transactions tudhop turtle vector vidick volume what wide williams with witten wolfram world york zealand
http://doi.acm.org/10.1145/564376.564399	16	Cross-Document Summarization by Concept Classification	advances annual association automatic barzilay based bayes beaulieu buckley carbonell chrzanowski clustering combining computational conference critical cruces database discourse diversitybased document documents electronic evaluation expository fellbaum firmin flexible gatford generic goldstein hancock harman hatzivassiloglou hearst heijden hierarchical holcombe hovy http iaui index information institute intelligence interactive intrinsic introduction jones klavans kraaij language length lexical linguistics management mani marcu maybury mckeown meeting mitra mixture model multi multidocument multiple naacl national neats news nist normalization okapi orleans over paragraph personalized pittsburgh pivoted practical press proceedings processing producing projects publication radev recent recommendation references reordering reranking retrieval review robertson robust segmentation sigir simfinder singhal slides special spitters standards stein strzalkowski summaries summarization summarizer system systems technology text textbased third tool trec trends walker wang webinessence willett wise wordnet workshop zhang
http://doi.acm.org/10.1145/564376.564484	92	A Hierarchical Approach: Query Large Music Database by Acoustic Input	acoustic acts analysis angeles approach april august based beat because chai chamberlin coding conference context contour correlation cunningham database degree digital downie each effective efficient every feng figure foote from garcia ghias henderson hidden humming icme ieee increase index indices input ismir kosugi large layer lewicki libraries library linearly logan master matrix mcnab melody memorizing multimedia music natural nature network neural neuroscience november output piece pitch practical proceedings query recurrent references representation result retrieval rhythm same september sigir similar simple size smith sounds spectrum structure system text then thesis this tokyo towards trained tune uchihashi used vercoe weight where whole witten zhang zhuang
http://doi.acm.org/10.1145/564376.564433	43	Liberal Relevance Criteria of TREC – Counting on Negligible Documents?	american annual august background belkin blair burgin cited commun conference croft development document documents effectiveness engines evaluation experiments finding full gordon harman hawking highly http inen information ingwersen international january journal judgments kantor large leong management maron measurement melbourne methodology methods nist orleans overview papers pathak performance press proceedings processing pubs references relevance relevant reliable research results retrieval retrieving rvelin saracevic scale science search seeking september seventh sigir society study system text track trec variations voorhees wide world york zobel
http://doi.acm.org/10.1145/564376.564389	8	Finding Relevant Documents using Top Ranking Sentences: An Evaluation of Two Alternative Schemes	aaai about access actual addition also alternative analysis annual approach assess assessing athens away based baseline beaulieu behaviour berger better biased borlund both cambridge cognitive collect colloquium completion components computational computer computing conclusions conducted conference content data development differentials documentation documents edit effectiveness effects electronic encouraging ended european evaluation evidence expansion experienced experiment experimental experimented experiments feature feedback felt finding fowkes from fully gather generate glasgow greater have helped helpful hill hollan implicit indications inexperienced influencing information interact interactive interface international into investigation irsg jansen jose journal leads lecture life likert loads locate madison magennis management mccandless methodology methods mittal more most moves need needs notes oard ocelot okapi only open oriented pages paper philadelphia popular possible potential potentially preferred present presented presenting press proceedings processing queries query questionnaires questions ranking read real recommender reducing references relevance relevant research results retrieval retrieved rijsbergen robertson ruthven saracevic scales science search searcher searchers searching selection semantic sentence sentences show shown sigchi sigir significance simply simulated sound spink statistical study subjects suggest summarisation summarizing summary system systems targeting task tasks techniques term than that they this those thus titles topranking trec unseen used user users using wear were which white wisconsin with without work workshop workshops wroblewski
http://doi.acm.org/10.1145/564376.564416	29	Compression of Inverted Indexes For Fast Query Evaluation	access adding addressing altos american australia baeza bell block cient codeword collection compressed compressing compression computer conference creswell croft davis decoding development document documents dublin early ecial ective edition editors elias encodings ergen exible fast filtered frequency generation gigabytes golomb harman harp hawking ieee images indexes indexing information institute integers international inverted ireland jansen journal july kaufmann kraft kretser large length ltering management managing melb memory morgan moura national navarro neub next ourne overheads overview pages persin proc processing public publication publishers queries ranking reduced references representations research retrieval rijsb sacks saracevic science searching second self sept sets sigir society sorted space spink standards systems technology termination text their theory thistlewaite track transactions trec universal vector very voorhees washington wilkinson williams with witten wolfram word yates york ziviani
http://doi.acm.org/10.1145/564376.564455	63	User-Centered Interface Design for Cross-Language Information Retrieval	analysis august australia based braschler clir conference cross davis design dictionary effects gaithersburg hackos harman hawaii hicss human improving information institute interactions interface international language lingual melbourne national ogden overview peters pirkola proceedings query redish references retrieval science setups sigir standards structure system task technology text track trec uble user voorhees wiley with
http://doi.acm.org/10.1145/564376.564441	49	Task Orientation in Question Answering	algorithms allan analysis annotation annual answer approaches automatic berger birmingham bridging buckley caruana chasm cohn communications conference croft crystal december description development document editor expansion extract freitag giles global glover gordon group harmon information lawrence learn lexical local message miller mittal nding nist pages proceedings publication query ramshaw references research retrieval salton schwartz search seventh sift sigir singhal smart special statistical stone system text that third trec understanding used using weischedel your
http://doi.acm.org/10.1145/564376.564406	21	Comparing Cross-Language Query Expansion Techniques by Degrading Translation Resources	align aligning ambiguity american analysis annual appear approach association automated available ballesteros based berger best bilingual buckley campaign cardie carol center centre chan char character chen chinese church clef clir clustering comparing computational computer concept concepts conference converting corpora croft cross crosslanguage delos development dictionaries dictionary dictionarybased diekema document emnlp empirical english europa evaluation expansion experiments feedback findings foreward form forum fourth franz fraser frie from fully global gonzolo harman hedlund hidden hiemstra http human improving indexing information international joint journal june keskusalo kraaij kwok lafferty landauer language large latent lecture level line lingual linguistics lisbon littman local machine markov mayfield mccarley mcnamee meeting methods mining mitra modeling models natural netherlands ninth nist normalization notebook notes ntcir oard online overview oxford paper papers parallel performance perspective peters phrasal pirkola ponte portugal position problems proceedings processing program project publication quantifying queries query readable references relevance research resnik resolving resources retrieval review revisited rvelin salton science score semantic short sigdat sigir sixth smart society special springer stage statistical super team techniques technology telematics text texts these thesis track translation travlang trec using utility very voorhees walz ward weischedel williams within workshop
http://doi.acm.org/10.1145/564376.564425	36	Improving Stemming for Arabic Information Retrieval: Light Stemming and Co-occurrence Analysis	abdul abdullah access acknowledgments acquisition against algorithm algorithms aljlayl analysis analyzer anzi approach arabia arabic arabiclexicons arbitrary around articles asian assessment automated automatic average backoff baeza banerjee based beesley beginning behavioral beitzel believe berlian best better bias bilingual books both brent bressan broglio buckwalter butterworths cabezas callan cambridge carlberger case center changes chapter chicago chowdhury classes clef cliffs clir clustering code cognitive coling collection comp comparable compared comparing complex computatio computational computer computing conclusions conference conflation connell consistent consisting contained contradictory cooccurrence corpus costello could croft cross crosslanguage crosslingual dalianis darwish data deal definite demonstrated department derived design designed designing detailed development dhahran dictionary difference discovery documentation documents doermann does dufresne eacl effect effective effectiveness eines ekmekcioglu elgadi empirical ends englewood english enhancement essentials european evaluation even evens expansion experiments fahd fang fares fedaghi feng fewer fifth filtering find finite finland flenner followed forms found frakes france franzosischer fraser french frequency freund frieder from gaafar gaithersburg garside generate generation german goldsmith good goweder gram grammar great greengrass grossman group hafer hall hand hassel have help helping higgins highly hill holmes hong however http hull hurt hypothesis identification identifying illinois implementation improve improved improvements improving independent index indexing individual indonesian inference inferior inflected information inquery institute intelligent international issues italian jaleel janssen jasis jensen jones journal kharashi khoja king klenk knutsson kong kraaij krovetz lancaster lancs language languages large larger larkey latin lavrenko learning least less letter levow lexicography lexicon lexikons light like likely linguae linguistics london lovins lund lynch make malay malaysian management many marcken maryland match matching mayfield mcculloh mcgraw mcnamee measure mechanical members method methodologies might minerals miscellaneous modeling modifying mohamed mono monolingual monz more morphe morphological morphologically morphologischer morphology morphsegmentierungssytem moulinier nasreen national natural nature news nicholas nist nodalida nonparametric nordic number oard occur occurrence ohne omari online only other over overall parallel part particularly passport pattern performance performed performers perspective peters petroleum piatko pirkola pohlmann popovi popovic porter possibility poster precision prentice presented proceedings process processes processing produced program providing qamus quantitatives queries query rautiainen reason recall references related removal removed repartitioning research results retrieval rijke rijsbergen robertson robyn roeck root roots ruled salem same saudi science sciences searches searching seem segmentation segmentierung sensitive shalabi shallow shereen short should siegel sigir significant similarity slovene small soglasnova some spanische spanischen spawarsyscen specific speech springer state statistical statistics steiner stem stemmer stemmers stemming stems still stop storage strategies string stripping structures studies study stuttgart successor sufficient suffix suffixes support supported suspect sweden swedish system systems taha tampere technology tenth term terms text texts textual than thank that their these thesis this tipster toulouse track transactions translation translations trec trends truncation turkish typology umass university unlikely unnecessary unsupervised uppsala users using variants varieties vector vega verbs verfahren verlag verwendung very victor video viewing vowel weischedel weiss well were west while wide wightwick willett with without word words work workshop world wortformen wortstruktur would writing yates yielded york
http://doi.acm.org/10.1145/564376.564383	4	The Importance of Prior Probabilities for Entry Page Search	algorithms amento american analysis anatomy anchor anchors annual approach approaches approximations assumption authoritative authority automatic bailey based bayes bharat brin buckley callan centre chang chemnitz city cohn collection collections combining computer conference connectivity content corresponding craswell crivellari croft cross csiro davison development different distillation distributed document documents does dublin ecml editors effective eighth engine engineering entry entrypage environment etri european experiment experiments expert exploring forty forum generation gurrin harman harper hawking heidelberg heijden henzinger hidden hiemstra hill homepage hyperlinked hypertextual identify improved independence informati information institute interactive international isdn jang jones jordan journal kleinberg kraaij kraft lafferty language large learning lecture leek length lewis link links locality machine management markov mcgraw mean melucci miller mitchell mitra mixture model modeling models moffat multi naive national nding nedellec networks ninth nist normalization notes number only orleans overview page pages park passage pivoted poisson ponte predicting press probabilistic probabilistically proceedings processing properties purpose quality query rank rasolofo ratings references relevance report research retrieval retrieving returned rijsbergen robertson rouveirol salton savoy scale schwartz science search searching seventh sigir similarity simple singhal site smeaton society some sources space sparck spitters springer stable standards summarisation systems tasks technology telematics tenth term terms terveen test text thesis topic topical track tracks trec twente twenty university using utilizing verlag volume voorhees walker weighted weighting westerveld wilkinson working workshop yonsei zhai zheng zobel
http://doi.acm.org/10.1145/564376.564386	6	Title Language Model for Information Retrieval	acknowledgements activity adesina advanced affect agreement also anonymous applied applying approach appropriate arda assumed based berger bridge brown callan collections comments computational conference contract cooperative croft cross data della development different digital directions document education effective effectiveness engineering estimation evaluate explore feedback fifth finally first foundation fourth from further future generate grant harman have helpful hidden hiemstra high information interesting jamie jones kraaij lafferty laffety language lavrenko leek library linguistics long machine many markov material mathematics mercer methods miller minimization model modeling models national nist number okapi only opinions other pages parameter part partial pietra ponte possibility proceeding proceedings program provided publication quality queries query quite references relevance research retrieval reviewers risk robertson robustness schwartz science second selection seventh several sigir similar smoothing special statistical study style summarization support supported system techniques technology term text thank that their there this title titles track translation trec twenty types under using variances verbose very voorhees where work would yang yiming zhai
http://doi.acm.org/10.1145/564376.564404	20	A New Family of Online Algorithms for Category Ranking	advances algorithm analysis annual cambridge categorization cation classi computational computer conference cristianini document eled eleventh elissee erceptron freund helmb images information introduction isri ittner journal kernel large learning lewis machine machines margin method multilab neural nevada pages press proceedings processing quality references retrieval schapire sciences shawe support symposium system systems taylor text theory univ university using vector vegas warmuth weak weston
http://doi.acm.org/10.1145/564376.564476	84	Biterm Language Models for Document Retrieval	analysis approach approximations around assistee automated based berger biterm brown buckley cambridge cardie caropreso categorization cocke computational conference croft data dell digital document elaborazione erent erformance eriments erty european experimental general hidden hiemstra information informazione international istituto jelinek language leek libraries linguistically linguistics machine markov massachusetts matwin mercer methods miller minimization mitra model modeling models montreal motivated ordinateur pages phrases pietra pisa ponte press probabilistic proceedings query recherche recognition references report results retrieval riao risk roossin schwartz sebastiani sigir singhal song speech statistical syntactic system table technical text translation york zhai
http://doi.acm.org/10.1145/564376.564471	79	Building Thematic Lexical Resources by Term Categorization	algorithm analysis application approach athens automated based boostexter boosting categorization cavagli chen cikm codes computing concept conference creating decision digital engineering evaluation ieee illinois improved information initiative integrating intel international into ject kircho know language learning ledge library ligence lrec machine magnini management manchester martinez mclean methods pages parallel part pattern press probabilistic proceedings processing references resources retrieval schapire schatz schmid sebastiani semantic singer spaces speech sperduti surveys system tagging text transactions trees using valdambrini wordnet york
http://doi.acm.org/10.1145/564376.564439	47	Automatic Classification in Product Catalogs	accuracies accuracy after agreement algorithms allan alternate appears approaches autocat automatic baeza between bishop bring buckley built categories categorization categorized category chapter choices classification classified cliffs communications computer conference confidence considerably containing cornell correct creecy data database decides definition department designed editing editor element engine engineering englewood facilitate features filtering find frakes generate hall harmon highest html http incorrectly indexing information initially instances interface invoked issue ithaca knowledge large less list lists locate manipulation masand matrices maximal measured memory minimum mips miscategorized model networks neural nist observed order output over oxford part pattern possible prentice presented press price proceedings process product productprice products publication quickly recognition references report required respectively results retrieval reverse routing salton schema schemas science scoring second services sets short smart smith solution space special standard stored structures technical tends term test text than this time trading training trec tree universal university unspsc user using value vector view viewing volume waltz weighting were when with wong xcbl yang yates york
http://doi.acm.org/10.1145/564376.564385	5	Term-Speciﬁc Smoothing for the Language Modeling Approach to Information Retrieval: The Importance of a Query Term	able academic accessibility algorithm american application applied approach approaches approximations australian automatic axioms based basis berger bigrams blanken blok bruza buckley butterworths cation centre choenni cikm clarke complex conclusion conference considered cormack cost croft cross ctit data database davis decide decision decisions default dempster design development digital discussions distance document documentation doing each ecial ective edition editor editors ehaviour eighth elds engine engines english entry ergen erience ertson erty etween experiments explicitly extended facilitating fast feedback frequent from general giles graphical hall harman harp hawking hidden hiemstra html http huib including incomplete information interactive international interpretation inverted investigating jelinek jones jong jordan journal justi keith kluwer know kraaij kraft laird language lawrence learning ledge leek lewit libraries lightweight likelihood ltering management mandatory mark markov maximum mayb measures mental methods might miller minimization model modeling models much muramatsu national nature nist note occurrences only optimisation optimization ortance orts others outness override page pages pairs persin phrases plus poisson ponte practice pratt predicting preface prentice press prior probabilistic probabilities probability proceedings processing publication publications quality queries query ranking rather ratio recognition references relating relevance representations research resources retrieval riao rijsb risk rocchio rose royal rubin sacks salton satis schwartz science search searches second seventh short should sigir similarity simple smart smoothing society some song sparck speech statistical stevens stop study such system take taken technical technology techrep telematics term terms text that this thistlewaite three track trade traditional translation transparent trec tudhop twenty twin typically university user users using utwente vector very volume voorhees vries walker weighted weighting westerveld which wilkinson will with word words workshop would zhai
http://doi.acm.org/10.1145/564376.564470	78	Experiments on Data Fusion Using Headline Information	across alone alternative amounts analyses analysis anne annual applying aslam assumptions atlanta based became being best better between bound calv cikm collection combination combining comparison computing conclusion conditions conference context cost cross dana data despite development different distributions document documents effective effectiveness efficient encouraging engines evidence experiment experiments feng fewer found from full fused fusing fusion gaithersburg general georgia giles have headline headlines here high ieee ignored ignoring improved improvement information inter international internet jacques javed joon july known lalmas lawrence lengths list listed local louisiana make manmatha mark markedly means merging metasearch method methods minimal mixture modelling models montague more most mounia multiple nist note november number only opening orleans other outputs overlap page pennsylvania performance performing philadelphia position presented presenting proceedings publication quite rank ranked ranking rankings rath reason references regardless relevant report reported research result resulting results retrieval retrieved returned robin round rows savoy score search searched searching section seen sets sigir similar similarity since slightly smaller sorted source speculate steve success such system systems table techniques text than that their theodora therefore these trec tried tsikrika under upper used using various volume vrajitoru were when where which words work works worse
http://doi.acm.org/10.1145/564376.564409	24	Resolving Query Translation Ambiguity using a Decaying Co-occurrence Model and Syntactic Dependence Relations	across adaptation advanced alignment ambiguities ambiguity annual approach association august australia automatic ballesteros based bian brown cache cambridge canada chen cheung chinese clarkson clir clustering coling college combining comparable computational computer conference context corpora corpus croft cross davis decaying della dependency dictionary different ding disambiguation document eacl estimation evidence expansion experiments exponentially filtering free from fung grefenstette harman huang hull icassp improving independent information integrating jang language languages lecture lingual linguistic linguistics local louisiana machine madrid mandala maryland mathematics meeting melbourne mercer microsoft mixed mixtures model models monolingual montreal msrcn multilingual multiple mutual myaeng ninth notes ogden orleans overview parameter park peters phrasal picchi pietra proceedings processing query querying references resolve resolving resources retrieval robertson robinson science selection sense sigir similar soup spring statistical syntactic system tanaka techniques term text thesaurus tokunaga track trained translation trec types using verlag voorhees walker weighting weischedel with word words workshop zhang zhou
http://doi.acm.org/10.1145/564376.564435	44	Robust Temporal and Spectral Modeling for Query By Melody	accompaniment algorithm arias arien aucouturier audio best bloomington causation clements computational computer conference dannenb dean decca disky domingo dorma duets durey engineering ersistent famous foote hidden information intel international kanazawa ligence line luciano markov melody model models multimedia music musical nessun octob otting overview pages pavarotti proc proceedings real reasoning references retrieval segmentation signals society symposium systems tenor tenors time using young
http://doi.acm.org/10.1145/564376.564408	23	Cross-Lingual Relevance Models	abiteboul able acquisition addition additional addressed advanced agency algorithm allan also ambiguity annual approach area around aspect attempt august australia average ballerini ballesteros based baseline baselines battle been belkin berger berkeley better bigrams bilingual both brown byrd callan canada cantly carry centre chrisment clir cocke combination commerce commercial comparable computational computer conclusion conference consider corpus coverage croft cross darpa data defense della dels demonstrate department development devroye dictionary digital directly disambiguation discussed docs document documentaire documentation documents does domain ecdl ects editors either encouraging erent erform erty estimated estimates estimation etudes european evaluating excellent exceptionally exhibit expansion expensive experiments explore extremely fifth figure first formal fourth francisco frei frequently from future gaithersburg gollins good graph harman harper hautes have hearst heuristic hiemstra high higher highprecision huang important impressive improve improving increase information informatique initial inquery institute integral international internationales jects jelinek jones jong journal july kaufmann kraaij kraft language lavrenko lecture left lexicon libaries like lingual linguistics list long machine massive melbourne mercer methods metric metrics minimization model models mono monolingual montr morgan most msrcn multilingual multimedia narasimhalu national need nguyen nineteenth ninth nist notes noticeably november number observe open oriented orleans other outp outperform outperforming outperforms pages parallel paris part particularly perform performance performing performs philadelphia phrasal pietra plan porter precision press previously principle probabilistic probability proceedings program proposed provide purposes queries query questions ranked ranking ranks rather readings recall recently references relevance relevant remain reported reprinted research resolving resources results retrieval retrieved riao right rijsbergen risk robertson roossin sanderson science search second sentence september setting sheridan short show shows sigir signi since sixth somewhat source sparck speci spider springer standards starts statistical still strategies stripping strong suggest swan switzerland system systems table target task techniques technology terms text than that these they third this though tong tool translate translation trec triangulated twentieth twenty typical uble unigrams unlabeled user using vercoustre verlag very volume voorhees weischedel well what wilkinson willett with without words work worse would zhai zhang zhou zobel zurich
http://doi.acm.org/10.1145/564376.564491	98	Query Performance Analyser — a Web-based tool for IR research and instruction	accepted acta analyser analysis applying artificial august author best boolean bridging categories colloquium cross databases descriptors doctoral education educational electronica european evaluation exact experiments finland full game gaming glasgow halttunen held http inen information instruction intelligence interactive irsg isbn issues kemppainen keskustalo laaksonen laitinen language learning march match measuring method opyright owner performance phtml pirkola pricai queries query range rapid references research retrieval retrospective rvelin scotland search september sigir singapore sormunen subject submitted sufficient tampere tamperensis teos text thesis through tool universitatis university wide workshop
http://doi.acm.org/10.1145/564376.564411	25	Document Clustering with Cluster Reﬁnement and Model Selection Capabilities	abstraction algorithm allan american approach association baker based broadcast brown browsing carb carp cation church cient classi cluster clustering clusters collections computational conference corpus corresp critical croft cutting darpa data detection distributional document documents duda ective edition eech english ervised estimation event extending fast from full gale gather gillick goldszmidt grove hart hierarchical hierarchies hofmann html http icame icml identifying ijcai index information international inverted itad journal jplatt june karger language large lavrenko learning linguistics link lowe machine machines management mccallum means method microsoft minimal model moore mulbregt natural news nformaton nist numb ofthe ondences onell online optimization paci page pages papka parallel pattern pederson pelleg pereira pierce platt probabilistic proc proceedings processing recent references report research retrosp review sahami scatter science second sequential seventeenth sigir single society speech stork stream study supp tagged technical tests text texts tishby topic tracking training trends tukey unsup using vector wiley willett with word words workshop yamron yang york
http://doi.acm.org/10.1145/564376.564387	7	Two-Stage Language Models for Information Retrieval	abstract american analysis applied approach approaches automatic based beaulieu berger buckley chan cikm communications conference croft cross development divergence document editor editors essen estimation extended feedback gatford generation hancock harman hidden hiemstra html http ieee improving indexing information intelligence international jones journal kneser knowledge kraaij kwok lafferty language lavrenko leaving leek length machine management markov methods miller minimization mitra model modeling models nist normalization okapi pages pattern pivoted ponte probabilistic probabilities proc proceedings processing publications pubs queries query references relevance research retrieval risk robertson salton schwartz science search seventh short sigir singhal small smoothing society space sparck special stage statistical study system tenth term terms text third track transactions translation trec twenty vector voorhees walker weighting wong workshop yang zhai
http://doi.acm.org/10.1145/564376.564475	83	Does WT10g Look Like the Web?	academy algorithms also amit amsterdam andrei andrew appear april bailey broder case characterizing collection competition component components conference conferenece connected consideration craswell crawls david distributions engineering equivalent eric every examined experiments exponents farzin figure flake follow following forward found from gary giles glover graph graphs hawking hong information international into jagopalan janet kaszkiel kong kumar lawrence link links maghoul management marcin means multi national nding netherlands nick node other pages pennock peter power prabhakar proceedings processing purpose raghavan ravi raymie reachable references retrieval sccs sciences searching shows similar singhal sridhar stata steve strong strongly structure study subgraph such take taking test that their these this through tomkins trec undirected union using wccs weak weakly when wide wiener winners world
http://doi.acm.org/10.1145/564376.564438	46	Using Self-Supervised Word Segmentation in Chinese Information Retrieval	after algorithm approach average beaulieu characters chinese collection containing daily data documents equal erformance eriments erton ertson ervised evaluate experimental gatford harman huang include information irsg known management measures news numb oints okapi over pages peng people performance precision probabilistic proceedings processing query recall references relevant results retrieval retrieved schuurmans segmentation segmenter self service sets stories supervised system theory trained trec using walker williams with word words year
http://doi.acm.org/10.1145/564376.564477	85	Selecting Indexing Strings using Adaptation	adaptation adaptive aist akira analysis based borrowed cactus chance chasen church closer coling conclusion counter dictionary empirical estimates expansion fact free frequency from hirano html http ideas imaichi imamura japanese kenneth kitauchi kyoji language likely literature manual matsumoto method model modeling models morphological naist nara noriegas ntcir osamu project proposed recognition references repeated report research runs selection sigdat speech standard statistical system tatsuo technical term than that tomoaki umemura used very weighting words workshop yamashita yoshitaka yuji
http://doi.acm.org/10.1145/564376.564436	45	Video Retrieval using an MPEG-7 Based Inference Network	abou accessibility addison advanced angeles application applications applied approximately april association audio australia baeza based bimbo booth broadcast broadcasting browsing california callen chang cient cleese colloquium colombo common communication como conference consumer content crete croft database december description development digital dimension distance document documents edit electronics england essex eurasip european exible expert fatemi fawlty florence framework fuhr future glasgow gourmet graves greece harding healey huang hunter iannella identifying ieee image indexing inference information inquery interface international italy ivory jang johann journal july june kazai khaled lago lalmas language libraries lleke london louisiana ltering march mary massachusetts master media melbourne merchant metadata model modern moutogianni mpeg multimedia myaeng naphade neto network networks news night orleans pages pala papworth part pearmain probabilistic problems processing productions programs putz queen query quicker references research restricted retrieval riao ribeiro room ruthven schemes school sciences scotland searching second semantic semantics september sgml sigir signal smeaton speech standards structured substructures summer system systems technology terminal text thesis towers trees turtle university using varenna video view visual volume wang wesley with xirql yates zhang zhoo
http://doi.acm.org/10.1145/564376.564488	95	Indexing, Searching, and Retrieving of Recorded Live Presentations with the AOF (Authoring on the Fly) Search Engine	aofse approaches audio august author authoring automated based brusilowsky consist copyright dealing documents education electronic finland found freiburg have held http index informatik journal lectures ller magazine media multimedia only other ottmann owner presentation presentations reason recorded recording references replay retrieval search sigir site slides springer such syllabus system systems tampere tele text that this well with
http://doi.acm.org/10.1145/564376.564443	51	The Relationship Between ASK and Relevance Criteria	able academic according affecting although american annual anomalous applies asis attributed automatic barry based basis bateman bear been belkin bring brooks came canadian characteristics choosing classification comparison conclusions conference confirms convincing cool core criteria criterion cross currency data definition design developing development differences different direct discussion diverse document documentation documents during dynamic effectiveness eisenberg evaluating evaluation exists explanations find first formal fourteenth france frequently frieder grenoble hapeshi have highly idea identify importance important information instead interactive interest italy journal judgments kantor know knowledge kwasnik learned least line management measurement medford mediated meeting modeling moment more most national need nilan number oddy online part person pisa preliminary problem proceedings processing purely quite reexamination references related relationships relevance representation representations research results retrieval riao saracevic scale schamber science search searching shown similar situational small society sources spink statements states strategies structural studies study subjects suggested system terms texts that their there these they this time topicality topics toward types used user users using variety were when which wide williams would
http://doi.acm.org/10.1145/564376.564447	55	)	aberdeen able adaptive amilcare applications applied artificial august bontcheva challenges ciravegna components conclusion conference cope cunningham designed developing development different eacl extraction fifth france from gate generalisation hirschman human iaui ieee ijcai induction information initiative intelligence intelligent international joint july knowledge kozierok language major management maybury maynard meeting message mixed natural nist november porting proc processing projects purposes recoding references related requires requiring robinson rule seattle skills systems tablan technology text texts that their tool toulouse types understanding ursu vilain washington with without workshop
http://doi.acm.org/10.1145/564376.564413	27	Probabilistic Combination of Text Classiﬁers Using Reliability Indicators: Models and Results	aaai accv additional advances algorithm algorithms american among approach apte arti asian ault automatic bartell bartlett based bayes bayesian belew belkin boostexter boosting butterworths callan categorization cation chen chickering cial cikm classi collaborative combination combining comparison comparisons computer conference contactinfo content cool corporation corrigendum cottrell croft cross damerau dasgupta data daviddlewis dependency development distribution dmax document documents dumais ecml editor editors effect effective environments event examination fall fawcett features forum fourth frakes fusion gale generalization goets hampp harman head heckerman hierarchical horvitz html http hull icml ieee imprecise index inductive inference information integration intelligence intelligent issues jackson jain january joachims johnson journal kadie katzer kofahi large larkey learning lewis likelihood linear local london ltering machine machines manual many margin maximizing mccallum mcgill meek meta method methods microsoft mining modality models multiple naive networks neural nigam nist notes number oles outputs overlap pages papka pedersen performance pierce platt press probabilistic provost publication query rajashekar ranked references regularized relevant representations research resources retrieval reuters rijsbergen robust rounthwaite sahami schapire scholkopf schuetze schuurmans science searches sequential shaw sigir singer smola society special stacked strategies structure study support system systems technology tessier testcollections text ting toolkit toyama tracking training travers trec tyrrell vacher validation vector vision visualization weiss winmine with witten wolpert working workshop yang
http://doi.acm.org/10.1145/564376.564474	82	Automatic Evaluation of World Wide Web Search Services	above algorithms amit among automated avoid buckley calculated case challenges changes chris cmis commerce completely computer conf confidence constraints constructing craswell csiro described details directory dmoz document each effectiveness eighth engine engines error evaluation examine exceeds finding first foobar from gives gordon guidelines harman hawking html http ieee information intended interval jaideep jansen jasis july kaszkiel last leighton like limiting list mailing marcin matched million more needed open original pages pairs pathak population precision proc process project queried query references resource respective results retrieval sample sampling saracevic search services sets since singhal size spink srivastava study table tenth thistlewaite those track trec trecweb trivially using vernon were wide wolfram world would
http://doi.acm.org/10.1145/564376.564401	17	Unsupervised Document Classiﬁcation using Sequential Information Maximization	about above accuracy adaptive agglomerative algorithm algorithms also among analysis annual appendix arti association assume assumption asymptotic automatic average averaged based becomes bekkerman belong best between bialek bottleneck browsing butterworths categorization cation centroid characterized cial class classes classi clearly cluster clustering clusters columbus communication computation computational conditional conf conference context convenience correlated cover demonstration denote denoting described developments disjoint distortion distribution distributional distributions divergence document documents does double each ecir ects editor eguchi elements english entropy equivalent ergen ervised etzioni european every except expanded feasibility feature fine following frequencies friedman gather give given graphical hard hearst hence hinton ieee increased incremental incrementally independence induces information informative insight intel into iterative jects jensen john jordan justi labeling lang learning length lerton less ligence linguistics london loquium lter machine markov maximization maximizes means measures meeting membership method micro models more mosenzon multivariate mutual neal netnews neural nips nition nonetheless notice ohio only other othesis over ower pages parenthesis particular partition partitions pedersen pereira perfect perfectly performance precision preliminary preparation present preserved prior probabilities problem proc processing proposition provide queries reexamining references refers relates relative repeated represent representation represented research respect restarts result results retrieval reuter rijsb roughly salton sample scatter scheme schriebman science section seeking semi setting severe shannon sigir similar size slonim smal small some sons sources souroujon space sparse speaking speci strong such supervised systems table tests text that then theoretical theory therefore these this thomas through thus tishby transactions true turn typically uncertainty under unknown unsup using variants view violation well where which while wiley winter with word words yaniv york zamir
http://doi.acm.org/10.1145/564376.564380	1	Impact Transformation: Effective and Efﬁcient Web Retrieval	annual australasian australia badue baeza chile conference croft database development distributed early editor editors effective effectiveness harper impact improved information international inverted january kraft kretser laguna melbourne moffat navarro neto november orleans pages partitioned press proc processing query rafael ranking references research retrieval ribeiro september sigir space string symp termination through transformation using vector with yates york zhou ziviani zobel
http://doi.acm.org/10.1145/564376.564394	12	Improving Realism of Topic Tracking Evaluation	academic activity addison adjust advances after algorithm allan allow also alternative although amherst approaches arriving articles assumed automatic based basis batch batched batches beaulieu best boston built cambridge cant case catory cieri classi clustering computer conference context corpora cost cuments currently data demonstrated dent detection determined digital directing discussed distributions document documents doddington during eacl early ective editor editors elapses enough erent evaluating evaluation event examined examines examining experiments feedback finally firmin fiscus form frey from gale gatford general generating gives grossly hancock harm harman hersh hierarchical higher hirschman house however hull impact improves information interactive interacts interested international investigating jones journal judgment judgments khandelwal klein kluwer lance lavrenko less leuski lewis liberman libraries library list ltering mani martey massachusetts maybury mckeown mind models more much multiple navigation necessary needing news next nist notebook noticeably numb okapi once ones only optimize order organization output over overview pages paper participants penalty performance phase pmiss positive precision presentations presented press proc proceedings processing proved providing publication publishers putting radev ranked reasonable references relevance relevant remainder rennert report requested results retrieval retrieved robertson salton score sees sequential should showed sigir signi similarity small smaller sorting stays stops strassel strategies such summac summaries summarization sundheim system systems table techniques text than that their theory there these thesis third this those time times tipster topic towards track tracking training trec umass university user uses using visual voorhees walker week were wesley when williams with workshop would
http://doi.acm.org/10.1145/564376.564393	11	Novelty and Redundancy Detection in Adaptive Filtering	acknowledgments adaptive allan american annual applied approach approaches based broadcast brown callan carb carthy cation chengxiang chun classes classi combining conferenc conference cument ddington deling dels detection detetion development discussions distributional documentation dragon eighteenth eighth ertson erty estimation evaluation event feedback filtering first franz furnas hard helpful hidden hiemstra hierarchy hull improve improving information international ittycheriah john jones journal knecht know kraaij language lavrenko learning ledge leek likeliho line lteirng ltering machine management markov maximum mccallum mccarley measures metho miller mitchell mulbregt news ninth novelty onell othing pages papka pictures pilot pohlmann proc proceedings references relevance report research retrieval rosenfeld schwartz science semantic setting shrinkage sigir similarity society spitters stokes story study syntactic system systems technology tenth text thank their threshold thresholds topic track tracking transcription trec twenty understanding using very ward workshop yamron yang yiming zhai zhang
http://doi.acm.org/10.1145/564376.564427	37	Automatic Query Reﬁnement using Lexical Afﬁnities with Maximal Information Gain	achievable adding allan analysis annual application approach april assistee associates association australia automatic based beaulieu bigi buckley callan cambridge candidate carpineto change choi clearly combining communication conclusions conference constraints construction contain corpus croft crouch databases demonstrates denmark describ development distributions document documents dublin editor editors endency engines englewood enhagen equivalent erent erformance eriments ertson ervised estimation exactly expansion experiments extension feedback feng figure filtering focuses full function gatford gauch global hall hancock hand harman have hawking hearst html http ideal illinois impact improved improving indexing information institute international invocation ireland january jarvelin jing jones july june kang kekalainen language level lexical libraries likelihood local lower ltering maarek management manmatha mathematical maximum melb method methods microsoft mitra modeling mori multiple mutual national natural nement ninth nist nities note okapi ordinateur original orleans other others ound ourne outputs overview pages precision prentice press proceeding proceedings processing publication pubs quality queries query rachakonda random ranked ranking rath recherche references relations relevance relevant rerieval research results retrieval riao rocchio romano salton score scoring search selection semantic septemb seventh shannon short sigir simple singhal smadja smart software special standards statistical strong structure symposium system systems technique technology terms text that theoretic theory thesaurus these third this thresholds thus track transactions treated trec twelfth university unsup urbana used using voorhees walker wang weaver which with yang york zhang
http://doi.acm.org/10.1145/564376.564428	38	Web Question Answering: Is More Always Better?	aaai abney agichtein analysis anlp annotation annual answer answering answers applied askmsr banko bases blair brill brown buchholz chen cikm clues coden collins conference data development diekema domain dumais eacl emnlp empirical engine expressions extraction frequencies from goldensohn grammatical gravano harabagiu harman high information intensive international knowledge language lawrence learning liddy management mccracken methods mining natural ninth nist open ozgencil pasca patterns performance potential prager predictive proceeding proceedings processing publication query question questions radev references relations research retrieval right search series sigir singhal soubbotin special specific spring symposium system taffet tenth text transformations trec using voorhees wide workshop world yilmazel zhang zheng
http://doi.acm.org/10.1145/564376.564419	31	Collaborative Filtering with Privacy via Factor Analysis	aaai agents algorithm algorithmic algorithms analysis application applications approach architecture arti asplos august aware based basu better bindel bishop borchers breese canny case cation chen cial classi claypool cohen collaborative combining commerce communication conference content context czerwinski data demonstration dempster detreville diagnosis digiovanni dimensionality dimensions eaton edition editor empirical environments evaluation factor filtering fourth framework free frey from full geels ghahramani global gokhale goldberg good goteborg graphical guilford gummadi gupta handbook heckermen herlocker hirsh horvitz hybrid iaai ieee ijcai incomplete information innovations intel introduction jester john jordan journal kadie karypis konstan kubiatowicz laird language lawrence learning length ligence likelihood linear ltering lters machine maximum memory methodological microsoft mining miranda model models murnikov narita natural netes neural newspaper november oakland oceanstore october online pages paper pennock performing persistent personal personality pervin popescul poster predictive press privacy probabilistic proc processing questionnaires recommendation recommendations recommender reduction references report research rhea riedl rogers royal rubin sartin sarwar scale schafer seattle security sept series session sigir social society some sparse statistical stockholm storage study submitted sweden symposium system systems taxonomy technical techniques theory time turbo ubicomp uncertainty ungar using usion weatherspoon webkdd weimer wells with workshop zhao
http://doi.acm.org/10.1145/564376.564395	13	Bayesian Online Classiﬁers for Text Classiﬁcation and Filtering	accurate active addition advances algorithm algorithms although amherst analysis annual appendix approach approaches arti automatic automomous average averages based basis bayesian binary bohr buckley burges butterworths calculate callan cant carlo case categorization cation cauwenb chapman chapter choice choose cial classi coincidence collection combridge computation computational computations computer conditioned conditioning conference connect csat data decremental department describ development dietterich dimension drucker dunning ecause ecml ecome ects edition editor editors effects endence enhagen equal erations erceptron erence erent ergen erghs eriments ertson estimated etween european evaluating evaluation evidence evidences examination examples feature features feedback field flip follows formulation framework from function functions gaussian general gibb gives hall having hence here hersh hickam highest hold hull ijcai illustrates implementation improves incremental information institute intel interactive international interp january jective jitter joachims joint kernel kernels large learning leen leone less letters lewis ligence likelihood likely limited line linear linguistics logit london lost ltering machine machines mackay macro making management many massachusetts matrix maxf mean measure method methods micro model models monte more much multiple nature neal needed networks neural niels nips nite noise noted numb nystr ohsumed olation online optimal optimizing osterior otherwise over pages papka parameter parameters passes perceptron physical poggio practical preliminary press probit proceedings process processes processing random recommended references regression relevance relevant representation research results retrieval reuters review rijsb saad salton scale schapire scholkopf science section seeger selected selecting selection sets shahrary show sigir signi similar sizes small smola snell solla space sparse springer statistical statistics strategies study suggested sung supp support surprise syed systems table task technical term test text than that their theory there thesis this threshold thresholding thresholds through toronto track training treating trec tresp university usability using vapnik vector vectors versus volume weighting where while williams winther with without workshop yang york
http://doi.acm.org/10.1145/564376.564469	77	Relative and Absolute Term Selection Criteria: A Comparative Study for English and Japanese IR	appear categorization combining cross direct documentation ecir english erent evaluation evidence expanding expansion feedback flexible four hearst http information irex irsg isahara japanese ject journal keenbow kids language lrec management mapping methods model ntcadm ntcir ogawa okapi onlineproceedings optimization probabilistic proceedings processing pseudo queries query references relevance report requests research retrieval robertson sakai search sekine selection sigir site structuring tables term toshiba tracks trec using walker workshop xerox