http://www.informatik.uni-trier.de/~ley/db/conf/sigir/sigir2006.html SIGIR 2006 http://doi.acm.org/10.1145/1148170.1148234 47 Type Less, Find More: Fast Autocompletion Search with a Succinct Index access aggarwal aggregation algorithm algorithms aligned alstrup approach arge autocompletion baeza beaulieu bell bickel binary brodal browsing buchanan case clarke codes collections commonsense communications complete completion complexity compressed compressing compression comput computer computing concept conference context craswell croft cunningham darragh data database development digital dimensional documents ecml edition entry erman ertson european expansion faab factors fagin fast ferragina file files finkelstein focs forum foundations full gabrilovich gaede gatford gigabytes grabski gunther haider human humanized ieee images index indexability indexing indri information input integers international intersection inverted jakobsson james journal kaufmann keyb koudas large learning lecture lexical libraries lieb lotem machine managing manzini matias method methods metzler middleware moffat morgan multidimensional muthukrishnan naor notes oard okapi optimal oroff orthogonal output pages payne paynter placing pods predictive principles problems query ramamohanarao range rauhe reactive references related relations research retrieval revisited rivlin ruppin samoladas scalable scheffer science search searching self semantic sentence sentences sequences sigir signature solan sorted sorting srivastava stocky strohman structures study substring surveys symposium syst system systems terabyte text track transaction transactions trec turtle typing using versus vitter voorhees walker wide williams witten wolfman word world yates http://doi.acm.org/10.1145/1148170.1148218 34 Evaluation in (XML) Information Retrieval: Expected Precision-Recall with User Modelling (EPRUM) addison annual artificial aslib assessments assist assisted august australia avignon baeza billingsley bollmann breese brin bringing buckley causal citation cleverdon common computer conceptions conference content copenhagen cranfield critical danemark development devices digital documentation dortmund editors effectiveness effort evaluating evaluation expected france francisco fuhr gallinari geva graded group heckerman independence index inen inex information intelligence international investigation irrelevance jasis journal judgments july jung kaufmann kazai lalmas language lawrence library look malik measure measurement measures melbourne metrics modern morgan motwani neto nist number order ordinateur oriented overlap pagerank pages performance piwowarski precision predefined press principle probability problem proceedings project publishers raghavan ranking ratio recall recherche reconsidered references relevance relevant report research retrieval riao ribeiro robertson rvelin saracevic science second sheffield sigir stability stanford structured system systems technical technologies tenth tests text tolerance transactions trec twelfth uncertainty unit units university user using variations vert volume voorhees vries wesley wiley winograd without woodley yates york http://doi.acm.org/10.1145/1148170.1148273 77 Clustering of Search Results using Temporal Attributes annotated annotation applications approach associated attributes augmentation aula baeza based bender bontcheva cluster clustering compute computing construction contains create cunningham data demonstration development document each elements engine enters environment etzioni experienced expressions extract feasibility first flynn form fragments frames framework from future gate graphical group habel http ieer index information internal involves jain jhaveri journal july kaki koen list load main maynard merging messages murthy news nist oracle original phase phases proc processing query reaccess references retrieves retrieving robust satisfy schlider search searching second semantic separate sigir snippet spatial strategies structure such summary survey surveys system tablan tagging temporal tests that then time title tools topics tree used user users using version view when with workshop xpath yates zamir http://doi.acm.org/10.1145/1148170.1148285 89 Term Proximity Scoring for Ad-Hoc Retrieval on Very Large Text Collections accumulator affect approach april based beaulieu chunks collection conference containing difference document documents each ecause ecir equation eriments ertson european experimental followed gaithersburg gatford hancock impact into jones keyword limited neighb novemb okapi only original oring other pages proceedings proximity query random rasolofo references research results retrieval same savoy score scoring seventh smaller split strategy systems term text that third trec walker weight where http://doi.acm.org/10.1145/1148170.1148330 134 Why Structural Hints in Queries do not Help XML-Retrieval appendix available conference duisburg effort evaluation extended fuhr http inex informatik information initiative international january kamps kazai knowledge lalmas malik management marx narrowed nexi pehcevski proceedings queries references reliability retrieval rijke rmit rnsson sanderson sensitivity sigir sigurbj structured system tahaghoghi thom trotman university workshop xpath zobel http://doi.acm.org/10.1145/1148170.1148254 63 Graph-based Text Classification: Learn from Your Neighbors algorithm algorithms analysis approximation available barrier based bioinformatics both breaking categorization catergorization cesses chakrabarti challenge chang class classification classifier classifiers cohen cohn combining computer concepts conditional connectivity content continuous coupling cument current customers data dataset dblp degree dels dictionaries discovering distance domingos ecml edge edia eling enhanced entities entity eppstein erlinks ertext especially estimates exploiting explor extraction feldman field fields finding focs foundations from geto graph guide helps hofmann hypertext icml ieee image improving incrementally indyk influence information initial initialization integration kamb kauffman kaufmann kleinb know label labeling lafferty learning ledge less library libsvm link links machine machines markov mccallum metho metric mining missing modeling more morgan moto multi myaeng named network newsl nips note observe only pairwise paths pelkowitz pereira performance poor positive powerful practical press probabilistic probability problems proc prominent random references relations relationships relaxation research results richardson role same sarawagi science searching segmenting semi septemb sequence shatkay shortest show shown sigir sigkdd sigmod significant similarity size small springer state still supp susceptible symp syntax table tardos techniques tendency there through training tutorial using value variant vector verlag very washio weighting weng when which wikip wikipedia with wmgc would york http://doi.acm.org/10.1145/1148170.1148205 24 Adapting Ranking SVM to Document Retrieval active advances agarwal algorithm altavista american analysis annual approach automatic based bengio boosting bounds brockhausen burges cambridge care case center changes cliffs collaborative combining commerce computer conference data development digital document efficient englewood experiments exploiting extensions first freund from gallinari gaussian generalization ghahramani grangier hall henzinger hyperlinks icml ieee information intensive international iyer jansen joachims journal knowledge kpartite lafferty language large learn learning levin machine marais margin massih minimization model models monitoring moricz morik neural nips ordinal outputs pages preferences prentice press principle problems proceedings processes processing public queries query rajaram ranking regression report research retrieval risk salton saracevic schapire science search searching semi shashua sigir silverstein singer smart society spink statistical structured study supervised system systems taxonomy technical technology their tresp truong unlabeled usunier very with wolfram workshop zhai http://doi.acm.org/10.1145/1148170.1148225 40 Generalizing PageRank: Damping Functions for Link-Based Ranking Algorithms about acknowledgements adamic addison advance affect affected albert algorithms along also analysis analyze another appl applications applied approach approximating approximation arasu arbitrary asist assumption athens attributes australia authoritative authority authors baeza barabasi based beach because becchetti bharat billion boldi bollobas branching brin bringing brinkmeier broad browsing calculating calculations cannot carefully castillo catedra change characteristic characterize chen chiba chosen chung cikm citation clara class classifications clearly closer clustering cocoon collection collective collude colluded collusion combinatorica combined comp compiegne complex computation compute computed computing conclusions condition conditional conf connecting contribution correct cost could counting crawlers cutoff damping daniel davis davison defined degree degrees delis depend derived detection diameter different difficult digital discussion distillation distributions diverse donato dynamically dynamics easier easy effectiveness eigenvalue eiron engine engines environment estimating evaluation evolving exponential fabra factor factors faster figure find fixed flow focs fogaras found framework france free from frontier function functional functions future garcia general generated germany given global good google grams graph graphs greece groups grow gulli haas have haveliwala hawaii helps henzinger here high hill hits honolulu huberman hubs hyper hyperlink hyperlinked icrea ieee iics importance improved improving include index indexable information instance internet involve iral iterations iterative japan jeong journal july june kamvar katz kleinberg known kumar large leipzig lengths leonardi less lexical libraries lifantsev like linear linearrank link linkanalysis links lncs local locality logarithmically many marchiori markov math matrix matter mccurley mcgraw measure melbourne menczer method methods model models modern modify molina moment more moreover motivated motwani multivalued myaeng nature necessary need needed neto networks nevada newman nonlin number obtain occurs order orderings other over paepcke page pagerank pages pandurangan pant paper papers parameters part particular path paths phys physica physics plan pompeu poster posters precision presented press probabilistic probably problem proc propagation provide psychometrika qualitative queries query quest raghavan rajagopalan random rank ranking rankings reaching real redondo references report research resources rest results retrieval revisited ribeiro riordan rudin santa santini scale scores search searching second semantic sensitive setting show sigir signorini similar similarity singapore sivakumar size small smaller social sociometric soft sources spam sparse special springer srinivasan stanford start stat statistical status stochastic strogatz structure such suel support tech techniques telefonica than thank that their them theoretical there this those time toit tomkins tomlin topic topical totalrank toward track truncated tyler under understand universitat university upfal used using valuable value values variation vegas vigna voting walk watts webpage webtrec wesley where which wide will winograd with without work world would yates york http://doi.acm.org/10.1145/1148170.1148344 147 DeWild: A Tool for Searching the Web Using Wild Cards answering august author automatic buckley conf copyright etzioni expansion extraction held improving information knowitall mitra overview owner pages preliminary proc query question references results retrieval scale seattle sigir singhal text track trec voorhees washington york http://doi.acm.org/10.1145/1148170.1148223 38 Structure-Driven Crawler Generation by Example accelerated aggarwal ahnizeret approach arasu aware bases berg building calado chakrabarti chawathe china comparing computer conceptual conference crawler crawling data design diego digital discovery ecific edinburgh erience external extracting feedback fifth focused from garawi garcia generation hierarchical honolulu information international know large learning ledge libraries management mclean memory modeling modelling molina networks online pages proceedings punera references relevance resource retrieval scotland shanghai sigmod site structured subramanyam systems through topic topical transactions twenty very wide world http://doi.acm.org/10.1145/1148170.1148280 84 User Expectations from XML Element Retrieval advances august bates berry betsi browsing computer conference content design evaluation expectations fuhr inex information interactive interface kazai lalmas larsen lecture lncs london malik management mary notes online oriented overlap picking problem proceedings queen references retrieval review science search sheffield sigir springer techniques tombros track university users verlag vries workshop http://doi.acm.org/10.1145/1148170.1148307 111 Building a Test Collection for Complex Document Information Processing agam analysis announcement argamon based borsack building butter caag california cancer center complex condit data digital dlib document documents effects evra francisco frieder grants grossman guide hirschhorn html http industry information institute january jasis knowledge legacy legal lewis libraries library magazine management manuscript march msasumm national nist noisy pafiles processing program prototype publications references reports research resources retrieval review rider schmidt sigir state system taghva text tobacco trec ucsf umiacs university http://doi.acm.org/10.1145/1148170.1148200 20 Improving the Estimation of Relevance Models Using Large External Corpora analysis annual archives automatic based blog boughanem clarke cognitive complex conference context cormack croft data dels deng development diaz ecific ecir effect effectiveness eled endencies engine eriments ertson ervised expansion field filtering fourteenth geoscience growing grunfeld http hughes ieee improving index indri information intel international invited jones kwok landgreb language lavrenko learning ligence loquium lynam markov mayer metzler mishne mitchell mitigating mixtures model notebook okapi operations orhees osting overview pages phenomenon pircs press problem proceedings qsdr queries query random reducing references relevance remote research retrieval rijke robust role routing sample samples science sensing septemb serach shahshahani sigir sixth size small strohman syst task term terra text through track trans transactions trec turtle twelfth umass unlab using walker with yeung york ysearchblog http://doi.acm.org/10.1145/1148170.1148246 57 High Accuracy Retrieval with Multiple Nested Ranker baum burges deeds distributions ervised hamilton hullender information lazier learning networks neural pages probability processing references renshaw shaked systems wilczek http://doi.acm.org/10.1145/1148170.1148243 55 Near-Duplicate Detection by Instance-level Constrained Clustering administration advances agreement alspector alto american application applied april assessing assessment atlanta austin based bernstein between bremen brin broder bruce callan cardie chowdhury cikm clustering collection conference conrad constraints copy croft current data davis derivative detection digital discovery distance document documents duplicate dynamic elsevier environment erulemaking extent fast federal flow fourteenth frieder from garcia germany glassman government grossman group gwet hash hoad icml identifying improved information instancelevel inter interest international issue issues italy jordan journal june kamvar kappa klein knowledge kolcz lafferty language learning level lexicon libraries machine making management manasse manning mccabe measures mechanism mechanisms methods metric metzler mining models molina most national near neural nist november october online orleans padova page pages palo plagiarized practice press prior proc proceedings processing public publication randomization rater raters references reliability replica research retrieval robustness rulemaking russell satisfactory scalable scam schriber science seattle secure september shivakumar shulman sideinformation sigir sigkdd sigmod signature similarity smoothing society space special spire standard standards statistic statistical statistics string study symposium syntactic system systems technology tenth texas theory tois tracking transactions twelfth versioned volume wagstaff with xing yang zhai zobel zweig http://doi.acm.org/10.1145/1148170.1148276 80 Automatic Construction of Known-Item Finding Test Beds approaches azzopardi balog based berger berkeley callan connell databases discrim enterprise information lafferty language modeling pages popular proceedings query references retrieval rijke sampling sigir statistical systems tasks text translation trec uniform http://doi.acm.org/10.1145/1148170.1148319 123 Unity: Relevance Feedback using User Query Logs aboutus abstract algorithm august balasubramanyan based bayes build calculating classification click clustering document engines evaluated evaluation explicit google have html http implemented implicit interleaving live mixture most multinomials naive onestat only pavlov people phrases planning preprocessing pressbox proceedings prototyped ranking references relevance replacement results scores search tested thus title used while with word yahoo http://doi.acm.org/10.1145/1148170.1148328 132 An Experimental Study on Automatically Labeling Hierarchical Clusters using Statistical Features approach automatic available based browsing chen chien chuang cikm citeseer cluster clusters collections constant cutting descriptions different document features figure gather generating glover hierarchical hierarchy html http inferring interaction karger krovetz labeling large lawrence learning manuscript mean pederson pennock popescul practical rank reciprocal references results scatter search segments shows sigir text tfidf time topic ungar unpublished used very zeng http://doi.acm.org/10.1145/1148170.1148314 118 Content-based Video Retrieval: Does Video's Semantic Visual Feature Matter? about alight average browsing centric chang collaboratively collection columbia combining compared concept conducted connor created data doulaverakis duration effectiveness evaluate extraction feature features focus from half hauptmann heesch herrmann high hour hours howarth including information interactive kender kennedy keyframe kompatsiaris lehane level magalhaes mezaris multimedia naphade news obtained ontology over participants participation pickering pilot proceedings processing program project references retrieval ruger scale schema search searching segments selected semantic smith strintzis study systems tasks technology temporal textual three trecvid types understanding university user using video videos visual were with yanagawa yavlinsky zavesky zhang http://doi.acm.org/10.1145/1148170.1148292 96 On Hierarchical Web Catalog Integration with Conceptual Relationships in Thesaurus agrawal airs approach catalog catalogs chakrabarti chen classification content cross dumais etween framework godb hierarchical integrating integration iterative jasist learning machines mappings measurement pages performance probabilistic proc references sarawagi sigir sigkdd srikant supp text topics training vector with yang http://doi.acm.org/10.1145/1148170.1148298 102 Evaluating Sources of Query Expansion Terms brazil context database design document dollu effectiveness elicitation examining expansion feedback implications independent information interactive investigation kelly loquacious management mediated potential practice proceedings processing query references relevance retrieval ruthven salvador searching seattle sigir source spink systems term terms toronto user http://doi.acm.org/10.1145/1148170.1148281 85 Theoretical Benchmarks of XML Retrieval application attitudes axiomatic barwise bruza cambridge cheng chiaramella cuments definition element enchmarking ergen functional glasgow huib inex information july kazai lalmas lectures logic measure methodology notes outness pages perry press references relevance retrieval rijsb situations song springer storage structured syst theory thesis towards trans universiteit utrecht verlag what wong workshop york http://doi.acm.org/10.1145/1148170.1148227 41 Capturing Collection Size for Distributed Non-Cooperative Retrieval academy access accessordered addison agichtein algorithms anagnostop analysis anatomy anchor annual australasian australia automatic baeza bailey based bases beijing benton boston brazil brin brisbane broder california callan cambridge canada cannane carmel categorizing census chiba china cikm classification classify collection comparing computer conference connell count craswell darlinghurst data database databases development diego discovering distributed distribution document ecological effect effective eiro eirotis engine engineering engines enough environment erative erformance eriments ertextual ertson eschmeyer estimating estimation finding fish framework france french gaithersburg garcia gravano harman hawking hidden hiddenweb hierarchical hong hybrid improving indexes information international jansen japan journal karnatapu know kong lakes language large ledge life link longman louisiana managed management maryland maximization mclean meng merging method methods modeling modern multi needs neto ogilvie onds opulations orleans ortal oulos over overview page pages paris powell press prob proceedings processing proeedings publishing purp qprob queries query raghavan ramachandran real record references relevant representative research resource results retrieval sahami salvador sample sampling saracevic scale schumacher science search selection server shah shrinkage sigir sigmod site sixth size souza spink study support sutherland system systems techniques tennesse test text thom thomas toronto track transactions trec uncoop unified university user users using utility very virginia voorhees washington wesley when wide williams workshop world yates york http://doi.acm.org/10.1145/1148170.1148177 3 Improving Web Search Ranking by Incorporating User Behavior Information accuracy accurately addison agichtein allan anatomy baeza behavior brill brin brown burges chickering claypool clickthrough computing conference data datamining deeds descent development discovery documents dumais engine engines evaluating evaluation experience filtering from goecks gradient granka hamilton hard hembrooke high highly hullender hypertextual ieee ijcai implicit improve inferring information interaction interest interests international internet interpreting jarvelin joachims karnawat kekalainen knowledge large lazier learning machine measures methods microsoft models modern mydland neto normal observing optimizing overview page pang predicting preferences proceedings ragno rank relevant renshaw report research result retrieval retrieving ribeiro scale search shaked shavlick sigir sigkdd systems technical their toolkit track transactions trec unobtrusively user users using waseda wesley white winmine workshop yates http://doi.acm.org/10.1145/1148170.1148346 149 DiLight: an Ontology-based Information Access System for e-Learning Environments about access accurate achieves acknowledgments acquisition across active advantage also among approach approaches architecture associated august author award backgrounds based become becomes belong between browsing build built child classes clear complete comprehensive concentrate concepts conceptual connections content copyright core course curve demonstration development different digital dilight discovering distance diverse documents dspace easily educating enables engine environment explicitly exploring face field figure form functions goals grasp gruber held hierarchical hierarchy however http illustrated important indirectly inference information innovation integrated interactive internal into knowledge leading learn learning library link linked links locate long management many materials mean methods more navigate nodes ontology open organize organizes over overall owner parent part pittsburgh points portable powerful practices preferences present presentation problem provided provides provost query quickly rapid recommendation recommendations reference related relationship relationships repository research results search seattle semantic sigir situation some source specifications steep structure students support supported system take tasks taught teacher teaching terms that their they this three through thus together tools topic topics translation truly types understand understanding universities university utilize utilizes views visual washington what when where whereas whole will with work http://doi.acm.org/10.1145/1148170.1148325 129 The Effect of OCR Errors on Stylistic Text Classification agam analysis argamon august author binary book building capable codes collection comparative complex correcting cunningham deletions different document documents engine feature frank frieder gmbh grossman hairetes hall heard holmes implementations information insertions intl java kazem language learning levenshtein lewis machine models ones overall page pages practical problems processing recognition references results search seen sets sigir springer spurious study systems taghva techniques test tools transmission trigg uzuner verlag weka with witten workshop http://doi.acm.org/10.1145/1148170.1148192 14 A Parallel Derivation of Probabilistic Information Retrieval Models after alistair amati american amit annual approach approximations arguments arjen based bates beaulieu belkin bomb brazil bruce buckley cambridge chapter chengxiang chris church collection conference corpora crestani croft development deviation digital distributions divergence djoerd document documentation editors effective entropy entry european event experiments fifteenth first framework frei frequency from gabriella gain gale general generation geometry getting gianni girolami glasgow hancock harmann hiemstra importance information ingwersen interactive international inverse irsg issue john jones journal june justification justin kazai keith kluwer kraaij lafferty language large length libraries ling lleke london loquium loss management mandar marcia margulis matrix measure measuring mitra model modeling modelling models moffat monday normalistation normalization october okapi operational page pages pareto pejtersen pivoted poisson ponte press prior probabilistic probabilities probability proceedings processing query randomness references relevance research retrieval right rijsbergen robertson roelleke ross salvador science scotland search seventeenth sigir simple singhal society some spaces sparck special springer stephen system systems term terms test theodora theoretical theory thesis thijs third this thomas time tois transaction trec tsikrika uble understanding university using verlag very vries walker weighted weighting wessel westerveld wilkinson workshop york zaragoza zhai zobel http://doi.acm.org/10.1145/1148170.1148332 136 A Study of Real-Time Query Expansion Effectiveness american analysis anick annual based borlund buckley components conference croft development document documentation effectiveness efthimiadis evaluation examining expansion experimental feedback global improving information interactive jinxi journal local management marchionini performance potential proceedings processing query real references refinement relevance research retrieval review ruthven salton science search sigir society study submission systems technology terminological time using white http://doi.acm.org/10.1145/1148170.1148320 124 Improving Personalized Web Search using Result Diversification access activities adaptive alternative anagnostopoulos analysis automated averaging based beyond bigram broder browser callan carbonell carmel chosen commons conference considered constructed curves detection different diversification diversity documents dumais each effect effort engine enough evaluated evaluation experiment figure filtering first five fixed follow frequency frequent from general generate goldstein hatano higher history horvitz improving increase increases information interest interestingly interests international investigating konstan lausen leading lists lower main marginally match mcnee measures method methods minka most novelty number omit pages patterns performed personalized personalizing poster producing profile queries query recent recommendation reduce redundancy references reformulation reformulations reliable reordering reranking result results same sampling score scores search show showing shows sigir simplicity some somewhat specific suggests sugiyama summaries tech teevan that these this three through topic trends type types typical unigram upper user users value values varied very volunteers wide with without workshop world yoshikawa zhang ziegler http://doi.acm.org/10.1145/1148170.1148210 28 Probabilistic Model for Definitional Question Answering abraham adam algorithm american annual answer answering answers applied approach approaches artificial association automatic automatically based bauman beaulieu bensley berger beyond bikel blair boris bowden branches bridging brill carroll caruana chang chapter chasm chengxiang christopher clark cohn collection combining computational computer conference czuba david dayne definition definitional demner development dina dirichlet divergence ductive echihabi emnlp empirical engine engineering entropy erational eric eriments ertson evaluating evaluation extraction factoid finding flairs florida freitag fushman gaizauskas goldensohn hancock harabagiu hermjakob hierarchical hildebrandt horacio hovy human hybrid ieee information intel interactive international ittycheriah jennifer jimmy john katz knowledge krzysztof kyoung lafferty language large learning learns lecture lexical licuanan ligence linda line linguistics machine mackay mahindru management marcu mckeown measures melz methods mining mittal model models moldovan multiple naacl name natural north note okapi overview package pages peto phrase piquant post prager proceedings processing question questions radu ravichandran reasoning references research retrieval rich rouge ruchi saggion sang schlaikjer schwartz science shannon sigir smoothing society song soricut sources statistical study summaries summarization system systems techniques technology terminology test text textmap that theory track transactions trec using vibhu voorhees walker weischedel welty wesley what williams with workshop young zhai http://doi.acm.org/10.1145/1148170.1148278 82 PENG: Integrated Search of Distributed News Archives academic adaptive advances algorithm antoniolli august azzopardi baillie based bordogna callan chapter clustering crestani distributed filtering france fuzzy hierarchical incremental information invernizzi ipmu july kluwer news pagani pages paris pasi publishers query references retrieval sampling seattle sigir supporting http://doi.acm.org/10.1145/1148170.1148195 16 On-line Spam Filter Fusion adaptive advances algorithms anal analysis androutsopoulos annual anti approach artificial aslam attia australian automated automatic back bartell batch bayesian belew belkin bennett bentley binary blind bratko buckley burges callan cambridge categorization ceas cikm clarke classifier classifiers combination combine combining comput computer computing condorcet conference considerations cormack cottrell crawford data datasets development diagnostic dietterich document duin dumais dzeroski ecificity eleventh email ensemble enterprise etter evaluation evidence eyond fast fawcett feedback filter filtering first fourteenth frei friedman fusion future gaithersburg generalization gosh graphs harman hatef help horvitz http hull ieee improved indicators information intel international interpret joachims kantor karkaletsis kephart kernel kittler know komarek large learn learning lecture ledge leiba lewis ligence likelihood line linear lncs logistic lynam mach machine machines mail making management matas meta method methods montague moore mountain moving multi multiclassifier multiple networks neural nist notes oratories outputs overview paliouras papka pattern pedersen practical precision prescriber press priors proceedings query range ranked ratios references regression reliability representations research researchers retr retrieval robust roli sakkis scale schapire scholkopf schutze science searches searching seattle sebastiani second segal selecting selection sensistivity shaw sigir spam spamguru spamjig sparse spyropoulos stacked stacking stamatopoulos statistical statistics structures supp support surv surveys system systems tech term tests text than thirteenth took track training trans trec trlynam uble using uwaterloo vector view voorhees wilkinson with wolpert york zenko zhang zurich http://doi.acm.org/10.1145/1148170.1148212 30 A Framework to Predict the Quality of Answers with Non-Textual Features aaai after algorithms annotation annual answer answering answers applied approach archives asian asked aspects authoritative automated automatic based burke categorization centralized chen civr classification clustering communications comparative comparison computational computer conference context continued croft data density development distributed document domain emnlp empirical entropy entry environment essay estimation experience experiences figure files filtering finder finding first flexible fourteenth frequently from gauch grading graph graphs hammond harman hiemstra high hubner hwang hyperlinked ieee ijcai image importance incorp incorporating independent information international into jeon jijkoun journal kleinberg know kraaij kulyukin kunze lafferty language large larkey learning lecture ledge lenz likeliho lippman logs lytinen machine magazine malouf management manmatha maximum mccallum measure method methods metrics modeling models multivariate natural nigam nonparametric notes orating overview page pages pang parameter performance ponte precision prior probabilities proceedings processing quality query question questions recall references research result retrieval retrieving rijke same schoenberg science search sentiment series shallow sigir signal similar smoothing sneiders sources specific strong study symposium system systems techniques text textual third thumbs time tomuro transactions trec twelfth understanding using vaithyanathan video wang westerveld wide with workshop world zhai zhou http://doi.acm.org/10.1145/1148170.1148214 31 Latent Semantic Analysis for Multiple-Type Interrelated Data Objects achieve addison again algebra algorithm algorithmic algorithms also although always american among analysis ando annotation applications applied appropriately associate authoritative auto baeza bartell based baseline bast bekkerman belew berry best better between birkhauser borchers boston breese brien calculate case cases categories categorization category centroid change chen clear clustering collaborative comparison comput computation computations computer concepts conclusions conducts confirms consider content context cooccurrences cottrell covers cullum data davison decomposition deerwester denoted department different dimension dimensions ding distributional document does domains dumais each edition effective effectiveness eigenvalue eigenvectors email empirical encouraging engineering environment environments equation evaluated even examination experiments exploit extraction factorization figure filtering finally first framework from furnas gatica general generic golub gong graph harshman heckerman herlocker higher hofmann hopkins hyperlinked icml identifies identifying image impact important improves incorporate incorporates incorporating indexing indicates information insensitive intelligent inter interactions into introducing involving iterative jasis jasist johns journal kadie karypis keyphrase kleinberg konstan kumar lanczos landauer large latent lathauwer lawrence learning linear link loan majumdar many matrix mccallum meaningful meaningfully measure measurement measuring methods micro minnesota mlsa model modelled models modern monay moor most multi multidimensional multilinear multimedia multiple multipletype mutual nature negative neto number object objects obtain obtained occurrence optimal overlap pages pairwise papadimitriou paper parts pennock perez performing popescul potentially precision predictive press principle probabilistic proceedings products projected proposed raghavan really recommendation references reinforcement relations relationship relevance report reported representing represents result results retrieval review ribeiro riedl salient same scaling science semantic sentence seung several show showed shown shows siam sigir simfusion similar similarities similarity simrank since singular smaller society sources space sparse special spectral spirit steinbach step structural summarization supervised symmetric syst table tamaki technical techniques technology text than that their theory there third this three thus toward traditional type types ungar unification unified university unsupervised used using utilizing value variants vary vector vempala very volumn wandewalle weight wesley when which widom willoughby with word words works yang yaniv yates zhang zhuang http://doi.acm.org/10.1145/1148170.1148183 7 Spoken Document Retrieval from Call-Center Conversations acero acoustics affinities algorithm amitay analysis annual application applications applied arbor association automatic auzanne based bechet best beyond boosting boston brill brown call cambridge carmel center chelba choi cikm classical classification clements college computational computer conference confusion consensus conversational conversations decomposition development digital distance document documents downing education error expansion experiments extending farchi finding foote gain garofolo general graph hakkani herscovici hindle hong hoory icassp ieee index indexing information institute internation international interspeech james jones journal juru kingsbury know kong language lattice lattices learning ledge lewis lexical line linguistics lisbon maarek mail main management mangu march massachusetts matrix maximal miller minimization mishne modules multimedia naacl national networks ninth nist november open other pages pereira petruschka phonetic portugal position posterior povey press proceedings processing pruning query recognition references refinement research retrieval riccardi rich robertson roytman saon saraclar search searching seventh sigir signal singhal soffer soltau specific speech spoken sproat standards stolcke story success system techniques technology telephony tenth text thesis track transcription trec uncertainty understanding university using utterance video vocabulary voice voorhees with word workshop york young zweig http://doi.acm.org/10.1145/1148170.1148228 42 Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources achieve adaptive algorithm algorithms also american anal analysis annals annual aplqa applic applications approach approaches aslam associate automatic automatically average based basic benchmarking bplqa builds called carmel categorization chang chen christel chua cikm civr class classes classification coleman collection collections combination computer conclusion conf conference confounded content contrast cument darlow data defined dels demonstrate dempster detection development difficulty digital dimension discover discovery discriminative discussed distributed distributions ectations effectiveness efficient eighth endent erates eriments eriority ernick ertson estimate estimating evolve extraction feature feng finally fine framework from fuhr functions fusion gains gupta harman hauptmann have hofmann huang identical image including incomplete indep indexing informaiton information informedia interior international intl into ject johnson jones journal kang kennedy kernel kimeldorf kplqa laird large latent learning likeliho likelihood manmatha manually margin math maximizing maximum mercer merge meta metasearch methods minimization missing mixture model models montague multimedia multimo multiple nallapati natsev nist nonlinear normalization novemb numb obtain optimal optimization orhees osed ounds over overview pages plqa precision press principles pris probabilistic proc proceedings prop query rank rath rather references region regularized relevance relying representation research results retrieval royal rubin scale schwarz science score search searches selection semantic series shaw siam sigir significant singap smeaton society some sources space spline statistical statistics strategies task tasks tchebycheffian terms text than that this three thus training trec trecvid trust typical unifies using velivelli version video wactlar wahba wang weight weighting weights which with yang york zhao http://doi.acm.org/10.1145/1148170.1148326 130 History Repeats Itself: Repeat Queries in Yahoo's Logs among anick anita aula average baluja based behavior being better capra case center certain change changes click clicked clicks common compared computer computers conditions consortium could decreased deletion design distinct doctoral during eight eighty empirical engine engines evidenced experience experienced fain feedback find finding findings found from given google graphic greenberg helping histories history human idea ieee immediately implications influence information initial interfered international issued jhaveri jones journal kamvar komlodi large likelihood likely list looked management measured mobile more negative number observed obvious occur october ones only pages people percent positive prediction probability proceedings queries query rank reaccess reduces references refind refinding refinement reflects repeat result return revisit same scale search searcher searches seconds seeking session shown shows sigir significantly slows strategies studies study support survey systems table task tauscher teevan tenth terminological that there time took under until usability user users using visualization were when where whether while wireless with word would http://doi.acm.org/10.1145/1148170.1148327 131 Early Precision Measures: Implications from the Downside of Blind Feedback access buckley clef enterprise eriments european experiments harman hummingbird information notes nrrc ntcir overview proc proceedings references reliable retrieval robust searchservertm sigir terabyte tomlinson track trec voorhees with working workshop http://doi.acm.org/10.1145/1148170.1148229 43 User Modeling for Full-Text Federated Search in Peer-to-Peer Networks able accuracy activities adaptive agent algorithms allan allows also amherst amount amounts analysis applied approach appropriate arise assistant australasian automated automatically baeza based been between browsing buddynet callan captures carnegie chaffee characteristics ciir cikm closely clustering clusters coarse collaborative collection community compared computer computing conclusions conference conjunction constantly constructed consumer content context contrast creating crespo data databases days degrading delivering department detailed develop dhts diaz different digital directory distinctive distinguishes documents dumais each ecir effective effectively efficiency efficient effort engines environment evaluate evolving explicitly explore federated feedback filtering find finding finest first focuses from full fusion garc gauch generates glance good grained granularity gupta hatano hierarchical histories history horvitz hull hybrid implicit improve improving inference infocom information initial institute intelligence intelligent interest interests interfaces international ipdps johnson kalogeraki kinds krovetz laird language large last learned learning libraries locality location logs long maggs make manner massachusetts mellon mining model modeling models molina months morphology most mostly naturally naumann necessarily nejdl networks ones only ontology open overlay paper parameter particular past peer peers performance persistent personalized personalizing pretschner previous proc process profile pruyne queries query questions ramanathan range ranked ratnasamy reassessing reduce references regard related relevant reminds report require required requiring research results retrieval robertson robust routing science search second selection semantic sensitive service several shao shen shenker short shows siberski sigir significantly small some span specifically speretta sripanidkulchai stanford stoica strategies studied study such sugiyama super system systems task tasks technical technologies teevan term text thaden that they third this timely track tracks training transactions trec type under university unsupervised useful user users using values very viewing vldb voorhees wang which with within without work works workshop yates years yoshikawa zhai zhang http://doi.acm.org/10.1145/1148170.1148242 54 Text Clustering with Extended User Feedback aaai activities aifb algorithm applications background based basu bilenko blum bootstrapping cardie carvalho categorization chakrabarti classification classify cluster clustering cohen combining comparative computational conference constrained data dempster document documents email external feature first foundations framework from godbole govindara harpale hotho huang icml ijcai incomplete inference inferring information institute interactive joachims jones journal karlsruhe knowledge labeled labeling labels laird learning likelihood machines madani maximum mccallum means measure mining mitchell mooney nigam ongoing pages pedersen pkdd probabilistic proceedings raghavan references report riloff rogers royal rubin sarawagi schroedl selection semi society spam staab statistical study stumme supervised supervision support tasks technical techniques term text theoretic theory through thrun training transductive university unlabeled users using validity vector volume wagstaff with words workshop workstation yang http://doi.acm.org/10.1145/1148170.1148348 151 MathFind: A Math-Aware Search Engine boolean callan communication extended flat inex information language models ogilvie pages proceedings queries references retrieval salton text using http://doi.acm.org/10.1145/1148170.1148300 104 Information Retrieval with Commonsense Knowledge activation also always andrews annotated annual approach ascertain automatic benefit both buckley canary caption categories category classification clef clough commonsense communications comparatively comparing concept conceptnet conference confirms consists context contrast covered creating cross dealing deeply deselaers different document effective effects employ enough evaluation even expansion field fields filtering forum grubinger headline hersh however humans image improvement improvements increases information infrastructure international introduce introducing investment islands jensen journal knowledge language large lehmann lenat less librarians library lieberman little ller lncs lrec meng methods much multiple needed note observe only performance photo phrases practical proceedings reason reasoning recognizing references reflects retrieval rich robust root salton scale semantics sense senses sensitive shows sigir significant singh smaller spreading tagging tags task technology than that their this three toolkit track used useful using utilizing vary when with word wordnet working workshop world http://doi.acm.org/10.1145/1148170.1148238 51 What Makes a Query Difficult? access advances amitay annual apostolico applications approach based beyond buckley burges cambridge carmel cikm classification cohen company computer computers conference content croft cronen curriculum darlow deng detection development difficulty dinstl disambiguation distributed duda editor editors estimate evaluation evans experiments features fine freeman garey grunfeld harman hart herscovici hummingbird including independent index inferring information institute interactive international intractability joachims john johnson juru kernel know kwok lafferty language large learning lecture ledge linguistic lkopf lrec maarek machine machines making management melucci meng methods metrics missing modeling mothe national nist notes nrrc ounis over overview pages palmas pattern performance petruschka pircs practical prager predict predicting predictors press proceedings processing pruning queries query references relevance reliable report research resources retrieval roadmap robust scale science search sense server shanahan sheftel sigir soffer sons spain spire springer standards statistics stork string structure subtopic support tanguy technology tenth terabyte text tomlinson topic townsend track trec twelfth using vector volume voorhees wasserman wiley with word workshop york zhai zhou http://doi.acm.org/10.1145/1148170.1148297 101 Enterprise Search Behaviour of Software Engineers accessibility added also american annual april asis assess authorship because behaviour borlund brazil budapest carry case changes charlotte clarke community comparison conference control corporate created cues current data desire detailed discussion distinct documents domain engineers evaluation expertise faces fagin familiarity features fidel figure findings framework freund from functionality future general genre greater green group heavy hungary information interactive international intranet jansen journal knowledge known kumar length like likely longer longitudinal make management many mccurley meaning meeting model modeling more much novak only paper pennanen percenta perception pooch proceedings processing proposal public queries query quite references reflect reflected reflects relationships rely repository research result retrieval review salvador saracevic science search searching serola sigir sivakumar society software source sources space specialized spink stenmark strong studies study suggest syntax systems tactics taskgenre technical technology terms that their them they this tomlin toms tools topical upon useful users vakkari which while wide williamson with workplace world would writing http://doi.acm.org/10.1145/1148170.1148336 140 Concept-Based Biomedical Text Retrieval abbreviations addition adjusted altman american ando approach association automatic based beaulieu bioinformatics biomedical briefings buttcher chang charles clarke cohen concept conduct conference containing cormack corpus creating current dictionary documents domain dredze each ecific effectiveness eriments ertson evaluate expansion experimental experiments feedback frequency from full gatford genomics gordon hersh http huang hyphens informatics information into journal mark measures medical medline ming mining name number ohsu okapi online only phrase prec preprocess present proc proceedings proposed provided pseudo query references relevance relevant replacing represents respectively results retrieval retrieved runs schtze series shows spaces stefan stemming stephen survey synonym table term text this three tokenize tong topics total track trec university used validation walker watson where williams with work xiang xiangji york zhang zhong http://doi.acm.org/10.1145/1148170.1148291 95 Stylistic Text Segmentation algorithm angheluta annual association august automatic boundaries carb cation chains changes cohesion collo computational computed context corpora critique cuments curves data documents each efficiently evaluation expected false feature figure find found from functional generated goldstein grammar halliday hearst here hypothesize improvement independent indicate indicates indication info intended intermediate international introduction july june kantrowitz kaufmann klavans large lexical linear lines linguistics london longman lope many marked mccoy mckeown meeting metric metrics multi obtain occurs onell pages paragraph passages pennsylvania pevzner philadelphia precision prevalent previously quite recall references reported represent representation results section sections segment segmentation segmenting selection sentence session sigir significance silb single some student stylistic subtopic summarization summarizing system text textiling texttiling that this thresholds topic topical true used using value vectors vertical very visual were which while with workshop http://doi.acm.org/10.1145/1148170.1148321 125 Using Small XML Elements to Support Relevance blok clarke components content controlling cused databases decemb editors embracing evaluation fourth fqas fuhr germany hiemstra inex information inititive jlovi journal kamps kazai lalmas length list malik mandekbro mass methods miha most normalization oriented overlap pages press proceedings references relationships relevant retrieval retrieving rijke rnsson sigir sigurb springer structural tijah using vries westerveld workshop http://doi.acm.org/10.1145/1148170.1148310 114 Representing Clusters For Retrieval academic based classification cluster clustering conference corpus croft effectiveness hierarchic hypothesis information international kluwer kurland lafferty language management model modeling models proceedings processing publishers query references retrieval revisited rijsbergen searching series sigir specific structure systems tombros using villa volume voorhees http://doi.acm.org/10.1145/1148170.1148323 127 Feature Diversity in Cluster Ensembles for Robust Document Clustering american analysis austin automated based categorization classification cluster clustering computing conll cspa data deerwester dimensional document dumais effective ensembles ervised factorization features flynn furnas harshman high indexing information jain jects journal landauer latent learning machine matrix mcla mining murty nature negative neural pages parts proc references relationship restructuring retrieval science sebastiani semantic seung society sparse srinivasan strehl survey surveys systems terms texas text thesis university unsup viola http://doi.acm.org/10.1145/1148170.1148171 0 Salton Award Lecture american aspects association behind butterworths centre closed continuous documentation fairthorne geometries geometry gleason hilbert information interpretation journal language logic mackay magazine maron mathematical mathematics meaning measures mechanics mechanized methods neumann philosophical probabilistics probability quantal references report retrieval scientific society space stanford statistical stevens study subspaces towards transition widdows with http://doi.acm.org/10.1145/1148170.1148343 146 Cheshire3: Retrieving from Tera-Scale Grid-Based Digital Libraries advanced amsterdam architecture based blueprint broker chen compromises computer computing conference cowart data digital distributed document ecdl edition elsevier european first foster grid indexing india information infoscale infrastructure international jagatheesan jasekar journal kesselman kremenek larson libraries managing moore olschanowsky pages phelps preservation press proceedings references research resource sanderson scalable scale schroeder searching society storage systems technology tera watry http://doi.acm.org/10.1145/1148170.1148258 66 Personalized Recommendation Driven by Information Flow academy adomavicius adopter agrawal algebra algorithm algorithms allocation amazon american analysis anatomy applications applied approach architecture arising artificial authoritative automating baldi based behavior behaviors bergstrom blei blogspace breese brin buyers cardona categories clickstream clustering collaborative commerce community computer computing conf contextual cooperative customers data deeper descriptor determination diffusion dirichlet discovery dissemination distrust domingos donovan dynamic dynamics early effective empirical engine environment evolutionary evolving extensions filtering finding frasconi free from generation gonzalez griffiths grouplens gruhl guha hampton handbook heckerman hyperlinked hypertextual iacovou identifying ieee immune incorporating influence information innovation innovations innovators inside instead intelligence intelligent interest interfaces internet intl ioerger item itemto john jordan journal kadie kempe kleinberg knowl knowledge konstan kumar langville large latent launch learning liben linden linear london machine maes mahajan majority marketing mathematics matrix maximizing methods meyer mining modeling models mouth muller multidimensional nasraoui national netnews network networks newsgroup next noisy nowell open page pagerank personal philadelphia possible predicting predictive premise press probabilistic proc product profiles propagation publications purchase raghavan rajagopalan recommendation recommendations recommender references relational representation research resnick richardson riedl rogers rojas rusmevichientong sage sankaranarayanan scalable scale schafer science sciences scientific scott search selinger shardanand sharing siam sigkdd sites smith smyth social society song sons sources spread srikant srivastava state steyvers suchak supported survey system systems tardos targeting technology three through tomkins topics toward trans transactions trust tseng tuzhilin uncertainty user using valente value viral when wide widyantoro wiley with word work workshop world worthwhile york http://doi.acm.org/10.1145/1148170.1148295 99 NMF and PLSI: Equivalence and A Hybrid Algorithm aaai adaptive algorithm algorithms analysis arises between clustering conf converge could data detailed dietterich different ding document editors equivalence factorization from function gaussier goutte have hofmann hybrid implications indexing infinitesimal interesting iteration jump latent local makes matrix method minima minimum mining negative nips nonnegative objective ogihara optimize peng plsi probabilistic proc question references relation running same seen semantic seung siam sigir simon speaking spectral square starting statistic step strictly subspace that their tresp will http://doi.acm.org/10.1145/1148170.1148287 91 Tensor Space Model for Document Analysis american analysis approximations april communications computer deerwester department document dumais furnas generalized harshman indexing information journal landauer latent learning machine matrices model rank references report retrieval salton science semantic society space technical tensor uiuc uiucdcs vector wong yang http://doi.acm.org/10.1145/1148170.1148188 11 Respect My Authority! HITS Without Hyperlinks, Utilizing Cluster-Based Language Models accuracy active advanced affinity agglomerative algorithms allan american analysis anatomy annual applied approach artificial authoritative automatic balinski based beeferman berger better bipartite brin bringing browsing butterworths centrality cessing chains chen chicago cikm classification clef cluster clustering collections computational conference corpus croft cross cument cuments cused cutting danilowicz deep deling dels demonstration denmark dhillon diaz digital disambiguation discrete distances documentation domshlak ecdl ecific ectral edelman edition editors effectiveness emnlp empirical engine environment ergen erkan erlinked erlinks ertextual etzioni european evaluating expansion extended feasibility feedback forum fourteenth gather graph griffiths hearst hierarchic high human improving induced information inside intel inter interactive interdo international internet into iterative jardine jasis jones jordan journal june karen karger karov kaufmann kleinb kluwer know kurland lafferty language langville large ledge leuski levow lexical lexrank libraries library ligence linguistics link links luckhurst management managment markov mathematics matveeva metho methods meyer mihalcea minimization modeling morgan naacl natural navigation numb oken option order othesis othing otterbacher output page pagerank pages partitioning pedersen peter ponte poster preece proceedings processing pseudo query question radev random ranking readings real reexamining references regularizing reprinted research results retrieval rijsb risk salience scale scatter science scores search searching second sense sentence series seventh shah shen siam sigir sigkdd similarity society soda sources sparck stable storage structural structure study summarization symposium system systems tarau techniques technology tenth text textrank texts than thing tombros tukey university using version villa visual walks wang wide willett with without word words world zamir zhai zhang zheng http://doi.acm.org/10.1145/1148170.1148207 25 A Study of Statistical Models for Query Translation: Finding a Good Unit of Translation aachen acquisition adriani alignment ambiguity approach assistance automatic ballesteros based bian bilingual butterworths california cambridge chai chen cherry chiang chinese classification clir coherence cohesion computational computer computing corpora corpus croft cross crosslanguage dale david decaying dekker dependence dependency dictionary ding disambiguation discriminant document duda ecir editor emnlp english entity error filtering flannery from gaithersbury ghahramani grammars graphical handbook hart heidorn hierarchical huang improving independent information informed insertion integrating intelligent introduction inversion jaakkola john jordan knight koehn language learning lecture linear linguistics london louisiana machine marcel maximum menezes methods microsoft minimum model models monolingual named natural nist notes noun numerical occurrence orleans palmer parallel parsing pattern peter phrasal phrase pragmatic press probabilistic processing query quirk rate recipes recognition references relations resnik resolving retrieval richard rijsbergen robertson rwth saul science scientific segmentation selection sense sigir similarity single smith sons soup southern spring statistical stochastic stork synchronous syntactic syntactically syntax system templates term teukolsky thesis track trained training transduction translation trec treelet univ university using variational verlag vetterling vines walker weischedel wiley with word writing yamada york zhang zhou http://doi.acm.org/10.1145/1148170.1148317 121 An Analysis of the Coupling between Training Set and Neighborhood Sizes for the k NN Classifier analysis approaches benchmark categorization classifiers collection evaluation information learn lewis mach references research retrieval scalability sigir statistical text yang http://doi.acm.org/10.1145/1148170.1148251 61 Exploring the Limits of Single-Iteration Clarification Dialogs abels academic access accuracy allan american analysis annual answering approach asking batch bates behavior belkin berry buckley california cambridge carterette case chan chua cluster comparison concepts conference constraints croft dependency dervin design development dewdney diego dillon document documentation documents during editor effectiveness eighth electronic elicitations england enough ensuing environments evaluations expansion fifth final finding framework from full function gather give good goodrum greenwood hard harman harter hearst hersh hierarchical high human hypothesis implications improving info information initial interaction interactions interactive interface interfaces interview interviews jasis jersey koenemann kramer lawrie lewis lexical library mail marchionini neutral nordlie notion olson online overview pages passage pedersen picking precision press price principles proceedings process queries query question questioning questions reexaming reference references relations relevance reliable report research results retrieval revealment review robins rosenberg sacherek same saracevic scatter science sdair search searching seeking selection semantic short sigir simple spink state study summarization swigger symposium system systems taylor techniques terms text thinking topic track trec turpin university user using voorhees when white will words workshop http://doi.acm.org/10.1145/1148170.1148267 73 You Are What You Say: Privacy Risks of Public Mentions about above accept accepted achieving ackerman acknowledgements advise advisor agrawal algorithms aliasing also although amazon analysis anonymity anti architecture association author based becomes begun being berkovsky blogs books both bridge build cacm canny changes chosen citations classi classification cloak collaborative combine commerce computing conclusion conversation cost cranor create data dataset dave demonstrated drenner easier empirical enable enhanced estimate evfimievski examining example explore explored extensive extraction eytani face factor fairly fication filtering find forums found foundation frankowski friendship from fruitful further fuzziness gallery gehrke generalization grama grants grouplens hard haritsa harper harris have here hong however icdm ideas identification identified identifier identify identifying ieee impractical impractically information inquiry insert interactive internet invent investigate investigated item items judgment karypis keep keller kidentification knowl konstan kuflik landay lawrence learning leave leaves level lines machine maintaining many march mathematical mcnee mention mentioning mentions methods might mining mirza misdirection mitigate mobisys model modeling models more most motivated movie much myspace national novak only opinion oriented other owners pang paper peanut pennock people personalization perturbation phoaks polat poll popular possible possibly posts potential power pragmatists predictions preferences preserving privacy privacysensitive private probably problem proc product profit protect protecting protection provides public purchase quasi quasiidentifier questions raghavan ramakrishnan randomized rare rated ratings reagle recommendation recommendations recommender reference relation relations relationships required rest reveal reviewers reviews ricci riedl risk risks rizvi rule rules sarwar scenarios science sean seems semantic sentiment serious sharing shilling should sigir sigmod sites some sophisticated space spaces sparse srikant state strategy substantial successful such supported suppressing suppression sweeney syst system systems talk taylor techniques temporal terveen text than thanks that their them themselves theoretical theoretically there they things think this thumbs tomkins tony ubiquitous uncertain understand used useful user users using vaithyanathan verykios violate vldb vulnerable ways when whether wish wishes with words work workshop would http://doi.acm.org/10.1145/1148170.1148216 33 Tackling Concept Drift by Temporal Inductive Transfer analysis annual application baker class classification clustering complement conference considerations data development distributional distributions fawcett feature flach forman graphs hewlett hewlettpackard http information intl labs learning machine mccallum melbourne mining notes packard performance practical predict proc references report research researchers response retrieval scaling selection sigir tech technical techreports text ting under varying webb words http://doi.acm.org/10.1145/1148170.1148249 59 Elicitation of Term Relevance Feedback: An Investigation of Term Source and Context academic accuracy acquisition actual allan american anick annual applied approaches automatic based beaulieu behavior belkin brazil brussels buckland canada carballo carolina case cheshire classification clustering computing conceptual conference cool coverson croft design development document documentation documents dollu dublin effectiveness efthimiadis environment evaluation examining expansion experiments exploration feedback fourteenth from fuller gatford germany government grenoble hancock hard harman hierarchical high independent information interaction interactive interfaces international ireland iris iterative jansen joho jones journal kelly kluwer koenemann larson length longitudinal loquacious madrid magennis maglaughlin maimon management muresan nameth newby north office okapi outcome overview park passage pennanen perceived perez philadelphia potential preparing presentation printing proceedings process processing proposal public publishers query real references refinement reformulation relation relevance research retrieval review rijsbergen robertson ruthven sanderson science search searching seeking shapira sheffield sigchi sigir sikora society source spain spink structure students study support symposium systems taeib tang technology term terminological terms text texts third toronto towards track trec university user using vakkari value voorhees walker washington while wilkinson with yang yuan http://doi.acm.org/10.1145/1148170.1148269 74 A Compositional Context Sensitive Multi-document Summarizer: Exploring the Factors That Influence Summarization abstracts across agreement american among analysis annual approach arabic artificial association automated automatic banko barzilay based bayesian bigram brain callan carb carthy categorization centric challenges chapter chen cikm columbia common comparing compression computational conference confidence conroy consensus content creating creation data daum development differences diversity document documentation documents doran dunnion elson english eriments etween evaluating evaluation evans event examining extention extract extraction extracts eyond factoid formation frequency fusion generation generic gentleman gerb goldstein gram grams halteren hovy human identifying information initial inquiry intel international intervals into intrinsic introduction ject jing journal judging june knight know kupiec language large leary ledge left ligence linguistics literature logline longest luhn machine machines management marcu master mckeown meeting menezes method mittal model modeling multi multidocument multilingual naacl nature neats nenkova newman news ninth north occurance onell over overlap package pages part passonneau paste perersen probabilistic proceedings producing pyramid quality radev rath redundancy references removal reordering reranking research resnick retrieval rhetorical right rouge saggion savage scale schenker schiffman schlesinger selection sentence sentences sigir sigleman significance similarities skip source stairs statistician statistics stokes subsequence summaries summarisation summarization summarizer summary suzuki systems szpakowicz task techniques technology teufel text their thesis trainable translation undersatnding understand understanding university using vanderwende vocabulary with workshop http://doi.acm.org/10.1145/1148170.1148201 21 Regularized Estimation of Mixture Models for Robust Pseudo-Relevance Feedback algorithm american amherst analysis annual applied asian automatic based blind callan cikm classification cluster clustering conference context croft development divergence document documents editors effect effectiveness evans expansion experiment experiments extensions federation feedback flexible generative global hall harman html http improving information international iterative jones journal knowledge koyama krishnan lafferty language lavrenko local manabe management massachusetts mclachlan meeting methods minimization mixture model models montgomery nist nrrc number numdocs pages prentice press proceeding proceedings processing pseudo publications pubs query references relevance research retrieval risk robertson rocchio sakai sampling science search selective sept sigir smart smoothing societies society sparck special spriner study suite system systems tenth terms text theory thesis transactions trec univ using varying voorhees weighting wiley with workshop york zhai http://doi.acm.org/10.1145/1148170.1148305 109 Simple Questions to Improve Pseudo-Relevance Feedback Results access across analysis anick around asked automatically based buckley causing cikm collections combined croft cronen discussion document ecify effectiveness erformance erimental etter etters examining expansion feedback final framework future gains global gmap group groups harman identifiable identify identifying improve improvement improves indicates information instead interactive known language lavrenko local many models more observe only over overview pages patterns phrases potential provided queries query refered references refinement relevance reliable report results retrieval ruthven scores search selective sigir still study succeeded system systems table terminological terms than that their them this towards townsend translated trying useful usenet user using well were when while work workshop worse zhou http://doi.acm.org/10.1145/1148170.1148203 22 Context-Sensitive Semantic Smoothing for the Language Modeling Approach to Genomic IR advanced algorithm american annual applications applied approach april association based berger biomedical bremen bruza bunescu cikm cluster concept conference context croft dasfaa data database databases dempster development document ecir european expansion explorations extraction feedback fourteenth from genomic genomics germany grefenstette harabagiu hauptmann hersh hidden improve incomplete indexing inference information integrating international into issue jasist journal knowledge lacatusu lafferty laird language leek likelihood lists literature london management markov maximum methods miller minimization mining model modeling models mooney multidocument natural november overview ponte proceedings processing produce query references relationbased relationships research retrieval risk royal rubin schwartz science sensitive sigir sigkdd singapore smoothing society song special stage statistical study summarization syntactic system systems technology term text themes thirteenth title topic towards track translation trec using word zhai zhang zhou http://doi.acm.org/10.1145/1148170.1148302 106 Quantative Analysis of The Impact of Judging Inconsistency on The Performance of Relevance Feedback active adaptive approaches automatic buckley cbir consistent document editor erformance eriments evaluation experiments feedback french hall hard improving information jasis michel multimedia pages prentice processing references relevance retrieval rocchio saic salton shen smart system toward track trec uiuc university virginia zhai zhang http://doi.acm.org/10.1145/1148170.1148271 76 News to Go: Hierarchical Text Summarization for Mobile Devices access adams adaptive advances analysis annual applied april august automated automatic based brazil browsing budap budzikowska buyukkokten canada centroid chen comprehension condensing conference contents delivery demonstration desktop detecting development devices document documents eacl edition editors edmonton effects efficient elson enabled evaluation extraction factor financial flexible florida form fractal frayling from garcia generic hand handheld help hierarchical hilb hirschb hong http human hungary improve information interaction interface intrinsic irwin january jing july kaljuvee kareem kasp kong kutner language large layout limitations linear link madrid mani maybury mckeown milic mobile models molina morris multi multiple nenkova neter news newsinessence orlando osal otterbacher over paep page pages parts passonneau performance phonetop proceedings prop radev reading references research retrieval salvador schilit seattle seeking sentence session sigir small smartview software sommerer spain statistical structure studies summaries summarization symposium systems task technology text tois transactions trevor user using utility very viewing wang wasserman whole wide winograd workshop world yang york zhang http://doi.acm.org/10.1145/1148170.1148232 45 Load Balancing for Term-Distributed Parallel Retrieval analysis annual april architecture architectures august badue baeza barroso cacheda cahoon chile cluster computer conf dean development distributed ecir editor editors european evaluation files frei google harman holzle ieee index information inter inverted laguna lncs mcdonald mckinley micro national navarro neto nineteenth november ounis pages partitioned performance plachouras planet press proc processing query rafael references research retrieval ribeiro schauble search sigir society springer string sunderland switzerland symp tait terabyte text using verlag volume wilkinson yates york ziviani zurich http://doi.acm.org/10.1145/1148170.1148185 9 Music Structure based Vector Space Retrieval acoustic analysis annual approach audio based berenzweig categorization cavnar chai computer conf dafx davies deller detection discretetime document doraisamy downie duxburg ellis evaluating evaluation foote germany gram grams hamburg hansen hybrid ieee indexing information intelligent jcdl journal july large logan measures method music musical musicsimilarity nelson note onset polyphonic press proakis proc processing references retrieval robust sandler scale self sept sigir signals similarity simple speech structure subjective summer symposium systems text thumbnailing trenkle using vercoe visualizing whitman with http://doi.acm.org/10.1145/1148170.1148282 86 Question Classification with Log-Linear Models actually adding additional answer answering approach attempt barcelona builds clark classification classifier classifiers coarse cogcomp coling component computational conditioned conference curran current data defined described distribution empirical employed employing encode entropy feature feed first functions gerber here hierarchical hierarchy honours hovy http information initial integrated international junk kocik label labels language learning linear linguistics makes maximum meeting methods model models natural ninth only over page pages parsing part proceedings processing propose question ratnaparkhi references retrieval roth scheme schemes second section semantic sense spain speech stage standard sydney tagger text that thesis this types uiuc university upon using webclopedia while http://doi.acm.org/10.1145/1148170.1148308 112 Enhancing Topic Tracking with Temporal Information allan articles automatically baoli based broadcast choose conference corpus darpa detection event experiments extracted features final from information james juha makkonen mandarin method metric minimal news normalized pages part pilot proceedings profile proposed pyung references report retrieval semantics sigir simple study talip temporal test topic tracking transcription understanding usefulness weighted workshop http://doi.acm.org/10.1145/1148170.1148275 79 Inferring Document Relevance via Average Precision analyzing annual aslam august average companion conference details development entropy estimates evaluation experimental formed found incomplete inferred information international judgments maximum measures method obtained pages pavlu press proceedings qrels quality references research results retrieval shows sigir statistical system table these topic trec using were yilmaz http://doi.acm.org/10.1145/1148170.1148196 17 Building Bridges for Web Query Classification aaai according advances agglomerative algorithm annual augmenting automatic background bansaghi bartlett bayes beckwith beeferman beitzel berger bickel bridges butterworths categorization categorizing challenge chowdhury cikm classication classification classifiers classifying clustering comparative comparison comparisons conference content data database development discovery document edition editors engine event explor facing feature feedback fellbaum ferrety fourteenth frieder geographical gravano great gross grossman haider hatzivassiloglou icml informaion information international introduction jensen journal kang kardkovacs kddcup know knowledge kolcz labeled language large learning ledge lewis lexical lexicography lichtenstein likelihood line locality london machine machines management margin mccallum methods miller mining models naive newsl nigam outputs pages pedersen peng platt press probabilistic problem proceedings queries query references regularized report research retr retrieval scheffer schimpfky scholkopf schuurmans search second selection shen siemen sigir sigkdd sixth smola solution statistical study support text tikk training twelfth type unlabeled user using vector vogel wang winning with wordnet words workshop yang zhang zheng http://doi.acm.org/10.1145/1148170.1148345 148 Searching for Expertise using the Terrier Platform amore august author automating candidates case copyright documents enterprise experiments expert figure finding glasgow held house intern macdonald manage marriage maybury ounis owner plachouras presentation proc query ranked references related research results search seattle showing sigir stable technology terabyte terrier their this tracks trec university washington with http://doi.acm.org/10.1145/1148170.1148219 35 Minimal Test Collections for Retrieval Evaluation above achieved actual actually added adding after algorithm allan allow always american analysis annotation annotators anywhere appendix applicable arbitrary argmaxi association assume available average back base because been begin better both buckley cahan carterette case cases challenges cikm claim clarke clef collections commercial comparison comparisons complete conciseness conclude conclusion condition confidence construction contain contains continues contradiction cormack correct correlation cument cuments decreaj defined degree denoted depends difference differences document documentation documents edition effectiveness efficient effort engine ensures enumeration environments estimate estimates etween evaluate evaluating evaluation every exists expected expectes experiments fairly figure follows forming forthcoming four fourth from furthermore general gentleman given goodness greater griffin hand have here high hours hypothesis ihaka implemented imposing impossible incomplete increases incremental indices indistinguishable induction inequalities information instead instec interface intuition intuitively iteration joho jones journal judged judgments kcontradiction kendall keynote language large leads least left lemma lemmas likely lines little london make many maximize maximizes maximizing means measure measurement methods minimal minimize more must need next nicholas nine nonrelevant note notebook number obtained oling only optimality order ordering orderings other otherwise overview page pages palmer paper papers part particular perspective philosophy pick place placed pooling possible precision present proceedings produced proof prove proves rank ranking rankings ranks ready reasoning reciprocal references relative relevance reliability reliable reliably remains remove replacing restatement results retrieval reused revised right rijsbergen robust running same sanderson scale search second section select selected selecting sensitivity sets shall should show showed shown side sigir similar simple singhal size soboroff some sparck spot springer stability stated states statistical statistically statistics stephens stopping such sufficient suggests suppose swapped symposium system systems talk technical test than that then theorem there therefore they this time track trec trivial true tually used using value variations variety verlag violate voorhees want weight when which while will wish with without workshop would ypothesis zobel http://doi.acm.org/10.1145/1148170.1148189 12 Topical Link Analysis for Web Search acknowledgments baoning based datasets ecific foundation grant material national orted providing querysp references science supp thank this twenty under work http://doi.acm.org/10.1145/1148170.1148222 37 Finding Near-Duplicate Web Pages: A Large-Scale Evaluation of Algorithms accurate achieved actually additionally affects alfredo algorithm algorithms alone also american amount annual another applications apply applying approach approximately brin broder building caused center chances changes charikar clustering clusters combined commerce communication communications computed computer computing conclusions conference congress consecutive copy correct could cutoff data databases davis dean deserve design detecting detection difference different digital document documents duplicate duplicates duplication during editors either electronic erform erformance erformed erior estimation etter evaluation evolution example explore fetterly file files filtering finding fingerprinting following formed fraction frequency from further garcia generate ghemawat glassman harvard heintze help hoad however identified identifying implementation implemented improve improvement incorrect incorrectly increase individual information instead interesting international jork journal june large latin less level libraries management manasse manb mapreduce march mechanism mechanisms method methods might molina mostly much near nearduplicates neither note numb ocelli oilerplate olynomials only oorly operating otentially pages pairs phrase plagiarised practice precision presented private probably proc processing rabin random recall reducing references renato replicas research results returned returning returns rounding running sacrificing same santis scalable scam science security sequence sequences shingle shingles shivakumar show shows sigir sigmod similar similarity simplified since site sites slower society some space stricter strings study symposium syntactic system table technical techniques technology testing text than that theory these this thus tokens undecided university usenix using vaccaro value versioned weighting well whether while wide winter with without work workshop world would zweig http://doi.acm.org/10.1145/1148170.1148304 108 Authorship Attribution with Thousands of Candidate Authors analysis anderson applications applied approaches attribution augumenting author authorship automatic barbara bayes blind buckley citations classifier computational computer corney data diederich double email exploiting explorations forensics hill identification idiosyncrasies ijcai information intelligence jajodia kindermann kluwer koppel language leopold machines management mining models mohay myth naive only paass peng proceedings processing provost references retrieval review salton schler schuurmans security sigkdd statistical style stylistic support synthesis term text using vector wang weighting with workshop http://doi.acm.org/10.1145/1148170.1148247 58 Semantic Search via XML Fragments: A High-Precision Approach to IR aaai academic advances analysis analytics anlp annoation annotation answering architecture bell bikel broder brown bruce candidate carmel carroll coden cohen computational conf conference control coop cornell customizable czuba dang derived description disambiguate document documents domain donnell eacl emnlp ench engine engineering entities erformance evaluation extraction facts february ferrucci finder fragment fragments framework from fuhr gathering good grosjohann grosso guha hatzivassilogou heflin hendler high http identifying improve indexing inex informatics informatik information infoxtract inokuchi integrating intel interchange intermediate iyer jective journal kanza katz kazai kelledy kluwer knowledge lalmas language learning levas level life ligence linguistic linguistics maarek mack mamou management mandelbrod martin mass matsuzawa mccool metrics metricsv mihalcea miller moldovan mukherjea murdock name named natural nymble olarity open opinion opinions overview passage practice prager precision predictive proc proceedings publishers query question questions radev recomendation references relations retrieval retrieving riao sagiv sanderson schwartz science search searching selectively semantic sense senses sentences separating shoe sigir smeaton soffer srihari structures studies subramaniam syntax synthesis system systems text theory tiedemann towards track trec uniduisburg unstructured uramoto using veillard vldb voorhees weischedel wieb wilson with word wordnet workb workshop xirql xsearch http://doi.acm.org/10.1145/1148170.1148318 122 Fact-Focused Novelty Detection: a Feasibility Study agree agreed agreement allan among annual assessing august based being brazil calculate carletta carterette classification columbia computer conference considered control department enough fact first found gaithersburg good harman identify information interjudge judge judges judgments kappa labeled learning lewis nist novelty order other overview proceedings prop references relevance relevant retrieval salvador schiffman science seen sentences sets shown sigir soboroff statistic status table tasks text that thesis three topic track trec twelfth union university when will words http://doi.acm.org/10.1145/1148170.1148299 103 Comparing Two Blind Relevance Feedback Techniques clarit conference design evaluation evans feedback harman lefferts proceedings references relevance retrieval revisited second sigir system text trec http://doi.acm.org/10.1145/1148170.1148347 150 Supporting Semantic Visual Feature Browsing in Content-based Video Retrieval access august author based browsing copyright features field figure geisler gruss heesch held howarth hughes input interface magalhaes main marchionini owner part performance pickering proceedings provided query references retrieval ruger satisfaction search seattle sigir system text traditional transcripts trecvid user using utilized versus video videos washington were wildemuth yang yavlinsky http://doi.acm.org/10.1145/1148170.1148204 23 LDA-Based Document Models for Ad-hoc Retrieval activity advances allocation american analysis azzopardi based bayesian berger between blei budapest cambridge chains chinese conference data deerwester development dirichlet discovery distributions dumais equivalence furnas geman gibbs girolami griffiths harshman hierarchical hungary ieee images indexing information intelligence international joint jordan journal kaban knowledge lafferty landauer language latent learning machine markov mining models nested networks neural pattern plsi press proceedings process processing profiling references relaxation research restaurant restoration retrieval rijsbergen science semantic sequential sigir society statistical stochastic systems tenenbaum topic transactions translation http://doi.acm.org/10.1145/1148170.1148233 46 Hybrid Index Maintenance for Growing Text Collections accumulating addison american analysis approach april august australasian australia behavior bell bremen brook build buttcher cambridge chiueh cikm citations clarke complex compressed compression computer conf conference constant construction cutting darlinghurst data database databases dates development document dynamic ecir efficient effort euler eulermascheroniconstant european evaluation fast files from garc generation geometric germany hash heinz html http huang human hybrid incremental index indexes indexing information inverted journal july know least ledge lester letters line lists london maintenance management mandelbrot mascheroni mathworld memory merge moffat molina novemb offs online optimization pages partitioning pass pedersen performance place principle proceedings processing query ramamohanarao real references research retrieval scholer science shoens sigir sigmod signature silagadze single situ society stony strategies suny synthetic systems tables technical technology text time tomasic trade trans versus vocabularies weisstein wesley williams wolfram workload yiannis york zipf http://doi.acm.org/10.1145/1148170.1148262 69 Statistical Precision of Information Retrieval Evaluation analysis athens australia blustein bootstrap brazil british buckley cambridge chapman clarke collections conference construction controversy cormack corpus data development effect effectiveness efficient effort efron epidemiology eriment erimentation eriments error estimation etween evaluating evaluation experiment finland fisher fixed gaithersburg gene glass greece greenland hall harman html http hull incomplete inference information introduction johno journal judgements large lenhard lippincott management measure measurement melb meta models modern neyman ooling opulation ourne over overview palmer particular pearson philosophical philosophy pragmatics press proceedings processing references relevance reliability reliable research results retrieval revisited robust rothman salvador sanderson savoy scale science sensitivity sheffield sigir sixth size society stability statistical sutcliffe system tacitly tague tamp test testing text theory third topic topics track trec tsibirani using variations voorhees which wilkins williams with york zobel http://doi.acm.org/10.1145/1148170.1148277 81 Adaptive Query-Based Sampling For Distributed IR approach april based callan classics classification connell database databases decision decisions degroot duda fuhr hart interscience library networked optimal pattern publication query references sampling selection statistical stork syst text theoretic trans wiley http://doi.acm.org/10.1145/1148170.1148294 98 A New Web Page Summarization Method algorithms applied artificial australia barcelona based brin bringing callan carbonell centrality cikm citation clair creating diversitybased document documents erkan evaluating extract extraction goldstein graph http intelligence interactive jair journal july lada lexrank mihalcea mittal motwani multi news newsinessence order page pagerank proc producing radev ranking references reordering report reranking research salience sentence sigir source spain stanford summaries summarization technical text umich university winograd http://doi.acm.org/10.1145/1148170.1148279 83 Examining Assessor Attributes at HARD 2005 appear assessing baillie belkin chaleva cole confidence crestani documentation effects elsweiler emerald experiences experiments gordon hard harper interest journal knowledge koychev landoni muresan nicol references relative relevance robert rutgers ruthven smith strathclyde sweeney track trec university wettschereck wiratunga yakici yuan zhang http://doi.acm.org/10.1145/1148170.1148241 53 Document Clustering with Prior Knowledge adaptive advances aided algorithm alto analysis analyzing applicability applications applied approach artificial background banerjee based basu batch bennett bias biometrics browsing cardie categorization chan chen classification cluster clustering cognitive collections comparisons computations computer computing conceptual conf conference constrained constraints corpus cument cuts cutting data demiriz design development ding discovery document effectiveness eichmann examination february fields filtering flynn francisco freitas from functions gaussian ghahramani ghani given golub graph grouping harmonic hartigan hopkins http icml ieee image individual inference information instance intel internation international jain joachim joachims john jose kaestner karger kaufmann know knowledge labelled lafferty large learning ledge level ligence loan london loquium machine machines malik management matrix means metaclustering methods michalski ming mining mitchell mooney morgan murty neto neural newsgroups nigam ninth nist normalized observation organizing pages palo partial partitioning pattern pederson percentage practical press proc proceedings processing publishing ranking ratio references relaxation research restrictive results retrieval review rogers role ruiz santos schlag schroedl science search seeding segmentation self semi september septermber siersdorfer sigir simon sixth sizov spectral srinivasan statistics stepp summarization supervised support surveys svmlight sydney systems table text tioga training transaction transactions transductive tsvm tukey twentieth unlabeled using value vector wagstaff wilcoxon with wong yang zeng zien http://doi.acm.org/10.1145/1148170.1148333 137 A Graph-based Framework for Relation Propagation and Its Application to Multi-Label Learning apply binary categories category collection computation compute denote directed document documents each element elong energy equation ervised fields framework function functions gaussian ghahramani graphs harmonic hofmann html http icml imageclef infinite lafferty learning lkopf minimize multi networks neural nips onds otherwise outputs preference proc processes propagation references relation resp semi shef simi simii solution stand straightforward that then using vector when where will williams with zhou http://doi.acm.org/10.1145/1148170.1148191 13 The Role of Knowledge in Conceptual Retrieval: A Study in the Domain of Clinical Medicine aaai access advances amati american amia analysis annals anomalous answered answering approach approaches arist aronson asked association automatic based basis beaulieu being belkin bergus bhupatira biomedical buckley built cacm canadian care carroll chambliss chua churchill clarke clinical club cogdill cognitive cohen coling combining comparisons computational conley cornell covell croft czuba databases decisions demner dependence dependencies dependency derived description disambiguate disambiguation divergence doctors document domain domains donnell dorsch ebell edition effective engineering error evans evidence expansion experiments extraction fagan family ferrucci field final first fiszman forum freund from fushman gaizauskas gatford genomics genre gorman groote hancock harabagiu harman haynes hayward hearst here hersh hirschman humphreys hypernymic indexing inference informatics information infrastructure ingwersen interaction internal interpreting investment issues jones journal journals kando kelledy knowledge language large lege lenat leong letin levy lexical library lindberg linguistic linguistics literature livingstone management manning mapping markov mccray measuring medical medicine metamap metathesaurus methods metzler mihalcea model modeling models moldovan moore multi narayanan natural needs network nishikawa office okapi online open osheroff overview passage patent patient patterns performance phrase physicians ponte practice prager preliminary press primary probabilistic proc processing program propositions query question questions random randomness recent references regarding relations relationships reliable report resource responses restricted results retrieval richardson rijsbergen rindflesch roberts robertson root rosenberg sackett salton sanderson scale scenario science second selection semantic sense senses sigir smeaton source space states statistical strategy straus structure structures students surdeanu syntactic syntax system tagging task teach techniques term text they tois toms track trec uman umls unified using vector view voorhees walker well welty wilson with word wordnet workplace workshop wykoff yang year http://doi.acm.org/10.1145/1148170.1148303 107 Swordfish: An Unsupervised Ngram Based Approach to Morphological Analysis addison analysis approach arrays baeza based challenge church comput compute conclusions constructed corp corpus creutz document eiro ervised extracted frequency from goldsmith helsinki induction information into lagus language learning lexicon linguist modern morfessor morpheme morphemes morphochallenge morphological morphology natural neto ngram ossible probabilities purely recursively references retrieval segmentation splits substrings such suffix swordfish tech technology term text university unsup using wesley where words yamamoto yates http://doi.acm.org/10.1145/1148170.1148180 5 Thread Detection in Dynamic Text Message Streams addison algorithms allan analysis anisotropic annual application arizona automatic based bengel bett bingham boston brill broadcast brower butterworths candan carbonell categorization chat chatroom chattrack classification clustering combining comparative comparison complexity comput computer conference consulting conversations darpa data department detection development diffusion document doddington domain dynamic dynamical edition editor effective elnahrawy engineering entities event evolution features finke fisher functions gartner gauch girolami gravano hatzivassiloglou hierarchies http iasted identification ieee independent indexing informaion informatics information instmessaging intel interactions international investigation islands june kaban khan know kumar kumaran language learning ledge lehigh lett ligence lind line linguistic london longman maganti marketplace meeting meetings message methodologies mining minnesota mittur model models monitoring named natural navigation nderler networks neural news newsgroup november page pages pierce pilot pottenger probabilistic proceedings process processing programming progress publishing pursuit ranking references report research retrieval retrospective room salton science search second security segmentation semantic services sharing shuler sigir smeaton smith social society steinbach stiefelhagen streams study summarizing support symposium syntactic syst tasks technical techniques text thomas tirri topic topical tracking transcription transformation tucson tuulos understanding university using vijayaraghavan virgin visualise waibel washington wesley wide workshop world yamron yang http://doi.acm.org/10.1145/1148170.1148215 32 Identifying Comparative Sentences in Text Documents aaai access adjective advances agrawal amazon apfa assoc attribute ayres based bitmap boards brill burges caaw carenini chen classification coling comparative comparatives coverage customer database dave direction doran edition editor effects egedi electronic elsevier enclycopedia english erlbaum evaluative extracting extraction fellbaum flannick from gallery gehrke gradability grammar hatzivassiloglou hearst hockey icde ickc iida information intelligent interpretation inui jacobs jindal joachims kennedy kernel knowledge kobayashi language large lawrence learning lexical linguistics lkopf making market matsumoto message methods mining opinion orientation part pattern patterns peanut pennock practical press product references refinement relations representation reviews rule scale second semantic semantics sentence sentences sentiment sequential simple smola speech srikant srinivas stock subject subjectivity summarizing support system systems tagger text using value vector wide wiebe wordnet xtag yahoo zaidel zwart http://doi.acm.org/10.1145/1148170.1148211 29 Answering Complex Questions with Random Walk Models aaai aarseth abductive acquisition annotated annual answer answering answers association automated bank based bases bensley bootstrapping boston bowden brill canada case clark coling collins combining communications computational computing conference content contexts corpus database decomposition development document driven emnlp empirical employing english erkan error evaluating evaluation experiments extraction faqfinder fifth focused fourteenth from geneva germany gildea gistexter harabagiu head hickl high hovy human impact incremental information interactive international kingsbury knowledge lacatusu lafferty language learning lehmann lexical lexicons linguistics lite lrec lytinen machinery match meeting method methods miller minimization mining models moldovan naacl narayanan natural nenkova otterbacher pages palmer papers parsing part pasca passonneau pattern pennsylvania performance proceedings processing proposition pyramid quality query question questions radev random reasoning references representations research resources responsiveness retrieval riloff risk roles saarbrucken selection semantic sentence sigir signatures speech spring statistical structures study summaries summarization switzerland symposium systems tagging taylor techniques technology text texts thelen thesis tomuro topic transformation trec twelfth types understanding university using vancouver walks wang williams with wordnet workshop zhai http://doi.acm.org/10.1145/1148170.1148184 8 Towards Efficient Automated Singer Identification in Large Music Databases accuracy acoustics adaboost addition against also amounts amplification analysis annual application applications approaches artist audio automated automatic available bartsch based because becchetti been berenzweig better boosting bregman calculate capability carried chang chih chung cikm cjlin classification clearly collins colt communications comparative compared comparison competitors computationally computer conditions conf conference constrained contain content continuous could cropping csie dafx data database databases dataset deamplification decision delay demo demonstrate detection development different digital discussion distances distorted distortion distortions distribution during each echo effects efficient electronic elements ellis emerges entertainment envelop equation estimation evaluation even examples experimental experiments expo exponential extremely figure flake framework francois freund friedman from function fundamentals gaussian generalzation genre hall hastie however http huang human icde icme identification identify ieee illinios important improve inference information instrument international items john jordan journal juang karjalainen kinds knowledge lafferty language laplace large lawrence learning lebanon length levels library libsvm likelihood limitations line livshin locating logistic machines maddage management maximum median methods michael mining minnowmatch mitsunori model modeling moderate modern modified most multimedia multipitch music musical networks neural nips noise noises nondistorted novel objects obtained often ogihara omit other pachet page pages paper perception performance performs pink popular power prediction prentice presence press previous probabilities proc processing property query rabiner ratio real recognition recordings references regression report require research results retrieval ricotti robert robust robustness rodet same schapire schemes science sciences section segments series shao shen shepherd shows sigir signal signals similar singer singing software solo song sons sound space spectral speech springer statistical study summaries summarization sundberg superior support synthetic system systems technical technique techniques terms test tests than that them theoretic theory this three tian tibshirani tolonen trans trend tsai tutorial types typically under university used using vapnik various vary vector verlag virtual vocal voice volume wakefield wang where white whitman wiley with within workshop world yoram zhang http://doi.acm.org/10.1145/1148170.1148315 119 Action Modeling: Language Models That Predict Query Behavior activities analysis annual august automated baseline boulder brazil bremen changes cikm college colorado conference dumais edit extensible from germany gram horvitz icslp implicit information interests international knowledge language length libraries longer management measure modeling move negotiation october overall pennanen personalized personalizing predictions press prev proceedings processing proposal question recall references remove repeat research return salvador same search seeking serola shen shorter sigir significant spoken srilm stolcke table tactics taylor teevan terms toolkit user vakkari while writing york zhai http://doi.acm.org/10.1145/1148170.1148224 39 Building Implicit Links from Content for Forum Search achieved advances algorithm algorithms also always analysis anatomy annual appendix artificial august authoritative baeza baker based basic because bergmark best bias bipartite block boldi brazil brill brin bringing build built cambridge canada castillo categorization chen chiba citation classification classifying clear clustering column combination combine combining comparative compute computing conclusion conference considered construct content continue crawls current damping data development dhillon diag different digital diligenti dimensional dirichlet discovery discriminant distributional documentation documents domingos dropped dubes dumais dynamics easily effective engine enhanced environment equals european evaluate eventually experimental explicit exploiting explore factor feature fgrank filter fine focused follow forums from function functions further generation give given golub google gori grained graph hall haveliwala hierarchical hierarchically hierarchy horizontal http hyperlinked hyperlinks hypertextual idea identifying impact implicit important increase information intel intelligent international interpreted jain japan jean jection jects jordan journal jump kamvar kingdom kleinberg know koller kumar lagoze lang large learning ledge libraries ligence likely lind linear link links lisbon machine maggini mallela manning mccallum mining model modified more most motwani named navigations netnews neural news newsgroup nodes notions number ogihara okapi operations order otherwise overcome overview page pagerank pages paper partitioning pedersen performance personalized portugal prentice presented press probabilistic problem proc processing proof proposed quality random rank ranking recommendations references relevant replace report research retrieval rewrite richardson robertson sahami saint salvador santini sbityakov scale scaling scoring search selection sensitive september shakery sheffield shown sigir sigkdd simplify sink sources spectral spire stanford still structure study surfer systems takes technical text than that their them then this through topic toronto tunneling uncertainty uniform united university useful using vaithyanathan values vector vertical very vigna volume wang weeder weight weiss when where wide widom winograd with without word words world yahoo yang yates zeng http://doi.acm.org/10.1145/1148170.1148235 48 Pruned Query Evaluation Using Pre-Computed Impacts annual august baeza brazil conference croft development early editors effective harper information international kraft kretser louisiana marchionini moffat orleans pages press proc ranking ranks references research retrieval salvador scoring september sigir similarity simplified space tait term termination using vector with yates york ziviani zobel http://doi.acm.org/10.1145/1148170.1148331 135 Searching the Web Using Composed Pages according agrawal among average banks based behind bhalotia brin bringing browsing candan chakrabarti chosen cikm citation compare compared comparing contain databases document edge exhaustive expanding expansion fagin figure focused fragment graph heuristic high hristidis hulgeri hyperlinks icde imply information intuition inverse keyword keywords kumar likely lists metric modified more motwani nakhe optimal order organizing page pagerank pages quality query random ranking ravi references relevant report results retrieving ronald scores search searching short shows similar since sivakumar soda spearman specific stanford structure sudarshan summarization technical text that unit university used using varadarajan weighted weights winograd with http://doi.acm.org/10.1145/1148170.1148284 88 Bias and the Limits of Pooling aslam august buckley charles chris christopher clarke classified collections construction cormack data efficient efficiently emine erformance eriments estimating feedback gerard gordon icml incomplete information javed judgments justin large learning measures optimization pages palmer partial pavlu proceedings proceesings query references relevance reliable results retrieval salton sampling scale sigir technique test training using virgiliu weights with workshop yilmaz http://doi.acm.org/10.1145/1148170.1148193 15 Semantic Term Matching in Axiomatic Approaches to Information Retrieval american analysis answering applications applied approach asian automatic automatically based buckley butterworths cchio ccurence ccurrence chains cikm computational computer concept conference constructed croft cross cument data deling dels development disambiguation divergence document effective effects engine expansion experiments feedback fifth fourteenth frei global hall hill improving indexing information international introduction journal know kuhns lafferty language languages ledge lexical limitations linguistics maeda management mandala maron mcgill mcgraw meng metho mitra modern moldovan novischi okumura orhees othing overview pages peat pedersen phrases ponte prentice probabilistic proceedings processing query question recognizing references relations relevance research retrieval rijsbergen robust sadat salton satoh schutze science search semantic sept seventh sigir singhal smart smeaton society study system systems tanaka tenth term text thesauri thesaurus thirteenth tokunaga track trec uemura using utilizing willett with wordnet workshop yoshikawa zhai http://doi.acm.org/10.1145/1148170.1148312 116 Combining Fields in Known-Item Email Search enterprise eriments glasgow macdonald ounis plachouras proceedings references terabyte terrier tracks trec university with http://doi.acm.org/10.1145/1148170.1148290 94 Automated Performance Assessment in Interactive QA answering burger conference definition document http index information international issues joachims july learning machine madison mallet nist proceedings program question references research roadmap similarity structures svmlight tasks theoretic umass waikato weka wisconsin http://doi.acm.org/10.1145/1148170.1148309 113 A Comparative Study of the Effect of Search Feature Design on User Experience in Digital Libraries (DLs) accomplish affected among annual apparently based belkin better canadian card case cases collections communications construction cool design designing digital effort except feature features feedback great halvorsen harley hartson hearst higher hits however ieee ieeecs information inspection interaction interactive interface interfaces international journal kelly kengeri knowledge length libraries library likert lower mackinlay made many masinter molberg most muresan ncstrl ndltd online organization other participants pearson pedersen perez plainsant point poor proceedings query quinones rating ratings reddy references refinement retrieval returned review rich robertson satisfaction scales seals search shenierderman shiri shivakumar shows sigir significances statement statements statistical study support system systems table tang task than there three toronto usability user users were with xplore york yuan zero http://doi.acm.org/10.1145/1148170.1148340 143 Appraisal Navigator appraisal appraisalnavigator argamon bremen casey classification conference garg germany groups http information know ledge lingcog management navendu proc references sentiment shlomo using whitelaw http://doi.acm.org/10.1145/1148170.1148250 60 Find-Similar: Similarity Browsing as a Search Tool aalbersberg abstract access accurate adaptive addison advantages agent allan annual applications applied approach approaches articles artificial assists automatic automatically based beaulieu behavior belkin biased boost breadth bringing browsing buckley campbell case catalogue categorizing chen ciir cikm ckner cluster clustering coffee cohen combining complex comput construction cornell croft dang database dependencies depth developing diaz directed document documentation documents draft dumais ecir editor effectiveness effects eguchi electronic empirical engine engineering enhancement eval evaluation excite existing expanded expansion extended factors feedback field fieldhouse first forum from gather glasgow graph graphical hall hancock harman hearst hierarchic html http human hypertext hypothesis ijcai implicit incremental incrementally indri inference information integrating intel intelligent interact interaction interactive interface internet iterative iwayama jameson jansen jasist jose journal judgments know koenemann krovetz lafferty lalmas language lavrenko ledge lemur lemurproject letizia leuski library lieberman ligence lists lumsdaine machine magazine markov methods metzler mixtures model modeling models morphology needs neighboring network networking nist notebook number olston online order ostensive overview ozmultu page pages pedersen people policy ponte poster prentice press proc process processing public pubmed queries query random reexamining references reformulation reinforcement related relevance report research result results retrieval review revisited rijsbergen robust rocchio ruthven salton sanderson saracevic scatter scenttrails schmitt search searching siek sigir simulated small smart smoothing spink strohman studies study summaries summarization support survey system systems technical term text that their thesis thompson tois tombros toolkit tools track trans trec turtle tutorial umass university user users using version viewing voorhees want wesley what white wilbur wirschum with wolfram york zhai http://doi.acm.org/10.1145/1148170.1148208 26 Combining Bidirectional Translation and Synonymy for Cross-Language Information Retrieval adam aligned alignment analysis anne annual approach approaches asian association august based bilingual boughanem braschler bruce cambridge chiang chinese chrisment christof cjke clef clir combination combined computational computer conference corpus coverage croft cross darwish david development dictionary directional disambiguation distinguishing documents douglas draft ecir effective effects emnlp engineering english ertunc europarl evaluating evaluation experiments exploiting extensions february forum franz funda gina gmbh hermann hiero hoon html http hyeok improved index indexing informatics information institute institutes international investigation jacques jianqiang jinxi jones jong josef july kang kareem karen koehn korean kraaij kwok laboratory language languages lavrenko levow lexical lingual linguistics lopez machine madnani martin maryland matching mccarley meaning methods michael modeling models monolingual monz multilingual nassr national natural nineth nist nitin notes november ntcir oard october pages philip philipp pirkola postech press probabilistic proceedings processing proven queries query ralph references related relevance report research resnik retrieval robertson savoy scott second sense senses setups seung should sigir simple sparck springer standards statistical strategies structure structured subotin system systems tcir technology template text thesis translate translation trec twente university unpublished variations verlag victor wang weischedel wessel with word wordlist working workshop yarowsky http://doi.acm.org/10.1145/1148170.1148289 93 Learning a Ranking from Pairwise Preferences algorithm american approach aslam automatic available bartell belew borda college combination conclusion contradicts cottrell count data decoste dels dified documents empirically enhanced estimation example exclude fast finite flexible formal given independent information journal keerthi know large learn learning likeliho linear machine making manmatha margins maximum mease metasearch metho models montague more multiple newton normalization otball other outperform pages penalized preference proceedings ranked ranking rankings references relevance research retrieval scale score scores searches sever shaw sigir solution some statistical statistician svms systems teams that them they training trec victory voting when with http://doi.acm.org/10.1145/1148170.1148179 4 Contextual Search and Name Disambiguation in Email Using Graphs activation acts adaptive adding additional aery airoldi algorithms analysis apccm applying archives artificial associative authority automatic back balmin based bekkerman berger between boosting brin bringing buckley callan carley carvalho chakravarthy chen chute cikm citation classification cleaning clustering cohen collective college collins company comparison computational computer confidence content context corpus croft data databases dataset dates department dependency diehl diligenti directed disambiguation discriminative distance distributional distributions dittenbach domain ecml edges electronic email emails emailsift emnlp enron entity error example expansion exploiting extracting extraction fields fienberg from functions future gaussian getoor ghahramani global gori graphs harmonic haykin hearst hofmann hristidis hyperlinks icdm icml iiweb ijcai improved incorporate independent induced inducing informal information integration intel interactions into jair ject jectrank jects joins journal kalashnikov keyword klimt knowles kurland lafferty language learning level lewis ligence linguistics links lists machine macmillan maggini mail malin management manning mapping matching mathematical matrix mccallum measure measuring mehrotra merkl method methods metrics minkov mitra model modeling models motwani multi namata name named names natural network networks neural nips nodes order organization organizational page pagerank pairwise papakonstantinou paragraph parsing passages perceptron personal plan predictions preliminary processing propagation publishing query random ranking rated ravikumar recognition reference references relationship report representation reranking research resolution retrieval salton schapire scholkopf science scores search segmenting semi siam sigir sigkdd simfusion similarity simrank singer singhal social speech spreading springer stanford string structural structure structuring study subtopic summarization supervised system systems tasks technical text texttiling theory things thompson threading time toutanova transactions unified university using vldb voted walk wang widom winograd without word work yang yaniv zhang zhou zhuang http://doi.acm.org/10.1145/1148170.1148187 10 AggregateRank: Bringing Order to Web Sites acad analysis artificial asia automatic biological bootstrapping brin bringing challenges citation community concept conference craig data dataintensive davulcu despeyroux digital distillation documents domain domingue dong eiron engines enrico feng france from frontier getoor girvan guang hasan henzinger information intelligence international island jeju joint june knoblock korea kristina latvia lerman libraries lise management mccurley minton modeling modelling motta motwani nagarajan natl networks newman novel october ontologies ontominer ontoweaver order overlapping page pagerank pages paris practical proc proceedings ranking references report retrieval riga saravanakumar search segmentation semantic silverstein site sites social specific srinivas stanford steven structure subsite symposium systems tables technical thirteenth tomlin topic using vadrevu weiying wide winograd with workshop world york yuangui zhang http://doi.acm.org/10.1145/1148170.1148311 115 One-Sided Measures for Evaluating Ranked Retrieval Effectiveness with Spontaneous Conversational Speech access actual again approximate archives arons assessments chose constant cutoff digital discard distribution document effectiveness either emitted evaluation experiment finding garofolo graded ground gtidi gustman here history inen interactively jasist jcdl judgments large list measurement modeled multiple nearly number occupied onset oral otherwise output place placed placing point points position predict probability product randomly rank recorded references relevance reported retrieval select should sigir since skimming some speech speechskimmer spoken start state story success supporting system that there therefore tochi track trec truth twice under using variations voorhees where which will would zipf http://doi.acm.org/10.1145/1148170.1148257 65 Unifying User-based and Item-based Collaborative Filtering Approaches by Similarity Fusion again algorithm algorithmic algorithms alleviate amazon anal analysis appendix application applying approach approaches architecture aspects associative automatic average back based bergstrom borchers both breese canny case cases chai chen classifiers cluster collaborative combine combining comparison computing conclusions conditional conference constant cscw depend derived deshpande diagnosis different dimensionality does duin each ecir eigentaste either empirical erformance etter factor factorization fast filtering first flexible following framework from function furthermore fusion giles given goldberg grouplens gupta hatef heckerman here herlocker hiemstra hofmann horvitz huang hybrid iacovou icml ieee importance individual info information intel internet interpolation item items journal july kadie karypis kittler konstan language latent lawrence limitations linden linear london mach margin matas matrix maximum mean means memory mixture model modeling models more netnews normalization normalize normalized normalizes novel obtain only open order orted other overcome pages pattern pennock performing perkins personality prediction predictive predictor predictors privacy probabilistic probabilities problem proc proposed query rating ratings recommendation recommendations recommender reduction references reinders relevance rennie replacing resnick result results retrieval riedl roeder sarwar scalable scbpcc scheme semantic showed sigir similar smaller smith smoothing sources sparsity special specific specifically srebro study subtracting suchak suir syst system table techniques term test that them three time towards training trans treated treating unified unify unifying user users using value vries wang webkdd weighting weights where whether with workshop yang york zeng http://doi.acm.org/10.1145/1148170.1148288 92 First Large-Scale Information Retrieval Experiments on Turkish Texts advantage after algorithm also altintas among analysis appear approach approaches automatic before bell best better between bitirim bpref buckley change chooseone cicling comparison comparisons compressing computing conf content definite difference documents effective effects especially evaluation explained extremes fashion final findstem francisco from function functions gigabytes good graphs hafer have images improvement include incomplete indexing inexact infor information insignificant iscis kaufmann language last letter linguistic literary lncs losers managing matching method methods mexico moffat morgan most much only order others parallel patton performer point practical precision prefix problem process purpose quantification recall references representative respect resubmitted result retr retrieval salton segmentation separated sever shows sigir similar simple slightly solak solving stemming stor streamline successor table takes term terms text than that these this time tonta translations truncation turkish using value varieties voorhees weighting weiss with witten word worse http://doi.acm.org/10.1145/1148170.1148313 117 Improving QA Retrieval using Document Priors answering charniak chua conference dang dependency digest entropy fourteenth haircut hidden hopkins http information inspired johns leek markov maximum mayfield mcnamee meeting miller models naacl nist notebook overview parser passage proceedings question references relations retrieval schwartz seattle seventh sigir site system technical text track trec using voorhees washington http://doi.acm.org/10.1145/1148170.1148263 70 A Statistical Method for System Evaluation Using Incomplete Judgments american analysis anderson annual aslam atasoy august australia authors baeza based brooks buckley carlo clarke classified cole collections comparisons conference construction contribute copyright cormack croft crossvalidation data depth depths development directly documents ecause editor editors efficient efficiently enormous equivalent erformance ergen eriments error estimates estimating evaluation figure frieder from gaithersburg generalization genetics good government hammer happ harman held ibraev icml incomplete increases indicated information international judged judgments kantor know large learning lecture ledge management marchionini mathematical measure measures meeting melb metasearch methods model moffat monte note notes novemb numb octob office only ooling ortance ortant ourne over overview pages palmer partial pavlu percentage plot pool pooling press printing proceedings quershi query references relevant reliable research results retreival retrieval rice rijsb runs sample sampled sampling savell scale science seligman seventh sigir size sociaty statistical statistics system systems tait technique test testing text that their they third train training trec trecs trend tthe twelfth unified using virtually volume voorhees wadsworth washington wilkinson with workshop yates yilmaz york ziviani http://doi.acm.org/10.1145/1148170.1148230 44 Distributed Query Sampling: A Quality-Conscious Approach aaai access addison agichtein algorithms automatic baeza bailey based callan cohen collections comparing connell craswell croft database databases digital discovery distributed effects french gravano hawking inference info information internet ipeirotis language learning libraries modeling models modern neto networks performance press query references report retrieval ribeiro sampling searching selection server sigir sigmod singer systems technical text webdb wesley wide with workshop world yates http://doi.acm.org/10.1145/1148170.1148220 36 Dynamic Test Collections: Measuring Search Effectiveness on the Live Web allan amsterdam andrei andrew annual april august australia bernstein bremen brewington brian broder buckley carterette catherine chowdhury chris cikm clustering collection collections computer conference crawler databases david decay derivative detection development document documents duplicate dynamic eleventh ellen enko evaluation evolution fast fourteenth frieder garcia geoffrey george germany glassman gloria grossman hector identifying implications incomplete incremental information international isdn italy james july junghoo justin know kumar large ledge management manasse mark mary mccab melb molina netherlands networks novemb octob ophir ourne padova pages press proceedings processing ravi references research retrieval scalable septemb sheffield sigir spire statistics steven string symposium syntactic system systems telae test thirteenth tomkins towards transactions transit understading very vldb voorhees wide with world yaniv york yossef zweig http://doi.acm.org/10.1145/1148170.1148181 6 Formal Models for Expert Finding in Enterprise Corpora aaai ackerman agent amore amplified analysis annual applied approach architecture artificial australasian ausweb beyond boston browser business campbell challenges cikm collection commerce communication communications communities community computer computing conference cooperative cozzi craswell croft cscw csiro curry cybernetics database davenport demoir design detection development documents domain ecscw effort electronic email engineering engineers enterprise evaluation expert expertise experts fifteenth finding flexible harvard hattori hawking herbsleb hertzum hidden hiemstra html http icse identification identifying ieee implementation inductive information intel international journal just kautz know kobsa lafferty language ledge leek ligence maglio manage management managing markov maron matsubara mcdonald methods microsoft milewski miller mockus model modeling models multiagent national network nickcr noptic notebook ohguro organizational organizations overview pages pejtersen people ponte practices press problem proceedings process projects prusak pubs quantitative recommendation recommender references reliability research retrieval sanderson school schwartz search searching seeking seid selman sensitivity sigir smoothing soboroff socialware software study summary supported supporting system systems test theory thesis they thirteenth thompson track transaction trec twelfth twente university users using vercoustre vries well what wiki wilkins work working workshop yimam yokoo york yoshida zhai zobel http://doi.acm.org/10.1145/1148170.1148296 100 Using Historical Data to Enhance Rank Aggregation academic advances after aggregation agia analysis applied approaches aslam atlanta based benchmark called castells center cikm collections combination combining combmnz combsum comparative compared comparison computed computing conf conference croft cyprus dcombmnz dcombsum development different distributions engines evaluation evidence feng florida followed four framework from functions fused have ifip information intelligent intl kluwer knowledge label lncs management manmatha melbourne metasearch method methods modeling montague multiple named namely napa normalization ones ontology orleans other outputs personalised prior published publishers rank rath rcombsum recent refer reference references relevance renda research results retrieval same scombmnz scombsum score search self semantics shows sigir standard step straccia swws symposium table taken technique techniques test tested these track trec tried tuning used where which will with work workshop york http://doi.acm.org/10.1145/1148170.1148239 52 On Ranking the Effectiveness of Searches advanced algorithm algorithms amati analyzing annual application applications automatic biometrika brazil britain butterworths carmel carpineto change choice cluster clustering computers computing conclusions conditional conference content croft cronen darlow data detection development difficulty dimensionality distance distributed document documents dubes edition effective effectiveness estimate estimating european examining exhibits expansion finding fine finland focused four fukunaga great hall http hypothesis ieee including inferring information international intrinsic investigate invited italy jain knowledge laboratory language learning lemur lemurproject lewis likely local london measures media method methods minka missing modeling olsen original ounis padova paper patterns perceptual performance perturbation predicting predictors prentice present proceedings processing properties query rate ratio reference references relevant report research respectively result retrieval retrieved rijsbergen robustness romano salvador search second section selective sensitive sensitivity series sigir similar similarity since spatial start string structure sunderland symposium systems tampere technical tendency that these this three together tombros toolkit townsend transactions using vicinity while will with year zhou http://doi.acm.org/10.1145/1148170.1148306 110 Is XML Retrieval Meaningful to Users? Searcher Preferences for Full Documents vs. Elements advances duisburg element fuhr glasgow http inex inexmw informatik information interactive july lalmas larsen lncs malik methodology otago proceedings references retrieval tombros track trotman users wanted workshop http://doi.acm.org/10.1145/1148170.1148253 62 Large Scale Semi-supervised Linear SVMs accuracy added algorithms also annealing applied arise available barbados based benchmark bennett bilbro bottou categorization chap class classification classifier clustering collection collob conclusion considered data decoste demirez density deterministic develop easily effective ehind eled elle enefits enhance erformance ervised even examples existing extended family fast fastest field finite frequently fung future generalization gives good handle have html http icml include inference international january jects joachims john journal keerthi kernel large learning lewis linear machine machines mangasarian mann mapping mean method methods miller modified multiple networks neural never newton nips numb obtain often onto optimization osed over peterson plenty presence primal problems prop provide purely reducing references relatively research returns rose scale scarce scenarios semi semisup separation setting significantly sindhwani sinz situations slower snyder soderb software solution solutions sons sparse stabilizing statistical statistics submitted such supp svmlight svms switching systems technical technique terms text theory these thing this training transductive tsvm uchicago unlab using utilizing valuable vapnik variability vector very vikass weights weston where wiley with work worthwhile yahoo yang york zien http://doi.acm.org/10.1145/1148170.1148197 18 ProbFuse: A Probabilistic Approach to Data Fusion achieved aggregation aics algorithm among analyses annual architecture artificial aslam auckland australasian australia automatic average bartell based bayes beitzel belew brisbane buckley calculated callan chang chowdhury cikm cognitive collection collections collier combination combinations combining combmnz condorcet conference conferences connell cottrell craswell croft culate data database development different distributed distributions document documents dreilinger dunnion effective eleventh elsevier engine engines erform eriments etzioni evaluation evidence expert february feng first forum french frieder from fused fusion garcia giles goharian gravano grossman gupta harman hawking higher howe ieee impact improved incomplete inference information inputs inquirus institute intel international internet ireland irish isolated january jensen johnson katzer know laird larkey lawrence learning learns ledge ligence lillis linear magazine manage management manmatha merge merging meta metacrawler metasearch model modeling models molina montague multiple national neci networks ninth normalization northern numb onse optimal opular organized osal osition outp outputs over overlap overview paep pages patents peng portstewart powell press probabilistic probabilities probability probfuse probfuseal problem proceedings process prop protocol publication queries query ranked ranking rath references regression relevance relevant representations research resource resp result results retrieval returned same sampled savvysearch science score scores search searches searching selb selection sets seventh shaw shown sigir significantly sixth special springer standards stanford starts strategies study system systems technical technol technology tenth text than that these third thistlewaite toolan topically track training trec ulster university using variations verlag viles vogt voorhees were which wide with world york zealand http://doi.acm.org/10.1145/1148170.1148199 19 Using Web-Graph Distance for Relevance Feedback in Web Search activities adam agglomerative alenex algorithms allan amanda analysis andrew annual arge authoritative automated automatic azadeh beeferman behavior belkin berger breadth brin bringing brodal case chen chengxiang chris citation cliffs clustering comparing computing conference cool dean development digital distillation documents dong doug dumais editors effectiveness efficient ellis engine englewood environment eric evaluation experiments exploratory external feedback finding first frayling gerth glen goldberg graph hagino haim hall harrelson henzinger highly horvitz http huseyin hyperlinked indexing information ingemar interaction interactive interest interests international jaana jaime jarvelin jennifer journal jurgen kalervo kaplan kekalainen kenneth kleinberg koenemann kurt lars laura lawrence library life lting manage meets mehlhorn memory methods meyer milic mindset motwani multi natasa nicholas order ozmutlu page pagerank pages park path personalized personalizing planar point posters prentice press proceedings process project propagation query rajeev ranking reach references related relevance relevant report research retrieval retrieving rocchio scaling search searching seda separation sergey shakery shortest sigir smart soda sources special spink sssp stanford study sublinear susan swat system tatsuya technical technologies teevan terry theory toma topic track tracks trec uiuc ulrich using vinay vishwa vivisimo wide widom winograd with wood world yahoo ying york zhai zhang zheng http://doi.acm.org/10.1145/1148170.1148274 78 A Complex Document Information Processing Prototype agam argamon building chen classification collection complex distance doclib document doermann effect efficiency figure filtration forecast frieder grossman heard identification ijprai income information jaeger kalera lewis logo mention more offline processing references research results retrieval reynolds sdiut sigir signature srihari statistics stein stylistic test text than tool using verification with words http://doi.acm.org/10.1145/1148170.1148270 75 Information Graphics: An Untapped Resource for Digital Libraries aaai academic acts advances analysis application approach artificial ashley association attention atuobrief automated automatic based bayesian bradshaw briefings cambridge captions carb carenini chang charniak chester classification cognitive cole color communicate communications communicative complex computational computer computers computing conceptual conf constraint content data demir diagram diagrams discourse discriminant document documents editor editors edjiev eech efficient effort elzer england erceptual erimental erry essay exploiting exploring figueiredo flickner from futrelle generate generation getting goals goldman gorkani graphics green grice grosz gupta hafner have hoffman huang human hunter ieee image incorp indexing indirect information integrated intel intelligent intention intentions interfaces into jain journal kerp language larkin ligence ligent limited linguistics london mani mapping mattis maybury meaning meeting methodologies model moore morgan multimedia multimodal netica niblack nikolakis oliva orating organization pages parsing patterns petkovic philosophical philosophy plan press probabilistic proc qbic query querying recognition recognizing references regions reiter retrieval review roth sawhney saying scenes science searle semantic semantics series sidner simon smith sometimes speech srihari sripada steele structural structure studies summarization swain symposium syntax system systems task tasks templates text theory third thousand time torralba trnka turbine university user users using utility utterer vailaya video vision visual visualization visualizations visualseek volume words worth yanker york zhang zukerman http://doi.acm.org/10.1145/1148170.1148286 90 An Exploratory Web Log Study of Multitasking analysis automatic cognitive combining during european evidence forum goker harper identification information jansen jasist jung logs management mining monsell multitasking outlier ozmutlu park pedersen processes processing references sciences search searching seeking semantic session sessionizing sessions spink switching task trends http://doi.acm.org/10.1145/1148170.1148334 138 Measuring Similarity of Semi-structured Documents with Context Weights based canada carmel chose collection condition configurable consists contains dataset describing description document documents each ecir effectiveness elements encoding evaluation following fragments from horses http indexing inex information initiative items kakade location maarek mandelbrod manually mass measure museum object proceedings qmir qmul raghavan ranking record reference references relevant repository resource retrieval santiago searching selected sigir similarity size soffer source spaces spain structure study terracotta their title toronto total type types used vector warriors with http://doi.acm.org/10.1145/1148170.1148266 72 Getting Work Done on the Web: Supporting Transactional Queries able adcs advances airs algorithm already amitay amount analysis anchor anchors annotating annotator apache approach appropriate architecture associated automatic based bikel bring broder business businesses buying capable case choose cities classificatio classification commission companies conclusions constructing content corp craswell data dataset design detection develop different digital dillon distinct docs document dramatic driven ecial ecific economic edia effective efore eling enefit engine engines entry estsellers etter even eventually evidences examples exploiting extraction extranet fagin finding first flight forum free from furnkranz generic genre genres geographic geotagging goals google growing gushrowski hawking help highly home hotel html http hybrid identification identifying improvement including information infoxtract instance integration intelligent interests intranet introduced jasis java kang kessler kraaij learning learns levinson like link links list location lucene machine making manage many matches methodology model modifications money multiple name naming need normalization online ooking optimization optimizing orate orbitz ortance ortant orting otential page pages pair pkdd princeton prior probabilities process produced provide providing query references results retrieval retrieving robustness rose rule search searchability searches searching selected services serving show showed sigir significant site smaller software standard structural study surprisingly taxonomy tell template text than that therefore this topic transaction transactional trec typically understanding university unwilling upstill urls user using very westerveld wget what where which will willing with without wordnet workplace workshop http://doi.acm.org/10.1145/1148170.1148259 67 Analysis of a Low-dimensional Linear Model under Recommendation Attacks acknowledgment against algorithm algorithmic algorithms already analysis annual application applied approach approximation architecture artificial attack attackers attacking attacks available average award azar bandwagon based been belief bergstorm better bhaumik billsus borchers bots brand built burke canny case cold collaborative commerce computation computational computations compute computer computing conclusions conference constant cooperative cost data decomposition demonstrated detect detecting detection deterministic diagnosis dimensionality directions divergence each edition effective effectively eigentaste either empirically examine experiment explore factor fast fewer fiat filtering filters find ford foundation framework future giles goldberg golub grouplens gupta have herlocker higher hofmann hopkins horvitz hurley hybrid iacovou icdm ieee incorporated incur information injecting injection instead intel interested international internet item items iterations jaakkola johns karlin karypis konstan kushmerick lanczos larger latent lawrence learning ligence lightweight likelihood literature loan machine mahony makedon many marlin material matrix mcsherry memory method methods metrics might mining mobasher model modeling models more moreover mounting much national need netnews neural number ones online open operators other pages paper papers part pazzani pearlman pennock performance performing perkins personality popescul possibility prefer press previous privacy procedure proceedings processing profiles profit random rank rating ratings readers reason recommendation recommendations recommender reduction refer references relatively representative research resistant resnick respectively retrieval revisions riedl robust roeder saia same sarwar schein science second segment semantic several shilling show shown siam sigir silvestre since singular specific spectral srebro start study success successful suchak suggests supported suppose symposium system systems technology than that them theory there third this time transaction transactions true uncertainty under ungar university upon used user users using value vectors wang webkdd weighted where whether which will williams with work workshop zhang http://doi.acm.org/10.1145/1148170.1148322 126 Give Me Just One Highly Relevant Document: P-measure absolute appear avep based bootstrap buckley chinese clir confidence databases diff document effect eguchi error evaluating evaluation experiment finding graded high highly informatics information institute japan japanese management measure metrics national ntcir overview precision proceedings processing references relevance relevant reliability report required retrieval sakai sensitivity sigir size society stability table task technical third topic transactions voorhees with workshop http://doi.acm.org/10.1145/1148170.1148301 105 Refining Hierarchical Taxonomy Structure Via Semi-supervised Learning algorithms artificial assigned based basu bilenko clustering conference constraints corpus cument data dataset datasets derived description discovery documents english eriments ervised experimental first found framework from hierarchical html http human integrating intel international jects karypis know learning ledge ligence machine metric mining native news newswire oney overview pages probabilistic proceedings references results selected semi sets sigkdd tenth uncertainty used vaithyanathan with zhao http://doi.acm.org/10.1145/1148170.1148316 120 A Method of Rating the Credibility of News Documents on the Web abdulla affect agency annual application applications articles assessment association attitude available beach calculated casey categorization categorized certainty cited collected communication computer computing convention credibility credible daily danielson divided dordrecht driscoll ebusinessnews education encyclopedia evaluation evening every experiment four from full garrison google googlenewspatentapplicationfulltext group groups having high html http human idea identification interaction into japanese journalism kando liddy manual manually mass method miami minutes model netherlands news newspaper newspapers november october online overseas pages patent proc proposed published reference references results rubin salwen score scores selected seven shown site sites springer stories table tagging television text texts them then theories topic webpronews were written http://doi.acm.org/10.1145/1148170.1148283 87 Community-Based Snippet-Indexes for Pseudo-Anonymous Personalization in Web Search abstracts adapted adaptive adar analysis annual apache august automatic balfe based boydell breuel briggs cass communications community computing conference context coyle creation current data development discovery edmonds efore engine engines etition european exploiting final freyne generic giles http ieee implementation improved indexing information interaction international internet jones journal july kemp know lawrence learning ledge list literature long lucene luhn mining modeling oley page pages personalized pitkow press principles proceedings query querysim ramamohanarao references regularity research result results retrieval returned sakai schutze search searcher sigir smyth sparck summaries term thus turnbull union user with york http://doi.acm.org/10.1145/1148170.1148245 56 Less is More accuracy accurate action algorithm algorithmically algorithms allan allow american amherst annual application applied approach appropriate argues asis assuming assumption assumptions automatic average based bayes better beyond bookstein buckley call carbonell case choosing classifiers clustering cohen collections common conclusions conference cooper could croft desired development directly discriminant discuss distinct distribution diversity document documentation documents does done ecml effectiveness efthimiadis empirical evaluate evaluating evaluation example expansion expect expected explore exponential extreme feasible filling focus focused formal forty fourth framework function functions future general given goldstein greedy guide harman harter have hedge hersh heuristics high hopefully icml identified include includes independence independent indexing indicates indicating ineffectiveness information interactive interpretations journal judged judgments karger kaufmann keyword known lafferty language learning length leouski lewis likelihood likely linear literature local locality many massachusetts mathematical means measure measurement measuring methods metric metrics middle might minimization model modeling models more morgan much naive need never objective objectives optimal optimization optimize optimized optimizing ordering other over overview pages part perform performance performs points poisson ponte poor possible possibly potential precision principle probabilistic probability proceedings process producing publishers quality query ranking readings reasonably references reflection relevance relevant remains rennie reordering report reranking result results retrieval review right risk robertson robust room salton says scenario science search seem sequential setting settings several shah shih shown sigir single sixth smart society some sophisticated soviet specialty subtopic such summaries system systems tackling teaching tech technical techniques technology teevan test text that them theoretically there this those towards track transfer trec union universally university used user using value variations voorhees wants weak well which while will with words work workshop worth would yield zhai http://doi.acm.org/10.1145/1148170.1148265 71 Learning to Advertise aaai adaptation addison adrevenues advert advertiser advertising affects african algorithm algorithms alternate alternative anegon annealing annual applications applied applying april article articleid attardi attitudes august automatic available baeza bahia based best bets bhargava bipartite boston bote brazil brighton btobonline calado cambridge carrasco chen ciety cikm classification clicks clickz client clustering collection combining communications computational computer computers computing conference consumers content context continuum conundrum conversion cordon coupling craswell cristo cument data datasets delivery description descriptions detecting development digital discovery distribution ecial ecific edance edition editor editors effective effectiveness effects eiro eled empirical enablement eneva engine engines erts esuli ethical ethicomp evaluation evolutionary experimental experts fain fawcett feedback feng fitness florida form forrester foster foundations fourth framework from function functions fusion fuzzy gaithersburg gallagher generic genetic gleich golgher gonalves gordon graph guerrero haig harman hawaii hawking hicss hill horng http ieee implementing information informs institute intelligent international internet into invalid investigation issues january jasis jasist ject journal july keyword keywordstudy kluwer know koza labeled landing lang langdon large learning ledge length listings longman machine maddox manage management maryland matching mcgraw mechanisms medium melb messages methodologies mining mishra mitchell modern moura moya multiple natural neto nist novemb november oneupweb online opers optimization order orts osed ourne overview page pages paid papers parsons pathak penno placement posters press pricewaterhouseco probabilistic problem proc proceedings process programming publication publications published publishing pujalte queries query ralph ranking rates redescribing references relevance research results retrieval revenue salvador science sciences scientists search selection seventh shift sigir simi simulated smith soft sources south sprague spring strat structures study suggestion system systems target targeted technol technologists technology term text thistlewaite thousands through tkde toward track trec troubador unlab unlabeled user using very visibility volume washington website weideman wesley with workshop yahoo yates york zarco zhang zhukov http://doi.acm.org/10.1145/1148170.1148237 50 Mining Dependency Relations for Query Expansion in Passage Retrieval amati answering application approaches association attar banko brill broglio buckley callan carpineto computing conference croft dataintensive difficulty document documentation dumais ecir expansion experiments feedback fraenkel full harper information inquery journal local machinery management mitra models probabilistic proceedings processing query question references relevance retrieval robustness romano salton selective singhal smart susan systems text tipster trec using with without http://doi.acm.org/10.1145/1148170.1148255 64 Constructing Informative Prior Distributions from Domain Knowledge in Text Classification aaai acero adaptation adding algorithm allan analysis approaches april automated automatic based bayes bayesian bhuptira bootstrapping buckley butterworth capitalizer carlin categorization ceas chai chapman chelba chieu classification classifiers classify cohen collection combining communications comparison computing data dayanik description dimacs discussion document ecml edition editor effect effective eliciting eling email emnlp empirical enchmark enhance entropy environment erformance ergen experiments feature features feedback filtering filters fradkin frequentist fuhr function gabrilovich gavrin generation genkin genomics graphical gupta hall headings heinemann help hersh hill home http hull iaai icml ijcai incorp indexing inference information interactive into introduction ject jmlr joachims johnson jones july kantor kluwer knowledge kraemer kudenko lafferty large lasso learned learning lewis linear little logistic london loss louis machine machines madani madigan many margin markovitch maximum mccallum mcgill mcgraw medical menkov mesh methods meyer mining model models modern nigam oles online oosting orating oriented overview oxford page pages parametric pedersen pfeifer porter predictive prentice press prior probabilistic problem processing program raftery raghavan rahim references regression regularized relevance relevant representations research retraining retrieval rijsb riloff robustness rochery rose ross routing royal salton scale schapire schutze sebastiani selection shrinkage sigir smart smith source spambayes srihari statistical statistics stripping suffix supp support surveys system tasks technometrics term text theory tibshirani track transferring trec univ using vector weighted weighting whateley with words workshop world yang zhang http://doi.acm.org/10.1145/1148170.1148329 133 Strict and Vague Interpretation of XML-Retrieval Queries analysis baas beitzel bricks categorized chowdhury clark consortium could derose exception extended formulation frieder grossman hourly inex jensen keefe language large narrowed nexi next oostendorp path possibly proc query recommendation references requires retrieval rnsson sigir sigurbj simplest that topic topically trotman union very which wiering with work xpath zwol http://doi.acm.org/10.1145/1148170.1148261 68 Evaluating Evaluation Metrics based on the Bootstrap accusystem agreement airs also applicable approximation arithmetic asakawa assumption assumptionretrieval avep based basic binary bootproceedings bootstrap bridje buckley cally chapman clir collection comparing comparison comparisons conclusions correlation cumulated dard data defect develop directly distribpp distributed effect effectiveness effects effort efron empirical engine eriment eriments error estimate etween evaluating evaluation evaluations even fail finally free from functions gain gallinari geometric graded hall hauptmann heuristics http hull identimeasures indep inen inex inference information insensitive insignificance introduction issue japan japanese johnson journal known less life lncs magazine management mean means measure methods metric metrics microsoft middle moreover most ncgl ndcg noted ntcir observed only opulation original othesis overview ower paired pdoc proceedings processing provides quantifying racy rank ranking references relevance reliability relies research results retrieval revisiting robust rvelin sakai samples sampling sanderson savoy search section selb sensitive sensitivity should showed sigir significance simple size society stability stan statistical strap strategies swap systems techniques test testing tests that these this those through tibshirani tioned topic topics toshiba track transactions trec unconventional unpaired useful using ution very voorhees when while wild with http://doi.acm.org/10.1145/1148170.1148172 27 Social Networks, Incentives, and Search acad academic adamic adar advances agent agents algorithmic algorithms alstyne annual archives artificial aspnes australian automatic autonomous barrier based broadening building callan cassel chakrabarti chapter chen classification collab collective combining communications comparison complex comput computer computing conf conference congress connected craswell cresp crestani croft crowcroft data databases decentralized degree degrees development digital discovering discovery disparity distributed duncan dynamics ecir economics ective ects editor effective effectiveness efficient eirotis engine engines erability erimental ersp europ factors federated field filtering fostering foundations framework from games garcia geographic goncalves gravano harvesting hidden hierarchical homophily human hypertext icdcs ieee incentive indices information initiative intelligence international internet interop jensen joint july kaufmann kautz kleinb kluwer know krowne kumar lagoze ledge libraries mathematicians mcdevitt mechanism meng message metasearch methods milgram mining modeling molina morgan multi nation natl nature network networks norton novak nowell orative osium overlay pages papadimitriou peer perez phenomenon pias press problem proc proceedings psychology publishers qprob query quinones raghavan references referralweb relaying research retrieval richardson routing sahami sanderson schemes science search searching selman sensor shah sharma sigchi sigir simsek singh small social sociometry somp strogatz structures study surv survey surveys swim sycara symp symposium system systems tardos text theoretical theory thesis today tomkins transactions travers tutorials university using watts wireless workshop world zhang