http://www.informatik.uni-trier.de/~ley/db/conf/sigir/sigir2003.html SIGIR 2003 http://doi.acm.org/10.1145/860435.860481 33 Building a Filtering Test Collection for TREC 2002 alistair analysis andrew annual appear australia based benchmark bruce buckley butterworths canaria categorization cation chris classi clustering collection combination conference corpus croft david description development donald donna dublin editor editors edward eighth eleventh ellen evaluation experiments fifth filtering final fourth from gaithersburg gran harman harper hersh hickam http hull information institute interactive international ireland joseph journal justin kraft language large lavrenko learning leone lewis machine mccallum melbourne modeling models multiple national news ninth nist ohsumed orleans overview palmas press proc references relevance reliable report rerieval research resources results retrieval reuters rijsbergen robertson rose ross scale searches second seventeenth seventh shaw sigir sixth soboro standards statistical stephen stevenson technology tenth test text third tomorrow tony toolkit track trec victor volume voorhees whitehead wilkinson william yang yesterday yiming zobel http://doi.acm.org/10.1145/860435.860499 47 Automatic Transliteration for Japanese-to-English Text Retrieval academic acknowledgments advantage alternative although ambiguities ambiguity annual anonymous appear applications approach approaches arabic artificial asian asru association attested augment automatic automatically back ballesteros based been believe between bilingual both braschler bringing buckley candidates capitals cardie case characters chen chinese choi choose clarit clef clir clustering cognates cohen colng combinations comments commercial complexity compound computational computer conference congress constraintcontrolled cooccurrence corpora corpus creation croft cross crosslanguage cutoff database david davis decaying demonstrated dependence derived described development dictionary differences different disambiguation discussion docherty document donna earlier editors effective efficiently ellen empirical engine english englishchinese entire entities euralex evaluation evans evidence existing expensive experiments expert exploration extracted feedback fifth first forum four frayling from fujii future gaithersburg gefenstette generate generated generating glosses gonzalo government graehl grefenstette handel hangul harman have heid here hull humanities hypotheses hypothesis hypothetical improve improved information insightful intelligence interest international invalid involving ishikawa jang japan japanese kando kang katakana keep kluck kluwer knight korean language languages large lefferts lexicon linguistics lower machine maeda management mapping mappings meeting meng metalexicography methods might milic mitra model models module monolingual more mutual myaeng named names nist notes nozue ntcir number office other ours pairs paper papers parallel park performance peters phones phonetic possible practice press printing probabilistic probabilities problem proceedings process processing publication publishers queries query radical rather recognition recognize reduce reducing references related relations relied rely remove reporting representations research researchers resolve resolving results retain retrieval reviewers revised revision sadat search second semitic sequences setting sigir significance significantly similar simple simply sixth smart sounds space special speech spoken springer sshy sshyu stalls string such summary superconcepts support syntactic systems table tang target tchi technical techniques term terms testing text than thank that their they third this thus tokyo tomany tong translating translation transliterated transliteration transliterations trec typically uemura understanding usage used using validate validation version viiith voorhees walz washington weighting with within word words work working workshop writing yoshikawa zhai zhou http://doi.acm.org/10.1145/860435.860486 37 ReCoM: Reinforcement Clustering of Multi-Type Interrelated Data Objects academic accrue acmsigkdd agent agents algorithms analysis anatomy applications artificial autonomous based berkhin boston breese brin browsing chakrabarti chen classification clustering cohn collaborative collections combining communities comparison conf conference connectivity content correlation cover craven data dhillon directory dmoz document domains efficient elements empirical engine engineering explorations filtering foster framework from gibson hawaii heterogeneous hicss hofman html http hypermedia hypertext hypertextual ijcai inferring information intelligence international intl joint kleinberg kluwer large link logs methods microsoft mining missing model national neural objects open page pages personal predictive probabilistic proc proceedings processing products project publishers query raghavan recommendation references relational report research researchpapers scale sciences scientific search searching sigkdd singapore slattery statistical steinbach survey sycara system systems taskar technical techniques text theory thomas topology transactions tutorial ungar unified user using very webmate wide wiley workshop world zeng zhang http://doi.acm.org/10.1145/860435.860475 29 Re-examining the Potential Effectiveness of Interactive Query Expansion american based beaulieu binding blocks chang cirillo collection computer conference control cunliffe digital documentation effectiveness european evaluation expansion experiments feedback freezing groups harman information interaction interfaces journal lecture libraries management modified notes overview proceedings processing qualitative query razon references residual retrieval rome saracevic science search selection sixth smart society spink support terms test text thesaurus trec tudhope using voorhees with http://doi.acm.org/10.1145/860435.860514 61 When Query Expansion Fails allan annual approach automatic available beaulieu berkeley bigi buckley california cambridge cantly carpineto chosen clear collections combining comparative conclusions conf conference croft cronen development ehaviour eight erent erformance evidence expansion experiments exploration explored exploring feedback filtering flexible from gaithersburg guiding gull hancock harman have identi improved improvement information jones keenbow louisiana management mandala mano method metric microsoft model mori multiple ninth nist ogawa okapi optimization options orleans ound overview pages parameters parts performance predicting preliminarily probabilistic proc processing provides pseudo quanti queries query range reasearch references relevance research retrieval robertson romano sakai salton second selecting should showed sigir signi singhal smart sparck special successful tables tanaka technique terms test text that theoretic thesaurus this tois tokunaga townsend track trec types using varied walker what zhou http://doi.acm.org/10.1145/860435.860448 9 Implicit Link Analysis for Small Web Search about above access according achieve addison adjacent affect agrawal algorithm algorithms also altavista american among amsterdam analysis analyze anatomy anchor another appendix applied apply appropriate april arbitrary arrived asked assume assumption august australia authoritative authorities average baeza bailey based beaulieu because becher behavior being berkhin between bias bibliographic borodin boulder brin bringing brisbane broder browsing bulletin caching calculated california cannot chakrabarti chen citation cocoon combination commonly compared completely computer computing conclusion conduct connected connectivity construct containing contribution contributions conventional cooley could coupling craswell current data database dekker designers difference different directhit directly discovering discovery distinct documentation documents dramatically each easily effective efficiently elaborating else encyclopedia engine engineering enterprise environment episodes event every existence expectations experiment experiments explanation explicit exponentiation feburary figure finding following forrester francisco frequent from furthermore future gaithersburg gatford generally generation gibson given good google graph group hagen have having hawking hearst henzinger here high highest hits hong however http hubs hyperlinked hyperlinks hypertext hypertextual icde ieee ignore implicit improve included increases information insight interaction interactive interesting internet into intranet intuitively invalid item joint journal june kato kessler kleinberg knowledge kong kumar kunar kyoto large larger last length less levene linear link links little logs loizou maghoul mannila manning marcel march matrix measure measurements mechanisms method methods microcomputers miller minimum mining mobasher models modern modifications modified mortazavi most motwani must nakayama navigation neto networks never next nist noticed november number obtained occurs october okapi once only order organization organizing orleans other outperforms overview page pagerank pages pair pairs pakdd paper papers path paths pattern patterns paul payne practice precision prediction prefetching preparation probabilistic probabilities probability problem proc proceedings process propose publication queries query raghavan rajagopalan randall random rank ranking recommendation records redundant references relatively report reranking result results retrieval ribeiro roberts robertson rosenthal same satisfactory scale schaefer scientific score scores search section seen select selecting separated september sequences sequential session setting sigir sigkdd significant similarity simplicity site sites small social some sources special srikant srivastava stanford start stata step still stink stop strict structure structures such support system systems table taiwan target technical test than that then there therefor therefore this thus toivonen tokyo tomkins total track traffic trec true tsaparas university usage users using usits values variance verkamo visit volunteers voorhees walker webkdd webs website weighting well wesley wheeldon when where whose wide wiener will winograd with words work world yamane yang yates zhang http://doi.acm.org/10.1145/860435.860461 19 Experimental Result Analysis for a Generative Probabilistic Image Retrieval Model∗ advanced applied approach ballegooij bayesian centre computer conference costello croft data development digital doermann ecdl editors eleventh eurasip european evaluation from harman hauptmann hiemstra information institut issue italy jong journal language lecture libraries management massachusetts model modelling models multimedia notes over pages ponte probabilistic proceedings processing references report research retrieval rome rorvig science sept september sigir signal smeaton smith sources special springer technology telematics text thesis track trec twente university unstructured using vasconcelos video visual volume voorhees vries westerveld http://doi.acm.org/10.1145/860435.860509 56 Exploiting Query History for Document Ranking in Interactive Information Retrieval analysis applied automatic based buckley callan chien cikm conference croft delos digital divergence document eighth expansion experiments feedback global harman huang improving information interactive lafferty language larvrenko lemur libraries local mandar methods mitra model modeling models ogilvie overview oyang personalisation personalization perspective proceedings query recommender references relevance retrieval search session sigir smoothing study suggestion systems term text toolkit townsend trec using voorhees workshop zhai http://doi.acm.org/10.1145/860435.860477 30 Latent Concepts and the Number Orthogonal Factors in Latent Semantic Analysis aaai access account advanced algebra algorithm algorithms american amsterdam analysis analyzed annual aono apprentice april armstrong attribute august automatic available baeza bartholomew berkeley berlin berry billsus brien buckley cahiers california caruana categorization category cation cient citeseer city claim climbing cluster clustering collection commerce communications comp comprehensive computer conclusion conference context correlation crystal data database databases david dayne decesions deerwester department development distributed distribution document documents drift dublin dumais dupret dyadic dynamic ecial ective edition eger eiro endence environments ervised estimation etitive examined explicitly factor factors feature features fifth figure following freitag from furnas gaithersburg gathering gavin george greedy harman heidelb heterogeneous hill histogram histograms hofmann html http human identi identifying incremental indexing information institute intelligent interesting international invited irrelevant joachims john journal june karl kendall keywords knott kobayashi kohavi kohonen labs landauer large latent learning lecture leibniz lewis library linear lled lsus machine malassis maps martin memmi menlo method meunier michael mining mitchell models modern muramatsu mutation national neto network networks ninth nisc notes numb organizing orthogonal outlier pages park patent pazzani planning porter press probabilistic problem proc proceedings program prototyp provides publication published puzicha random rank ranking ranks references research retrieval retrievial reuters revising rich rocchio salton sampling samukawa scaling science sciences search searching second selection self semantic series siam sigir singular sites skalak society space spring springer standards statistics stripping structure structuring subset survey susan symposium synonymy syskill take tate technical technology test text tfidf thomas thorsten trec unsup user using validity values variable vector vectors version very vocabulary volume voorhees webwatcher which wide with workshop world yang yates york http://doi.acm.org/10.1145/860435.860452 12 Search Strategies in Content-Based Image Retrieval academic analysis annual application applications arnheim ashford assessing assist association available basalaj based behaviour beijing belongie berkeley bimbo blobworld boston browsing california carson chang china clusters commission conference conniss content cooper creative data database databases design development diego documentation does eakins edition empirical engineering enser evaluating evaluation example expanded expectationmaximization exploring fidel final francisco graham grail greenspan hartley hewitt holy html http hypermedia ieee iidr image implications information institute integrated intelligence interfaces issue jain jisc journal kaufmann kluwer library louisiana machine magazine malik mcdonald media metrics morgan multimedia navigation newcastle newspaper nielsen northumbria organisation original orleans ornager pacific pattern perception perceptual pictorial press proceedings programme progress psychology publication publishers query querying references report republic research retrieval review revised rodden rubner santini seattle second seeking segmentation sigchi sigir similarity sinclair supported system tait task technology topology transactions university usability user users using venters version visor visual washington wood word http://doi.acm.org/10.1145/860435.860524 71 Error Analysis of Difficult TREC Topics analysis another average calculate callan comparisons conference connell database differ difficult distributed documents easy eighth french harman hypothesis impact jones methods most overview pages performance powell proceedings query ratio references relevant rely retrieval searching selection sigir since statistics summary terms test text that this through topics trec value viles voorhees weighting well which whole http://doi.acm.org/10.1145/860435.860441 4 Empirical Development of an Exponential Probabilistic Model for Text Retrieval aaai able accurate adaptive advantages allow also although analysis analytic application applications apply approach approximations arti ascent assumption automatic barzilay based bayes bayesian bernardo butterworths cambridge cant cases cation center chan church cial city classi classify closed closely compared computational computer conclusions conjecture content corp corpus could croft currently data dent describ developement development discussed distribution distributions document documentation documents ecause ecomes ective eled elieve elieved emcl enable encoding engineering erent erformance eriments erties erty eskin estimation etter even example exploratory exponential eyheramendy families family feedback figure fitzpatrick forms forty found framework from fuhr functions further future gain gains gale general generalized global good gous gradient graphical greatly grei hand handle hatzivassiloglou hauptmann heckerman help hyper impact improve improvements improving incorp independence information intel intelligent interdocument interesting interpretation into investigate jasis john jones journal kalt katz klavans kwok labeled laboratory language learn learned learning length lewis ligence like likelihood likely literary lookup luckhurst luhn machine madigan making many massachusetts matched maximum mccallum mckeown mechanized method microsoft mitchell mixtures model modeling modelling models modify more multidocument multinomial naive natural networks nigam occur only optima optimization orated orating ortant ossible other over particularly past phrases poisson ponte precision principled probabilistic progress prop prospects quality queries ratio recall references reformulation relevance relevant replace report research results retrieval revised robertson same searching seem short should sigir signi similarity simple situations smarter smith social some sons speci stage statistical statistics status steps study such summarization systems table tailor technical techniques term terms text than that then theory this thrun times title towards trec tutorial understaning university unlab unlabeled used using vanrijsbergen varies walker weighted weighting were while wiley will willett with words work working would zhai http://doi.acm.org/10.1145/860435.860539 86 Assessing The Effectiveness of Pen-Based Input Queries also alternates ambiguity arpa balasubramanian ballesteros based croft cross default dictionary each easy edition effects filtering found from hardest home http human information kimber kupiec language least lists microsoft occurrence page participant pirkola proc proceedings queries query recognise recognised recogniser references resolving retrieval semantic setups shows sigir site speech structure table tablet tabletpc technology terms that topic using visited were which windows windowsxp words workshop http://doi.acm.org/10.1145/860435.860440 3 Beyond Independent Relevance: Methods and Evaluation Metrics for Subtopic Retrieval adaptive allan applied approximating based callan carbonell cikm conceptions conference cover detection divergence diversity document documentation documents economics editors evaluation experiments feedback feige forum goldstein gupta harman hersh highly information interactive international invited jarvelin journal july kekalainen khandelwal knowledge lafferty language lemur library ltering management methods minimization minka model models news nist novelty ogilvie over overview pages principle probability proceedings producing publication query ranking reconsidered redundancy references relevance relevant reordering report reranking retrieval retrieving risk robertson saracevic science search sept seventh sigir sixth smoothing special study summaries talk temporal tenth text threshold toolkit topics track trec using varian voorhees zhai zhang http://doi.acm.org/10.1145/860435.860497 45 Probabilistic Structured Query Methods algorithm approximate arabic average baeza baseline baselne better black cells clean clir combination combinatorial compared crosslanguage cumulative darwish dependence effectiveness english eshold evidence experiments faster figure fine french grams gray hold information kwok lncs maryland matching mean navarro oard pages pattern pirkola precision print printed probability proceedings queries query references represent results retrieval searching selection sigir springer statistically string table term threshold title track trec using verlag worse yates http://doi.acm.org/10.1145/860435.860444 6 Structured Use of External Knowledge for Event-based Open Domain Question Answering aaai amardeep annotation annual answer answering answers applicability approach association badulescu banko based bases bell bolohan brill brown building carroll chapter chua clarke coden coling computational conference cormack czuba dataintensive detection development dictionary domain dumais eacl effectiveness eleventh european exact exploiting external fabric factoid ferrucci form from gigabytes girju grewal harabagiu harris hermjakob high hong hovy information integration international kaufmann kemkes knowledge lacatusu laszlo lexical likely linguistics logic lynam managing meeting mining moffat moldovan morarescu morgan multi notebook novischi open overview pasca patterns perceptron performance prager predictive probabilistic proceedings qualifier question radev redundancy references reranking research resources retrieval selection semantic series sigir soubbotin source spotting spring statistical strategy strings symposium systematic tenth terra text tilker tools topic track transformation trec voorhees weiguo welty wide witten wordnet workshop world yang http://doi.acm.org/10.1145/860435.860482 34 An Empirical Study on Retrieval Models for Different Document Genres: Patents and Newspaper Articles annual answering approximations automatic buckley comparisons conference cross development distillation document ective evaluation experimental experiments exploring forum fujii hall hull information international itoh iwayama kando keen length little management mano mitra model normalization ntcir ogawa overview pages patent pivoted poisson prentice presenting probabilistic proceedings processing question references relevance research results retrieval robertson salton sigir similarity simple singhal smart some space statistical summarization system takano task term testing text third using walker weighted weights with workshop zobel http://doi.acm.org/10.1145/860435.860471 26 Robustness of Regularized Linear Classification Methods ∗ in Text Categorization acknowledgments addison algorithms amount analysis analyzed annual anonymous arsenin authors available award badly baltimore based berlin case categorization catin cation chute cikm classi comments comparative compared comparison computations concluding conclusions conditions conference connection controlled corn crude data development discussed distributions document dokl dortmund dumais earn ecml edition editor elements erences erform erformance erforming erforms erger eriments error establishing etter etween european even examination example examples explanations feature features figure fisher formulated foundation fourteenth friedman function functions generalization give golub grain grants hastie have heckerman helpful hopkins however hull icml incorrectly indicated inductive inference information interest international investigated joachims john johns kaufmann learning lewis linear loan logistic loss lsviii luenb machine machines many mapping math matrix method methods mining money morgan national nature necessarily nevada noisy nonlinear numb oavg oles only onsored onsors opinions optimization ositive othesis ounds pages part pedersen platt posed positive practically prediction presented press problem problems proceedings programming rare references regression regularization regularized related relevant remarks report representations research retrieval reviewers ridge ringuette robustness routing sahami schutze science score sdair selected selection settings ship show shrinking sigir similar size small solutions sons soviet space springer statistical study supp supply symposium systems targets text than thank that their theoretically theory they third this those three thus tibshirani tikhonov tois trade transaction treated under universit university vapnik vector vegas very volume washington well wesley wheat when wiley winston with worst yang york zhang http://doi.acm.org/10.1145/860435.860447 8 Building a Web Thesaurus from Web Link Structure acquiring adaptation although analysis anchor approaches association authoritative authorities automatic based basic beckwith been bolelli borodin brin bring broad broadwater buckley cambridge categorization chakrabarti chen church citation classifying clustering coded coling communities computational computer concepts conclusions cover craswell creating croft database davison describing design development devoted distributional dolan durand effective effort elements engine english enhanced environment ertekin expansion fellbaum filtering finding flake format from function future gibson global glover graph gross group growth guide hand hanks hawaii hawking henzinger heydon hierarchy http hubs hyperlinked hyperlinks hypertext ichikawa implications inducing indyk inferring information international isdn joshi journal kahn keep kleinberg language large lawrence lexical lexicography liechti line linguistics link local locality management mapa metadata microsoft miller mindnet mining mitzenmacher model motwani much mutual najork natural near networks norms object order orleans page pagerank pages paghavan parasite parser patrick pennock pereira press principles proc processing punera queries query raghavan ranking references report research retrieval richardson roberts robertson rosenthal salton sampling sarah scale search searching semantic shape sifer sigir sigmod site sites sources speed spertus ssgrr stanford structural structure structured structures structuring style system systems technical term terms text theory thesaurus thomas tishby tools topical topics topology towards track trec tsaparas tsioutsiouliklis uniform university user using vanderwende visualizing walker website websites weighting wide wiley winograd with word wordhoard wordnet words works world wrdhrd yale york zhang zhou http://doi.acm.org/10.1145/860435.860457 16 Using Asymmetric Distributions to Improve Text Classifier Probability Estimates aaai active additional advances algorithm algorithms american applications assessing assessments assessors association asymmetric bartlett basic bayes bayesian bennett beyond binary birkhauser bourlard brier brown calibrated calibration callan carnegie categorization cation chen cikm class classi combining communications comparing comparison comparisons computer concepts conditions content continuous corrigendum coupling data daviddlewis decision decisions degroot distribution distributions domingos duda dumais ecml economics editors elkan elsevier embedding engineering engines estimates estimation evaluation event examination expressed extensions fall features feng fienberg finance forecasters forecasts forum freund from gale generalizations goel good hart heckerman hierarchical http icml ijcai improve independence inductive inference into january joachims john journal kotz kozubowski laplace large learning lewis likelihood lindley linear machine machines manmatha many margin mccallum mellon methods modeling models monthly morgan multiclass multivariate naive nigam nips obtaining optimality outputs papka parametric pattern pazzani perceptron platt podgorski posterior press probabilistic probabilities probability provost publishers ranking rath rational recognition reconciliation reducing references regularized relevant report representations resources reuters review revisit royal rules saar sahami schapire scholkopf school schuurmans science score scoring search sequential series sigir simple smola society sons speech standard statistical statistician stork support system technical techniques terms testcollections text training trees tsechansky tversky using vector veri weather wiley winkler with workshop yang zadrozny zellner http://doi.acm.org/10.1145/860435.860519 66 Enhancing Cross-language Information Retrieval by an Automatic Acquisition of Bilingual Terminology from Comparable Corpora acquisition adjectives adverbs also analyzers approach articles automatic based bilingual chasen clir coling collection combination combinations comparable completed computational considered content cope corpora daily dejean described dictionaries different documents during eacl edict english evaluate evaluation evaluations extraction features focused foreign from fung gaussier german graehl have identification japanese jean knight koehn learning lexical lexicon linguistic linguistics machine mainichi model models monolingual morphological multilingual news newspapers nouns ntcir order parallel performances proceedings processing proposed rapp references results retrieval ronis sadat smart special statistical strategies system table test text texts thesauri translation translations transliteration unrelated unsupervised used verbs view were with word words workshop http://doi.acm.org/10.1145/860435.860494 43 Domain-independent Text Segmentation Using Anisotropic Diffusion and Dynamic Programming adaptive advanced advances agglomerative algorithm algorithms allan analysis anisotropic annual approaches asia aspect association august automatic based beeferman benjamins berger between bipartite blei bogurae brants broadcast buckley callan cance carbonell cation chen choi cient clustering coling computational conduction conference corp corpora correlating critique croft cues darpa data decomposition dectection detection development digital discourse discovery document documents doddington domain dynamic edge enabled erty european evaluation evidence expository extracting extraction full generic graph hartigan hawaii hearst heinonen hidden hierarchical hypertext identi ieee improvement independent information intelligence international into isahara iterations jime john journal keyphrase klavans knowledge kozima lambda language large latent learning level libraries linear linguistic linguistics location machine malik manabu management mani markov martin mckeown meeting metric mining mitra model modeling models moreno mult multi multilingual multiple mutual naacl natural news optimal paci pages paragraph parameters passage passages pattern perona pevzner pilot ponte porter porterstemmer principle probabilistic proc proceedings proceessings programming recent recognition references reinforcement report research resources retrieval reynar salton scale sciences segment segmentation segmenting segments semantic sentence setting shared sigir signi similarity singhal sons space statistical statitical stemming stop study subtopic successful summarization surface system systems table takeo tartarus technology text texts texttiling themes topic topics tracking transactions transcription tsochantarides understanding using usion utiyama very wayne wiley with word words workshop yamron yang york http://doi.acm.org/10.1145/860435.860553 99 DefScriber: A Hybrid System for Definitional QA anlp answering approach aquaint arda arlington automated based blair budzikowska centroid chal columbia concepts cucs documents extraction goldensohn grishman homme hovy hybrid information intel jing lenges ligent mckeown model month multiple naacl nist nition nitional pages program questions radev references report sager scalable schlaikjer springer summarist summarization technical techniques terminology text university verlag workshop http://doi.acm.org/10.1145/860435.860530 77 A Comparison of Various Approaches for Using Probabilistic Dependencies in Language Modeling above allan approaches based bruce bruza burgess butterworths calculate capturing charniak choquetee cikm collection computing context croft cross dependencies discourse divergence document documents each employed estimate explorations flow foundations from inference inferring information jasist kullback language lavrenko learning leibler lingual livesay lund manning measure model models nallapati natural networks over press proceedings processes processing query references relevance relevant retrieval rijsbergen sensitive sentence sentences sigir smoothed song space statistical term terms then this towards trees turtle used using vocabulary words http://doi.acm.org/10.1145/860435.860451 11 Stuff I’ve Seen: A System for Personal Information Retrieval and Re-Use access adar addison after ahlberg alta alternative american analysis anderson appear architecture asist assessment assistant atlantic august available back barreau based behavior behaviors behaviour belkin bell bharat bookmarks broder browsers browsing bruce bulletin burrell bush cambridge came carballo catledge characterizing cikm cockburn computer conference cool cooperative creating cutrell date death deployed design designed desks desktop document documentation dourish drucker dumais dynamic easily edwards effort electronic email empirical environment environments erickson evaluated evaluation experimental exploration exploring faceted fertig file filters find finding findings fluid found freeman frequently from frontier fulfilling future gelernter gemmell georgia graph greenberg have haystack hearst henzinger hicss history hone horvitz http human huynh image implementation implications individuals information initial installation integrating interact interaction interactive international isdn issues items iterative jansen jones journal kaasten karger keeping kelly kumar lamarca landmarks large lawfulness least less library life lifestreams loans long lueder maghoul malone management marais mckenzie memex memory metadata metaphor milestones model monthly moricz multimedia mylifebits nardi network networks next note notebook october office once organization organize organizing other overload ozchi page pages papers park people perez person personal pirolli pitkow platform positive predicting press presto previously principle proceedings processing provides psychological public quan queries query quite raghavan rajagopolan recker recognize references reflections reflective reformulation regardless relevant reminding report repositories representation retrieval retrieving revisit revisitation ringel rreuse salisbury saracevic schooler science search searching seen shneiderman show sidner sigchi sikora silverstein simple society spaces spink stata stein stochastic stores strategies structure studies study such summary support supporting surfing swearingen system systems tauscher tech technical technology term that their they think thomas thumbnails time titles tomkins tools toolsmith transaction transactions type uist unified university urls used user using value very vision vista visualizing wesley what where whitaker wide wiener williamson with wolfram wong work world zipf http://doi.acm.org/10.1145/860435.860463 20 Combining Document Representations for Known-Item Search academic advances analyses anchor annual approaches aslam callan calv center chapter chowdhury collins combinations combining conf conference cottrell craswell croft detection development effective evidence filtering finding frieder from grossman hawking information intelligent international jiang kluwer lafferty link mccabe metasearch models montague multiple named notebook novelty ogilvie page pages proc proceedings publishers recent references research retrieval robertson savoy sigir site song strategies text thompson trec using version vogt vrajitoru zhai zhang zhao http://doi.acm.org/10.1145/860435.860478 31 A Frequency-based and a Poisson-based Definition of the Probability of Being Informative about absence account aizawa also amati american annual anthology applies applying approaches approximations assumption assumptions automatic based belew bibliometric binary bookstein bronstein cambridge church city classical collection comp conference constant construction containment corpora could coverage derived deutsch development deviations discussed disjoint disjointness distinguishing distribution distributions document documents does duality ective ectively ehaviour eing emphasises endence engineering ensated entropy erent ergen ersp ertson establish etween european events explained explicit explicitly finding formulating formulation frankfurt frequency from gain gale glasgow harri indep indexing inference information informative international interpretation into inverse irsg january journal lafouge language large leverage link links london loquium main management margulis mathematical mathematically mathematik measure measures michel mixtures model modeling modelling models multidimensional natural nature nition noise noisy normalization occurrence oisson ortance osition pages parameters pareto poisson press probabilistic probabilities probability proceedings processing quality radical rather references relatively research resp result retrieval rijsb science scotland showed sigir simple size smooth smoothen society some space springer structure such summary swanson systems take taschenbuch term terms that theoretic theoric theory third this those three thun thus transactions transforms university verlag very walker weighted whereas with within wong workshop http://doi.acm.org/10.1145/860435.860483 35 Collaborative Filtering via Gaussian Probabilistic Latent Semantic Analysis aaai algorithmic algorithms analysis application applying architecture arti automating based basu bayesian bergstrom borchers brackets breese canny case cation chickering chien cial class classi clustering cohen collab commerce communications communities computer computing conference content cooperative data denote development dimensionality eachmovie ecml empirical endency eople erforming ervised estry european extracted factors figure foster framework from gaussian george goldb gordon grouplens heckerman herlocker hirsh hofmann human iacovou ieee indexing inference information intel international joint journal kadie kardie karypis konstan latent learning ligence ltering machine maes maltz means meek methods miller mining model models mouth netnews networks news nichols numbers online orative pages plsa predictive press privacy probabilistic proceedings puzicha recommendation recommender reduction references research resnik retrieval riedl rounthwaite sarwar security semantic seventh shardanand sigir social statistics study suchak supported symposium system systems terry uncertainty ungar unsup usenet user using visualization volume want weave webkdd what with word work workshop http://doi.acm.org/10.1145/860435.860466 22 Word Sense Disambiguation in Information Retrieval Revisited agirre allan altavista ambiguity analysis annual arpa association automatic based basque boston broader california center cikm collocation computational conference country decision digital disambiguate disambiguation document donostia engines exploring finland freetext good henzinger human indexing information informational infornortics interaction internal international knowledge language large linguistics lists management martinez massive meeting network note pacific paraphrase part patterns pederson pittsburgh princeton proceedings quality query raghaven reasearch reduce references relevance report retrieval retrieving sanderson search semantic sense senses sigir silverstein sources speech stevenson sussna symposium systems tampere technical technology text travis using vancouver vegas very vooehees wallis washington wilks with word wordnet workshop yarowsky http://doi.acm.org/10.1145/860435.860470 25 Text Categorization by Boosting Automatically Extracted Concepts advances algebra algorithm algorithms analysis appear application approach baker bekkerman berry brien buckley categorization cation cient classi clustering collection collections combining computer conference cristianini data dataset david decision dence development discriminative distributional document documents dumais editor eriments ervised evaluation exploiting feature features freund generalization generative harman haussler hersh hickam hofmann improved information intel intelligent interactive international iyer jaakkola journal kandola kaufmann kernels large latent learning leone lewis ligent line linear lodhi ltering machine mapping mccallum models morgan neural nist noise ohsumed oosting pages predictions preferences press probabilistic probmap proceedings processing publication publishers rated reduction relative research retrieval reuters review schapire sciences semantic shawe siam sigir similarity singer special statistical system systems taylor test text theoretic third tishby trec unsup using volume weight winter words yang yaniv http://doi.acm.org/10.1145/860435.860493 42 Retrieval and Novelty Detection at the Sentence Level academic adaptive allan annual available based bolivar broadcast callan caputo carb carthy ceedings chapter classi clsp combining conference connell cross cument cuments darpa dels detection development diversity doddington ducting eleventh eriments erman erty estimation event exact expansion final gildea goldstein gupta harman http icml improve inference information international introduction jiang jman khandelwal khandewal kluwer kraaij krovetz language larkey lavrenko learning likeliho ltering machine maximum minka mixtures model modeling models morphology nding news notebook novelty onell online organization overview pages pilot press proceedings process publishers redundancy references relevant reordering report reranking research retrieval semantic sigir song spitters stage stokes story study summaries summarizing summer syntactic technologies temporal text topic topics track tracking tracks transcription trec umass understanding viewing wade wayne word workshop yamron yang zhai zhang zhao http://doi.acm.org/10.1145/860435.860510 57 Automatic Ranking of Retrieval Systems in Imperfect Environments according automatic based cahan chowdhury conference correlation editors evaluation evaluations fifth figure gaithersburg graphically harman human institute judgments method national nicholas nist november overview performance proceedings publication ranking references relevance retrieval runs search services shows sigir soboroff sorted special standards systems technology text their trec values voorhees wide without world http://doi.acm.org/10.1145/860435.860541 88 Speech-Based and Video-Supported Indexing of Multimedia Broadcast News audio based bessho biased boundaries boundary broadcast browsing candidate change choose commercials comparing computational conceptual condition conditions conference continuous corresponding defined depending designated determine document documents each eight employed equal evaluated exact exceeds experimental extract extracting extraction fifth first frame garofolo hayashi hearst identified ieice included information integrated integration into ipsj japanese kinds length linguistics local maximizes minutes multimedia multiparagraph news ohtsuki packaging panoramaexcerpts panoramas parameter particular passeges peak points preference proc process programs ranking recognition references relevance results retrieval rich running scene score searching segmentation segmenting separately sigir significance simply single speech spoken stamp step story subtopic success taniguchi text texttiling that them third three threshold time topic track trans trec type under using vectors video were with word workshop http://doi.acm.org/10.1145/860435.860455 14 A Scalability Analysis of Classifiers in Text Categorization ∗ academic acquisition algorithm algorithms analysis anderson approach approaches associates baltimore based berlin berry boston callan cambridge categorization catin cation chapter cient classi cognitive collection comparative computations conference corpus cullum decisions development dortmund ecml ective edition editor eigenvalue erlbaum ertext ertson european evaluation examination faloutsos fast feature features fisher fourteenth from function ghani girosi golub harmon hillsdale hopkins human icml ieee improved information intel international internet joachims john johns journal kaufmann kluwer lanczos large lawrence learning lewis ligent linear loan loss machine machines many margin matrix maximum mechanisms methods microsoft minimal morgan network networks neural newell nineth noise oles ology optimization osuna ower page pages papka pedersen platt practice press proceedings processing reduction references regularized relationships relevant report research retrieval reuters rose rosenbloom scale schapire selection sequetial sigcomm sigir signal singular skil skill slattery sons springer statistical study submitted supp symmetric systems technical test text their theory thesis training trec university value vapnik vector volume walker wiley willoughby with workshop yang york zhang http://doi.acm.org/10.1145/860435.860549 96 Generating Hierarchical Summaries for Web Searches american amherst annual appear association barnett concept conference croft deriving development finding foundations from google headings hierarchical hierarchies information international journal language lawrie literature lowe manning massachusetts medical mesh models natural pages perform press proceedings processing references research retrieval sanderson schutze searches sigir statistical subject summarization text thesis topic understanding university using vocabulary words yahoo http://doi.acm.org/10.1145/860435.860485 36 Document Clustering Based On Non-negative Matrix Factorization advances aided akad algorithm algorithms allow american analysis appendix approach baker based binary browsing budap capabilities cation center chan classi cluster clustering coding collections computer consists corresp criterion cuts cutting data deerwester derive design diagonal ding distributional document does dumais each ectral eigenproblem element endix equal equation equivalent error exactly fact factorization finland following form found furnas gather gong graph harshman have holland hoyer icdm identities ieee image indexing indicator information insisting instead intel inverted jects journal karger kiad landauer large latent learning ligence lovasz machine malik martigny matching mathematical matrix maximizing mccallum means miai minimizing model nature negative nement networks neural normalized normalizedcut north notation oint onding optimal otherwise oting pages partition partitioning parts pattern pederson plummer proc proceedings processing prove ratio real references relaxation same scatter scheme schlag science second section segmentation selection semantic seung sigir signal simon society solution solving sparse squared switzerland systems tamp term text that them then theory this trans transactions tukey used using uvit values vector volume weight weighted weighting where willett with words workshop written zien http://doi.acm.org/10.1145/860435.860544 91 HAT: A Hardware Assisted TOP-DOC Inverted Index Component addition also application bits chip corporation count document documents exploiting fast fastsearch figure frieder from html identifier index information inserts into leaving list location maintain matching order paracel parallelism pattern posting products proper provides pruning references retrieval scanning search sept sigir similar soffer sorted static structures systems term textfinder trans unit units using weight http://doi.acm.org/10.1145/860435.860528 75 Single N-gram Stemming accuracy across adequate affixes agency algorithm ameliorated analysis annual approach average based belkin both calculating campaign carefully center chosen church clef collection comparative compose concomitant conference conflation cooccurrence corpora corpus croft cross development different document each eight european evaluated evaluation examples exhibit exists experiments figure forum found frequencies frequency from given gram grams guessing haircut have hence hiemstra highest http hypothesize index indicating information internal international invariant inverse jacquemin january jones juggling kaufmann knowledge language languages like many mayfield mcnamee mean measured might models monolingual morgan morphological morphologically morphology national netherlands neutral ninth nist number over paid particular paying pearce penalty performance piatko porter portion postings precision priori proceedings program pseudo pseudostem publication publishers queries readings reason references repeated report reprinted requires research results retrieval same security seen select selecting serve shows sigir simplistic simulate single site snowball sparck special stem stemming stems straightforward stripping substitute such suffix system systems table target tartarus technical technique technology telematics term terms text that their thesis thus tokenization transactions trec under using variants variation visited want weighting will willett with without word words would http://doi.acm.org/10.1145/860435.860473 27 Building and Applying a Concept Hierarchy Representation of a User Profile abstracting acquisition activation adaptive agent american analysis anick annual application applied approach arti assistant associative automatic based basis bayesian belew bhatia browsing cambridge chapter choi chung cial collections college collocation columbus communications comparative comparing computational computer computing concept conference connectionist construction corpus cosine crestani croft data deriving development digital disambiguation discovering document documentation documents domingue doyle dublin edge editors edmundson elligott ellis ergen ersonal evaluation evolutionary expert feedback finding forsyth from furnas generated geometric hearst hierarchical hierarchies higuchi hingston horwood http ieice indexing information informer institue intel internation international ireland irish iterative jacm ject jennings jones journal know knowledge kwok language large lawrie learn learning ledge lexical lexically libraries ligence ling linguistics literature london louisiana ltering machine management manning maps massachusetts measure measures media meeting methods model nanas network networks neural nevill news occurrence ohio orleans pages paraphrase park paynter pictures pottenger press probabilistic proceedings processing publications putting rada recommendations references relevance representation research resources retrieval retrieve reuters review riao rijsb riordan road roeck rose rosenb sanderson schatz science search searchers seeking selection semantic sense service sigapp sigir similarity society sorensen spreading state states stevenson study summarization survey symposium systems technical techniques term terminological terms text theoretical thesaurus third tipirneri tomorrow tong topic transactions united universal university uren user using veronis volume weighting whitehead wilkinson with witten word words workshop wyllys yesterday http://doi.acm.org/10.1145/860435.860449 10 Query Type Classification for Web Document Retrieval academic advances analyses anatomy anchors annual approaches baeza bailey based beaulieu books brin bringing broder callan center citation cmis collection collections combination combining computer conference content craswell croft csiro development digital editor eiro engine engineering eriments ertextual ertson evidence feedback forum foundations from gatford hancock hawking hiemstra http information intel international isdn jaynes ject jones kluwer kraaij language large lemur library ligent link links management manning mechanics methods models modern motwani multi multiple natural neto networks ogilvie okapi order overview page pagerank pages physics ponte press proceedings processing publishers purp ranking recent references relevance research retrieval retrieving review scale schutze search searches shaw sigir stanford statistical systems taxonomy technical technologies test text theory toolkit track trec trecweb urls using walker westerveld winograd yang yates http://doi.acm.org/10.1145/860435.860535 82 Evaluating Retrieval Performance for Japanese Question Answering: What Are Best Passages? answering argue based challenge claims clarke clir concluding consecutive contradict described does dynamic dynamically experience experiments exploiting expository finding generated hearst however http inherently input japanese kids limited matsushita more multi nomoto notes ntcir onlineproceedings overlapping paper paragraph part participants passage passages prior proceedings question questionspeci redundancy references remarks research sakai sakait segmentation selected selection sentences sigir siteq some static suitable system text that their this three toshiba trec used were with working workshop http://doi.acm.org/10.1145/860435.860464 21 Searching XML Documents via XML Fragments allan amitay approaches azagury baeza broder buckley carmel cases chamberlin documents download downloadable draft edition effect efraty error evaluation experiment experiments extension factor fall fankhauser finland flexible formulation forum fragments from fuhr full generating grabs grossjohann herscovici hill http index inex information initiative introduction issue jasist juru landau language maarek mandler marchiori mass mcgill mcgraw model modern navigation nist novel number orleans paradigm passage petruschka pittsburgh proceedings projects pruning qmir qmul query querying references report repositories results retrieval robie salton schek search second sigir size soffer software space spaces special systems tampere taxonomy text topic trec unidortmund vector visualization volume voorhees wdxmlquery with working workshop worskhop xirql xquery yates york http://doi.acm.org/10.1145/860435.860495 44 A System for New Event Detection adaptive allan analysis approaches australia based brants brown callan canada carb chen cikm combining computational conditioned conference croft cronen data deguzman delos detection detections diego digital discovery document ective edmonton ersonalization ersp evaluation event feedback finland first franz hearst information institute international into ittycheriah know laflamme language larvrenko latent lavrenko ledge libraries line linguistics ltering malin management mccarley mclean meeting melb mining minka modeling models multi national nist nition novelty onell ounds ourne pages papka paragraph passages personalisation pierce plan pollard probabilistic proceedings recommender redundancy references relevance retrosp segmentation segmenting semantic sigir similarity slides standards story study subtopic swan systems tamp task technical technology text texttiling thomas timelines topic townsend tracking tsochantaridis umass version vienna ward with workshop yang zhang http://doi.acm.org/10.1145/860435.860548 95 Stemming in the Language Modeling Framework algorithms allan american amherst analysis approach arabic ballesteros case chance choosing chosen ciir class collection connell croft cross detailed details divided ective equivalence estimate evaluating evaluation framework frequencies frequency harman hull improving information kumaran language larkey latter light lingual model modeling nguyen occurrence pages particular ponte probabilistic probability proceedings references report represents retrieval science sigir society stemming study tech term terms that umass weischdel where word would xing http://doi.acm.org/10.1145/860435.860479 32 Table Extraction Using Conditional Random Fields aaai advances agreement algorithm algorithms annual answering applications association based bayes block branstein byrd categorization cation chen cient classi cognitive coleman coling comparison computational conditional conference conll croft data detection digital document edinburgh editors eech elds eling entropy erty estimates estimation event evidence exact extraction free freitag gaussian hidden http human hurst iccl icml informatics information integrating international interpretation jaakkola jcdl joint july kaufmann king knowledge language layout learning libraries limited linguistic linguistics machine mallet malouf markov mathematical matrices maximum mccallum meeting memory methods models morgan naacl naive nasukawa neural newton nigam nips nocedal pages parameter parsing pereira pinto prior probabilistic proc proceeding proceedings processing programming pyreddy quasi quasm question rabiner random readings recognition recognize references representations retrieval rosenfeld schnab school science second segmentation segmenting selected semi sequence shallow sixth smoothing spatial speech structured system tables tasks technical technology text texts their thesis tintin toolkit tree tutorial umass understanding university using viii wainwright weib willsky with workshop http://doi.acm.org/10.1145/860435.860537 84 On an Equivalence between PLSI and LDA allocation annual approach approximated arti august berkeley blei california cial conference croft development dirichlet document ectation eighteenth erty from generating generative hofmann indexing information intel jordan journal language latent learning ligence machine minka model modeling pages ponte probabilistic probability proceedings propagation query references research retrieval semantic sigir therefore uncertainty under http://doi.acm.org/10.1145/860435.860547 94 The TREC-Like Evaluation of Music IR Systems about annual appears asking audio basic bopp byrd communities conference crawford developed dewdney downie elements encounter englewood ensure expect expert following from futrelle high humirs hummers include information interdisciplinary international interview introduction issues justification kinds libraries library like made management metadata michell might minimal music must need needs participants presented problems processing pronounced proxies quality quarterly query questions real realistic record reference references representation research retrieval review science services smith stream suggestions symbolic synthesizing systems tasks technology test that theoretical third trec unlimited user uses verbose world http://doi.acm.org/10.1145/860435.860534 81 Passage Retrieval vs. Document Retrieval for Factoid Question Answering always annotation answering answers applied around august automatic banko based beaulieu being better bookman both brill brown cant cases chua clarke coden compare compares conference consistently cormack data dence differences document documents dumais echihabi entire equal erences except exploitation exploiting extending external extracting fernandes footing from full green gure hermjakob hotspot houston improve include information integration interactive july kaszkiel katz knowledge kuhns language level levels lexical linguistic ltering lynam marcu martin marton matchedpairs method methods mining more multitext natural none okapi outperforms overview pages passage passages performance prager precision predictive processing provide question radev ranks redundancy references reformulation reporting resource resources retrieval revisited robertson september sigir signed signi similar table techniques tellex test text them thus track treat treated trec using values voorhees walker wilcoxen window with woods yang zobel http://doi.acm.org/10.1145/860435.860490 40 Relevant Document Distribution Estimation Method for Resource Selection academic advances algorithms annual approximations australasian australian based broglio callan calv chang cikm collections comparing comparison conference connell craswell croft data database databases development discovering distributed editor effective eleventh emmitt engine experiments french garcia gravano hidden hierarchical http impact information inquery international internet invisibleweb ipeirotis kluwer knowledge large lemur logistic management meng merge merging metasearching methods model molina nation over paepcke performance poisson powell prey probabilistic proceedings processing proposal publishers references regression representative research results retrieval robertson sampled sampling santos savoy search searching selecting selection sigir sigmod simple some souza stanford starts strategy techniques text thesis thom tipster toolkit trec university using very viles vldb walker weighted with zhang zobel http://doi.acm.org/10.1145/860435.860439 2 Bayesian Extension to the Language Model for Ad Hoc Information Retrieval analysis annual appendix applied approach august average baeza based bauman bayes bayesian bayessmoothing beaulieu berger carlin center chapman chen combiation combinations computing conference croft cross data david development dirichlet distribution ecial editors empirical engineering ergen ertson erty following gaithersburg gelman goodman grace hall harman harp harvard hearst hidden hiemstra hierarchical information international interpolation jarvelin joshua kraaij kraft language last lavrenko leek linda line linear markov mckay methods miller model modeling models multinomial myaeng natural nist observational pages pair peto ponte precision predictive press probability proceedings products prop publication references relevance research results retrieval rijsb rubin schwartz score seventh siam sigir smoothing spline stage stanley statistical stern study table technical techniques technology text tong track translation trec twenty university used using volume voorhees wahba wilkinson wise workshop yates zhai http://doi.acm.org/10.1145/860435.860474 28 Query Length in Interactive Information Retrieval about affects allowed along answers appendix asia been belkin bring brooks bystr causes central complexity concerned conference congress contains cool craswell cycling disease display document documentation drinking dwarfism electronic elicitation engineered expedition experience family find food foods four friend friends from gathering genetically gifts given good government guidance guidelines harman have hawking health identify india information interactive interested interfaces into investigate issues jeng journal keller kelly kinds know laws learning like locate looking maintenance making management material measures meet mode more multiple muresan name need netherlands northeast oddy only overview part passed personal planning plans policy potential precautions press prevent privacy private problems processing products programs project projects publication query question raised references regarding regulation regulations related report research restrictions results retrieval retrieved road rutgers rvelin safe safety same screen scrollable seeking short shots should silk since some source specified standards statement subject such sure take taken tang task tasks tenth territories text that there these this three title topic topics tourists track travel traveling treatment trec type types typical voorhees want washington water website websites well what which with wonder would your yuan http://doi.acm.org/10.1145/860435.860456 15 A Repetition Based Measure for Verification of Text Collections and for Text Categorization access accomplished across advantage afterwards aiia alamitos algorithm algorithmica allows allwein also alto analysis appendix application applications applied apply approach array arrays article articles assigned assume assuming assumption attribution authorship available average bangor based binary block burrows calculate calculated call canaris cant cantly careful carnegie case catch categories categorization cation cations cells certain chains character check chronological chui ciency cient class classi classify codes coincides coinciding collection collections comes common communication compare comparing comparision complexity compressed compression compressors comput computation computational compute computed computer computers computing conf conference consider consistent consistently construct construction consumes containing contributing corollary corpus correct counting course crammer crauser cross currently data databases deal degree department describ design detail details detect detection determine diederich digital directly discuss discussion disjoint disk disputed document documents does duplicate duplicated duplicates each easily easy eated ecially ective ectively ects efore ehavior ehaviour either elds eliminated elsewhere empirical emptied empty english ensive ensures ensuring entropy eople equal equality ercentage erent erformance ergodic eriments erroneous error ersonal essentially estimate etter etween evaluation even exactly example exhibits existing exists experimental external extremely ferragina finally first firstly following foreign form formally formula found france francisco frank from function further given glasgow grammatical gran great half hard harper have here high highlight highlighted highly holds hour hours html http human humans identify identity ieee implication implications improving inconsistent indexes inexp inform informatics information initial inside integers interested interesting interestingly internal international into introduce jersey joachims journal jrennie just kaufmann keeping khmelev kindermann kluwer known kukushkina lang language large learnability learning lengths leopold less letter letters lexicographic lexicographically line linguistics list locate locating long longest lossless machine machines made main maintain manber many manzini margin markov match mathematical measure measures mellon memory merge method methods minimum minor modeling models modi modules more morgan most much multi multiclass myers naive natural need needs news newsgroups next normalized nowadays numb only opportunistic order ordering ossibility ossible other otherwise outlined output over paass pages pairs palmas palo paris parts people pieces plagiarised plagiarisms plagiarized plexity polikarpov porter practical practice precedence preceeds precision presence preserving press prev probably problems proc proceedings process processing produced program programming programs prop provide quantitative quicksort quite random rate rather read reader reads realization reasonable recall recent recently redirected reducing references relation relative relies removed replaced report representation require required requires research researchers resolution resources restored result results retrieval return reuters riao rose same sanderson satisfy schapire school science search searches second secondly section segmentation sentinel sequence sequential sequentially sets several shall shannon share shkarin should show shows siam signi simple simplicity simply since singer single size small snext snowbird society some sort sorted sorting source split splitting sprev stack statistics stevenson still stop storage stored straightforward strcmp string stripping structure structures studies study subsequent such suggested suitable supp support symb symbols symp system tags takes teahan technical technique test testing text than that them then theoretical theory there these third this through time together tomorrow topic topics trade training transform transmission typical ultrasparc unde undertaken undesirable unifying univ upgraded urls used useful uses using usually valid vector veri verify verifying violating volume wales watkins well weston wheeler when where which whitehead whole wish with witten workshop worst yesterday http://doi.acm.org/10.1145/860435.860491 41 SETS : Search Enhanced by Topic Segmentation access acquired active actual addressable advances algorithmic algorithms american among analysis applications approach architecture associates associative automatic balakrishnan barbara based bases bawa bowman brokers butterworths caching cahoon callan cambridge catalogs chord cient cikm citeseer clifton cluster clustering cohen collection collections comparing computer computing conf conrad content corpus crespo croft customizable danzig data database databases descriptive determinants dexa digital discovery dissemination distributed document domain duda ective edbt elicitation emmitt engine engineering environment evaluation expansion expert exploits extending feld fiat framework francis frans french fusion garcia gauch gloss granovetter gravano gupta handley hardy harnessing harvest hash hashing heterogeneous hierarchical hotnets http hypertext hypursuit icdcs ieee improving indexing indices inference infocom inform information internet intl isdn jackson jections johnson june kaashoek kaplan karger karp kleinberg know knowledge laird landscaping language large latent learning ledge library linguistic link literature logical lookup mahalingam management manber manku massive mckinley merging meta meziou milgram miller milliner model modeling models molina morris motwani multi name natural nemprempre network networks nldb noll obraczka ogilvie operational ordille over pages papazoglou peer peersearch performance perspective phenomenon physical podc poster powell press prey problem proc processing proper psychology query raghavan randomized ratnasamy references resource resources results retrieval review revisited rework rijsbergen routing scalable schutze schwartz scienti search searching selection semantics service services sharing sheldon sigcomm sigir silverstein similarity simpson small social sociological source space speci stoc stoica strategies strength structural symphony symposium system systems szilagyi tables tang technique technologies technology text that theory ties today tomasic tool toole transactions university usenix using usits velez very viles vldb voorhees wang weak weigand weiss wessels wide with workshop world yang http://doi.acm.org/10.1145/860435.860467 23 Probabilistic Term Variant Generator for Biomedical Terms abbreviation abstract academic acquisition algorithm algorithms american annotated annual approach approximate articles ation automated automatic baroni based between biocomputing bioinformatics biological biology biomedical blast bosch case collier computer computing conference conll cooccurrence corp corpus croft data derivational detailed development discovery domain ective eech entity ervised evaluation extended extraction friedman from gene genia grabar group guessing guided hearst hishigaki hull human identifying inference information interactions interest international ismb jacquemin journal kazama klavans kluwer knowledge krauthammer krovetz language learning lecture lexicon literature machines makino matching matiasek medical meeting mining molecular morozov morphological morphologically morphology multi named names natural navarro nitions notes ohta orthographic paci pages parsing part phonological proceedings process processing protein publishers recognition references related research retrieval role roth rzhetsky schwartz science second semantic shallow sigir similarity simple sixth society special stemming string study supp surveys symposium synergy syntax systems tagging takagi takeuchi tanigami tateisi techniques technology term terms text texts tour transactions trost tsujii tuning tzoukermann unsup using variant variants vector viewing word words workshop zweigenbaum http://doi.acm.org/10.1145/860435.860505 52 Investigating the Relationship between Language Model Perplexity and IR Precision-Recall Measures abstract allocation annual approach august average based berger berkeley best between blei cacm california cisi conference cran croft development dirichlet distributions documents equivalence erty general girolami hofmann indexing information jordan journal language latent lavrenko learning machine management margulis model modeling models multiple pages performance plsi poisson ponte poster precision probabilistic proceedings processing references relevance research results retrieval semantic sigir song statistical table translation with http://doi.acm.org/10.1145/860435.860489 39 Evaluating Different Methods of Estimating Retrieval Quality for Resource Selection advanced algorithms altos annual applications approach automatic based beaulieu broker california callan cambridge classical combining comparing complex computer conference conferenve connell crestani croft database databases dayal decision development distributed distribution distributions document ecir editor editors effects emmitt engines estimation european feng fidel flannery french from fuhr gaithersburg garcia generalizing gioss gravano gray grossman gull hancock harman harper hierarchies inference information institute international ipmu isbn journal kaufman knowledge kraft large logic management manmatha method modeling molina morgan multi national nested networked nishio nottelmann objective objects ogilvie okapi outputs pages performance powell press prey probabilistic probability proceedings processing publication query rath references relations relevance relevant research resource retrieval rijsbergen robertson sampling score search sebastiani second selection sigir space springer standards submitted systems technology teukolsky text theoretic transactions trec uncertain uncertainty university vector very vetterling viles walker with wong york zobel http://doi.acm.org/10.1145/860435.860525 72 XML Retrieval: What to Retrieve? abolhassani additional allan amsterdam approaches appropriate articles assessments based baseline buckley cial complete conclusions content discussion document documents ective editors element evaluation exible expected from fuhr full graded hatano hiemstra hyrex inen inex information initially invested jang jasist johann just kamps kazai kinutani lalmas language literature marx model models myaeng oriented orts pages passage performance proceedings proper references relevance required results retrieval retrieved return returning rijke runs rvelin salton sgml sigir smaller structured system systems text than that thesis topics traditional treating twente unit units university using vert watanabe were wilkinson with workshop zhoo http://doi.acm.org/10.1145/860435.860542 89 Summary Evaluation and Text Categorization absence abstracts advances ahmad analysis anlp applic arpa assigned automatic based benbrahim berlin better budzikowska cambridge case centroid chief choosing clasp classify cohesion comparison comprises comput computer computing constructing coordinates dimensional dimensions distribution document documents each eacl evaluating evaluation extracted extraction feature figure francisco from full given goodness hand heidelburg human indeed information jing jones just kohonen language ledford lexical literature management mani maps maybury multiple naacl neural occupied organising organizing other over paice parent phase points position presence presented press processing proposal prospects radev recognise recognises recognition references represented respectively reuters review role salient same seattle self sentence sets shows similar sofm springer studies study summaries summarisation summarising summarization summary system systems task techniques technology terms tested testing text texts that their then therein thesis this towards trained training tucker used user using utility vector vectors verlag vrusias word workshop york http://doi.acm.org/10.1145/860435.860503 50 Transliteration of Proper Names in Cross-Language £ Applications approaches arabic asru character chinese clsp cognates coling comparable computational cross document english entities error extended further generating graehl groups handle http information investigating investigation jung knight korean language languages machine mandarin markov meng merits model named names other performance phonetic pinyin rates references report retrieval satisfactory semitic speech spoken stalls systems table technical terms text this translating translingual transliteration very while window workshop http://doi.acm.org/10.1145/860435.860538 85 Query Word Deletion Prediction altavista analysis analyzing changes clickthrough commerce computer conference croft data digital discovery document editor engines expansion from global henzinger horvitz ieee international jansen joachims know large ledge local marais mining modeling moricz nement optimizing pages patterns press proceedings query references saracevic search seventh sigir silverstein spink technical user using very wolfram http://doi.acm.org/10.1145/860435.860527 74 eBizSearch: A Niche Search Engine for e-Business aaai academic achieve additional algorithm algorithms amsterdam approach author automated automatic automatically available base berg better bollacker business categories categorization chakrabarti citation citeseer classification comparison compliance conference correction could crawling currently customized data date defined digital discovery documents ebizsearch ecml engine enhancing european example exist expand experimental expressions extend extraction extracts fact features field focused focuses follow free from functionality furthermore future generate giles glossaries goal harvesting header hidden homepage http ideas improve increase indexed indexing information inquirus international items joachims knowledge latest lawrence learning less libraries machine machinelearning machines maintained major manual manually many markov mccallum meta metadata method model netherlands niche number often only openarchives openarchivesprotocol oriented other papers performance poor portals proceedings propose protocol provides publications quality references regular related relevant repec reported requires research resource results rosenfeld search services seymore similar some specific standards structure such supervised support supports system table text than that them those topic training turns used using usually vector where wide wish with work workshop world http://doi.acm.org/10.1145/860435.860443 5 Question Classification using Support Vector Machines aaai advances alignment ambiguities answer answering approach athens available bartlett based bayes brown cambridge carlson categorization chang charniak chen chih chung cjlin class classification classifiers coling collins comparison computational computer conference content convolution cristianini cruz csie cumby danr darpa data department development diego discrete domain duffy dumais dynamic editors entropy event examination flach france francisco from gaertner gaithersburg gerber grewal guide haussler hawaii hermjakob hierarchical hill hovy html http human ieee inductive information inspired international introduction july kaufmann kernels language large learning library libsvm linguistics lloyd logic machine machines madison margin maximum mccallum mcgraw methods mitchell models morgan multi natural networks neural nigam nips nist open overview parser parsing pinpointing press probabilistic proceedings processing programming programs question quinlan radev ravichandran references report research resolve retrieval rosen roth santa schlkopf schuurmans science semantics shawe sigir smola snow software structured structures support systems taylor technical technology text toulouse towards track transactions trec uiuc uiucdcs unified university user vector voorhees watkins wide workshop world yang york http://doi.acm.org/10.1145/860435.860506 53 Topic Distillation using Hierarchy Concept Tree about algorithm algorithms assign auth authoritative authority based between bharat categorization chakrabarti compared computation computed computer concept concepts conference converge corpus database development difference discrete distance distillation done engine enhanced environment envrionment eventually experiments follows found givson henzinger hierarchy higher hyperlinked hyperlinks ieee improved information intend international iterations joshi kleinberg kumar level link lower manuscript markup meng metasearch mining nearer path practice precision preparation proved raghavan rajagopalan ravi references research retrieval shortest siam sigir sources structure symposium tags tawde text that tomkins topic topics trec under used using vectors wang weight when would http://doi.acm.org/10.1145/860435.860459 17 Automatic Image Annotation and Retrieval using Cross-Media Relevance Models academic advances analysis annotated annotation annual appear approach approaches barnard based belongie berger blei blobworld boston brown carson choquette color combining computational computer conference croft cross cuts data dissertation dividing document domain driven duygulu enschede erty estimation european first flowers forsyth freitas hall hellerstein hiemstra ieee image images indexing information intel international ject jordan journal kluwer knowledge language lavrenko learning lecture lexicon ligence ligent lingual linguistics machine malik management manmatha matching mathematics mercer michael minimization minka misrm modeling models modern mori multimedia names netherlands normalized notes pages parameter pattern picard pictures pietra ponce ponte prentice proceedings publishers quantizing query recognition references region relevance research retrieval riseman risk science segmentation semantics seventh sigir statistical storage system systems takahashi texture third thomas transactions transformation translation twente university using vector vision visual vocabulary with word words workshop zhai http://doi.acm.org/10.1145/860435.860516 63 Fractal Summarization: Summarization Based on Fractal Theory aaai abstracting abstracts access adesina aldine applying approaches automatic based classification coding controlling creation discovery display document documents edmundson empirically endres evaluation extraction feder feedback flexible founded fractal fractals freeman full geometry glaser goldstein grounded gruyter hearst ieee image implement info information jacquin koike kupiec length literature luhn management mandelbrot method metrics model naturalistic nature niggemeyer plenum proc processing qualitative references relevance research retrieval review rhetorical salton seattle selection sentence sigir simsum simulation spring stanford strategies structuring subtopic summarization summarizer summarizing techniques term teufel text theory trainable tran views weighting http://doi.acm.org/10.1145/860435.860518 65 Statistical Visual Feature Indexes in Video Retrieval accordance allowing automatically available bars based been browsing california combination communication content control cybernetics default desired developed digital digitized dynamic eighteen feature figure formed four francisco from gray histograms hour hundred idris ieee image indexes indexing integrated interface journal lengths level library method middle more multimedia nearly open otsu panchanathan parsing provided query references representation retrieval review right sample screenshot seconds segments selection several slider smoliar solution statistical system systems techniques than these threshold transactions user users value values video visual which with zhang http://doi.acm.org/10.1145/860435.860445 7 Quantitative Evaluation of Passage Retrieval Algorithms for Question Answering alicante allan analyses analysis annual answer answering answers approaches association bakshi beaulieu boole breck buckley bunescu callan catu clarke clef clues coling complex computational computer conference context coop cormack cross current development document domain eacl eighth eleventh elucidating engineering erformance eriments error ertson evaluation evidence expressions external extraction eyond factoid fall ferr forum franz from full gaizauskas gatford getting good hancock harabagiu here hermjakob high hirschman hovy human huynh ifip improve information interact interaction international issue issues ittycheriah jones journal jung karger katz kisman knowledge kwak language level lexico light linguistics list llopis lynam maiorano makes management mann matching meeting mihalcea moldovan multitext natural ndez ninth okapi orted otential otin overview passage pattern patterns payne performance precision proceedings processing quan queries question questions ranking ratnaparkhi references relations relevance rescu research retrieval right rilo role roukos salton second selection selectively semantic server shallow sigir sinha siteq soubb special srihari statistical supp surdeanu system systems technology tenth term text textual three tice track trec tudhop university using vicedo view voorhees walker what winter with workshop http://doi.acm.org/10.1145/860435.860523 70 Incorporating Query Term Dependencies in Language Models for Document Retrieval adhoc allan applied approach approximation average based baseline biased bigram bigrams biterm cambridge canada cant capturing charniak chelba cikm coling collection comparable concept corpus corresp count croft culm decision dirichlet document each ectively empirical endencies erform erformance erty estimated exploiting general given have here higher improvement improvements information interp jelinek language larger lavrenko learning made massachusetts maximum meeting method methods minimization model modeling models montreal nallapatti occurrence olates olation onds order ordered over pages pair parameter ponte precision press prior probabilities probability proceedings query references relevance resp retrieval risk sentence show sigir signi smle smoothed smoothing song srihari srikanth statistical structures study syntactic table term test towards trec tree unigram uses using values weighted weighting where which while with word york zhai http://doi.acm.org/10.1145/860435.860460 18 Modeling Annotated Data advances allocation american analysis annotation applications approach association attias automatic barnard bayes bayesian blei clustering cohn connectivity content croft cross current cuts data dirichlet document duygulu empirical forsyth framework freitas ghahramani goodrum graphical hofmann huang hypertext ieee ijcai image indexing inference information informing intel introduction jaakkola jacm january jeon jordan journal july koller language latent lavrenko learning ligence link ltering machine malik manmatha march matching media meghini methods missing model modeling models morris multimedia naphade neural normalized overview pages parametric pattern pictures ponte probabilistic processing references relational relevance research retrieval saul science sebastiani segal segmentation semantic sigir statistical straccia systems taskar theory transactions using variational video words http://doi.acm.org/10.1145/860435.860453 13 Using Terminological Feedback for Web Search Refinement - A Log-based Study alex analysis anick assistant automatic based beaulieu belkin biased brandeis british bruza colleen compared construction context cool data delima dennis devices diane directory dissertation empirical engine enquire erika expansion experience faceted feedback head hyperindex information innovation interactive internet iterative jeng jones kelly keyword knepshield library lima lobash local mcarthur mechanisms micheline nicholas okapi pamela paraphrase park payne pedersen peter phrase precision proceedings project queries query recognition references reformulation relevance report research retrieval riao rutgers savage search seeking shin short sigir sikora soyeon suggestion suresh susan term terminological thien tipirneni track trec versus http://doi.acm.org/10.1145/860435.860487 38 A Comparative Study on Content-Based Music Genre Classification academic acoustic acoustical acoustics allwein america analysis applications approach archiving arti audio audiovisual available bakiri ballard based binary blum broadcast brownian cation cepstral chang cial cients cjlin classi codes cognition color companies computer conf conference construction content cook correcting cost csie cviu data daub david deshpande dial dietterich digital discrimination discriminator domain domains dowling echies ectrum ects edition eech electronic eller error estimating estimation evaluation exploration explorations expo face factors fast features flandrin foote fractional framework francisco frequency fukunaga fundamentals fung genre germany gjerdigen goto hall harwood hill histogram histograms http icassp identi ieee image imaging indexing information intel international introduction journal juang july kaufmann keislar khokhar kudumakis lambrou laroche learning lectures library libsvm ligence linney locations logan machine machines madison managing mandal mangasarian margin marsyas matias mcgraw method methods mining mitchell modeling morgan motion multicategory multiclass multifeature multimedia munich muraoka music musical ogihara online organized oulnasr output page pages panchanathan pattern perception perrot philadelphia prentice press problems proc proceeding proceedings processing proximal rabiner real recognition recordings reducing references research retrieval retrievalismir rhythm robust sandler saunders scanning schapire scheirer schultz search segmentation selectivity siam sigkdd sigmod signal signals singer singh slaney society software soltau solving sound speech spie statistical storage style supp support survey swain swing symposium synthesis system systems technical techniques temp theory time tracking tranform transactions tzanetakis uchihashi understanding unifying university using vapnik vector vision visual vitter volume wang waspaa wavelet wavelets westphal wheaton wiley wisconsin wold workshop york zhang http://doi.acm.org/10.1145/860435.860531 78 Topic Hierarchy Generation via Linear Discriminant Projection alessio algorithms boley called categories categorization cation cjlin classi classifying clustering csie data datasets dept documents dubes effect evaluation examination frequent hall have hierarchical hierarchically html http icdm icml jain kershenbaum koller libsvm methods most murray pddpdata prentice references reuters riao sahami schiaf sigir subsets text textlearning that topics unique used users using very words yang http://doi.acm.org/10.1145/860435.860512 59 Syntactic Features in Question Answering according achieves answer answering baseline boosting chinese clarke classifier cnrs cognition comparing comparison correct criteria croft determined documents engine engines entities evaluating evaluation expected experimental experiments falcon ferret first found from given grau group harabagiu hull identifinder illouz incorporating information inquery jacquwin knowledge language lasso left limsi litkowski masson mean measure measures moldovan named number outperforms over passage proc program qalc question questions rank reciprocal recognized references relation report results returned search second selected selection semantic surfing syntactic system tale techniques that their there tool track trec triples types used using whose with xerox