http://www.informatik.uni-trier.de/~ley/db/conf/kdd/kdd2005.html KDD 2005 http://doi.acm.org/10.1145/1081870.1081916 43 Finding Similar Files in Large Document Repositories algorithms analyzing australasian australia bandwidth banff based brin broder cambridge canada center chen chunking clustering compare computer computing conference content copy data davis detection digital documents eshghi extraction file files finding fingerprinting finkel forthcoming framework francisco garcia glassman guidelines harvard hash henderson henson hewlett html http improving infohost international isdn january jose labs large management manasse manber mass mazieres mechanisms melbourne molina monostori muthitacharoen network networks october operating overlap packard pages polynomials principles proceedings rabin random references report research review schmidt science sigmod signature similar sosp symposium syntactic system systems tang tech technical technology univ usenix using winter zaslavsky zweig http://doi.acm.org/10.1145/1081870.1081961 87 Combining Proactive and Reactive Predictions for Data Streams able accordingly accuracy achieve acquires adaptive again aggarwal algorithm along also always among amount ansitions appropriate artificial attributes austin because besides cases change changes changing cikm classification classifiers clustering college coming committee comp compact computer concept concepts conclusion conduct context contexts continuality current cvfdt data dealing decision definitions demon demonstration department detecting deterministic different does domingos drift drifting drifts dublin dwce dynamic eactiv ears easier ecause efficient eings empirical encer enchmarks ensemble erformance erplane error eseeing essential etitive even evolving excels exist february figure filtering foresee framework from future ganti gehrke given helps hence hidden histor historical history hulten human hyperplane icdm ieee incremental information instances intel into intractable intrusion ireland itself jority kasetty kaufmann keogh kolter kubat lanquillon large lazy learn learned learning ligence line live lowest machine making maloof mechanism method mining model modifying monitoring more morgan most much necessarily need network noisy novel oactiv offers often oncoming onent only optimal organize ortant osed ospecting over pages patterns pkdd predict predicting prediction predictive presence proactive problem proc programs prop publishers quinlan ramakrishnan random rate reactive recent references related renz repro result results retain reuse review salganicoff sampling scale scenarios science sciences scratch series shift sigkdd since solved sometimes sources stamp stanley starts still stream streaming streams street summary survey switching system takes task technical texas text than that there this time tkde tolerating tracking transitions trees trinity tsymbal university unpredictable using vermont very vldb wang weighted when which while whose widmer with witness work world yang http://doi.acm.org/10.1145/1081870.1081918 45 Price Prediction and Insurance for Online Auctions agent airfare alexander annual appear article artificial attributes auction auctions auctionsoftwarereview automated bajari bakiri based blum boosting bryan chawla class classification codes combining competition computational conditional conference correcting craig csirik curse data david density determinants diettrich documents ebay ebaystatistics economics editors empirical endogenous entry error estimation etzioni explorations fano fourteenth from ghani hortacsu http icdm icml ieee imbalances infer insights intelligence international issue january japkowicz journal june knoblock kolcz labeled learning littman lochner lucking machine mackie market mason mcallester mccallum michael minimize mining mitchell modeling multiclass nigam nineteenth online oren osepayshvili output pages pennies peter planning prasad prediction price prices problems proceedings purchase rand rattapoom reeves references reily research reserve retail review robert schapire scheduling semantic sigkdd software solving special stone strategies text theory thrun ticket trading training tuchinda uncertainty unlabeled using volume vorobeychik wellman winner with yates http://doi.acm.org/10.1145/1081870.1081978 103 Pattern-based Similarity Search for Microarray Data academy algorithm algorithms analysis applications approximate april architecture arep arrays beyond biclustering biology campbell captured cdna chang checking cheng church clustering commerce complexity computer conclusions conference constructing construction data databases delineating developmental dimensionality economical enriched essential euclidean expression finding foundations full given haixun harvard http hughes icde identify ieee including indexing indyk information intelligent international jiong journal large least length line linear manhattan manifest matrix mccreight metabolic micro microarray miki molecular mouse national near nearest need neighbors objects pages park pathways pattern perng philip piotr proc proceedings processing profiling range references riken sanghyun science sciences sequences sets shing sigmod similarity software space spaces subspace suffix symposium system tarthe tavazoie threshold time tree trees type ukkonen under using vivo wang weighted where whether wide yang yeast http://doi.acm.org/10.1145/1081870.1081963 89 Building Connected Neighborhood Graphs for Isometric Data Embedding alanysis algorithm algorithmic amer analysis annals annual artificial balasubramanian based belgium blocks bruges building cambridge chapman classical combinatorial comments comp company completeness computers computing conf connected connectivity continuum converging curvilinear data demartines dimensional dimensionality disjoint distance distances donckers edding edge edition edmonds efficiency esann estimated estimation european even fast figure flow flows ford framework freeman fulkerson garay geodesic geometric global goodness graham graph graphs guide hall hell herault high history holt icml icpr ieee improvements intel inter intractability isomap isometric jection jections johnson journal karp kruskal langford lawler learning lendasse length letters ligence machine mapping math matroids matula method minimum multidimensional neighb network networks neural niemann nonlinear nonmetric ological onent onto optimization optimizing organizing orhood othesis pages pattern plane poster preserving press princeton problem problems proc psychometrika recognition reduction references research rinehart robust roll salesman sammon scaling schwartz science self sept series sets shortest siam silva spanning stability stoc structure subtree swiss symp tarjan tenenbaum testing theoretical theory total track trans travelling tree trees ultrablocks university using verleysen volume washington weiss winston yang york zhang http://doi.acm.org/10.1145/1081870.1081965 91 CLICKS: An Effective Algorithm for Mining Subspace Clusters in Categorical Datasets agrawal algorithm algorithms analysis andritsos applications approach assigning asso attributes august automatic barbara based biclustering bioinformatics biological biology bron cactus candidate categorical ciation cikm click clicks cliques clustering clusters computational computer conf configured confusion couto data database dataset dimensional discovery dmkd dynamical each efficiently entropy ergraphs extending extensions frequent full ganti gehrke gibson gouda graph guha gunopulos high huang icde icdm ieee information international items itemsets johnston journal karypis kerb kleinb know kumar large ledge limbo madeira many march matrix maximal means miller mining mobasher mushro none novemb olcat oliveira osch overlapping partite peters raghavan ramakrishnan rastogi references robust rule scalable sciences sets sevcik shim sigkdd sigmod space sparse subspace summaries surprisingly survey systems table technology trans transactions tsaparas using value values variations very vldb wang were with workshop zaki http://doi.acm.org/10.1145/1081870.1081926 53 Email Data Cleaning academic adaptive advertisement agents ajax algorithms annual applying approach approaches artificial association attachment august autonomous based basenp bases bellotti berger black blocks boundary brill bulletin cambridge canada capitalized caravel channel chen chieu church clark clean cleaned cleaning cleansing clsp coference colingacl collins comm committee computational computing conditional conference content context contextsensitive corpora corpus correction cortes croft current damerau data database december decisions decker declaratively default della detection development dimensional dirty disambiguation discovering discovery discrimination discriminative dnsql documents donnees ducheneaut eclean eighteenth eleventh eliminating emailcleaner embedded empirical engineering english entities entity entropy error experiments exploration extraction factorial fields final florescu focused free freitag from fuzzy fzdtssql gale galharda galhardas ghahramani golding government grouping habitat hawaii hearst hern hidden hingham hong honolulu honour html http huang icml identification ieee improved index industry information informative inria integration intelligence interactions international internet issue issues ittycheriah january japan jdsoftware jordan journal journees july june kambhatla kluwer knowledge kong kumar kushmerick labeling lafferty lancaster language large learning library linguistics listcleaner lita lookup machine mail management markov maximum mays mccallum measure meeting mercer merge method methods microsoft mikheev mining model models montreal moore msdn multilingual named national natural ndez networks noisy normal normalization october ostendorf pages palmer paper parent perceptron pereira performances periods personal phrase pietra pinto precision prepositional press probabilistic probability problem problems proc processing projects publisher publishers purge rahm random ratnaparkhi real recall recognising recognition recognize references remove report research retrieval richards roth roukos sapporo scoring seattle segmentation segmenting semi sentence sequence server services shallow shasha sigir sigkdd simon spaccapietra spaces spelling springer sproat standard statistical statistics stolfo structured support table tables technical term text theory theroy third track training transformation truecasing unified unsupervised using vapnik vector verlage very walker wang washington wide winnow winpure with words workshop world yarowsky york your zhang zhou http://doi.acm.org/10.1145/1081870.1081879 7 Consistent Bipartite Graph Co-Partitioning for StarStructured High-Order Heterogeneous Data Co-Clustering above abstract academic accepted according after aided algorithm algorithms analysis appendix applications applied aspects assigned bach basri belongs benson between biclustering bipartite boyd built bydocument cambridge canada cancer categories categorization category chang chen cheng cikm classification clustering column columns comparative computations computed computer conditions conf conference consistent convex corpus correlation correspond cuts data dataset december denotes described desgin dhillon dimensional ding document documents duda each easily edition element elements engineering erim evaluation feature feng fourteenth fractional framework frenk frequency from fujisawa fukuda furthermore genes genome gerstein global golub graph hagen hart heterogeneous hierarchical high hopkins http icml ieee illuminate image indicates info information informationtheoretic intelligence interior international interrelated isbn issue january john johns jordan journal kahng klerk kluger kluwer knowledge kojima label learning like loan machine malik mallela march matrix methods microarray mining modha multi nakata neural nips nonlinear normalized number numerical objects obtained optimization other partitioning pattern pedersen performance point preparation press problem proc proceedings processing programming publication publishers ratio ratios recom reference references reinforcement report represented rows schaible sdpa second section segmentation selected selection semidefinite series sigir sigkdd simon singapore society sons special spectral ssrn stork study suppose systems taxonomy term terms text then theory this toronto total trans transactions type unified university using vandenberghe vector volume wang weights where wiley wise words would yang zeng zero http://doi.acm.org/10.1145/1081870.1081949 75 Co-clustering by Block Value Decomposition acknowledgments advances afrl algorithm algorithms also american analysis annual apply applying approach approximation association award banerjee based bialek biclustering block bregman called case chan cheng church cluster clustering clusters column communication computing conclusions conference consequently considered control cuts data decomp deerwester demonstrate dempster denotes dhillon diagonal different ding direct distance distribution document double dumais dyadic each ecial ecific ecml ectral effectiveness efficient empirical enforce entropy erfect erty ervised etter etween evaluations expression extensive factorization filter finally focus framework from furnas gene generalized generic ghosh gong grant graph great guan harshman hartigan have icdm icmb icml identical ieee image improves incomplete indexing indicates information institute intel interesting interpretation intuitive iterative jects journal landauer lang latent learning lerton ligence likelihood machine make malik mallela march matrix maximum means merged merugu method minimum modha more multi nature nbvd negative netnews neural newsweeder nice nips normalized novel numb obtain only opular orted osed osition otential other ottleneck over pages pair part partitioning parts pattern pereira precision presented press proceedings processing prop provides proximity quality questions ratio references relaxation represent research residue resulting robust royal rubin schlag science segmentation semantic semi seung shown sigir similar similarity simon since slonim society some souroujon squared statistical structure substantially supp symmetric systems table than that theoretic they this through thus tishby transactions true under unsup using value values very well when with within word yaniv zien http://doi.acm.org/10.1145/1081870.1081912 40 Reasoning about Sets using Redescription Mining algorithm algorithms along alternating analysis approach april asso august bastide bioinformatics biol biologist biologists botstein brown building burdick calimlim carmel cartwheels causton cells changes charm chromosomal ciation closed closet closure cluster computational concept conf conference create data databases datasets demonstrated described descriptors designed discovery discussion efficient eisen ellman employing empowered endent engineering environmental example examples expression expressiveness facility fast first form formal formulation foundations frequent from future ganter gasch gehrke gene generating genes genesis genomic greater grunstein harel have helm here holstege homology hsiao ieee increase information instance interactively international intl itemset itemsets jamison janoski jennings july just know kumar lakhal lander landscap large lattice lattices ledge letters like logic mafia many mathematical maximal microarray minimal mining mishra mouse nature nourine nucleosome onse pages pasquier pflatz plan potts predicate proc processing programs propositional quackenbush ramakrishnan raynaud reason redescribe redescription redescriptions redundant references relate relation resp rules scale sciences searching shore siam sigkdd silencing springer storz strategies structure stumme sturn such systems taouil their them there this transactional transactions turning understanding using verlag vocabularies vocabulary wang ways will wille with work would wyrick yeast young zaki http://doi.acm.org/10.1145/1081870.1081878 6 Rule Extraction from Linear Support Vector Machines addison advances analysis andreu andrews angulo applications applied artificial asscociation assisted athena august austria bartlett based belmont bertsekas beyer biomedical both bradley breast buchanan california cambridge cancer catal cecilio cherkassky classes classified classifiers cleveland computer computing concave concepts concurrent conference congress constraints control correctly coverage covered critique data datasets department detection diagnosis diederich digital dimitri discovery discriminant disputed diversity douglas edition editor editors esann european execution experiments expert extracting extraction feature features federalist fifteenth fisher formulation foundations francisco from fung generalized glenn haydemar heindel heuristic html icml ieee images institute intel international ionosphere issue jected john journal july kaufmann kernels know knowledge kurfes large larsen learning least ledge ledley letters ligence linear lkopf lung lusted machine machinery machines madison mammography mangasarian march margin math maximal mdct medical methods mika minimization mining mlearn mlrep morgan mulier muller murphy mycin nature networks neural newton nips nodules nonlinear novemb number nunez operations optimization orts ository pages paper points press preston prob problems proceeding proceedings processing prog prognosis programming project promise provost proximal pulmonary radiology reader reading reasoning references refined research results roehrig rule rules schuurmans science sciences scientific second seconds selection shavlik shortliffe shown siam side signal simple smola solved sons special springer squares srikant stanford statistical stoeckel street structured supp survey suykens symposium systems table tapia tech technical techniques techrep theory tickle time total towell track trained tsch university used using vandewalle vapnik vector viena wesley weston wiley wilson wisc wisconsin with wolb wormanns wpbc york zierott http://doi.acm.org/10.1145/1081870.1081907 35 Summarizing Itemset Patterns: A Profile-Based Approach afrati agrawal algorithm approaches approximating association baker based bayardo calders chemical classification classifying clustering collection comp compounds conf data database databases dehaspe derivable deshpande development dhillon discovery distributional divisive efficiently engineering ergraph etween european feature finding frequent from gionis goethals gunopulos icde icdm imielinski information items itemsets karypis khardon king know kumar kuramochi large learning ledge long machine mallela management mannila mccallum mining ounds pages patterns pkdd pods principles proc references research retrieval rules sequential sets sigact sigart sigir sigmod srikant structure substructures swami symposium systems text theoretic toivonen transversals words http://doi.acm.org/10.1145/1081870.1081892 20 Simple and Effective Visual Models for Gene Expression Cancer Diagnostics about abst academic acute adenocarcinoma aliferis american analysis analytic armstrong artificial based behavior betensky better bhattacharjee bioinformatics blood boue brunsdon cancer cancers carcinomas case cell central charlton class classification clinical cognition comprehensive computer contextual correlates curk cutting dasarathy data datasets decision demar deoxynucleotidyl depth diagnosis diagnostic diego different diffuse discovery distances distinct distinguishes embryonal evaluation experimental expression faculty febbo febs fotheringham from gaasenbeek gene genetics gliomas golub grinstein hallmarks hanahan handbook highly histological hoffman human ieee induction information integration intel interactive investigation journal khan knowing komura kononenko large layout learning leban lebien letters leukemia ligence ljubljana lovell lung lymphoblastic lymphoma machine machines malignant mani marx mathematical mcgavran medicine methods microarray mining molecular monitoring mrna multi multicategory multidimensional multivariate nakamura nature nearest neighbor nervous networks neural nonpositive norms nutt orange outcome pages paper pathology pattern pediatric perceiving perception pkdd pnas pomeroy potency precursor prediction press profile profiling prostate references relative reliability relieff research reveals richards ringnr ross science sciences shipp signatures silverman simec singh slonim social society specify springer statistical statnikov staunton structure studies subclasses subtype supervised support survival system tamayo techniques terminal than that track transferase translocations trees tsamardinos tsutsumi tumour unique using vector verlag vishton visual visualising visualization weinberg white with zupan http://doi.acm.org/10.1145/1081870.1081927 54 Dynamic Syslog Mining for Network Failure Monitoring administration agrawal alarm algorithm algorithms analysis annals applications approach areas asymptotically atkins automated baum belief bounds burns cambridge chains change classification clustering codes coding communication communications completion compression computer conference construction convolutional correlation cybernetics data decoding detecting detection discounting discovering discovery driven dynamic editor eleventh empirically encoding engineering episodes error estimation event fault finite framework frequent from functions future government grabarnik graphical hansen hellerstein high hinton icde ieee ifip incremental individual industry inform informatics information integrated intel international ipom issue jakobson jordan journal jsac justifies klemettinen kliger know krichevsky learning ledge lempel ligent lightweight line lisa localization logs lonvick magazine management mannila markov maruyama maximization milne mining mixtures model models monitoring mozes multi neal need network networking networks ninth notification observed occurring ohsie operations optimum other oultlier outliers pages paper patterns performance perng petrie points prediction present press probabilistic proc processing protocol rabenhorst rate references relationship rissanen robust rule rules security selected selection sequences sequential series service sethi seventh sigkdd signal sixth smyth soules sparse special speed srikant states stationary statistical statistics steinder swatch syslog sysmposium system systematic systemics systems takeuchi taylor technique telecommunication that theory thoenen time toivonen tool track trans trofimov unifying universal unknown unsupervised usenix using vaarandi validation variable variants vernamo view viterbi weiss weissman williams with workshop world yamanishi yemini http://doi.acm.org/10.1145/1081870.1081888 16 Local Sparsity Control for Naive Bayes with Extreme Misclassi¿cation Costs dffxudwh dgmxvwhg dgmxvwlqj dgur duwlfldo dvvxpswlrqv dwlrq dxvwudoldq fdoleudwhg fdwhjrul fhgxuh fodvvlhu fodvvlhuv frpsdudwlyh frpsxwdwlrq frqihuhqfh gdwd ghfdhvwhfnhu ghflvlrq glvfryhu hljkwhhqwk hljkwk hondq hvldq hvwlpdwhv hvwlpdwlrq ihdwxuh irxuwhhqwk iurp kdqj lfpo lqgxfwlrq lqwho lqwhuqdwlrqdo lqwr mrlqw ndujhu nqrz odwlqqh ohduqlqj ohgjh oljhqfh orjlvwlf paper pdfklqh plqlqj pxowlfodvv qdlyh qhxudo references research rewdlqlqj rxwsxwv sdjhv shghuvhq slhfhzlvh sulrul suredelolvwlf suredelolw suredelolwlhv surfhhglqjv track uhjuhvvlrq uhqqlh vdhuhqv vfruh vfruhv vhohfwlrq vklk vlpsoh vwxg wdfnolqj whhydq wudqviruplqj wuhhv wzhqwlhwk zhee zlwk http://doi.acm.org/10.1145/1081870.1081947 73 Determining an Author's Native Language by Mining a Text for Errors aaai acapulco across addison american analysis annotation anonymous application applied approaches arbor argamon asialex assoc attribution augmenting author authorship automatically avneri baayen based bayes biometrika cases categorization categorizing cave chapter characteristic chodorow classifiers computational computer computeraided computers computing congress corder corpus cultures dagneaux denness detecting detection disputed educational engelson english enhance error errors exploiting fakotakis federalist foster from gender grammatical granger halteren henry here holt humanities idiosyncrasies ijcai implications inference interlanguage international isahara izumi japanese jones journal kaneko kokkinakis koppel korea lado language leacock learners learning length lexical lexicography linguistic linguistics literary mass measures meeting method mexico michigan million models mosteller naacl naive newspaper nonnative outside oxford peng press proc proceedings prose reading references retr round saiga schler schuurmans second sentence shadows shimony speaking spoken stamatatos standard statistical style stylebased stylistic syntactic synthesis system technology test text texts tomokiyo tono trail tweedie university unknown unsupervised using utterance wallace wang wesley what with without word workshop written york yule http://doi.acm.org/10.1145/1081870.1081970 95 Data Mining in the Chemical Industry aaai academy advances analysis anderson bass basu berry breyfogle business chain comprehensive conceptual control corporate customer data dhar dick discovery edition education fayyad forecasting foundation framework francisco free from gale gustafsson hall hand haykin implementing improving integrated intelligence interscience into jenkins jersey john johnson jossey journal knowledge linoff loyalty management managing mannila marketing measurement methods mining mittal networks neural overview pearson piatesky prentice press principles profit references reinsel research sales satisfaction science second series service seven shapiro sigma smarter smyth solutions sons statistical stein strengthening support system techniques third time toward transforming using value volume wiley york http://doi.acm.org/10.1145/1081870.1081885 13 Combining Email Models for False Positive Reduction aaai access agents algorithm alkoot alspector analysis androutsopolous annotated anti application approach architectures arizona artificial assistant association automatic automation autonomous available baker based bayes bayesian behavior between bibliography budapest budrevich california cascon categorization cation ceas chapter chung classi classification classifier classifiers classify clemen clustering cohen columbia combination combining comparison computational computer conf conference content contents continuous cost costs croft crossroads damashek dept development dietterich distributions drucker duda duin dumais eacl ecir eleventh email emails emnlp empirical ensemble environments estimating european event fawcett file files filtering finding forecasting forecasts foundations fourth francisco fransisco freely fusion gauging genetic graham gram grams hallam hart hatef heckerman hershkop heuristics hidalgo hierarchical hill horvitz hungary huynh identification identifying ieee ifile imprecise incremental informatics information intelligence intelligent international itskevitch john jose journal junk karkaletsis katirai kaufmann kephart kiritchenko kittler kolcz kwei langley language languageindependant large larkey learning lecture linguistics lisbon littlestone logic long machine machines mail mailcat majority manber manual massey matas mathematical matwin mcgraw message messages methods mining misclassification mitchel model models morgan mountain multiple naive natural networks neural nimeskern notes office organizing padmanabhan paliouras pattern patterndiscovery patterns peeking peng performance plan pollock press processing profiling programming provost rennie research retrieval revew rigoutsos robust rule rules sahami sakkis sanz scene schneider schuurmans science seattle second securing security segal sensitive sigir similar similarity simple software sons sorting spam specific spring spyropoulos stacking stamatopoulos stolfo support swiftfile symposium system systems technical techniques text textdm that thomure toolkit training trans transactions tucson uncertainty university unsolicited usenix using vapnik vector verisign versus view vote wang warmuth weighted wiley winter with without workshop york zheng zurich http://doi.acm.org/10.1145/1081870.1081929 56 Learning to Predict Train Wheel Failures aaai academic accuracy additive advanced aircraft albert algorithm alternative american analysis approach artificial association attribute august authors bagging based best better blahavas boosting categorization chowicz class classifier classifiers combination combinations combining comparison complex component concepts condition conference consistent consortium constructing content continuous conversano correlation cost cross data december decision derailment detection dietterich discovery discrete discriminating distributions doctoral drummond duin dzeroski effective efron empirical engineering ensemble ensembles equipment error estimating european events expected experimental experiments explicitly extracting famili fawcett feature features francisco frank frieder generalized generation globally government grossman hall harries health heterogeneous heuristics hidden hill holte horn http huang hunt identification ieee ijcai images implementations imprecise improvement industry information initiative instance intelligence intelligent interaction international issue issues italy java joachima journal july kamber katakis kaufmann kibler kira kluwer knowledge kubat learning lechowicz ling loading machine maclin madison maintenance management managing many matwin mcgraw mcml measure methods mining mitchell mola monitoring more morgan multimodel multiple national ntsb numeric opitz ottawa paper performance pisa popular practical prediction preventing proceeding proceedings provost publishers radar railroads randomization rate references relevant rendell replacement report representation representing research retrieval rules safety salient salientsystems sammut satellite school selecting selection siciliano sigart sigkdd special spills stacking statistical statistically study supervised support symp system systems technical techniques technology text than theoretical thesis three through tools tourneau track train trees tsoumakas under university validation vector visualization voting wheel wisconsin with witten workshop york zenko zhang http://doi.acm.org/10.1145/1081870.1081875 3 A Bayesian Network Classifier with Inverse Tree Structure for Voxelwise Magnetic Resonance Image Analysis aaai academic accelerated accepted accuracy acknowledgements adaptive adopting advanced advances aging algorithm algorithms analysis application approach artifact artificial assisted attribute bacchus baltimore based bayes bayesian besag binder blsa bncit bouckaert brain branches bryan buntine callosum chen cheng chickering classification classifier classifiers comp compare comparing comput computational computer computerized conference construction coop corpus could cowell data databases davatzikos designed difference dimensionality discovery distributions domingos duda editor editors elastic elief elieve elisseeff empirical engineering erformance erties explained fayyad feature figure first frequenc frequency friedman from funded geiger geman generation gibbs goldszal goldszmidt grant graphical greedy greiner guide guyon hammer hart health heckerman herskovits hidden hierarchical high home http human identification ieee ijcai image images imaging implement incomplete induction inference institute institutes intel interaction international introduction ject jensen john joint jordan journal june kanazawa kaufmann kluwer know knowledge kohavi koller langley lattice learning lecture ledge letovsky ligence ligent link literature longitudinal loss machine mamdani mantaras mapp mapping matching mechanism medical menlo method methods mining model models morgam morgan morphological morphometric morphometry naive national nato need network networks ninth notes ootstrap optimal optimality orted ossible other overall page pages pami paper park particularly pattern pazzani pearl peng pham plan poole press prince principle probabilistic proceedings processing prop protocol proved publishers quanlitative quantification quantitative random real reasoning references regardless registration relaxation resample resampling research resnick restoration result results rish royal russel sahami scheme science search selection series sets shen significant simple smyth spaces spatial springer stanford statistical statistically stereotaxic stochastic structure study subset such supp suzuki system systems tenth that thesis thiesson this thompson tomogr tomography towards track trans tutorial uncertainty under university uthurasamy vaillant validation variable variables volumetric whether which wiley with work workshop wrapp york zero http://doi.acm.org/10.1145/1081870.1081945 71 A Maximum Entropy Web Recommendation System: Combining Collaborative and Content Features aaai allocation annotation approach automatic based blei civr conditional conference content dirichlet entropy ersonalization generalized goodman image international iterative jelinek jeon jordan jose journal latent learning machine manmatha maximum methods mobasher models naacl personalization press probabilistic proceedings recognition references research retrieval scaling semantic sequential speech statistical unified usage using video workshop zhou http://doi.acm.org/10.1145/1081870.1081938 64 Creating Social Networks to Improve Peer-to-Peer Networking ability able account advances allocation alternative amount analyzing annual apply approach approaches associative august availability based beal because benefits berkeley blei california center central chord chose claims classical clustering clusters cohen collective combination combines come compared conference connections content correct could create creating crespo customers data degree demonstrate department described designed desirable detection determine development dhts difficult difficulties dirichlet discovery distributed documents does domingos dynamics easily efficient enhancing entire evaluating existing experienced exponential family fiat file files find first flooding formal found from function garcia generative graph harnessing hash have hdps hellerstein hierarchical hofmann however http huebsch ieee improvements increase indexing influence influences infocom information insight interest international internet into january jordan journal kaplan kempe kernels kleinberg know labonte large latent lavrenko learning ledge levine libraries locality location lump machine maggs making massacusetts maximize maximizing mining model molina more musical nature negligible neighbors network networks neural newman nodes number only outliers outperformed overlay overlays overview pages particular pdos peer people performance placed placing plsi popular popularity presents priori probabilistic proceedings processes processing processor project provide provides queries query querying random rare rather real recently recommendations references related relevance removed report requirement research results retrieval review richardson sacrificing scale search seek semantic semantics september seventh shared sharing shenker should siam sigkdd significant single small social soft some specifying spread sripanidkulchai stanford statistical statistics stoica strogatz structure styles sufficiently summarized surrounding system systems table tardos technical techniques tend than that their theory these thesis they this through topic topics traces traffic trends typical understanding university used user users using utilize value version very vldb volume watts were when where while with without work world zhang http://doi.acm.org/10.1145/1081870.1081909 37 Anonymity-Preserving Data Collection abort aborted aborts abstract accountability accuracy achieving active adding address advances aggarwal agrawal algorithms allows along also amer andrew annual anonymity anonymization anonymous answer approach april asiacrypt association attacks august available aviel barbara based bayesian beach behavior bias boneh booth bortz breaches california cambridge census channel chaum ciphertexts classification cleartext clifton clustering collect collection comm communications communities computation computed computer computes computing concerning conclusion conference corresponding council cranor crowds crypto cryptographers cryptography cryptology customer dalenius data database databases datamining datta december decryption decryptions design detected digital dining dinur directive disclosing discovery distributed during dwork each editor efficient eighteenth eighth election electronic elgamal eliminating encryption environments eurocrypt european evasive evfimievski exchange exit extends finding flash follow force forward forwards foundations free from fuzziness game gehrke generalization generalizing generate generates gilburd goldreich goldwasser golle haritsa haystack health heterogeneous hipaa honest hopper hordes identifying ieee implementation implications individuals information insurance international internet invalid issue issues itoh jakobsson juels kargupta kilian know knows kurosawa large leader lecture ledge levine limiting lindell lncs loss luis mail maintaining malicious management marks markus means mental message micali michael miner mining missing mixing model movement multicast needle network newport nicholas ninth nissim notes nothing number october official optimistic other over page pages paper park parliament partial partitioned permuted personal perturbation phase piece pinkas play polls portability potentially practical present preserving press prevents principles privacy private probabilistic problem proc proceedings processing proof proofs properties propose protect protecting protection protocol protocols provable provide proving pseudonyms public quantification rais random randomized receipt recipient records references regard reiter rerandomization research respondent respondents response return revealing revisited rizvi round rubin rule rules sako samarati santa scale scheme schemes schuster science sciences secrets sector security semi sender sends sent shields siam sigact sigart sigkdd sigmod sivakumar solution special spite springer srikant stat statistics structure submission submitted such summary suppression survey sweeney symposium syst system systems technique techniques telecommunications tenth their them then theory they third this three track transactions transmission tsiounis type uncertain unconditional university untraceability untraceable using vaidya verifies verify verlag vertically vldb volume voting waidner wang warner washington when which while wigderson will with without wolff workshop wright yang yung zhan zhong zkps http://doi.acm.org/10.1145/1081870.1081881 9 Mining Tree Queries in a Graph abiteb addison advances advantages agrawal algorithms already also alternatives analysis animal annual another appendix applications approach argument around arxiv association august balachandran bases beera berry bharat brings california canonical case cases centrality cercone chain chakravarthy chandra chang chawathe children citation cited city clearly cohen comes complex computer computing conducted conference conjunctive consisting consists contain containment contains contradicting croft cross current data database databases datasets dayal december deep describ design desired direction discovered discovery discrete distinguished done dumouchel each easy ecies ecology edges edinburgh editor editors either eliminable elled encouraging energy entirely erformance ermutation ermutations erties etween existential extraction fast fayyad figure finding finite first flocks following food forms forward foundations fourth free frequency frequent frequently from function gehrke generality generalization generating ghazizadeh graph graphs have height hence henzinger high homomorphism hull icdm identity ieee implementation implications indeed induces induction information integrating intel international iteration itself jagopalan japan jeong jose journal kamath karypis know knowledge kohavi kumar kuramochi lange large least lecture ledge lemma lethality levelwise lexicographic ligence linear linkage links machine maebashi management mannila mapp mapping mappings martinez mason maximal maximality measurements memmott merlin michie mining models moreover muntz must natural nature need networks newman nodes note notes november numb number online only optimal order orders ossible pages pakdd paper parasites parent path pathogens patterns performance physics piatetsky placing predators preliminary press proceedings proof prop protein prototyp prove published queries query raghavan random record redundancy redundant refer references relational research results review richness root rooted ruhl rule rules ruskey sarawagi satoh science scions search second sense seus shapiro showing shows siam sigkdd sigmod since sites sivakumar sizes skillicorn smith smyth society some space sparse springer srikant standard still structure subtree summaries symposium synthetic systems taken tenth terms that their then theories theory there thinking this thomas thus time together toivonen tomkins track tree trees trophic tsur tuned ullman university upfal used using uthurusamy variables verkamo vianu volume well wesley while whom widom with would yang zhang http://doi.acm.org/10.1145/1081870.1081934 60 Scalable Discovery of Hidden Emails from Large Folders acsac algorithm amit amount anchoring another anti applied applying automatic bagga based basic bryan carenini carvalho case centers centry chignell chris classification cohen commission computers computing conclusion conference conferences consideration contact context contribution corpus cscw dataset datasets deal dealing decisions derek develop differences digital directions discovery discussion document down ecml edded effective effort electric elsevier email emailfiltering emails enefit energy enron erimentation etter etween european evaluation even exploiting exploring extract federal ferc find first florida folders found fragment fragments framework from further future generalize giusepp given global goal gwizdka help hidden hiddenemailfinder hierarchy http identifying ieee improve include indexing individual indus industries info initial integral intelligent interacting interface italy ject joint july kinds klimt kulesh language large larger last learned learning least lessons libraries lines lists long louisiana machine mail make manage many march matched melb memon messages mexico mining minneap minnesota models more mountain nasir natural need nenkova newman novemb numb observe olis optimizations optimize oregon orleans other ourne part paula pending piecing plan plans portland poster prevalence prevalent problem proceedings protected protection providing quotations quoting raymond real reassembly reconstructing redesigning reduce references regeneration regulatory reinventing release reply representing required research robust rohall santa scalability schmandt science second session shanmugasundaram show side signature spam states statistical steps stern steven strengthen structure studies study summarization symposium task tasks techniques text that their them these this thread together tools under understanding united used user users using valuable various view vitor well will william with word work workshop xiaodong yang yiming zhou zwart http://doi.acm.org/10.1145/1081870.1081908 36 Mining Closed Relational Graphs with Connectivity Constraints aaai academy algorithm algorithms analysis annual apriori bandyopadhyay based berthold between biological biology borgelt botstein brown burdick butte calimlim carpenter chekuri chemotherapeutic closed cluster communities computational conf cong constraints cook data databases datasets discovering discovery discrete display djoko efficient eighth eisen engineering european experimental expression fimi finding first flake fragments frequent from functional gehrke genome giles goldberg golub graph graphs holder huan icde icdm identification implementation inokuchi intersecting itemset karger karypis know kohane kuramochi lawrence ledge levine long mafia maximal mielikainen minimum mining molecular molecules motifs motoda national pages patterns pkdd principle prins proc protein recomb references relationships relevant research science sets siam slonim snoeyink soda spatial spellman stein structure study subdue subgraph substructure substructures susceptibility symp symposium system tamayo transactional tropsha tung volume wang washio wide with workshop yang zaki http://doi.acm.org/10.1145/1081870.1081891 19 Feature Bagging for Outlier Detection adaptive advanced advances aerial agarwal aggarwal agrawal agreement algorithm algorithms amsterdam analysis anomaly applications approaches april arnold artificial august australia automatic bacon bagging bangalore barbara bari barnett based baxter beyer bias billor binford blake blocked boosting boston bowyer breiman breunig brighton brisbane buena canada case cavtat challenge chance chawla city class classes classification classifying cluster clustering clusters coding coil combining company comparative computational computationally computer conference content correcting corrects correlation credos croatia dallas data database databases datasets december densitybased designing detecting detection dietterich dimensional discovery dissertation distance distancebased distributions domains down dreilinger edmonton efficient eight eleventh engine engineering engines ensembles environment environments error ertoz eskin european exception experiments expert faloutsos fast fawcett fifth finding findout fourth framework france francisco freund from functions gehrke generalized geometric ghosh gibbons giles goldstein gunopulos hadi hall hawaii hawkins haystack high hong honolulu howe html icdm icdt identifying ieee images imprecise improved improving incremental india induction information inquirus institute insurance integral intelligence international intrusion intrusions israel italy jajodia jerusalem john joshi journal july june kais karypis kingdom kitagawa kluwer knorr knowledge kong kriegel kumar lake langley large lavrac lawrence lazarevic learned learner learning learns lecture leiden lewis liacs linear local loci lyon machine magazine make maloof management march markou mcburney meaningful mearf measures medical merz meta michalski mining minnesota minority mlearn mlrepository mozetic multiple multipurpose national near nearest neci needles neighbor network networks neural nevatia ninth noisy nominators notes novelty november october ohsawa outlier outliers output over ozgur oztekin papadimitriou paper part partitions phase philadelphia pkdd pnrule portnoy predicting prediction predictors prerau principles probability proceedings processing progress projected provence provost pruning putten query raghavan ramakrishnan ramaswamy randomization rare rastogi record references replicator report repository reranking research reuse review ripple robust rooftop rule rules sage sander santa savvysearch schapire schemes schwabacher science search seattle security sentient september sets seventh shaft sheikholeslami shim siam sigkdd sigmod signal similarity simple singh smoteboost someren sons spaces spatial springer srivastava stanford statist statistical stolfo strehl strong structure study subspace suzuki system systems technical testing that theory third three time track undirected unified united university unlabeled unsupervised using variance velleman very vista vldb warehousing washington weak when which wide wiley williams with world york zhang zytkow http://doi.acm.org/10.1145/1081870.1081948 74 A Fast Kernel-based Multilevel Algorithm for Graph Clustering algorithm analysis augmented august austin bounds chan circuits clustering comput computer computing conference cuts data development dhillon dimensional donath factorization fast fill graph graphs guan hendrickson heuristic high hoffman ieee image integrated intel international irregular iterative jones karypis kernel kogan kulis kumar laboratories leland ligence local lower machine malik matrix means mining multiclass multilevel national normalized pages paral partitioning pattern proc proceedings processing quality ratio reducing references report sand sandia scheme schlag scientific search segmentation siam sparse spectral systems technical texas text trans unified university view vision zien http://doi.acm.org/10.1145/1081870.1081884 12 Wavelet Synopsis for Data Streams: Minimizing Non-Euclidean Error adaptive aggregate algorithms also annual approximate banks based callaghan cambridge chakrabarti chakrabati clustering computer conference construction data databases daubechies deligiannakis deterministic dimacs dimensionality dynamic efficiency error estimation extended fast filter finkelstein focs garofalakis gibbons gilbert graphics guha histogram histogramming histograms icalp icdt image indexing indyk item jacobs karras keogh kotadis kotidis kumar large lectures locally maintenance mamoulis manuscript martin matias maximum measures mehrotra metric metrics mishra motwani multiple multiresolution muthukrishnan nguyen optimal pass pazzani pods press probabilistic problems proc processing queries query querying rastogi reduction references roussopoulos salesin scott selectivity series shim siam sigmod small space stoc strang strauss streaming streams summaries surfing synopis synopses synopsis thresholding time tods urieli using vitter vldb wang wavelet wavelets wellesley with workload xwave http://doi.acm.org/10.1145/1081870.1081973 98 Automated Detection of Frontal Systems from Numerical Model-Generated Data academic acknowledgement across adam addison addition additional adjacent africa after algorithm also ames analyses annual applied area automated automatically available based bayes because belongs boundary calculated canada causing center characterize classifier climate close cloud clustering clusters coast comm company compared comparisons components computers conditions conference corresponding counted create cumulus cyber cybernetics data defining denver derived descriptors detected detection detects determine developed development digital discovery does domain each early east effects engineering engineers evaluation even experiments expert experts false fast feature field fields fifth figure figures filter filtering flight from front frontal fronts funded gaussian geosciences goddard goldgof gonzalez government grant graves greenland grid ground hall help hierarchical hopkins however hydrology iceland identified identifies identify ieee image imagery images index indexed indices industry information instrumentation interactive international into investigative ipps journal kelleher kennedy klooster koutroumbas kramer kumar labeled lakshmivarahan levit likely limitation located major means meeting meteorology method methodology mining missed model most moves nair nasa needed networks neural nexrad noise north northeast northwest number ocean oceanography once optical orographical other over parallel parameter part particle particular partition pattern patterns peak persistently plankton point poster potential potter precisely press probability probably proc processing profiling proper publishing radar ramachandran rate recognition recognizing recorder references regions remsen research respectively results rushing satellite science scientific scientists seattle segmentation sets shadow shape show shown siam signatures since single some southwest space spain special speed stage steinbach step steps study such suen symposium system systems technique techniques temperature that theodoridis there these they thinning this though three threshold thresholding time toolkit total track trained trans treatment truth used using utilizes value weak weather welch were wesley west which wind with woods work workshop zhang http://doi.acm.org/10.1145/1081870.1081876 4 Variable Latent Semantic Indexing adfocs advances aggregation analysis another applications approximation approximations automatic baeza based bases berry bishop brief brien caching chapter collaborative component computation computer conference data daviddlewis decomposition deerwester development dumais eckart edition editor efficiency eigenspaces eighth engineering engines face factorization feature filtering first fonseca furnas generalized golub group guide harshman hoffmann hofmann hopkins http human humans icml idea identification ieee improve indexing information inspection institute interaction international jaakkola jects john jolliffe journal keenbow kernel krishna landauer large latent learning lempel letin level lewis lkopf loan lower machine mahabhashyam matrices matrix meets meira mika mining models modern modular moghaddam moran moura muller national nature negative neto neural nips noising okapi overview pages paper parts pentland predictive prefetching preserving press principal probabilistic proceedings processing psychometrika quality query raghavan rank recognition references report research resources results retrieval reuters riberio robertson royal saraiva scalable scholz science scime score search second semantic series seung sigir singhal singitham slides smola society spaces spie springer srebro standards statistical status svdpackc systems technical techniques technology tennessee testcollections text third tipping track tradeoffs transactions trec tsch university usage user using varadhan vector version very view vldb volume walker weighted wide world yates young ziviani http://doi.acm.org/10.1145/1081870.1081976 101 Short Term Performance Forecasting in Enterprise Systems american analysis anomaly application applied artificial automated autonomic barham based bayesian block brewer building characterizing chase chen chess classifier classifiers clusters cohen company computer computers computing conf control correlating critical data dependable detection determination diagnosis domingos dynamic edition eigenspace engineering ensembles event extraction feature first forecasting fratkin friedman geiger goldszmidt group gupta hall hellerstein hewlett hill http httperf identification inference instrumentation intel internet intl isaacs jenkins john june kashima kaufmann kelly kephart kiciman kohavi kutner labs large learning ligence ligent linear ljung loss machine magpie management managementsoftware mcgraw measurement measuring modeling models moreira morgan mortier mosberger nachtshein neter network networks normal oliner openview operation optimality osdi packard pages patterson pazzani pearl performance pinpoint plausible prediction prentice proactive probabilistic proble problem problems proc proceedings products provan reasoning references reinsel repairing report request rish sahoo scale scientific selection self series server shahabuddin simple software states statistical subset symons system systems theory time tivoli tool under user using vision wasserman workload workshop wrappers zero zhang http://doi.acm.org/10.1145/1081870.1081957 83 A Hybrid Unsupervised Approach for Document Clustering aaai algorithm algorithms analysis annals approach artificial automatically calinski classification cluster clustering clusters communications comparison comparisons conclusions conference cooper corpus criterion data datasets dempster dendrite dendrogram determining dimension document documents duda empirical estimating evaluation examination experimental extracted extraction extracts from functions generated generating harabasz hart heckerman hierarchical hybrid incomplete information initial initialization intel international introduces john journal karypis know labeled laird language learning ledge ligence likelihood machine management maximum mccallum meila method methods microsoft milligan mitchell model national news nigam number paper pattern patterns procedures proceedings process psychometrika references report resources reuters riloff rose royal rubin scene schwartz selected series several society sons statistical statistics step stevenson technical text that theoretical third thirteenth this thrun tomorrow unlabeled untagged using volume whitehead wiley yesterday york zhao http://doi.acm.org/10.1145/1081870.1081975 100 Mining Rare and Frequent Events in Multi-camera Surveillance Video using Self-organizing Maps academic advances algorithms alhoniemi amir analysis applications artificial audio ayers bandi based behavior between bouldin browsing brumitt camera cluster clustering collins complex computer computing conference content cooperative copenhagen darpa data databases davies dementhon denmark design displays distances djeraba doermann dublin easyliving eccv efficient elsevier environment european event events figure finding framework from furht fusion gong government hale handbook harris helsinki himberg human ieee image industry intelligence international ireland july kanade kaski kaufmann kluwer kohonen koskela kote krumm laakso laaksonen lecture letters lipton machine maps marques matlab maybank measure meyers mining monitoring morgan multi multimedia multiple network neural notes office organizing pami parhankangas pattern people person picsom ponceleon poster press proc program publishers rare recognition references report results retrieval robust rosenfeld searching self selforganizing sensor separation sequences shafer shah siebel simoff springer srinivasan summarization surveillance synchronized taken tech technology tool toolbox track tracking trans understanding units university using usual verlag vesanto video views vision visual visualization with workshop zaiane zhong http://doi.acm.org/10.1145/1081870.1081919 46 Deriving Marketing Intelligence from Online Discussion abney agent agents agrawal algorithms analysis arising artificial association automatic autonomous based bases baumgartner behavior block boards boardviewer bocca bollacker cascades chen citeseer cohen community computer conf conference construct craven crawling data declarative dipasquo documents editors eleventh european extraction fast finite flesca flexible freitag from geometric giles glance gottlob hawaii honolulu html hurst identification information integrating integration intel intelliseek interesting international jagopalan jarke jensen joins kaufmann knowledge language large lawrence learning lecture ligence linguistic lists lixto logic mail mapping mccallum message meta mining mitchell morgan networks newsgroups nigam notes over pages parsing partial proc proceedings publications recursive references report representation retrieval robust rules school science search signature similarity slattery social sproat srikant state summer system systems tables technical transactions twelfth using very vldb wide with word workshop world wrapping zaniolo http://doi.acm.org/10.1145/1081870.1081902 30 Sampling-Based Sequential Subgroup Mining aaai adaptivit additive advances algorithm algorithms annals another application artificial assistant august background barcelona based basket bayesian berlin better blake boosting boulicaut breiman brin california carlo carney chapter classification classifiers computer conference confidence considerations construction continuous cost counting covering cunningham data database databases decision detecting discovering discovery distributions diversity dynamic ecml editors eleventh engineering ensembles environment eriments erschatz estimating european evaluation example exception explora fawcett fayyad feature finding first fischer flach flexible forests frank frequent freund friedman furnkranz generalization geometry graphical graphs gupta hastie hill html http icdm icml ieee implementations implication improved incorp inductive intel interesting interestingness international into introduction isometrics itemset itemsets jaroszewicz java john journal kaufman kaufmann kavsek klinkberg klosgen know knowledge komorowski langford langley lavrac learnability learning lecture ledge lehren lernen ligence line llwa local logic logistic machine machines mackay makes management margin market mcgraw measures menlo merz methods metrics mierswa mining mitchell mlearn mlrep models monte morgan morgen morik most motwani multi multipattern multistrategy naoki networks notes oosting orating order ository pages pair park patterns piatetsky pkdd platform practical predictions press principles prior proc proceedings programming proportionate quality quickly rahim random rated references regression relational research researchers ritthoff rochery rule rules sampling schapire scheffer scholz science sciences selection sensitive sequential shapiro sieb sigmod silb simovici singer smyth space spain springer srihari statistical statistics strength subgroup subgroups submitted supp suzuki symposium system systems tagungsband techniques theoretic through tibshirani todorovski tools towards transactions tsur tucson tuzhilin ullman uncertainty understanding unifying using uthurusamy vector verlag versus view weak weighted weighting what wissen with witten woche workshop wrob yale york zadrozny zelezny zupan zytkow http://doi.acm.org/10.1145/1081870.1081974 99 Disease Progression Modeling from Historical Clinical Databases aggressive algorithms analytical applications approach april assign balanced baltimore banerjee basis biliary blei cambridge chapman chapter characteristics characterize classification clinical clustering combining computational conf considered controls corp data database detre develop different disease distinguish each edition evaluation everhart evolution feature finally finding first fitting from generalized ghosh gonye gordon greville groups guide hall hardy have inequalities inferred insightful into inverses involves israel jective kaufman krieger laboratory littlewood liver main measurable medicine meet mining model models more negative niddk oorer order outcomes pages parameters patient patients pearson plus polya predict press problem proc procedure prognoses progression publishing quantitative references resulting results rousseeuw scaling schwab seattle siam statistics summary surgery survival that then theory this time tract transplant transplantation treatments university using variables vectors wiley williams with york zylkin http://doi.acm.org/10.1145/1081870.1081904 32 Finding Partial Orders from Unordered 0-1 Data advances agrawal agusti algorithm algorithms alizadeh alroy andrews annual armour association atkins begun bernor biochronologic biochronology biogeographic boman bonis booth bruijn cameron chelu chromosomes chronologic columbia comp computing conference consecutive damuth data databases daxner diachrony discrete earance ectral endium erty etween eurasian event events evolution fahlbusch faunas fejfar fessaha fifth fortelius franzen gasparik gentry geology graph graphs heissig hendrickson hernyak hoeck hungary imielinski implications interval italica items janossy journal kaiser karp kordos koufos krolopp large late llenas lueker macroevolutionary mammal mammalian management mapping meszaros method methods mining miocene mueller multidisciplinary neogene ones ordination pages palaeontographia paleobiology paleoenvironmental patterns physical planarity press prob problem proceedings processes prop quantifying recent references renne research rocek rook rool rudabanya rules scott seriation sets siam sigmod steininger swami symposium syndlar synthesis syst systematic testing tree tupal ungar unique university using weisser werdelin western ziegler zweig http://doi.acm.org/10.1145/1081870.1081897 25 Detection of Emerging Space-Time Clusters account accounting accurately activity advances aerosol aggregated aggregation agrawal alamos alarms allow also american analysis anthrax applications applied approach assume assuncao athas attack automatic balakrishnan baseline baselines bayes bayesian behavior besag between binomial biometrics birkhauser building bump calculations cancer carnegie cell changes class clayton clearly cluster clustering clusters combination communications computations computing conclusions conf cooper correct correlation counts current currently data days demonstrated depending designed detecting detection detector different difficulty dimensional directions discovery disease distinction distributions each early editors emerging empirical epidemics estimate estimated estimates evaluating expected extend extension extensions factors false fawcett feuer fisher framework friedman from gaussian gehrke generalized generated geographical glaz gunopulus hartman have health heffernan high highly hogan hunting hypothesis image important imprecision independent independently infer inference information inst interactions interesting intl iyengar journal kaldor knowledge knox kulldorff laboratory large less level location make mantel mapping math mean medicine mellon methods mgmt miller mining mitchell model models mollie monitoring moore more most mostashari multidimensional nagarwalla negative neighboring neill neural noticing null number numbers observe only outbreaks overdispersion pages pereira perhaps periodic permutation plos positives predict presented press previous prior problems proc processing prospective provost public pvalue raghavan randomization rapid rapidly references region regression related relative release relevant report research restoration result results risks rods royal scan second section separately series should sigkdd sigmod significant since society space spatial standard standardized statist statistic statistical statistics still subspace successful such surveillance system systems task tavares technical testing than that theory these this thus time university using values variance very wagner wallstrom warning ways when which will with work working york http://doi.acm.org/10.1145/1081870.1081933 59 Integration of Profile Hidden Markov Model Output into Association Rule Mining acids agrawal asso ayres bateman biology bitmap ciation coin computational conference data database databases discovery durbin eddy eleventh engineering etween families finn flannick gehrke griffiths hollich imielinski incremental international issue items jones khanna know large ledge management marshall mining motif moxon nucleic ostolico pages paradigms parida pattern patterns pfam proceedings protein references representation research rules sequential sets sonnhammer srikant studholme swami using yeats http://doi.acm.org/10.1145/1081870.1081917 44 An Approach to Spacecraft Anomaly Detection Problem Using Kernel Feature Space advantage alexander analysis artificial automated automation banerjee based behavioral berlin bernhard brian ceedings clustering component computation conference data decoste dhillon diagnosing diagnosis directional discovery doktorarb eigenvalue failure functions generative ghosh great ijcai intelligence international johann kernel kleer know knowledge learning like limit lkopf method mining model modes monitoring muller munchen neural ninth nonlinear occurs oldenbourg otics pages press problem proceedings purpose references sigkdd smola space support symposium technische that this though universitat vector verlag what williams with would http://doi.acm.org/10.1145/1081870.1081966 92 Fast Window Correlations Over Uncooperative Time Series abbadi achlioptas adaptive aggregate agrawal alexander algorithm algorithms amnesic analysis annual approach approximate approximation arbitrary archive august automatic barbara based bootstrap california case castelli certain chakrabarti chan chapman chicago china cikm cohen cole combinatorial comparison computation computations computer conference contemp cormode correlation correlations cover dalal data database databases datasets decomp department design dictionaries dimensional dimensionality dimensions discovery distance distributions drinea drineas dynamic eamonn ecause eddings efficient efficiently efron elements engineering erative exact exhibit extensions faloutsos fast financial fodo folias forms foundations frequency friendly future generation generators gilb gionis greenwald guha guide gunopulos hadjieleftheriou hall hashing hierarchical hierarchyscan high hilb histograms hong html http huggins icde identifying ieee illinois image index indexing indicate indicates indyk informatics information international into introduction jagadish jagopalan jections john johnson keogh khanna knowledge kong korn kotidis koudas kushilevitz large lindenstrauss lindsay lipschitz locally long madison mallat management manku manolop mapping market massive matching mathematics measures mehrotra mendelzon miller mining minneap models monitoring motwani multi multidimensional multiple muthukrishnan nearest neighb olis online orary order organization orting osition osium ostrovsky oulos over palpanas panhellenic parelius pass patton pazzani performance pods popivanov poster proceedings processing pseudorandom pursuit quantile queries rafier ranbani random randomized ranganathan real reduction references representative research riverside sampling santa science search searching sequence sequences series services sets shasha sigkdd sigmod signal similar similarity singular sketches society software sons space spaces springer stable statistical statistics statstream stoc strauss stream streaming streams studies subsequence summaries summary supp surfing swami symp systems tabular technical techniques test thap that theory thomas thousands tibshirani time together track transactions trend trends trupp tsdma uncoop university useful using value verlag vlachos vldb warping wavelets wharton wiley window windowed wisconsin with work wrds york zhang zhao http://doi.acm.org/10.1145/1081870.1081901 29 On the Use of Linear Programming for Unsupervised Text Classification allocation american analysis approx attachment azar baker bansal blei blum chawla chung classification clustering computations correlation dasgupta data deerwester degree degrees demaine dhillon dirichlet distributional distributions dumais edition eigenvalues enhanced expected fenner fiat flaxman focs frieze furnas given golub graphs harshman hierarchical high hill hofmann hopcroft hopkins immorlica indexing information introduction johns jordan journal karlin kumar landauer latent learning loan machine mallela matrix mccallum mcgill mcgraw mcsherry modern pages partial pnas preferential press probabilistic proc random references research retrieval saia salton science semantic sigir skewed society spectra spectral stoc text university vertices with word words york http://doi.acm.org/10.1145/1081870.1081928 55 Enhancing the Lift under Budget Constraints: An Application in the Mutual Fund Industry account accuracy achieved addition advances against algorithm algorithms appendix applicable applications applied approximation april around assessment assume bans based berson bfgs both budget building called case change choose chose class classification classifier colagrosso company comparing conclusions conf constrained constraint construction conventional cortes curve curves customer data decision defection denote derive detection determined different differentiable direction discussed distinct dodier does each ecific effect egins ehavior eing either england enhance environments equal erform erformance error estimation etter etween exists explore expressing factb false fawcett figure finite form free function functions fund funds future government gradient green guerra hand have hidden hill html http ieee imprecise improved index induction industry information institute intel interesting intl investing investment january jective jects john kohavi lane large largest latest learning lift ligent like limited machine mann march mathematical mcgraw memory method methods minimization mining model models mohri more mozer mutual negative network neural next nips nocedal note numb observe obtained obvious optimization optimizing ositive ositives other otherwise outp outputs over overfitting paper parameters predetermined predicting prediction prior problem proc processing prodding programming prop provost psychophysics pull rate reaches redemption references regression relationship resulted robust rules salcedo sample samples scale scenarios sets signal simply since smith sons stair statistic stats step substantially swets systems telecommunications than that thearling then theory there these this threshold thus track train trained training trends true typically units until used valuable values when where which whitney wilcoxon wiley with wolniewicz work worse worst york zero http://doi.acm.org/10.1145/1081870.1081931 57 Towards Exploratory Test Instance Specific Algorithms for High Dimensional Classification advantage agrawal algorithms analysis ankerst approach association based chapman classification combines computer conclusions conference construction cooperation data databases decision density dimensional discussed effective effectively ester estimation exploratory fast good hall high human instance integrating interaction kriegel large method mining order paper path process provide references rule rules sets silverman srikant statistics subspace summary that this towards understanding user vldb with http://doi.acm.org/10.1145/1081870.1081964 90 Pattern Lattice Traversal by Selective Jumps advantage agrawal algorithm algorithms also association bastide bayardo burdick bytes calimlim candidate cases charm closed closet consumes context dallas databases datasets diffsets discovering discovery efficient efficiently fast figure fimi finding first fpmax frequent from gehrke generation gouda grahne growth have hfpleap hsiao html http icde icdm icdt interactive inverted items itemset itemsets lakhal large leap learning less long machine mafia magnitude many matrix maximal maximals memory mining mlearn mlrep noticed order ository pages pasquier patterns prefix references rules scalability searching seconds siam sigkdd sigmod srikant strategies suppor taouil test tested than that time transactional transactions trees usage using vertical vldb wang without zaki http://doi.acm.org/10.1145/1081870.1081939 65 Unweaving a Web of Documents aggarwal agrawal allan arising augmented automatic behavior broadcast carbonell construction cornell darpa datar detection doddington final focs from gupta hypertext jagopalan khandelwal mining model networks news newsgroups pages pilot primitive references report ruhl sigir social sorting srikant streaming study summaries temporal thesis topic topics tracking transcription understanding using with workshop yamron yang http://doi.acm.org/10.1145/1081870.1081942 68 Privacy-Preserving Distributed k -Means Clustering over Arbitrarily Partitioned Data adaptive agrawal algorithm algorithms american analysis annual anything anywhere applications applied arbitrarily artech assignments astronomical bart based beaver benninga berkeley bioinformatics biometrics brazilian cambridge carvalho centers cherkauer classification clear clifton cluster clustering computation computer computing conclusion conf constant contributions cryptographic cryptography cryptology cuts czaczkes data databases decemb describ desirable digitized direction discovery discussed distributed djorgovski efficiency efficient eighth ervised etween even except exchange expression fault fayyad fifth financial find first forgey foundations frank further gene general generalization generate gert goethals goldreich gray gusfield hill horizontally idea ieee ilan image information interaction intermediate international interpretability introduce janick kathleen know laur leak leaks learning least lecture ledge letin lindell ling lipmaa lloyd machine macqueen main management marchal mathematical mathys mattison mcgraw means methods mielikainen mining mitchell model moor more moreau multivariate nested notes numb observations obtain obviously oliveira otentially other over pages partitioned pattern pinkas poss preserving press previously principles privacy probability proc product profiles protocols provide quality quantization rare recognition references revealed roden rounds scalar scenarios science second secrets section secure security segmentation sigkdd sigmod smet society some squares srikant statistics strings such symp symposium telecommunication that themselves theory these thijs this though through tolerant transactions transformation trees university unsup useful vaidya values veksler vertically vision volume warehousing weir well which work would yield yves http://doi.acm.org/10.1145/1081870.1081932 58 Model-based Overlapping Clustering acknowledgements adaboost algorithm algorithms alizadeh allows also alternating alternative analysis applicability applications applied applying approach approximation areas artificial aspects banerjee based basu battle belong benefit bezdek biclustering bilenko bioinformatics biology bjorck both botstein bregman broad brown categorization cause cellular censor chan chapman cheng church chvatal classes clustering clusters coclustering collins colt combination component concept conclusions conf contrast convex dasgupta data decomposing decomposition decompositions dempster desirable dhillon different discover discovery distances distinct divergence divergences effective effectively efficiently eisen empirical entropy experimental exponential expression extended factorization family features fellowship finally fitting framework friedman from further fuzzy gene general generalization generalized generalizing generative generic genes genome getoor ghosh gordon grants hall hard hastie have hearst icml identifying ieee ijcai important incomplete individual industrial information interesting into introduced investigation ismb issues items journal kleinberg knapsack knowledge koller laird large lazzeroni learning least levy likelihood linear logistic long machine management math matrix maximum maybe mccormick mccullagh merugu method methods minimization mixture model modeling models modha mooney more movie multiple naive natural needs negative nelder newsgroup nips numerical often ones operations optimization other overlapping owen oxford papadimitriou paper parallel part partitional pattern patterns pfeffer plaid potentially practical presented press principal private probabilistic problem problems proc processes provided provides raghavan rationality real recognition recomb references regarding regression regulation relational reorganization research results royal rubin sahami saund schapire schweitzer segal semi sets seung several shaving siam similar singer sinica society sparse squares statistica statistical staudt subsets such supervised supervision supported technique text than their theoretical theory this thresholding tibshirani total traditional university useful using value well which white with zenios http://doi.acm.org/10.1145/1081870.1081893 21 Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations academy algorithms annual approach average barabasi bollobas broder buchsbaum chakrabarti chung combinatorica conference coop data degrees diameter distances distribution ected ello emergence eriments european external faloutsos free frieze functional general given graph graphs handbook internet jagopalan jeong kluwer korn kumar maghoul massive mining model models national nature networks ology ower pages pardalos proceedings raghavan random recursive references relationships resende riordan scale scaling science sciences septemb sets sigcomm skewed springer stata struct structure symposium tomkins verlag westbrook wide wiener with world zhan http://doi.acm.org/10.1145/1081870.1081954 80 Key Semantics Extraction by Dependency Tree Mining akamine algorithm also analysis annual answers applications appropriately apriori arikawa arimura asai asso based blood building case cessing challenge characteristic chastic ciation city classification classified complexity conclusion conducted confidence constrained contents corpus customer data decision defined deletion demonstrated direction discovering discovery documents duct dynamics easily ective eech effcient efficiency efficient efficiently emnlp enables endencies endency english equivalent eriments etween event events exact explorations extension extract extracted extracting extracts factored fast field fields filters final finite forest free frequent from fukushima furthermore future gain generation graph growth gspan have hearing higher highest http icdm ieee ikeda improves including indexing inference information inokuchi intelligent into irresp james japanese judge judgment karypis kawaso klein kudo kuramo lance language large largest learning lnai management manner manning manually matching matsumoto meeting mgts middleware mining mixture mode morinaga moto much muntz murrah nakano nakata namely natural nichols nijssen nips nist note numb occurrence odies oklahoma ombing only order ordered osada osed osting output parser parsing pattern pcfg phrase picked pilot pkdd plane processes prop prosecution questionnaire readability readable reconstruction reduced reduction redundancy references relationship reputations result results sakamoto sakao samples satoh selected semantics semi sentence sept shows shtml siam sigkdd simpson software springer stanford structured structures study subgraph substructure substructures summary syntactic system systems table tateishi terry testing tests text than that theoretic theory this through todayfs topic topicscop tracking trans tree trees trends unit unordered used using verlag view washio well were which while with words works yamanishi yang zaki http://doi.acm.org/10.1145/1081870.1081959 85 Regression Error Characteristic Surfaces abreu academic accuracy against algae algorithms analysis austria bennett case characteristic ckeel cognition comparing computing conf conference considerations core curves data detection development discovery dorovski editors egan eiro environment epia erger error estimation fawcett foundation francisco gamb graphs harmful hewlett induction international isbn kaufmann know kohavi language lavrac learning ledge lnai machine mining morgan notes numb outliers packard pages perception pires pkdd portuguese practical predicting press principles proc proceedings provost references regression researchers series signal springer statistical team technical theory torgo vienna http://doi.acm.org/10.1145/1081870.1081898 26 On Mining Cross-Graph Quasi-Cliques abello acids advances albert algorithm among analysis analytical analyzing application approach apriori automated bader barabasi based bayada biclustering bioinformatics biology budding cell chem chemical cheng church classifying click clique closed closegraph clustering common comp company completeness complexes comput computers conf connection connectivity cycle data databases detection ding discovering discovery disk dongen dzeroski efficient emergence enright enumeration explorations expression faloutsos families fast finding flow fragment freeman frequent from garey gene genome graph graphs gspan guide hartuv hogue holder icdm indexing info information inokuchi interaction intractability ismb johnson jose july karypis kasturi kuramochi large largest latin letters linkage massive matsuda mccurley method minimum mining mitotic molecular mrdm multi netherlands network networks neural nucleic nultiple ouzounis pages pairwise palmer partitioning pathways pattern patterns pkdd problem proc processing properties protein quasi raedt random recognition references relational research reserach rymon saso scalable scale scaling science sciences search segal sequences sets shamir sharan sigkdd sigmod similarities similarity simulation space spanning spectral structure structures subdue subgraph subgraphs substructure substructures system systematic systems takahashi temporal their theor theoretic theory thesis through tomkins tool topological transcriptional trees university using utrecht variety wang wide widom with yeast york http://doi.acm.org/10.1145/1081870.1081899 27 Query Chains: Learning to Rank from Implicit Feedback aaai adaptive agglomerative algorithm annual architecture artificial august automatic bartell based beeferman belew berger boosting boyan brill broder clustering cohen collective combination combining computing conference connectionism correction cottrell crammer cucerzan data development discovery efficient emnlp empirical engine engines experience exploits factors forum freitag freund furnas human icml indexing information intel international internet iterative iyer joachims journal know knowledge language learning ledge ligence machine methods mining multiple natural neural nips optimizing order pages pranking preferences proceedings process processing query ranked ranking research retrieval retrieve schapire scheme search shapire sigir singer spelling swedish systems taxonomy that things users with workshop http://doi.acm.org/10.1145/1081870.1081952 78 Efficient Computations via Scalable Sparse Kernel Partial Least Squares and Boosted Latent Features academic advanced algorithmic algorithms alternetive analysis annual applications approach arbitrary australian bartlett based baxter bengio bennett blake cambridge chemometrics choudhury classification collob communications comp computation computations computing conference constructing cristianini data databases dels delve denham descent directed ducing ective editors efficient efficiently egaerts embrechts eriments ersp escu estimation evaluating extraction feature features foundations frean friedman function girolami gradient greedy gunn guyon heidelb helland hilb html http icml idiap ieee ijcnn implementing information institute intel international introduction iterative joint jong journal keane kernel kernels laboratory large latent learning least ligent linear loss machine machines mason mercer merz methods metric mlearn mlrepository momma mommam multivariate nair national nato networks neural nikravesh nonlinear onents optimization orthogonal ository osted osting other pages partial practice press primal principal problems proceedings references regression regularization related repro research rosipal rsise scalable scale shawe side simpls simulation sixteenth some space sparse springer squares stanford statics statistics structure study supp support suykens systems taylor technical theory toronto trejo twentieth university valid vandewalle vector verlag with wold york zadeh http://doi.acm.org/10.1145/1081870.1081887 15 Fast Discovery of Unexpected Patterns in Data, Relative to a Bayesian Network abbreviates abbreviations above absolute accelerating account accumulated adaptive added addition additional address advances after again against agrawal algorithm algorithms almost also alter altered always analogously analysis antees appear appendix applies apply approxima approximate approximated approximately approximating approximation apriori aprioribns argue aribtrarily arrive artificial assertion assigned assistant assisting asso assume assumption assumptions assured assures attribute background base based bayardo bayesian because become becomes been before begin below better between binomial both bound bounded bounds brevity bution calculate call candidate candidates cannot carlo case causal chain chapman chernoff choose cial ciation clarity classification climbing close collection combined combining completeness completes complex compute computer concept conclude condition conference confide confidence consideration constant contain contains correct corresponds course criteria criterion current dagstuhl data database databases datasets decision defd define defined defines definition dels denotes detail details deviation difference differences different directly disbn discovering discovery discussed disregard distinct distinguish distribution domingo domingos drawn during each early easy eautiful ected ectedness ectively editor editors effding efore either element elements empty endent enough entered entire equal equation equations equivalent erations error estimate estimated estimates european ever every exact exceed execution exist existence exited expand expands expected exploit expression fact fast fayyad final finding finite finitely first focus follows former fraction framework frequent from function furthermore gavalda general generated generation gilks given giving good governs graphical graphs greater greatest greiner guar guaranteed guarantees hall have heavily helpful highest hill himax hoeffding hold holds hoschka hulten hypergeometric hypotheses hypothesis identical identically idom imax immediately implies important independent indp induction inductive inequality information inition inspection intel interesting interestingness international interval intervals into introduce intuitively invariant invoked itemset itemsets iteration iterations jaroszewicz jective jensen journal july know knowledge kruse large leads learning least lection ledge lemma lies ligence logical loop loose lower lowing machine main mannila many markov maron maxaz maxi maxidom maximum mean measure measures mechanism members ment metho method minimal minimum mining models monte more most must myllym needs negative network networks neural never nimax normal notation note notice number nwrite observations occur omit only operations optimally order otherwise output outside over padmanabhan pages palo paper paragraph parameter parts patterns piatetski pimax point polynomially possibilities possibility possible pport practical practice prei previous primarily principles print probabilistic probabilities probability problem procedure proceedings processing proof prove proven proves pruned prunes pruning pthroughout quantify quantities quickly races random refer references refers rejected rejection relational relies remain remove removed replace replaced requires research reserved resp respect respectively returned richardson rimax risk rispcan romig rule rules sample samples sampling satisfied satisfy says scaling scheffer search selection seminar sentation sequential sets sgen shapiro show shown shows sided sigkdd silander silberschatz similarly simovici since sixth size sizes small smaller smyth solves some special sped spiegelhalter spite springer srikant standard start state states step steps still stop stopping student study substitute substituting such suffices summing superset supersets supp support symmetric system systems table take taking technical terminate terminates termination than that their them then theorem theorems there therefore these this those throughout time tion tirri together toivonen tools towards track trade trivially true tuzhilin uimax under understand unexp unifying union unseen until upper uronen used useful using valid value values vanishes variables verkamo verlag versions very violated violating violation watanabe ways well when where which whichever will with within would wrob zero http://doi.acm.org/10.1145/1081870.1081882 10 Non-Redundant Clustering with Conditional Ensembles aculty advances algorithms almaden american analysis annual application artificial ashington association averaging basu bengio bialek bidgoli bilenko bootstrap bottleneck bottou cardie categories category center charv chechik classification cluster clustering clusterings cognitive combining communication comparing comparisons computational computing concept conference consensus constrained constraints control convergence corter course craven cuments data davidson dipasquo distance ensemble ensembles entropy exas extended extension external extract extracting fourth framework freitag from function ghosh gluck gondek gordon havrda hofmann ieee image information informative instance integrating intel international isconsin jain joachims jordan journal kamvar klein knowledge kybernetika large learning lerton level ligence machine making manning mccallum means measure meil method metric minaei mining mirkin mitchell mixture model mooney most multiple neural nigam only ornell pages paper partition partitionings partitions pereira press prior priors proceedings processes processing proj properties punch quantification redundant references reinterpreting relative relevant report resampling research reuse russell satyanarayana schultz science semi sets seventh shows siam side slattery society space speeding staf statistics step strehl structural structures student supervised survey symbolic systems table technical technology theoretic theory third those through tishby topchy topic track uncertainty university utility vaithyanathan validity variation volume wagstaff walk weak wide with workshop world xing http://doi.acm.org/10.1145/1081870.1081956 82 Evaluating Similarity Measures: A Large-Scale Study in the Orkut Social Network addison advances algorithms analysis articles atlantic automatic available baeza bart based boston chapter cial clickthrough collab collaborative combining communications computer conditionally conference cross customer data discovery edition ellen endent evaluating fast filtering formal frakes freed hall harman heidelb hong html http indep information intel international invited item january joachims karypis kautz kitts know kong konstan ledge lehmann ligence march mathematical media method methods mining much networks online orative orkut otheses performance prentice press probabilities proceedings processing promotion raghavan ranking reading recommendation references referral reidl retrieval river saddle salton sarwar second selections sell selman shah sigir sigkdd sixth spertus springer statistical structures talk tenth testing text transformation tunable upper using verlag vrieze wesley wide workshop world yates york http://doi.acm.org/10.1145/1081870.1081940 66 Maximal Boasting advances algorithm analysis application april asia association attributes australia basket buchter bunke california cheng cikm comput computer conference configurations data database databases development discovery editors error faster fifteenth fukuda guillaume identification ieee indexing international jose kandel keogh know korb last lecture ledge leeuwen maintenance maletic marcus melbourne mining morimoto morishita notes november numeric optimized ordinal over overmars pacific pages pakdd plane press principles proceedings publishing ramamohanarao references research rules science scientific second series sets sigact sigart sigmod springer symposium syst systems taipei taiwan time tokuyama volume wirth world http://doi.acm.org/10.1145/1081870.1081883 11 The Predictive Power of Online Chatter adamic adar admati ageon allan america analysis analysts analytics annotation antweiler arbesman architecture automated blackshaw blogosphere blogspace boards book bootstrapping broadcast browsing build bursty business carbonell carma chapman chatfield chavet clustering comes communications community company complex computer conf conference consumer contagious content control darpa data datasheet desch descriptions detecting detection development diffusion dill disclosing discovery discussions doddington dynamics ecosystem eiron empirical endogenous environment event events evolution exogenous final finance financial forecasting frank from gain generated gibson gilbert graduate gruhl guha halavais hall hierarchical http implicit industry inferring information initial intelliseek international internet intl jagopalan ject jenkins jhingran journal just kanungo kleinberg know kumar large lawrence ledge letters liben line lukose mapping market massachusetts media memespread message metadata meyer mining music nature nazzaro networks news noise novak nowell online opinions pages paper papka pattanayak pfleiderer physical pierce pilot postings prentice prices proc product raghavan rankings references reinsel report research retrieval retrospective review sale scale school scrutiny seeker semantic semtag series shocks sigkdd similarity sims smith sornette stanford stock streams structure study systems talk technical test text that through time tomkins tomlin tong topic track tracking transcription tres tumarkin under understanding university unstructured using versus very webfountain weblogging whitelaw whitman wide workshop world yamron yang zhang zien http://doi.acm.org/10.1145/1081870.1081950 76 Adversarial Learning aaai acre additional adversarial adversary algorithm algorithms alto analyze angluin anti approach attacker attacks bayesian case categorization china classification classifier classifiers communications concept concepts conference cost course cover dalvi data defeating defender designed determines discovery distribution domain domingos dumais easily ecially ecific efficient efficiently email enemy enough entropy erformed etter exceeding filtering filters framework from future good heckerman heuristics horvitz iccpol international junk just know knowledge learn learnable learned learning ledge linear lowd machine madison mail mausam maximum meek minimize mining model more much natural numb oneself only ossible ounds pages palo papers practice preliminary presented press proceedings queries quite reduce references relative results sahami sanghai second shenyang sigkdd significantly simple spam statistical studying technical text than that theoretical theory using valiant verma well whether while will wisconsin with word work workshop worst zhang http://doi.acm.org/10.1145/1081870.1081903 31 Probabilistic Workflow Mining able according adjacent after again agree agrees algorithm algorithmic allowed also among analogous analogously ancestor ancestors ancestralset another applicable argument arguments arise arrows artificial assume assumption assumptions avoid base based basis because before between blanket both call candidates cannot case cases chain childless children choose cluttering common compatible complement complemented complete computing concern concurrent conditional conditioned conditioning connected connects consider considerably constructed contain contradiction contradicts contrary converse corresp corresponding could created currentblanket cycle define definition demonstrate depicted depicts descendant descendants determining different direct directed directionality disconnected does drop each easy ecause ecial ecific ecifically edge edges efore either element elements eliminate elong endence endencies endent enforcement entails eration erform etween even exactly example exclusive exist exists explicitly figure final finally first fixed follow followed follows form four frequent from full furthermore generality generative given going graph graphs hand happ have hence hereafter hidden holds however iddens identical immediate immediately implied implies including indep induction inductive initial initiated insert inserting insertion instance instead intersect into introducing involve iodi iteration join joined joins just know last latent latents layers lead lear least lemma lemmas lies longer loss main means measurements memb might mining model modification moment more much must mutually naive naturally need nesting next nextblanket node nodes note notice nser numb observable obviously oint onding onds only oracle order ordering orders original osite osition ositive ossible other otherwise othesis othsis over paper parallel parent parentless partial partially particular plits plitss predecessor previous proceed proof prop prove random reaching reasonable reconstruct recorded recursion recursive references rehearsing relation relationship remaining remove renaming render represent represented require research return rhombus rule ruled same scenarios seems select separate separating session sessions sets should show shown siblings similar simple simply since singletons some split splitjoin splits start starting step steps subgraph subgraphs subscripts such supp suppose symbolized symmetry synchronize synchronizes take task tasks test than that them then there therefore these they this those though thread threads three through thus time times tlatents total totally track trivially true unconnected under unique unlabeled versa very vice when where which while will with within without work workflow works write http://doi.acm.org/10.1145/1081870.1081877 5 Mining Images on Semantics via Statistical Learning abstraction academic achieved adaptive addition ages agrawal algorithm among annotation application approach approximate aslandogan atkins automatic barnard based bayes belongie bennett between blei blobworld bouman breen bristol carson categories cbsa chakrabarti chang chee cheng chiang civr ckform classes classification classifier classifiers classifying cluster clustering cohen collections comm compounds computation concept concepts conclusions content convincing cook correlation cozman cross csvt cvpr data database databases degrade density dexa different discovery discriminants discriminative djeraba documents domain dominant dugulu duygulu electronic eled elephant english ercentage erformance error etween expectation experts explorations extensions faloutsos fayyad features fellbaum figueiredo figure finite flat flowerview forsyth framework freitas from future gaussian geiger generative ghahramani greenspan have heterogeneous hierarchical hierarchically hierarchies hierarchy highway hinton hofmann huang icml ieee ijcai image images improving indexing information integrating intel interactive jaakkola jacobs jain john jordan journal keywords khan khanzole kluwer knowledge koller krishnan kumar labeled learning level lexical ligent machine machines malik matching maximization mccallum mclachlan meets miller mining mitchell mixture mixtures model modeling models multi multimedia multimediaminer multimodal nakano naphade natsev natural navigating nearest negative neighbor networks neural nigam nips novel ontologies outdoor pami paper partially pattern performance photos pictures point ponnusany precision problems proc promises proposed prototype publishers purdue purp queries querying raghavan rate ratio references region regularization relationships report representation research ress results retrieval rishe rosenfeld sahami saliency same scenes scheme search segmentation semantic shapiro sheikholeslami shrinkage sigir sigkdd sigmod signatures simoff smem smith song sons specific spie sunset support sychay system szummer taxonomy technical text their thrun topic track training trans ueda university unlab unlabeled unsupervised using very visual vldb wang waterfall when wiley with wordnet words works workshop york zabih zaiane zhang zhou http://doi.acm.org/10.1145/1081870.1081920 47 Making Holistic Schema Matching Robust: An Ensemble Approach academie analysis anderson bagging batini bergman borda breiman brightplanet business champaign chang comparative computer computing database databases deep department economics edition elections hidden histoire http illinois implications integration learning lenzerini machine memoire metaquerier methodologies navathe observations patel predictors record references report repository royale schema science sciences scrutin second sigmod statistics structured surfacing surveys sweeney technical uiuc university urbana value west williams zhang http://doi.acm.org/10.1145/1081870.1081951 77 Estimating Missed Actual Positives Using Independent Classifiers academic approaches biometrics biometrika blum capture census challenges classification classifiers closed colt combining computational computer cover cyber darroch data definite detection elements epidemiol epidemiology estimation false francisco frank george goldberg good hall hook icdm icpr ieee implementations independence information intrusion issues java kaufmann kluwer kuncheva labeled large lazarevic learning limitations machine managing mane mathematics medical methods mining mitchell morgan multiple negatives pages population positive practical prentice proc publishers recapture references regal screening series solution sparse survey systems techniques theory thomas threats tools training unlabeled wiley with witten wittes york http://doi.acm.org/10.1145/1081870.1081890 18 Combining Partitions by Probabilistic Label Aggregation algorithms analysis approximation bagging basu bilenko breiman cluster clustering clusterings colt combining conference consensus constant data david december ensemble ensembles framework fred icdm ieee international jain journal learning machine median mining mixture model mooney multiple pages paper partition partitions predictors probabilistic proc punch references research semi siam statistical supervised time topchy track weak with http://doi.acm.org/10.1145/1081870.1081923 50 A Hit-Miss Model for Duplicate Detection in the WHO Drug Safety Database action active adaptive adverse algorithm american analysis applied approximately associated association assurance attributes bate bayesian beitz belin bhamidipaty bilenko bortnichak brinker british calibrating cleaning clinical computational computer conference confidence consolidation construction copas data database databases deduplication design detecting detection discovery domain drug duplicate edwards efficient eighth elkan estimations european evaluation event false family fellegi finding freitas generation genetics haystack hematology hernandez hilton histories human independent individual interactive international into issues journal know lansner large learnable learning ledge lindquist linkage linking management match matching measures medical merge method mining models monge mooney needle network networks neural newcombe ninth nkanza object olsson orre pages pharmacoepidemiology pharmacology pharmacovigilance press proactive problem proceedings purge quality quinine rates rawlins reaction reactions record records references regulatory related reporting reports research royal rubin safety salive sarawagi series sigkdd sigmod signal similarity society spontaneous statistical statistics stolfo string sunter surveillance systems theory thrombocytopenia tilson timing training uses using vaccine vaees walop wise with workshop http://doi.acm.org/10.1145/1081870.1081922 49 Using Relational Knowledge Discovery to Prevent Securities Fraud abandon abbeel about account accuracy acknowledge acknowledgments adaptive additional affect aggregating agustin algorithms allow allowing amherst among analysis apparent applying approach artificial assistance associated attempt authorized authors autocorrelation available avoiding based been behavior believed bias blau brodley broker brokers called captured carl case categorization cause caused chakrabarti changes cindy class classification classifier cogswell cohen collective collectively communities comparison complaints complex computer computing concept conclusions conducted conference connections consensus contained continue contract conveyed copyright cornell cortes cost could currently customers darpa data databases date david decline degree dept described detection develop different direction directions directly disclosure disclosures discovery discriminative disparity distribute distributions domingos drift duce dynamic effort either elements employment encourages endorsements enhance enhanced evaluation examinations examined examiners exploit explore expressed faithfully fall fawcett fayyad feature features finally firms first focus fourth fraud friedland friedman from future gallagher george getoor government governmental greater hannah hastie have help here herein hereon higher higherrisk hope hyperlinks hypertext ideal immerman implied imprecise improve improvement improves indepth indicates induction indyk inference inferences inform information initial intel interest international interpreted investigations ject jensen john joint judgment judgments know knowledge koller label labels laboratory language larger learning lective ledge ligence ligent linkage loiselle machine made magazine mahadwar management market massachusetts matthew mcgovern medium memb mining model models months nasd necessarily networks neville normalizing notation notwithstanding numbers obtain obtained official only orted others pages paper part particular pawelski performance perhaps period pfeffer piatetsky policies possible potential practice pragnya precipitous prediction pregibon preliminary probabilistic probability proc professional profile promising provost purposes query range rankings rash rates rather rattigan recently references relational report representing reprints repro reproduce research researchers resulting results risk rpts rubin sanghai schapira science screening second seems selection senator serious several shapiro should show shown sigkdd sigmod significantly similar sized small smyth social some springer staff started statistical statistics steven stocks strongly subsequent suggested suggests supp surrogate suspect symposium task taskar tech technical temporal than that them they third this those tibshirani tree trees troutner types umass uncertainty under university using variety verlag victoria views violations visual visualization volinsky walz weld were when wide wider with work would years http://doi.acm.org/10.1145/1081870.1081910 38 Cross-Relational Clustering with User's Guidance access according across actually after aggarwal algorithms along also analysis application approaches autoclass background based bayesian benchmark berkeley between brodley cactus cannot cardie categorical cheeseman cikm class classfication classification cluster clustering clusters components compute computed computing constrained correlation crossclus crossmine data database databases different dimensional discover discrete discuss discussion disk distance distances document done each effective efficient elisseeff emde existing expected expensive extending fast feature features figure fills finding first flach ganti gehrke general generalized generates generating generator groups guyon hall here high hill hristidis http icde icml information instance introduction jected john join joining jordan kaufman kernels keyword kirsten knowledge learning lloyd machine macqueen mcgraw means memory methods metric mining mitchell mitra moreover multiple multivariate murthy need needed nips number numeric numerical observations only order organization other pair pami papakonstantinou park path pertinent proclus procopiuc propagate propagation query ramakrishnan random randomly rdbc references relation relational relations representations research resident results rogers rousseeuw rtner rules runtime russell scalability scalable scan schemas schroedl search seconds section seen select selection semi sequential shown side sigmod similar similarities similarity simultaneously solutions some sons sort sorting spaces spatial stored structured summaries supervised symposium system technique test that tpch tuple tuples unsupervised used using values variable vldb wagstaff wettschereck when which wiley will with wolf wrobel xing yang http://doi.acm.org/10.1145/1081870.1081886 14 Nomograms for Visualizing Support Vector Machines additive ailab alberta altun analysis applications archive artificial available banff blood brier california canada chang chapman chickering cjlin clean computer conditional conference csie data demand demar department dirt economics editors environ experimental exponential expressed faculty families fields forecasts from generalized graphs gunn hall halpern hankins harrell harrison hastie hedonic hettich history hofmann http information intel interactive irvine isis journal july kandola kernels learning library libsvm ligence linear ljubljana logistic london machine machines management mining modeling modelling models nomograms orange pages paper particular prices probability proc random references regression rubinfeld science slovenia smola society software sparse springer strategies structural support survival terms tibshirani uncertainty university vector verification weather white with york zupan http://doi.acm.org/10.1145/1081870.1081880 8 Dimension Induced Clustering able about access accurate address agarwal aggarwal agrawal algorithm algorithms alone analysis ankerst appear applications approach approximate arbitrarily argue arises arya automatic barbar based becomes being belussi berchtold beyond black breunig caetano careful carlo case cases chaotic chen ciaccia classification cluster clustering clusterings clusteris clusters coefficient compare complexity compute concept concepts conclusion conclusions considerably consisting copiuc correlation could create curse data databases dataset datasets definition deflating degrades density detection differ difficult dimension dimensional dimensionality dimensions discover discovering dynamics each easy efficient elements endence enough equally error errors ester estimation example experiment extend fails faloutsos family fast figure filho finally find finding fine fixed flats found fractal friedman from furthermore gehrke generalized gibb gionis good growth gunopulos hashing hastie high highdimensional hinneburg however icalp icde identify identifying increases indep independent index indyk information integral interscience jacm jected jective join jones kamel katayama keim kitagawa knee knees korn krauthgamer kriegel large last learning linear little local location loci longer main manifolds mapped matrices meaningful means metho method metric mining model monte more motwani mount multimedia multiple murali mustafa nearest necessarily neighb netanyahu noise nonlinear note oints omni only opposed optics optimal order ordering outlier pagel papadimitriou parameters park patella pattern pods point points preserves press problem proc produce purp queries raghavan rank rasband references representation requires resulting results reveal same sander satoh search searching seems selectivity self sensitive sets shown sigmod silverman similar similarity since single spaces spacial spatial springer statistical structure subspace such systems table tears that therefore they this those tibshirani tois traina tree trees trying tuning underlying uniformity using valley value verlag visibility visible visited vldb wang well when which wiley will with without wolf yang zezula http://doi.acm.org/10.1145/1081870.1081953 79 Optimizing Time Series Discretization for Knowledge Discovery accuracy achieves acknowledgements activation added advances against algorithm algorithms alonso also analysis annals anomaly application applications approach archive asso austria axis based berkeley bilmes binning bins biology bostr canada cases chakrabarti chiu ciation classifiers clustering coinciding comp company comprehensible computational computing conf contain context continuous converges corresp cottbus created dash data database databases defined degraded dels demonstration density deogun department describing descriptions detection devices differentiation difficult dimensionality discovered discovering discovery discretization discretizing dmkd dortmund dougherty driguez each eamonn ecause ecialized editors edmonton eech empirical enabling enchmarks energy engineering entropy erforms erimental error ersistence ersisting ervised estimate estimation etter evolution existing exploring extended extracting fast features figure filters finding finney first garibaldi gaussian generation genetic gentle germany gfkl gionis good guess handle harder hardware harms hart hetland hidden higher himb however html http hussain huuskonen icsi ieee ignore implications incorp index inductive information initial initialization instruments intel interpret interpretable interpretation intervals jiis john journal july kadous kangas kasetty kaufmann keogh know knowledge knowlegde kohavi kullback lags large leads learning ledge leibler ligent like likelihood likely linear linz local logic lonardi long lotfi machine mannila many marburg markov mathematical maximum meaningful medicine mehrotra method methods mining mixture mobile models molecular more morgan much multivariate muscle need needed noise novel ntyj numb olaf olic onent only oral orating oration order orts outliers outp pages parameter parameters pareto patterns pazzani persist persisting pervasive philipps presented probability proc produce programming publishing quality rabiner rapidly rchen recent recognition recurrent recurring reduction references representation require required research result results review robust role rough rule saetrom sahami scientific score search searching segmenting segments selected selecting sensitive sensor sequence sequences sequential series should sigkdd sigmod signal similarity simple soft sources space springer states static statistics streaming sufficiency suitability summary surprising survey symb systems tasks technical technique temp than thank that they this time tracy trended trends tsdma tuomela tutorial ultsch university unsup using value values very vienna were when will with without workshop world http://doi.acm.org/10.1145/1081870.1081944 70 Discovering Frequent Topological Structures from Graph Datasets additional advances agrawal akihiro algorithm also analysis application atoms based being believe berthold between binding bonds borgelt cardiolipin cases chains challenge chao chemical christian close coatney common comparison complete compound compounds condition connected constraints containing contains correspond data databases dataset datasets dictive diestel difference different dimitrii discovered discovering discovery each edges efficient essentially evaluation expect expected factor figure figures fragments frequent from functionality fuzzy gagan generated george given graph graphs gspan high hiroshi hofer huan hunte icdm ijcai impact implementation increase increases independent inokuchi intelligent interested international ismb jiawei jiong joost karypis keep kept king kuramochi labeled large learn length level levels linear lipids mach macromolecules make mapped marsolo maximal meinl membrane mgts michael michihiro mining molecular motifs motoda muggleton nijssen note number observe offer ohio optimize page pages palsdottir parameters parthasarathy paths pattern patterns philippsen phisical polshakov potential predictive prins protein quickstart real reasonable reasonably reduces refer references reinhard relax rely report respect result results running ruoming same scalability scale scales second sequences show siegfried simplicity sites size slower specifically spin springer srinivasan state sternberg structure structures study subgraph subgraphs substructure substructures such support synthetic takashi technical that theory this thorsen threshold time tool topological total toxicology trees tsminer university using vary varying verlag vertex vertices wang washio well wildcards with workshop would xifeng yang http://doi.acm.org/10.1145/1081870.1081913 41 A New Scheme on Privacy-Preserving Data Classification about acceptable accountability ackerman addressing advanced advances aggarwal agrawal algorithms annual appendix applications association attitudes attribute attributes australian available bayes beyond blake building class classes classifier clifton computation computer computing concepts concern concerns conference cover cranor cryptology current data database databases datta decision design discovery distinct efficiency eigenvalue element elements european golub guidance haritsa health held hipaa hopkins horizontally http ieee indices induction information insurance international interscience john kamber kantarcioglu kargupta kaufmann knowledge krishnan label labs largest learning level lindell loan machine management matrix maximum merz mining morgan notions number online pages paper parameter partitioned perturbation perturbed pinkas portability practice predetermined preserving press principles privacy private privrulepd proceedings properties provider providers quantification quinlan random randomized ranspose reagle references report repository research response rule scheme security siam sigact sigart sigkdd sigmod singular sivakumar society springer srikant superscript symposium system systems table technical techniques theory thomas track tree trees tuple tuples understanding university users using vaidya value values variable vector verlag version vertically wang wiley with workshop zhan zhang zhao http://doi.acm.org/10.1145/1081870.1081925 52 Modeling and Predicting Personal Information Dissemination Behavior aaai academic academy acknowledgments alberta algorithm algorithms allan allocation also america among analysis analyze anonymous april artificial aspx association august author authoritative authors automated automatically based bayesian behavior better blei both breese brin bringing cambridge capital chapter citation coetzee cohen collaborative combining comm comments communication communities community communitynet computer computing conclusions conf conference contact contactmap content corrada creech culture current curved data demo detection different digital dirichlet discovering discovery discrete discussions disseminating documents each edinburgh edmonton email emails emmanuel empirical employees enron entrepreneurial environment establish event events evolutionary experiments expert expertisenet explored exponential family figure filtering finding firms first flake formal friendster from funds further future giles government graph graphs griffiths group hainsworth handcock heckerman hierarchical hofmann home http https huberman human hunter hyperlinked identification ieee important includes incorporates incorporating individual informal information integrating intelligence interesting interests intl introduction isaacs january johnson jordan journal july kadie kautz kilduff kleinberg know knowledge krackhardt kubica labs latent lavrenko lawrence learning libraries like line link linkedin login logistic logit logo longitudinal machine machinery madison management many march markov mccallum methods milgram modality model modeling models moore most motwani multi nardi national network networks nonparametric novel nowell ongoing order organization organizations orkut page pagerank panel paper papka pattison people performs personal predict predicting prediction predictions predictive press price probabilistic problem proc proceedings profiles propose psychology psychometrika ranking receiver receivers receiving recipienttopic recommendation references referral regression relation relational report research response results reviewers role scandal schneider schwartz schwarz sciences scientific self selman semantic senders shah shared show siam sigir simmelian simultaneously small smyth snijders social song sources spectroscopy stanford steyvers stochastic stock structure studying such supported symposium system technical technologies tell than thank that this through ties time timeline today topic topics tracking tseng tyler uncertainty univ university user using valuable visualizing volume wang wasserman well what which whittaker wiki wikipedia wilkinson winograd with within wood work working world would yang york http://doi.acm.org/10.1145/1081870.1081935 61 Web Mining from Competitors' Websites aaai about account acquisition acts adaptive adding additional agent agrawal algorithm algorithms allowing also among amount analysis annual application applications applied approach association associations august automatic available background based basu before better between both calculated calculation cambridge centrum children comparable competitors computational concept concepts concluding conference considered correlating correlation correlations could croft data database databases datasets define delete demonstrated demonstrates deriving development discovered discovering discovery discussed distance documents during edge effective efficient efficiently electronic eliminate ellis especially evaluating evaluation even example expect expert exploitation extracted fact fails feldman fellbaum ferguson first form formal format forsyth francisco frawley from generates ghosh groenendijk hampered hard have hearst hierarchies hierarchy hirsh horwood however huge human identifying imilienski impossible improvement information input interact interaction interesting interestingness international into items janssen judges judgments june kamp keywords knowledge language large learner learning lexical linguistics list machine management manually maryland math mathematische mean measure measured measures meeting method methods mined mining modeled mooney nearly novel novelty number objective observe occurrence often organization organizing other paper pasupuleti pearson performs personal piatesky piatetsky poca point potential presence presentation presented press probability problem proceedings process proposed provide pruning rada rank rating recommender references refinement remarks representation requires research results retrieval rules running sanderson score seek selected semantic sets seventh shapiro short show shown sigir sigkdd sigmod silberschatz simple sites slightly snow spearman stokhof strong study subjective such swami system systems table take talking technique terms text textual than that theory there thesis they though through truth tuzhilin unexpected uninteresting university unknown untangling useful usefulness user users using value volume websites well what when which with wordnet words your http://doi.acm.org/10.1145/1081870.1081894 22 A General Model for Clustering Binary Data absence adaptive additive adptive advances agent agents aggarwal aggregates agrawal algebra algorithm algorithms allerton almost analysis annual appendix application applications approaches approximation arabie arrows authoritative automatic automatica autonomous axiom baier based baulieu benefit between bialek binary bmdp bock boeck boley boolean borra bottleneck california carlotta carroll castillo categorical categorization chain classification close cluster clustering clusters code coding coefficients cofd columns combinations communication computations computer computing concept conditional conditioned conference connections consider control corresponding coupled criterion cybernetics data datasets dawak decomposition decompositions denote department desarbo description deuflhard development dhillon dimension dimensional ding discovery discrete dissimilarity distance document domeniconi double dubes each elements engelman entropy environment environmetrics equality equivalent error estimates eucildean evaluation experssion explicitly exploration factor factorization fast feature figure find fischer fourth friedman gaul gehrke gene general generated generation gennclus gini golub gong govaert gross guan gunopulos hall hartigan hastie hastings have here hierarchical high holding hopkins http huisinga hyperlinked icdm icml identification ieee inference information intelligence international invariant iteration iterative jain jajuga johns journal karypis kato kingdom klar kleinberg knowledge kumar language large learning length linear loan locally machine machines mallela management manual maris market markov matrix maurizio mccallum means mechelen method methods metric mickey minimize minimizing minimum mining minnesota mismatches mobasher mode model modeling models modha moore multiplicative mundle natural nearest nearly negative neighbor neural nips nonegative nonhierarchical note number objects ogihara only operators opitz optimal organization original overlapping paatero paper park partitioning pattern peng pereira perturbation positive possible prediction prentice presence press probability proc proceedings processing procopiuc programming projected proof properties proposition psuchometrika psychological psychometrika quadratic query raghavan reduction references related relaxation report representation require research residue retrieval reversible review rissanen rocci rows saul schader schutte science search segmentation seung sheffield shepard shortest show siam sigir sigkdd sigmod similarities simon simultaneous since slonim society soete software sokolowski some sources space spaces sparse spectral springer squared statistical structure structuring subspace summary support systems tabu tapper technical text that then theoretic theory this those tibshirani tishby toolkit track transactions trejos united university updates upon using utilization value values variables variant vector vectors verlag vichi viewed warehousing webace where which wide wiley wish with wolf word work world zhao http://doi.acm.org/10.1145/1081870.1081962 88 A Generalized Framework for Mining Spatio-temporal Patterns in Scientific Data able acad accelerated account across aggregates algorithm algorithms also american among analysis anamolous appropriate approximately association associations atallah automatically bailey based between biokdd boundary capable capture capturing case centroid christian cisrc class classification clique cohn collocations commonly compare compared complex computational computer conclusion constraint convex cressie critical data databases dataset datasets defect define demonstrating derived detection detroit developed different discover discovered discovering discovery distance dynamic dynamics each earlier economic ellipse episodes equivalent event evolving extend extended extent extracting fast feature features fernyhough figure flows four framework frequent from fundam general generalised generalized generated geography geometric graph growth gspan hausdorff have hazarika henze however huang hydrodynamic hydrodynamics icdm ieee ijcai impact importance inferences influence information input interactions into jiang john journal justify kellogg kenneth lack landmark lecture letters linear linked location make mannila many maps massive mehta mentioned method metric metrics mindist minimal mining modeled models molecular more morimoto most movie mtng multiple multiresolution munro natl neighborhood neighboring notes number numbers numerical object objects occurences ohio only order other overview pages paper paradigms pattern patterns physics point points polygons present presenting proc processing qualitative rather real reasoning recognition references region relationships representation represented results richie robust rules running sadarjoen science scientific section selective sense sets shape shekhar shown sigkdd silver similar simulating simulation simulations soaps sons space spaces spatial spatio spirit star state statistical statistics structural structures structurs substructure summary suryawanshi table take taking temporal than that this thus time tobler toivonen total tracking twice type types university urban used using very visual visualization volume vortex vortices wang were where wiley with workshop xiong yang zaki zhang zhao http://doi.acm.org/10.1145/1081870.1081977 102 A Multinomial Clustering Model for Fast Simulation of Computer Architecture Designs aaai academic accelerating achlioptas adve algorithm analysis annals annual application applications architectural architecture architectures artificial assumption atlantic august austin automatically based basic bayes behavior behaviour benchmark bingham block boca bosschere brodley burger bzip calder change characterization characterizing choosing clapp classification clustering cohen column commercial comp comparing comparison compilation computer computers conference conjunction conte crafty cycle dasgupta data database dempster dept design detection dhodapkar dimension dimensionality discovery dist distribution document each ecml editors eeckhout effective ehavior eighth emer emerging enchmark engineering equake error estimating european evaluation executed execution experimental experiments extensions falsafi fast feature february fifth figure find finite florida forty forwarding fradkin francisco friendly from full generative ghosh girbal give gives government gzip hamerly heckerman heidelberg held hierarchical high hilbert hill hirsch hpca iccd ieee image impact incomplete independence indicates industry information input instructions intel international into isca jection jections johnson journal july june karypis kaufmann keeton kluwer know kumar labeled lafage laird languages large learn learning ledge length lewis ligence likelihood lindenstrauss lipschitz lists loss lucas mach machine madigan magazine magnusson mannila mappings maximum mccallum mcgraw mclachlan means measurement measures meil menezes method methods micro microarchitecture million mining mitchell mixing mixture model modeling models modern mooney morgan mouchard mukherjee multinomial multinomials multiple naive nanda national nigam ninth numb number obtained oint oints onents operating ortion overall page pages paral parentheses peel perelman performance periodic phase points poster prediction preliminary press principles probability proceedings processor processors program programming programs projection prop publishers random raton reduce reducing reduction references reliable report representative research retrieval rigorous royal rubin sair sampling scalable scale schwarz science search second selection september series sets seventh seznec sherwood should sigact sigart sigkdd sigmetrics sigmod significantly similarity simp simple simplescalar simpoints simulate simulated simulation simulations slices smarts smith society source space spec springer start starting state statistical statistics steinbach stream strehl study suite superscalar support symposium systems table technical techniques temam texas text that this thrun time tool tools total trace track tracking twentieth uncertainty university unlabelled unsupervised until used using vaithyanathan value values vandierendonck verlag version vlsi weight wenisch were wiley with workload workloads workshop wunderlich york zhong http://doi.acm.org/10.1145/1081870.1081889 17 A Multiple Tree Algorithm for the Efficient Association of Asteroid Observations above accelerations accelerators actual advances algorithm algorithmics algorithms also american analysis apple approach approximated approximately arnold artech ascension association associative asteroid asteroids astronomy automated automatic belt benefit bentley between binary blackman body both bubble cern chamber code comm compatible components conclusions conference consisted context control correspondence covering data databases december declination degree dependent describe design dietterich differences different distance distribution domain each earth editors efficient energy equally examined example exhaustive experiment fast feature fifth finding from fully further fusion gave generate generated given gray hall handbook hendriks high hjaltason hough house ieee image implementation important included incremental independent information initiation instrumentation intelligence international introduced introduction jain january join lead learning leen linkage linkages llinas location machine main methodology model modern monocular moore motion motivated multidimensional multiple multisensor multitarget near needed neural nights note number numbers object objects observation observations occlusion october optimizations orbits order over pages pami pattern performance period pictures point points popoli presence presented press principles problems proceedings processing processor projected proportional provides pruning quadratic queries radar real reduction references region reid reinders relative required results right robust running runs salari samet scientist search searching seconds sequences sethi sets shalom shaw show shown sigmod significant simulated spaced spacing spatial specifically speedups square statistical steps structure system systems table target targets techniques test that then these this time times track tracking tractably trajectories trans transactions tree trees type uhlmann used uses varying veenman were whether while with yaakov http://doi.acm.org/10.1145/1081870.1081914 42 Streaming Feature Selection using Alpha-investing abramovich across adaptation adapting adaptive added adding addition advance akad akaike algorithms amer american analysis annals approach approaches article assoc association bankruptcy based bayes benjamini bickel biometrika budap building calculates calibration cases citation clopinet cluster code combinations common comp comparable computation computer computing concept conference considers controlling controls criterion data database decide decreasing dept develop dimension discovery does doksum done donoho dynamically dzeroski each ecial editors eing empirical enalties enalty erts escul estimates estimating etes examine example extension extraction false feature features find flexibility foster found fraction from generate generating george getoor given gives guarantee guyon hall have haystacks hochb however http ideal ijcai increasing incrementally inflation information international invention investing isabel issue jacod jasa jensen jmlr johnstone journal know known large lavrac learning ledge less likelihood limit link logistic looked machine make mathematical maximum methods mining model models more much multi multiple necessary needles nips numb observations online otential otentially other othesis over owerful pages paper petrov possibly practical predicting prediction predictive predictors prentice presented principle problem problems proc processes projects provides publication published querying raedt range rate references regression regularization relational research results risk rissanen roughly royal rule savings scales schwartz selected selecting selection sense sequences sequentially series sets shiryaev show shrinkage significant silverman small smaller smooth smoothing society some sparse sparsity spatial springer stanford stat statist statistical statistics stepwise stine stochastic straw stream streaming stronger structural submitted such summary symposium takes technical testing tests than that their them theorems theory this threshold thus time track transformations trying ungar university unknown unlike used using value values variable variety verlag wavelet well what where which will with work works workshop wrob years zhou http://doi.acm.org/10.1145/1081870.1081896 24 A Distributed Learning Framework for Heterogeneous Data Sources account achieve aggarwal agrawal algorithms also annals applicable applications approach asso assumptions austin axiomatic based both bradley censor ciation class clifton cluster clustering collective combining computation conclusion conditional constraints continuous cover cryptology csiszar data databases datasets david dels design developed discrete disparate distributed diverse domains dutta ecial efficient elements emission ensembles entropy erties erturbation estimation evaluation evfimievski exchange experimental explorations fayyad focs formulated framework from gehrke general generate ghosh green heterogeneous hierarchical hierarchically high html http icdm icml ieee imaging independence indicates inference information initialization integration into inverse involving issue iterative iteratively jmlr johnson journal kargupta kivinen knowledge lang large learning least letters likeliho likelihood lindell linear lncs local loss machine making maximum medical merugu mining model models much multidimensional newsgroup newsgroups optimization order ordered ounds oxford pages paral partitioned partitionings pattern performed pinkas pods preserving press principles privacy probabilistic problem problems prop proposed quality quantification random recognition reconstruction references refinement regression reina relative require reuse royal rules scalable scale scenarios schema schuller secrets security sensitive sets settings shepp sigkdd sigmod sivakumar size society solutions sources specialized squares srikant stat statistical strehl systems takes tasks technical techniques texas that theoretical theory this thomas thus time tomography transactions types univ university using vaidya vardi various vertically wang warmuth weighted wiley without zenios http://doi.acm.org/10.1145/1081870.1081921 48 Using Retrieval Measures to Assess Similarity in Mining Dynamic Web Clickstreams aaai abbass abidi access adaptative adaptive advances alamitos alberta algorithm algorithms also american analysis appear applications applying approach arlington artificial association associations automatically axis babu bands barbara based bases beach biosystems birch birmingham borges bradley breaking browsing callaghan canada cardona chen china chrono chronological clustering clusters competitive computation computer conf conference congress constrained context continuous cooke cooley coronel cosine coverage cybernetics dasgupta data databases dayal days depicting depicts deshpande digital dimensional discovering discovery distributed distribution dong dynamic each editors edmonton efficient eighth emerge engineering england estimator etzioni evolutionary evolving explorations extracting fayyad figure florida focs foundations france frigui from fuzzy garcia gecco genetic gonzalez government guha hits hong horizental hsinchu hunt hypertext ieee immune industry information input intelligence international into issues jacobsen jerne joshi journal knowledge kong korfhage krishnapuram large last late learning lecture levene libraries linking livny logical logs management melbourne method mining minpc mishra mobasher model molina motwani multi nasraoui natural navigation neal network newsletter newton noisier noisy notes number olap only order over page pages paper papers paris parthasarathy patterns perkowitz precision preparation presented press proceedings profile profiles profiling queries ramakrishnan record reddy redondo references regression reina relational requirements research retrieval reverse robust rojas rules santa sarker scalable scaling science scientific sensitive series session sessions several shah shahabi siam sigkdd sigmod similarity sites split springer srivastava stamp step storage streams symposium synthesizing system systems taiwan technology tecno time timmis tools toronto track tracking trend trends uribe usage used user users using verlag versus vertical very vldb wang weakens weaker webkdd when while wide widom wiley with workshop world yang york zaiane zarkesh zhang