http://www.informatik.uni-trier.de/~ley/db/conf/kdd/kdd2006.html KDD 2006 http://doi.acm.org/10.1145/1150402.1150436 31 Very Sparse Random Projections abbadi ability achieving achlioptas addition after against agrawal algebra algorithmic algorithms alon also alto always american anal analogous analysis annals anupam appear appendix applications applied approach approaches approximate approximating approximation arbor arriaga assume assuming assumption assumptions asymptotic asymptotical automatic automatical available aziz back based bayes beach bebis behavior bellingham belmont bernhard berry berryp binding bing bingham biology boon bound bounded brian brodley buckley buhler called cambridge canada carla case casella categorization central charikar check chew chiba chin chistyakov choose chris christos chunqiang church cient citations classi clement cluster clustering colt column combinatorial common comparative competitors complexity comprehensive computation computational compute computers concepts concerns condition consequence conservatively considerable consists constant contemporary convenience convergence copies corresp could course cubic cumulative curse dallas daniel dasgupta data david deal deepak delve denote denoted denotes density derive dimensional dimensionality dimitrios dimitris discovering distances distribu distributed distribution distributions divyakant dmitriy dumais durrett duxbury dwarkadas eaking ected edda edition eduard element elementary ella embeddings emek empirical endix engineering ensemble entries equal equation erich error esseen estima estimation estimator ethernet even exactly example examples exist exists expansions expected experiments express extensions external face fact factor fairly fast fatih feller fern finding fisher focs following foundations fourth fradkin francis francisco frank frankl frequency from function general generality generators george gerard goel goemans gotze graphs great grows guaranteed gulbeden gunopulos gupta hash hastie have heavy heikki henry high hilbert hillol hinrich hisao hovy hwee icml ieee ijcnn illustrate image immediately improved improving indexing indicates indyk information input instruments insu interested into introduction involve jaime japan jason jected jection jections jeev jeremy jessica john johnson joint jorg journal karger kargupta kaski keep kenneth kernel kindermann know known language large latent lawrence laws layout leads learning ledge lehmann leland lemma lemmas leopold leung likelihood limit lindeberg lindenstrauss lipschitz lkopf locality logan longer loss lows machine machines madigan maehara mallows manage mannila manning many mapping marginal margins mario martin mathematical mathematics matias matrix maximum mcsherry meaningful means measurements method methods michel mining model modern moment moments montreal more moses most motifs motwani much multi multimedia multiplicative multivariate murad naive natural nature navin nearest need needed neighbors nement networking networks newman next nips nite noga normal normality normalized normalizing notation notational note noun numb number onds only onverg ooks ounded owledge ozgur pages palo pantel papadimitriou paper parameter pareto patrick peer perturbation philadelphia philip physics ping piotr piscataway pittsburgh pods point poor power prabhakar precision preserving press prior prism privacy probabilistic probability problems proc process processing programming projection projections proof proofs provided providence proving pseudorandom quebec raghavan random randomized rate ratio ravichandran really recognition redondo reduction reference references relative remains removing rennie represent research retrieval rice richard ripley robust roni rosa rosenfeld rounding rows ryan sahin salton sample sampling samuel sandhya sanjoy santosh satis says scaling schemes schutze sciences seattle second seen self semantic semide sensitive severely shepp shiganov shih should show shown sigir sign similar similarity simpli simplicity since singapore sites society sold solution some sons sources space sparse speed sphericity spie springer stable standard statistical statistics still stoc stream strictly strong structures studentized study sums sung supp support susan systems szegedy tackling tailed tamaki tang taqqu techniques teevan term terms text texts than that their then theorem theorems theory therefore third this three thus tion tompa topic towards track trans transactions treat trevor trivial tsang unbiased unexpected upper useful using vancouver variable variance variances vector vectors vempala venables veri verlag very viii vlsi volume walter washington weaker weighting what when where which whose wiley will william williamson willinger wilson with without word would write xiaoli xmin yields yiming york yossi your ysis yuan zero zhang zhichen zipf http://doi.acm.org/10.1145/1150402.1150469 63 Mining Relational Data through Correlation-based Multiple View Validation able accuracies accuracy achieves across active adviser aggregation algorithm also analysis approach approaches arti assessment assisted automatically ballard based bene berka berlin blum both bristol burges cameron capable category cation chal challenge chosen cial cient ciently classi classifying combining commonly company comparing complex computation computational computing conclusion conclusions conference consistently correlation coursac craig crossmine dality dasgupta data database databases datasets dels discovery diss domain duteil dzeroski ecml editors eled elmasri employed engineering entities erform eriment evaluate evaluated execution facets feature figure flannery foil framework frank friedman fundamentals furthermore future generalization geto ghiselli guide hall hand holte ijcai impact implementations implies imply improve improved indicate indicates intel international java jects john join jones kaufmann king knoblo know kohavi koller krogel lavrac learn learners learning ledge left lenge length ligence littman lucas mach machine machines maximum mcallester mcgrawhill measurement medical meta method midterm mining mitchell models moderate more morgan most multi multiple muslea nancial navathe needed neural nips number numerical obtained order ositionalization other ounds overall pages paper path pattern performance performances pkdd practical predictive present presented press probabilistic proc proceedings programs progress promising promisingly prop proposes psychological publishers quinlan recipes recognition reduced references regardless relational relations required respectively results right robust rules schema scienti selection sensing show shown shows side sides siebes simple since slightly springer srinivasan strategy subset supp systems tables techniques terms teukolsky that theory these thesis this thoroughly through time tools toxicology track training tutorial uncorrelated unlab used using validation value values varied vector very vetterling view views viktor waikato well when which with without witten work workshop wrapp wrob yang yields http://doi.acm.org/10.1145/1150402.1150449 44 Anonymizing Sequential Releases ∗ aggarwal agrawal alternative analysis anonymity anonymization anonymized anonymizing april attacker based bayardo bell beyond biomed blake bottom cation chakraborty checking chicago cient classi clifton communication complexity computer constraints data database databases datasets dence deutsch dewitt disclosing disclosure distributed diversity domain dong down enforcement enhanced exchange exposure feder formal full fung fuzziness gehrke generalization genomic handicapping hettich html http icde icdm icdt identities ieee incognito info information injecting integrating international into iyengar jajodia journal june kaufmann kenthapadi kifer knowledge learning lefevre limit machanavajjhala machine malin mathematical merz meyerson microdata miklau mining mlearn mlrepository model mondrian morgan motwani multidimensional network newman november optimal pages panigrahy papakonstantinou personalized pods preservation preserving privacy private problems programs protect protecting protection publishing quinlan ramakrishnan references release repository research respondents samarati sample satisfy security shannon sigkdd sigmod size solution specialization suciu suppression sweeney symposium system systems tables technical template theory thomas through tkde transforming uncertanty using utility views violation vldb wang when williams wong xiao http://doi.acm.org/10.1145/1150402.1150468 62 Algorithms for Discovering Bucket Orders from Data academic aggregating aggregation agrawal agrees ailon algebra algorithm algorithms alon also analysis anatomy appear applications approach approximates approximation approximations arti assignment attempts authoritative automated aware based basic been bernor best better between biology borodin both bounded brin bucket buckets carlo cation cene chain charikar chaudhuri cial cidr cient class classi clustering cohen collab columbia combine combining community compaq comparing complete computational computer concept conclusions conditional consider considered context corresp crammer cranking cruse data database declare dels demonstrated denoted described devices direction discrete discuss discussed diversity does downldgf dwork each eachmovie ecml ecology edia editors elements encyclop engine environment erent ermutations erts erty etween eurasia eurasian even evolution example examples experiments explicitly extending fagin fahlbusch fast faunas figure finding formulations fortelius fossils found fragments from function furnkranz future gave generatingfunctionology gionis given global good grades have haveliwala here himb however html http hullermeier hyperlinked hypertextual icdm icml ilyas inconsistent information input integer intel interesting internet into intuitive isdn item items jective journal just kleinberg kujala kumar labels land large later learning lebanon life ligence line linear link lists ltering machine mammal mannila markov math mathematics matrix mcjones meek metho method methods mittmann mixture mnorder mobile model modeling monte more much natural nding neogene networks newman nips njas numb onding optimization orative order orders ours page pagerank pages pair pairs pairwise paleo paleontological paper partial permutations pivot plos pods points polytope possible practice pranking preference press probabilistic probability problem problems provide provinciality puolam queries query rank ranking rankings ratio real recognition references related relational removing research results returned same scalable scale schapire score search segmentation sensitive sequences sequential seriation series several show showed shown siam sigmod similarity simple singer single situations sivakumar slone soda solution sources stoc studied suggest suitable systems table task technology test than that their theory they things ties time times topic total tournaments transactions turnover ukkonen under university unordered upenn usefulness using variations vertex very vldb western where which wilf with work works yield york http://doi.acm.org/10.1145/1150402.1150457 52 Simultaneous Record Detection and Attribute Labeling in Web Data Extraction analysis applications arasu automated automatic based bayesian block boundary bunescu buttler carnegie carreira causal chang chen classification cohen collective combining computation computational conditional cowell crescenzi cvpr data dawid dictionaries discovery documents ecml embley entity entropy expert exploiting extracting extraction fields fine finn from fully garcia gaussian hidden hierarchical icdcs ieee iepad image information integration jensen jiang kushmerick labeling large lauritzen learning level local machine markov maximum mecca mellon merialdo methods model models molina mooney multi multiscale named networks object olesen pages pattern perpi prior probabilistic proc processes quarterly random record references relational report roadrunner rosenfeld sarawagi search semi sigir sigkdd sigmod singer sites smoothing spiegelhalter springer statistics structured system systems technical tishby towards university updating vldb wide with world zemel http://doi.acm.org/10.1145/1150402.1150418 13 NeMoFinder: Dissecting genome-wide protein-protein interactions with meso-scale network motifs acad acids alberta algorithm allow alon analysis apriori assess assessment based bioinformatics biology blocks botstein brown building cagney cerevisiae chen chklovskii cient city cluster comparative complex complexes comprehensive computational computing concentrations concepts conference conserved construction data database databases department detecting detection discovering display eisen ellman ermeyer estimating etween exploiting expression finding fortin frequency frequent frishman from functional gene generality genome genomes giot graph grigoriev gspan guldener hayashizaki huan icdm inokuchi interaction interactions interactomes international isomorphism itzkovitz karypis kashtan krause kuramochi large largescale lnbi maslov maximal measure measurement mering meso mewes milo mining mips mirny modules molecular motifs motoda national natl nature network networks nucleic ology pages pattern patterns pkdd pnas prediction presence prins problem proc protein proteome references relationship reliability reliable report saccharomyces saito sampling scale schreib schwobb science sequences sets shen siam sigkdd simple singap snel snepp sparse spin spirin stability subgraph subgraphs substructure substructures suzuki systems technical tkde transactions uetz university wang washio wide with yang http://doi.acm.org/10.1145/1150402.1150452 47 Extracting Redundancy-Aware Top-K Patterns access active afrati afshar agrawal algorithms analysis annual approach approximating approximation association automated background baptie based bases bastide bayesian biennial bioinformatics block brief brown calders carb case chaudhuri chen cheng cidr class closed clospan clustering cold coldstein collection comm compressed comput computational conf correlations data database databases datasets della derivable desouza development discovering discovery discrete disk disp diversity documents ecial engineering epkut erschatz ersion european evaluating evolutionary exploration fast feedback file finding francisco frequent from general genome gionis goethals gram gravano hall harbor hassin heuristic hohenbalken holldorsson icdm icdt ieee inen information innovative interesting interestingness itemset itemsets iwano jain jaroszewicz jennifer katoh know knowledge kumar lakhal language large ledge linguistics localtion makes mannila maxian maximizing maximum measure mercer mielik minimum mining models modern mount natural networks onell operations opns oral ordering overview paegs pages pasquier pattern patterns peter pietra pkdd prentice principles problem problems proc producing queries query ranking ravi references reordering reranking research results retrieval right rosenkrantz rubinstein ruemmler rule rules selecting selection sequence sequential sets shen siam sigir silb simovici singhal sixth soda spring srinivasan srivastava storage structures subsets summaries summarizing supp symposium systems tamir taouil tayi tech technologies temp text theme theory third tokuyama trans tuzhilin tzvetkov unix usenix using very vincent vldb wang what wilkes winter without zhai zhou http://doi.acm.org/10.1145/1150402.1150528 119 A Component-based Framework for Knowledge Discovery in Bioinformatics adaptive annual based berzal blanco cannataro carnahan classifiers communications component componentbased conference cubero data development ding distributed engineering enterprises feedback framework frameworks france grid hernandez iasse intelligent international isasi july knowledge larsen layered learned lessons linear malakhov marin mining multi nice perera perrizo predict proceedings ramos references registers security serazi shift sierra software sparling systems talia through using years http://doi.acm.org/10.1145/1150402.1150497 91 A Large-Scale Analysis of Query Logs for Assessing Personalization Opportunities about adaptive after agenda agent american analysis appears assessing automated available based become beitzel billsus broder brown categories categorization categorized cation chains changes chen chowhury cikm classes claypool clicked clicks clustering commerce comparison computer computing conference conferenceon contextual converge convergence data decoste different direction discovery distinct distribution distributions engine explored exploring extent feedback forum found framework frieder from future gauch generation grossman helps henzinger histories history hourly hundred identi identify ieee implicit includes indicators information informational intelligence intelligent inter interest interesting interfaces international interntional jansen jensen joachims journal knowledge large larger learning logs ltering machine madani main management mapping marais meng mining moricz navigational nine only opportunities pattern patterns pazzani personal personalization personalized personalizing pooch population poster problems proceedings processing queries query questions radlinski rank rarely recommender references repeated report research review revising saracevic scale science search searching sebastiani self sets shoval showed sigir silverstein site sites society special speretta spink stationary stickiness studies study such summary survers system taxonomy technical technology text that this three topical topically trends types user users uses using utility versus very view wased webology wedig well wicacm wolfram work workshop yahoo http://doi.acm.org/10.1145/1150402.1150494 88 Efﬁcient Kernel Feature Extraction for Massive Data Sets accuracies accuracy advantages also always appears average besides both bottom cannot cation cess cient classi combination compares computa computational computationally conclusions converge cost data dominant during each erent evaluations even expected experiments explains expressed extract extracted extraction extractor extracts face faster feature features figure generalization good have high hours however hundred ideally implementation improvement including investigated involved kernel kpca large larger largest lead leads letters linear magnitude manner mentioned mmda mnist much needed numb numbers often only optdigits orders original others paper pendigits performance poor poster problem process produce random recall references reminded required research same sampling satimage scheme seconds section seen sets several should shows slower small sometimes sparser successfully such table tasks test testing than that this thousand three thus time track training types used useful usps various while http://doi.acm.org/10.1145/1150402.1150451 46 Discovering Signiﬁcant Rules aaai against agrawal agresti algorithms analysis annals approach archive asia associates association associations assortment august aust based baskets bastide bayardo bayes benjamini berlin beyond binary blake bonferroni boulicaut brijs brin business bykowski calders california cant case closed closures computational computer concise condensed conf constraint contingency contrast control controlling correlations data database databases datamining decisions dence dense department derivable detecting discovered discovering discovery dumouchel editor editors empirical endency erences erformance etween european exact false february fifth finding first fourth frawley frequent generalizing goethals group guide gunopulos hard hettich hochb holland holm http imielinski improved inference information intel intelligent interesting international irvine item items itemsets japan jermaine journal know kohavi kyoto lakhal large learning ledge letin ligent logic machine machines magnum management mannila market mason massive megiddo melb menlo merz miner minimal mining most motwani multi multiple newman optimal optimally opus ository othesis ourne owerful paci padmanabhan pages pakdd park pasquier pazzani peckham piatetsky pkdd portland practice predictive pregib presentation press principles proc procedure procedures product pruning psychological psychology quantitative rate readable real redundant references rejective release representation representations royal rule rules scandinavian sche science screening sequentially series sets seventh shapiro sigkdd sigmod signi silverstein simple society software springer srikant statistical statistics strong study stumme summarizing supp survey swami swinnen systems tables taouil tenth test testing that toivonen trade tuzhilin under university user uses using vanhoof verlag version washington webb wets world yekutieli york zaki zhang zheng http://doi.acm.org/10.1145/1150402.1150461 55 CCCS: A Top-down Associative Classiﬁer for Imbalanced Class Distribution agrawal algorithms antonie archive arunasalam association associative based bases baskets beyond blake brin cccs chawla class classi conference cong correlations data discovery distribution dmkd down farmer fast finiding generalizing groups http imbalanced interesting international issues know large ledge management market merz microarray mining motwani negative ositive pages proceedings references research rule rules sigmod silverstein srikant sydney technical tung university very vldb workshop yang zaiane http://doi.acm.org/10.1145/1150402.1150429 24 Training Linear SVMs in Linear Time advances algorithms align altun analysis applied approach area arti arxiv astro august available award backstrom bahamonde bartlett bengio berlin burges cambridge caruana categorization ccat centre chang chapter cheung cial cikm cjlin classi clickthrough collection collob computation conference convex core covertype csie curve cutting data decomp decoste discovery dumais editors enchmark endent engines erformance ermayer european fast features ferris figure from function fung gift gijn google graep heckerman herbrich hofmann http hush icml inductive industrial insa intelligence interdep interior international jmlr joachims journal kddcup keerthi kelley kernel kernels know kwok lagrangian large learning ledge left lewis library libsvm light linear lkopf machine machines making mangasarian manuscript many margin massive mathematics maximum measures method methods minimal minimizing mining modi multivariate munson musicant neural newsletter newton nite novemb numb oint online optimization optimizing ordinal orted osition oundaries output oviedo pages pairs paper perf physics plane platt polynomial practical press problems proceedings programs proximal rakotomamonjy rank ranking references regression relevant representations research results reuters right rose rouen sahami scale schoelkopf scovel search seconds septemb sequences sequential sets siam sigkdd smola society software solution solving springer structured supp support svms svmtorch swapp technical text this through time track training tsang tsochantaridis under universidad using variables vector very williamson with yang http://doi.acm.org/10.1145/1150402.1150532 123 Camouflaged Fraud Detection in Domains with Complex Relationshipsi accounting alexander algorithms analysis authors balaji based berkeley breiman business california carnegie center chang characterization chen chicago classification classifier columbia comiskey computer conference contract creative crsp crystaliz daah darpa data database december dechang deng detecting detection deviants discovery dissertation dynamical earnings employees engineering february financial florida game general geometric getpub gsbwww http icdm ieee information institute international introduction isbn jacob jeffrey john june knowledge line macqueen management mathematical melbourne mellon melo memory methods minimal mining modeling mulford multivariate muthukrishnan numbers observations omega outlier padmanabhan palis paper part patterns penman performed practices press prices probability proceedings purpose rahul ratios references regression reported reports research robotics rule sbir school science scientific scott security series shah society some sons spatial sponsored springer ssdbm statement statistical statistics stephen streams sustainable symposium system systems that theory this tien time transactions trees tuzhilin uchicago unexpected university using verlag vitter wellington were while whoswho wiley work working xiao yufeng zhang http://doi.acm.org/10.1145/1150402.1150518 110 Computer Aided Detection via Asymmetric Cascade of Sparse Hyperplane Classiﬁers additive advances aided algorithm algorithms altman annals annual approach armato assisted automated bamberger bazaraa bennett bogoni boosting breast buchbinder building burges cambridge cars cascade category cation classi column combining computational computer conference convex data demiriz detection discovery dumouchel dundar editors example false ferencz fields following francisco freedman freund friedman from fung game gehrke generation ghosh giger hasegawa hastie huang identi international john kaufmann kernel kernels know kohavi learned learning lederman ledge leichter lesions levy line lkopf logistic lung macari machine machines macmahon malik manifolds mathematical medical megibow methodology methods miller mining mixture morgan ninth nodule nodules nonlinear novak pages pattern physics pitfalls positives potential prediction preliminary press proceedings programming radiology rads recognition reduction references regression results scans schapire serial shawe sherali shetty sigkdd sklair smola sone sons statistical statistics support surgery system taylor theory tibshirani training validating vector view vision visual wang wiley wilkie with workshop yarmish york zhang http://doi.acm.org/10.1145/1150402.1150511 105 BLOSOM: A Framework for Mining Arbitrary Boolean Expressions aaai advances agrawal akutsu algorithm algorithms also alternating analysis antonie approach asso bastide been cambridge cartwheels cation ciation ciations cient cikm clauses closed closure concept conf conjunctions context counting customer data database davey discovery discrete disjunctive disruptions engineering ethals explorations expression expressions extensively extracting fast figure fimi formal formulas foundations frequent ganter gene generalised generators have hsiao icde identi ieee implementations inference information introduced introduction itemset itemsets jamison june know large lattice lattices ledge many mathematical mine minimal mining monotone nanavati navathe negative network networks notion olean omiecinski operator order ositive overexpressions pages past patterns pfaltz pkdd press priestley proposed pure ramakrishnan reasoning redescription redescriptions references regulatory related rule rules savasere science sciences sets shima sigkdd springer strategic strong structure studied symposium systems task their trans transactions turning university using verlag wille with within work zaiane zaki zhang http://doi.acm.org/10.1145/1150402.1150490 84 Na¨ve Filterbots for Robust Cold-Start Recommendations ı aaai algorithms application approach architecture automating based bergstorm case choices cial clustering cold collab community content cosley cscw data decoste dels diagnosis dimensionality environments ersonality escul evaluating factorization fast filtering foster furnas getting giles grouplens hill horvitz hybrid iacovou icml information item karypis know konstan lawrence learning lterb ltering madani maes margin matrix maximum mcnee memory metho mouth naive netnews orative pages park penno prediction preferences probabilistic rashid recommendation recommendations recommender recommending reduction reidl rennie resnick riedl robust sarwar shardanand sparse srebro start stead study suchak systems technical ungar user virtual webkdd word workshop http://doi.acm.org/10.1145/1150402.1150460 54 On Privacy Preservation against Adversarial Data Mining adult adversarial after ages aggarwal agrawal algorithms also anonymity anonymization applications approach association attributes based bayardo bureau cardinality cation census classi clifton conceptual condensation conference containing continuous dalvi data database databases design dimensions disclosing discretization discretized distortion distribution dmkd edbt education effectiveness enforcement entries enumeration examine experimental experiments extracted figure fnlwgt from generalization haritsa have here hiding http icde ieee implications incomplete information leakage learning liew machine maintaining marks massively mievski mining minnesotta missing mlearn most multi nominal number optimal parthasarathy partitioned pattern pods preserving privacy probability proc proceedings protecting quanti real reconstruction references relational removal removed report repository research respectively results rizvi rule rules rymon samarati same search security semi sensitive sets shows sigmod size srikant still supervised suppression sweeney symposium systematic technical those through tkde tods tuples univ vaidya value values vertically verykios vldb when which workshop xiong year http://doi.acm.org/10.1145/1150402.1150448 43 Center-Piece Subgraphs: Problem Deﬁnition and Fast Solutions actually aditya advances algorithm also always among analysis anomaly appendix applications april asymmetric attributes august authority automatic balmin banks barabasi based bhalotia biological bipartite bousquet brin bringing browsing calculate calculation case categorical cation chakrabarti cheng cient cikm citation clustering coetzee communities community complex compressed computed computer conference connection consistency context correlation cross data databases deal design desirable detection dhillon diameter digital discovering discovery discuss does dorogovtsev duygulu ectively ectral electricity eriments erson erty every evolution example explorations external faloutsos fast flake formally formation frequent from function geerts gibson giles girvan global good goodness graph graphs handbooks haveliwala here high holland however hristidis hulgeri hypermedia hypertext icdm identi ieee image individual inferring information informative international internet interpretation irregular issue ject jectrank jeong jordan karypis kemp keyword kleinb know kumar laplacian largest lawrence learning least ledge library link local longer mallela management manifold mannila matrix maximizing mccurley mean measure mendes might milnor minimum mining modal models modha monma motwani multi multilevel multimedia nakhe nally nature necessarily neighb network networks newman ninth nips node north note nowell ology only operations order organization orhood ortant otential other ower page pagerank pages pakdd palmer papakonstantinou paper parag parallel particle partitioning pattern perry physics prediction probabilistic probability problem proc prop provide queries query quite raghavan ramakrishnan ranking references relational relationships replacing research resp resulting retrieval review scholkopf science score search searching self sensitive sept sets sheth siam sidl sigcomm sigkdd similar similarity similarly simrank since social softand some source special spread stanford state statistic statsitic stay steady stoer structural structure subgraphs such survivable symmetric symmetry tardos technical technologies terzi that then theoretic there this through tomkins tong topic track transition tschel uence value variant variants version vldb washington weiss weston where wide widom winograd with world yang york zhang zhou http://doi.acm.org/10.1145/1150402.1150428 23 Adaptive Event Detection with Time–Varying Poisson Processes accumulates acknowledgments additional allows along also alternative amer among analysis anomalous answer anton applications applied approaches areas arising assistance assoc asymptotics attendance authors axis baseball based baum bayesian building bursty calculating calculations calit campus carlo cascade chain chains characterization chib chiu choice chris collection combined comm compare comparison computed conclusion conference correlated correlation counting counts data database davison days degree densities dept describ detected detecting detection different direction discovery dissertation distributions doing doors duration during dynamic each eecs ehavior ehavioral eighth entering eople erent erformance erformed erforming erhaps eriodic erkeley estimated estimating event events exact example examples exists exiting extensions february feedback figure finally finding form forward foundation framework freeway from functions future game gelfand geman gibbs gives grants ground guralnik harvard have having hierarchical http ieee images increases inference information interesting international intrusion keogh kleinb know ledge like likelihood linear list logistics lonardi loop lucantoni main mannila marginal markov math maximization mcmc measurement methods mining mmnhpp model models modulated monte months more multiple multiplexer national natural naturally nazarenus network nonhomogeneous numb number observe observed obtained occupancy occurring opularity orted osition otentially other output over packetized pami parameter parameters part patterns pems performance petrie poisson predictable presence press principled probabilistic proceedings process processes programming provide providing quantities question questions real references regular related relaxation restoration richer salmenkivi sampling scheduled science scott sensors sequence series several shared shellie should showed shows sigkdd simpli simultaneously smith smyth soules source space srivastava stat state statistical statistics stochastic streams structure such supp surprising system systems technique test thank that their them then these this those though through time times trans truth uncertainty under university used using varies varying vehicles virtue voice weiss well what which while with work would xing york http://doi.acm.org/10.1145/1150402.1150509 103 Identifying Bridging Rules Between Conceptual Clusters academy acknowledgment actions agrawal algorithm algorithms analysis annual antecedents applications approach arning association august australian barnett based basic baskets belief between beyond both breunig bridging brin cannot case category chameleon changes chengqi china chinese clustering clusters cognitive comput computation computer concepts conclusion conf confirm contours contrast correlations cubes data database databases datasets dawak demonstrated density depth describing designed detecting detection deviation difference differences dimensional discovered discovering discovery discrete distance driven dynamic edwin efficient employ engineering eppstein evaluated example exceptions existing expectations experimentally experiments exploration extending farther fast feng finding first fraud frequent future generalizing geom grant grants graphs group hans have hierarchical hussain icml identify identifying ieee include influence insightful insurance interesting interestingness intuitively itemset john johnson jorg kamber karypis kaufmann knowledge korn kriegel kumar kwok large lewis life linear local major makes many march market markus measuring meeting megiddo method metrics mining modeling more morgan motwani muthukrishnan national nearest need negative neighbor olap oregon outlier outliers outstanding overseas padmanabhan pakdd paper partially particular paterson patterns pazzani peter portland positive possibility proc proceedings program promising proposed pruning queries raghavan raymond real references reliable research reverse riven rule rules sander sarawagi scalability science sciences search second sets shichao sigmod silverstein simply sixth society sons spain specifically statistical still strategies supported talent techniques technology that then theoretical this transactions tuzhilin unexpected useful user using valencia vldb what wiley will work xindong zhang http://doi.acm.org/10.1145/1150402.1150453 48 Regularized Discriminant Analysis for High Dimensional, Low Sample Size Data academic academy accuracy acids algorithms allaml american analysis annals applications applied arsenin association axis based belhumeour bergel biased bioinformatics biology biomedicine burges cambridge cancer categorization cation characterization chen class classi classification coding comparison computational computations computer conference costs covariance cristianini data denotes different dimension discovery discriminant discrimination distribution diversity duda dudoit ediatric edition eigenfaces eigenfeatures elements ercentage erent estimation eugenics expression extraction face family feature figure fisher fisherfaces fridlyand friedman fukunaga gene generalized genome geometric golub grouven hall hart hastie hespanha high hoerl hopkins http human icml identi ieee image implementation incorporating inference intel into introduction janardan jection john johns jonathan journal kennard kernel know kriegman krzanowski learning lection ledge leukemia lewis liao ligence linear ling loan lymphoblastic machine machines marron matrices matrix mccarthy measure measurements methods mining misclassi monitoring multiple national neeman nonorthogonal nucleic olecular other outcome paper park partitions pattern posed prediction press problem problems proceedings programs protein quadratic ratio recognition references regions regression regularized representation research retrieval reuters ridge royal sample schultz science sciences series singular site size small society solutions solve sons speci spectroscopic speed splice splittings springer statistical statistics stork subtype support swets system taxonomic taylor technometrics test testing text theory third thirty thomas tibshirani tikhonov tissue track training trans transformation tumors tutorial ulda uncorrelated undersampled university using vapnik vector washington weng where which wiley with xiong yang yeoh zhang http://doi.acm.org/10.1145/1150402.1150456 51 Event Detection from Evolution of Click-through Data∗ account addison agglomerative agrawal allan analyzing approach approaches author based beeferman berger bhowmick bolivar candan carb centric chang chen cikm classifying click clickthrough clustering collection conclusions conditioned context cuts cvpr data detect detecting detection document driven dynamic ective elieve endent engine engines ersonalized esvd etween event events evolution exact existing expansion extraction fact feature from graph grouping gunaratna harary historical ieee ignoring image incorp indexing information into investigation joachims keogh level line logs malik measure measures mining model motivated nature nding normalized novel novelty onell only optimizing orated organizing ortant osed page pages pairs partitioning pattern pierce plays probabilistic process prop queries query real references retrieval retrieving retrosp rich role sarkar search segmentation selection semantic sentence sigir sigkdd similarities similarity simrank soundarara structural study subgraphs take task terrorism that then theory this through time topic tpami tracking unit user using vector visitor visitorcentric vldb wade wang warping wesley while widom work world yamron yang zeng zhang zhao http://doi.acm.org/10.1145/1150402.1150431 26 Maximally Informative k-Itemsets and their Efficient Discovery aaai agrawal algorithms almuallim analysis approach association between bingham candidate comon component concept cormen cover data databases dietterich elements elisseeff feature features frequent generation guyon imielinski independent information introduction irrelevant items john journal karhunen kira large learning leiserson machine mannila many mining patterns practical press proceedings processing references rendell research rinen rivest rules selection sepp sets sigkdd sigmod signal sons swami theory thomas topics variable wiley with without http://doi.acm.org/10.1145/1150402.1150504 98 Utility-Based Anonymization Using Local Recoding achieving aggarwal agrawal algorithms annealing anonymity anonymization anonymizing approximation based bayardo bureau census cient complexity constraints control curse data dewaal dimensionality disclosing disclosure division domain elements enforcement full fuzziness generalization generalizing icde icdt identities ieee incognito information international iyengar journal knowledge lecture lefevre meyerson microdata model mondrian multidimensional notes optimal pods privacy protecting protection provide references release report research respondents samarati satisfy sigmod simulated springer statistical statistics suppression sweeney systems tables technical technology through tkde transforming uncertainty using verlag vldb when willenborg williams winkler http://doi.acm.org/10.1145/1150402.1150425 20 A New Efﬁcient Probabilistic Model for Mining Labeled Ordered Trees abiteb acetylgalactosamine acetylglucosamine acids actual aggarwal akutsu algorithm algorithms analysis analyzing aoki application applications araki area assp baum bottom buneman capturing carb cation chains chakrabarti characteristic chemical cient ciently class classi cold complex computations computer concluding context cummings curve damerau data datasets dempster develop developments diligenti discussion document ective ectiveness editors eled empirically erating esko essentials estimation evaluated extensions figure forest found foundations fourth framework frasconi free freeze frequent from fucose functions galactose general generalisation genomics glycans glycobiology glycome gori goto grammars graphical hama hand hanley harb hart hashimoto hattori have hidden high hirakawa hybrid hypertext icml ieee image incomplete indurkhya inference informatics information inside intel interscience introduction itoh jima juang kanehisa kashima katayama kaufmann kawano kawasaki kawashima kegg kernels kinoshita know koyanagi krishnan laird language lari lauritzen learn learning ligent likelihood likely local mach magazine mamitsuka managing manning mannose markov marth math maximum mclachlan mcneil meaning methods mining model models morgan most multiple natural networks ninth nite nucl ohydrate oratory ordered otmm outside pages pami paper paths patterns pearl petrie plausible predictive press probabilistic probabilities problems proc processing rabiner radiology real reasoning receiver record references relations remarks research resource royal rubin scheme schutze semi semistructured sigmod simple speech spiegelhalter spring springer stat state statistical stochastic structural structured structures suciu sugar summary synthetic systems text their till track trans tree trees ueda under unstructured using varki weiss wiley with xrules yamaguchi york young zaki zhang http://doi.acm.org/10.1145/1150402.1150481 75 Visual Data Mining using Principled Projection Algorithms and Information Visualization Techniques advanced aids algorithm algorithms also alternative aminergic analysis ankerst approach areas arti astrono aurenhammer basis belkin billb bischof bishop calized cation characterize chem chemical chemists chemoinformatics cial class clearly closely cluster clusters comp computation computing conclusions conf consider constructing convincingly coordinates could cover curvature curvatures data dataset datasets dels demonstrated detail details deterministic develop diagrams dimensional dimensionality dimsdale directional discovery divergence domain drug during dvms early ective ectiveness edition eigenmaps elements engineering eptidergic erent erts ervised etter exact exceeds exible exploration explore exploring extraction eyes factor factors feature folding framework from function functions fundamental generative geometric geometry germany gpcrs grap graphics group grouping guiding have help hgtm hierarchical high hornik icann ieee info information inselb integrates intel interactive interesting interface jection jections keim kinases know kreuseler languages laplacian large latent learning ledge levkowitz levy life ligence linear ling lncs local lowe machine magni manifold manifolds maniyar manual mapping matrix methods mining model models more multi multiplications nabney ncrg networks neural neuroscale niyogi normally note novel numb oarding obtain obtained ographic oints oliveira only opulations ordinates organizing oriented ortant osed ounds output overlap pages parallel particle partiview pattern patterns pixel plots principled proc processing projection prop proved provide provided radial real reduction references regions regression representation research results scales schumann scientists screening self semisup separation sewing shneiderman show since space springer stages stat strong structure structures such survey surveys svens svensen symp symposium systems targets task taxonomy technical technique techniques terms than that theory this thomas tino tipping tool traditional trait tran trans transactions tting understand understanding union useful user using verlag very visual visualisation visualization visualizations visualize visualizing vornoi which wiley williams with work york http://doi.acm.org/10.1145/1150402.1150434 29 New EM Derived from Kullback-Leibler Divergence accuracy adaptive after algorithm analyzed appl approach approximation automatic based bayesian beal before begin biometrica blackwellized burgard camp canny carlo chain composed compression computation computers data dempster detector determination diagonals during edge edges eriments figure from generated ghahramani green grid grisetti ground hardle healy icra illustrate image implementation improving incomplete initial input iteration iterations journal jump laird likelihood line lters markov maximum merge missing model monte noise obtained only original osals outdoor output oxford paper particle pixels points polygonal press process prop rate references resampling rescue research reversible robot rome royal rubin scan second segments selective series shown shows signal simulated slam smoothing society split springer stachniss statist statistical statistics techniques that track truth univ values variational verlag wesmacott with http://doi.acm.org/10.1145/1150402.1150440 35 Tensor-CUR Decompositions For Tensor-Based Data advances algebra algorithm algorithms alternating amazon analysis annual appear application applications applied approximating approximation approximations april arithmetic array arrays arti baltimore based basic benign best bolasco boosting callan carcinoma case cial cient cmsc coifman collaborative college colon combining communications complete complexity component computation computations computer computing concept conference constant contemporary coppi data davis decomposition decompositions decoupled deverse device digital dimensionality discriminating drineas editors eigentaste elesvier error factor foundations four freund from generalization geshwind goldberg golub goreinov gram graphic gupta harshman hattie higher hopkins hyperspectral ieee improved information intel international internet item iyer jagopalan johns journal karypis kernel kleinberg know konstan kroonenberg kruskal kumar lathauwer learning least ledge leeuw leibovici ligence linden linear loan ltering lundy machine machines maggioni mahoney malignant management manuscript maryland mathematical mathematics mathematik matrices matrix maximum mcdonald means method methods microarray microscopic mirror mixture mode model models moor multidimensional multilinear multimode multiway neural normal notes novel numerical numerische nystr order pages parafac park perkins praeger preference preferences press principal probabilistic proceedings processing pseudoskeleton psychometrika publishers quasi raghavan randomized rank ranks ratings recommendation recommendations recommender reduction report research resnick retrieval riedl roeder sabatier sandler sarwar scaling sccm schapire schmidt science sections seeger siam signal singer singular smith snyder some sparse speed squares stad stanford statistics stewart study symposium system systems technical tensor tensors theory three time tissue tomkins trilinear truncated tucker tyrtyshnikov umiacs uncertainty uniqueness university using value vandewalle varian volume warner webkdd williams with workshop york zamarashkin zhai http://doi.acm.org/10.1145/1150402.1150420 15 Orthogonal Nonnegative Matrix Tri-factorizations for Clustering aaai academy adaptive agent agents aggregation algebra algorithm algorithms analysis applications approximation assoc autonomous base based behav berkeley berry binary bipartite block boley brien brunet california categorization cation cheng chessell cikm classi clsi cluster clustered clustering common comparability comparisons component computer computing concept conf conference constraints convex cooper criteria criterion data decomposition developerworks dhillon dimension dimensionality ding direction discovery discriminant divisive document documents dumais empirical environmetrics equivalence error estimates evaluation event exible exploration external factor factorization factorizations fast foote framework from functions gallopoulos gavin general generalized generalizing gini golub gong graph gross hastings hierarchical hofmann howland hoyer http hybrid icml ieee indexing informatic information integrated intelligence intelligent international iteration jordan karypis knowledge kumar laboratory language large latent lawrence lbnl learning liang library linear localized long machine machines management matrices matrix mccallum means mesirov metagenes method methods milligan mining mobasher model modeling modha molecular moore multimedia multiplicative multivar national natl nature negative negatvie nips nonnegative objective objects ogihara optimal paatero pages paper park partitioning parts pattern peng pkdd positive principal probabilistic proc proceedings processing programming quadratic rand recognition reduction references relaxation report representation research retrieval review saul scaled scheme sciences selected self semantic semi seung siam sigir signal similarity simon singular space sparse sparseness spatially speci spectral square stat statistic statistical stearley study subspace summarizing support syslogs system tamayo tapper technical term text theoretical toolkit toward track trans university unsupervised updates using utilization value values vector video vision webace webservices with words workshop zeimpekis zhang zhao http://doi.acm.org/10.1145/1150402.1150523 115 Understandable Models Of Music Collections Based On Exhaustive Feature Generation With Temporal Statistics according acoustic acoustical acta acustica advanced advances aggarwal ahrendt algorithms america analysis anchor another applications applied architecture architectures archives aucouturier audio automatic automatically bandwidths based bayesian beat behavior bello benchmark benelux berenzweig berry breebaart burges cambridge cano case categorization cation celma center cessie chapter chorus cker classi classifying clustering coding collaborative collections comparative compute computer conf content contents cook critical data database databionic dataset dept descriptors detecting dietterich digital dimacs dimensional dimitrova distance distances dixon dortmund dynamical ects editors efthymiou elements ellis ensemble environment essl estimators european evaluation evolving expo extraction fast feature features finding first fischer framework friedman from function fundamentals general generalization genkin genre german germany glasberg goto government hall hastie herrera high hinneburg hken homburg houwelingen human icassp icme ieee implementations improving indexing industrial inference integration interaction international intl ismir journal juang kamps kantz kaufmann keim kernel kittler klinkenberg know kummerer large larsen lasso lawrence learning lecture ledge letters level lewis logan logistic loudness machine machines maddage madigan maps marburg marsyas mateo mathematics matlab mcgee mckinney measurement measures media meng merkl method methods metrics mierswa minimal mining model modeling modelling moeller moore morgan morik multimedia multimodal multiple music musical musicminer negative networks neural nonlinear notes ogihara optimization organised organization organizing pachet page pages pampalk paper pattern pauws perception perceptual philipps platt polyphonic prediction prentice press proc processing programs quinlan rabiner rand rauber rchen recognition references regression report research results retrieval review revision ridge rittho roli royal salomon same sandler scale scholkopf schreiber sciences section selection self semantic sequential series serra sethi shao short shrinkage sigir signal signals simac similarity smola snoek society songs sound space speech springer stacked stamm state statistical statistics stenzel stevens strength study summation support surprising survey symposium systems takens technical technologies temporal text that theory thies tibshirani timbre time toolbox tools topograpical track training transaction tsap turbulencs tutorial tzanetakis ultsch university using vector video vignoli viii visualization visualizing volume west widmer with wolpert workshop worring wurst yale young zhang zils zwicker zwickers http://doi.acm.org/10.1145/1150402.1150472 66 Polynomial Association Rules with Applications to Logistic Regression aaai accuracy achieved additive after agrawal agresti allow analysis analyzing annual application arti association associations august austria based brazil canada cantly case categorical cation chicago cial classi comparison computational computing conclusions conference convergence core cpar data databases decision decreases dept development discovering discovery discretization duan dynamics dzeroski educational empirical environment equations erger erimental ernan etween european evaluation foundation frank frequent freund freyb friedman future generalization generalizing goethals guide hastie have http icml imielinski improve intel interaction international introduced introduction items itemsets john jose kaufmann keerthi kind know kumar language large learning ledge ligence ligent line logistic logs maceio machine manuscript method methods mining models montreal morgan multiclass multiple national nonlinear notion numerical olynomial oosting outcomes owicz pages pattern practical predictive proc project quantitative references regression relational relationships ruiz rules sampling schapire scholz search seaside seattle seen sequential sets siam sigmod signi sons srikant stanford statistical statistics steinbach student study subgroup supp survey swami systems tables team technical techniques that theoretic theory tibshirani todorovski tools transfer tting tutor tutoring undiscretized university using vienna view which wiley witten work workshop xiong york zemb zytkow http://doi.acm.org/10.1145/1150402.1150491 85 MONIC - Modeling and Monitoring Cluster Transitions accummulating adbis adjacent adjoint advances again aggarwal algorithm angra applied asso associated association bakiras baron bartolini based between borgelt brazil change changes characteristics chicago ciaccia ciated cient classes cluster clustering clusterings clusters collections comparing complex computation conclusion conf conference contains cuments currently data database databases dataset detected detection diagnosis digital disapp disappearances discovering discovery discussed document does during dynamics each earances early east ecml edition eing eled elled elow emerging encompasses enhancements enriching environments erating erent erger erience eriments eriods etween european event evolutionary evolving exceed exible explain exploration figure finite framework frank fransisco from ganti gehrke generalized germany give greece gunther have hence heuristic history ieee image includes information insights interaction italy kalnis kaufmann knowledge later learning least less library lifetime lncs machine mamoulis many matrix measuring mehta mining mixture model monic monitoring morgan moringa moving ntoutsi numb number nurnb oint oints only opulation oral other oulou outlook over overhead pages parthasarathy patella patterns pennsylvania periods philadelphia pisa pkdd practical presented press principles proc proceedings ramakrishnan reduce references reis remarkable retrieval rules schult scienti seattle section sept several shift shown sigact sigart sigmod signalled simple size some spatial spatio spiliop split splits springer sstd starting streams subtopics such summary symposium systems techniques temp temporal term text than that their theme theodoridis there thessaloniki they this timep timepoints tkde tools topic topics tracking transition transitions trends unlab until uploaded using values washington weights which with witten work workshop workshops yamanichi yang year zhai http://doi.acm.org/10.1145/1150402.1150483 77 A New Multi-View Regression Approach with an Application to Customer Wallet Estimation additional advances algorithm also among analysis analyzing applicability application applied approach area arising assume assumes assumptions available bartholomew based bayesian bickel bishop blum cant cation causal certain cikm class classi coem combining compatibility computational conclusion conditional conforms consider consistency constraints converge core coupled customer dasgupta data deals dempster desired directly disagreement discriminative distributions ecial ecml ectiveness eick eing eled embased endence endent erence erent ervised estimation etween existing explicit explicitly exploiting factor features financial focus from garland gaussian generalization ghani graphical heckerman hinton idan idea incomplete incremental indep inference information instead interest intl into involve involves journal justi knott laird large latent lawrence learn learned learning least lifetime likelihood linear littman make marketing maximizing maximum mcallester meets merugu method methodology microsoft minimize minimizing mining mitchell mixture modeling models most much multi multiple neal network networks neumann neural nigam nips observed obtain onential osed other ounds oxford pages parametric particular perlich planning prediction predictions predictors press probabilistic problem proc proceedings processing prop provide rate recently reduces references regression relations research retention role rosset royal rubin satisfy sche semi series services sets setting settings share shown signi similar simpli single society sparse spirit squares statistical structural structure subsets such systems tability target task technical techniques term terms that then theory there these this those training transform tutorial university unlab unobserved unsup using usually value variable variants vatnik view views wallet watson weisb weiss which while wiley with workshop zadrozny http://doi.acm.org/10.1145/1150402.1150506 100 Coherent Closed Quasi-Clique Discovery from Large Dense Graph Databases across agrawal algorithm algorithms association bioinformatics biological boginski butenko clan cliques closed coherent connectivity constraints cross databases dense discovery economic editor edward elgar fast financial from functional graph graphs hang icde innovations jiang large market massive mining nagurney network networks pardalos properties publishers quasi references relational rules sigkdd srikant structural subgraphs suppl vldb wang with zeng zhang zhou http://doi.acm.org/10.1145/1150402.1150464 58 Model Compression able acapulco acquisition advances airborne algorithm ames analysis applied approach approximate arti available averaging aviris bagging base bayes bayesian blake bonn breiman buntine called caruana cation center chapman chettri cial classi compact complex compression computational conclusions conf conference constructing containing craven crew cromp currently data databases dels dempster density distribution diverse domingos duction editors eighth eled eling ensemble ensembles erformance erforming estimation etter examples excellent extracting extremely fast forests francisco frank from generalization generating geoscience germany gualtieri hall hasselmo have highest huebner hundreds icml ijcai implementations incomplete infeasible information international intro java joachims johnson journal kaufmann kernel knowledge korb ksikes laird large learning letters level libraries likeliho limited little loss loyd mache machine making martinez maximum melville memory merz method methods mexico mimic mining mizil model models morgan mozer msri multiple munge naive nasa needed network networks neural newly niculescu nonlinear oney ository osting over overview ower pages partitioning practical predictions predictors present press probability problem proc proceedings processing pseudo random real recursive references representations research royal rubin scale schapire schmalzl selection sets shavlik silverman similar simulator slow snns society some sommer stacked statistical statistics structured stuttgart supp systems target technical techniques test that them then these thousand time tools touretzky train trained training tree tting university unlab using vector volume when where with witten wolp works workshop zell zeng http://doi.acm.org/10.1145/1150402.1150466 60 Single-Pass Online Learning: Performance, Voting Schemes and Online Feature Selection abound acknowledgement acts advanced against agency aggressive airoldi aistats algorithm algorithms amherst analysis annual anti aspects attributes author available averaging balanced based batch bayesian been bekkerman benchmark better blum brain business calendar carnegie carvalho case categorization cation center challenges chang cient ciir cjlin classi classify cohen comparable comparative comparison computer conclusions conference considerably contagion corpora crammer csie csna dagan darpa data dataset defense dekel delta department detection document domain driven electronic email emnlp empirical enron eriments error evaluated even experiments expressed extensive extract feature features fienberg figure finally folders forman fraudulent frequent freund from gain have hill http huang icml ieee ijcai improves information institute intent interface interior into irrelevant jects jority journal karov language large learners learning lewis library libsvm linear lines littlestone long machine machines mails malin mansour margin massachusetts material maximum mccallum mcgraw meetings mellon method methods metrics mining minkov mistake mitchell model models modi moviereviews national naturally ndings necessarily nips nonnlp notes number online opinions organization other over pages pang pass passive past pedersen perceptron performance performs privacy probabilistic proc proceedings proposed protect psychological quickly rate recommendations references relaxed reply report requests research results retrieval review ringuette romma rosenblatt roth safety schapire scheduling scheme sdair security selected selection sentiment setting several shalev showed shwartz signature simple singer single site society software sometimes spam speech square statistic storage study such suited support supported surprisingly symposium tasks technical techniques technologies terms test text than that this those threshold thumbs tomasic traditional training tting understand university update upon using vaithyanathan vector vegas views vitor volume voting weighted well when winnow with work workshop yang http://doi.acm.org/10.1145/1150402.1150500 94 Incremental Approximate Matrix Factorization for Speeding up Support Vector Machines aizerman august automation bach bagging boyd braverman breiman cambridge cient classi conference control convex cortes courant december decomposition ferris fine foundations francisco function fung hilbert interior international interscience jordan journal kernel learning machine machines mangasarian massive mathematical method methods munson networks optim optimization pattern physics point potential predictive predictors press proceedings proximal publishers rank recognition references remote representations research rozonoer scheinberg siam support theoretical training university using vapnik vector volume http://doi.acm.org/10.1145/1150402.1150413 8 Detecting Outliers using Transduction and Statistical Testing abraho active aelst album algorithm algorithmic algorithms almeida analysis angiulli antip applications approach architecture artists associative athos aviles barbar based bases basic benchmarks bentley binary breunig brodatz cation chapman characterization chatterjee chen cheung cient cluster clustering clusters communications complexity computational computer conf conference covariance cruz cure curves data databases datasets dekker deliverable dence density designers detection determinant dice dimensional discovering discovery distance dover dugue ective ectiveness edgington edition editors eiro elena elsevier engineering enhanced enhancing esprit ester estimator european evolutive filling fractal france friedland from gammerman geoinformatica guerin guha hall handbook hardin hawaii hawkins high honolulu html http identi identifying ieee images international intl introduction ject jensen john joint knorr know knowledge kolmogorov kriegel large learning ledge leutenegger lewis liao linear lling local london machine management marcel measures menasce methods minimum mining mlearn mlrep multi multidimensional multiple multivariate natural near nervesi networks neural neuralnets neville ninth noise nouretdinov numb olis order ository outlier outliers pakdd paper pattern patterns photographic pizzuti portland prediction probability proc proceedings proedru pruning publications ramaswamy randomization randomness rastogi recognition references relational research resolution robustness rocke rousseeuw rule sagan sander saunders schwabacher science search searching sets setting sheikholesami shekhar shim sigkdd sigmod similarity simple solka sophia space spatial springer statistical statistics system tang task technical tests textured textures theoretical theory time track transactions transductive trees tung used using vapnik verlag very vitanyi vovk washington wavecluster wechsler wegman wiley with workloads workshop york zhang http://doi.acm.org/10.1145/1150402.1150489 83 Automatic Mining of Fruit Fly Embryo Images ∗ academic also analysis annual anterior applications approach ashburner asso base based bdgp beaton best binary bioinformatics biol biology biomedical burges cation celniker chang ciates classi cluster clustering clusters comparing component computational conf content data database davidson decemb determination developing development developmental digital discovery displayed drosophila during each early edition editor eisen embryo embryogenesis embryos emden endoderm expression expressions feature femine figure from fruit gene genetics genome genomic gilb gurunathan hartenstein hill hits identifying ieee image images imaging impressive independent indexing intensity international invariant isbi january jayaraman john jolli karhunen know kumar kwan learning ledge lewis library libsvm long machine machines marti mcgraw melanogaster mining mitchell molecular moment montalta mrna myers newfeld novel number original pages panchanathan pattern patterns peng poster press principal proc proceedings processed processing pyramid query recognition recomb references registration regulatory reichert representations research retrieval richards rinen rubin ruttimann seventh shown similar sinauer situ sons spatially spie springer stage stages subirana subpixel supp support symposium systematic systems tescher thvenaz tomancak total track transactions tutorial unser using vector versus volume weiszmann wiley with xxiv http://doi.acm.org/10.1145/1150402.1150467 61 Evolutionary Clustering acquisition advances aggarwal agglomerative allan analysis annual applications auer bases blei both broadcast bursts callaghan carb case cation cess change chapman chat chien chinese classi clustering clusterings clusters comparing computational computer concept conceptual conference correlation cument darpa data database ddington decomp decrease dels detection development dhillon dicities disjunction distance duda ective editors engine erio etween event evolving extending figure final fisher foundations framework frank from ghahramani guha gunopulos hall hart herbster hidden hierarchical history identifying ieee immorlica increases incremental information internal international interscience iterative jordan journal kaufmann keogh knowledge large learning line linear lkopf machine management markov means meek meila mining mishra morgan motwani nested neural news novelty onell online oral ositions over page pages parameter pattern performance pierce pilot plots poster practical predictor probabilistic proceedings processing quality queries references research restaurant retrieval retrosp saul science search semantic sequences series sigmod similarities similarity smoother smyth snapshot sparse stork streams study symposium systems techniques technology temp tenenbaum text than theory those thrun time tools topic track tracking transcription understanding using vagena variation versus very vlachos volume wang warmuth wide wiley with witten workshop world yamron yang zhang http://doi.acm.org/10.1145/1150402.1150442 37 Aggregating Time Partitions accumulation adaboost adaptive aggregating aggregation algorithm algorithms american analysis annual anomaly appear approximation arts august bagging based berlin biocomputing biology block blocks boosting boundaries bregman bursty chicago chromosome cient cluster clustering clusterings combining communications comparing complex computational computer computing conference constant constructing context curves data description detection devices discovery discrete distance distances distributions diversity dynamic earth engineering enhancing ensemble ensembles episodes estimating european event evidence features finding framework frequent genetics genome germany granularities haplotype haplotyping hierarchical high histograms human ictai ieee illinois image inconsistent information institute intelligence intensity international intrusion japan john journal jump knowledge label learning length limited line logistic machine massachusetts mcmc media method methods metric minimum mining mobile model models molecular mover multiple multivariate nature nding online overview paci partitioning partitions pattern patterns perception perfect phylogeny piecewise pnas pods predictors preferences principle probabilistic proceedings process program programming programs rank ranking rankings recognition recurrent reduction regression reliable research resolution retrieval reuse revealed reversible scanning science sciences secur security segmentation segmenting segments selection sequence sequences sequential series social sons sources stoc streams strength structure symposium syst systems technology temporal theory thesis ties time tokyo trans transactions unsing using vision wiley with http://doi.acm.org/10.1145/1150402.1150421 16 A General Framework for Accurate and Fast Regression by Data Summarization in Random Decision Trees accuracy additive american amit analysis applied approaches association based better breiman cambridge cation ciency classi computation conference data decision discovery drummey effective ensemble estimation explaining fokianos forests friedman geman generalized goodness greengrass hardle hastie huang icdm ieee indurkhya international journal kedem knowledge learning machine mccloskey mining model models neural nonparametric olshen pages parametric posterior press probabilities problems proceedings quantization random randomized recognition references regression rule science series shape solving statistical stone tests third tibshirani time tree trees university wadsworth wang weiss with http://doi.acm.org/10.1145/1150402.1150465 59 Classification Features for Attack Detection in Collaborative Recommender Systems against alto analysis annual architecture attack attacks august based bergstrom beyond bhaumik burke chen chicago chirita cient classi collab commerce computer computing conference cooperative cscw data decemb detecting detection diego ective edinburgh edition electronic enterprise figure finding focused formation francisco frank generation group grouplens houston hurley iacovou icdm identifying ieee ijcai information injection intel international internet item january joint june kaufmann knowledge kushmerick learning lecture ligent limited ltering machine mahony management mining mobasher models morgan neighb nejdl netnews next notes nuke omahony online orative ourhood pages palo personalization practical press preventing proc proceedings recall recommendation recommender references reidl resnick riedl robust robustness science scotland secure segment services shilling silvestre springer suchak supported system systems techniques technology tools transactions utility webkdd wide widm williams witten work workshop world york zabicki zeng http://doi.acm.org/10.1145/1150402.1150409 4 Learning to Rank Networked Entities agarwal anatomy anywanwu argonne authority balmin banks based benson bhalotia bound brin browsing chakrabarti chang chiba cohn complex conference constraint cortes creating customized databases editors engine faloutsos graph herbrich hristidis hulgeri hypertextual icde icdm icml ieee japan jectrank keyword laboratory large learning limited lists maduko mccallum memory method metric minimization mining model nakhe national nips page pages papakonstantinou queries rank ranking recursive references relationship report results scale search searching semantic semrank sheth siam sudarshan technical toronto using variable vldb workshop zhan http://doi.acm.org/10.1145/1150402.1150487 81 Statistical Entity-Topic Models abdul abdullah above academy adaptive advances algorithms allocation also america american analysis andrei annotated annual anwar appeared applications arafat ariel arti article articles aslan association author ayman babitsky barak based berezovsky binalshibh blei bombings boris brill brosnahan buntine businessman cambridge certain cial close cohn computer conditional conference connected connection connectivity content convicted coreference corrada correlated data department development dirichlet discovery document does domain edmond eduard eech ehud embassy emmanuel engine entity erosheva ership ertext erty espionage event exist explorations fienb figure finding frank goncalves google gorbachev gusinsky hage hamid hamza haouari have hazmi heather hezbollah hijazi hofmann hussein hutchins identity ieee igor indicates information integrating intel international islam islamiyah issue ivanov james jemaah john jordan jose journal karimov karzai khidhir khrushchev king know kuchma laden larry latent learning ledge lenin leonid ligence lindh link links lnai lofstrm lotfi machine madadhain maintenance management maskhadov massachusetts massoud mccallum memb mercer mikhail mining missing mixed model modeling models mohamed mokhtar moussaoui named national network networks neural never news nikita nity noun obvious ontology padilla pages pair part people perki perttu physician pope poroshin poster prediction press probabilistic proceedings processing professional publications qaeda raed raissi ramzi ranking recipient recognition referenced references research retrieval ritter role rosen russia russian sadat saleh scalable science sciences scienti scott search sergei sharon shas shevardnadze shirzai sigir sigkdd silander single smyth social some source special springer states steyvers syntax systems tagging technical tenenbaum tenth these this thompson threshold times tirri together topic topics track transformation tuominen tuulos uncertainty united university uren vladimir volume wadih walker wang wellner where while with yasser year zacarias zawahiri zogby zubaydah http://doi.acm.org/10.1145/1150402.1150424 19 Assessing Data Mining Results via Swap Randomization acknowledgments algorithm algorithmic algorithms also always amount apart apparently assessing availability besson boulicaut cance certain chain cient clearly clustering column columns computation conclusions conducted correlations data dataset datasets degree describ discoveries discussions distribution dramatic drop erences eriments error erties evidenced example extensive found fran frequency frequent from gave generate hand hastings have having helsinki hiit http immediately interesting jean keeping large laws less line little loop lots maintains margins markov method metropolis mining more moving numb only open order original ossible other ower pairs practice present problem problems prop question random randomization randomized reachable real references results retail robardet rows same second self sets several should show showed showing shows signi skewed slightly software some started states statistics strong structure studied study swap swaps technique than thank that these they this thus treatment used variable version versions very when whether while work yields http://doi.acm.org/10.1145/1150402.1150496 90 Suppressing Model Overﬁtting in Mining Concept-Drifting Data Streams able accuracy accurate advocates algorithm algorithms also analysis applications applied approximate area attention attracted average babu bagging baile barbara based bauer bias boat boosting boston boulos callaghan cation change changing chen china ciency cient classes classi classifying closed clustering cock combining comparison computation concept conclusion construction continually continuously cormode cult current data datar decision derive determine dimensional does domingos dong drifting ecause eech eliminating empirical encer ensemble ensembles erent erently eric eriments error euclidean evaluating exploiting explosive extensive fast figure focs forgotten formulate francisco frequent freund ganti gehrke generally graham granularity greenwald growth guha haixun harb hidden high hongkong however hulten icdm icml improve inappropriate including incorp incremental issues itemsets jiawei june kaufmann khanna kohavi large lawrence learning leverage load loadstar machine madison maintaining many markov milshra minimizing mining model modeling models moment morgan motawani motwani much multi multiple muntz muthukrishnan nick ones online oost oosting optimal optimistic optimization orating osed over pages pattern patterns peng philip pods predications predictions press previous proactive problem processing prop publishers quantile queries querying rabiner ramakrishnan rate reactive readings real recently recognition reducing references regression related require results richard samples sampling santa scale scans schapire scheme select selected series several shedding should siam sigkdd sigmod similarity skewed sliding solution space speech stochastic stream streaming streams street studies such sudipto summaries summarizing synopsis systems techniques that their then theoretical there they this thus time traditional trained training tree tutorial uence unlike using variants vldb voting wang wavelet when widom window windows wisconsin with work xiaochen xindong xingquan yang ying yoav yongseog http://doi.acm.org/10.1145/1150402.1150408 3 Deriving Quantitative Models for Correlation Clusters achtert aggarwal agrawal algorithm algorithms analyzing applications association automatic based biclustering bioinformatics biology brighton cheng church cient clustering clusters complete computing conference connected correlation data databases dempster density diego dimensional discovering discovery ester expression fast finding france from gehrke generalized georgii gunopulos high icdm incomplete intel international ismb jected jects journal kailing know kramer kriegel laird large ledge ligent likelihood local maximum microarray mining molecular noise paris park philadelphia portland preferences proceedings procopiuc quantitative raghavan richter robust royal rubin ruckert rules sander seattle series sigmod society space spatial statistical submitted subspace suppl systems using with wolf zimek http://doi.acm.org/10.1145/1150402.1150476 70 Structure and Evolution of Online Social Networks adamic adar advances albert algorithmic algorithms amenable analyses analysis analyzed analyzes applications arbitrary arrives asymptotic barabasi based believe biological blogspace bollobas broder bursty cacm cambridge cation characteristics collective combinatorics common communities complex components computer conclusions congress critical cyber debrecen decentralized degree densi diameter diameters different distributions dodds dorogovtsev dynamics each ects edge emergence emerging empirical engine erdos european evolution evolutionary evolving experience experimental explanations exploring fairly faithfully faloutsos faust feature fetterly flickr focs formula free from function geographic given global graph graphs internet intl intriguing jeong journal keeping kleinberg kumar labeled large laws leskovec liben maghoul manasse many mathematical mathematicians mathematics mechanics mendes methods model models modern molloy moments muhamad najork namely nature navigation nets network networks newman node novak nowell ntoulas number observations observed olston online over oxford pages paper particular perspective phenomenon physics pnas point popular possible postulated power practice precise press prevalence probabilistic proof properties publications qualitative quantitatively raghavan rajagopalan random reed references regular relationships renyi results review reviews riordan routing scale scaling science search sequence share show showed shrinking siam sigcomm simple simulation since sivakumar sized small social software stars stata statistical stoc stochastic strogatz structure structures studied study that their these this time tomkins topology track trawling university upfal very view wasserman watts what when wide wiener wiley with world yahoo http://doi.acm.org/10.1145/1150402.1150423 18 Quantifying Trends Accurately Despite Classifier Error and Class Imbalance about acceptance accurate accurately adjusting advanced alignment amounts analysis answer application applications apply area based been before benchmark biased boundary business call centroid challenging chang changes characteristics chemists chicago class classification classifier classifiers collections comes company compared compensate computation computer concept conclusion conclusions conf conference considerations considering constraints cost costs counting currently data dataset datasets decaestecker depend despite develop directions discovering discovery distribution distributions document down drift easily ecml edition effective effort empirical engineering estimates european evolutionary experienced experimental experimentation exploration extensive factors family fawcett feature finally find flach forman fortunate francisco frank freiburg from further future graphics graphs great greater have having havre hetzler hewlett hierarchies high highly hopefully human ideally ieee imbalance imbalanced important improvement inaccuracy inaccurate include including individual international introduces involve issues karypis kaufmann kernel kirshenbaum knowledge labels labor labs large last latinne lead learning least less logs machine made many median methods metrics million minimizing mining mistakes models more morgan most motivated multi naturally near neural never noise notes nowell nowhere obradovic ones only other others outputs over packard parts patterns performance philadelphia pkdd popular porto positives possibly practical pragmatic precise predict prediction principles prior priori probabilities proc procedure publishable pull pushing quantification quantify reduce references repeatability report require requires research researchers response results saerens seems selection sensitive show sigkdd simple since size small some strongly study studying subclass substantially such suermondt suit support surely surface surprise sweep tail talk task tech techniques technology techreports temporal text that thematic theme themeriver these they this though time ting tolerance tools toward train training trans transactions trending under underestimating various varying very visualization visualizing vucetic webb which whitney will wish with within witten word work would years yielding zhai http://doi.acm.org/10.1145/1150402.1150470 64 Recommendation Method for Extending Subscription Periods account aiming algorithm also although analysis applications approach architecture attributes automatically based basic bergstrom berry betz bfgs campaign categorization chan charged chen churn cleves cmucs collab combine combining commerce computation computational computer conclusions conference considering content continuous cooperative correlation customer cybernetics data date datta decision digital directions discovery dissatisfaction distance divided drew eager easy ecause ecome edition effect efore ehavior eick eing encourages encouraging endencies enhanced entropy eriments eriod eriods estimate estimated estimating estimation etween even evolutionary example extend extended feature features fifth finally first framework from fukuda further gaussian gould grimes grouplens gutierrez hand have hazards hiramatsu histories iacovou ieee improve improving industry informative interests into introduction items john johnson kaushansky know komoda konstan large learning ledge libraries lifetime lift limited lino little long ltering make management mani marketing masand math maximum measured memory method mining mobasher mobile model modeling models modules monthly mooney mozer must nding need netnews networks neumann neural nocedal novel novelty obtained oiso online only optimization orative ortant ortional osed other pages patterns piatetsky predicting prediction press prevent prior probability proceedings programming prolonging prop providing purchase purchased recommend recommendation recommendations recommending references relationship remain resnick results retention revised riedl rosenfeld rosset sales scale schafer second selection service services shapiro shono sigkdd since slightly smaller smoothing some stata statistics store stores subscrib subscription such suchak suggests supp supported survival switching system takada taking technical techniques telecommunications text than that their therefore this tool trans uence understand used useful user users using value vatnik which wiley wireless with without wolniewicz work zhou http://doi.acm.org/10.1145/1150402.1150422 17 ReverseTesting: An Efﬁcient Framework to Select Amongst Classiﬁers under Sample Selection Bias advances analysis annual anthony available bartlett bayesian bias both categorization cation characterizing classi clustering computational conference costs data davidson decisions dept dimension discovery econometrica edition elkan error evaluating framework from hastie heckman icdm ieee improved inference inferring information international knowledge label language learning little machine making mccallum mechanisms method minimisation mining missing modeling moore nature network neural newsgroups pages press probabilities proceedings processing references reject rennie report retrieval risk rosset rubin sample sampling selection semi sensitivity seventh shawe sigkdd smith speci springer statistical structural supervised systems taylor technical tenth text theory toolkit tutorial tutorials under unknown vapnik website when wiley williamson with zadrozny http://doi.acm.org/10.1145/1150402.1150412 7 Group Formation in Large Social Networks: Membership, Growth, and Evolution aaai aasiigirjcai acad adamic adar addison altruism amer american analysis annual appear approaches arxiv association authorlines availability baeza bagging bases basic behavior biological blogspace boorman borgs both breiman bursty burt buyukkokten cade cambridge capital cascade cation caught centola challenging chayes clustering coetzee coevolution coleman colt comm commerce communication communities community competition complex computer conclusions conf conference conferences considered contagion content continuous corn creation crowds crypto customers dagm data datasets decision decisions deerwester detecting different diffusion dill dimensional direction directions discovery dodds domingos dumais dynamic dynamics ecoop edition eguiluz elaboration electronic empirical even evolution evolve evolving explorations exploring eytan faust figure first flake focs formulate foundations frans free from fstcs furnas further future gary generalized genetics geographic giles girvan global granovetter graph grow hampton handcock harshman harvard have hawaii here hierarchical hoff holes holme hopcroft huberman human hybrid icde icml icpp identi ieee ifip ijcai indexing individuals induction infocom information innovations interesting international internet intl into ipps join journal june kempe khan kleinberg knowledge kossinets kulis kumar lada landauer large latent lawrence leads learning leskovec lett level levitt liben lics link linked machine macy mahdian march marketing markov math maximizing mccallum mccurley membership minimum mining model models modern monday moore more movements multiplex natl natural naturally network networks newman newsgroup newsgroups nips nonequilibrium novak nowell opinions organization organizations orkut over paper personal phase phys physica physics pills podc pods poison popl predictors press proc projection projections propagation questions quinlan raftery raghavan rajagopalan references research retrieval review ribeiro rich richardson rogers routing rtss saberi sarkar science search self selman semantic sharing sigcomm sigcse sigir sigkdd sigmetrics sigmod siguccs similarity sites sivakumar smith social sociology soda sosp soule spaa space special spread srds stacs statistical steve stoc strang streams strength structural structure subset supplement tardos tarjan their theory three through ties time tomkins topical topics track transition trees trends tsioutsiouliklis uence uisenix universal university usenix using valente value very viegas view viral vldb vlsib wang wasserman watts ways wdag weak wesley which with work yates year years http://doi.acm.org/10.1145/1150402.1150521 113 Mining for Proposal Reviewers: Lessons Learned at the National Science Foundation about abstracts acad access addresses against agents algorithm analysis annual application applications approach assigning assignment assistant australia author automated automatic automating autonomous banerjee based basu because benefits bennett best bibliometric bollacker bradley browsing carbonell chair challenge chinatsu citation citeseer clustering cohen college commun communication communications communities competitive comprehensibility conference consider constrained contribution could critical cyberchair data decided demiriz development digital dimensional ding discovery diversity document documents doesn dramatic dumais easily effective engine established evolving experiments fast feedback fifth filtering frequency friend full furnas geller giles goldstein gomez government griffiths hall hand have hierarchic high hill hirsh hopcroft hosh human hypersphere ijcai ijcnn impact implementation indeed independent indexing industrial information intelligent international introduction jcdl joint just kephart keywords khan knowledge kulis lack landauer large larsen lawrence learning less leveraging libraries linear linked longer machine mail mailcat main management mann manning manuscripts marchionini maryland mccallum mcgill mcgraw means measure measures melbourne microsoft might mimno mining model models modern natl networks neural nevill nielsen online only organizing paper papers park particularly people porter prentice press probabilistic problem proc proceedings processing produce producing program proposals prove queries recent recommending references related relevance reordering report reranking research residual retrieval revaide review reviewer reviewers reviewing rocchio salton scherl search seattle segal selected selection selman sensitive sigir sigkdd simple smart smyth stadt steyvers stockholm strategies stripping study submission submitted such suffix summaries suppl sweden system technical technology term terms text that third this time topic track tracking trends umiacs understood university used useful users using video vocabulary washington weight weights well willett with words work workshops worth http://doi.acm.org/10.1145/1150402.1150493 87 Mining Long-Term Search History to Improve Search Accuracy accuracy activites adaptive allan also although analysis applied automated based because believe bibliography bottom broder challenges cikm clearly clickthrough comes constructed context contributes cutoff data days demo difference different divergence dominant dumais editors effects effort engines especially feedback figure forum fresh from hatano history horvitz hour hours implicit important improvement increase inferring information interests joachims jones kaufmann kelly lafferty language lengths methods model modeling models more morgan most optimizing page pages personalized personalizing preference proceedings publishers queries readings recent recurring remote retrieval search sensitive sept shen sigir sigkdd smoothing sparck study sugiyama taxonomy teevan that time toolbar ucair user users using volume while willett within without yoshikawa zhai http://doi.acm.org/10.1145/1150402.1150439 34 Unsupervised Learning on K-partite Graphs aggarwal algorithms analysis approach approximation banerjee based bipartite bregman chan clustering collins comp conference copiuc cuments dasgupta data dhillon directional divergences ectral entropy factorization family fast generalizaionof generalized generative ghosh graph heuristic jected jones matrix maximum merugu nips onent onential pages park partitioning ppsc principal ratio reducing references reina schlag sigmod sparse using with wolf words zien http://doi.acm.org/10.1145/1150402.1150411 6 Global Distance-Based Segmentation of Trajectories abarbanel adaptive agrawal algomax algorithm analyses anomaly another appendix archive arnason array assigning assignment assignments assume athitsos backtrack based being boeing bounding brooks capture cardle case cases chakrabarti chan cient clade cohen collide collision complete complexity compute computes consider contains contradiction coordinates corresponding data databases decomposed describe detail detection devries dimensional dimensionality distance dynamic each eamonn edbt eddings editing elding entries entry environments evenwe every exact exactly examines faloutsos fast finally first flythru focus fodo found genomes give given gives good graphics gunopulos hadjieleftheriou have here higher hippopotamus http human hypothesis icdm icpr including increasing indexing induction interactive jectories jectory jects jepson keogh kersten know kollios large least locally london look mahoney maintains mamoulis mann manocha maraghi matching matrix maximum mcneely measures mehrotra mining mitochondrial model modeling most motion multi multiple nascimento need notice operations optimal order pages palpanas pazzani perform ponamgi proc produced programming prove query range ranges rasetic rectangle rectangles reduction references replicated rightmost rithm royal sander scale sclaro search second section segmentation sensitive sequence series show siggraph sigkdd sigmod similarity similarly since size society solution some space spatio spatiotemporal splitting step strongly supp support swami symposium system table temporal that theorem there therefore this those time tsdma tsotras tually typical ursing uses using valid value values vertically vlachos vldb volume want whale when where while with would zordan http://doi.acm.org/10.1145/1150402.1150486 80 Clustering Based Large Margin Classiﬁcation: A Scalable Approach using SOCP Formulation advances algebra algorithms annals applications applied approach available bennett bertsimas bhattacharyya birch boyd bredensteiner cambridge cation chang chebychev cient cjlin classi classifying clustering clusters cone cones conference connection convex csie data databases duality equations erghe fast feature formulation formulations functions geometry ghaoui handbook hierarchical http inequalities information input integral interior international joachims john jordan journal kernel lanckriet large learning lebret library libsvm linear livny london machine machines making marshall mathematical mathematics matlab mercer method methods minimal minimax missing moment multivariate negative nemirovskii nesterov neural nips nite numb olkin optimization order ositive over pages philosophical platt point practical press problems proceedings processing programming ramakrishnan references research robust royal scale second sedumi selection semide sequential sethuraman sets shivaswamy siam sigkdd sigmod smola society software sons statistical statistics studies sturm supp support symmetric systems their theory toolb training transactions uncertainty using vandenb vapnik vector very wiley with yang york zhang http://doi.acm.org/10.1145/1150402.1150410 5 Spatial Scan Statistics: Approximations and Performance Study acad agarwal algorithm alon approach approximate approximating april autonlab autonweb bump cambridge cant center chakrabarti clusters comm communication comp compl complexity comput computing conf conroy daniel data detecting detection dimensional disc discrepancy disease disjointness dwass enron epsilon erence fast feigenbaum fisher foun frequency friedman genes geom graphs haussler henzinger high html http hunting ieee info information jagopalan jayram kannan khot kulldor kumar kushilevitz lower marchette markers massive math matias maximizing meth metho mitchell moments multi multidimensional natl near neill nets neur nisan nonparametric optimal otheses ounds pages park party pereira phillips press prieb proc queries raghavan random randomization range rapid references reservoir resolution sabhnani sampling saul scan siam signi simplex sivakumar softw software space spatial stat statistic statistical statistics strauss stream streams susceptibility symp syst szegedy tests theory trans university venkatasubramanian viswanathan vitter welzl with yossef http://doi.acm.org/10.1145/1150402.1150444 39 Learning Sparse Metrics via Linear Programming advances algorithm alon analysis application approximate arti athitsos background baltimore barbados bennett blitzer blumer boostmap bradley california cardie cation chapman cial cient classi clustering cohen column comparisons component computations computer concave conf conference constrained data datasets demiriz dimensionality distance edding edition editor ehrenfeucht erger ervised faloutsos fast fastmap feature fifteenth framework francisco from fung generation geometric global golub hall haussler hopkins icml image indexing information institute intel international january joachims john jolli jordan journal kaufmann kernel knowledge kollios langford large learning letters ligence linear loan locally london machine machines mangasarian manifolds margin maryland math matlab matrix means method methods metric minimal minimization mining morgan multidimensional multimedia nearest neighb neural nite nonlinear novemb occam oosting optimization order orts package packer pages pattern press principal proc proceedings processing prog programming rankings razor recognition reduction references relative research rogers roweis russell saul scaling schapire schroedl schultz science sclaro sdpt selection semide shavlik shawe side sigmod silva similarity singer smola software springer statistics supp systems taylor tech technical tenenbaum tenth things todd traditional tutuncu university unsup vector verlag vision visualization wagsta warmuth weinb wisc wisconsin with workshop xing york http://doi.acm.org/10.1145/1150402.1150484 78 Efﬁcient Multidimensional Data Representations Based on Multiple Correspondence Analysis according added always analyse analysis applied around benzecri bring cant conference consideration constant corresp correspondence curve data dekker dense dition does drops dunold ecial economic edition employment esco etude evolution exploratoire fact factorielle figure france gain general handbook hardcover have homogeneity increasing institut january john josas jouy kimball lebart leroux less malinvaud marcel marketing means method morineau multidimensionnel naturally negative nevertheless note null ondence oscillating paris piron probl publication quality rather ratios references representation science should signi socio sometimes sons sparsity stabilit statistics statistique than that this toolkit trend trois universit value values variation warehouse when wiley with http://doi.acm.org/10.1145/1150402.1150416 11 Out-of-Core Frequent Pattern Mining on a Commodity PC address advances afopt agrawal algorithm algorithms anderson applicable approach approximate asso available based bases basket beyond blocking brin buehrer burdick cache calimlim candidate capabilities causal centers cessor cessors chase chen ciation cient ciently cieslewicz cisrc commo commodity computing conclusion conference conscious considered context core correlations data database databases demonstrate dern directly discovering discovery dity dong doyle ective elieve emerging empirically energy engineering episo erences erformance ermitting ethals etween event existing eyond fast fpgrowth frequent from gehrke generalizing generation ghoting gouda grahne growth hash have highly hosting icde icdm illustrate imielinski implementation implementations improved improving information international items itemset itemsets journal know large ledge leverage limitation literature locality mafia management managing mannila market maximal memory methodology mining motwani multhithreading navathe nguyen ogihara omiecinski oper optimizations oral osed other pages parallel park parthasarathy pattern patterns placement present presented previously proceedings processor prop references report resources ross rules savasere scalable scale scales scaling secondary sequences server sets shah shared show sigkdd sigmod sigops silverstein simultaneous sizes solution solutions sorting spatial srikant structures such swami syst systems tasks technical techniques temp thakar that this those today toivonen transactional trees trends truly ullman using vahdat verkamo very vldb volume well with without workshop xiao zaki zhou http://doi.acm.org/10.1145/1150402.1150443 38 Using Structure Indices for Efficient Approximation of Network Properties acad according actors adamic addition algorithm algorithmic algorithms among annotation annual appear appeared appendix applications approaches approximation april arithmetic artificial average backward bacon bang bartal based beacons been between betweenness bhatti brandes burn burning centrality chow citation citations cite clarification collective communications component computer computing conceptual conference connect connected connection construct contains contributors coordinates coppersmith coreference costarred crovella crowcroft data database dataset datasets decentralized degree degrees densification dependency depicted described diameters discovery discrete disparity distance distances distributed dodds drawn dynamics each edges elsewhere embedding encompassing encyclopedia errors estimation euclidean evaluated evolution examples experiments explanations fact faloutsos fast faster fewer figure filtered finally fire first floyd forest foundations free freeman friedman from furthermore generated getoor gibbons given goldberg graph graphs harrelson harris have heuristic homophily http huberman hungar identity ieee imdb index indices infocom inst intelligence international internet jensen joint journal jrnl kevin kleinberg knowledge koller krauthgamer laboratory landmarks largest lattice lawrence laws learning leskovec lighthouses lightly lightweight linked livermore location lukose many massive math mathematical matrix mccurley measurement meets meridian metric mining model models movie movies muliplication national nature navigating navigation neighbors nets network networks neville newman nineteenth node nodes number object oldid over palmer paper papers particular path paths peer performance period perspective pfeffer phenomenon physical pias portion possible power predicting principles probabilistic probabilities probability proceedings progressions proposed proximity publ puniyani random randomly ranging real references relational report representing research resulting retrieved review rewire rewired rexa rings scalable science scientific search second select service sets shavitt shortest shrinking siam sigcomm sigkdd simple simulation sirer slivkins small smallworld social societies sociology some space spaces specifying strogatz style subgraphs symbolic symposium synthetic systems tang tankel technical tested than that their then theory they this three thus time title together tomkins tool track triangulation type types typically ucrl undirected used using version virtual watts were wexler when which wikipedia wilbur winograd with without wong workshop world zhang http://doi.acm.org/10.1145/1150402.1150492 86 Combining Linguistic and Statistical Analysis to Extract Relations from Web Documents abbreviations acquisition agichtein american among analysis annotation annual answering aseltine aslam automatic automatically based birthdate boosted brin buitelaar bunescu cafarella cali cation change cherkassky cient cimiano classi classify coling collections colt combining competitors complexity computational computer concept concepts conceptual conf context contextdriven corp corpus correct cpankow crystal data decatur deep dence dictionary dirt discovering discovery distribution documents domain dortmund downey driven ecai ecml edbt emnlp endency english entity erent erley ernym ervised escu estimation esws etzioni example examples exploiting extract extracting extraction famousbirthdates finn fisher formal framework free freitag from general generalized generate generating germany gimme googlec googlecomposers grammar graupmann gravano grishman hand headquarters hearst hovy http huttunen iccl icml ideal ifrim ijcai inducing induction inference information instanceof institute integrating interval iria joachims jurafsky kearns kernel knowitall knowledge krieger kurz kushmerick ladwig language large learning lehnert leila letters level linguistic link lker lncs lonely lonelyplanet machine machines maedche markup match metric mined mining montague mooney multi named names nancial natural needed networks neural nips nldb noise numb olejnik onto ontology onyms ortion other oundary ounds output pages pairs pantel papers parameters parsing path pattern patterns philosophy piskorski plain planck planet plug poster practical precision preliminary press probabilistic processed processing prop protege question ramaka ranlp ravichandran recall references regression relation relational relations relaxed report research results richard rilo rules saarbrucken sample scale schapire schmeier science search selected selection semantic semi sentences shaked shallow shortest simon sintek sleator snow snowbal snowball soderland staab standard statistical structured suchanek support surface synonymy syntactic system systems table tagging tapanainen technical technologies temp term text texttoonto theory thesis tolerant towards track universal university unsup untagged using vector weikum weld wide wikic wikicomposers wikigeneral wikigeography with workshop workshops world wrapp yale yangarb yates yunqian zhang http://doi.acm.org/10.1145/1150402.1150502 96 Discovering Interesting Patterns Through User’s Interactive Feedback able accepts active algorithm algorithms also analysis annual application applications approaches arti background based bases bayardo bayesian biased bishop cambridge cause cheng cial clickthough clustering cohen compressed comput conclusions conf conference constrained cristianini culty current data databases designed development discover discovery discrete discuss distance ected elief engines erformance eriment eriments erschatz ersonalized exploit explore extended fast feedback fienb fimi finally first formulations framework frequent from further gill goethals gonzalez hall holland icdm ieee implementations information intel interactive interacts intercluster interesting international introduces introduction itemset jain jaroszewicz joachims journal kernel know knowledge large learn learning ledge ligence linear machines makes many maximize maximum measure methods minimize mining model models moreover multivariate murray network optimization optimizing order osed other paegs pages particular passively pattern patterns prentice press prior problem proc prop ranking references relative research results retrieval review sample sampling saunders scale schapire sche search second seen select selective sequential sets setting several shawe shen show siam sigir silb singer snopt sparse sparsity strategies strategy structural study such summarization support system systems taylor tested that theoretical things this trans tuzhilin unexp univ user using various vector very vldb ways what where will with work works workshop zaki zhai http://doi.acm.org/10.1145/1150402.1150435 30 Workload-Aware Anonymization accuracy achieving acknowledgments additionally aggarwal agrawal algorithm algorithms also although always anonymity anonymization anonymizing anonymous approach arti association attribute attributes based bayardo because behaves belmont benjamin better blake bottom bravo breiman brought cation chakrab characterize chawla chen chung cial cient class classi comments common complexity conclusion condensation conditionally consisting constraints construction continuous control conversations corrada currently data database databases datasets davis decision described dewitt disclosure diversity domain domingo down draft dwork earlier ecialization ectiveness edbt edition engineering erent erformance erspective expect experimental expressed expressive extend extensive eyond fast fayyad feature feder fellowship ferrer finally form foundation framework francisco frank freidman from full fung future fuzziness ganti gehrke generalization generating generation ghosh grant granularity group handling haritsa hector histograms honavar icde icdm icdt icml identities ieee imielinski implementation implications incognito incorporating indicative individual information insightful intel interesting international irani iyengar jesse jjhala journal jude kaufmann kenthapadi kifer know large leads learning ledge lefevre levels ligence light machanava machine maintaining mateo mcsherry measures merz meyerson microaggregation microdata mievski mining models mondrian more morgan most motwani multidimensional national naturally olshen only opportunities optimal oriented orty ository panigrahy paper partially particular pods possible practical predicates prediction predictive preservation preserving privacy problem protecting protection provided providing quality queries rainforest ramakrishnan recoding rectangular references regions regression reiss release remain requirement research respect respondents results retained rizvi rule rules samarati sanz satisfy science section selection selections several shavlik should show sigkdd sigmod simple situations solution some sometimes speci srikant statistical steps stone studied study subject subsets supported supports suppression swami swapping sweeney systems tables talwar target taxonomies techniques than thanks that these this thomas through tools topic traditional trans transactions transforming tree trees ultimately uncertainty under used useful using utility validated valuable value valued variety varying venkitasubramaniam vldb volume wadsworth wang well when where which widely wild will williams with witten work working workload workloads zhang http://doi.acm.org/10.1145/1150402.1150533 124 Beyond Classiﬁcation and Ranking: Constrained Optimization of the ROI acainternational advances application approximation april arti aspx association baldasare baluja bfgs budget burges caruana cial classi collection company conf constraints cost credit customer data decision descent discovery dodier ehavior elkan enhancing erformance evaluation factb fassion foundations framework fund future gradient http ieee industry information institute intel intl investment joint know large latest learning ledge lift ligence ligent limited machine mann march marketing mathematical medical meets memory method mining mitchell mozer multitask mutual neural nocedal optimization optimizing overview prediction present proc processing professionals programming rank rankprop references risk scale sensitive shaked sigkdd sort statistic stats systems telecommunications theoretic under using value whitney wilcoxon wolniewicz workshop york http://doi.acm.org/10.1145/1150402.1150462 56 A Framework for Analysis of Dynamic Social Networks acad african aizen alpine american analysis animal animals artif asso automated barab barabasi based bass baumes behaviour bernstein biosci blogspace breiger bursts bursty business cambridge carley cating cation cessing chapter chen cher chicago cial ciation cieties classifying clearwater comm communicating communication community complex complexity computational computer concurrency conf countermeasures croft cross culture cyber czkai data davis deadline deep delineation delling dels densi diameters discovering discovery disease disentangling distribution doreian draft dynamic dynamics ecker economics ectroscopy ects editor editors electronic email enron erlbaum erman eubank evolution example explanations extracted faloutsos feedback finding forthcoming francisco freeman from function fusion gardner garofalakis gehrke getz goldb granovetter graphs group groups guclu guppy heavy hidden high hill hillsdale hudson human huttenlo ideas identifying ieee impact individual infectious inform information intel internet intl ismail jaccard james joint jossey journal kaufman kemp kiesler kleinb knowledge kolata krause kretzschmar kumar lawrence laws lecture leskovec line lloyd london lusseau magdon mahwah mail management marathe markov material math maximizing measures mechanics meta mining modeling morgan morris multi natl nature network networks newman news notes novak optimizing oral organization organizations origin origins ossible otential outbreaks over pages papadimitriou patterns pattison perlich phenomena phys phytologist play poecilia press prietula proc propagation provost raghavan rastogi realistic references relational reticulata review role royal schreib science security series servan shrinking siam sieb sigkdd small smith social society sociology south southern spread springer srinivasan ssion statistical stream streams strength structure suppl supplement surveillance symp tails tardos tech technology telecommunication temp that their theory through ties time times tomkins toro trans trends tyler uence university unlikely urban using virtually viruses wallace wang wash washington wasserman weak wellman wilkinson within women workshop world york zone http://doi.acm.org/10.1145/1150402.1150475 69 Algorithms for Storytelling aaai able adler aggarwal algorithm algorithms alternating approach architecture around arti axiomatization babaria based basket biological cached cartwheels case chains challenge cial cient clusterings colt comparing complementary creech data datasets demonstrated design discovery discussion documents entropy explorations finding getoor graham guha have helm icml ieee indexing information intel interactive jair jaroszewicz jensen joins journal june kirpal knowledge kuchinsky kumar large learned learning lessons letin ligence link literatures machine manolop market meila method mining mishra moore nanop narrative neville ning organization orting otential oulos outputs overlaps pages parida partition potts predicates proc ramakrishnan redescription redescriptions references relational sarawagi scalability scienti search showcased siggroup sigkdd sigmod similarity simovici sivakumar smalheiser software statistics stimulus stories storytelling structure study sundaram supp swanson system theory this tool transactions turning unweaving using variation vldb with wolf word workshop http://doi.acm.org/10.1145/1150402.1150454 49 Supervised Probabilistic Principal Component Analysis account achieve after algorithm algorithms allows also american analysis annual anova appendix applied applying approach approximate bach bair bartlett basics basis bayesian benchmark between beyond biometrika bishop blocks building categorization cation centered cient classi clearly collecting collection column columns component components computaion conclusion conference corresponding cves data deerwester dempster denote denotes derivation derived dimensional dimensionality discriminant discussion done dual duda dumais each easily eigenvalue elements embedding empirical entry equation esearch evgeniou extend extraction family feature focus following form friedman from fukumizu function functions furnas gaussian generalized ghahramani give gives good harshman hart hastie have here hilbert honour hotelling icml incomplete indexing inference information informed inner innerproduct input institute interesting international into iterative janardan john jolli jordan journal kernel kernels label laird landauer large latent leads learning least length lewis likelihood linear lkopf locally machine mapping mappings massachusetts mathematics matrices matrix maximum miller minka mixtures model modeling models modifying multi nding need neural nice nition nonlinear notation obtain obtained obtains only pair paper papers park partial pattern paul performance performing perspectives plugging point points pontil ppca prediction press principal probabilistic probability problem proceedings product proof property proposed recovers reduction references regularized related relations report reproducing research results review rewrite rewritten rose roweis royal rubin satis saul scales science scoiety section seen semantic semi series sets show sigir sigkdd similarly simpli since singular sketch smola society soft solve solving some space spaces sppca springer squares statisitical statistical statistics step stork supervised take task technical technology terms test text that them then theorem therefore thesis this tibshirani tipping track transposing trick uncorrelated unifying unlabeled update updates using value variables variances vector veri verlag volker well where which wiley with wold working written yang yields http://doi.acm.org/10.1145/1150402.1150520 112 Pragmatic Text Mining: Minimizing Human Effort to Quantify Many Issues in Call Logs accurately active american analysis annual answer application applications banerjee banff based basu beil berkeley berlin california categorization changes chicago class classification classifier classifiers clustering cohen collections commercial comparison computer conf conference counting cross current data databases deerwester despite development discovering discovery distributions diverse document domainspecific dumais ecml empirical ensembles error ester european evolutionary exploration extensive fawcett feature features flach forman forthcoming frequent from furnas ghosh given graphics harshman havre hetzler icip icml ieee image imbalance inaccurate indexing information international joachims journal keys kirshenbaum knowledge krumpelman labs landauer large latent learning lingual little machine machines macqueen many mathematical melville methods metrics mining model mooney multilabel multivariate nowell observations overlapping patterns performance philadelphia pisa pkdd porto positives practical practice predict press principles probability proc processing quantifying references relevant research resource response retrieval rogati science second selection semantic sheffield sigir sigkdd society software some state statistics stinger study success suermondt sung support symposium tech temporal term text thearling thematic theme themeriver thoughts ting topic training transactions trends under univ varying vector visualization visualizing wang webb whitney with workshop yang york zhai http://doi.acm.org/10.1145/1150402.1150507 101 Mining Progressive Conﬁdent Rules agrawal association between cient cikm computer constraints databases dent dept edbt enumeration expression frequent garofalakis generalizations growth icde ieee imielinski improvements items june large mining mortazavi national pattern patterns performance progressive projected rastogi references regular report rules science sequences sequential sets shim sigmod singapore spirit srikant swami technical university vldb wang with xspan zaki zhang http://doi.acm.org/10.1145/1150402.1150432 27 Measuring and Extracting Proximity in Networks albert alenex algorithm algorithms america arrangement arxiv aspects association august balanced barabasi barbra based berry between bhattacharya bibliography bollobas boston brandes bureau captures cation census centrality cfec christo cient cikm clustering cohen communities community company completeness computer computers conference connecting connection convey cormen current cuts data database dblp directed discovery division doris doyle drawing electrical elizabeth emergence engineering entity experimentation faloutsos fast figure finding flake fleischer frank freeman from garey getoor gibson giles good graph graphs guide hadjiconstantinou halle hershberger hill hope http human hypermedia hypertext ibaraki identi ijcai imdb implementation incremental inferring interaction international internet intractability introduction jackson john johnson june katoh kidman kleinberg knowledge labs lang lawrence learning leiserson liben link linkage lncs madonna math mathematical maxsiz mccurley mcgraw measures method michael mine mining minsiz models modern movie multi nding nearly networks nicole nowell october pages paper paths pennsylvania pittsburgh popescul power prediction press problem problems proc proceedings proximity raghavan random record relational report research resolution rivest scaling science shortest showing siam sigkdd simple sinatra sixth snell social springer stacs state statistical streisand subgraph subgraphs suri symp taylor technical them theoretical theory tomkins topology track transactions trier type ungar verlag walks winkler with workshop yahoo york http://doi.acm.org/10.1145/1150402.1150505 99 Integration of Semantic-based Bipartite Graph Representation and Mutual Refinement Strategy for Biomedical Literature Clustering accepted acknowledgements adaptive aggarwal alberta algorithms analysis andrade aone applications approach association based beil biocomput bioinformatics biomedical bork bottleneck browsing buckley butte buttersworth california cambridge canada career chapel church classifying cluster clustering clusters cluto collections combining comparative comparison comprehensive computer concept conference connections conrad criterion critical cutting data databases definition demonstration department dept diego different digital dimensional discovering discovery diseases document documents download edition edmonton effective engineering enriched entropy erlbaum ester etzioni experiments expression extending extraction factorization fano fast feasibility feature fifteenth free frequent frias from functional functions garcia gather gene generative generic genes genet genetically genomic ghosh gong grant graph handbook hanks health hearst hierarchical hierarchically high hill hristovski html http human hypothesis icml ieee inferred information inherited intelligent international inverted iratxeta jenssen joint journal july june karger karypis keyphrase knowledge kohane koller kumar large larsen lawrence learning lewit lexicography libraries library linear literature london machine management mdeline measure measurements medicine medinfo medline meeting meta method methods mining minnesota models mutual network networks norms novel online ontologies ontology optimization pairwise park part pedersen perez press principle proceedings processing procopiuc projected rank recent reexamining references reinforcement relationships relevance report representation research results retrieval review rijsbergen rule sahami scalable scale scatter science searches sentence sept siam sigir sigkdd sigmod similarity slonim society song steinbach study summarization supported supporting symp system systems tang technical techniques term text theoretic theory this throughput time tishby transmission trends tukey umls university users using vector very willett wolf word words work workshop wren zamir zeng zhao zhong http://doi.acm.org/10.1145/1150402.1150446 41 Acclimatizing Taxonomic Semantics for Hierarchical Content Classiﬁcation aggarwal agrawal america approach based building byron categorization cation chakrabarti charu chien chuang cikm classi clustering databases document ervised feature feng gates generating generation hierarchical hierarchy hofmann http into journal large lijuan lung machines merits online organizing pages philip prabhakar practical raghavan rakesh references scalable segments selection shui signature soumen stephen supp systems taxonomies text thomas topic vector vldb with http://doi.acm.org/10.1145/1150402.1150499 93 (« )-Anonymity: An Enhanced -Anonymity Model for Privacy-Preserving Data Publishing achieving aggarwal agrawal american anonymity anonymizing appear association based bertino beyond blake bottom cation chakraborty cient classi completeness complexity computing conference control dasseni data databases dewitt disclosure diversity domain down edge elmagarmid engineering feder full fung fuzziness gehrke generalization hiding holyer html http icde icdm icdt identities ieee incognito information international journal kenthapadi kifer knowldege knowledge learning lefevre machanavajjhala machine management merz methodology meyerson microdata mining mlearn mlrepository motwani optimal pages panigrahy partition pods preservation preserving press privacy problems proc protecting protection ramakrishnan references release repository respondents rule samarati saygin siam sigmod solution some specialization srikant statistical suppression sweeney systems tables template thomas transactions uncertainty using verykios wang williams http://doi.acm.org/10.1145/1150402.1150530 121 Maximum Profit Mining and Its Application in Software Development advances analytics bagging basili berry boehm book breiman bruckhaus business california challenges chapter chulani computer computing customer data davidson defect discovery economics editors engineering escalation forthcoming group hall hershey idea impact introduction john knowledge learning ling linoff list machine madhavji marketing mining modeling models practice prediction predictive predictors prentice publishing real realities reduction references sales science series sheng software sons step support symposium techniques technology wiley with workshop world http://doi.acm.org/10.1145/1150402.1150455 50 Extracting Key-Substring-Group Features for Text Classiﬁcation acquisition adaptive addison aizawa algorithmica algorithms american amnesia analysis annual applications applied approach apweb arrays asia association assumption attribution augmenting australia author authorship automata automated automatic baeza bangalore based bayes behaviour bejerano bell bender benjamins berkeley bilingual biocomputing bioinformatics biology boostexter boosting bratko bray brazil butterworths callan cambridge carnegie categorisation categorization cation chapter chemnitz chen chien china chinese church cient ciss class classi classify cleary cluseq clustering colton comparative comparison compostela compression computational compute computer computing conference construction corpus cristianini croft data decades dels detection development directions discriminative document dong dynamic ecacl ecir ecml ective edition editors email engineering english erty eskin estimation etzioni european examination extended extraction fakotakis farach fast feature features federalist filipi fine finland forsyth forty foundations frequency from frontier genre germany goodman grouper hall hangzhou harper hellerstein herbrich here hierarchical hill holmes howard human icde icml ieee improve independence india informatics information intel interface international introduction jackson japan jebara ject jersey jmlr joachims john journal jurafsky kernel kernels kessler kluwer knowledge knuth kokkinakis kondor language latin learning least lebanon length leslie levene lewis ligence lihue line linguistic linguistics literary lodhi london lossless ltering machine machines madrid mahoui manifold manifolds manning many markov martin marton matching mcgraw meeting melb melbourne mellon memory methods microsoft minimization mining mirkin mitchell model modeling models modern moulinier msri multi multiclass multinomial myaeng naive nashville natural nature neto networks neural nlprs noble nonlinear nunberg online orleans ourne overview paci pages pampapathi paper papoulis pattern pedersen peng performance phrasal ponte power prentice press pricai princeton principle probabilistic probability problem proceedings processes processing product programming progress protein publishing query random recognition references relevant report research results retrieval revealing revisited ribeiro rijsbergen risk rosenfeld salvador santiago saunders schapire scholkopf schtze schutze schuurmans science sciences search sebastiani selection semantic sequence sequences shawe sigir singer slonim smola smoothing snowbird spain spam spectrum speech springer stamatatos statistical stochastic string strings study substrings support surveys sydney symposim symposium system systems tampere taylor teahan technical techniques term terms text theoretical theory thesis tishby tois tokyo track transactions tree trees tsuda ukkonen university using usion utah vapnik variable variables vector verlag version vert vishwanathan wang watkins wesley where with witten workshop yamamoto yang yates york zamir zhai zhang zipf http://doi.acm.org/10.1145/1150402.1150473 67 CFI-Stream: Mining Closed Frequent Itemsets in Data Streams aaai adaptively agrawal algorithm algorithms almost also amount approximate april association august based because best between challenges chang changes charm chen chiu closed closet computing conf consume consumes corresponding counts data databases dataset datasets decreases defined dense directions discovery dramatically drops efficient engineering entire especially fast figure finding frequency frequent from future gateway generation giannella granularities guha histograms history hsiao ieee includes increases independently information infrequent intermediate international itemsets january journal july keeps knowledge koudas large less lucchese maintaining maintains manku maximum memory method minimum mining moment motwani much multiple muntz next nodes november number online orlando output over overall patterns perego performance proportional ratio real ream recent recently references remains rules runtime same science searching sensitive sept september shan shim shows siam sigkdd sigmod sliding slightly small smaller space srikant stores strategies stream streams support symposium takes terms than that theory these this threshold thresholds time track transactions tree unpromising usage user userspecified very wang when which window with workshop zaki zhou http://doi.acm.org/10.1145/1150402.1150437 32 Rule Interestingness Analysis Using OLAP Operations active adomavicius agrawal algorithms analysis analyze association autonomous azevedo background based bayardo bayesian bendat book browsing categorical cercone ceri chen cikm classification comput concepts contents convenient coordinates crystalclear data database databases decision design deviations discovered discovery dmkd dmql dong dynamics enhanced environment evaluation exception expectations exploring fast finding forthcoming framework frequent from general graph hamilton handling hellerstein hilderman hofmann hussain hyperlinks icdm identification ieee imielinski impressions improve information infovis integrating inter interactive interesting interestingness itemsets jaroszewicz jiang jorge journal july kamber kaufmann keim klemetinen knowledge koperski kumar lakshmanan language large learning like machine makes mannila matheus measure measurement measures microarray mine miner minimum mining model morgan mosaic most multiple nature neighborhood networks numbers operator ordering padmanabhan pakdd parallel patterns persol perspective piatesky pkdd plots pocas post procedures process processing product programs psaila pushing query querying quinlan random ranking references refinement relational relative reliable ronkainen rule rules ruleviz schaller science sets shapiro siebes sigmod silberschatz simovici srikant statistical support supports suzuki systems techniques terms test theory tirpak tkde toivonen trans tuzhilin unexpected unexpectedness usage useful user using vapnik verkamo very virmani visual visualization visualizing vldb wang what wiley wilhelm with workshop xiao zaiane zhao http://doi.acm.org/10.1145/1150402.1150510 104 Linear Prediction Models with Graph Regularization for Web-page Categorization aaai accuracy advantage algorithmic analysis ando applicable applied approaches appropriately argyriou arti automating based belkin better bioinformatics bousquet business case categorization cation ceder chakrabarti cial ciently classi cluster clustering cocit collective combining compared computed concept conclusion conference consistency construction convex cora correctly craven data design dipasquo direct directly document earlier ecial ectral elds eled enhanced equivalent ercentage erformance erimental erlinks ertext erty ervised escul evaluation existing expression extract features formed formulation freitag friedman from functions gallagher gasch gaussian gene getoor ghahramani global graph harmonic herbster icml improved improves incorrectly indyk inference information intel internet introduces invention issue iterative jaakkola jensen jmlr journal kernel knowledge koller laplacians leads learning ligence linear link literature local machine macskassy manifolds markov mccallum method methods mitchell models moreover national networked neville nigam nips niyogi novel oles olic only optimal optimization ortals osed pages partially poceedings pontil predicate predicted predictive probabilistic problem proceedings prop provost random recently references regularization regularized relational rennie research results retrieval rich riemannian schlkopf school segal semi seymore show shows sigkdd sigmod slattery solution standard statistical stern structure structures studied study submitted symb szummer table taskar tasks technical test text that theoretical therefore this toolkit ungar univariate university using versus walks webkb well weston where whizbang wide with workshop world yahoo yang york zhang zhou http://doi.acm.org/10.1145/1150402.1150441 36 Generating Semantic Annotations for Frequent Patterns with Context Analysis afrati agrawal american analysis approaches approximating association based baskets beyond brin collection conference correlations data databases deerwester deshpande discovery dumais eleventh engineering etween frequent furnas generalizing gionis harshman imieliski indexing information international items journal karypis know kuramochi landauer large latent ledge management mannila market mining motwani pages patterns proceedings references rules science semantic sequential sets sigkdd sigmod silverstein society srikant structure swami tenth http://doi.acm.org/10.1145/1150402.1150426 21 Learning the Uniﬁed Kernel Machines for Classiﬁcation aaai accuracies active additional advanced advances algorithm alignment american analysis andd ando applications arti association bachrach bartlett based batch belkin bhattacharyya bousquet categorization cation chang chap chung cial classi clustering cohn colt committee committees comp comput computation computer conf conference consistency cristianini data datasets design discrete ectral edinburg eeding eigenvalue elds eled elissee elle emerging erent error errors ersonalized erty ervised estimation examples factorization fine four framework freund functions gaussian ghahramani ghaoui gilad global graph graphs harmonic hastie heart horvath icml image infonation information initial intel intl introduction ionosphere jaakkola jmlr john jordan kandola kernel kernels koller kondor kronecker lanckriet large learn learning lectures leveraging liere ligence linear literature lnai local logistic mach machine machines madison management manifolds markov mathematical matrix mccallum mean meir methods minimax mining mode models muller multimedia nato neural nips nite niyogi nonlinear nonparametric onent oosting optimal other pages paper paradigm partially performance press probability problem proc proceedings processing programming query rand random ratsch reduction references regression regularization represents research retrieval riemannian sampling scale schemes schlkopf scholkopf science sciences selective semi separation series seung shamir shawe shown siam size smola snowbird soceity sonar sons spectral standard statistical structures supervised supp survey suykens systems szummer table tadepalli taip target taylor technical technology text theor theory three through tishby tong toward track train trained transforms uklr university using usion vandewalle vapnik vector volume walks wang weiss weston wiley wine wisconsin with workshop york zhang zhou zien http://doi.acm.org/10.1145/1150402.1150471 65 Dynamic, Real-time Forecasting of Online Auctions via Functional Models about adaptive advanced allow also analysis apple approach arizona armonk arrivals auction auctions available average barista based beginning bene bidders both brokerage bryan business businesses buyers cambridge carroll change chicago cial cient clustering comment components conclusions conference correctly corresp could coupled curse curve dashed data decisions determinants develop discovery done double driven during dynamic dynamically dynamics earlier ebay ecommerce economic economics edition either empirical endogenous entry ercentage erence erent error especially estimating even exible expected features figure forecast forecasted forecasting forecasts forthcoming from function functional ghani given guarantees hand have help hortacsu house hundred ideas important information insights instance insurance insure international ipods italy item items itnow itself jank jari journal know last lead least ledge like line linear ling live lowest lucking make makes management mape maryland mean method methods microsoft minimum mining model modeling modern more natural nature offering onds onential online only open operates operations option order other othing over paper participate parties pennies pisa point popular potentially practice prasad precedes predicted predicting prediction premium press price prices proceedings provide purchase ramsay rand ranking reeves references regression reiley related report research reserve ruppert russo same scalable school score scoring section seems seller sellers selling semiparametric series services several sharpe shmueli sigkdd silverman similar simmons smith smoothing solid spline springer static statisitcs statistical statistics subsequently such suggest systems technical that their there this thousand time tting university used using varying velocity verlag very wand wang want ways which while winner with work workshop would xboxes york http://doi.acm.org/10.1145/1150402.1150485 79 Algorithms For Time Series Knowledge Mining activation afshar algorithms allen based beschreibung bide braunschweig california casas cation cient closed clospan cohen communications constraints conversion cybernetics data databases datasets dawak ddick dept discovering discovery discretization editors eine elucidating engineering english episo ersonal etween event events extracting fayyad fluent frequent from garriga generating german germany grammar grammatik gricean guimar hsiao hunter icde ieee information intel interacting international interpretable interpretation interval intervals irvine journal kandel kationsbasierte klein know knowledge komplexen large last learning ledge ligent linear maintaining mannila marburg maulik maxims metho midp mining multivariaten muscle mustern notes oint oney optimizing oral orders pages partial patterns philipps ppner proc rchen references reiter relationships schwalb sequences sequential series sripada structure summaries summarizing survey systems technical temp their thesis time tkde toivonen tran transactions ultsch university using uthurusamy verkamo vila villafane wang with zaki zeitreihen http://doi.acm.org/10.1145/1150402.1150480 74 Clustering Pair-wise Dissimilarity Data into Partially Ordered Sets accrue across algorithm algorithms american among analysis anders annotation application applications applied ashburner association azua ball based berkhin bertrand bioinformatics biological biology blake bodenreider botstein brass bron brown budanitsky butler cation cell cerevisiae chapter chem cherry classi cliques clustering clusters combinatorial commun complexity comprehensive comput computational computations computer computerh corrales correlation cruz cycle data davis diday discrete distance dolinski driven dswo duda dwight eisen ellman eppig erimental erkhinsurvey etween evaluation expressio expression finding five futcher gene genes genet goble graph guruceaga hall harris hart hierarchies hill hirst https hybidization identi ieee international introduction investigating issel issues jain janowitz jects john june karp kasarskis kerb leeuw leiden lewis lexical linguistics lord lyer martinez matese mathematics mato measures medical memphis method microaray mining model molecular multidimensional naacl north numb octob ontology orders ordinal oriented orting osch osium other overlapping pages pattern pittsburgh plenum podhorski prentice press problems proceedings public pyramidal pyramids reducibility references regulated relationship resources richardson ringwald rubin rubio sacccharomyces segura semantic sequence sevilla sherlock similarity software sons springer stevens stork supp survey symp systems tarver techniques theory tool transactions umdrive undirected verlag volume vphan wang weak wiley wordnet workshop yeast york zhang http://doi.acm.org/10.1145/1150402.1150433 28 Hierarchical Topic Segmentation of Websites agrawal analysing analysis anchor annotated applications arising aumueller banerjee bayesian behavior behavioral bharat blei broder cation chakrabarti cient classi cluster clustering collins communications comparison competitors craswell customers data databases dean denoyer dent dhillon diligenti disjoint distributions document documents ective enhanced ester exporting fagin fine fisher formulation from gallinari gathering general ghosh gibson gori guha hawking henzinger hidden hierarchical hosts html hyperlinks hypersphere hypertext icdar implementation index indyk information institute jagopalan jasis jmlr jordan kolaitis kriegel kumar large learning leeds link machine maggini management markov master mining mirrored mises model modeling models multi multivariate nding network networks newsgroups novak omega pages pods processing punera rand recovery references research roberston scale scarselli schubert semi sigir sigmod singer site sivakumar social solutions spot srikant structural structure structured studies suitable suppliers techniques thesis tishby tomkins tool tree unit university using visualizing vldb volume website wide world http://doi.acm.org/10.1145/1150402.1150450 45 Topics over Time: A Non-Markov Continuous-Time Model of Topical Trends above academy advances advantage after allocation analysis andrieu annual appears arti author authors balancing bayesian beal begin behavior belonging berkeley beta biased blei bounded burstiness bursty chain cial conditional conference conjugate constructing continuous conveniently corrada data density detailed dirichlet discovery dissemination distribution distributions documents double doucet drawn dynamic each elkan emanuel erosheva erty estimation exponential fienberg finding follows freitas from function generalized gibbs grams group hierarchical hydrology hyper indicate information integrals integrating intel international introduction jensen joint jordan journal kauchak kleinberg know koller kumaraswamy last latent learning ledge ligence machine madsen mccallum mcmc mdzdi mean membership method mining mixed model modeling models mohanty moments moore national network networks neural nips nodelman note nzdi obtain often pages parameter personal power practice predicting priors probability problem proceedings processes processing publications random references relations report research respectively role rosen rule sample sarkar sciences scienti section shelton sigkdd simplicity simplify since smyth social song space sparsity statistical statistics steyvers streams structure suppl swan symbols syntax systems take technical tenenbaum term text time timelines timemines timestamps topic topical topics tseng umass uncertainty update usage using variance wang where with word workshop http://doi.acm.org/10.1145/1150402.1150498 92 Semi-Supervised Time Series Classification about alcock algorithms amaral analysis andconquer annual answer application applications april architecture artificial atomic august based benchmarks blum breakeven categorization chambery chen circulation cirelo classification classifier classifiers cohen combining complex components computational computer conference conjunction cozman data database december demonstration discovery document documents dynamic efficient empirical european everything evolution existence experimental feature features filtering finding france from glass goldberger greenbelt ground handwritten hausdorff herle histograms historical http huang human icdm ieee image indexing intelligence interaction international ivanov joachims joint journal kasetty keogh kibler know knowledge knowledgebased labeled landford langley launch learning literature machine machines madison mafra management manmatha manolopoulos many mark match matching mccallum methods mietus mining mitchell moody most multi multiple nanopoulos need neto nigam oria pattern peng physiobank physiologic physionet physiotoolkit point precision proc proceedings progress pruning quan queries query ratanamahatana rath recall recent recognition references relevant report research resource rule scale science sciences scientific sdiut sebe segundo selftraining semi semisupervised separate sequential series session shape sigkdd signals statistical streaming subsequence supervised support survey symposium system systems technical technology temporal tenth text their theory third thrun time training transaction understanding university unlabeled unusual using vector vision warping wedgie wisconsin with word working workshop wrong http://doi.acm.org/10.1145/1150402.1150415 10 Efﬁcient Anonymity-Preserving Data Collection able access achieves achieving actual actually adaptive additional advances advantage advantageous against aggarwal agrawal algorithms allow allows also annual anonymity anonymitypreserving anonymization anonymous another appendix applications applying argue asiacrypt asked associate associations assume attack attacker attacking attacks back based basic bayesian because bellare bits boneh boyen break broadcast called cambridge cannot carried cation cations caused chosen cient ciphertext ciphertexts classnotes clear clever clifton collected collection collude colludes collusion communication communications community comp compatible complete completely complexity computation computational computationally computer computes computing conclude conclusions concurrent conference connections consists constant context contrast convincing correct corresp corrupt cramer critical crypto cryptography cryptology cryptosystem cult current data database decrypt decrypting decryption decryptions dern describ description design designated determine deviating dingledine direct discovery dishonest distinct distributed documentation does duction during each easily easy ecause ectively efficiency eginning egshuf ehave elgamal elow enables encrypt encryption encryptions encrypts endix enough ensive entities equality erations erforms ermutation ermute ermuted ermutes erties erty etween even exactly exchange executing exit explain exploited explorations fact fails feasible finally fixing focs following follows forwards foundations four from function fuzziness game general generalization generate generates generation generator given goldreich goldschlag golle groth group guarantees have having heterogeneous homomor homomorphic honest honestly however html http identity ieee implementation implies impractical imprecise include includes incorrect indcca individual initial input insight instead institute integer integers intended international intro invalid involve iteration jakobsson journal juels kantarcioglou keys know knowing knowledge known knows large last leader leaders learn learns ledge likely lindell long longer makes malicious management mathewson mental message micali mihir miner mining missing misused mixing model modi modulo more moreover multiplicative network networks note numb number olls ondent ondents onding onent onion only onse onses optimistic order organized original originally ortant orted osed osition ossible otentially other ounded output pages pair pairs paper parallel parallelizable parallelized parameters partial participant participants party pass phase phases phism picked pinkas plaintext plaintexts play pods practical practice precisely present presented presents preserve preserving press prevent previous prime principles privacy private proc produce produces product products proof proofs prop protecting protection protocol provably prove prover proves provide provided providers proving public quanti quantify random randomized reason reed refer references relies rely required requires requiring rerandomization rerandomize rerandomized rerandomizes rerandomizing research resistance resistant resp restate results rmation rogaway round rounds router routing same scenario scheme science sciences second secret secrets section secure security sends sent sequentially server servers sets setting shoup show sics sigact sigart sigkdd sigmod signature signatures signing signs simplify simply since single size slight some srikant statement still stoc structure submission substitutes substituting subtle such suggested suppression swedish sweeney symposium systems syverson taking technical techniques tells terms than that their them themselves then theory there they this though three thus tools total track traditional transmission transmit transmits transmitted true turn ucsd uncertainty under unfortunately uniformly university unlike until usenix users using vaidya validity values veri version very vhti volume votehere waters were when where whether which while whom wigderson wikstr will with without works workshop would wright yang zero zeroknowledge zhan zhong http://doi.acm.org/10.1145/1150402.1150527 118 Mining Citizen Science Data to Predict Prevalence of Wild ∗ Bird Species aaai absence accurate algorithm algorithms almost also american amia anal analysis analyzed analyzing annals anova applied approach approaches approximation arti attribute available bagging based bauer berkeley breiman buntine california caruana cases cation certain challenging chapman chapter cial cient citizen classi collected compare comparison computed conclusions conf conference containing contrast correlated data decision demands department determined diagnostics dimensional directions duction earlier ecies ecologists ecology ected elissee empirical endent ensembles ensive environments erent erpg ervised escu etter evaluating even example fast faster fawcett feature features finding forests friedman frontiers function functional functions further furthermore future gain generalized generated giles goals gradient greedy growth guyon hall heavily heuristics high highly hill http icml identi identical imprecise include informatics inherent intel interesting intro issues ject jects john journal kira kohavi large learning lehmann ligence like limited linear machine make mccullagh mcgill mcgraw measure measuring medical metho methods mining mizil model models more much nelder niculescu node nonparametrics numb observation ointed oker once onse option ortance ortant oses osting outcome pairs physician plots practical practice practices predicting predictions predictive predictors presence presented probabilities probability problem proc produce provost pstdc psych quality random rankings ranks rapid rate references relies rendell research resp resulting results robust roughly rule scale science section selection sensitivity simms single size small split standard stanford statistical statistics structure study such technical techniques that these this those times together traditional training trees trend truly ultimately understanding university using variable variables variants very voting want where will with work wrapp http://doi.acm.org/10.1145/1150402.1150445 40 Beyond Streams and Graphs: Dynamic Tensor Analysis able academic adaptive agrawal alexandros algebra algorithm algorithms alternating amnon analysis anat anatomy applications approach approximations arun association authoritative bader based belgium between brett brin cacm carroll cation chang changing chris christos classi clustering clusterings clusters coding coherent comm component computer computing concurrent conditions cvpr data databases decomposition delivery deng dhillon dimensional ding discovery distributed domingos dong drineas dsvd dumais dyrby eccv eckart ective elsevier engelsen engine ensembles environment erences explanatory factor faloutsos filter flip focs foltz foundations generalization generalized george good graphs hall harshman haykin heung higher hisao hong http hulten hyperlinked hypertextual icdm identifying image images imielinski indexing individual indyk informatik information irregular items jiang jieping jimeng jolli joseph journal kannan kapteyn karypis katholieke kenny kleinberg kolda korn kotidis koudas kroonenberg kumar labrinidis large latent lathauwer laurie learning least leeuw leuven levin linear link lizhuang ltering machine mahoney mallela maps massive mathematical matrices means methods michael microarray mining mode model modha mohammed monitoring multi multidimensional multilevel multilinear multiple muthukrishnan neudecker nick nips niyogi notes number order page pages papadimitriou papers parafac paral partha partitioning pattern pedro personalized peter petros phonetics piotr pods prabhakar prentice press principal principle probabilistic procedure processes processing psychometrika quanti raghavan rakesh randomized rank ratio references regression report representative rules santosh scale scaling scheme science search semantic series sets shashua shuicheng shum sigmod signal singular sketches soda some sources spectral spencer spiros springer square streaming streams subspace subspaces susan swami tamaki tamara technical technology tensor tensorfaces terzopoulos theoretic theoretical theory thermal thesis three time tomasz trends tricluster trier tucker ucla university using value vasilescu vempala vetta viereck vipin vldb volume wansbeek working xiaofei yannis yeung young zaki zhang zhao zhengkai http://doi.acm.org/10.1145/1150402.1150447 42 Mining Distance-based Outliers from Large Databases in Any Metric Space accomplished achieved acknowledgements address aggarwal algorithm algorithms also another applications applied avoid bagging barnett based bounds breunig callaghan card catch cations cerg certain cient city cityu clustering commercial compared computation concerns conclusions continuously contours correlation cost council credit data database databases dataset datasets dbms density depth deriving detection developed dimensional direction directions distance easily edit edition effective euclidean exact exciting faloutsos fast feature finding from function future general gibbons goal government grant grants guha have high hksar hong icde identify identifying implemented indicates inequality integral intensional investigation itself jamming john johnson journal justi kitagawa knorr knowledge kong kriegel kumar kwok large lazarevic least lewis linear local loci long lower mining mishra most motwani must near never novel numbered objective objects olken operate order outlier outliers pages papadimitriou paper points probabilistic probability promising pruning ramaswamy random randomization rastogi rate received references regardless relational reports research results retrieved rotem rule sampling sander satis scan scanning scenario schwabacher sets several shim shuigeng sigkdd sigmod simple snif solid solution statistical streams strict strings subsequent supported system that theoretical this time traf transactions triangle tucakov tung twice types university using vldb were when where which wiley with work xiao xiaokui yufei zhou http://doi.acm.org/10.1145/1150402.1150427 22 Frequent Subgraph Mining in Outerplanar Graphs aaai addison advances agrawal algorithm algorithms annals annual applications approaches arti association background backtrack based biconnected bounds bringmann canonical chemical cial classifying comb comp complete completeness comput computer computers computing conf cook croft cubic cycles data depth description deshpande discovery discrete elled endent ergraph exive fast faster feder forms francisco freeman frequent from fundamenta garey generating graph graphs gspan guide harary hell holder homomorphisms horv ieee indep informaticae information inokuchi intel intractability isomorphism johnson journal karypis know knowledge kuramochi learning ledge length letters ligence linear lingas list listing machine mannila manuscript massachusetts mathematics matula maximal minimum mining mitchell motoda muntz networks nijssen ounds outerplanar overview papadimitriou paths pattern patterns planar press problem proc processing raedt read reading recognize references research rules science search sets shamir siam sixth society spanning srikant subgraph substructure subtree symposium syslo systems tarjan their theoretical theory time toivonen trans trees tsur unpublished using verkamo wale washio wesley wong yang yannakakis york http://doi.acm.org/10.1145/1150402.1150459 53 Outlier Detection by Active Learning active algorithms annu anomalies anomaly application approach apte arti august available bagging bars based bhattacharjee boosting breiman breunig cation chan chow cial cient classi clresults committee comput computer conference constrained contest cost costs curves data databases david decision density detect detection detectors deviation discovery distance distributions diverse elkan ensemble environments error false feature february fifteenth figure first framework freund from function generalization goldman html http hush icdm icpr identifying ieee images international intrusion intrusions iterations journal knorr know known kriegel kumar langford large lazarevic learning ledge levels light lindenbaum line local machine mamitsuka mammography management melville miller mining misclassi mooney multispectral network number oosting opper outlier outliers pages paradigm parzen pattern plotted positive poster predictors press proc proceedings query ramaswamy rastogi rate recognition references resampling research resource results sampling sander schapire sciences scovel sets seung seventeenth shim siam sigkdd sigmod sompolinsky spie standard steinwart stolfo strategies system teacher test theiler their theoretic theory track true ucsd unknown users using very vldb window with without workshop yeung york zadrozny http://doi.acm.org/10.1145/1150402.1150529 120 Discovering Signiﬁcant OPSM Subspace Clusters in Massive Gene Expression Data accepted acids across agrawal analyses analysis applicable application applications approach ashburner assessment automatic available average barrett biased biclustering binding bioinformatics biologically biologists biology breast call cancer candidates cannot cant capture case cation cdna cell cheng church ciency clear cluster clustering clusters columns common complexity computational concept conclusions conditions consistent consisting consortium costs cvid data database datasets demonstrated dense dimensional discover discovering does edbt enumeration erent erform erformance eriments existing exploits explosive expressed expression expressions extensive factor fail figure focusing framework fraser free gene general generalizations genes genet genomics global gominer growth hedenfalk hereditary high highly human icde icdm identi immune improvements industrial integration integrative interested interpretation introduced ismb journal keeping kiwi large lead likely limited many massive matrixes meaningful methods microarray millions mined mining model most mouse multiple ncbi nejm networks nicely nucleic numb number numerous oligonucleotide only ontology opossum opsm opsms order ounded over pages parameters particularly pattern patterns pnas preserving problem promising publicly random real references report represented resp results rows runtime rymon sage scalability scale scales science search seeds sequential sets sharing sherlo show shown sigmod signi similarity simon sites sizes small space srikant stanford strength structure studies submatrix submatrixes subset subspace synthetic systematic technical tendency testing that this those through throughput tkde together transcription transcriptomes trend twig university used using variable varying wang well when which will with xspan yang zeeb http://doi.acm.org/10.1145/1150402.1150479 73 Sampling from Large Graphs adler agrawal airoldi algorithm algorithms answer application back best bias burning canada cantly cares carley cation chakrabarti chosen chrobak citation clique collective combination communication compressing compression computer conference constant contributions could creating criteria curial current data densi deviation diamaters diameter dimitrop direction domingos down dynamics earlier ectively edges edmonton eeding eled empirical ending enough erent erfect erforming erience erset erties erty estimate estimators etter evaluate ever explanations explor exploration explore exploring faloutsos fast faster feder fire focus following forest free from future gibb gilb give goal graph graphs help ieee inspired international internet ject journal kleinb krishnamurthy large larger laws leskovec less levchenko linkkdd list lists little long management mascots massive match mean means measure measurements methods mining mitzenmacher model models more motwani nature network networking networks node numb obtained ologies ology oregon original osed ossible oulos over ower pages palmer particular partitions patterns percus pnas precise probability prop provide pure random reach real realism realistic recognize recursive reducing references rejaie relations relationships results richardson riley route rules sample samples sampling scalable scale scaling sciences second seems selection semantic show shrinking sigcomm sigkdd signi simulations single small smaller social strogatz study stumpf stutzbach subnets suitable system table techniques than that their there this through time tool towards triads trust university usually variance variety vicinity views visualization visualizing volume watts weighted weights well which while will willinger with wiuf work world would zero zhan