http://www.informatik.uni-trier.de/~ley/db/conf/kdd/kdd2002.html KDD 2002 http://doi.acm.org/10.1145/775047.775126 62 S i m R a n k : A M e a s u r e of S t r u c t u r a l - C o n t e x t Similarity* addison algorithms american annual applying appropriate authoritative automating back baeza being berthier between bolyai bradley brian brin bringing california ceedings citation citeseer close collaborative colorado common communications computing conference context database david dbpubs december denver discrete distance documents domains douglas edges environment equation expected factors filtering following francisco glen goldberg google gordon graphs group grouplens have henry herlocker high html http human hyperlinked identical information inneighbors instead janum jennifer john jonathan joseph journal kleinberg konstan lawrence literature lovfisz maes maltz march massachusetts mathematical measure meeting miller modem motwani mouth necessary neighbors neto news nichols ninth nodes order page pagerank pages parameter proceedings rajeev random ranking reading references relationship report retrieval ribeiro ricardo riedl science scientific score sergey shardanand siam similarity simrank since small social society solution some source sources stanford step structural studies survey swapped symposium systems tapestry technical terms terry their theorem this thought thus traveling unique university upendra usenet using volume walks weave wesley widom winograd with word would written yates http://doi.acm.org/10.1145/775047.775125 61 A Model for Discovering Customer Value for E-Content access adaptive addison after agent agents algorithm algorithms allocation almeroth also amount analysis analytical analyzed application applications april arnold arrival artificial auctions august automatica available bakos balance bandwidth barbara based baseline basu been behavior believe best between breaks breslau burtonsville buyer buying caching california centric changed changes channel chavez chicago communications comparison competitive concerned conclusions conference congestion considerations constraints content control corporation costs customer customers december delivering demand detailed developed digital discovering distributions down dynamic dynamics ecology economically economics economies econtentprc efficient effort electronic even every evidence evolution exchanges fall february figure finally first fixed fixedprice focus focussed framework frice future generate gibbens goldberg goods gorithn governing greenwaid gupta hartline have hofmann hour house however http human ieee ient ignoring illinois implications increase independent infocom intelligence intelligent interactive international internet intertrust introduction issues jagannathan joint kasbah kelly kephart knowledge known lack lead least lgorlthm like little lnfocom lnternet long mackie maes main management managing market marketplace marketplaces maryland mason mean models most motivated multi multimedia nayak network networks nificantly nmsl november observed october ondemand online operative optimal pages papers parameters pareto peformance performance phillips practical presence present presented presl price pricebots prices pricing primary principle proceedings prospects provider public publishing randomly rate reasons reducing references rejected related relative report request research resource resources results revenue revenues santa scale scarce scheme schemes science search selling server services shenker shopbots shown sigecom significant simple simulated simulation simulations sitaram situation sixteenth socially spaces sresults stahl star strategies strikes such suggests system systems technical technologies technology term tetzlaff that there this thrust time ucsb university unrealistic used using utilization value varian venkatesh version very video well wesley when where which whinston willingness with work workload wright zipf http://doi.acm.org/10.1145/775047.775124 60 Mining Complex Models from Arbitrarily Large Databases in Constant Time able abundant accelerating account active adaptive additional adjust advances algorithm algorithms allows american amounts analysis anderson anytime application applications applied approximation arbitrarily arbitrary artificial association atlas author available averaging awards based bayesian being better blue boston bounded bounds brodley cambridge candidate career challenge changing chickering classification climbing clustering cohn combination combining common complexity computer conclusion conf constant constraints constructing continuous controlling corin curve data database datasets decision department depend dependencies develop developed developing diego directions discovery discrete domingo domingos donors each efficient engineering essentially experiments explorations extending facilitate faculty faster finite ford form framework francisco frasca friedman from function functions fundamental further future gavalda geiger general generalization gift give greiner handle have heckerman high hill hoeffding hulten improving include includes increasingly incremental incrementally inequalities infinite information instance intelligence interleaved international into jensen journal kaufmann knowledge kohavi ladner large learn learned learning library local machine magnitude maintainers maron martini mason massive mateo maximization mechanisms meek memory method methods mining missing model models moore morgan motor nachman network networks neural node oates ones onion only orders organizers over palo paper partly pedr peeling performs portland present press previous probabilistic probability problems proc processing programming progressive proposes providing provost quality races random references report repository research results same sample sampling scaling scheffer science search seattle second selection semiautomatically sequential sets should show sigkdd size space spaces sparse speed spencer statistical stockholm streams structure sums support supported sweden systems taking technical than thank that thiesson this time timechanging training transforming tree uncertainty university used using utility values variables vfbn wald washington watanabe when while wiley williamstown with within work wrobel yakhini york zheng http://doi.acm.org/10.1145/775047.775115 51 CVS: A Correlation-Verification Based Smoothing Technique on Information Retrieval and Term Clustering aaai across addison allc american amsterdam analysis andi anmysis annual applied approach august austrailia australasian australia automatic baeza based bayes beijing berkeley between biometrika british brown buckley carpineto categorization categorizing center cikm classification coast collections colloquium combining common comparing comparison computing concept conference contribution corn corpora corpus correlated croft data december deriving development different document documents does dublin effective empirical english espoo estimation event evidence expansion experiments features finding finland fowler franz frequencies frequency from fuhr garside gauch global good group hashimoto hierarchical hierarchies hoashi hofland holland homogeneity hongkong honkela html http humanities iformation improved improving information inoue institute integrating international interpolated ireland irsg jelinek johausson july kaski kilgarny kilgarriff klas knowledge kohonen labeling labelsom lafferty lagus language large lawrie learning lexicm leximancer lists local louisiana machine management mandala mapping maps markov matsumoto maxlison mccalhtm measure measures melbourne mercer method methods mitra models modern mori multiple naive national natural neto nevada nigam nist north norway norwegian november number organizing orleans pages parameters pattern pool population practice press proceedings processing profiling pubs query rauber rayson recogition references relations representation research researchandstandards retrieval retuers reuters ribeiro romano rose rosenberg roukos sanderson scoring selection self semantic september seventh sigir similarity singhal sixth smith smoothing source sparse species springer square standards study summarization sunshine symposium tanaka technology term text thesaurus through tokunaga topic trec trect tuwien types using vegas very visual voorhees wang websom wesley wilson word words work workshop wsom yates zhai http://doi.acm.org/10.1145/775047.775073 17 Shrinkage Estimator Generalizations of Proximal Support Vector Machines additive adult advances agarwal albert alternatives american amsterdam analysis annals appear applications applied approaches april arthur association associations august available bagging bartlett based basis bayes bayesian biased binary boosting brieman bureau burges calculating cambridge campbell census cessie chib classes classifiers collinearity comments communications company computation computational computations conference connecticut critique dagarwal data databases dataset datasets decoste dempster densities department design discovery discussion dissertation drucker duan dumouchel editors efron eilers elseviers embeddings empirical engineering estimation estimator estimators evaluation feature francis francisco friedman from functions fung garcke gelfand generalizations generalized genton gibbons gilbert grids griebel hastie heijden holland houwelingen hsalng html hyperparameters implementation information international item jensen jordan journal kaufman keerthi kernel kernelized kernels knowledge large learning least leyk likelihood linear locally logistic machine machines mackinnon mangasarian margin marginal marx matlab matrices matrix measures mechanical mercer meschach methods michael mining mlearn mlrepository modeling models moler morris mozer multi murphy national netlib neural nonorthogonal north optimization ordinary pages part performance perspective petsche polychotomous predictors pregibon press problems proceedings processing proximal publicly publishing puterman references regression report repository research response ridge robert sampling schatzoff schreiber schslkopf schuurmans screening seeber segerstedt seventh siam sigkdd silvapulle simple simplicial simulation singapore smith smola smooth some spaces sparse spatial squares ssvm statistical statistician statistics stein stewart study support systems technical technology technometrics theory thisted thomas tibshirani tuning university using vapnik vector view visualizing wermuth with http://doi.acm.org/10.1145/775047.775050 1 Scalable Robust Covariance and Correlation Estimates for Data Mining abdullah algorithm american analysis annals appear applications approach association asymptotic based behavior behaviour bias biometrics birthday breakdown classical coefficient communication components computation computer conference correlation covariance data davies dept detection determinant devlin dimensional dispersion distribution donoho donvho driessen efficient estimates estimation estimator estimators fachgruppe fast functions gamma gnanadesikan hampel harvard high honor huber identification implementing influence intensive international john journal kettenring large least leroy lindsay location manku marazzi maronna mathematical matrices median methods minimum modified multiresponse multivariate online order outlier outliers paper parameters personal peter point principal projections properties publishing qualifying rajagopalan random record references regression reidel report research residuals robust rocke ronchetti rousseeuw ruffieux sampling scatter sets sigmod sons space springer squares stahel statistic statistical statistician statistics statistik techniques technometrics university verlag wiley with woodruff yohai zamar zurich http://doi.acm.org/10.1145/775047.775134 70 Discovering Informative Content Blocks from Web Documents addison algorithm algorithms analysis anatomy association august authoritative automatic baeza based bear bell best blahut blockeel brin cardie carnegie cdrom chakrabarti chidlovskii chien chinese communication communications computer conference cowie data department discovering dissertation distillation document dung empirical engine engineering enhanced environment explorations extensible extraction finite frakes freitag from generating generation google grammar hall html http hyperlinked hyperlinks hypertext hypertextual ieee improve induction information integrating international israel journal july keyword kleinberg knowledge kosala kushmerick language large learning lehnert machine magazine markup martin match mathematical mellon methods mining model object october office page papers petit pittsburgh porter porterstemmer practice prentice principles proceedings processing references research retrieval reversible salton scale science search semi semistructured seventh shannon shasha sigir sigkdd sixth sources state stemming structural structured structures survey system systems tartarus technical techniques tenth text theory topic transactions transducers transformation trec tree university using wang washington wesley wide with workshop world wrapper yates http://doi.acm.org/10.1145/775047.775121 57 SyMP: An Efficient Clustering Approach to Identify Clusters of Arbitrary Shapes in Large Data Sets addison agrawal algorithm algorithms analysis aplications application approach approximation athens automatic based birch boundary bradley chatterjee classification cluster clustering clusters complexity conf coupled cure data databases datasets dempster density detection dimensional discovering distribution duda effective efficient elkan ellipsoidal engineering ester explorations farnstrom fayyad finding frigui from future fuzzy gehrke greece grid groups guha gunopulos hart high hinneburg ieee incomplete infromation inquiry introduction john journal kanfman kaufman keim kriegel krishnapuram laird large lewis likelihood livny maximum method methods microsoft mining multi multimedia muntz nasraoui noise notes organization oscillators pami past pattern population possibilistic proc proe raghavan ramakrishnan rastogi references reina report research resolution revisited rhouma rissanen rousseeuw royal rubin sander scalability scaling scene scientific self series sets sheikholeslami shell shim sigkdd sigmod society sons spatial statistical sting stochastic subspace surface synchronization technical techniques their trans tutorial very vldb wavecluster wesley wiley with world xiaowei yang york zhang http://doi.acm.org/10.1145/775047.775070 15 Pattern Discovery in Sequences under a Markov Assumption abdre acad academy alignment analysis annual assumption bailey baldi biocomputing biological biology biopolymers bounds buhler california canada chauvin chow cliudova collado comput computational conference contextual dependence discovery duda eddy elect elkan error expectation extracting fifth finding frequencies from gelfand genes hart helden hidden hunkapiller ieee information intelligent international irvine john journal learning machine markov maximization mcclure method models molecular montreal motifs multiple national neighbor oligonucleotide pacific pattern pevzner pool primary procedure proceedings recognition recomb references region regulatory report scene science sequence sequences sites smyth statistical symposium systems technical theory tompa trans under university unsupervised upstream using vapnik vides weak wiley yeast york http://doi.acm.org/10.1145/775047.775083 24 PEBL: Positive Example Based Learning for Web Page aaal algorithm algorithmic allwein applications approach automatic available binary building categorization category characteristics chen cikm citeseer class classification classifier classifiers communications conference content cortes coupled craven data deadliner decomite dempster denis dipasquo discovery document documents duin dumais ecml engine exaanination examples experiments extract finding flake freitag from generation giles gilleron glover help hierarchical html hypertext icann icml image improving incomplete incrementally inference information internet interuet issues joachims journal knowledge kruger labeled laird lawrence learning letouzey likelihood links machine machines management manevitz margin mase maximum mayoraz method methods mining modifications multiclass myaeng networks neural niche nigam object optimizating orwig page pairwise positive practical queries query reducing references report representation research royal rubin saint schapire schuffels search semi series sigir sigmod singer society specific stanford statistical structure structured sundaresan support svms symbolic symposium system technical text theory transductive uniform unifying university unlabeled using vapnik vector visual wide with wong workshop world yang yousef http://doi.acm.org/10.1145/775047.775067 13 Optimizing Search Engines using Clickthrough Data aaai addison advances again agent agglomerative ahead algorithm algorithms altavista american analysis annual appendix approach approaches architecture artificial assists august automatic average avgprec back baeza bartell based beeferman belew berger between boes boosting boser bound boundaries boyan brin browsing buckley burges cambridge canada center chapter chichester classifiers classify clickthrough clustering cohen combination combining computational computed computer computes conf conference constraints convex cornell correlation cortes cottrell crammer data decrease department derivative derivatives desired development digital discordant discovery documents easy edition editor editors effectiveness efficient eigenvector engine engines equal evaluation fields fifteenth fixed following follows freitag freund fuhr functions ginn given graepel graybill guide guyon hafner harlow hartmann haussler have henzinger herbrich hill horn hsffgen http hypertext icml ijcai increasing indexing information integer intelligence international internet into introduction iyer joachims joint journal kaufmann kemeny kendall kernel kluwer knorz knowledge known kopf lagrange lagrangian large last leads learned learning least letizia lieberman longman lower lustig machine machines making management marais margin mathematical mcgraw measuring methods minimize minimum mining mitchell models modern montreal mood morgan moricz multiple multipliers multistage neto networks neural neurons nips number obermayer only optimal optimization optimizing optimum order ordinal page pagerank pages pairs partial partim placing polynomial possible practical pranking precision precison preference preferences press principle probability problem proceedings processing proof quality query rank ranked ranking ranks regression related relaxing relevant remaining removing report research respect retrieval riao ribeiro robust rsus rsvs rule salton scale schwantner science sciences search sets shapire sigir sigkdd silverstein simon singer single smola snell social society solution solved solving sorted starting statistical statistics still subject substituting support system systems taking technical term text that then theorem theory there things tour trainability traininig transactions tzeras unbiased university user using value vapnik vector very volume webwatcher weighting wesley what wide wiley with without workshop world worst write yates zero http://doi.acm.org/10.1145/775047.775061 9 Bursty and Hierarchical Structure in Streams aaai access acoustics actions adelson advances agent agents agrawal allan alterego america american analysis appear application applications applied approach april archives artificial aspects assistant association assumption attention attorney august ault automatic automation autonomous bandwidths barber bark baru based bayesian becker beeferman berger berghel birrell blanton boone brace brazilian brewster broadcast carbonell cardoso case categorization changes chapman chatfield chatman chronicle chudova civil classification classify cohen collaborative collection communication communications compaq computational computer computers computing concept concurrent conf conjunction constructing cornell corporation cover cowley cryptography curves cybernetics darpa data databases davison defigueiredo descriptions detection digital discourse discovering discovery doddington dumais education edulafslsipb edwards effect effective effects ehrich electronic eliot email emvis engineering english enterprise environments episodes essay estimated estimation event example explaining exploring extracting facing factors fast features fiction film filtering final finding fine first firstmonday fitting flood foith foote forster frequent from future general generation genette gong good grace grosz gruber gupta guralnik hall hamann hand happy harcourt hart have havre hawkins heckel heckerman helfman help hembrooke hetzler hidden hierarchical higher hmms horvitz house http hudson human hypermail identification ieee ifile immediate important improving incremental inference info information initiative integration intelligence intelligent intensity intentions interaction interface interfaces intl introduction investigation isbell ishmail islands issues jensen join journal judgment junk jurrus kandel kelly keogh kephart klein knowledge knowles labs lafferty large last lattimer lavrenko lawfie learn learning lewin lewis life lime line linear linguistics llweb loss ludaescher lukesh machine maes magazine mail mailcat mall management managing manipulation mannila marciano markov markus martin matching medium memorandum message messages method methods microsoft miller mimng mining mixed model models monday moore movements murphy narrative nearest negative neighbor networks neural news nips notes novel nowell office ogilvie olsen organizing over overload overview oxford pachyderm papka paradigms parameters part paskin pattern patterns payne periodic perl persistent personal piecewise pierce pilot plaintiffs point points pollock potential preliminary press principles probabilistic proc processing proe projectltrnlsrcltrn proposed query rabiner rajasekar random reasoning recognition redmond reduce references regression related relational rennie report representation research researchers retrieval retrospective revisited rule rules sahami salmenkivi schmill schneier scholarship schroeder seeks segal segment segmentation segmented selected september sequence sequences sequential series shaw sidner sigchi sigir signal significant simple singer smyth social speech spitzer spring srikant srivastava state states statistical statistics stefanone stochastic story structural structure study supercomputer support swan swiftfile symposium syrup system systems technical technologies temporal text that theme themeriver theory thomas threading time timelines timemines tishby toivonen tool topic tracking trans transactions transcription translation trees tutorial ugly under understanding united univ unsupervised unusual usage user varying verkamo visual visualization visualizing volumes wall waveforms wavelet white whittaker whose wiley wireless wise with wkshp wobber wong word work workshop world yang yarnron yohai york zachary ziedins http://doi.acm.org/10.1145/775047.775116 52 Learning to Match and Cluster Large High-Dimensional Data Sets For Data Integration aaai acknowledgments american andrew artificial authors cohen conference craven dipasquo extract felligi fifteenth formal freitag from helpful information intelligence journal knowledge learning linkage madison mathematical mccallum methods mitchell national nigam numerous order orleans proceedings record references research retrieval robert schapire sigir singer slattery society statistical suggestions sunter symbolic thank theory things wide william workshop world yoram http://doi.acm.org/10.1145/775047.775102 39 Learning Nonstationary Models of Normal Network Traffic for Detecting Novel Attacks alfonso anderson anomalous anomaly antonio anup april attacks automaton barbar based bayes beetle behavior behaviors bell bendre berkeley bollineni california chan cigital clara cleary component compression computer computing conference darpa data database date debra defense denial detecting detection dhurjati difficulties eluding emerald estimators evaluation evasion experience expert failure fast first florida floyd forrest ghosh handley harold header hofmeyr hostile html http icir identifying ieee ieefjacm insertion international internet intruders intrusion intrusions jajodia january javitz john kendall kreibich kristopher laboratory lawrence learning library lightweight line lippmann lisa longstaff lunt mahoney martin masters meetings method mining mirror model modeling monitoring national network networking networks neumann newsham newshamevasion nextgeneration nides normalization novel packet papers paxson phad phrack poisson porras privacy proc proceedings processes profiles program protocol ptacek publications real references report robertgraham roesch sally santa sasha schatz schwartzbard science sdrn seattle security sekar self semantics sense service siam silicon silicondefense simulating snort software somayaji spade spice statistical strict surveys symposium system systems tamaru tech technical teresa text thesis thomas time timothy traffic transactions unix unusual usenix using valdes vern with witten workshop http://doi.acm.org/10.1145/775047.775062 10 On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration aaai abbadi about abstraction academic accurate adaptive adaptivet advances again agrawal aieee algorithm algorithms aligned allan allows alnt american andconquer anddata andr angnelov appear applied approach approaches approximate approximation arbitrary arima arms artd artificial atlanta attribute australia author badal bailey barbara based bases beach because being bemdt berlin best bethesda bias boston bottom bozkaya cairo california call caraea cassis castelli chakrabarti chambery chan chavarrias chicago chun chung cikm cikmint cited claim claimed classification clifford clustering cohen community compare comparison complexity comprehensive compressed compression computer computers computing conclude conclusions concrete concurrent conducted conference constraint content criticism current curves czech dallas danger data database databases datasets debregeas decay deformable deliberately demonstrated department detection diego different dimensional dimensionalityreduction discovering discovery distance down dynamic echo effective efficiency efficient efficiently effort egypt engineering enhanced entire european evaluation event ever except exchange experimental experiments extraction falcon faloutsos fast feature feedback ferhatosmanoglu field files filtering finding fink flaws fool foundations fourrnilab framework france freiburg from gada gavrilov generalizability generated genuine germany geurts giving goldin graphics greater gunopulos guralnik gurel haar have hebrail heidelberg here high hint histories hlnt hotbits http huang huhtala identifying ieee image implementation improve included index indexing indyk inen information inproceedings intelligence intelligent intended interactive international interpolation interpretation invadant ironically irvine italy jagadish japan jensen joint jose journal kahveci kalpakis kanellakis karkk keogh kibler kinds knowledge kobe kohonen koudas langley large later lattice lavrenko lawrie learning length lengths level levels like literature little lnproceedings locally london long loop lopez lssues luery machine made malm management mannila manolopoulos many maps market markov masses massive matching mclean meaningfully measure measures mehrotra mendelzon menlo methods miller milo mining minneapolis mislead model most motwani movement multi multidimensionaldata multilevel multimedia multiple multiples muthukrishnan nearest neighbor network networks neural newport noise normalization norms norway note noted nsson numbers ogilvie once organization original orlando orleans outperformed over ozsoyoglu paper papers paradigm parallel park particular pattern patterns payne pazzani performance philadelphia polly popivanov positano practice practices prague pratt prechelt presence press principles probabilistic problems proceedings processing programming projection pruning psaila puttagunta puzzling quantitative queries querying radioactive rafiei random ranganathan rdlnt real recent recommendations references refiei relevance renganathan repository representation representations representative republic resampling research researchers results retrieval review rhint rhlnt rival robust rule salemo santa sawhney scale scales scaling schmill science scientific search searches searching segment seiies sentiment separate sept sequence sequences seres series session sets several shahabi shapes shatkay shieee shift shifting shim shown siebes sigkdd sigmod signature similar similarities similarity simon simulation singh sketches sliding smyth some specification spie spirit srivastava statistical statistician stock structure struzik study subsefies subsequence subsequences suggestions supercomputing supporting surprise survey suspicion swami switzerland sycara sydney symposium systems teach technique technology templates text thacm that theory this thlnt tian time titled toivonen tools transform translation tree trend trends trondheim tucson tuncel twelve under undertaken university unless using valente variable vegas very verylarge view viewed walker wang warehousing warping wavelet wavelets ways whang what when which wimmers windows with wonderful wong work working workshop world would yazdani yoon york zait zdonik zhao zurich http://doi.acm.org/10.1145/775047.775058 7 Efficiently Mining Frequent Trees in a Forest aaai abitebou abiteboul able across advances against agrawal algebra algorithm algorithms allow allows also ancestor application applications approach april apriori arikawa arimura arise artificial asal association assume august background base based bases biosciences called candid change chemical chen clip cole compact comparing comparisons compounds computer computing concept conclusions conf conference confirmed constraints containment cook cooley counting customer data database definition dehaspe dense dept depth description deterministic develop different disconnected discover discovering discovery discrese discrete distributed documents domains each easily easy editors efficient efficiently embedded encoding encompass engg engineering enough environment european expensive expressions extension factor fast faster fayyad february fernandez finally find finding first flexible focus forest forests formalization formulation framework frequencies frequent from future general generated generation give given graph hariharan holder horizontal ieee inclusion incorporate indexing indyk inference information informative inokuchi instance intelligence interactivity international interval intl introduced isomorphism january joins journal july kaplan karypis kawasoe kilpelalnen king knowledge krishnamoorthy kuramochi label labeling language large learning length likely linearly list lists logml loth management managment mannila markup matches matching maximal menlo milo mine mines minimal mining mobasher modifying moon more most motoda multiple node notion novel november once optimizing ordered outperforms pages park path patt pattern patternmatcher patterns performance performing plan press principles problem procedure produce proposes punin queries querying quickly real references regular relational relatively relevant report representation research retrieval road rules same satamoto scales schemas schemes scope search secondary semi september sequential several shamir shapiro siam sigir sigkdd simple single space specified srikant srivastava store string structured structures strutures studled subgraph subset substructure substructures subtree subtrees suciu summarize supporting symp symposium synthetic systematic systems task tech techniques test this time toivonen tools touchpoints traversal tree treeminer trees tsur twig typical unlabeled unordered usage user uses using utilized variants variations vertical very vianu wang washio which wide with work workshop world yoshida zaki zhang http://doi.acm.org/10.1145/775047.775089 28 Transforming Data to Satisfy Privacy Constraints adaptation addison agrawal algorithm algorithms allocation american anonymity argus artificial assessment association based basis best blake bureau california census chen comparing computer conference confidentiality contextual continuous control creecy data database databases datafly decision dept disclosing disclosure discretization dissemination division domingo done dougherty duncan elements eleventh enforcement engineering eralizations estimation feature features ferrer figure files generalization genetic genitor goldberg harm holland hong html http hundepool identification identities ieee induction information internation international irvine iterations journal kaufmann keller keogh knowledge kohavi lambert learning limited loss machine management march masking mateo mcnulty measures medical merz methods michigan microdata mining mlearn mlrespository morgan natural neerlandica ntts number numbero official optimization optimizing pages performance perturbative practice prediction preserving press pressure privacy proceedings prospects protecting protection providing quesdufjons quinlan rank ranking references release report repository reproductive research respondents risk sahami samarati sanz science search section security selective seminar sigmod skinner software springer srikant statistica statistical statistics status supervised suppression survey sweeney system systems technical test third through torra transactions trees trials twelfth uniquesolutions university unsupervised verlag waal wesley when whitley willenborg winkler yancey http://doi.acm.org/10.1145/775047.775074 18 Hierarchical Model-Based Clustering of Large Datasets Through Fractionation and Refractionation. academic advances algebra algorithm algorithms allan american analysis ankerst annals answers approach association axis banfield based behavior berry better between biometrics bradley breuning brien browsing carbonell chaining chose chosen classification cluster clustered clustering clusterings clusters cole collections comparing components computer computers conf conference covariance criterion cutting data databases datasets dayan density detection diagonal digits dimension discovering discovery distribution document doddington domingos dumais each editor effects elements ester estimating estimation example except expectation experiment external fayyad figure final finite fowlkes fraley from gather gaussian generalization generated group groups handwritten hartigan hierarchical hinton however hulten identify ieee images improving index infinite information instruments intelligent international john journal karger knowledge kriegel large learning linear mallows management manifolds many matrix maximization mclachlan mean method methods microsoft mining mixture mode model modeling models multivariate nearest neighbor networks neural noise number numerical observations obtained only optics ordering pages parallel pedersen peel pilot points press proceedings processing proe raftery reduces references reina replaced replications report research retrieval review revow same sample sander scaling scatter schwartz scott setting siam sigmod sigr similar simulated size slightly sons sources spatial statistical statistics structure study systems taxonomy technical than that them theory this time topic tracking transactions tukey using values very which wiley wishart with yamron yang zero http://doi.acm.org/10.1145/775047.775071 16 On Effective Classification of Strings with Wavelets accurate adaptive additional after agrawai agrawal algorithm algorithms allows analysis applications approach association average based because between biological boat calculation cambridge case chakrabarti classification classifier classify classifying clustering complex complexity compute conference construction cost cumbersome data database databases decision dependent deshpande detection deviants different dimensionality distance distinctive dong duda dynamic each edit effective efficiency efficient efficiently enhanced evaluation event extremely fast feedback finding framework from functions furthermore ganti gehrke guralnik gusfield hart have heczko however icde icdm identifying illustrated incurred indexes indexing individual instances integrating jagadish jagarlish james karypis kaufmann keim keogh koudas lack landmarks large leads learning length locally machine magnitude manganaris march matching mehrotra mining minnesota model morgan most multidimensional multivariate muthukrishnan nearest needs neighbor noise note number oates once only optimistic orders over parker partial pattern patterns pazzini periodic perng presence press probabilistic process programming progrttrns quadratic querying quinlan ramakrishnan reduction references relevance report representation required requires rules running sawhney scalable scaling scene scratch search sensor sequences sequential sequentially series sets several shim sigkdd sigmod similarity since size smyth somewhat srikant srivastava strings subsequences such suitable syndicate table tainforest technical techniques test testing than that their this time tradeoffs training translation tree trees tutorial university used value vanderbilt vldb wang wavelets wavrule which while wiley with worse zhang http://doi.acm.org/10.1145/775047.775136 72 Incremental Context Mining for Adaptive Document Classification above acclassifier acclmufler acknowledgement acmsigir active adaptive also although analysis analyzing applications applied apte attributed automated automatic base based basis been better boosting case categorization category caused changing chen china chute clarkson classes classification classifications classifiers classifying cluster clustering cohen combining comparative compare comparison complexity computing conclusions conference confirms conjecture content context contexts contextual contributed contributions could council creation critical croft cumulative damerau data decision demonstrates design detection discovery distributed document documents dumais dynamic efficiency efficient enough errors essential even ever evolutionary evolve examination example excluding extensions extraction fanlt faster feature figure filtering frequently from further generalized giles grants grobelnik hierarchical hierarchically hierarchy high higherprecision ieee iito impreciseness improved improving include incorrect incremental information inforrnation instance intelligent interesting internet invoked iwayama judge keller knowledge lain larkey lawrence lcml learning lehnert mainly maintenance management mapping mecallum method methods mining misclassifieations mitchell mladenic more much national nicholas noise nonaka oflcml often operation organization organizational over page paper part particular pedersen perceptron perfect possibility practice prec precisely precision presents previous proc processing real recent recovery references refinements refining reprocessing republic requirements research result results retrieval review riloff rocchio rosenfeld rules sahami sampling science search seconds section seeking selection sensitive shcapire show showed shows shrinkage sigir significance singer singhal some specified spent strategies study support supported systems technique testing text than that their theory there this those thus time times tminin together tokunaga tolerance tolerant training transactions treated trends typical under usability using very vocabulary weiss whether which whose willett with without words world yang http://doi.acm.org/10.1145/775047.775096 33 Exploiting Response Models -Optimizing Cross-Sell and Up-Sell Opportunities in Banking allocation assignment bibelnieks bullock campell channel choice cohen combinatorial conference constraints couple cross crowder customer december developed diamondsug effort erdahl event examples expressed fingerhut framework francisco functions haydock integer interfaces internal john johnson linear mail marketing must nemhauser offer optimization optimizing paper parks perspective plausible proceedings programming references selling shortcomings solution sons streams targeted technical this value variables wiley wolsey york http://doi.acm.org/10.1145/775047.775064 11 Query, Analysis, and Visualization of Hierarchically Structured Data using Polaris about academy aggregation aiken alternative analysis andp applications approach april area artd august automatic available baker based bayesian becker becket beyer bnsworth brunk brushes building buja burks chandhuri chart chen cited classifier classifiers computational computer computing conference cook cross cube cues data database databases datasets datasplash december decision decoste derthick design devise digital dimensional dimeusional direct discovery donjerkovic dunmire edge edited effective eick engineering engineers environment ereegovac exploration exploring fayyad february framework from frontiers future generalizing goldstein gomberg graphic graphical graphics gray grinstein group hanrahan healey hgmis hierarchies high himl hitp http humangenome ieee information integrated integrating integration interactive interface international invited issues jagadish january journal kaufman keim kelly kelojejchick knowledge kohavi kolojejchick kriegel lakshmanan languages large lawande layman leading livny local lucas manipulation mattis mawby methods mineset mining mobile models morgan multi multidimensional multiple myuymaki national navigating network networking observatories october olap olston online operator ornl pellow perceptual perlin pilote pirabesh polaris presentation press proc proceedings programming project publishing query querying ramakrishnan references relational reports roth rrdinternational rundensteiner scientific sdss semantic senn sept sets sigchi sigmod simple sloan solutions sommerfield spalding spotfire srivastava statistics stolte stonebraker stroffolino structure survey swayne symposium system systems table tabular talk tang techniques thearl thomsen totals transactions user using venkatrao very virtual visage visdb visual visualization visualizations visualizing vldb ward warehouses welling wenger what wierse wiley wireless with woodruff workshop xmdvtool york zoom http://doi.acm.org/10.1145/775047.775122 58 Scaling multi-class Support Vector Machines using inter-class confusion accuracy advances agrawal algorithm algorithms allwein approach arrange associate automatic bakiri base bayesian benchmarks binary bombay categorization chakrabarti chosen cikm class classification classifier classifiers classifying classitiers closest codes coding collection column columns compared conditions confusion correcting cristianini dags databases dataset denotes diagonal dietterieh different documents does dumais during each ecoc ecocs element ennie entries equivalence error evaluations experiment experiments exploiting feature found framework general generation ghani have heckerman hierarchical hierarchically hierarchies hierproof http hztp icml improvement improving inductive industry into jair jason joachims journal kernel koller kressel large learning litb machine machines margin matrices matrix maximally memo method methods mitchell model modification multi multiclass nature negative nips note number online organizing other others outcome output over pairwise participate platt porter positive predicted present press problems procedure program proposed provide qodbole raghavan rayid reduces reducing references related rennie repeated report reports representations rest returned rifkin ritkin rows ryan sahami scalable scaling schapire section selection separated serving shantanu shawe sigir signature significant similar simple singer single solve solving springer standard statistical stripping structure such suffix support svms taxonomies taylor technical testing text that theory they topic treated twoclass unifying used using vapnik various vector verlag very vldb volume where which will wins with words work http://doi.acm.org/10.1145/775047.775146 82 Making every bit count: Fast nonlinear axis scaling abbadi advances agency algorithm analysis applications approach artificial august babu backer banco based bases bell belussi blake brasileiro breiman brodie cart central chakrabarti chakravarthy chang chaos chapman classification communcation component components compression computation computing conference continuous correlation cottrell cowan curves dados data databases datasets dayal december demers deterministic dimension dimensional dimensionality discovery distance distributions edition editor editors eigenvalue embedding estimating extraction factbook faloutses faloutsos fast fastmap feature fractal framework freidman garofalakis geometric germany ghosh giles global government gray hall hanson high houghton http hyvfixinen iebe image independent indexing information intelligence international introduction ixinen john johnson jouiffe journal kainel kanfmann karunen kaufmann kernel knorr knowledge kohonen kotz langford large learning letters linear local locally machine mates mathematical mehrotra merz mfiller mifflin mining model morgan multimedia naud network networks neural nishio nonlinear office olshen operations organizing pages pattern prac press principal printing problem proc proceedings processing publications queries rastogi recognition reduction references regression repository robust roweis saul scheunders schlageter schslkopf schuster science selection selectivity self semantic september shannon sigkdd sigmod silberschatz silva simp smola sons space spaces spatial spie springer stone survey surveys systems tables technical techniques tenenbanm theory traditional traina transformations trees univariate unsupervised using verlag verlagsgesellschaft very visualization volume weinheim whang wiley world york zamar zlth http://doi.acm.org/10.1145/775047.775109 45 Sequential PAttern Mining using A Bitmap Representation agrawal algorithm algorithms apers association australia avignon bayardo bettini bouzeghoub bulletin burdick calimlim canada chen chile conference constraints data database databases dayal discovering dong edbt editors efficient efficiently engineering episodes expression fast france francisco frequent from gardarin garofalakis gehrke generalizations germany granularities growth hart heidelberg icde improvements international itemset jajodia kaufmann large learning long machine mafia mannila march maximal mining montreal morgan mortazavi multiple pages partial pattern patterns performance periodic pinto prefix prefixspan proceedings projected quebec rastogi references regular relationships rules santiago sept sequences sequential series shim sigmod spade spirit srikant sydney taipei taiwan temporal time toivonen transactional twentieth verkamo very vldb wang with zaki http://doi.acm.org/10.1145/775047.775135 71 Collusion in The U.S. Crop Insurance Program: Applied Data Mining acadrnic advanced allen american analysis andlnsurance annals applic application applications april artificial association august automatic automobile belhadji berlin best better bias bibliography blaxton bodily bootstrap brockett cabena cambridge canadian casualty categorical chair challenge chapman claim claims classification classified collect computational confidence containment contrast cost counsel crime cross data databases davison defense dekker derrig detecting detection development diego differences digging dionne dirt disaster discovering discovery dispatch distributed edition estimation expert fail farmers feature federal fertile fienberg filing francisco frank fraud fraudulent freivogel fuzzy generalized geneva gentleman george germany getting gilbert graco graphical graphics gray grossman group hadjinian hall hawkins hinkley hotspots http huang idurkhva ihaka implementations information injury insurance insuring intelligence introduction jackknife january java johnson journal jump june kasif kaurman knowledge kohonen language large learning linear lnsurance loglinear london louis machine management manchur marcel mass massive mathematical mccullagh medical methodology methods mine mining model modelling models montreal moore morgan ncdm nded nelder networks neural october ofrisk opportunities organizing ostazewski owen panko paper papers pattern payments pazzani post practical predictive prentice press prone property publishers quantitative quit real recognition reduction report research researchers risk river rocke rose routinely royal saddle samples sattar schucany self sept social society solutions springer stadler statistic statistical statistics system tarkhani techniques technology their three tools topics tukey ullman uncover underwriter university unwin upper using verhees verlag wang weapon wedderburn weisberg weiss westphal wiley williams with witten working workshops world york zanasi http://doi.acm.org/10.1145/775047.775123 59 Visualization Support for a User-Centered KDD Process animated artificial baseddata bell belmont breiman browser browsing card cercone classes classification computer computing cone conf context data database decision discovery drawings dynamic engineering exploiting factors fayyad files first fisheye focus friedman from furnas graphic grinstein hierarchical hierarchies human hyperbolic ieee induction information intelligence inter interactive japanese journal kaufmann kawasaki knowledge kumar laboratories lamping languages large learning level lnter look mackinlay maekinlay mannila memorandum methods mining minority model morgan multi nguyen olshen plaisant prediction problems process pruning quek queries readings references regression reingold robertson rule rules ruleviz shimodaira shneiderman sixth society software springer stone structured studies system systems technical techniques theory tidier tilford tokyo tool tools transactions tree trees twel view visual visualization visualizing wadsworth wierse with workshop http://doi.acm.org/10.1145/775047.775142 78 Privacy Preserving Association Rule Mining in Vertically Partitioned Data accuracy accurate administration advances aggarwal agrawal algorithms applications approach approaches associated association atallah barbara bases bayesian between borth buildingrelationships buneman california canada chan chapter chen cheung chile citizenship class clifton cloudcroft clustered columbia communityandcult completeness computation computational computer computing conference corporate corporation crypto cryptology cucs dallas data database databases department design discovery distributed distribution dmkd ecml editors efficient engineering environments european exchange extensible fast firestone firestonetirerecall ford foundations freiburg from game generate geometry germany goldreich grama haritsa highway hipp honest horizontally html http ieee imielinski index inductive information intelligent international ioannidis island issues items jajodia joint journal june kantarcioglu kargupta knowledge large learning lindell machine majority management mental meta mexico micali mining mobile motor multi multiple national networks nhtsa open ourcompany pages paradigms parallel part partitioned party pinkas pkdd play practice preserving press principles privacy problem problems proceedings processing prodromidis products protocol protocols providence quantification recall references report research review rhode rizvi rule rules safety santa santiago scalable science secrets secure security semantics sept sets seventh siga sigart sigmod sivakumar springer srikant stolfo strategicissues streams structures swami symposium systems technical their theorem theory thesis tire traffic transactions twentieth ubiquitous university using vancouver verlag very vldb washington when wigderson wirth with workshop york http://doi.acm.org/10.1145/775047.775137 73 Evaluating Classifiers' Performance In A Constrained Environment able above account acknowledgements along also alue analysis answer appendix applications areas artificial attrition bazaraa been being benefit benefits binding bindinli boros broad called campaign capacity case ceil cell change choose classes classification classifier classifiers clll coil combined competing computational conference consider constrained constrains constraint constraints convex cost costs could create created data deal decades decision deterministic develop developments difficulties dimensional dimensions direct discipline draw each employed entire environment error errors example examples excel except expense experience explicit express fast fawcett feasible feedback field fifteenth figure flnll flows follows formula formulate formulation found fprate framed frequently from function future gain generously goes gratitude grow growing guide hall have help helped helraan hull importantly imprecise incentives include indirectly input inspire intelligence international intersect intersection into involving issues john katta know knowledge learning legacy limited limiting linear losing ltoro machine mail main maintain maintained mathematical matwin methods minimizing missed mmvabl models moktar more multi murty name need network nlalbie note odginli odgln once operations opportunity optimal optimization optimizations optimizer optimizetion other overcome part particular past performance perhaps piece points powerful prentice problem problems proceed proceedings process processing produced programming provided provost pursuit quadratic rate ratep references report research responders response restrictions resulting rich rkskb rnal robust rocch rocchi rocchl rooo ropol same sdsg selection sensitivity several sfslo shared sheet shown size slack slides slopes solution solutions solve some sons space stableceils stages stan standard status stochastic such systems taken target techniques terms than thanks that their theory this thoughts throughout time tool trints uncertainty unit used value variables various vduo vera vertices vilul visualization vllue vlluo well where wiley wise with work worksh would http://doi.acm.org/10.1145/775047.775080 22 Privacy Preserving Mining of Association Rules about access according ackerman actually adam additional advances after against aggarwal agrawal algorithms almaden also among annual answer anti appendix appropriate april assoc associates association assodation assume assumption attitudes august available balancing barbara bayardo becomes beginning belmont between beyond bias both bounds brankovic breiman business butions california canada care case castro center chapter characteristics chasm check choice choices choose city claim classification clear clifton combinatorics comics commerce commissioner compute computer computing concern concerns conf conference conflict consider consumer contain contains contribute contributes contributing control conway coord coordinate coordinates corporation counted covariance covariances cranor crossing crypto dallas data database databases dealing death decision definition denote denotes depends design diag diagonal diego different directive discard discovery distri distribution dividing each easy economist editor editors edmonton efficiently elements eliminating equals equiprobal estivill esupp european evasive exactly expectation expectations fast fayyad find first follow follows formula formulae freebies friedman from given group groups harris have hill identity imielinski immediately implications inates independent independently induction information interact intersection intersections into invariance issue issues item items itemset january jose july june just know knowledge labs large last learning lecture left likely linden lira logic long louis lower machine make making management many march marks marmila matrix mcgraw methods mexico mining mitchell mohania moreover mula multinomial multinomials multinornial namely nonzero notes notice number october office olshen online only ontario opinion original other pages papers part partial partitioned patterns piatetsky pinkas possible power precision preserving press principles privacy probabilities probability problems proc proof proofs protection prove proven quantification quest quinlan randomization randomized reagle references regression replacing report research response restrict retaaned retain retained right rule rules same santa science seattle security selected selective september sequence sets shapiro shoshani should show sigkdd sigmod simple since size sized smyth solutions some special split springer srikant staking stat statement statistical stone strip subsets such summation supp suppir supports suppose suppr suppt survey surveys swami swapping symposium systems takes technical technique terms texas that thearling their then there therefore think this thus time times tjoa together toivonen transaction transactions transformation trees triangle triangular understanding union users using uthurusamy vaidya values various vector verkamo verlag vertically vldb wadsworth want warehousing warner washington ways week well westin what whenever where which with words workshop wortman written your zero zeros http://doi.acm.org/10.1145/775047.775059 8 ANF: A Fast and Scalable Tool for Data Mining in Massive GraphsChristopher R. Palmer Computer Science Dept Carnegie Mellon University Pittsburgh, PA crpalmer@ cs.cmu.edu algorithm algorithms applications apriori base based bases bell blocking broder ches closure closures cohen computer conceptual conference connectivity cook cora counting custome data database december domingos engine estimating estimation external faloutsos fault flajolet framework frequent from generalized generating gibbons globecom goodrich graph hill holder html http ieee imdb information inokuchi intelligent international internet introduction ista journal kumar labs large laws lipton maghoul management mape martin mcgraw megill mereator mining model modern motoda naughton network nodine nrdm obey pages palmer pdkk pods power prec principles probabilistic proceedings raghavan rajagopalan reachability references related relationships retrieval richardson salton scan sciences search searching sigact siganos sigart sigcomm sigmod simple sivakumar size stata steffan structure substructures symposium system systems tauro that their tolerance tomkins toplogies topology transitive upfal value very vitter washio whizbang wide with workshop world http://doi.acm.org/10.1145/775047.775138 74 Discovering Word Senses from Text academic achine acknowledgements acquisition affair after agreed agriculture algorithm algorithms allows also ambiguity anagement analysis anguage answering approach argument assigning attributes atural australia authors automatic automatically aviation avoid based bass because belongs blockbuster broad browsing building burgin called canada cascade categorical centroid centroids chameleon citadel city classes classification cluster clustering clusters collections comments committees communication company comparison completely computer computing concepts conclusion concordances consortium construct context copenhagen correction council coverage custom cutting dart data database decisions delta denmark department departments dependency discover discovered discovering discovers distributional document does domain dumais duplicate dynamic each echnical edited education efficient element engineenng engineering environment eport esources etrieval evaluation evaluations eview example experiments features fellbaum fellbaurn final finally finance find first fishery flynn following forestry forte foundations francisco frequent from gather good government grant guha haft hallmark harabagiu harris have health helpful hierarchical higher hill hilosophy hindle hooker howell hutchins hybrid ical icography induction information informative inguistics ining inspected intelligence international introduction inunigration japan karger karypis katz knowledge known kumar kurnar kynseok kyoto labor lain landauer landes language large latent leacock lectronic less lexical local looking lottery lssue made madrid manning manual mcgill mcgraw measuring mental methodology methods metropolitan miller minnesota missing modeling montreal most mostly murty nada nalysis names natural nestle noun odern odyssey online open orkshop ournal outperforms outputs overlapping oxford pantel partitional partly partnership pasca pedersen performance pgsb pittsburgh planning plato precision predicate presented preservation press principar principlebasedparser problem proceed procurement public question ranslatio rastogi realized reasonably recall recessing recreation references remove representation represents research resolve retrieval review reviewers robust roceedings rock role rosewood salton sample scatter scattered scholarship schtltze science sciences scouting semantic sense senses service several shamrock shaw showed sight sigi sigkdd similar social solution some somewhat sommers spain special standards statistical steinbach structure structures subsumes supported surprisingly surveys sychological sydney synset syntactic techniques telecommunication tengi test text thank that their them theory ther these this thought tight tourism transport transportation tukey university urban using voorhees welfare well whirlpool wish with word wordnet words works wurdnet york http://doi.acm.org/10.1145/775047.775130 66 Construct robust rule sets for classification aggregating agrawal algorithms artificial association associations bagging bender between blake boswell breiman caep chile clark classification computer conference data databases dong emerging ewsl fast ieee imieliuski improvements induction intelligence international items large learning machine management massive mathematical merz methods mining mleaxn mlrepository pages patterns predictors press proc proceedings recent references repository rule rules santiago sets sigmod society some srikant swami twentieth very with wong zhang http://doi.acm.org/10.1145/775047.775119 55 Tumor Cell Identification using Features Rules academic accuracy accurate acknowledgement activity adaptive algorithm algorithms alone also among anal analysis analyze anoraganingrum antonie application arbitrary association august australian automated automatic average awasthi background barba base based behavior berns better biddell bioimaging biology blood bmes boston building cell cells characterizing chen christopher classification classifier classifiers coman combine combining complex comput computer conclusion conf configurations counting cybern cytological data dellas demand describe detection distributed doolitle duin dynamic electr embs encoding error examined experiment experimental explanations extract extracted features february filter florida francisco freeman from fukunaga functional fundamental gauthier genome genomics geometric give grigoriev hatef help histological identification identify ieee image images imaging immersion implemented improve increasing indicate information institute integrated integrating intell intelligent interesting intl into introduction iris jeacocke johnson kaufmann kittler kovalev laboratory leads learning levine life like living local lovell lowest mach machine majority matas mathematical meaningful median medical meta method mining morgan morphology multi multimedia myshkin nally noble november number object operation optical optimal paper part parulkar pattern patterns performance plenary presentation press principles problem proc process processing program propose providing pure quantitative quinlan rate real recognition references regions related relue research resolution results retinal robust rule rules scene segment segmentation segmented sigmod signal singapore spie statistical strategies studies suen system systems technique techniques texas than thank that there therefore this thresholding tracking trans tumor tumour unable using valuable vikas voting water weight with wong workshop would york zealand http://doi.acm.org/10.1145/775047.775105 42 On the potential of domain literature for clustering and Bayesian network learning aaai about accumulation adnexal advanced algorithms analyis analysis annot annotated annotation antal antibody antigen appendix application arrays assessment associated baeza baeze based bast bayesian behavorial belongs benign bethesda bioinformatics biological biology blaschke boguski bool bourne cancer cancers cbms cell cerevisiae chromosome cirrhosis classification cluster clustering coefficient collins columrm combination comp comparability composed computational computer conditions conference consensus control cooper corr criteria cuckle cycle cytoplasmic data database definitions describe description detect disease dissertation dobrowiecki domain dubes eath edwards eighth encountered endometrial endometriosis entry epithelial euckle expressed expression external false family features fibroids finding frakes freq frequently from funct function functional gene genes genetics genomics glycogen glycoprotein group groups grundy gynecol hall heckerman herskovits heterogeneous hierarchical hierarchies high highest hovig human ieee illustrated induction inflammatory information integr integrate intelligent international interpret interscience involved iota ipll jacobs jain jenssen kaufman kaufmann keyword kinaae knapp knowledge komorowski korfhage laegreid language large learning leuven level linking literature liver local location machine maggino maribor marker masys measurements medical meigs meiosis menlo menstruating meszaros metabolism method microarray microarrays milligan mining modern molecular monoclonal moor morgan most multivariate name nature nearly network networks obstet ofphosphatases oliveros only ontology opinion opposition organism ovarian pages pancreatitis park pathology patients patterns pavlidis pearl pelvic percent phosphatase polarity positive pregnant premenopausal prentice press probabilistic probabilistie probability proc proceedings process progression prot protein quakenbush rail ranks reasoning recognized recomb references regulating repr research restr results retrieval reviews rousseeuw rvaths saccharomyces scale scores segregation sensitivity serum shatkay similarity since slovenia sonographic specificity stage statistical stemming storage study subceuular subfamily such swiss syndrome syrup systems table tamoxifen terms textual tfidf that themes this throughput timmerman tool trace translation tumor tumors type typical ultrasonography ultrasound university using uterine valencia valentin variable variety vergote verrelst wald weston wide wilbur wiley will with women yates yeast yeastcard york http://doi.acm.org/10.1145/775047.775112 48 Topics in 0-1 Data aaai advances agrawal algorithm algorithms american analysis applications applied approach approaches approximation association attributes bacteria behavior berger berkeley bernoulli between binary both cadez carreira case chapter comon compared component computation computational computers concept conclusions conjecture connected considered contrast data databases datasets deerwester della description detailed discovery distributions dumais editors entropy extending external factorization fast fayyad features fields find finding finite formal fransisco from furnas gave general generated gyllenberg harshman have hofmann identifiability identifiable identification ieee imielinski independent indexing inducing information intelligence irinen issues items john journal june karhunen knowledge koski lafferty landauer language large latent learning linguistics lots machine mannila mapping matrix maximum mining mixtures model modeling models multivariate natural nature negative neural nonlinear nonnegative numerical objects open other pages papadimitriou parts pattern pavlov perpinan piatetsky pietra pods pool practical practice prediction press probabilistic probability probe probes problem processing profiling query raghavan random reasonably references reilink relationship relatively remain renals ronkainen rules sammon science seeing semantic sets seung shapiro showed side sigir sigmod signal similarity similarly simple smyth society sons sparse srikant still structure studied such swami tamaki that theoretical toivonen topic topics transaction transactions understanding uniqueness used using uthurusamy vempala verkamo verlaan visualization ways well wiley with works http://doi.acm.org/10.1145/775047.775081 23 Mining Frequent Item Sets by Opportunistic Projection" acmsigmod agarwal aggarwal agrawal algorithm algorithms almaden almost analysis apfiori approach april apriori around array ashnk association based bases basket bayardo because between bombay borgelt breadth brin burdick california calimlim candidate chen chile chooses combines complete computing conclusions conf conference consumed counting dallas data database databases dense depth discovering discovery distributed dlmn dynamic edward effective efficient efficiently engineering example except execution extending fails fast features figure find finishes first fpgrowr francisco frequent from fuzzy gehrke generation germany growth hart hash heidelberg high html htrnl http hyper icdm imielinski implication impressive increase india internation intl ions issue item items itemset itemsets iternset jeffrey jose journal june kedem klsk knowledge kohavi large length less level linearly llew long mafia magdebarg march market mason maximal maximum memory mine mining miue mlenrn mlrepository moderate motweni multiple navathe nishio number oppoauneproject opportuneproject opportunistically ornieeinsky paper parallel park pattern patterns performance pincer prasad proc proceedings projection propose quest rajeev ransact real references rreaehes rule rules sampling santiago sarasere scales search seattle seconds sept sergey sets sfikant shalom sharnkant shows sigkdd sigmod size sizes sparse special sttructure suppor support swami switzerland syndata tang technology than that they this threshold threshool thresohbl through time toivouen transactional tree tsur tucson ullman very vldb washington when where while with without world yang zheng zijian zuich zurich http://doi.acm.org/10.1145/775047.775084 25 Web Site Mining : A new way to spot Competitors, Customers and Suppliers in the World Wide Web aaai algorithm artificial august bases bayes biological branching categorization chains chakrabarti characteristics classification classifying comparison comt conference construct craven data deshpande diego dipasquo directory dmoz efficient eibe elsevier enhanced european evaluation event examination features fields freitag frequent from http hyperlinks hypertext implementations indyk intelligence java joachims journal karypis kaufmann knowledge learning lesh machine machines many markov mateo mccalhim mecallum menshikov methods mining mitchell mitsunori models morgan naive nigam ogihara open pakdd practical proceedings processes programs project qualitative quinlan references relat relevant sequence sequences service sigir sigkdd sigmod slattery spade support techniques text tools using vector volkov waikato weka wide with witten workshop world yahoo yang zaki http://doi.acm.org/10.1145/775047.775151 87 Transforming Classifier Scores into Accurate Multiclass Probability Estimates advances allwein annals appear applied approach archive artificial assessing available ayer bayes bayesian bennett binary blake both brank business calibrated calibration california carnegie cder chapter classification classifiers coding comparison computer computing conf conference correcting costs coupling data databases decision decisions department dietterich discovery distribution domingos dykstra eighteenth elkan empirical error estimates estimation ewing filter five forecasts from function hastie heml hetp http improving incomplete inference information intelligence international irony irvine john journal kaufmann knowledge kong lang large learning likelihood lrvine machine making margin mathematical mellon merz methods mining mlearn mlrepos mniticlass morgan mulficlass multiclass murphy naive nation netnews neural newsweeder nips obtaining order output outputs pages pairwise paper percent pets platt posterior precipitation press probabilistic probabilities probability proceedings processing provost publishers reducing references regularized reid reliability rennie report repository research restricted rifkin rnachines robertson sampling schapire school sciences seventh silverman singer soft sons statistical statistics stem subjective support svmfu systems technical temperature text tibshirani trained trees twelfth unifying university unknown using vector volume well when wiley winkler with working wright york zadrozny http://doi.acm.org/10.1145/775047.775099 36 Learning Domain-Independent String Transformation Weights for High Accuracy Object Identification aaai administrative agents algorithm algorithms ambite american application applications approach approximately arens ariadne artificial ashish association automatically autonomous baeza bagging based bibliographic bitton bollacker boosting bureau census chee church citation classify cleaning cleansing clustering cohen common computer computing conference cooperative correcting correction data database databases detecting dewitt dimensional dirty discovery dissemination division documents domain domains duplicate efficient elimination elkan entity extraction fellegi fifteenth files fourth frakes from gale ganesh garcia giles hall hernandez heterogeneous high hylton identification identifying ijcis independent information integrating integration intelligence intelligent international issue journal june knoblock knowledge kukich labeled laboratory large lawrence learning linkage linking lists machine mamitsuka manangement massive matching mccallum merge merging methods mining minton mitchell molina monge multiple muslea national nigam object pages philpot pinheiro portland prentice probability problem proceedings purge queries query real reconciliation record records reference references related removal report research retrieval retrieving richardson rules science scoring seattle second sets sigkdd sigmod similarity sirvastava sixth software sources special spelling statistical statistics stolfo strategies structures sunter surveys switzerland systems technical techniques tejada text textual theory thesis third thrun transactions tuczon ungar unlabeled using vldb washington wiederhold winkler with without words workshop world yates york zurich http://doi.acm.org/10.1145/775047.775108 44 Collaborative Crawling: Mining User Experiences for Topical Resource Discovery aggarwal aggaxwal approach arbitrary available berg building cache caching categorization chakrabarti clustering collaborative conference crawling discovery dora drive empirical experiences focussed from garawi gates http implemented intelligent ircache main memory merits mining ndsu nodak papers performance predicates profiling proxies proxy references report research resource results rousskov scsi solviev specific squid supervised system systems tested topic topical traces ulaur user using were wide with world http://doi.acm.org/10.1145/775047.775104 41 Handling Very Large Numbers of Association Rules in the Analysis of Microarray Data aaai ability able about above abovementioned academy acceptable accepted according acid acknowledgments activity adapt adapted adding addition addressed adjacent adomavicius advances advise affect affymetrix after against aggregate aggregated aggregation agrawal algorithm allow allowable allowed allowing allows already also although altogether always amino among analysis analyze anddata annual another answer answers antecedent antecedents applicable applications applied applies apply applying approach approaches appropriate approximately april arbitrary ares arising artificial assessment association associations associative assume assuming attribute attributes august authors autonomous available barash base based bayardo bayesian because been before behavior being belief beliefs believe believes bello belong belonged belonging belongs below belyavsky benjamin berrar besides between bhatnagar bicciato bioinformatics biokdd biological biologically biologist biologists biology biomolecular bodies body boolean both botstein bottom bowtell brackets broad brown browsing buooh butyl california camda capability cart case categorical categories categorized category ceil ceils cells cellular cerevisiae certain cfistiani challenges chapter characteristics check chemical chip chips choose church class classes classification classifier clear cluster clustering clusters combinations combined combining common community compared comparison composite computational concept concise concisely conference confidence confirmed consequent consider consideration considerations considered consistent constraint constraintbased constraints constructs contain contained containing contains context contradict contrast contribute contributes control cores corresponding corresponds could count covers criteria critical customer customers damaged damaging data database databases dataset dealing deals decide decided deckard defined degradation demonstrate demonstrated denoted denotes dense depending depth describe described describing descriptive determine developed development didone differ different differentiate differentiation differently dimensional direct direction directly discard discovered discovering discovery discrete disjoint display distance distinguishing dmitrovsky does doing domain down downregulated driven drug dubitzky duda dynamics each earlier easier ecml eerevisiae efficient effort eighth eils eisen either elaborate element elements empowers enclose engineering enhanced environmental equivalence especially estep evaluation even every exactly examine examined example examples exception excision experiments expert explevel explicit exploration explorations exploratory explore explored expressed expression expressions exprlevel extended extends extension extracting fact factors families fashion fast fayyad features fewer fifth figure filter filtered filtering filters finally find finding finds finish first fixed flexibility floreny fluence focus following follows form formal formulated foundation fourth fragment frequent friedman from full function functionai functional functions furey further gather gene genechip geneexprset general generate generated generation genes geneset genesetl genetics genome genomic genomics geometric ggrpbi given golub granularity granzow grids group grouped grouping groups grpa grpb grpc grundy gunopulos hand handle handling hart hash hatonen haussler have having head help hematopoietic here heuristic heuristics hierarchies hierarchy high higher holds however http hybridization hydroperoxide hypothesis identification identified identifies idiosyncrasies idna imidinski imielinski implementation implemented implies imply important impose improved includes including incorporate indicate indicates individual individually induced induces influence influences information initial inspection instance instead integrated intelligence interactive interest interested interesting interestingness international interpreting interval intervals into introduce introduced intuitive involved involving irrelevant issue item items itemsets iterative january jelinsky john journal july keyword keywords khrapko kitareewan klemettinen know knowing knowledge knowledgebased known knows kotala kurra lander language languages large leaf learned learning least leaves lent leona level levels lewfl lewin lies lifts like limitations limited linear linial links list literature logic logical look lookup lower lysov machine machines macro macros main major make makes managed management manner mannila many mapped maps march match matched matches matching maximizing means measure measured measurements measures medium memory menlo mentioned merger merging mesirov metabolism metaquefies method methods methylmethane microarray midas millions miners mining minutes mirzabekov mitbander models moderately molecular more moreover most msql much mudivarthy multiple must nachman name names national nature ndsu needs network networks neural newly next nitroquinoline noble nodak node nodes none note number numbers numeric numerical obtained obtaining october ofpredefined often omfigure online only operations operator operators optional options order ordered organize organizing originally other others outlined over overview oxford oxide padmanabhan pairs paladin paper parent parentheses park part participating particular path patiem pattern patterns pavlidis peano perera performed performing perrizo personalization pertaining pertinent pevsner pharmacogenomic piatetsky placed point possibilities possible possibly posslble post postprocessing potentially practical precedence predictable present presented presenting presents press previous previously primary prior probabilistic problem problems proceedings process processed processing produce produced profile profiles profiling properties property propose proposed proteasomes protein proteinsynthesis proteomic provide provided provides providing pruning psaila purely purpose push quantifier query querying question questions quickly radiation range ranges rather reality realize reasons recomb recorded records reduce reduced references refers refine regardless regulate regulated regulation regulatory related relationships relatively relevant reliable removed repair repairi replaced represent representative represented represents requirement requiring research respect respectively respond response rest restricted restriction result resulting returns revealed right role roles ronkainen root rule rulebase rulebases rulegrouping rulepart rules runs saeeharomyces said same sample samson scalability scale scales schemes sciences second section select selected selecting selects self semantics sense sequencing sets setting several shapiro shen should sigkdd sigmod significant similar similarity simple simply since single sixth size sized sizes slonim smaller smyth solely some somewhat sons space specific specification specified specifies specify specifying spellman square srikant standard standardized start started statement statistics still storing stork stress structure structures studied study subsequently subset substantially such sugnet sulfonate summarizing summary suppl support suzuki swami syntactic synthesis system systematic systematically table tables takes tamayo tang tautology team techniques template templates tend tens terminology tert tested than thank that their them theme then there therefore these they third thirteenth this those thousands three time times together toivonen took toolbox tools total traditional transactions transcfiption transcription transcriptional transformation transformations treated treatment tree trees trends trivial turns tuzhilin twodimensional type types unallowable unchanged unexpected unexpectedness union uniquely university unknown unlike unmanageable upregulated used useful user users using usually uthurusamy utilize validation value values variant various vector verkamo versa very vice view virrnani virtual wang want wanted wants ways well were what when whenever where whether which while whole wide widom wiley will with without work works workshop worth would yeast york zaniolo zhou http://doi.acm.org/10.1145/775047.775054 4 DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints aaai able advances advantage agrawal algorithm algorithms also anonymous antimonotone application areas artificial association august based bases bayardo beneficial between bocca both buneman burdick calimlim calls case chapter chile classes clear comments computer conclusions conference consideration considered constrained constraint constraints convertible cornell data databases delia dense discovery dist dualminer editors efficiently experiments exploiting exploratory faloutsos fast fayyad figure finding fragment frequent from future gehrke generous ghandeharizadeh gifts given good grants gunopulos haas have helpful however hypergraph icde ieee ijcai imielinski implications incorporated information institute intel intelligence intelligent interested interesting international into item items itemset itemsets jajodia jarke joint june kaufmann khardon kinds knowledge kramer lakshmanan large leads learning levelwise long machine mafia management mannila maximal microsoft mining molecular monotone more morgan nice none optimizations oracle other oualminer pages pang pattern patterns pennsylvania philadephia piatetsky pods predicate predicates press previous proc proceedings prune pruning push queries raedt references remove results reviewers rule rules santiago seattle seem selectivity september sequential sets seventeenth several shapiro sigkdd sigmod similar simultaneously smyth society soooo space sponsored srikant structures success such swami systems taking thank that theoretical this threshold tiwary toivonen transactional transversals treatment uninteresting used using uthurusamy variance verkamo version very vldb washington well what which with work yields zaniolo http://doi.acm.org/10.1145/775047.775092 30 Predicting Rare Classes: Can Boosting Make Any Weak Learner Strong? aaai abilities ability accuracy achieving acst adacost adenocarcinoma agarwal algorithm algorithms alone also american analysis analyze annals annual anoxia arcing arguments artificial association attempt balance barbara base benchmark best better blake bled boosting breiman buckley butterworths california card case cases chan choice city claim claimed class classes classification classifiers classify classifying cohen collection colon comparable comparative comparing comparison component computational conclusion condition conf conference confidence context corpus cost credit critically data databases datasets decide demonstrate dependent depends detection different disappears discovery distributions dramatic each effective eight eleven emerged empirical eport evaluate evaluating evaluation evidence extracts factor fast fifteen first five fourteen fourth fraud from given haystack heart hersh hickam holte html http icdm icml ieee imbalanced important improve improved improvement improvements induction information infrequently intelligence interactive international internship introduction japkowicz jose joshi just kearns king knowledge krkopt kumar lake large learner learning leone less ling london machine mathematical mechanism medline menlo merz mining misclassification mlearn mlrepository molecular needles nine obtained occurring ohsu ohsumed order outcome pages paper papers park performance phase practical precision predictions press problem problems proc proceedings process proportion pvoc qualitative rare rarity rated real recall recent references repository research residency results retrieval right rijsbergen rook rule santa scalable scenarios schapire sensitive sequence sets show sigir sigmod simple singer sixteen sixth slovenia some stand stanford statistics stolfo strong study such support synthetic talaoe target technical technique test than that their theory thirteen this ting topics towards twelfth uniform university update using vazirani very voting weak weight when which with workshop years york zhang http://doi.acm.org/10.1145/775047.775098 35 Mining Product Reputations on the Web aaai able academic access adelberg agent agents agrawal algorithms also analysis analyzing anderberg annotation annual answering answers application applications applied area artificial ashish association attribute automatically autonomous based bases become benzecri beyond both brown building business chandhri change characteristic clark classification closely cluster clusters coden coling collecting collection combining companies comparison compete complexity computational computer concluding conduct conducts conf conference consists construct correspondence cost could course craven criterion ctrec data database decision dekker demonstrate development develp dipasquo discover documents domain doorenbos drastically each easy economy empirically engine etzioni events example experiments extension extracted extracting extraction fall fast february fikes first fisher five florescu form former four framework freitag from fujitsu fukushima fundamental gathers generation governments handbook harabagiu have help html http ieee imagine implies important index individuals industrial information informtion infortmation intelligence interest international internet ishiguro isps japan japanese katz kaufmann knoblock knowledge komatsnsoft komatsu labeling language large latter learning levy linguistics lists litkowski located machine maiorano makes management mccallum meeting mendelzon mercel might mining mitchell moldovan morgan natural nigam nodose november occurring open opinion opinions other paper parts pasca people performance portion positioning possible prager prayer predictive press proc processing product products programs proposed providers qninlan quesdtion question questionnaire radev real record reduce references regarding relation remarks report reputation reputations research retrieval rissanen rules saran scalable search security seen semantic semi semistructured sense sentence separated service services share shopbots shopping shows sigir sigmod significantly similar single slattery society soft sources specific spring srihari srikant stochastic structure structured supervised supported survey symfoware symposium system systems tasks tateishi technical techniques terms test text textual that theoretic theory these they this three tice tool trans transaction trec triples typical used users using very vextsearch vldb voorhees vxtsc weld well which wide with within word words works world wrapper yamanishi yaraanishi http://doi.acm.org/10.1145/775047.775118 54 Statistical Modeling of Large-Scale Simulation Data abdulla about access acharya adaptive addition agostino also andersen answer answering approach approximate aqsim aqua article assumptions athens august baldwin brooks cairo chakrabarti cliffs cole company components computational computes conculsion conducting consists constrained course creating critchlow current darling data decreases dekker describe developing devore disk distributed distribution easily edition effective efficiently egypt engine engineering englewood error evaluate evaluating experiments finally finding first freitag from future garofalakis gather generator gibbsons good goodness greece grid grove hall help index information infrastructure instead interested investigating jcdl june kamimura kaufmann knowledge large larger layout lessons llnl lozares makes marcel mathematical mean memory minimize mining model modeler modeling models morgan multi multiple multiresolution muntz musick nonrange november octree only optimal original other outliers pacific parallelized parallelizing part particular partitions poosala portland practical prentice press probability proceedings processing processor publishers publishing queries query ramaswamy rastogi record reduces references report require requirements resolution resolutions roanoke root ross scale science sciences scientific scientists seek september sets shim sigmod simple simulation snapp spatial specifically square statistical statistics stephens sting storage storing supporting sweep system systematic tang technical techniques that their time times tree unbiased uses using utilizes variant visualization vldb wang wavelets which will work yang http://doi.acm.org/10.1145/775047.775057 6 Mining Knowledge-Sharing Sites for Viral Marketing aaai account advertising advice agent albert algorithms alls also analysis anatomy annual appendix application approach artificial assume austin australia authoritative automotive baltimore barab barcelona bayesian brin brisbane calculation case characteristics chicago chickering classifier cobot collaborative combining communications complete completes conference convenience customer customers cybernetics data database decision define definition direct discovery discrete dolningos domingos editor effect eleventh elsevier engine environment environmentfor equation estimate european exactly filtering first fixed flkeo francisco frauenfelder free from gelbrich give have heckerman herring howard hughes hyperlinked hypertex hypothesis iacobucci identify ieee induction industry information inside intelligence international into irwin isbell iterated iteration iterative iwta jackson jong journal jurvetson kaufmann kautz kearns kleinberg know knowledge korman lambdamoo large learning lifetime liwa loss machine marketer marketing measurement miner mining morgan nakhaeizadeh national network networks ninth notational note oaks obtain optimality page pages pazanni physica point power press probability proceedings proof prove purchasing random reaches recursion references referralweb renaming respectively resulting revenge revolution richardson sage satisfy scale science script search secondgeneration selman seventeenth seventh shah show shown siam simple since singh sixteenth social sources spain springer stanford statistics stone strategic strategies super symposium systems taken tapping targeted targeting techniques that theoretic theory this thousand thus topology transactions true tual uncertainty under unrolling until update value values viral wali waliwa what where which wide will wired with wjiaj wkan wkil wkitxk wkizx wkmrm world wtkiwtatk wtman xwkmm zero zwplo zzwhn http://doi.acm.org/10.1145/775047.775051 2 MARK: A Boosting Algorithm for Heterogeneous Kernel Models additive advances algorithm amer application approximation august bartlett based baxter behind bengio blake boosting bousquet chapelle choosing classifiers collobert colt combining computer conference conjugate cristianini data databases decision department discovery duffy editors elements february frean freund friedman function functional fung generalization generalized gradient greedy hastie helmbold html http hypotheses idea idiap international introduction ipsen journal kernel knowledge krylov large learning leveraging line machine machines mangasarian mason math merz methods meyer mining mlearn mlrepository models monthly msller mukherjee multiple other pages parameters press problems proc proceedings proximal references regression report repository scale scaled schslkopf science sciences seventh shapire shawe sigkdd smola springer stanford statistical statistics support system taylor technical techniques theoretic tibshairani tibshirani university vapnik vector http://doi.acm.org/10.1145/775047.775093 31 Efficient Handling of High-Dimensional Feature Spaces by Randomized Classifier Ensembles access accuracy adaboost adaptive additive aleksander algorithm alpha amit annals arcing artificial asia automated averaged bagging based blanchard bledsoe block blocks boostexter boosting bowden breiman browning canada categorization chicago classifier classifiers comparative comparison computer computers computing conference constructing continuous damerau data databases decision depertment development dietterich digital discovery domingo each ecognition enhancement ensembles examination experimental experiments feature forests forward fourteenth framework freund friedman guide harris hastie having icml iiii ijcai image information intelligence international john joint kingston knowledge learner learning logistic machine machines macro memories methods mining morciniec mrcl multiple networka neural online oriented pacific pages pakdd parallel parallelizing pattern pavlov pedersen perform performance predictors proceeded proceedings queen radical random randomization randomized reading recognition references regression reichler report research retrieval rohwer rules sampling scaling schapire science selection sensor sigir singer skillicorn statistical statistics status step stonham study support system systems technical techniques test text theoretical theory thirteenth thomas three tibshirani transactions trees tuple university using vapnik vector view watanabe weiss wilder wiley wisard with within workshop wrappers wrapping yang york http://doi.acm.org/10.1145/775047.775117 53 SECRET: A Scalable Linear Regression Tree Algorithm abalone about academic accuracy adaptive alexander algorithm algorithms also american annals apparent approach approaches artificial association assurance attributes august australian automatic available baseball based believe belmont bled boston both bradley breiman cart case chaudhuri chen chose classes classification clustering comparative comparison computation computational computations conclusions conf conference confirmed considerably considered constant construction continuous contribution cornell data databases dataset datasets decision decrease denoted dependent depicted designed detection directions disciplinary discovery discussion down dsin edition efficacy efficiently error estimators european exhaustive experimental experiments expert extensions fact fair family fast faster fayyad figure figures first framework fried friedman from fukanaga functional further ganti gehrke generous gifts ginde golub grants graphical grimshaw grow guide here hessian hopkins huang iberoamerican ignoring immediate impact implemented indistinguishable information institute intel intelligence interaction interactive international into introduced introduction investigate ional italics johns joint journal kanfmarm karalic kaufman kaufmann kernel knowledge large learning leaves linear lnternat loan look looooo machine magnitude make makrishnan matrix measured medium methods microsoft minimum mining models morgan most mostly mples much multi multivariate mumps murthy necessary nevertheless node novel number oblique observations october olshen only order orders other otherwise outperforms overcome pages paper pattern performance piecewise plan point pointing points polynomial possible poster precision press previous principal problem problems proc proceedings programs pruning quadratic quality quiulan rainforest real reason recognition reduces references regression regressors reina reliable report reporting respectively results running same sample sampled sampling scalability scalable scaling school search second seconds secret selection sets shih show significant significantly since sinica size sized sizes slightly slovenia small some spent splines split splits sponsered springer starting statistica statistical statistics step stock stone structured study survey synthesis synthetic teeator testing than that their these this time times torgo towards training tree treed trees truly tuples uide unbiased used using variable verlag version versions very virtually wadsworth went will with without work worth would http://doi.acm.org/10.1145/775047.775087 27 Interactive Deduplication using Active Learning aaai active adaptive adding algorithm algorithms allan almost american analysis aoodec application applications approximate april apte argamon artificial association atlas automatic autonomous available barabara based bayardo bengio bibliographic bollacker borkar both buckley burges business categorization census chaudhuri citation classification classifiers cleaning cleansing cleanup cluster clustering cohn collobert comltechlmlcl committee committees computational computer computing conf conference cora costs current dagan data database databases debull december decisions declarative deduplication deshmukh digital dimensional dirty discovery dougherty editor editors effect efficient elkan employing engelson engine engineering entity environment evaluation extracting feedback field florescu formal francisco free freund from function galhardas generalization giles gravano guided hellerstein hernandez high hill html http hylton icde icml identifying idiap ieee improving indexing information intelligence interactive international issue italy iyengar jacob jagadish joins jose journal kanfmann kaufman kaufmann kaufmarm knowledge kohavi koller ladner language large lawrence learing learning less libraries library liere linkage machine machines madison making management master match matching mathematical mccallum mcgraw menlo merge merging methods microsoft mining mitchell model monge more morgan names narasayya navarro nigam oles opper pages panagiotis paper papers park parsa pattern pool potters predicates press probabilistic probabilities probability problem problems proc proceedings proceedinrags programs pros providence publishers purge queries query quinlan ramakrishnan raman real recognition record records reed reference references regression related relevance rennie resampling research retrieval rome saita salton sample sampling santa sarawagi scale schohn science search second segmentation selection selective sets seung seventh seymore shamir shasha shavlik sigir sigkdd sigmod simon society software sommerfield sompolinsky special state stolfo string structured support survey surveys svmtorch system tadepalli technology text theory thesis tishby toney tong tools tour tutorial ungar unknown unlabeled unsw using value vector very vldb wheel when whizbang wiley winkler with workshop world york zadrozny zhang http://doi.acm.org/10.1145/775047.775095 32 From Run-time Behavior to Usage Scenarios: An Interaction-Pattern Mining Approach acids aghai aging agrawal algorithmic applications approach artificial assisting associates automated bairoch baixeries balcazar based bergen biological biology biosequences biuk bookshelf brejova bucher cambridge canada carlini casas case catalunya cellest chikofsky cnica code comp comprehension computer computing conf constraint cross cypher data demonstration departament department dept design developments dialog dimarco discovering discovery driven efficient engineering episodes erlbaum event evolution experiments fall fasolino feasibility finding finnigan floratos france fraser frequent from hawaii hidalgo holguin holt human icse identification ieee iglinski inform informatics intelligence interaction interactiorl interface interfaces ireland italy iwpc jonassen journal kalas kapoor kerr knowledge kong kontogiannis lawrence laws legacy lehman line llenguatges llth lucca maintenance mannila matching matichuk methods metrics migration mining modeling models mortazavi motifs mtiller muller multiple mylopoulos nineties nucleic number object orgun oriented parnas patten pattern patterns penner perelgut perry plan platforms polit practice press proc processes program programming project proposals prosite quilici ramil rarely recent recognition recovering recovery references related report requirements research reverse school science sciences scient seke sept sequences sequential sets short simoff simon simultaneous sistemes software sorenson spain srikant stanley stroulia structure subsystem symposium system systems taxonomies taxonomy theory thesis threadbased tics tilley toivonen traces traversal univ universitat university unpublished user using verkamo view vinar virtual watch waterloo wcre webpage wernick what with wius wong woods work working workshop york zhang http://doi.acm.org/10.1145/775047.775106 43 Mining Heterogeneous Gene Expression Data with Time Lagged Recurrent Neural Networks abstract acad acids akaike alcoholism alexandrov altman analysis anders anderson applications approach artificial assessing assessment associative attributes automatic available azuaje backpropagation banavar based bauer bayesian bello berkeley beucklaer bicciato biochemical biocomp biocomputing bioinfomaatics bioinformatics biokdd biol bishop botstein bronberg brown budding calculation carmel categories cdna cell cells cenet cerevisiae changes characterizing choice cieplak cleaver cluster clustering complex comprehensive computation conference control criteria cross cycle data department derisi design didone discovery discussion diseases disparate display does dopazo dudoit dynamic dynamical dynamically echols editors eisen environmental epidemiol expressed expression factors falk fedoroff folds forgetting functional futcher gasch gene genes genet genetic genome genornic george gerstein gradient growing haeseleer haghighi hall harel hawaii haykin heidelberg herrero hierarchical highly hogg holter huberman human hybridization icnn identification ieee including information interaction international issue iyer jansen jersey joint journal kelemen krebs kummer learning liang look lymphoma manual maritan mathworks matlab memory microarray miller mining mitra model modeling mulholland natick natl network networks neural normalization nucleic online oxford pacific pandin park partslist pathways pattern patterns pearlmutter perspective pnas predictions prentice press proc proceeding proceedings processing program programs protein proteins publicly qian qualitative ranking raychaudhuri recognition recurrent references regulated release report research response risk river rost royal saccharomyces saddle sanfrancisco science series sherlock sherriff site socci society somogyi speed spellman sporulation statistical statistics stenger stometta stone storz structural stuart supervised suppl sutphin symposium syrup system systems technical teichmann temporal thieffry thomas through time transaction transactions transcfiptome transcriptional university unsupervised upper user valencia validatory werbos what whole wide wilson with workshop yang yeast zhang zurada http://doi.acm.org/10.1145/775047.775068 14 Relational Markov Models and their Application to Adaptive Web Navigation academic adaptive administrivia advances agents agrawal aiicoursework aiiexams aiilectures amsterdam analysis anderson answering appear appendix applications apply artificial association assortment assortmentdefaulto autonomous based bases between billsus boutique cadez case cation chains challenge clustering collection comprising computation computer conceptual conference contain context course courseoccurence courseoccurenceother courses coursesamplecode coursesite coursesiteother coursewebso coursework courseworkcode courseworkgeneralother courseworkother data databases departments derived described devices domain domingos each editors eenth engineering etzioni evaluation exam existing explicitly flat framework from gazelle graduate ground grouping haddawy heckerman hidden hierarchies home identifying ieee imielinski independence induction information intelligence interesting interior international items joint jordan journal kddcup knowledge large leaf learning lecture lectureother lectureothergeneral legcare level lifestyles link machine maillndex mailmessage management many markov meek mining mobile models most mozer muramatsu national navigation netherlands networks neural ninth node nodes number only oourse oourseoeeurence other pages parameters particular path patterns pazzani perkowitz personalizing petsche prediction press probabilistic probability proceedings processing product productdetaillegcare productdetaillegcaredefaulto productdetaillegwear productdetaillegwearassortment productdetaillegweardefault productdetaillegwearprodassort productdetaillegwearproduct productdetaulegwearprodcollect provost queries rabiner ranking recognition references relational relations relative reverse root rules sarukkai schemata science section sectionother selected sensitive sequences sets seventeenth sigmod site sites smyth sortby speech structure study swami syskill take tenth term terms theoretical these they third thirteenth three towards tree turnin tutorial undergraduate urls users using values variables vendor vendoro visualization washington webert weld which white wide wireless with world year http://doi.acm.org/10.1145/775047.775110 46 Frequent Term-Based Text Clustering achieves advanced agent agents agrawal algorithm algorithms already analysis antonie aone applications approach apriori associating association australasian automatically average based better bisecting boly browse browsing categories categorization chakrabarti chile classification classifying clustefings cluster clustering clusterings clusters collection comparable comparison competitors comprehendible conclusions conference could cutting data database databases demonstrated demonstration description directions directory discovered discovers document documents dustefings duster easier effective efficient etzioni evaluation experimental exploration explorations fast favors feasability finally finding flat fmeasure forms frequent furthermore future gather general generate generated generates generation gini greedy gross groups guntzer hastings hftc hierarchical hierarchies hipp however http hypertext implementation integrating integration introduced introduction itemsets java john karger karypis kaufrnan kumar kurnar large larsen like linear many means measure mentioned mining mobasher moore more nakhaeizadeh natural naturally note notes novel observe occur outline overlapping paper partners pedersen precision presented proc quality real recall references research rousseeuw rule rules santiago scatter secting sets setting sigir sigkdd significantly similar some sons speed srikant state steinbach such survey techniques term terms test text textmining than that their this time tukey tutorial using values variants vldb webace which wiley with workshop would yahoo yibin yields zaiane zamir http://doi.acm.org/10.1145/775047.775086 26 Sequential Cost-Sensitive Decision Making with Reinforcement Learning adacost advanced analysis appear approximation apte archive arlington artificial athena automatic barto based bertsekas bibelnieks bibliographies bibliography boosting bootstrap both california cambridge campbell canada chan classifiers computer conf conference connectionist control cost costs council cued data dayan decisions delayed departement department dietterich difference discovery domingos dynamic efficient elkan engineering estimators evaluation extractor fifth foundations francisco from function general http ieee ijcai infeng information institute intelligence international introduction irvine joint journal kaelbling kaufmann knowledge large learning line littman machine making margineantu marketing massive metacost method methods mining misclassification modeling moore morgan natarajan national nelson neuro niranjan optimization ottawa pages pednault press probabilities proc proceedings programming regression reinforcement report research rewards rummery scale sciences scientific second segmentation segmented sensitive sets seventeenth seventh siam sigkdd statistical stolfo survey sutton systems targeted technical technology temporal thesis tipu transactions trees tsitsiklis turney university unknown using value virginia wang watkins when with workshop zadrozny zhang http://doi.acm.org/10.1145/775047.775101 38 Mining Intrusion Detection Alarms for Actionable Knowledge aaai academic accumulating acknowledgments acquisition actionable adam adaptive addition advances agencies aggregation alarm alarms alert alerts algorithm allen also among analysis annales annual anomaly applications approach artificial assessing assurance attack attacker attribute attributes author automate automated bace barbar barbara based bayes bellovin benefit biased bisson bloedorn boston broader broderick cactus card carnegie carofalakis case categorical causes cercone cerz chan chapman choose christensen christiansen christie class classic classification classifications clifton clustering clusters commission communications computer computing concentrate concept concepts conceptual conclusion conf conference constructing construction correlation cost couto created credit criteria critique cunningham custom dacier dain data databases debar december demonstrated derived desirable detecting detection detector developing dimacs discovered discovering discovery discrete displaytc distributions down driven dynamic editor editors education efficiency engineering entries environment episode episodes estimators european evaluation event evidenced experiments explained exploration extensive facilitates fawcett fayyad features federal find finding finland first fisher fithen flynn found framework fraud frequent from frramework funded fusing future ganti gehrke general generality generalization generalized generally generation gengo gordon group guha hall handling hansen have heinonen hellerstein helsinki helsinky herein hermiz heterogeneous heuristic hierarchies hill historical hoagland homogeneous host hzzp ides ieee ilgun improve improving incremental induction information insights intelligence interest interesting international internet into introduced intruders intrusion intrusions investigate investigated investigation issues jain jajodia january jaumard javitz julisch kemmerer klemettinen kluwer knowledge laboratory large learning likely lincoln lncs logic loth lters machine macmillan maftia make management manganaris mannila mathematics mcalerney mchugh measurement meets mellon methodology michalski milcom military mine minimal mining mladenovic models modified moreover murty necessarily nemesis network networks nonetheless novel number numerical oakland objectively observation occasionally occurences occurs october office operate order oriented outsourced outweighs over oworld packets pages paper papers partially pattern patterns paxson piatetsky pickel pitt point polynomial popyack porras portscans power practical practice press prevent privacy probabilistic problem process processing produce project properties provost publisher publishing purposes pursued quantitative raid ramakrishnan rastogi real recent references refinement reflect reinke relational relevant repetitive report representation research resulting results review revised robust rock root rtid rule rules scalable scenarios science second security seen selecting sequences series several shapiro shim shown siam sigkdd sigmod skinner skorupka smyth software solution source specific specifically springer staniford state statistical stealthy stepp steps stolfo stoner stream streams study subsequently suitability suitable summaries support supported supporting surveys swiss symposium system systems talavera talboot taxonomy technical technique techniques technologies telecommunication telligenee tends tglgcommunications them theoretical there these thesis third this those three time tivel toivonen toward transactions transition triggering undo uniform university used using uthurusamy valdes validity value verkamo verlag versus very viability views ways well wespi where which will with work workshop world zerkle http://doi.acm.org/10.1145/775047.775049 0 Bayesian analysis of massive datasets via particle filters aaai academic adaptive advances altschuler american analysis applications association asymptotics based basic bayes bayesian berger bernardo berzuini besag biometrika boca cadez calculations carlin carlo chain chains chapman chapter chemical clarendon clustering cohen computation computing concepts construction cowell data databases dawid decisions degroot discovery discrete discussion distributions doucet dumouchel dynamic dynamics edition editors elder empirical engineering equations expert fast fayyad figueiredo finite following freitas gelman geman gibbs gilks glymour gordon green hall hastings heckerman higdon hill ieee images imputation inference information instance intelligence jeffreys journal kluwer knowledge kong learning lessons likelihood louis machine madigan march markov mcgraw meek mengersen methods metropolis microsoft mining missing model models monte motoda moving nason navigation neural nips optimal oxford pages pattern patterns perspective physical physics piatetsky posse practice pregibon press prior probabilistic probability problems proceedings process processing publishers raghavan ramoni raton references relaxation report research restoration richardson ridgeway rosenbluth ross royal rubin sampling science sciences sebastiani section selection september sequential shapiro site smith smyth society some sparseness spiegelhalter springer squashing state statistical statistics stern stochastic systems target technical teller their themes transactions using uthurusamy verlag visualization volume white with wong yang york http://doi.acm.org/10.1145/775047.775139 75 Combining Clustering and Co-training to Enhance Text Classification Using Unlabelled Data able accuracy addition advances another artificial assessment augmentation australian available averaged background bartlett based beyond binary blum breakeven cambridge case category cfco class classification classifier classifiers cluster clusters colt combination combining computational concepts conf conference cotrained could creating crisfianini data derived document documents each early ecmlo editors enhancing error european examples experiments fact feature features ferni ferr figure figures filter final fourteenth francisco from full gain generated given goldman have hence high higher hirsh icml improve improvements improves indeed indicate inference information intelligence international investigation joachims joint kaufmann kernel kernels knowledge kowalczyk labeled labelled labels lang large learner learning level lower machine machines major management margin marginal matrix maximal maxirnising mccallum membership methods micro mitchell morgan netnews news newsweeder nigam nineteenth number obtained obtains only optimization order original other over pages parameters perceptron performance point presence press proceedings projectin provide publishers quality raskutti references regularization results retrieval reuters scholkopf schuurmans scores second self sets seventeenth shawe shown significant similarity sixteenth small smola space statistical study such suggests summary supervised support table taylor tenth text than that themselves theory these they this thought through thrun thus trained training transductive twelfth ultimate university unlabeled unlabelled used uses using vapnik vector very webka webkb when which wiley with word workshop wpco zelikovitz zhou http://doi.acm.org/10.1145/775047.775076 19 Enhanced Word Clustering for Hierarchical Text Classification aaai about agrawal american analysis annual ansi applications arithmetic august baker based bayes bayesian becher bekkerman bell berkhin bias binary biometrics book cald categorization chakrabarti classification classifications classifier classifiers classifying cluster clustering clusters colloquium communication company comparative comparison complexity computer computing concept conditions conference cover curse data databases decompositions deerwester dhillon dimensionality discovery discriminants distributional divergence documents domingos duda dumais ecir ecml edition efficiency elements english entropy equivalence european event every feature features floating forgy friedman furnas garey generalized goldberg gray harshman hart hierarchical hierarchically hill hofmann icml ieee indexing inform information interpretability introduction joachims john johnson journal know knowledge koller kullback landauer language large latent learnin learning leibler lloyd loss machine machines many math mathematical mccallum mcgill mcgraw measures meeting mining mitchell modeling models modern modha multivariate naive nature navigating neuhoff nigam optimality pages pattern pazzani pedersen pereira point power press probabilistic problem proc proceedings quantization raghavan references relations relevant report research retrieval sahami salton science scientist second selection semantic shannon should siam sigir signatures simple slonim society sons sparse springer standard star statistical stork study sufficiency support surveys system taxonomy technical text theory thomas tishby toolkit trans under using vapnik variance vector verlag very vldb what wiley winter with witsenhausen word words workshop yang yaniv york zero zjrd http://doi.acm.org/10.1145/775047.775147 83 B-EM: A Classifier Incorporating Bootstrap with EM Approach for Data Mining agrawal algorithms applied august babu binderberger bootstrap bradley castelli chapman chapmann classification classifier clustering conference continuous cord corelfeatures cover data database databases decision discovery documents effect exponential extending fayyad features fifth fourth from geoscience hall html http hughes ieee image induction information international introduction knowledge labeled landgrebe large learning letters machine march mccallum mehta mining mitchel mitigating mixing monograph nigam ortega over pages parallel parameters pattern phenomenon probability problem proceedings queries quinlan recognition record reducing references reina relative remote sample samples scalable scaling sensing sept shanhshanhani sharer sigmod size small sprint statistics streams technology text theory thrun tibshirani transaction transactions trees unknown unlabeled value vldb widom with http://doi.acm.org/10.1145/775047.775053 3 Selecting the Right Interestingness Measure for Association Patterns aggarwal agrawal agresti american analysis april army artificial asia association associations attributes august barber baskets between beyond biases brin cambridge canada categorical center characteristic cikm computational computer computing conf conference contingency data database databases definite dependence diego discovered discovery editors estimating estimation evaluating evaluation fifth finding flairs florida fourteenth framework frawley from gaithersburg generalizing generation george hall hamilton hand high hilderman hong ieee ijcai imielinski information intelligence interesting interestingness international items itemset john joint journal june kamber klemettinen knowledge kong kononenko kumax large makes management mannila market maryland mathematics measure measures mining montreal mosteller motwani multi november oregon orlando pacific pages pakdd patterns performance piatetsky portland positive prentice presentation press principles proc proe pruning ranking references report research right ronkainen rules seattle second selecting series sets shapiro shinghal sigmod silberschatz silverstein smyth solution sons sparse srivastava statistical strong summaries summarizing swami symposium systems tables technical toivonen transactions tuzhilin valued verkamo washington what wiley http://doi.acm.org/10.1145/775047.775133 69 A Robust and Efficient Clustering Algorithm based on Cohesion Self-Merging able acknowledgments addison advantage algorithm algorithms also applications apply april attains attributes authors average based basket bennett best between beyer biased birch bradley capability categorical chain chen china chuang cluster clustering clusters cohesion combining compexity complexity compsa computation computer concentric conclusion conducted conf conference constrained constraint council cure data database databases december demiriz density discovery distances dubes effective efficiency efficient engineering european execution experiment experiments faloutsos faster features figure first flynn from goldstein gravity guha hertz hierachical hierarchical hybrid icdt ieee improved incurring information input institute inter internation international introduction jain journal knowledge krishna krogh lakshmanan large larger lead lecture like linear linearly link livny longman management many market meaningful means measure measurement merit method methods microcosm microsoft mining moreover much murty national nearest neighbor neural note notes only optimal orders other outliers over overview oyang pages palmer paper part partitional pattern perspective phase powerful practice principles prior proc procedure proceedings project proposed publ ramakrishnan rastogi ratios reading recognition references relationship republic research resist results review robust rock running runs same sampled sampling santa science sciences seconds seen sept sequential series sets shaft shim shorter show shown shows siam sigmod similar similarity similarly single size sizes small smaller software spatial studies study subsets supported surveys taiwan than that theory these this time times trans tung using value values various very vlaues vldb wesley when whereas while whose with works yang zhang http://doi.acm.org/10.1145/775047.775113 49 Extracting Decision Trees From Trained Neural Networks acquire addressed advances algorithms also analysis appears applicable applications arto atlas back backpropagation barnard basic beginning behind believe best between binary black boxes buntine chapman chapter cial classdiff classical classification cole comparison comparisons comprehensible computationally computer concepts conclusion conference connor constructive converting craven criteria critical data dataset deal decision dectext delity density denver department detroit developed dietterich discriminators editors eighth eleventh empfical empirical engineering estimation etxractor evaluating evanston expemive experimental extracting extraction extracts fidelity fisher from further general generalization generalizing global good hall harder heart here housing however humans important induction inductive information initial inputs intelligence international joint kapouleas kaufman kaufmann knowledge learning lehigh like local london machine madison marks mateo mckusick method methoddisc methods modelinstances models mooney more morgan morn most murphy muthusamy nets network networks neural niblett ones output pages pattern pazzani performance press problem problems proceedings processing propagation random readings real reasoning recognition references regular report reported represent representations required researchers results rules safety science sciences setzero several sharkawi shavlik silverman some special splitting splittingmethod statistics structured suitable symbolic systems table technical technique than these thesis they three towell trained training tree trees type understand understandability understanding university unseen used very volume vote weiss were wisconsin with workshop world http://doi.acm.org/10.1145/775047.775129 65 Clustering Seasonality Patterns in the Presence of Errors about accomodate accurate agrawal algorithms allows also although american analysis applications approach associated association assumption based basic both bucklin calendars carl case classical classification clustering come common computing concept conference consecutive correlated correlations data databases decision demonstrated dependence developed development different difficult discounting discovery discussionand distance distribution distributions donald dubes duxbury dynamic eamonn edition effect empirical encounter enhanced error errors estimates estimation euclidean fast feedback finally flynn fourth from function future gafney gaussian general generalization generalized give grouping hall harpreet have hierarchical implications improve improving inadequate incorporates incorporating independent industry information international introduced invariant issue jain john jorge journal keogh king knowledge kopalle kyuseok lawrence less made manufacturers many marketing marsh mathematical measure mela method methods michael mining mixtures models more morrison murty negative noise normative objective observations often optimize other padhraic paper pazzani planning points positive praveen prentice presence press pricing problem proceedings promotion property rakesh randolph real references regression relevance representation research retail review rice risso sales sample samples sawhney scales scaling science scott search seasonality second section series sets shim showed sigkdd silva similarity simulated smyth statistic statistical statistics subsequently support surveys system test that theme therefore this time timeseries traditional trajectory translation under used utility values very viewed vldb volume ward when which while with working http://doi.acm.org/10.1145/775047.775148 84 A Unifying Framework for Detecting Outliers and Change Points from Non-Stationary Time Series Data able accurately activity adaptive akalke algorithm algorithms amer among analysis applied approaches asakura awaji axis barnett based basis beginning behavior black bubble burge cellular change changes characterized complexity concluding conference consists corresponding data datasets deal dealt decay demonstrates denote density detect detected detecting detection dimensional discounted discounting discover discovering distance each earthquake economy effect employed enabled estimation event example fawcett filtering finite fisher framework fraud from function gave gersch give gradually great guralnik guthery hanshin have hawkins here high hinton horizontal html huskova identifying ieee incremental incrementally index information interesting introduced japanese john kddgo kitagawa knott large learn learned learning lecture letting lewis line linear logarithmic loss management meaningful method milne mining mixtures modds model modeling models monday monitoring murad neal nonparametric notes noticing observe occurred order original other outlier outliers ozaki paper parameter parameters part partition parts past piecewise pinkas pkdd point points practices press priors probability problem problems proc procedures proe profiling property proposed prototypes provost publications radford real reduced references regression remarks risk rissanen royal rules score scores scoring sdar sequence series shaw shoten show shows significant simple smoothness society sons sources sparse specifically springer srivastava stationary statist statistical statistics stochastic strategy superimposed takeuchi taylor that thatjustifies theory there this time toronto trans unlabeled unsupervised used using variants verlag vertical view vldb where which while wiley williams with yamanishi http://doi.acm.org/10.1145/775047.775128 64 Finding Surprising Patterns in a Time Series Database in Linear Time and Space aaai aberrant abstract accurate activity adaptive algorithm algorithmica algorithms allows alphabets analysis annual anomaly apostolico application applications applied approach approximation april artificial assoc association august automata automation based bases behavior behaviour billari biology biosequences blockeel bock calendar cambridge chakrabarti chakrabaxti chandhuri change changes chapman classification cluster clustering comput computational computer conf conference construction dasgupta data database databases deformable demographic department description design detecting detection detectors deviants different dimensionality discovery economical editors efficiency efficient engineering enginesring enhanced event experimental expert extended faloutsos farach fast fault fawcett feedback feller fifth finney forrest foundations from function furnkranz global group gusfield hall hannenhaln hawkins hori house hoyt huang ideas identification identifying ieee immunology implementation improve indexing industrial information instruments intelligence intelligent interest interesting internat international introduction istrail jagadish joint june kato keeping keogh knowledge kotsakis koudas large length lengths level line linear locally lonardi london ltracy mach madigan management mannila manolopoulos maps markov matching mccreight mehrotra method mining model molecular monitoring monographs monotony monterey multi muthukrishnan myers noticing novelty optimal outliers overview pages park pattern patterns pazzani pevzner prec preceedinmgs predicting press principles probabilistic probability proc processing properties propulsion provost prskawetz purdue queries query quest ranganathan recomb record reduction references reinert relevance renganathan representation research review robotics rule rules sarawagi scale schbath science sciences scientific searches selow sequence sequences series shahab sigkdd sigmod similar smyth space special statistical statistics strings subsequence subsequences suffix surprise surprising switching symbolic symposium system systems templates temporal test theory thesis tian time tree trees trend ukkonen university unusual using very visualization washington waterman wavelet weiner which whitehead wijk wiley with wolski words yairi yoon york zhao http://doi.acm.org/10.1145/775047.775143 79 Non-Linear Dimensionality Reduction Techniques for Classification and Visualization access adaptive advances agrawal algorithm analysis animating approximation beckmann bellman bentley blum chakrabarti chan classification classifier college company component comprehensive computation computer computers confference control coordinates cover data databases datasets dept dimensional dimensionality dimsdale discriminant domeniconi efficient embedding faloutsos fast fastmap flexible fodo foundation framework friedman from geometric geometry girosi global grouping gunopulos hart hastie haykin html http icde ieee indexing information inselberg intelligence jolliffe kaufmann keogh kernel kriegel large learning lecture linear lnfovis locally lowe machine macmillan manifold manolopoulos mapping matching mclachlan mehrotra merz method metric mining mlearn mlrepository morgan multidimensional multimedia murphy nearest neighbor neighbors networks neural nips nonlinear notes pages parallel pattern pazzani pdints peng perception perona poggio points polito press princeton principal proc proceedings processes processing programs publishers publishing quinlan random ranganathan recognition rectangles reduction references report repository robust saul scaling schnei science search sequence sequential series sets sigmod silva similarity slagle space springer stanford statistical statistics subsequence subspaces swami systems tech tenenbaum theory tibshirani time tool traditional trans transactions tree triangulation univ university variable verlag visualization visualize visualizing ward wavelets ways wiley york http://doi.acm.org/10.1145/775047.775090 29 Exploiting Unlabeled Data in Ensemble Methods aaai advances alch algorithm algorithms ambroise approach arnbroise artificial atkinson available bagging bartlett bauer baxter becker benchmark benchmarks bennett berlin bischof blake blum boosting cambridge caruana classification classifiers classify clustering colt combining comparison competition computational conference data databases datasets demiriz design dietterich discovery documents dorffner editor editors elements embrechts empirical engineering ensemble ensembles experiments february first francis frean freund friedman from functional fung genetic ghahramani gradient graepel grandvalet grove hastie herbrich hornik html http hypotheses iaai icann information integrating intelligence international journal kaufmann kearns knowledge kohavi kremer labeled large learned learning limit lncs machine machines maclin mangasarian margin marginboost mason maximizing mayo mccallum mccauum merz method methods mining mitchell mixture mlearn mlrepository models morgan neural nigam nips obermayer opitz optimization pages partitioning popular press proceedings processing publishers raetsch recursive references repository research rpart scale schapire schslkopf schuurmans semi seventh sfunc sigkdd skremer smart smola software solla springer stacey stat statistical street study supervised support supunsup system systems taylor techniques test theory therneau thrum tibshirani training tsch unlabeled unsupervised uoguelph using vapnik variants vector verlag voting wiley with workshop york