http://www.informatik.uni-trier.de/~ley/db/conf/icdm/icdm2005.html ICDM 2005 http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.4 100 A Framework for Semi-Supervised Learning based on Subjective and Objective Clustering Criteria able academic accuracy achieves acknowledgments algorithm anderson application applied approach approaches assess august average averaged award background based basu below bilenko bioinformatics blum both cambridge cannotlink cardie career caruana class classification cluster clustering code cohn combining comments commission comparative compare comparison competing complete computational computing conf conference conjuction considered constrained constraints costly data dataset datasets december default described diabetes discovering discussion distance documents domeniconi each eissen equivalence euclidean evaluate evaluated evaluation experimental experimented expression external feedback fifth figure finding flannery framework from function funded gene given gives grant gunopulos halkidi have hertz higher hillel html http icdm icml ieee implement independent information integrating interaction international involves ionosphere iris jordan july keogh knowledge koller koutroubas labeled labels lack learned learning like link machine matlab mccallum means measure ment methods metric mining mitchell mlearn mlrepository modifications moif molecular mooney moore mpck mpckmeans must naive need nevertheless nigam nips noisy nonparametric number numerical observe obtained only optimal optimization order other outperforms over overview pages part partition partitioning pathways pattern percentage performance points press probabilistic proceedings prof proposed protein provided providing randomly recipes recognition references related relations report repository respectively results rogers runs russell same satisfy schroedl scientific seeding segal selected semi semisupervised september shental side similar soybean space spam stein structure study subspace supervised supported synthetic technical teukolsky text than thank that their theodoridis theory these three thrun thus training true university unlabeled unsupervised used user users using validity valuable varied various vazirgiannis vetterling wagstaff wang weinshall wibrock with work would xing http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.130 87 Sequential Pattern Mining in Multiple Streams Gong Chen Xindong Wu Xingquan Zhu about across algorithm although appended appending approach associated available average avoided avoiding balancing become becomes benefits better both built case cases change chen close cohen collected compared comparison computer conclusion conference consistent consistently containing contains data dayal dealing depth described different discovery discussion distinct distributed distribution distributions does each efficiency efficient empirical encoding even expensive explains extensive faster fifth figure first formed frequently from further furthermore gains generated greater growth happen happens have however icdm ieee improve improvement incorporated incorporating increase increased indeed indicates international into intuitively kind knowl knowledge larger learning length lengths longer lowest machine made manage mannila mechanism memory mile mined minimized mining more mortazavi much mult multinomial multiple note number oates optimization order others outperforms over overlapping pages paper paragraphs parts pattern patterns performance pinto prefixspan previous prior probability procedure proceedings process provided ratio redundant references renganathan repeatedly report result results rule runs scalability scanning science search searching section sequential series sets show shown shows significant significantly smyth some sometimes speeding statistics still stream streams structure such suffix suffixes technical techniques than that these this three time token tokens trans unif uniformly unique university upon varying vermont wang were when which will with http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.133 125 Spatial Clustering Of Chimpanzee Locations For Neighborhood Identification Sandeep Mane , Carson Murray¶§, Shashi Shekhar, Jaideep Srivastava and Anne Pusey¶§ across algorithms analysis animal approach behavior behavioral bivariate cambridge carlis chimapnzees chimpanzee chimpanzees clustering competition computational computer conference cressie data dataset definite dept different directions discovery dissimilarity distance distances domain ecological ecologists editors effect enable experimental extending farm female fifth francis function future geographic george gives gombe goodall hall harvard hence icdm identification ieee include influence inhomogeneous international kamber kfunction knowledge large latter locations main male mane marked mathematics measure methods miller mining minneapolis minnesota more murray neighborhood overall paper patterns point positive prentice press proceedings proposed pusey question ranging references report requires research results science series shekhar show solution space sparse spatial spatio srivastava stable statistics studied study survey systems taylor technical techniques temporal territorial that these this tung university using were wiley williams york http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.37 81 Blocking Anonymity Threats Raised by Frequent Itemset Mining Maurizio Atzori Francesco Bonchi Fosca Giannotti Dino Pedreschi agrawal anonymity association atzori based bastide between bonchi calders closed conference data databases derivable discovering fifth fimi frequent fuzziness giannotti goethals helsinki http icdm icdt ieee imielinski international items itemsets journal kanonymous knowledge lakhal large mining model pasquier patterns pedreschi pkdd privacy proc proceedings protecting references rules sets sigmod supd supok swami sweeney systems taouil uncertainty http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.34 136 Bias Analysis in Text Classification for Highly Skewed Data Lei Tang and Huan Liu Department of Computer Science & Engineering Arizona State University Tempe, AZ 85287-8809, USA {l.tang, hliu}@asu.edu after analysis arizona balancing batista bayes before behavior bias biased categorization cause change class classification classifiers comparative conference costly data decision decreases discrimination distribution effect effective empirical experimental explains explor extensive feature features fifth figure forman found full good grobelnik highly icdm icml ieee improve improves induction insensitive international jair kaufmann learn learning mach machine measure methods metrics mining mladenic monard more morgan naive negative newsl over pages paper pedersen performance prati proc proceedings provost publishers ratio references report results sampling select selection several sigkdd skewed state study such suggest svms tang technical text than that this training tree trees unbalanced university version weiss when which yang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.150 146 Visualizing Global Manifold Based on Distributed Local Data Abstractions abstractions aforementioned applications artificial august baptist based basford been bishop blue bottom cheung circles clustering compared components computation conference curve data dekker density different directly distributed edinburgh ence estimates failed fifth figure found from generative ghosh global grant hkbu hong icdm ieee ijcai infer intelligence international joint klusch kong learned learning local lodi manifold mapping marcel mclachlan means melbourne merugu mexico mining mixture models more moro neural november numbers observed obtained ones only original others pages part particular preserving privacy proceedings proposed references region research reveals sampling seen shown similar situation source svensen tangled that third those tion topographic unfold unfolded university using well when which williams with worst york zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.55 137 Efficient mining of high branching factor attribute trees Alexandre Termier1, Marie-Christine Rousset2, Michele Sebag2, ` Kouzou Ohara1, Takashi Washio1 & Hiroshi Motoda1 account acknowledgments advances agrawal algorithm algorithms almeroth also analyzes application approach april arikawa arimura arlington asai asia association attribute available avril based behavior benchmarks both branching brighton chalmers characteristics chile closed cmtreeminer community computation conclusion conference corpus data databases dataset depends detailed directions discovering discovery documents dryade dryadeparent efficiency efficient efficiently eighth embedded england especially experiments extend extraction factor fast faster fifth figure fimi find first forest frequent from fundamenta future gains general giving global grant graphs have heterogeneous hooking http icdm ichi ieee implementation implementations improving inductive infocom informaticae international into issue itemset itemsets july kawasoe kind kiyomi know knowledge large logic making march maximal mgts mining modeling more most motoda multicast muntz nakano nasa nearly nijssen nodes ohara output pacific pages pakdd paper parameter partly patterns performances perspectives phdtermieren plan polynomial presented proc proceedings programming proposed publis random reconstructs references report research robust rousset rules sakamoto santiago science scientific sebag second semi sequences settings shown siam sigkdd some special srikant strategy strongly structure structured structures substructure substructures subtrees such supported takeaki taking technical termier test tests than thank that these third this thorough tiles time tree trees unaffected unordered vldb washio wish with work workshop yang zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.8 144 A Join-less Approach for Co-location Pattern Mining: A Summary of Results Jin Soung Yoo , Shashi Shekhar, Mete Celik Computer Science Department, University of Minnesota, Minneapolis, MN, USA jyoo,shekhar,mcelik@cs.umn.edu advances agarwal algorithm algorithms answer approach association based bases berg changes chawla class computational conference data database databases datasets dense discovery does efficient evaluation expensive experimental explore fast fifth frequent future geographic geometry hall huang icdm identify identifying ieee information instance instances international join joins knowledge koperski kreveld large less location maine materialization method methods mining morimoto moving neighborhood neighboring objects outperforms over partial patterns plan prentice proc proceedings questions references require results rules scalable schwarzkopf sets shekhar showing shows sigkdd similar since spatial spatio springer srikant sstd such summary symposium systems temporal time tour very vldb well with work http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.138 109 Supervised Ordering -- An Empirical Survey absolute accordance accordingly accuracies accurate acknowledgments acquired adaptively advantageous affect affects after against agresti akaho algorithm algorithms also always analysis analyzing applied apply arnold articles artificial attribute attributed attributes balakrishnan based because become beginning bollmann boosting cannot categorical categorization cause central chapman clickthrough cohen combining comparison complete complexity computation computers condition conf conference constitute content convergence conversely correct correctly could course critchlow data demerit depression discovery does drastic drop efficient empirical engines error estimating estimation european examples experimental fifth figure filtering first fitting fligner freund from functions furthermore generalized graepel grant hall hence herbrich hirao homepage however http icdm icml ieee improve improved increase increasing indicates inferior information intelligence intended interchanged international iterations iyer japan joachims john journal kamishima kazawa kernel knowledge large larger learn learners learning learns less levels like lnai machine maeda mainichi majority many marden mathematical measure method methods mining mixtures modeling models monographs more most much nagaraja newspapers next noise noises noted number numerical obermayer objects observed optimizing options order ordered ordering orders ordinal osvm other over pages pairs part performance permission permutation perturbation practical practically prediction preferable preference preferences probability problems proc proceedings promotion property psychology rank rankings rather references relations relative represented research respectively results retrieval rich robust robustness rounds same sample schapire science sdorra search serious should show singer slow slower slowness society sons spearman stage standard statistics stop study such suitable support supported svor systems takes tend tested text than thanks that their them these they things this though thus time types unique until upon used using values various vectors verducci volume weak were while whose wiley with work workshop http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.27 70 Approximate Inverse Frequent Itemset Mining: Privacy, Complexity, and Approximation Yongge Wang Xintao Wu UNC Charlotte, {yonwang, xwu}@uncc.edu about according agrawal algorithms analysis antwerpen appl application approach approximation april asked association aware axiomatization based bases basket been benchmark between both business calders complexity computation computational computer computers conf conference confidence constraint constraints continuous data database databases deduction dexa digital discovery discrete distributions efficient fagin feasible fifth frequency frequent frequently generally generate generation georgakopoulos guide halpern hard have having heuristic html http hundreds icdm ieee imilienski include information international inverse items itemset itemsets kavvadias knowledge kohavi large leakage level linear lncs logic machines management maniatty market mason math megiddo mielikai mining more necessary needed order pages papadimitriou parallel particular pentium performance pods potts powerful ppdm preserving press privacy probabilistic probabilities problem problems proc proceedings program programming proposed purpose questions ramesh real reasoning references regular regularly rule rules satisfiability scheduling sets showed siam sigkdd sigmod size society solved solving springer support swami synthetic testing that then theory these thesis this thousands thus together transaction trust trustbus universiteit unix unrelated variable variables verlag wang with workshop world zaki zheng http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.121 88 Privacy Preserving Data Classification with Rotation Perturbation Keke Chen Ling Liu College of Computing, Georgia Institute of Technology {kekechen, lingliu}@cc.gatech.edu accuracy adjust aggarwal agrawal algorithms analysis approach attacks based basic breaches chen classification classifiers column component condensation conf conference conflict cryptology data database datta deriving design edbt evfimievski experiments extending factors fifth find from gehrke greatly guarantees huang hyvarinen icdm ieee improve independent information international interscience intl journal kargupta karhunen limiting lindell locally loss many meanwhile measured metric mining multi optimal optimality paper perturbation pinkas pods possible preserving privacy private proc proceedings properties propose quality quantification random randomized references report rotation sacrificing show sigmod sivakumar srikant technical technique techniques technology terms that this treated wang where wiley without zero http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.36 124 Bit Reduction Support Vector Machine accurate active adaboost adaptive advances algorithms also analysis approach association august available based been better boley brsvm bundling chang choice chudova classification clustering cochran complete compress compressed compression conclude conclusion conference cormen cybernetics data database dept desired dice dimensional dimensionality discovery discussion done edition elena empirical eschrich evaluation examples experiments fast faster feature fifth florida from further fuzzy goldgof groups hall happen help high hopkins html http icdm ieee image images international introduction john karger kernel knowledge kramer learning leiserson library libsvm likelihood limitations lower machine machines margins merz method methods michie might minimal mining mlearn mlrepository more most muller multiple murphy nets neural noted obtain onoda optimization over overall owen pages paper part particle pattern pavlov plankton platt potentially prediction press proceedings process profiling proposed random ratio ratsch recognition recognize recognizing recorder reduce reduces reducing reduction references relatively remsen rennie report repository required resolution rivest sampling samson scalable selection sequential sets shadow shih should shows siam sigkdd significant significantly similar simple sixth smyth soft sons south space speed speedup speedups spiegelhalter squashing statistical statistics stein such support svms system systems taylor technical techniques tends text than that their therefore this through time together towards training transactions twentieth types university used using vector version very volume well when which wiley with work http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.128 114 Semi­Supervised Clustering with Metric Learning using Relative Comparisons Nimit Kumar, Krishna Kummamuru, Deepa Paranjpe IBM India Research Lab, Block-1, Indian Institute of Technology, New Delhi-110016, INDIA. nimitk, kkummamu, dparanjp@in.ibm.com accuracies accuracy advances agrawal algorithm algorithms also amount application augmented basu best bilenko binary bottleneck case centers centroids classes closure cluster clustering clusters code compared comparisons conclusions conference confusing considers constraint constraints could data datasets deterministic different dissimilarity distance document effect expected explore falls fifth figure figures greater have hmrf http icdm icml identical ieee improve improvement improves increases increasing information initialization initialize integrating international investigate joachims jordon kmeans krishnapuram kummamuru labeled labelled learning like means measures method metric mining misleading mooney mpck multi neural news note number observed often only page pages pairwise part percentage performance performs proceedings processing project real references relative risc rmpck russell samples schultz seems semi show shown side sigir sigkdd significantly size slonim spatially sssvad success supervised supervision svad systems textual than that theo this through thus tishby triplet used users uses using utexas variant various ways when which with word would xing http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.17 82 Adaptive Clustering: Obtaining Better Clusters Using Feedback and Past Experience Abraham Bagherjeiran, Christoph F. Eick, Chun-Sheng Chen, Ricardo Vilalta accuracy achieve achieves adaptive advances against analysis application applied approach artif artificial assess assessment assign atkeson average bagherjeiran based berkeley best better case classification classifier classifiers classify cluster clustering clusters comparing conf conference costly data datasets dayan decision defined discrete discussed distance domain dynamic edition eick enhance existing externally fifth figure find function functions furthermore gain granularity hall helsinki however hynninen icdm ieee improved improvement improvements incoming information informative instance instances intell intelligence intelligent international john jordan kaelbling kangas kohonen laaksonen learn learned learning less level littman machine markov mathematical maximize mcqueen methodologies methods metric mining modern moore more most multivariate neural norvig objective objectives observations only package pages particular prentice press prioritized probability proc proceedings processes processing program programming puterman quality quantization references reinforcement report representatives require results reward russell same saratoga several side significant similarity some sons springs statistically statistics stochastic study sufficient supervised survey sweeping symp systems technical technique techniques technology than that this those time torkkola university used user uses using vector very vilalta watkins weights whole wiley with worse xing york zeidat http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.21 36 An Algorithm for In-Core Frequent ltemset Mining on Streaming Data aaai aaaociation adlrnnrrx agrawal algorithm algorithms aoftwaindx aoying apecd approximate april apriori assoc association ationrules august available babu bags bahcock bart based borgelt burg canada candidate ccbcrg chcn chile chong chriatoe chrirtos christan closed cnnfrrrnc cnnfrwnre cnnfrwnrr cnsf conf conference confrrrnrr conj conrrrmnry context core counts covfrworr ctime ctsrc data database databasra datar datasets discovery distributed dnscovery dntn dntnbna dntnbr dnto doft dola ducauen duto editors effecitive efficient element enqmecnnq ernt ethals false fasf fast fayyad fifth fimi finding frcqucncy frequent from fuzzy gagan generatton gfor giannella gocthals goethals granularities haixun haji hash helsinki herkeieyy hidbcr high hongjun html http icdm ieee implementation interactive international inti intl intrmottonnl issues itcmsctr items itemsct itemsets iverkamo ivldb jeffrey jiawei june karp knolrliedaed knowlegbe knowler ksrypis kumar large lhneoctxonr lmplcmentstions lntrrnntronni lnvcrted lrtrqr lset ltcmect magd maintaining manku mason matrix menlo mining mininu minlng minlnq minxnq mlmng mnnnqrmrnt mnnoqpment models mohammad mohammed moment motwani mtnmg munta mzmng mznll natn navathc ncenrn negative nezt noft november npei ofntheg ogihara ohlo omiecinski ondotu online orred osmar over pages paper parallel park parthasarathy patterns performance philip pmrcrdrnqr pmresdmq pnges pods positive press proceedings proceedxnqa pspadimitrtoua qpdntn qrofs real references renr report richard rmrprrhr rnchard rnrlrryand rulcs rule rules ruoming sampling santiago savasere scalable scott september shanker sigkdd sigmod sliding slmple software srikant state stream streaming streams survey svatrn symporntn systems technical toivonen toronto transactional tree university updated vcreion version vldd vldm vldr wang widom window without wnnd workshop world xlfcng xnnrrlrrlqv zaiane zaki zheng zhihong zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.91 24 Making Subsequence Time Series Clustering Meaningful Jason R. Chen Department of Information Engineering Research School of Information Science and Engineering The Australian National University Canberra, ACT, 0200, Australia Jason.Chen@anu.edu.au accrue achieved addition adopted algorithm aligned along analysis approach area areas attractors average avoided babcock based beach because been bellcylinder berkhin both broken callaghan cambridge causes choosing class close closely cluster clustering clusters complex computer conclusion conclusions conference contrast correct correctly corresponding curved cycle cyclic data database datar date definition delay detecting determining deterministic diego different discovery distance distances distinct distinctive dynamical dynamics engineering equal euclidean even experiments feng fifth figure final finally findings first flawed foremost formal forming foundations funnel further future fuzzy gave general generally guha here hope huang icdm idea identification identifying ieee implications including indeed international into intrinsically introduced invariant involves jose kantz keogh knowledge lead lecture length maintaining making math meaningful meaningless means measure measuring medians members method methods minimum mining mishra more motwani multivariate nature nonlinear notes number oates obtained other outcome outcomes over pages paradigm paradigms patterns phase pods point presented press previous principles problems proceedings produced producing promising really recommended reconstructed redondo references regard regions report represent representation representative research results return rise roddick schreiber science second sectrix sensible sequences sequential series should showed similar similarity since sliding software soon space spiliopoulou springer step strange stream streams suboptimal subsequences survey symposium system systems takens taking technical technique techniques temporal than that there these this those time together total transactions truppel turbulence underlying univ values variance very volume ways were when where windows with work york http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.145 48 Training Support Vector Machines using Gilbert's Algorithm accelerate accurate acknowledgements adatron advances advantages ailing algorithm algorithms almeida almost also america annual applied areas artificial association automation available averages barros based because been bench bennek bennett between bhattacharyya bias blake boser both breast bredensteiner burges california cambridge campbell cancer carnegie certain chine christianini classifier classifiers close coded colt company complexity computation computational computer computing conclusions conference considered contract control convergence convex corporation cortes cpdee cristianini data databases department dependences dept design development difficult directed discovery discrimination discussion dismissed distant dodusaf editor editors effect efficient either empirical energy engineering england english estimation existing explorations fact fast faster fewer fifth figure final finds first form fortran france freund frie from furthest generally geometry gilbert gile girosi gorini grant guppy guyon hallelujah hand harrison haussler have heuristic heuristics home hospital html http hype icdm ieee improved information inseparable institute intelligence international interscience into introduction irvine iterating iterative itory joachims john journal jplatt kantrowitz keerthi kernel knowledge kopf laboratories laboratory large largest last learning likely linear linearly litc little locating lockheed machine machines made madison making mangasarian marcelo margin mark martin mathematical matlab measures mechanical mellon merz method methods microsoft might minimal minimum mining mlearn mlrepository modification more morgan moscow most motivated moves mpessk much multiprogram murthy musicant national nauka nearest networks neural nice number office operated optimal optimization origin osuna other over page pages passed pattern pittsburgh platt point polytechnic polytopes possible practical press principe proc procedure proceedings processing produced production program programming project proximity quadratic reaches recognition references related relatively relaxation rensselaer report repos repository representations research resulting robotics robust runs russian same sancheti sandia sathiya scale schol school science sciences scientific second sequence sequential sets shawe sheffield shevade shtml siam sigkdd signal similar similarly simple simpler since singapore situation slow slower smobr smola soft software specifically springer stage stages states statistical subset successfully successive support supported svms taylor technical technology than that theory these third this those thus time training transactions translation triangle tutorial ufmg under united university using vapnik vector vectors verlag viewpoint washington well when where which wiley will william wilson wisconsin with wolberg work workshop york http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.113 39 Orthogonal Neighborhood Preserving Projections E. Kokiopoulou and Y. Saad Computer Science and Engineering Department University of Minnesota Minneapolis, MN 55455, USA. {kokiopou, saad}@cs.umn.edu academy advances alizadeh alon also american analysis armitage arrays aspects association barkai belkin benavente bengio bernhard bioinformatics biology boldrick botstein bottom broad brown byrd cambridge canada cell chan classification clustering colon comparison comput computational conference data database davis delalleau diffuse dimensional dimensionality discriminant discrimination discussions distinct dlcl dudoit edition editors eigenmaps eigenvalue eisen embedding expression extensions face fifth figure fridlyand from gene gish globally grant greiner grever halstead help icdm identified ieee information insightful institute international isomap janardan journal laplacian large lawrence learning left leukemia levine levy lewis like linear locality locally lossos lymphoma machine mack manifolds marti martinez methods mining minnesota moore national nature neural nips niyogi nonlinear normal notterman numerical oligonucleotide onpp ouimet paiement panels paper pattern patterns powell preserving press probed problems proceedings processing profiling projections providing recognition reduction references report representation research revealed right rosenwald roux roweis saad sabet sample samples saul scholkopf science sebastian sherlock spectral speed statistical staudt supercomputer supported systems technical thank theory think this thrun tibshirani tissue tissues tran transactions tumor tumors type types uncorrelated unsupervised using valuable vancouver vapnik various vincent warnke webb weisenburger wiley wilson with work would xiong yang ybarra york http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.96 34 Mining Minimal Distinguishing Subsequence Patterns with Gap Constraints Xiaonan Ji James Bailey Guozhu Dong ability afshar algorithm allow also analysis answer approach appropriate arikawa ayres based before being believe best between bible bide biology bitmap bitset brinkman building butler cakes capture casas categorical certainly chan checking chen chief choosing christophe cikm classes classification classifiers closed clospan comput computational concluding conference consgapminer constraint constraints contrast contrasts contributions dasfaa data databases datasets dayal detecting differences different difficult dimensional discovering discovery discussion distinguishing domains dong each efficient emerging employ employing episode episodes essential ester eternal examination experiment experiments expressive factor feature fifth find first fishes flannick focused following forgiveness form four framework frequency frequent from future gardy garriga gehrke gemma good group groups growth hand have high hirao hoshino however human icde icdm ieee important incorporating information insignificant intelligent interesting international introduced intuitive item items jianyong jiawei journal jumping kingdom knowl knowledge large last learning length lesh lexicographic life limiting machine made maintain maintained major making mathee maximum mdss meger membrane mentioned michael mined minimal minimization mining more mortazavi most motifs much narasimhan newlands news nicolas node number ogihara only operation operations optimal other outer output overall pages paper particularly pattern patterns pazzani performance performed pinto pkdd pleasing post practical prediction prefix prefixspan present presenting previous priests problem proceedings processing protein proteins pruning quality question ramamohanarao reducing references related remarks representation requires restricting results rigotti rules sarah saying scalable seated section sequence sequences sequential sets shinohara show shrinks similar size sizes snapshot some spade spirit straightforward strong studied studying subsequence subsequences substrings such support syst systems table takeda taken tang technique techniques tested that their theor there these this threshold time tkde tradeoffs tree trends truly unbounded unclean understandable used useful using utilising versus very wang webb well will window with within work works would zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.85 23 Labeling Unclustered Categorical Data into Clusters Based on the Important Attribute Values Hung-Leng Chen, Kun-Ta Chuang and Ming-Syan Chen Department of Electrical Engineering National Taiwan University Taipei, Taiwan, ROC E-mail: {kidd,doug}@arbor.ee.ntu.edu.tw, mschen@cc.ee.ntu.edu.tw accrue accuracy acknowledgements addison addition adjusted adopted algorithm algorithms allocate allocation also apply appropriate approximate attributes average baeza bell berkhin best between blake bradley cactusclustering calculated categorical characteristics claim clarans close cluster clustering clusters communication comparison complexity compute computing conclusions conf conference contracts council cure data database databases dempster developed difference different discov discovery discrete distribution dubes each efficient efficientthan engineering entire evaluation even experimental experiments extensions fayyad fifth fifty flynn from ganti garey gehrke generalized guha hall hettich high hill html http huang icde icdm ieee incomplete indicating information international into intra jain johnson journal just knowl knowledge label labeling laird large learning likelihood linear lloyd lowinter machine management mardl mathematical maximum mcgraw means merit merz method mining mishra mlearn mlrepository modern more mostly murty named national neto nodes number objects oblinger obtained only other paper parameter part perform phase pitt point points prentiche preserves prior problem proc proceedings proposed quality ramakrishnan rastogi real references reina report repository represent representative respect result results retrieval review riberiro right robust rock royal rubin sampled sampling scaling science sets shannon shim shown shows siam sigkdd sigmod significantly similarity size society software spatial statistical study sublinear summaries supported survey surveys symposium system systems table taiwan techical technical technique techniques than that theory these this three time total trans transactions under unlabeled using utilized validates value values very wesley when which whileattaining with witsenhausen work works yates http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.3 89 1 above abramovich adams adapting also analysis annals appear approach argue arises automatic autonoma avoid balderas bart based benjamini bennett biodiversity body bookstein breneman carpio carpiodes carpsucker characters children classification classifiers colorado computational conclusion conclusions conference conflicting constraining contradicts contreras controlling could cyprinus data describe desirable diagnostic different dimensionality discovery distinct donoho drawn ecology effective embrechts even false feature features fication fifth following four framework from generated geometric grande honor icdm identification identified identify ieee indispensable institute interesting international ital johnstone joint journal jubilar known lawton learned learning leon libro louisiana lozano machine machines majority mexico michigan mining monterrey morphometrics most museum nuevo number observation obtained only overfitting pimm planning portion preliminary previously procedure proceedings progress promising proposed publ question random range rate reason reasons recognized reduction references relatives removed representative research results revolution river rohlf salvador science selected selection separated shape should significantly similar slice small song southern sparse sparsity special species specimens statistics support supported suspicious suttkus table taxonomic test texas that this those three trust universidad university unknown using vector viewpoint well were which without work workshop years zool zoology http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.77 40 Higher-Order Web Link Analysis Using Multilinear Algebra Tamara G. Kolda, Brett W. Bader, and Joseph P. Kenny Sandia National Laboratories Livermore, CA and Albuquerque, NM {tgkolda,bwbader,jpkenny}@sandia.gov acar accelerations access acknowledgments acrobat additional administration adobe advances advice algebra algorithm algorithms alqaeda already also alternative although amtepe anal analogous analysis analyzed analyzing anatomy anchor annu another apparent applications applying approach approaches appropriate approximation articles artificial associated attribute authoritative authority automatic available aviationweather bader based basic bauer been being benefits berry between beyond bharat blackledge blondel both brian brien brin bringing called carroll center chakrabarti chang chatroom chemical chen child citation classes cleal climate cluster cognitive cohn combination company compilation comput computation computational computations computer computing conditions conf conference connectivity content contract control convergence copy corporation cost created credit cubesvd currently dashoptimization data david davison decision decomposition decompositions deeper deerwester department dependent developed differences digital diligenti ding directed directions discoweb distillation document documents domingos done dooren dumais eccv eckart efficient efficiently eigenvectors eiron elsevier energy engine ensembles environment even example exist explanatory exploiting explor extend extending extraction factor fast fifth figure find fire first focused foundations framework free friedland friedman from funded furnas further future gajardo gallagher geladi generalization gerasoulis getoor gibson gleich golub golubuhou google gori govbenefits grabitech graph graphs guide handle harshman have haveliwala henzinger here heymans high higher hits hofmann home hopkins horizontal host http husbands hyperlink hyperlinked hypertext hypertextual icdm icml identify ieee ijcai ilog image importance improve improved improvement income indebted indexing individual inform information inside instance intelligence intelligent international internet isdn jensen johns joint jordan kaufmann kleinberg kleisouris known kolda koller krishnamoorthy kumar laboratories laboratory labs landauer langville large latent learn learning lecture lehigh lempel library linear link lncs loan lockheed looking mach maggini many martin math mathematical mathematics matlab matrix mccurley mcgovern measure mendelzon meyer million mining missing modal mode model modeling models moran more morgan mosek motwani much multi multidimensional multilinear multiple multiprogram multiway national natl ncdc ncep need neos network networks neural neville news newsl noaa nohrsc nonzeros notes novel nuclear office operated optimization order organization orthogonal overall page pagerank pages palisade papers parafac part patterns personalized phonetics physics plato pointers policy poster potentially prediction premise press principal probabilistic probabilistically proc procedure proceedings processing program project proposed prototyping provides providing psychometrika publication publishers pubs qaeda queensu query rafiei raghavan rajagopalan rank ranking reader reduce referees references relational relevant remains represents reputations research resource restricted results retirement retrieval revenue richardson salsa same sand sandia scale scaling science sciences score scoring search searching security semantic senellart sensitive service sets severe shang should siam sigir sigkdd similarity simon size skill skillicorn smilde social soft software solver some sources sparse sparsified spider springer spyderopts stability stable stanford stanley state states step stochastic storage stored structure study studying subgraphs submitted such suggestion surfer synonym syst system systems taskar taxes tech techniques technologies tensor tensorfaces tensors term terms terzopoulos text textual than thank that their them then there these they this thomas three tomkins toolbox tophits topic topics trans travel travis tree tucker ucla under understand unified united univ univie usdoj used users using vasilescu vector vectors verlag vertical vertices volume wang weather well what which while wiley will winograd with work working yener young zeng zhang zheng zhukov http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.98 55 Mining Patterns of Change in Remote Sensing Image Databases Marcelino Pereira S. Silva1,2, Gilberto Câmara2, Ricardo Cartaxo M. Souza2, Dalton M. Valeriano2, Maria Isabel S. Escada2 abandonment accurate acknowledged acknowledgements actor adam adolescence advanced aksoy alberta alves amaz amazon analysis annual application applications approach archives area aspects available barbara based been bins boundary brazilian breen campos canada capes categorization chang change chen childhood classification classifiers clue clusterbased cnpq communication comparison computer computers concepts conference content contentbased contributions cosit cover current data datcu definitions deforestation dependent description detection directions discovery domains dynamics early ecology editor editors edmonton effect enabling encyclopedia engineers england enhance environment environmentrics erthal evaluate fapesp faster fifth fonseca forest forested forestry foundations fraction fragstats francisco frank geist geographic geoscience geosciences germany gilberto graphics gratefully growing hermes herzog high http huang icdm identify ieee image imagery images implementations including information inpe institute integrated integrating intelligence intensification interactive international issues january java john journal kaufmann khan knowledge krovetz lambin land landsat landscape landscapes learning lepers libraries like logics machine mapper mara marcelino marks matching mcgarigal meinel method metrics miguel mining modelling monitoring monteiro montello more morgan multimedia must national neubert object online ontologies ontology open oriented paris part partially pattern patterns pekkarinen perform perimeter photogrammetry picture piegorsch practical press probabilistic proceedings process processes processing prodes program programs project promising quantifying quinlan reasoning recognition references region regions relevant remote report representation research resolution resources retrieval review rond rushing santa satellite satellites schober schr science scientists seattle segmentation semantics sensing sensitive service seventh shaarawi shade shimabukuro signal silva simplicity smeulders sons space spatial specialists specialized specific spring springer structure such supported supporting sussex symposium system systematics systems tasks technical techniques thank their thematic theory third toolkit tools training trans transactions tropical turner uern unsupervised usda using very viii visual wang washington what wiederhold wiley with witten work workshop would xxxv years zucker http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.69 52 Finding Representative Set from Massive Data acknowledgement algorithm analysis andritsos annual approach attains based basu better between bilenko bottleneck build captures categorical ceedings characteristics classes classical classification clustered clustering clusterings clusters compared computer conclusion conference constraints cover coverage data database dataset datasets defined design desired dhillon difficult divisive document documents efficiently elements experiments extract faster feature fifth finding first foundations framework friedman from generate good grant graphtheoretic greedy hastie have hellenic hochbaum however icdm ieee information instead international jounal kannan karypis kumar learning limbo machine mahadevan maintains mallela manually massive maximum method methods miller mining mooney more most mutual naval original other paper partially pathria performance proach probabilistic problems proceedings quarterly redundancy references replaces report representative require research results same scalable science search selection semi sevcik show sigir simplified sivakumar size slonim small special spectral springer statistical storylines straints subset subsets summarizing summary supervised supported symposium technical tends text than that then theoretic theory this thomas through tibshirani tishby transactions tsaparas tung using verlag version veta wang which wiley word words yang york http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.30 128 Automatically Mining Result Records from Search Engine Response Pages accuracy achieved advanced also approximate artificial august automatically based boolean building canada celebi center chang charan china compared completely complex computer computing conclusions conf conference content could creating data demand demo detecting discoverer discovery doorenbos dowling editorial effective effectiveness efficient efficiently engine engines evaluation experimental explorations extend extract extraction failed fifth first from future general grossman hall having high highly hongkong http icdm ieee iepad immediate improve induction ineffective information intelligence international intl issue japan joint july katukuri kushmerick lafayette language lego louisiana mainly matching measures meng metasearch mine mining mostly multiple mundluru noise operations outperformed page pages paper pattern perfectly plan precision primary proc proceedings proposed purpose query raghavan reason recall record records references reflects region regions remover report response result results return search showed showing shows sigir sigkdd significantly solution sources special state step string studies support supported surveys system systems technical that these this thus towards university used weld when whereas which while will work workshop wrapper zhai http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.51 74 Discriminant Analysis: A Unified Approach academic addison advances algebra algorithm algorithms american analysis answers application approximation arsenin aspects association august bartlett based belhumeur berlin best bias biased brabanter bulletin cambridge chen cheng choice choices class classification classifiers complete comput computation computational conference confidence cucker data dealing direct discovery discriminant discriminants duda each edition editors eigenfaces eigenfeatures elementary error estimation eugenics evgeniou extraction face feature fifth fisher fisherfaces found foundations framework francisco frangi friedman fukunaga function fung further generalized gestel given good hart hertz hespanha highdimension hoerl icdm ieee image intelligence international introduction john journal kennard kernel knowledge kpca kriegman learning least liao linear machine machines makes mangasarian math mathematical mathematics measurements mika minimize minimizer minimizing mining moor multiple networks neural nonorthogonal norm notices number october optimal pages palmer parameter parameters pattern peng philosophical plataniotis plus poggio pontil posed press problem problems proceedings projection provost proximal question questions recognition references regression regularization regularized retrieval ridge sample scene scientific simple singapore size smale small society solutions solve solved sons specific squares srikant state statistical such support suykens swets system taxonomic technology technometrics term that then theory there these thesis they this tikhonov trans transactions unique university using vandewalle vapnik variance vector vectors venetsanopoulos volume washington well weng wesley what where which wiley with world yang yield york zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.87 35 Learning Instance Greedily Cloning Naive Bayes for Ranking accuracy accurate aiming algorithm algorithms anneal arff attributes audiology autos balance bayes believe breast called cancer classes cloning colic compare comparisons complexity conference correction credit data datasets decision description desirable deviation diabetes directions downloaded effective efficiency enhancing especially experimental experimentally experiments extending fact fifth find format from future glass greedily heart hepatitis higher http hypothyroid icdm ieee igcnb improving induction instance instances international ionosphere iris labor laplace lazy learning letter locally lwnb lymph machine main mean measured meet method mining missing more motivated mushroom naive name needs neighborhood numeric orig other outperforms performance potential prdownloads present primary problem proceedings provides pruning ranking recommended references relatively research results scale segment selective sets show sick significantly simply snnb sonar sourceforge soybean splice standard statlog summary table test that these they this time tree tumor used using value vehicle vote vowel waveform weighted weka when whole with without word work yielding http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.80 15 Improving Automatic Query Classification via Semi-supervised Learning according acquired adjectives agglomerative algorityms american approach appropriate automatic automatically autonomous beeferman berger butterworths carroll categorizing cikm classbased classification clustering cognitive communications computational conference content cover data disambiguating document elements engine evaluating experiment experimentation feedback fifth formulas foundations geographical gravano greiff hafner hatzivassiloglou hill icdm ieee indexing induction information international jones journal kang krauth kupper language learning lewis lexical lichtenstein light linguistics locality logs london machine management manning mccarthy mcgraw mezard mining mitchell model models natural networks neural nouns optimal optimizing pennsylvania physics popular pragmatics preferences presented press proceedings processing queries query relationships resnik retrieval rijsbergen salton sample schutze science search seatlle selection selectional sigir sigmod size stability statistical statistician statitical systems tague text theory thomas transactions type university user using vectorspace verbs wiley with wong words yang york zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.46 57 Compound Classification Models for Recommender Systems Lars Schmidt-Thieme Institute of Computer Science, University of Freiburg, Germany lst@informatik.uni-freiburg.de adaptation adapted addison algorithms american analysis antonio architecture artificial asia association autocorrelation automatic automating balabanovic based bayesian bergstrom bias billsus breese burke cacm callan cause chapel chen cikm class classifiers cliffs collaborative comp comparison computer computing condliff conf conference content contentboosted coop cscw data decoupled department descriptions deshpande development discovery document dynamic editor effectiveness effects eighteenth empirical englewood evaluation experiments exploiting factors feature feedback fifteenth fifth filtering filters fourteenth francisco furuta goldberg grouplens hall heckerman hill hofmann human hybrid hypertext iacovou icdm icml ieee improved information intelligence interaction international intl item iyengar jensen july kadie karypis kaufmann knowledge latent lausen learning lewis linear linkage london machine machines madigan madison maes management melville menlo methods mining mixed modeling models mooney morgan mouth multi nagarajan national netnews networks neural neville nichols open optimization order pacific pages pakdd park pazzani posse predictive preferences prentice press proc proceedings processing product publishers publishing ratings recommendation recommendations recommender relational relevance report research resnick retrieval riedl salton schmidt science second selection semantic shardanand sheffield shoham sigchi sigir smart social springer stotts structure study suchak support survey swir syst system systems taiwan tapestry technical terry thieme training trans transactions twelfth uncertainty university user using vector verlag weave wesley with word work working workshop york zhai zhang ziegler http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.140 102 Suppressing Data Sets to Prevent Discovery of Association Rules actual address adversary aims algorithm algorithms approaches association average avoids aycah azgin bertino besides better blocking both caused clifton conclusion conference confidential containing dasseni data december different discover discovery distances drawbacks effects elmagarmid existing experiments fact fifth figure forward guess heuristic hiding hintoglu http icdm ieee improves inan increase increasing information international keskinoz knowledge kullback leibler loss major margin mining modified moreover number oliveira original overhead paper percentage perform performance preserved prevent proc proceedings proposed protecting real reconstruction record reducing references respect rule rules sabanciuniv safety sanitization saygin sensitive sets showed shown side sigmod slight strategies students suppressing suppression tends than that this time tkde unknowns usefulness using values verykios which with zaiane http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.62 93 Fast Frequent String Mining Using Suffix Arrays abouelhoda about acids adapted algo algorithm algorithmica algorithms allows alphabets also always amino answering application applications approach approaches arikawa arimura array arrays athlon available bairoch based bears because before behavior better between biological boost both builds burkhardt cambridge case chosen combined common compared comparison compiler complete comput computation computer conference constraint constraints construct construction consuming containing contains contrast could data database databases dataset deal dementiev depend design different directed discrete discussion disjoint efficient efficiently embl enables engineering enhanced entries environment evaluation even evolution exhibit existing experiments extensions external extracts extremely factors fair fakult fast faster fastest favst ferragina fifth figure file first fischer fixed forced frequency frequent from further future group groups gusfield handled held heun high higher highly icdm ieee inductive inen informatik information instead interesting international into kasai kdid kramer kurtz large larger less lightweight line linear lncs locality longest ludwig manber mannila manzini maxfreqquery mehnert memory method methods microarray mined minfreq minimum mining minutes motivating much myers nchen nice nucleic nucleotide ohlebusch only options other outlook outperforms output pages paper parison park partitions performance phylogenetic pick piled prefix presented preserved press proc proceedings progress project proposed protein query raedt random real reasons recent references release relevance replacing report resource respectively results revealed rithm rrna running same sanders saving scalability scaling searches second secondary seen selected sequence sequences sequential several show shows siam similar similarity size small society software some space species springer storage striking string strings structures subset subsets substrings such suffix swat technical test tests than that their theory this three thus time times towards tree trees tricks trie tryout twice under uniprot unit universal university used using value values varied varying very volume were which whole will with work workshop worst write xanthomonas http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.1 67 A Bernoulli Relational Model for Nonlinear Embedding Gang Wang1, Hui Zhang2, Zhihua Zhang1, Frederick H. Lochovsky1 Hong Kong University of Science and Technology, Hong Kong according across advances algorithms ality along also although american analysis applicable application applied artificial avoids belkin bernoulli between block blocks blue brand brendan brighter busby center chung circles class classes clustering components computation computational conference consequently considering construct coordination cost crosses data dataset decomposition dekel describe develop different digit digits dimension dimensional dimensionality discrete distance distribution distributions each edition efficient eigenmaps eigenvalue either embedded embedding embeddings enlarged especially example existing experimental explicit explored face faces fifth figure figures florida following four fourth framework from future gaussian geometric global goldberger gplvm graph green hall hand have high highdimensional hinton icdm identities ieee ijcai images information inspired intelligence international introduces inverting isomap iterative joint jordan kernel kolman kpca labels langford laplacian large latent lawrence lead learning left like likely line linear local locally lower magenta maintain mathematical matrix meaning memisevic method methods metric mining model models motivated motivates much muller multiclass multinomial multiple nabney naturally nature need neighbor neighborhood neighbourhood netlab neural niyogi nonlinear note number only optimization order orientation over overlap pages pairwise part pattern physical plot plotted plus points ponent prentice prior probabilistic problem problems procedure proceedings process processing properties quite random readily recognition reduction references relation relational relations represent required respectively ressell results right ross roweis running salakhutdinov same saul scholk science separated show shown side signs silva since singer small smiles smola smoothly society some space spaces spectral springer squares stars statistics sticks stochastic structures systems techniques tend tenenbaum than that theorem theory there these thickness this those tongue turn turns twin types unifying upper used uses usps utilized variable variables vary very visualisation visualized volume well west when which with within would xing yellow http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.151 16 ViVo: Visual Vocabulary Construction for Mining Biomedical Images above acad accuracies acknowledgements aerial allow american analysis annals annotation another applicable applications applied apply approach assessment automated automatic automatically available avery baek barnard bartlett based believe between beveridge biological biologists biomedical boland browsing burges cancer captures categorizing cells cellular characteristic charteris civr class classes classification classifying cognitive collection color comp comparisons component computer concepts conclusions conference confidence contents contribution corporation correspond cytometry data databases dced deerwester delayed derive describe describing description detachment detection differences different discovery distinguish diverse domain donations draper dumais duygulu early eccv efficacy eigenfaces electron entropy essential evaluation experimental faces features fifth first fisher fixed fluorescence focus follows forsyth framework freitas from furnas ganapathy general generality genomic geoffrey gift gives google grammar grants grumman harshman highlights hyvarinen iccv icdm ieee image images immunofluorescence immunological implications important independent indexing information intel interesting international interpretation introduction january jeon john jolliffe journal june karhunen keywords klockars knowledge kovace laboratory landauer large latent leading learning leigh lewis lexicon like linberg localization location machine machines main mammalian manjunath manmatha markey matching maximum meaningful method methods microarrays micrographs microscope microscopy might mines mining mojsilovic mpeg multiple murphy neurobiol neuroscience northrop number numerical object ophthalmol oren other otherwise oxygen pages papageorgiou paper part partnership parts pattern patterns pentland perspective photographs pita poggio porreca principal problem proc proceedings process processes processing progress propose proposed protein proteins providing publications quantitative readily recognition recognizing references remodeling research respectively resulting results retina retinal retrieval reveal robert robust safranek sage salembier science sciences screen second section semantic sensor sequences series sethi showing signal significance sikora sivic sixth social society sons sources spatio springer steven structures studies subcellular successful successfully succinctly such summarization support supported talaga technique techniques technol temporal terms text texture thank that therapy thesaurus these this tiles trans translation treatment turk tutorial ucsb understanding univ unnoticed unsupervised using validate vector velliste verardo video videos vision visual vivo vivos vlsi vocabulary volume when wiley will with work would zisserman http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.59 148 Example-Based Robust Outlier Detection in High Dimensional Datasets ability acknowledgements adaptive addison aggarwal algorithms analysis arbor barnett based been behavior best beyer breunig charu class clustering comp computation conf conference contours correlation curse data database datasets deal demonstrated density depth designed detection dimensional dimensionality dissertation distance especially example examples experiments faloutsos fast fifth find flynn from genetic gibbons goldberg goldstein grant high html http icde icdm identifying ieee incorrect indicated integral international jain john johnson jong jsps kitagawa knorr kriegel kwok large learning lewis local loci machine meaningful method mext michigan mining mlearn mlrepository murty nearest neighbors optimization outlier outliers pakdd papadimitriou part philip proc proceedings ramakrishnan reading references relevant research review sander scientific search shaft sigmod sons statistical subspaces supported surveys systems theories this tolerance university user using vldb wesley when wiley with http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.114 105 Pairwise Symmetry Decomposition Method for Generalized Covariance Analysis Tsuyoshi Ide´ IBM Research, Tokyo Research Laboratory 1623-14 Shimo-tsuruma, Yamato, Kanagawa 242-8502, Japan goodidea@jp.ibm.com according addison advanced analysis analytically anomaly application applications arnold author based beside best between combinations conclusion conference considering correction covariance covariances cross cumulant cumulants data defined demonstrated detection detects distribution each effectively elsewhere existing expansion exploits extending fifth figure finally first five found framework from functional gaussian generalized geometric group have icdm idea ieee international inui irreducible japan jebara journal kendall kernel knowledge known kondor kubo learning limitations lissajous london machine mechanics method methods mining model modern next nonlinearities nontrivial only onodera other pairwise parameters pattern patterns physical physics proc proceedings proposed published publishers quantum real recognition references relationships remarkable representations sakurai series sets several showed shown shows simplest society solvable specific springer statistics stuart summarize takes tanabe task telos terms that their theoretical theory think this through time tool traditional trajectories using utility value vectors verlag viewed volume well wesley where will work world zero http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.132 19 Shortest-path kernels on graphs Karsten M. Borgwardt and Hans-Peter Kriegel Institute for Computer Science Ludwig-Maximilians-University Munich Oettingenstr. 67, 80538 Munich, Germany {kb|kriegel}@dbs.ifi.lmu.de about accuracy acids acknowledgements advances akutsu algorithm algorithmen algorithms alternatively alternatives analysis annual application apply approach area attributes bank beneficial berman best between bfam bhat bioinformatics biwiss bmbf boolean borgwardt bourne brenda burges cambridge challenge chang characteristics classification clustering colt comm comments comparable complexity component computable computation computational computations computer computing conference connection connections considered constructive convolutional could creates cruz cyclic data database defined department designing developments dietterich different diffusion dijkstra direct discovery discrete domain drucker ebeling ecml editors education efficiency efficient engineering enrichment enzyme even expensive exploited exponential expressivity extensions faster feng fibonacci fifth finding first flach floyd fredman free from function functional functions further future gart genome genomes geometric german germany gilliland giving good grant graph graphen graphs gremse hard hardness haussler have heaps held heldt hierarchical higher horn horvath huhn icdm icml ieee improved improvement include information inokuchi interesting intermediate international into issue jacm jordan jungnickel kashima kaufman kernel kernels knowledge kondor kopf kriegel label labeled lafferty large largescale lawler lead learning leen limitation ller look loopless lower machine machines maha major mammalian management mannheim marginalized mathematics matrices matrix method methods might mining ministry modeling most mozer multiplication necessity network networks netzwerke neural ngfn nips node nodes nonvectorial note nucleic numerische obvious offering only optimally optimization order other over oversimplify pages part particular path paths pattern performances perret petsche pkdd polynomial powers practice prediction predictive preserves press principal principles problem problems procedure proceedings processing product project promise protein question ramon random references regression remains report represented research results reviewers rtner runtime santa scale schoenauer schol schomburg science sciences search sequences sets seventh shindyalov shortest shortestpath siegelmann sixteenth smola solutions sparse special speed springer start states statistical still structures studies support supported systems tarjan technical techniques technology than thank that their them theorem theory these think this time transformation trees tresp tsuda twenty type ucsc ueda under united unreal updates uses vapnik vector verlag versus vert very vishwanathan volume walk warmuth warshall washington weights weissig westbrook which while wiley will with within work workshop wrobel york http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.112 135 Optimizing Constraint-Based Mining by Automatically Relaxing Constraints about acknowledgements agrawal algebra algorithm algorithms allow analogously anti antunes association automatically based bases been belonging benefit bingo bonchi boolean borders both bucila claudia closed cnrs combinations combined comm completely conference constrained constraint constraints context convertible correction correspond cremilleux data database deals defined detect discovery donnees dual dualminer efficient efficiently equal evaluation exploratory expression extraction false fast fifth finally find flexible framework frequent from funded garofalakis gehrke give icde icdm identify ieee imielinski implement important improve improves inductive inductives instance international item itemsets justifies kdid kiefer kifer kinds knowledge lakshmanan latter levelwise lucchese mannila masse method mines mining mique mono monotone most music named nevertheless note operator operators otherwise pages pakdd partially particular parts pattern patterns perspective pods post pour practical primitive proc proceedings process properties propose proposed pruning pushed queries query quickly raedt rastogi references regard regular relaxation relaxations relaxed relaxing relevant requiring resp respectively result rules search section sequential sets shim show sigkdd sigmod solver soulet soundly spirit srikant stage stemming such superclass that then theorem theoretical theories these they this toivonen tone true under user using vldb which white with without witness work http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.156 64 X-mHMM: An Efficient Algorithm for Training Mixtures of HMMs when the Number of Mixtures is Unknown according acoustics advanced advances algorithm algorithms also alternative alternatives appear applications approach approaches associated authors avoiding based baum bayesian better bicego biswas bits capable cases categorization characteristics classification clustering clusters codes competitive compression conference correct cost could critical data datasets discovery dubes editors efficient errors estimate estimates estimating estimation even excellent experiments extending extension faster fifth finding found framework francisco functions future ghosh give groups hall have heskes hidden hmms however huffman icdm identifying ieee important include incorporate inequalities inequality information initialization inspired international jain john jordan journal juang kaufman kaufmann knowledge large lead learning least like machine make many markov maximization means mhmm mining mixture mixtures modelbased models modifications moore more morgan mozer murino necessary neural number observation only order original other overestimation page pages parameters particular pattern pelleg performance petsche points possible prentice press probabilistic problems procedure proceedings processes processing proposed rabiner real recognition reference references reliable research rousseeuw seems segmential selected sequences series sets seventeenth shannon should signal similar similarity simpler simultaneous small smyth some sons space speech states statistical still study successful synthetic systems technique temporal than that think thorough tolerating training trans tutorial underlying unified user using vector volume were where wiley with without work workshop would ypma zero zhong http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.125 134 Pruning Social Networks Using Structural Properties and Descriptive Attributes accuracies accuracy accurate achieve achieved achieves achieving acknowledgments actor actors aggregates aggregation agrawal allowing also analysis approaches arising attributes based baseline behavior believe best better between beyond both brokers built case cases chairs choosing claim classes classifier classifiers common communications compact complex compressing compression compressions concept conf conference contain corner created customers data datasets descriptive different discovering discovery distinct domingos each event exchange executives exploring feature fifth figure following from full function graph graphs haas higher hubs icdm identifying ieee important influence information interests international intl invention keeping kempe kept kleinberg knobbe knowledge liben link maintaining management many maximizing methods mining most network networks newman newsgroups nowell older ones only other outperformed paper pared percentage perlich position predicting prediction predictive preserved principles problem proceedings properties propositionalisation provost prune pruned pruning quite rajagopalan random rather reducing references relate relational represents resulting results review richadson right sample samples schwartz sector sets shared show showed shown shows siebes significantly similar simply size sizes social spread springer srikant still strategies structural structure supported supports tardos target techniques tenure terms than that these this those through together tradeoff types under understand understanding upper useful using value various verlag versus viable were while wide with wood work world younger http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.70 101 Focused Community Discovery addition aiello algorithm analysis august boston both clustering communities community conference cortes data discovery edges efficient extraneous fifth flake giles graph graphs have icdm identification ieee interest international internet kalmanek knowledge kudo lawrence link mathematics mcdaniel merwe minimum mining missing nakamura networks nity only pages partitioning planted pregibon proceedings real references return returns robust search shown sigkdd sixth spatscheck structure study synthetic tarjan that topology trees tsioutsiouliklis user using volinsky wide world http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.44 95 CoLe: ACooperativeDataMiningApproach and Its Application to Early Diabetes Detection able accurate after agent agents agrawal aimed almost alone already also andthere apatient appears application approach architecture areas aspects average between born butitseemsinterestingtothem chan cole combination complex conclusion conditions conf conference confirms contain cooperative could data databases defined denzinger describe diabetes diabetesrelated diagnoses diagnosing diagnosis discovered disease diseases distributed earlier efficient enhancing example experts explanations fact feedbacks feelings fifth figure firstly fitness focusing following foradiscovered framework frank frequently from future general generating generation generations given global goal good hamzaoglu have hints however hybrid hypertension hypertensive icdm identifying ieee immediately improve indicate indication indicator indicators individuals interest international intl james java kargupta kaufmann knowledge known learning like likely lnai machine main mainly many medical meta miner mining model more morgan multi multiagent multidatabase multiple need number only opinion optimization order other ourresultsurgethepublichealthservicestoimprove over patient patterns phenomena positive potential presented proc proceedings prodromidis produce promising proposed prove quality quite recall references relatedtodiabetes relation relevant repeatedly research researchers respiratory results reveal rule ruleisintable rules scalable secondly seem sequential sets should showing significance signs skin some specific srikant stafford stolfo strategies strong subcutaneous symptoms system temporal that their them there therefore they thirdly this tight tissue toward trans tselepis uncomfortable using valid very which will with without witten work http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.12 65 A Random Walk through Human Associations Raz Tamir The Hebrew University of Jerusalem raz.tamir@gmail.com acmsiam actual addressed akman alan algebra algorithm algorithmic algorithms ambiguous analysis analyzing anatomy answer answered answering apis applications approach approaches area army artificial association associations asymmetrical australia authoritative base based baton been behavioral believe better between both brace brain brains brin brisbane caching cambouropolos capable categorisation categorization center certain challenges chapter chelsea cicekli clark cognitive combinatorics common compared complexity computer computers computing concepts conceptual concluding conclusions conference confidence confronts congressus conscious creating current cuts data database deals define descriptors determine development dictionary diego different discover discrete discussion edinburgh edition efficiently eigenvalue eigenvectors employed engine engines environment even example exemplar expansion experimental exponentiation express extract extracted extracts farahat fifth finally first florida flow forecast forthcoming four fragment free freeassociation freitas french from furthest future gain gantmacher generation germany giving good google graph gupta hahn harcourt haveliwala help henzinger high highly history hits html http human humans hyperlinked hypertextual icdm identify ieee iimenau image imitate imitating information informative innovative instruments intelligence intend interdisciplinary interest interesting interestingness international internet intro ircache journal jovanovich june kamvar kleinberg knowledge kumar large later latter lecture legacy lempel like limit linear link links lofaro logs maccluer machine machinery machines malik many mathematics matrices matrix mcevoy meanings measure measures melbourne memory methods michie miller millican mind minds mining model modeling modification more morran most much near nelson normalized normsnorms notes november numerantium only orleans other overview oxford page pages pain palmeri part partitioning pattern patterns people perception perceptual performance performed perrons phrases potential prediction presented press probability proc proceedings processes profile profiling programs project promising proofs properties proxy psycholinguistic publ publishers qualitative quantitative query question questionnaires questions raises ramscar random rapp rather reach readable real recognition records references remarks reply report research retrieval reveal review rhyme right robust rouge rule rules salsa samples sampling saygin scale schreiber science scoring scotland search searl second seed segmentation selecting september session shiffrin show shown siam sigir similar similarity similarly smaller solar solid sources south southeastern srivastava stanford star statistical still stochastic strang structure subcognition subcognitive submitted success supply surfing symposium system systems tamir technical technique terms test than that theorem theoretic theoretical theory thinking third this thought thoughts three thus time topics transactions translated trends turing turney unger university usages used user users using validation valuable various vision visits vldb volume waiting walk walks were what whether while wide wilson with within word work workshop world years york http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.149 18 Using Information-theoretic Measures to Assess Association Rule Interestingness Julien Blanchard, Fabrice Guillet, Regis Gras, Henri Briand ´ LINA (FRE 2729 CNRS) ­ Polytechnic School of Nantes University La Chantrerie BP 50609 ­ 44306 Nantes cedex 3 ­ France {julien.blanchard,fabrice.guillet,henri.briand,regis.gras}@polytech.univ-nantes.fr aaai ability about absolute account actes agrawal algorithm allow allows also american amount analysis analyzing applied approach approaches arqat asmda assessing assessment association avec bases basic basket baskets bayardo beyond blachman blanchard breakdowns briand brin certainty chapman chapter chen clark classification clustering communication complementary complete conference confidence construction contraposees coordinates correlations counting data databases decision definition deviation deviations discovery distributions dynamic editor ekaw engineering entropic equilibrium estimation evaluation example exploratory factors fast fifth figure foundations frequential from general generalizing generally generation gives goodman gras guillet hall have holland holyoak huynh icdm ieee illinois implication incoherent incomplete independence index induction inference information informationnel intelligent intensity interesting interestingness international into involving ipee itemset jaroszewicz journal kaufmann knowledge kumar kuntz lallich learning lenca lerman leurs likelihood linkage loevinger machine mannila market mathematical means measure measures mesurer method mining models monographs more morgan most motwani niblett nisbett nouvelles objective only order padmanabhan pages parallel piatetsky pkdd presenta press probabilistic proceedings processes profiles programs psychological qualitative qualite quinlan record redundant references regles relationships relative revue right rule rules sample samples schoenauer science sebag selecting sets shannon shapiro sigmod significance silverstein simovici size smyth sociology springer srikant srivastava statistical stochastic strong subjective support sure symposium systematic systems takes taux technologies tests thagard that theil theoretic theory these this tion toivonen tool transactions transcations tsur tuzhilin ullman unexpectedness university vaillant variables verkamo verlag version weaver while whole with zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.61 132 Face Recognition Using Landmark-based Bidimensional Regression akamatsu algorithms analysis anona automatic based benavente better brunelli component compute conference coordinates costen craw data database distance docmainpage dreden effective evaluation face faces features feret fifth goldstein guide harmon have http human ically icdm identification ieee impacted institute intelligence international john jolliffe june kato landmark landmarks lesk machine manually mardia martinez meaningful measure methodology minimally mining mixed model moon obtained pattern phillips poggio principal proceedings procrustes rauss readers recognition references refined repetitions report represent rizvi shape should shown similarity sons springer stat statistical support tech technical templates that transactions user using verlag versus while wiley works york http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.25 118 An Optimal Linear Time Algorithm for Quasi-Monotonic Segmentation about accurate addition aggregate algorithm amnesic analysis applications approx approximation arrhythmia based baseline best between both brooks cascon channel circulation computed conference consecutive considerably consideration constant continuous correction counting cubes data database decrease defined description domain down each easily error estimate experiments factor faster fifth figure finding first flat from given goldberger gunopulos having hence icde icdm identifiable ieee ijcai implementation implemented international isotone keep keogh label lemire length level linear markers match matching maximal methods mining modelling moments monotonic monotonicity more norm number observed october omafe online only optimal optimization other over palpanas pattern physiobank physionet physiotoolkit picking point points precomputed predicted prefix preprocessing proceedings pulse pulses python qualitative queries range rate recordings reference references regression relative repeated resolution respect results same samples sampling scale second seconds segment segmentation segments series should sign significant some source specific spline stack starting starts streaming such than that theory there these through time times total truppel typical ubhaya under used useful version vlachos wavelet where which while with worse http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.19 47 Alternate Representation of Distance Matrices for Characterization of Protein Structure Keith Marsolo and Srinivasan Parthasarathy The Ohio State University Department of Computer Science and Engineering Contact: srini@cse.ohio-state.edu aaai about academic accuracy accurate acid addition additional algorithm algorithms alignment alignments allow almost also amino analysis analyzing andrew annu appl application applications approach approximation april augmented aung automated automatic avenues based bayesian bibe bioinformatics biol biomed biomedical boosting bourne brenner bullimore calif case cases cath chemical chinnasamy chothia cibcb class classification classifier clinical coefficients coherent collins combinatorial combined combining compact comparision comparison compute computed conf conference corneal could creating data database dataset datasets daubechies davis decomposition described descrip description descriptors deville dimensional ding direct distance domain domains dubchak dynamic edition effective efficiently ensemble evolutionary exist existing experiments explored extend extension fain families family fast feature fifth final fingerprinting fold folding fourth francisco freund further future gauss genome gerstein gilbert global greinvenkamp height helpful here hierarchic high holbrook holm hope huan hubbard icdm icml ieee images improve improving increase incremental indexing indust informatics instance instead integrals intell intelligent international investigation involve iskander issues iterative jones kaufmann keratoconus laine large learning lectures level levitt lotan machine machines mallat marsolo mateo math matrices measure menlo method methods michie might miller mining mittal modeling modern morgan most muchnik multi multilevel multiple multiresolution murzin naive need needs networks neural number numbers objective obtain optimal orengo other over pages pair park part parthasarathy path performance philadelphia physio pnas points polynomials potential prediction press prins proc proceedings process processing programming programs proposed protein proteins provide published publishers quickly quinlan raasch rather recognition recomb references representation representations research results same sander schapire scheme schwarzer schwiegerling scientific scop selection september sequence sequences shindyalov siam signal similarity some sort spatial specific stage stanford still strategies strategy structural structure structures strutcture study subgraph such suganthan sung support surfaces swindells tbme techniques tein temporal terms than that there this thornton those through tion tour treat tree tropsha tuned used useful using vary vector very videokeratoscopic wang washington wavelet wavelets well when where while wise with world would yields york zernike http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.89 50 Leveraging Relational Autocorrelation with Latent Group Models Jennifer Neville, David Jensen Department of Computer Science, University of Massachusetts Amherst, MA 01003 {jneville|jensen}@cs.umass.edu aaai abbeel advances aistats algorithms allocation american analysis approach approaches approximate association authoritative authorized authors autocorrelation based bayesian bias blei blockstructures building business case categorization cause ceder chakrabarti classes classifers classification cluster clustering collective concept conclusions conference contained contingent copyright craven cuts darpa data dependency detection dipasquo dirichlet discovering discrete discriminative distribute domain domains either endorsements engines enhanced environment estimation estimators expressed extract feature fifth freitag friedland friedman from gallagher getoor government governmental griffiths group handcock herein hereon hoff hyperlinked hyperlinks hypertext icdm icml ieee ijcai image implied improves inductive indyk inference infinite institute intelligence international interpreted invention jensen jordan journal kemp kleinberg knowledge koller kolobov kubica latent learning link linkage logic machine macskassy malik marthi massachusetts mccallum memo milch mining mitchell models moore multiple necessarily network networked networks neville nigam normalized notation notwithstanding nowicki official overlapping pages pattern pfeffer playing policies popescul predicate prediction probabilistic probability proceedings programming provost purposes raftery references regularities relational rennie report representing reprints reproduce research roles russell schneider school search segal segmentation selection seymore should siam sigkdd sigmod simple slattery snijders social sontag sources space specific springer stahl statistical stern stochastic study symbolic symposium taskar technical technology tenenbaum test those toolkit transactions trees ungar univariate using verlag views wide wolfe workshop world yang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.107 115 On Learning Asymmetric Dissimilarity Measures aaai accuracy addressed advances after aggarwal agrawal algorithm algorithms amounts analyzing applications approach artificial asymmetrically baseline basu because becomes better bilenko bottleneck burges caruana case cikm clad classification clickthrough cloucester clustering clusters cohn compare comparisons completing comprehensive conclusions conference consider considered consistently constraints context cornell corre corresponding craven curves data design desired details different dipasquo dissimilarity distance document each editors engines exact experiment experiments explore extract features feedback fifth figure fixed foundation freitag from functions fundamental future generalize hall haykin hence here http icdm icml identical ieee information integrating intelligence international iterations joachims kernel knowledge krishna krishnapuram kullback kummamuru large learning like making matrices mccallum measure measures mentioned method methods metric mining mitchell mooney multi national nature networks neural news nigam number observed obtained omit optimizing pages parameter parameters percentage perceptron performed peter posed practical prentice present press problem proceedings processing product project properties proposed recommendation refer references regularization related relative report respec resulting results retrieval river saddle satisfied scale schlkopf schultz search section semi sensitive sets show shows sigir sigkdd slattery slonim smith smola space spatially sponding statistics stopped study supervised support svad svms symbolic systematic systems technical testing text than that theo theory these thousands tishby tively towards training uniform uniformly university upper used user using value variant varied various vector vectors want webkb weighting weights were when which wide with word work world would zero http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.95 20 Mining Frequent Spatio-temporal Sequential Patterns Huiping Cao, Nikos Mamoulis, and David W. Cheung Department of Computer Science The University of Hong Kong Pokfulam Road, Hong Kong {hpcao, nikos, dcheung}@cs.hku.hk accelerate acknowledgements addition agrawal algorithm algorithms also american assessment authors available both canadian cannot caricature cartographer characteristic cheung chiu closed closeness clustering conf conference connectivity considering copyrighted data database databases dataset dieter digitized discovery douglas douglaspeucker effectively efficient employed engineering environment episodes event fifth finding found frequent from generalization grant grouping gunopulos hadjieleftheriou handling hart hershberger historical hong icdm ieee indexing information international intl keogh knowledge kollios kong like line linear lonardi long longer made mamoulis mannila meaningless minimum mining newly noisy number online only pages patterns pazzani peucker pfoser points previous problem proc proceedings properties proposed providing publicly querying real reduction references renganathan represent required rule search segmenting segments sequences sequential series shape sigmod similar simplification singular smyth snoeyink space spatial spatio spatiotemporal special speeding srikant streaming substring support supported surprising symp symposium temporal terns thank tial time toivonen tree truppel tsoukatos tzvetkov unfortunately using verkamo wang white with without work workshop would yang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.117 53 Parameter-FreeSpatialDataMiningUsingMDL advances affinity agrawal algorithm algorithms also analysis andritsos anefficientapproachtoclustering anomaly application applications approach approaches approximation arithmetic association associations attraction automatic automatically available based basu been better bilenko bipartite birch block boosting brought camb candidate cannot categorical cell cells chakrabarti chameleon cheung city classic classification clustering clusters coding coherent collocation collocations colt communities compress compression computer concept concepts conclusion conference confidence confident conjunctive correlation country county cover cross cultural cure cuts cvpr data databases dayal decompositions description detection dhillon directly discov discover discovery disorder dynamic easily edbt efficient elements elkan employ entries estimation evaluating exploit extended extending extends faloutsos fast feature features field fields fifth finally finding follow framework free frequent friedman from fully generalized generation gersho graph graphs group groupings groups grunw guha hamerly handle harmonic hastie have help hierarchical hierarchies hinneburgandd history however huang icassp icdm ieee image incorporate inference information informationtheoretic inicml international interscience introduction isalsoparameter itemset jacm jordan kamber karypis kaufmann keim keogh kitsuregawa kleinberg knowl kolmogorov kumar labeling langdon large learning leino length limbo linear livny location lonardi mach mallela mamoulis mannila markov means method methods metric miller minimum mining mishra modeling modha mooney moore morgan multiconstraint multilevel multimedia natural neighbourhood nips noise number numberof occurrence occurrencepatterns onomastic order other pages pairwise papadimitriou parameter parameters particular partitioning pattern patterns pelleg phil pitkanen pkdd point potts practical practically prediction press principle principled probabilistic problems proc proceedings propose ramakrishnan random rastogi ratanamahatana recentworkoncross reddy references relate relationships require revolution rissanen rule rules salmenkivi scalable semi sevcik shekhar shim shou sigmod significance simultaneously size some sparse spatial spatially specify spectral springer srikant standard state statistical supervised support swaminathan tardos tasks techniques text that theory these thomas threshold through tibshirani tools towards transformations tree tsaparas tsdm tutorial user using vaisey values variable versa very vice vldb weiss well when wiley wise with withefficient without work xiong zabih zero zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.72 60 Generalizing the Notion of Confidence Michael Steinbach and Vipin Kumar Department of Computer Science and Engineering, University of Minnesota 4-192 EE/CSci Building, 200 Union Street SE Minneapolis, MN 55455 {steinbach, kumar}@cs.umn.edu aaai access accompany addison agrawal agreement agrees ahpcrc algebra algorithm algorithms also applications applied appropriate april area areas arlington army association associations attributes august aumann auspices banerjee based baskets been between beyond binary bollmann boolean bradley bregman brin buena bump center clustering combine comparison computer computing conclusions conference confidence constrained constraints content continuous converting cooperative correlations daad data databases dawak defined definitions demmel department derive described dhillon dimensional dimensions discovering discovery discrete divergences dmkd does efficient elements endorsement error evaluate evaluation example exclusion experiments exploratory explore exploring express facilities fast fayyad fifth finding fisher foundations framework frequent friedman from functions future gancarz gawrysiak general generalized generalizing germany ghosh government grant hafez hastie have high highconfidence hunting icdm ieee imielinski inap include includes including inclusion inferred institute integrated interestingness international into introduction investigation item items itemsets january jarosewicz karypis kumar lake lakshmanan large lcns learning lindell linear market mathematics measure measures merugu minapriori mining minneapolis minnesota model motwani multi munich necessarily need needed notion number numeric numerical obtained october official ogihara okoniewski optimizations ozgur pages pang pattern patterns pearson performance policy polynomials position presenting press proceedings provided pruning quantitative raghavan references reflect regression relational report research result results retrieval right rosenberg rules science sdorra selecting sets should shown siam sigmod silverstein simovici springer srikant srivastava statistical statistics steinbach such supercomputing support swami tables technical that theoretical theory there this tibshirani tion tolerant traditional transaction under university used useful usefulness using variables various vectors verlag vista vldb volume washington webb wesley with work workshop worthwhile would xiong yang york zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.123 91 Process Diagnosis via Electrical-Wafer-Sorting Maps Classification acid advance advanced agrate algorithms analysis approach automatic based building case cause chatterjee chen classification commonality comparison computer conference core create croley currently data defect denicolao design diagnosis dipalma donzelli each electrical enhancement experimenters fabrication failures fifth history hoffmeistern hunter huunter icdm ieee international introduction issm johin kong letters likely list malinaric manufact manufactoring manufacturing maps methodology microelectronics mining miraglia model more network neural pages part pasquinetti pattern piccinini problem problems proceedings process recognition recognize references responsible root semi semiconduct semiconductor semiconductors simposium site software sons spatial statistics step study that this tool trans unsupervised used wafer which wiley yiel yield york http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.32 147 Bagging with Adaptive Costs active added adds advances algorithm algorithms annals applied arcing bacing bagging base bauer better bias blake boosting bootstrap breiman classification classifiers comparison complexity computational computer conference conjecture correction cost cross current data databases dietterich editors effective empirical ensemble ensembles estimate expected experiments fair fifth forests freund generalization generally helps hold html http icdm ieee implement information international itself kohavi krogh learner learning lecture leen like little machine mechanism merz method methods mining mlearn mlrepository natural nature network networks neural notes pages performance predictors press proceedings processing provides random references reposi sampling schapire science sensitive simple springer stacked stacking standard statistical statistics supported systems tesauro than that theory therefore tory touretzky using validation vapnik variants vedelsby verlag very volume voting will with wolpert york http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.142 68 Template-Based Privacy Preservation in Classification Problems according achieve achieving adult affect agrawal algorithm alternative amazing american anonymity anonymization approach approximately archive association attribute attributes averaged based bayardo because bertino better between bottom candidates cases chakraborty class classification clifton combined computer conclusions conference confidence consider considering constraints consuming contains control created dasseni data database databases datafly datasets depicts design detection difference disclosure discovery disk does domain down each eliminated eliminating elmagarmid errors evaluated even evfimievski expanded expansion experiment experiments expert explorations exposure farkas fifth figure files first follows frequent from fung fuzziness gehrke generalization giving goal group hettich hiding high however http hundreds icde icdm ieee imielinski implementation implying including individuals inference inferences information inherited international items iteration iterations iteratively iyengar jajodia japan journal kantarcioglu kaufmann kloesgen knowledge large larger largest learning least less letting level levitt life limit linking lower machine made masking maximum medical method methodology methods microdata minimum mining modeling more morgan most number operations optimal original other over pages picked possible presented preservation preserves preserving privacy problem proc proceedings programs progressive proposed protection providing prunes purpose quinlan random randomly reach real record records references remaining removed removing requirements requires research rest restrictive results rule rules runtime sample satisfy saygin scalability scalable scale search searches seconds section security selected sensitive sets settings showed shows sigkdd sigmod size slightly solution some specialization specify spent srikant statistical statistics studied such summarize suppress suppressing suppression survey swami sweeney symposium system systems table templates testing than that then theory these this through time tkde together tokyo tools topn training transforming uncertainty uniformly used user using vaidya value values variation variations version very verykios violate wang were when whenever where which while winkler with working http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.84 25 Kernel-Density-Based Clustering of Time Series Subsequences Using a Continuous Random-Walk Noise Model Anne Denton Department of Computer Science North Dakota State University Fargo, North Dakota 58105-5164, USA anne.denton@ndsu.edu aaai about abstract academic accessed according achieved acknowlegements advances alan algorithm also amaral analysis anguelov another application applied approach archive artificial assigned assignments associative balloon based being bentley berndt best better beyond binary boston brazil buoy chapter cheng circulation city clifford cluster clustering clusters comaniciu communications compared compensating compensation complex components concept concepts conclusions conf conference conjunction considers constraint continuous coordinate correct data databases density denton despite deterministic discovering discovery discrete distinguishing dynamic elimination engineering eral estimation evaluation extended feature fifth finding folias from furthermore future gavrilov general generalized glass goldberger goldin gunopoulos hausdorff hinneburg hold icde icdm ieee ihler implementation implications important improved incorrect indication indyk input intelligence international introduced ivanov janeiro japan john jose june kanellakis keim keogh kernel knowl knowledge kohavi kollios large lead leads less likely linear listed lncs lonardi machine maebashi magnitude makes mannila many mark market massive matches matlab mean meaningless means measure measurements meer melbourne members menlo mietus mining mode model moody more motifs motwani much multidimensional noise number ones orders origin other others pages paper park particular patel pattern patterns peng physical physiobank physiologic physionet physiotoolkit picked potential practice presented press previous priestley principles proceedings productively programming proofreading quality queries random rather recognition references reliably renganathan research resource result results robust rule runs search searching seattle seeking seen selection sequences sequential series sets settings shift sigkdd signals similar similarity since sixth smyth some sources space specific specifically specification speech springer stanley stationary still stock subsequence subsequences subset success suggest syst temporal tenth tests than thanks that then these third this threshold time timeseries toolbox toward trajectories transactions transformation trees truppel uniform used using valid vlachos walk walks were which with workshop wrappers http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.102 58 Multi-Stage Classification Ted E. Senator DARPA/IPTO* tsenator@darpa.mil aaai academy acapulco accessed acknowledgements adibi advanced algorithns also american analyses analysis angeles applications approaches architectures armour artificial assessment assumption august availability awareness behavior between blau break browne business cagliari cash chalupsky classification classifier classifiers colleagues collected combining communications company complete complex comprehensive computer conditional conference consideration considering constraints cottini could countering course current data david decision deploying described detecting detection develop developing discovery distribution domains donoho dvzeroski dybala dynamic dzeroski each editorial encouraging ensembles evaluated events existence explicit explicitly explorations expressed extension extremely fall fifteenth fifth finally fincen first former foster fourth fraud frayling friedman from frontiers george getoor gmbh goldberg grobelnik group handbook hayden helping hoboken iaai icdm ideas identifying ieee independence information innovative intelligence interest international italy january jensen john july june khan kirkland klinger klosgen knowledge koller kuncheva large laundering lavrac learning lecture likelihood link linkkdd llamas ludmila macmillan magazine major making management many march marrone math menlo methods milic mining mladenic model modeling models money more multi multiple multistage nasd national natural network networks neural neville news newsletter ninth notes november numrych observability observation only organizations over overload oxford paper papers park pattern paulos pfeffer picton popp positive potential prepare press previous probabilistic probabilities proceedings processes prospective provost pursue raedt raghu ramakrishnan rare rattigan reality recognition references regulation relational report reports research resource risk rooting science sciences scientific seattle sebestyen second securities senator shyr sidkdd sigkdd sizes social some sonar sons sophisticated sources spring springer springerverlag stage stages statistical step suggest suggestions symposium system systems technical technology terrorism terrorists thakker than thank that their third this through times total towards transactions tricky understanding university using verlag washington weakened when wiley winter with wong wooton workshop would yang years york zytkow http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.40 14 Classifier Fusion Using Shared Sampling Distribution For Boosting Costin Barbu, Raja Iqbal and Jing Peng Department of Electrical Engineering and Computer Science Tulane University New Orleans LA-70118 {barbu,iqbal,jp}@eecs.tulane.edu according accumulate accuracy achieve agree algorithm algorithms among analysis annals application applications approach arching artificial august bagging based become been behavior benchmark best better between bezdek biocomputing blue boosting both breiman bssd chang classification classifier classifiers classify color combination combined combining comparable compared comparing comparison component computer conference confidence consensus continues cybernetics cyclic data database dataset decision deemed degrees deng design difficult dimensions discussion distance distribution duin each effectiveness emerge empirically error established evaluated eventually example examples experimental experts explanation february fifth first fourteenth framework freedom freund from func function fusers fusing fusion generalization generated give given good green handwritten hashem hatef have high http icdm ieee image imagery images improved independence information inherently intelligence international introduction iteration iterations january japanese journal kernel kernelbased kittler knowledge known kuncheva lanckriet learner learners learning lecture less likely line linear lose lowest machine majority management margin matas media method methods mining more most multiple multiview needed networks neural noble nonlinear notes november number numerals obtained october only oommen opinion optimal optimize other over pacific pages paired pattern perform performance performed performs points prediction predictions predictors previous previously proceedings progress proposed protein prototype provide rate rated recognition reduction reference references result results sample sampled sampling schapire schemes science selection sensor separability separable september sequence sets shared short shown sided significance singer size society some statistical statistics still strategies strong study subsequent subspace suen summary super superiority symposium systems table taken techniques templates test texture than that theoretic theoretical theoretically there thirteenth this those tion training transactions unconstrained until update updated used using values vectors view views visiontexture vismod vistex vote voting weak weight weighted weights well were what which whitaker will winning with without yeast http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.53 42 Effective and Efficient Distributed Model-based Clustering Hans-Peter Kriegel, Peer Kroger¨ , Alexey Pryakhin, Matthias Schubert Institute for Computer Science, University of Munich, Germany {kriegel,kroegerp,pryakhin,schubert}@dbs.ifi.lmu.de academic access algorithm algorithms also applicability applications applies approaches arbitrary associates based batistakis both bradley called clustering clusters compared computer concepts conf conference containing costs covariance covariances data databases demonstrates dempster density densitybased dhillon dimensions discovering discovery distributed distribution distributions dmbc dramatically each editor editors efficiency efficient enables erlbaum ester europ evaluation exact examine experimental explorations fast fayyad fifth forman from functions future gaussian gaussians generating halkidi handbook handle heterogeneous hierarchical high higher icdm ieee incomplete information initialization instead intelligent international iterative jager januzaj johnson journal kamber kargupta knowledge kriegel laird large lawrence lecture level likelihood lncs local matrix maximum mean memory merge method mining model modha multiprocessors noise notes other pages parallel park patterns performance pfeifle pkdd press privacy proc proceedings proposed publishers pure recent reduces references refinement reina represented respecting robustness royal rubin sander sayal scalable scale scheuermann science series sigkdd sites society spatial springer statistical step systems techniques that transfer validation variance variances variants vazirgiannis vector verlag volume well will with work zaki zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.14 97 A Scalable Collaborative Filtering Framework based on Co-clustering absolute accuracy algorithm algorithms analysis application approach approximately approximation architecture banerjee based below benefits bergstorm both brand breese bregman case choice classic clustering collab collaborative compared conf conference consists correlation cscw data dataset dept detailed dhillon dimensionality dynamic empirical entropy error evaluating existing experimental factorization fast fifth figure filtering fixed found framework generalized george ghosh grants grouplens heckerman herlocker highlight highly hofmann http iacovou icdm icml ieee implementation including incremental international intl involving jaakkola kadie karypis known konstan latent lightweight lower main matrix maximum mean merugu method methods mining models modha movielens movies negative netnews nips nnmf note observe observed obtained obtaining online open pages parallel performance prediction predictive presented proc proceedings provide rank ratings reasonable recommender reduction references report resnick results revisions riedl same sarwar scalable scaleable scenario scenarios semantic seung shows similar since size split srebro static students study suchak syst systems table tamu technical techniques terms terveen test texas that there those time tion training trans under univ users using variation variations various webkdd weighted were which with workshop http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.105 90 OBTAINING BEST PARAMETER VALUES FOR ACCURATE CLASSIFICATION Frans Coenen and Paul Leng Department of Computer Science, The University of Liverpool, Liverpool, L69 3BX frans,phl @csc.liv.ac.uk aaai accuracy accurate algorithm algorithms almost also always analysis appears approach association august based better both carm choice classassociation classifica classification classifiers cmar coenen comparable conf conference confidence cost coverage cpar data demonstrate describe does effect efficient eliminate fast fifth finding francisco good have icdm ieee improved integrating international lead leng less lncs lower methods mining multiple need obtain obtained pakdd performed possible predictive priate proc procedure proceedings reduces references results rule rules selected sensitive shown siam significant simple springer such support tfpc than that these this threshold thresholds tion tuning using values well will without york zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.74 10 Handling Generalized Cost Functions in the Partitioning Optimization Problem Through Sequential Binary Programming advanced alexandre algebra algorithm also amsterdam andrew applications arag artificial artin assignment available bayes belmont berlin bianca branch breiman building cairns calibrated california caravan carnegie case challenge charles class classification classifiers coil company computer computing conference core cost costs costsensitive data decision decisions demirsoy department departmento diego discovery distributed domingos dragos ecml eduardo eighteenth elkan empirical error estimates estimation european evaluation evolutionary example fifth finland fixed flach fotzeu foundations framework friedman from fully function general generalized genetic geoffrey giorgio granger group haixun hall heidelberg helsinki html http hybrid icdm icml ieee induce inform information ingargio ingargiola instance institute insurance intelligence international ipek jacques janeiro jean jerome jersey jianning john joint journal june kaufmann knowledge komarek kretowski kwedlo langford learning leiden ling lisa lnai logistic machine making management marcus marek margineantu mateo mellon metacost method michael minimal mining modeling models moore morgan naoki obtaining olshen operational operations pacific paper paul pedro pennsylvania peter philadelphia philip pigatti poggi polices policies policy prediction prentice price probability problem proceedings programs proportionate published publishers putten qiang quarterly quinlan raedt readings references regression report research richard ritr robotics ross rule salvatore science sensitive sentient sets seventeenth shichao someren specialization springer stabilized stolfo stone systems tech technical temple thesis third tica ting tool tree trees turney uchoa university verlag wadsworth wang webb weighting with wojciech worthington yang zadrozny zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.57 129 Efficiently Mining Frequent Closed Partial Orders (Extended Abstract) Jian Pei1 Jian Liu2 Haixun Wang3 Ke Wang1 Philip S. Yu3 Jianyong Wang4 according advantages aggressively agrawal also among appear application avoids both branches called cannot casas case checking children clearly closed collected conf conference consider construct contain copy data database depth deriving details detection determine dictionary directly discovering discovery distinct each edge edges efficient efficiently empty enumeration episodes event every expand expands experimental expression extensively extract extracts fact feasible fifth filling find first following forbidden form forms found four fragments frecpo frequent from full further furthermore futile garriga gene gionis global growth have having help here hyperlinks icde icdm identified ieee immediately implementation implemented including infrequent instead interestingly international into items knowledge large last less limited link local manner mannila matrix meaningful meek methods mined mining must namely number omit once ones only order orders other overhead paper partial partitioned pattern patterns physical pointers prefix prefixspan preserving problem proc proceedings producing progressively projected projection prune prunes pseudo real recomb recursive reduction reductions redundant references related remaining reported results scalable scanning search second sequences sequential sets shared should show siam since some space srikant stay string strings stronger structure submatrix subset subsets substantial such summarizing super superset support synthetic technique tested than that their them there these they this three thus time together transitive tree used version which will with without words http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.137 22 Summarization - Compressing Data into an Informative Representation access acknowledgements adam addison advances afrati agrawal ahpcrc algorithm algorithms along analysis analyst anomalous anomaly another approach approximating april arda army association attacks attributes authors automatic barbara based bases bastide behavior between boriah breunig calders cambridge candidates cannot capture captured categorical center challenges chan chandola chapter closed clustering clusters cluto collection columbia comments compressing computer computing concepts concise condensed conference connections content context contract corporation couto customer daad darpa data databases dataset datasets demonstrated density dept derivable detecting detection detectors directions discex discovered discovering does dokas dong dozen draft dubes earlier effectiveness eilertson enables endorsement ertoz eskin evaluating evaluation exploring extensive facilities fifth francisco frequent future gaurav generate generation gionis goethals government grant hall high highdimensional highly icdm icdt identification identifying ieee imieliski incorporate inferred informative institute international into introduction intrusion involves items itemsets jain jajodia just kamber karypis kaufmann knowledge kriegel kumar lakhal large lazarevic leads learning level line lippmann local mahoney mani manner mannila minds minimum mining minneapolis minnesota models morgan multi necessarily netflows network next normal novel number official often organization other outliers overview pages pandey paper part pasquier pattern patterns performance pkdd policy position possibility prentice press proceedings project proposed provided publishers ranked references reflect report representation research reviews routinely rules sander science scores sets several should shyam sigmod skaion snort software srivastava stationary steinbach step stolfo such summaries summarization summarize summarized summarizing support supported suspicious swami system systems taouil technical techniques testbed text thank that their this thousand tools traffic transactions tzvetkov undesirable university used using variant visualize volume wang well wesley which widely with without work http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.56 77 1 achieve acknowledgement addison additional adjusted advances advantage algorithm algorithms almost also annual applied applying assigns baeza balanced based batch because benchmark boosting both burges categorization classification classifiers collection comparable comparison computationally conclusion conditions conference consume data demonstrated determined development different directly discovery distribution each effectiveness efficient equations error estimation experiments fast fifth figure filtering formula fourth freund from fung future gaithersburg general grant have hkust hong however http icdm icml ieee importance information instead institute international jmlr joachims john journal kernel knowledge kong large learing learning least less lewis light linear lsqr lyrl machine machines making mangasarian memory method methods minimal mining model modern more morrison national neto only optimization other pages paige paper papers parameter pattern platt point pointing potential practical prior problem problems proc proceedings proposed proximal qiang quality readme recognition references regardless relative require required requirement requirements requires research retrieval ribeiro rose routing saunders scale schapire sequential seventh sherman showed shows sigir sigkdd simple size sizes slightly solving sons space sparse squares standard standards statistical store strategies study successfully supplemented support supported svmlight tasks techniques technology tenth test text than that theory thirteenth this thresholding toms training trec tutorial twenty unbalanced unbalancedness usage used using validated validation vapnik vector version volume weight weighted wesley when whether which wiley with woodbury work working worth wpsvm yang yates http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.13 78 1 academic according accuracies accurate achieved acknowledgement actual adaptive advances algorithms also amherest amount analysis annual applications apply artificial asia associated association attributes automatic based blake both burges california categorical ccaiia challenge class classification clinical clustering cognitive committee commonly comparison composition computer concepts conclusion conf conference conjunction construct culture data database databases datamining dataset datasets decision default department described discovery distributed each ecml education engineering enough estimate evaluate evaluation evaluations expert fast fifteenth fifth five frank from future generating global grant gray hamilton hatazawa have hepatitis hettich higher hilderman hinton holmes holte html http human icdm ieee implementations indexes indicate inductive information inglis interest interesting interestingness international into introduce investigated irvine japan java just kaufmann kernel kitaguchi kluwe knowledge kopf kumar kume learning less lnai machine machines measure measures meningoencephalitis merz method methods minimal mining ministry mlearn mlrepository model models more morgan morris most multi negishi objective ohsaki ohshima only optimization oriented orlowska other pacific pakdd paper patterns peculiarity perform performance pkdd platt practical predict predicting press proc proceedings programs proper publishers quantitative quinlan references repository representations reprinted research result results right rule rules samples schol science scientists selecting selection sequential sets shows simple situation smola society srivastava such support supported suyama techniques than that them then these this tool tools training trans tree trees tsumoto university used using vector very wang well which whole will with without witten works yamaguchi yokoi young zhong http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.106 86 On Feature Selection through Clustering accuracy agglomerative agnes agresti airborne algorithms american analysis antonescu apply aquatic ares artificial attribute attributes backbone barthelemy based benis berthold blake blum breathes brown california cambridge came cancers categorical catsize class classification classifier classifiers clement cluster clustering coefficient computer computing conference correlation courtine cristiani data databases dataset dendrogram dept diagnostic discrete domestic eggs elisseeff ensembles evaluation examples explorations expression feathers feature features fifth figure finding fins fore francisco frank furey gene general graphics groups grundy guyon hair hall hamilton hanczar hannegar haussler height html http icdm ieee implementations improving increased incremental information intelligence intend interest international interscience introduction involve irvine jain java john kaufman kaufmann khan knowledge kohavi ladanyi langley learning leclerc legs listed machine machines maindonald math mathematical mathematics median medicine meltzer merz method metric metrics metriques microarray milk miners mining mlearn mlrepository monjardet more morgan nature networks neural ordered ordonnes pages parially particular partitioning partitions pattern peterson pnas practical predator prediction press problems procedure proceedings profiling propriet prototype providence recognition references relatively relevant remarques repository research ringner rousseeuw saal schwab sciences selection sets sigkdd simovici singla society sugnet support survey tail techniques than that thesis this thousands tools toothed type university using variable vector venomous waikato ward westerman wiley with witten wrappers york zealand zongker zucker http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.82 73 Integrating Hidden Markov Models and Spectral Analysis for Sensory Time Series Clustering Jie Yin and Qiang Yang Department of Computer Science Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong, China {yinjie, qyang}@cs.ust.hk abound academy acapulco accords accurate acknowledgment addition advances affinity alberta algorithm algorithms alon america analysis aone applications apply approach argued arma artificial assumptions augest august austin based bayesian becker benchmarks bennewitz best bile biswas bootstrap bostein boston both british brown burgard cadez canada city cliffs cluster clustering clusters cohen columbia combination computer conference czielniak data dealing december demonstrated demonstration denver diego dietterich different difficulty dimensional discovering discovery display document dubes dynamic editors edmonton effective eigenspace eighth eisen empirical englewood equal example experiences experimental expession explored fast fifth final find firoiu framework further future gaffney general generative genome ghahramani grant hall have hidden hkbu hong human icdm ieee individuals information intelligence international into introduced jain japan joint jordan judgements july june kasetty keogh knowledge kong larsen learning leen length like linear machine madison maebashi markov matrix means meila method mexico might mining mixture mixtures mobile model models monitoring motion mozer multi national need network neural noise noisy oates objects obtain other pages paradigms pattern patterns pavlovic persons petsche plan prentice press probabilistic proceedings processing produce projected proposed purpose rabiner random real recognition references regression remove replace results robot robots robust schmill sciences sclaroff segmentation selected sensor sequence sequences series seventeenth sigkdd similarity simplicity sixth smyth society spectral speech spellman stanford states supporting survey synthetic system systems temporal test texas text thank that their then these this time tracking traditional trajectory transforming tresp turn tutorial types under united used using utility utilizing vancouver variable vectors vision walks warping weiss where which wide wisconsin with work would xiong yeung http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.10 46 A new algorithm for finding minimal sample uniques for use in statistical disclosure assessment acknowledgments agrawal also arning assessment association balancing between centre challenge computing conference council data databases detection deviation disclosure efficient engineering feedback fifth funded grant here icdm identification ieee imielinski individual international invaluable items japan journal large like linear load management manchester maximum members method microdata minimum mining novel numbers official pages physical presented proceedings produce providing raghavan rank record records references research risk rules safe sciences sets sigmod society statistical statistics swami takemura thank university unsafe variables will would http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.68 43 Finding Maximal Frequent Itemsets over Online Data Streams Adaptively accommodate accuracy accurately adaptively adjusting after agarwal aggarwal agrawal algorithms although approximate approximating ases association available average basket because becomes better between bound brin chang changed changes closer compared concluding conference confined continuously cope corresponding counting counts current customer data dataset datawarehousing date defined deogun depth different discovery domingos dong dynamic either embedded enough estdec estimation evaluation execute executed expected experiment experiments fails fast fifth figure find finding first fluctuated frequency frequent from furthermore future garofalakis gehrke generated generation gets given guarantee guha hafez hand higher however hulten icde icdm ieee illustrate illustrated illustrates implication impossible increased increment information initial international interpret inversely item itemset itemsets kept knowledge koudas lakshmanan lambert large larger less likely line long longer look main mainly maintained making management manku market maximal maximize memory merged merging method minimized mining moment more motwani much node nodes notes number online only other over pages paper patterns performance pinheiro prasad prefix preliminary problem problems proc proceedings processing proportional proposed provides querying raghavan rastogi ratio real reason recent references remarks represented required requirement research respectively result results rules same secondary should shown shows sigkdd sigmod significant since situation size slightly smaller smerge smin space spencer srikant ssig storage stream streams structure successfully than that them there this threshold time timechanging times total trace traced transaction transactions tree tsur tutorial ullman unpredictable upper usage used user utilization value values varied varying vldb wang weblog when while widely with without workshop http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.54 29 Effective Estimation of Posterior Probabilities: Explaining the Accuracy of Randomized Decision Tree Approaches Wei Fan1 Ed Greengrass2 Joe McCloskey2 Philip S. Yu1 Kevin Drummey2 aaai ability accuracy accurate adrian alberta algorithms also although american amit analysis appear appears approaches artificial asian august avenue averaging bagging bauer bayesian behaviors better between bianca bias bienenstock boosting both breiman building buntine canada carefully cart charles chipman chris class classification classifier clear combination comparison complete computation computing conference constructing correlated cost data datasets david decision decomposition definition dietterich difference different difficult dilemma directly discovery diversity domingso donald dorsat edmonton effectively effectiveness efficiency eighth elkan empirical ensembles eric error established estimate estimated estimates estimating estimation existing expected experimental explore explores family fifth find forests foster found four from function future geman george given good haixun have hill hoeting hope however hypothesis icdm ideally ieee improved increase increasing induction intelligence interested international into jennifer jose july kind knowledge kohavi labels learning lindsay looking loss losses lower machine madigan maximizing mcculloch mcgraw measure measurements measures melbourne method methods ming minimize minimizing mining mitchell model monotonically more multiclass national networks neural nineteeth nineth optimality original outputs pacific pages pakdd paper pedro philip plots posterior predicted predicting predictiong predictors probabilities probability probabilitybased problem proceedings proposed provost quantization raftery random randomization randomized randomizing range ranking recognition reduce reduction references regarded relationship reliability reliable results rules same school science scores search sensitive september shape sheng shown shows significant significantly single some space sporadic statistical statistics straightforward structured studies such syndney systematic technology than that their theory thesis third this thomas though three ting tony traditionally training transforming tree trees trends true tutorial university value values variance variants very volinsky voting wang well with work wray yali zadronzy http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.131 75 Sharing Classifiers among Ensembles from Related Problem Domains aaai able about acknowledge acknowledgement active adaboost adaptive adopts advances algorithm algorithms among anal annals anomalies another application applied applying approximation arcing artificial bagging bauer been before bluntly boosting breiman burer called card carefully caruana chan class classification classifiers closeyrelated combined combining company compare comparison conference connection content costly counted credit crew cross customers data database december definite definitions detection dietterich different directly discovery distributed distribution domain domains each editors effect email emails empirical ensemble ensembles example existing experiments extending extremely factorization fifth filter filtering filters first flexible forests fraud freund from future gather generalization geomans good grant graph hansen herfindahl however icdm idea ieee improved index induction information input instance instead intell intelligence intelligent interests international journal kaufman knowledge kohavi krogh ksikes label large lazarevic learning leen libraries like limited line mach machine manteo margineantu marketing mathematical maximum mccallum method methodology methods might mining mixture mizil model models monteiro more morgan multi multitask nets network networks neural never niculescu nonlinear november obradovic obtained often only other output overlap pages particularly partition pattern peer peers people possibly potential practical predictors press problem problems proc proceedings processing prodromidis product products programming programs promote provost pruned pruning quinlan quite random rank references related relationships relaxation research review rounding salamon samuel satisfiability scalable scale schapire science screen selection selective selectively semi semidefinite sent series seventh sharing sharkey sigkdd similar simple skewed solving spam stacked stacking statistics step stolfo strategies strategy streaming street structure studying support systems technology tesauro text that there these thing this those tially touretzky towards trading trained training trans treated tree trying under useful users using usually validation valuable variables variants vedelsby volume votes voting wang ways weiss when wikipedia will williamson with without wolpert work workshop would your zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.127 85 Segment-Based Injection Attacks against Collaborative Filtering Recommender Systems Robin Burke, Bamshad Mobasher, Runa Bhaumik, Chad Williams Center for Web Intelligence, DePaul University School of Computer Science, Telecommunication, and Information Systems Chicago, Illinois, USA {rburke, mobasher, rbhaumik, cwilli43}@cs.depaul.edu against algorithm algorithmic algorithms also analysis associates attack attacks audience august average barely based been benefit berkeley between beyond bhaumik borchers both broad broader burke california case cases collaborative conclusions conference construction cost data degree development diego difference distribution does dramatic edinburgh effect effected effective ensuring evaluating examined fifth figure figures filtering focused found framework from general generation group herlocker homes hong horror however hurley icdm identifying ieee ijcai impact implementations information injected injection intelligent interesting international internet introduce item itembased items january karypis knowledge kong konstan kushmerick likely limited mahony manner market mining mobasher models more most mount movie next number ones overall paper performing personalization point popular population prediction previous proceedings profile profiles profit pronounced purchasers push pushed random ratings ratio recommendation recommended recommender references reidl remains require research results retrieval riedl robust robustness sarwar scotland secure segment segmented shift shilling show sigir significant silvestre similar sizable small somewhat specific stable such sults system systems target techniques technology tervin than that their there these they this those transactions type unnecessary user users view vulnerable wasteful well were which while whole wide will with works workshop world york zabicki http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.144 122 Text Representation: from Vector to Tensor* Ning Liu1, Benyu Zhang2, Jun Yan3, Zheng Chen2, Wenyin Liu4, Fengshan Bai1, Leefeng Chien5 academic achieve addison agenda algebra amherst analysis annual application applications applied approaches aslam automatic baeza based belkin better callan cannot categorization cavnar center certain challenges chris component computer conclusion conference contrary cornell croft customization data dataset decomposition department design determines dimension disadvantages document documents done dumais experimental fifth figure filter found future gerard gram harper held hiemstra hofmann hosvd icdm icml ieee implies improves increase information instances intelligent international introduction investigation jolliffe journal keeps kernel kluwer lafferty lang language latent lathauwer learning limited machine many massachusetts matrix measurement merits mining model modeling modern moor more most multilinear netnews neto newsweeder number originally others outstanding paper performance principal problems proceedings propose proposed rank reduce reduced references report represent results retrieval ribeiro samples science sdair show shows siam similarity singular some space spriger structure such symposium tasks technical techniques tensor tensors term testing text than that theorems theoretical there this traditional trenkle under underlying university using value vandewalle vector vectors verlag weighting wesley while wiley work works workshop wrede yates york zhai http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.92 141 Merging Interface Schemas on the Deep Web via Clustering Aggregation across aggregation ancestors application approach automatic average barbosa based book chang clustering commerce common conference data databases deep doan domain domains effectiveness estate expressions fifth four freire from further gionis gmax handling hidden high icde icdm ieee improves increase increases indicate indicates inferring integrating integrator interactive interface interfaces international irregular irregularities lmax lowest mannila matching measured meng mining observe optimization over overall performance persc presc prevalence proc proceedings query range real references relational sagiv schema schemas score scores search searching shows siam sigmod significantly source statistical szymanski table that their these this tree tsaparas ullman vldb webdb wise with http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.45 45 Combining Multiple Clusterings by Soft Correspondence aaai accumulation accuracy acknowledge acknowledgments advice afrl aggarwal aiam alexander algorithm algorithms also anal analysis application award bagging based bioinformatics bipartite brodley buhmann classifier cliffs cluster clustering clusterings clusters combining comparing comput computes conclusions conference consensus consistent contrasting correctness correspondence cspa curves data dataset define demonstrate dempster dimitriadou distance domain dubes dudoit effectiveness empirical englewood ensemble ensembles evaluations evidence expression extensive factorization fast fern fifth finding fischer force framework fred fridlyand from function gene ghosh given grant graph graphs grouping hall have high hornik hypergraph icann icdm icml icpr ieee improve incomplete intell intelligence international iris irregular isolet iteratively jain john journal karypis kellam knowledge known kumar laboratory learning likelihood literature mach matrices matrix maximum mcla medicine merging method mining mixture mmec model multilevel multiple multiplicative nature negative nips novel number objects optimal page pages paired paper part partition partitioning partitionings partitions parts path pattern pendig pharmocology prentice press problems proc procedure proceedings propose proposed punch quality references reported research results reuse royal rubin rules salerno scec scheme segmentation seung several shekhar shown siam smooth society soft solution solving statistical strehl superior supported systems table tests texture that theoretically third this through topchy trans tucker under updating using values viral vlsi voting weak weingessel well with workshop http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.63 96 Feature Selection for Building Cost-Effective Data Stream Classifiers (Extended Abstract) Like Gao, X. Sean Wang Department of Computer Science, University of Vermont, Vermont, USA {lgao, xywang}@cs.uvm.edu aggarwal algorithm analysis artificial choose classification classifier conference confirm cost count data decision demand detection ding effectiveness empirical ensemble evaluate evaluation experimental experiments feature features fifth genetic hybrid icdm icml ieee induction intelligence international journal kira kononenko large learning machine method methods mining motoda national networks pages peano performed perrizo problem proceedings references relieff rendell research results robnik rrelieff same sampling scale selection selective sensitive sensor sigkdd small source spatial streaming streamrelief streams street that theoretical this traditional tree trees turney using validated wang with yield http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.134 143 Speculative Markov Blanket Discovery for Optimal Feature Selection accuracy adversely affecting agresti algorithm algorithms aliferis analysis archive ascertain augustine bayesian blake blanket blankets called categorical compared conclusion conference contribution data databases dept difficult direction discovery ditional each editors empirical employs execution existence existing faithful fast faster fastiamb feature fifth figure florida from future hettich html http iamb icdm ieee improved indicates induction inference information intelligent inter international internationalflairsconference irvine john kaufmann kollerandm large learning leen local machine main maintaining margaritis markov merz mining mlearn mlrepository more morgan muller neighborhoods network networks neural nips novel often optimality pages paper pearl performs plausible potential practice press probabilistic proceedings processing publishers reasoning recover recovered references relaxing reliability reliably repository requirement research respect results sahami scale selection show solla sons speculation statnikov systems test tests than that theoretical this thrun times total towardoptimalfeatureselection tsamardinos underlying which while wiley with without york http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.41 99 CloseMiner: Discovering Frequent Closed Itemsets using Frequent Closed Tidsets above adamo agrawal algorithm algorithms almost also apriori association august bases bastide because becomes berlin between both cambridge candidate case charm chess closed closeminer closet cluster clusters compared computer conclusions conference connect data database databases datasets davey department diffsets discovering discovery disjoint each efficient engineering evaluations example experimental explores fact fast fifth find frequent from gauda generate generating generation groups heidalberg hsiao icdm icdt identical identified ieee imielinski indata iner institute international into introduction issues items itemsets january june knowledge lakhal large larger lattices linearly longer magnitude management march mining more mushroom narrower navathe number october omiecinski only operations order orders outperforms overlaps pages paper pasquier pattern patterns perform performance polytechnic present press previous priestley proceeding proceedings reasons redundant references rensselaer report research rule rules same savasere scalable scales science sequential sets several sigkdd sigmod simultaneously smaller space sparse springer subsumption support swami synthetic taouil technical that theory this tidset time transaction transactions university unlike using values verlag vertical very which with without workshop york zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.50 54 Discovering Frequent Arrangements of Temporal Intervals about abraham actions additional afshar agrawal algorithm algorithms allen applicability approach arrangement arrangements association assumption athitsos ayres based bastide bayardo behavior bide bitmap boulicaut burdick calimlim candidate charm chen cikm closed closet clospan computer computers conference consecutive constraint constraints data databases dataset datasets dayal decreasing demonstrates dense density direction discover discovering dmkd efficient efficiently eliminating enumeration episodes euvrard evaluation event events examining experimental expression fast faster features ferguson fifth figure finding flannick freespan frequency frequent from further future garofalakis gehrke generation gestural gospade growth gunopulos hence hsiao icde icdm icdt ieee improves incorporate incremental instantaneously instruments interesting international interval itemset itemsets july karypis knowledge lakhal language large learning leleu length linguistic logic london long longer machine mafia make mannila maximal medium meta method methods mining mldm mortazavi need neidle novelty occur over pages partial pasquier pattern patterns performance petrounias pinto pkdd prefix prefixspan presented proc proceedings projected quent rastogi reaching references regular repetitions report representation research results rigotti rochester roddick rules runtime scalable sclaroff seconds seno sequences sequential sets shim siam sigkdd sigmod signstream slpminer smaller solve some spacesaving spade spam sparse spatiotemporal spirit springer srikant sstd subsets support synthetic taouil technical temporal that these they toivonen tool transactional tree tsoukatos university usefulness uses using verkamo verlag vision visual vldb wang warehousing with without work workshops zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.9 84 A Levelwise Search Algorithm for Interesting Subspace Clusters academy adding agrawal algorithm almost alon also analysis another application applications array arrays assciation attribute attributes automatic axis barkai based because becomes being below between binary blake both broad cancer case certain change cheng choudhary clustering clusters colon columns combinations concept conceptual conference connected contain contains contributions cpdc crisp data databases dataset dense density difference different dimensional dimensionality discovery discretized distinctive duplicates each effectiveness efficient engineering entropy evaluated every explanation exponentially expressed expression fast fifth figure finding fragment from gehrke gene genes gish goes goil gunopulos having high iccs icdm ieee ijcai imielinski increases increasing instances international into items kailing knowledge kramer krger kriegel large lattice lead learning least levelwise levine lindig logarithm machine mack mafia main management many means memory merz micro minimum mining minuscule missing molecular more mushroom nagesh national normal northwestern notterman number numerical object objects oligonucleotide only other pages patterns probed proceedings produces property pruned pruning raedt raghavan ratio reaches reduces references refers regions remains removed removing report repository representing respectively revealed rows rules samples scalable science sense sequential sets several show shows siam sigmod significant similar size slight small space srikant stable stays structures subspace subspaces succinct succinctness suggest swami technical tests that this three threshold thresholds tissues tried tumor university using values version very when where while with working ybarra zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.23 127 An Expected Utility Approach to Active Feature-value Acquisition accuracy accurate acknowledgments acquisition acquisitions active alternative among analysis approach artificial atlas based bayes blake budgeted builds candidate capture classifier classifiers cohn completion computationally concept conf conference consistently constraining cost costs could darpa data databases demonstrate direction economical error estimated estimation expected experiments faster feature features fifth foster found francisco frank generalization grant greiner html icdm icml identified ieee implementations improvement improving induction inductive informative instance instances intelligence intensive interactions international intl irrelevant java john kaufmann kohavi ladner learning little lizotte loss machine madani made maytal mccallum melville merz method methods mining missing mlearn mlrepository model models mooney more morgan most much naive number obtains only optimal padmanabhan pages paper performance pfleger picked potential practical preliminary prem problem proc proceedings propose proposed provost queries random raymond reduction references repository restrict results rubin saar same sample sampled sampling search selection sensitive setting show significantly sons statistical subset such supported technique techniques than that this through tools toward tsechansky turney types uncertainty uniform unit using utility value values were wiley with without witten workshop wrapper zheng http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.108 69 On Reducing Classifier Granularity in Mining Concept-Drifting Data Streams accuracy accurate adapted adaptively addressed adjusts agrawal algorithm algorithms already analysis approach approximate associated association baile based because been bing blake boat boston break california callaghan center challenge chang change chen china class classification classifier classifiers closed clustering cmar component compromising computer concept concepts conclusion conference confidence consistent construction cost counts current cvfdt data databases dataset datasets datasize decision dept designed detection determine dimensional domingos dong down drifting drifts easy effects efficient embodied ensemble error even experiment experiments extensive fast faster fifth figure finding focs framework francisco frequency frequent from ganti gehrke granularity growing guha haixun hardly high hongkong http hulten hxwang icdm ieee important impossible incrementally information integrat international into issue itemsets jian jiawei keep large learn learning level life limited machine made maintainable maintaining major manku matched means mehta merz milshra mining model models moment more most motwani much multi multiple muntz nick nursery online only operation optimistic over overcome pages paper parallel peng philip pieces pose predicates press procedure proceedings publications quickly rainforest ramakrishnan rate real recent reducing references regression report reported repository research results revise richard rule rules runtime scalable scale science second semantically series shafer showed shown shows sigkdd sigmod sliding small smaller space speed spencer sprint static still stream streaming streams street structure support synthetic task technical than that they this those time timechanging tree triggered ucla univ unlike update uptodateness using very vldb wang wangtech watson wenmin when which while window with without wynne xiaochen yiming yongseog http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.97 107 Mining Ontological Knowledge from Domain-Specific Text Documents Xing Jiang and Ah-Hwee Tan School of Computer Engineering, Nanyang Technological University Nanyang Avenue, Singapore 639798 {jian0008,asahtan}@ntu.edu.sg acquisition america analyzing approach assessing august automatic auxiliary berners better building complex compterm computers conference containing content corpus crctol data demonstration difficulty discovery documents domain effective entries environment especially extracted extraction fifth form full gruber handing handle handling hendler however iccs icdm ieee include incorrect indicates international into iswc kavalec knowledge lassila learning lexical maedche meaningful method mining missikoff more mori much nakagawa navigli networks nevertheless obtained only onto ontology pages poorer portable possibilities powerful precision proceedings purpose rajaraman recall references relation relations revolution scientific score second semantic sentence sentences simple sofsem software specification staab still structure svateck table taxonomic technique term text than that there third this total translating translation unleash usable used velardi verbs were which will with works wrong http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.148 26 Usage-based PageRank for Web Personalization Magdalini Eirinaki, Michalis Vazirgiannis access accesses account accuracy adaptive aktas algorithm algorithms already also analysis analyzing anand anatomy answers applied approximate archive artificial aueb august available baraglia based baumgarten before behavior behaviour bhowmick borges boston both brin buchner cadez cambridge capitalizing case chains changes china claim clustering coinciding combination computational computer computing conceptual conclusions conference confirm content correlation data databases dataset datasets decision delaying described deshpande diego discovery distinct domain domingos dynamically efficient eirinaki engine enhance enhances entropy etzioni evaluate expanding experimental experiments faulstich february fifth first framework france frequent from future gaffney garofalakis general gibbons giles given graph graphics graphs greece hansen haveliwala hawaii heckerman high history however html http hughes hypertext hypertextual icdm ieee ignoring importance improve incorporation incremental individuals information intelligence intelligent international internet intl into ioannidis italy journal july june justified karypis kendall kingdom large learning less letters levene link lncs logs loizou machine making manavoglou markov masseglia meek menczer mentioned methods miner mining model modelling models more most motwani msnbc mulvenna nacar nature navigation navigational networks neural nevertheless next novel november number objects october online other outperformance oxford page pagerank pages pageviews paper papers paris path pattern patterns pavlov perkowitz personalization personalized personalizing pkdd plan point polyzotis poncelet possible predicting prediction predictions press previous probabilistic probability proc proceedings process processing profiles profiling promising properties propose proposed provide query quite raghavan randomized rank ranked ranking recommendation recommendations recommender references report results revised richardson roadmap ruiz rundensteiner sarukkai sayed scale search seattle seconds selective semantics sensitive september sequence sewep should siam sigmod sigweb silvestri site sites small smyth solely spiliopoulou statistics stress structural structure study surfer synopses synopsis system systems take takes taxonomy technical techniques technology teisseire than that there therefore this toit topic towards united university usage used user users using value varlamis vazirgiannis very visited visitors visualization vldb washington wealth webkdd which white whole widm wilkler without work workshop zhao http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.129 83 Semi-supervised Mixture of Kernels via LPBoost Methods Jinbo Bi Glenn Fung Murat Dundar Bharat Rao Computer Aided Diagnosis and Therapy Solutions Siemens Medical Solutions, Malvern, PA 19355 jinbo.bi, glenn.fung, murat.dundar, bharat.rao@siemens.com adapt algorithm algorithms allowing averaged bartlett basis bazaraa bennett boosting capability clearly column conference considered construction cristianini data demiriz different discovery efficient embrechts enhance error examples exploitation fifth figure from functions generation geometric ghaoui heterogeneous hill icdm ieee inductive international involved john jordan journal kernel kernels knowledge labels lanckriet learning linear locate machine mark matrix mcgraw methods mining mixture model models momma nash nonlinear optimization pages performed ples plot points prediction present problems proceedings programming properties randomly rate rates references remaining research sample semidefinite settings shawe sherali shetty sigkdd size sofer solved sons ssmk statistical supervised take target tasks taylor techniques test then theory these through train training trials unlabel unlabeled used using vapnik various versus very were wiley with york zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.152 27 WARP: Time Warping for Periodicity Detection aaai adaptive addison agrawal algorithms analysis applications approach aref based brendt brockwell bunke clifford conference convolution crochemore data databases datamining deepening detection discovery dynamic edbt editors elfeky elmagarmid event exact experimental faloutsos fast fifth find finney forecasting future gershenfeld handsoff hart hellerstein icde icdm identifying ieee index indexing indyk instruments international iterative july june kandel keogh knowledge koudas large last massachusetts massive mine mining muthukrishnan noise novel obscure oxford papadimitriou park partially pass past patterns pazzani periodic periodicity periods prediction presence press proceedings publishing reading references representative review rytter sawhney scaling scientific search segmenting sequence series sets shim similarity sketches stream supporting survey symbolic text time tkde tracy translation trends understanding university unknown using vldb warping weigend wesley with workshop world http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.79 38 HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence above acmsiam actually algorithm algorithms almost annual anomalies applications archive array artificial based benchmarks bentley bitmaps brute calls chiu coerman coincidence company compressed computation conclusions concretely conference consider current currently data databases datael dataset datasets defined demonstrated demonstration differences discord discords discovery discrete distance distinguishing diverse domains eamonn easy efficient efficiently empirical even every expect experiment experiments factor fast faster fifth figure figures find following force free from function future general ground happy have heuristic hill hours icdm ieee implementation implications increase information intelligence international introduced introduction isaac issues japanese jsai kasetty keogh knorr knowledge koski kumar lanctot lankford large larger least leiserson length lethargic lncs lolla lonardi made magnitude make massive matlab mcgraw mining monitoring most motif motifs motion need note numbeg number numbers nystrom obvious orders otsll outliers over parameter patterns pessimistic practical primitive probabilistic problems proc proceedings qtdbs query rando range ratanamahatana reasonably references relatively repeated representation representative required research result results right rivest sadakane search searching seconds sedgewick selection series setting several shown shows siam sigkdd sigmod single sizes society sorting speedup streaming string strings strongly such suffix suggest sure survey symbolic symposium take tanaka test tested text than that there these this thousand three tickwrp time times tool towards truly tucakov uehara unusual using values visualization visually vldb walk wang were while with work working workshop zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.60 33 Extracting Frequent Subsequences from a Single Long Data Sequence: A Novel Anti-Monotonic Measure and a Simple On-Line Algorithm Koji Iwanuma, Ryuichi Ishihara, Yo Takano, Hidetomo Nabeshima Department of Computer Science and Media Engineering University of Yamanashi 4-3-11 Takeda, Kofu-shi 400-8511, Japan {iwanuma,ishihara,takano,nabesima}@iw.media.yamanashi.ac.jp acknowledgments agrawal algorithm also anti applications approximate artif asynchronous bags between called chang chen chubu cikm coling computational comut conf conference correspondences counting counts culture data database databases datar dayal detecting developing differences discovery documents duplication education efficient efficiently electrotechnogly elements elsewhere engineering enumeration experiment extracting extraction fast feature fifth finding foundation freq frequency frequent from fukumoto generalizations gionis grant growth helsinki here html htoivone huang iasted icde icdm ieee improvements indyk infinite information intel inter international iwanuma japan karp knowledge line linguistics maintaining management manku mannila maximal method mining ministry monotonicity mortazavi motwani multiple nabeshima nakamura near news occurrences over papadimitriou paragraph partly pattern patterns performance periodic pinto predicting prefix prefixspan problem proc proceedings projected property proposed pruning references research results satisfies science search secondly sequence sequences sequential series shall shenker shifts show siam sigkdd simple single sliding sorts srikant statistics stories stream streams subsequences supported sure suzuki syst teaching temporal this time timus toivonen total trans using very verylong wang windowlength windows with without yang zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.76 32 Hierarchy-Regularized Latent Semantic Indexing Yi Huang1, Kai Yu2, Matthias Schubert1, Shipeng Yu1, Volker Tresp2, Hans-Peter Kriegel1 huang@cip.ifi.lmu.de {kai.yu,volker.tresp}@siemens.com {schubert,spyu,kriegel}@dbs.ifi.lmu.de acid additionally advances algorithm algorithms allfea almost always american analysis annual application applications approach apweiler authoritative averaged bairoch banff based become belonging better between bianchi biology biometrika blatter blockeel boeckmann both bruynooghe called canada canonical capable cases categorization cesa changed chemnitz chen choice cikm class classes classifcation classification classifying close comparative comparison completely component computer conclusions conf conference connection consortium content correlation curves cuts cvpr data deerwester degrade dekel demanding demonstrate depends derive described descriptive develop development different dimensional dimensionality dimensions direction directions document documents donovan dumais dzeroski each ecml editor edmonton egorization employing enables encyclopedia environment error especially estreicher european evaluations exactly examination examine experiments feature features fifth figure figures fisher found francisco from furnas future gasteiger gene general genetics gentile global graphs guide hardoon harshman heidelberg here hierarchical hierarchically hierarchies hierarchy highly hlsi hofmann holloway hotelling however human hyperlinked icdm icml ieee image implies improves improving incorporates incremental indexing induced information integrate interesting international into introduced joachims journal kaufmann kernel keshnet kleinberg knowledge knowledgebase koller kopf landauer large last latent learning least like linear little ller london machine machines macro madison malik manage many margin martin maximum mccallum mean measured ment method methods michoud micro mining mirror mitchell models morgan mrdm mulitlabel mult multirelational nashville nature need neural nips normalized nucleic number oles omit only ontology optimal outputs overlapped overview pages paper parameter partial pattern pedersen performance phan pilbout practice principal proc proceeding proceedings processing proposed prot protein proximity publishers quality quite ramon random realworld recognition references regularized relations relevant repeats report research retrieval robustness rosenfeld rousu royal sahami same saunder schneider schol science sciences section seems segmentation selection semantic sensibility sensitive sets setting shawe shop shows shrinkage sigir sigkdd since singer small smola society some sources spaces specificity springer squares statistical stay structure structured struyf study subplots suited supplement support swiss swissprot systems szedmak tasks taylor technical text than that there they this thus tironi tool transformation trembl underlying understandable unification university uses using valid values vancouver variables variety vector verlag very vision washington well which while whistler with within without wold words work workshop yang zaniboni zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.109 142 On the Stationarity of Multivariate Time Series for Correlation-Based Data Analysis Kiyoung Yang and Cyrus Shahabi Computer Science Department University of Southern California Los Angeles, CA 90089-0781 [kiyoungy,shahabi]@usc.edu acknowledgement adjusting advances algorithms american analysis application april association attribute auslan author autoregressive baeza bank based beach been before between biomed birbaumer bogdan bozkaya brain bulletin california cambridge cash channel classification clustering coefficients cointegrated cointegration collected comparison component components computer concisely conclusions conference control correction correlation crash cybern data dejong determination dickey diff distribution econometrica economic education elder elger empirically engle eros error estimation estimators expressed extend extending feature federal fifth figure findings foundation frakes from fsdm fuller funded gifts given granger grants great grouping groups hall have hill hinterberger human hypothesis icdm ieee improves imsc income incrementally indexing inference information integration intend interfaces international invasive itself jansen jasa johansen jolliffe journal june kadous kennedy knowledge kopf krzanowski large likelihood louis making march material measure methods metric microsoft mining mmdb models money more motion multi multivariate nankervis national necessarily necessary neural newport observations opinions original oxford ozsoyoglu pages paradigm part pecase performances performed performing perron prabhakaran precision prentice press price primer principal proc proceedings processes processing queries ranking recall recognition recommendations references reflect report represent representation research reserve retrieval root roots rosenstiel savin schol schroder science search seborg segmentation selection september sequences series shahabi shock should shown similarity singhal south southern spaces springer stably stationarity stationary statistical stream structures students subsequent subset such supervised support swift syst systems taught technical technique temporal terms testing that thesis this thornton those thus time timeseries tods towards trans trend tucker unit university unrestricted using variable vector versus views volume wales washington well weston what where whiteman widman with yang yates yoon york zhai zheng http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.35 28 Bifold Constraint-Based Mining by Simultaneous Monotone and Anti-Monotone Checking Mohammad El-Hajj, Osmar R. Za¨iane, Paul Nalos Department of Computing Science, University of Alberta Edmonton, AB, Canada mohammad, zaiane, nalos @cs.ualberta.ca address advantages agrawal alberta algoproceedings algorithm algorithms almaden along also anti approach approaches argue argued asia association august australia bailey based bases battery been before between bifoldleap bonchi bonsai both brighton bucila bulletin burdick calimlim candidate case chaudhuri checking chicago chile closed code cofi committee compared conducted conf conference consideration constrained constraint constraints convertible costs could crushing dallas data database databases dawak despite development discovered discovering discovery does dual dualminer early edmonton effective efficiency efficient eight engineering evaluate exact examiner exclude existing experiments expressing fast fifth fimi find florida focus fraction frequent from furthermore gehrke generation giannotti goethals growing hajj help helsinki herein http iane icde icdm identify ieee imielinski implementation implementations indeed index integrated integration internationa international intersection into intricately introduces issue items itemset itemsets jumping jumps kifer knowledge lack lakshmanan language languages large lattice launch leap level lucchese mafia management many march maximal mazzanti melbourne mining monotone more need node november number ones optimization optimized outperforms overhead overwhelming pacific pages pakdd pang paper paradualminer parallel pattern patterns pedreschi possible problem proc process propose pruning push pushing queried queries query quest ramamohanarao reasons recognized recursive reduce references regardless relevant report repository representations required research resources rithm rules santiago satisfy search selective september sets show showed shtml sift sigkdd sigmod simultaneously small software solve space srikant state structures studies suggests supports swami sydney synthetic systems technical techniques technologies that their them these this through ting transactional traversal traverses tree trees useful uses using usually variable very violations warehousing washington where white widely wise with without yields http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.52 72 Discriminatively Trained Markov Model for Sequence Classification aaai abbeel accuracy accurate acids acknowledgment additional advances aimed algorithm algorithms analysis andorf appel applications approach artificial assigning augmenting avoid bairoch baldi based bayes bayesian between bhasin biochem bioinformatics bioinfromatics biologists blast bouchard brunak built cambridge canada carbonell case cases categorization chain charniak cheng cikm classes classificatin classification classifier classifiers combine combines combining communication comparison complexity composition compstat computational computationally computer conditional conference counterpart cowell crfs data dawid dependency descent described development dimensional dipeptide discovery discriminative dobbs document domingos dumais edmonton efficiently eighteenth either eliminates equivalent error eskin eslpred estimated eukaryotic event examined example expasy expert exploit exploration extensions families features febs fields fifth finally form from function functional further future gene general generation generative generatve gradient gram grant graphical grossman growing grunw hastie have health heckerman higher hochstrasser honavar hubbard hybrid iasc icdm icml identification ieee images improve inducing inductive infeasible information informative initialized instead institutes intelligence interest interface international introduced jordan june kbcs kernels klein knowledge koller language lauritzen learning leslie letters likelihood limited localization location locations logistic lower machine management margin markov maximize maximizing mccallum method methods mining mismatch model modeling models momentum more mullymaki multinomial naive national nature need network networks neural nigam nips noble noted nucleic observed obtained ongoing ontology opposed optimization optimizing order over overfitting pages paper parameters part pass peng platt prediction presented press probabilistic problem proc proceedings processing proposed protein proteinprotein proteins qian raghava raina random rate recently recognition references regression regularization reinhardt related relational representations research residues resulting retrieval roos roose rubinstein sahami search seetharaman separation sequence sequences server seventh shen showed shown shuurmans silvescu single sophisticated special speech spiegelhalter springer springerverlag stage statistical statistics strengths string strings structure structures subcellular support supported symposium synergistically systems taskar techniques term text than that their theoretical theory there they this through tirri tools topological trade training trenchs triggs tunes uncertainty using vapnik vasant vector wang ways well weston wettig which with work workshop yielding york yuan http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.16 76 A Visual Data Mining Framework for Convenient Identification action actionable active adomavicius agrawal algorithms analysis analyze another applications approach association attribute attributes autonomous azevedo based been behave behavior belief blanchard briand browsing bruzzese categorical cell cercone chambers chapman chen cikm classes classification clearly cleveland cohen comparative comparison comput conclusions conference confirm convenient coordinates crystalclear customer data databases davino decision demonstrate demonstrates deployment design designers deviations discovered discovering discovery distribution dmkd dmql domain down drill driven ecml engineering enhanced environment exception exploratory fast fifth figure find first framework from function further general generated graph graphical guillet hall have hellerstein helps hierarchy hofmann house icdm identification identifying ieee implemented important impressions improve industrial information infovis inspired integrated integrating interactive interesting interestingness international jorge kaufmann keim kleiner know knowledge koperski language large leads learning life linked lucie machine make makes making managers many matheus matrix method methods mine miner minimum mining model morgan mosaic most multimedia multiple needs number ongoing opportunity ordering padmanabhan paper parallel part partner patterns piatesky pieces pkdd plots pocas possible post prentice press problem proceedings process processing product program proposed quality query querying quinlan real reasons references relational reliable represents results reviews rule rules ruleviz rummaging saint scale schaller sets shapiro shows siebes sigmod silberschatz similarly srikant step study summary support suzuki systematic systems team techniques technology terninko test text that their them this thomas thus tirpak trans transfer tukey tuzhilin unexpected useful user users using value values very visual visualization visualizes visualizing vldb wang what which while whitney wilhelm with wong working workshop xiao zaiane zhao http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.71 106 F S3: A Random Walk based Free-Form Spatial Scan Statistic for Anomalous Window Detection Vandana P. Janeja and Vijayalakshmi Atluri MSIS Department and CIMIC, Rutgers University {vandana,atluri}@cimic.rutgers.edu Abstract about account advances alamos alarms algorithms american among analyzed anomalous application applications applied approach approaches area around association athas atluri attributes aurenhammer autocorrelation barber barnett based better between births boundaries brain breach cancer capital carolina cases center centers circle circles cluster clustering clusters communications compare compared comparison compiled comprises comput computation computational compute computed computing concepts conclusions conference considered consisted consists coordinates corresponding counties county creation critical cylinder cylindrical data dataset datasets death deaths delaunay denote department depending described detecting detection diagramsa discuss discussion disease disproves distribution drawn each earlier edition either enhancing ensure entirely equal evaluating experimental experiments extend falls feurer fifth first fixed form free from fundamental future generated generation geographic geometric geometry gordon graph harel have health heterogeneity higher human hypothesis icdm identified identify ieee inaccurate including indeed indicate indicates infant information intend international into janeja john jonathan journal keeping koren kulldorff large lewis likelihood limit limitations line linear list live mathematics maximum means mesh meth mexico miller mining model moore moreover most multidimensional multiple naus neill network neural next ninham node nodes north noted number obtained occurrence often only original other outliers pages paper parameters patil performed performs points poisson population position probability proceedings process processing proposed public random randomize ranges rectangle references refinement refute region regular repeat resources restricted result results scan scans science seat seats second section select semantic several shape sids significance significant similarly simulated size sons sort space spatial specific state states statistic statistical statistics step steps strategy structure sudden sufficiently surv surveillance survey symposium syndrome systems take tallie technique techniques terms test testing than that then theory therefore these third this those thus time times total town tract traditionally transition triangular unchanged until unusual uses using value values voronoi walk walks weight whether which while wiley will window windows with within work years york zones http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.99 108 Mining Patterns That Respond to Actions aaai above academic actgain action actionability actionable actions actionset address adomavicius adopted also among analysis applicable approach assume availability available based bottleneck bottomup capture certain check child classification compute computed computing conclusion concrete conference contribution cost criterion customers data databases decision derman details determine deviations discovery domain duda each edbt empirical entire error estimated estimation evaluation expanded extract feature features fifth finite fitting framework fraser from fully function future generates hart hierarchy higher icdm ieee ignoring immediately important incorporating increase independent interestingness international introduction issue jiang journal kaufmann keep kleinberg knowledge learning machine manner markov matheus maximize maximizes method microeconomic mining modeling modification morgan need node observed obtain obtaining opportunities over pact pages papadimitriou partition pattern patterns pessimistic piatesky population postprocessing potential practice presented press previous proceedings process profit programs projected prune pruning quinlan raghavan references report reported respond scalable scence school science select shapiro shows similar simon state studied subtree such technical than that then therefore this topic tree trees tuzhilin type under university user using utility view wang whether wiley will work works workshop yang york zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.90 112 Making Logistic Regression A Core Data Mining Tool With TR-IRLS accurate appears autonlab available chang cjlin computational computing conference csie data details elements faster fifth fitting friedman from generalized gentle hastie http icdm ieee implemented international irls komarix learning least library libsvm linear logistic machines mining model other paper procedures proceedings references regression simple software sources springer statistical statistics support svms this tibshirani tion used vector verlag very with http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.38 44 CanTree: A Tree Structure for Efficient Incremental Mining of Frequent Patterns Carson Kai-Sang Leung Quamrul I. Khan abiteboul according acknowledgement adjusting adjustment afpim after agrawal algorithm algorithms also alternatives analysis approach arkun arranges association associations avoids ayan based baskets bayardo became because before between beyond block bonchi brin bucila canada candidate canonical cantree cantrees captures caused changed changes charm chen cheung clifton closed closet conclusions conference constrained constraint constraints construction content contribution convertible correlations counting cover dasfaa data database databases dawak deleted different dimensional discovered discovery distinct divided dmkd does domain dualminer dualpruning dynamic early easily efficient efficiently engineering entire experiment exploiting explorations exploratory fast fifth figure flocks form fourth frequency frequent from fssm fukuda gade gehrke general generalization generalizing generation grants hash hence hidber high higher however hsiao huang iane icde icdm ideas ieee imielinski implications incremental infrequent inserted integrating interactive international into introduced introduction item items itemset itemsets june karypis kifer knowledge lakshmanan large larger leads leung linear long longer lower lucchese maintained maintaining maintenance manitoba mannila market means merging method mining modified moreover morimoto morishita motwani needed nestorov nice node nodes novel nserc number online optimizations optimize optimized order ossm pang paper park partially pattern patterns portions powerful presence probability problem proc proceedings project properties proposed provide pruning query rebuild reconstruction references relational relue require required rescan research results rosenthal rule rules runtime sarawagi scalability scan scheme science segment segmentation sept sets several shieh show sigkdd sigmod silverstein simple some specifically splitting sponsored srikant structure structures studied succinct such support swami swapping system systems tansel technique tested that these third this thomas tkde tods tokuyama tough transaction transactions tree trees trimming tsur ullman unaffected university update updated updates updating used user using visualization vldb wang were when which white with without wong worsened zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.5 41 A Generic Framework for Efficient Subspace Clustering of High-Dimensional Data Hans-Peter Kriegel, Peer Kr¨oger, Matthias Renz, Sebastian Wurst Institute for Computer Science, University of Munich {kriegel,kroegerp,renz,wurst}@dbs.ifi.lmu.de academic account adaptive addition agarwal aggarwal agrawal algorithm algorithms applications approximations automatic based baumgartner better biologically biolology both cancer capturing carlo cell cerevisiae cheng choi choudhary class classification clearly clique cluster clustering clusters comprehensive computes concepts conference connected correlation cycle dash data databases datasets deltaclusters densities density densityconnected different dimen dimensional dimensionality discovering discovery effectiveness efficiently entropybased ester evaluation existing experimental explorations expression fast feature fifth filter finding fires from gehrke gene genes global goil golub grids gunopulos haque high highdimensional hybridization icde icdm identification ieee inadequate including incooperated interesting international into jones kailing kamber known kriegel kumar large local massive maximum meaningfull methods microarray mining molecular monitoring monte nagesh noise noisy notion numerical obtain oger order outperforms overcomes parsons plant prediction preferences press problems proc proceedings procopiuc projected projective raghavan references refined regulated review runtime saccharomyces sander scalability scalabilty scheuermann selection sets several shapes shown siam sience sigkdd sigmod significantly sizes solution spatial spellman steinbach subclu subspace such techniques terms than that thorough threshold true varying wang well which with yang yeast zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.2 63 A Border-Based Approach for Hiding Sensitive Frequent Itemsets according agrawal algorithms also applied approach aspects association atallah balancing based bertino border borders canada carvalho clifton concept conclusions conference confidence considered context contributions controlling dasseni data database databases degradation disclosure discovery during effect effectiveness efficiency efficiently efforts elmagarmid evaluate evolving expense fast fifth first focus following fovino frequency frequent have hiding ibrahim icdm ideas ieee impact implications importantly include information international itemset itemsets kdex knowledge levelwise limitation maintained maintaining mannila marks meira minimal minimize mining modification modifications montreal most number oliveira only originality pages pakdd paper parthasarathy performance possas preserved preserving prevent previous privacy problem proc proceedings process proposed protecting provenza provided quality record references relative result results rule rules sanitization saygin search secure security select sensitive sharing side sigmod small srikant state studied superior support terms their theodoridis theories this thus tkde toivonen unknowns using veloso verykios vldb well were with work workshop worshop zaiane zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.42 117 CLUGO: A Clustering Algorithm for Automated Functional Annotations Based on Gene Ontology In-Yee Lee1,2, Jan-Ming Ho2, Ming-Syan Chen1 adaptive algorithm alignment altschul annotation annotations automated barrett based basic beissbarth bioengineering bioinformatics biol biological biology bussey categorizer chen clustering clusters cluto computational conference consortium contracts council creating data database datasets decker design dexa dimensional distributions feng fifth find finding fojo fulmer functional gene generating generic genes genome genomic gish gofigure gominer gomit gostat gotm gotree graph group groups heaton hierarchies high http icdm ieee implementation intelligence interesting international interpretation interpreting joslyn kane karypis kaufman khan kirov knowledgebase lababidi lipman local machine memetic miller mining mniszewski modeling myers narasimhan national ontologies ontology overrepresented package part partition platform proc proceedings prot protein proteomic references reinhold representative resource riss rousseeuw saccharomyces schmidt schmoyer science search sets situ snoddy software space speed speer spieth statistically sunshine supported swiss swissprot symposium taiwan term theoretic tool under uniprot using wang weinstein wiley within work workshops wwwusers yeastgenome york zeeberg zell zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.78 140 Hot Item Mining and Summarization from Multiple Auction Web Sites Tak-Lam Wong and Wai Lam Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong, Shatin, Hong Kong wongtl,wlam @se.cuhk.edu.hk according adapting adaptive advances agentlink agents agichtein annotated aone appear applied approach attributes auction auctions automatic bidding blum cambridge camera chawla chose ciated collected collections conducted conference constitute containing contains context core coverage customer data demonstrate depicts described determine digital discovering discovery domain domains each ebay effectiveness eighteenth eleventh ends engineering europe evaluate evaluated evaluation examples experimental experiments extensive extract extracting extraction feature features fifth find framework from generated ghani gold graph gravano hour icde icdm icml ieee illustrates importance important information insurance intelligents international item items journal kernel knowledge kushmerick labeled large learning leverage libraries listed machine mani manually maybury methods mincuts mined mining model namely niblack online over page pages performance period perspective plain player prediction press price probabilistic proceedings produce product randomly realworld references relation relations remaining reported research respectively results reviews richardella satisfactory section sentiment served sets shows sigkdd sites snowball some standard summarization summarizing summary table technologies tenth testing text that their then these thomas three total train trained training ubid unlabeled useful using valuable value values very webfountain were with within wong wrappers yahoo zelenko http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.58 12 eMailSift: Email Classification Based on Structure and Content Manu Aery and Sharma Chakravarthy IT Laboratory and CSE Department The University of Texas at Arlington {aery, sharma} @ cse.uta.edu aaai ability able absence access accuracy adaptation adapting adaptive additional aery affect agent agents algorithm allow also analysis applicable application applied approach approaches artificial assistant associations australasian australia automatic based been believe boone boston both bunke challenge characteristics clark classification classifier classify coffs cohen company comparable compared complexity computing concept conclusions conference consistent cook crawford data desiderata detailed details direction discover discovery document documents during edwards effectiveness efficiency email emailsift enable enquiry established even exhibits experimental exploring extend factors feature features fifth figure filtering folder folders frequent future general good graph grouping gspan harbour have headers heflman holder human icdm identification ieee ifile immediate important improving increase induction information infosift innovative instances intelligence intelligent intend interface interfaces international investigating investigation isbell ishmail issues kandel karypis kephart kuramochi labs large larger last learn learning leeway machine mail management master mccreath means mining model niblett organizing over overload page pages parameters pattern payne performance performs personal please premise presented proc proceedings process proposed publishing recognition reduced refer references rennie results returned rissanen rules same schenker segal seventh sidner sixth size sized some spring statistical stochastic subgraph substructure summary swiftfile symposium systems technique techniques text that thesis this traditional training traits user using validate variety various viability well where while whittaker with work workshop world http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.18 17 Adaptive Product Normalization: Using Online Learning for Record Linkage in Comparison Shopping aaai access accuracy active adaptive advances agent agents algorithm algorithms alignment also ambiguous american application applied approximately artificial association attributemediated authors autonomous baeza bartlett batch baxter bayesian bhamidipaty bhattacharya bilenko blocking bollacker boosting bureau cambridge census chaudhuri christen churches citation classification cleaning cluster clustering cohen collins comments comparison computation computing conditional conference consolidation construction coreference cost crammer cristianini current data database databases decision deduplication dependences design detecting detection dimensional discovery discriminative discussions division dmkd domain domingos done doorenbos duplicate efficient elfeky elisseeff elkan emnlp etzioni evaluation experiments fast fellegi fifth first florida flynn foundations freund froogle fuzzy ganjam ganti getoor ghaoui ghosh giles google gusfield hardening hernandez hidden hierarchical high icdm icml iden identification identity ieee impact independent information infrastructure integration intelligence interactive international issues iterative jain jaro jordan journal kandola kautz kernel keshet knoblock knowledge lanckriet language large lawrence learnable learning like linkage machine manning margin markov marthi match matching matrix mcallester mccallum measures merge methodology methods metrics milch mining minton model models modern monge mooney morie motwani moustakides murty naacl names natural neto nigam nips noun object olson online optimal page parag parallel pasula perceptron pkdd press problem problems proc proceedings processing programming providing pseudo purge reading record records reference references report research retrieval review ribeiro richman robust roth russell sarawagi scalable schapire schutze search section semidefinite sequences sets shalev shawe shopping shpitser shwartz sigkdd sigmod similarity singer soft sources state statistical stolfo strehl string strings sunter survey surveys tampa target taylor team technical tejada thank their theory this tity tracing training transformation trees uncertainty ungar university using verykios vldb washington weight weights weld wellner were when wide winkler with work workshop world would yates http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.146 103 Triple Jump Acceleration for the EM Algorithm accelerating adaptive algorithm algorithms also analysis applied artificial autoclass bauer bayesian bound burden cheeseman classification cluster conclusion conference convergence converges cooper data dempster effective empirical estimation experimental faires faster fifth framework freeman from future ghahramani herskovits icdm ieee incomplete induction intelligence international journal jump kelly kent koller laird learning leslie likelihood luis machine maximum method methods mining models networks numerical ones optimization other overrelaxed pages paper parameter particularly performing presents previous probabilistic proc proceedings references results roweis royal rubin rules salakhutdinov self several show singer society statistical study stutz surface system taylor than that this triple uncertainty update work http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.15 30 1 aaai abstract adaptive addison afopt agrawal algorithm algorithms almaden application applied approach approximation apriori arimura asai associainforcenter association barbara based bastide bayardo been beeri behavior bell better between bocca boolean borders borgelt boulicaut buneman burdick bykowski calders california calimlim candidate casali casanova cercone ceur ceurws changing characteristics charm chile cicchetti classification closed closet code computer computing concise condensed conference copenhagen counting cover curves data databases dataset datasets dawak december denmark dependencies design diego dimentional discovered discovering discovery disjunction distribution distributions during dynamically eclat edition editors efficient efficiently enumerating essential experimental experiments explorations expstudydatasets extended fagin fast feasible fifth fimi find first flannick florida flouvat former foundation france free frequency frequent functional gajek gehrke generalized generation generators germany goethals gouda grahne growth hajj have heidelberg helsinki high hsiao http iane icde icdm ieee imielinski impact implementation implementations important inclusion inference information intelligent interaction international isima issues item items itemset itemsets jajodia jarke jose journal june kantola kaufmann kdci kiyomi knowledge koeller kryszkiewicz lakhal large latter lattices lecture levelwise limos lucchese mafia main maniatty mannila marchi mation maximal melbourne minimal mining morgan most multi multiple negative notes november obtained orlando pages pakdd palmerini papadimitriou papers pasquier pattern patterns perego perfect performance performances perspectives petit pkdd pods portland positive poster prefix press procedure proceedings proof properties proposed queries quest raih ramesh references relational report repository representation representations research resources respect rigotti rules rundensteiner runtime santa santiago science sciences scrutinizing search second sequential session sets shown siam sigkdd sigmod siirtola silvestri society software space springer srikant stability statistical step strategy study stumme swami symposium synthetic system systems taipei taiwan taouil technical that their theoretical theories theory this tions toivonen toward transactional trees uchida understanding uses using vardi vldb volume washington wesley with without work workshop xiao york zaki zaniolo zigzag http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.49 149 CTC - Correlating Tree Patterns for Classification Albrecht Zimmermann Bjorn¨ Bringmann Machine Learning Lab, Albert-Ludwigs-University Freiburg, Georges-Kohler¨ -Allee 79, 79110 Freiburg, Germany E-mail: {azimmerm,bbringma}@informatik.uni-freiburg.de aaai about accurate acknowledgments actually adaptivitat aggarwal agrawal algorithm alleviates allows also ambiguous andreas appear applications applying approach approaches arbitrary arikawa artificial association australia australian base based basing being best better bringmann build cairns california cercone city class classification classifier classifiers cmar comments complex complexity comprehensible computer computing conference confidence consider consisting construction corclass correlated correlation criterion cslog ctcavgstr ctcdl ctcmv ctcw dallas data databases datasets decision definitions different direction discovery discussions domingos done editors effective efficient entailment error evaluate evaluated evaluating evaluation existing experiments fachgruppe faloutsos feature fifth figure finally forward founded four framework frank frequent from furthermore future geamsakul generation getoor give gives giving graphbased grieser guidance hall have having helma helsinki heuristically heuristics highest icdm ieee include inclusion incomprehensible induced induction integrating intelligence interesting international inverse italy itemset japan jose karwath kersting kilpelai kramer kristian large lattices learning lernen less like logical machine main maschinelles matching matsuda maximal meaning measure measures metric mining model models mohammed molecular morishita motoda muggleton multiple mutter namely offers opportunity optimal other ourselves output padova pages parameter part patterns performance piatetsky pkdd pods possible possibly prediction press problem problems proc proceedings process progol promising providing provost pruning quinlan raedt ranking rates rather references relations representation representations research restricted restricting results rule rules sapporo science search seems selecting senator separate sese sets setting shapiro show single size smiles society somewhat springer srikant statistical statistically still stolorz straight strategies structural structured subset success supply support suzuki table tanaka texas text than thank that them thesis they this thresholds traversing tree trees tuning uninformative university unlike user users uses using validation very washington washio webb well while whose will wissensentdeckung with work workshop would xrules yamamoto york yoshida zaki zimmermann http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.88 145 1 aaai accord accuracy achieves advantage algorithm algorithms among appl assessing assessors aulc average bagc bagful bagfullc bagged baggedc based beat been being benefits best better blake bradford brodley brunk building calibration campaign carried cfts class comparison complete conclusions conference considerations consistent consistently correcting correction costly costs cross curves data database decision decomposition deficiencies degroot depth distribution diversity domingos done each ecml effect efficiency empirical ensembles entropy error estimating estimation exception exhibits fawcett fienberg fifth figure focus following from full fullc future graphs half have hypotheses icdm icml ieee improve increases increment induction international iscer isomorphic jair kaufmann kohavi kunz laplace larger largest leads learning left lift ling machine masand maximizing merz metrics mining misclassification model modeling modification morgan murphy nearly nemife nine noitarbil noitulos notes optimality other outperform overall pakdd paper partition percentage performance performs pets plot points poorest practical precision probability proceedings programs proposed provost pruned pruning quinlan quite random rank ranking rankings rate rdtf rdth reach references refinement related repository researchers resolution respect responding right score section seven several shapiro single size some ssor statistica still summary superior supplement than theory these this through ting topics total training tree trees understood unpruned using value varying vector viewed wang weiss what when which will with work yportn http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.126 56 Ranking-Based Evaluation of Regression Models Saharon Rosset, Claudia Perlich, Bianca Zadrozny IBM T. J. Watson Research Center P. O. Box 218 Yorktown Heights, NY 10598 {srosset, reisz, zadrozny}@us.ibm.com about academic across advantages algorithms analysis approach appropriate area arnold based bennett between bootstrap both bradley case characteristic classification comparative conference confidence consistently contributions correlation curve curves customer cutoff cutoffs data definition detection differences different edward egan elements error evaluation fifth figure financial function functions further garland gibbons graph hampel icdm icml ieee include influence insights international intervals journal kendall learning local machine march marketing measures methods mining model models more nature noether nonparametric often other outliers outperforming outperforms pattern percent performance points position presentation presents press proceedings profitability property proposed provide range rank ranking recognition references regression residual robust robustness role ronchetti rousseeuw services share shown shows signal sons springer stahel statistical statistics study suggested tasks terms than that theory these this traditional under vapnik visualization wallet wiley within work http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.75 113 Hierarchical Density-Based Clustering of Uncertain Data according after already ankerstm arethecorresponding assign assume been breunigm cessed clustering compute conference containing data database definition diby disn distance distfuzzyd each easily enough entries fifth figure following foptics fopticsalgorithm fopticsrun function functions furthermore fuzzy given have icdm identify ieee instance instances international inthismergingprocess kriegelh list listl lists materializing merge mining note object objects only optics order ordering other pfor points predecessor proceedings processed rate reach reachability reachabilityvalue reachpi reasoning references reflects representing requires sample sanderj seedlist sigmod stored structure tance that then throughout used value weassumethat wecreatethefinalobject where which without http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.118 116 Partial Elastic Matching of Time Series Longin Jan Latecki Vasileios Megalooikonomou Qiang Wang Rolf Lakaemper Computer and Information Science Dept., Temple University Philadelphia, PA 19094, USA { latecki, vasilis, lakamper, qwang }@temple.edu C. A. Ratanamahatana E. Keogh Computer Science and Engineering Dept., University of California Riverside, CA 92521 { ratana, eamonn }@cs.ucr.edu aaai aach about acoustics algorithm algorithms aligning automatically based behavior berndt best bioinformatics called cheapest chiba chiu church clifford computation compute computes conclusions conf conference corresponding data databases deepening demonstrating determines dimensional discovering discovery dissimilarities distance dynamic efficient elastic elements engineering european everything experimental expression faloutsos fifth find finding following free freiburg future gene gunopoulos hadjieleftheriou hart hoppner icde icdm ieee incremental indexing integrating international iterative jagadish jose keogh know knowledge kollios lakaemper latecki lcss learning lonardi mannila mapping matches matching measures megalooikonomou method minimizes mining motifs multi multidimensional multiple multiresolution obtain only optimal optimization outliers outperforms pages parameter part partial path patterns pazzani performs pkdd practice present presented principles probabilistic problem proc proceedings process processing programming proposed provide qualitative queries query rafiei ratanamahatana recognition references representation results retrieval rules sakoe scale seattle sequence sequences sequential series siam sigkdd signal similar similarity simultaneously skips solution speech spoken statistical support sydney symbolic symposium target tasks temporal that this time timie tokyo towards trajectories trans translation under update using values variance vlachos wang warping washington whether whole with word work wrong http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.20 31 AMIOT: Induced Ordered Tree Mining in Tree-structured Databases acknowledgment adopts advances agrawal aided algorithm algorithms alphaworks amiot analysis application applications applying approach apriori asia association avoid based bases basically because being boosting breadth candidate canonical chemistry chopper class classification comparison comparisons complemental computer conference consumption cost culture data database databases dataset dbsj depth derive devised discovery education effective efficiency efficient efficiently embedded emnlp empirical enumerates enumeration european evaluate expensive experiment experimental extension fast faster fifth first forest form free freqt frequent from further generates generation generator grant graph graphs growth hand have hido high higher hong http hybridtreeminer icdm ieee improve indicates infrequent inokuchi institute international intro issues japan japanese join joining kawano knowledge krishnamoorthy kudo language languages large leaf left letters liacs linear management matsumoto memory method methods mgts mining ministry more motoda much muntz mutagenesis natural need network never nijssen node number only ordered other pacific pages pakdd part pattern patterns performance performs pkdd polytechnic principles probability problem proceedings processing proposal propose punin real reduces reduction redundant references rensselaer report requires research respect results right rooted rules same scalability scan scans science scientific semi sequences serial show sigkdd sigmod similar since size sports srikant ssdbm statistical structure structured studies substructures subtree such sufficient supported synthetic tech technical technology tern text than that this time toxgene tree treeminer trees unordered usage using very vldb wang washio which with without workshop xspanner yang zaki zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.141 61 SVM Feature Selection for Classification of SPECT Images of Alzheimer's Disease using Spatial Information able academic advances advantage aging alder algorithm also always alzheimer amin among analysis approach arbitrary areas article artificial asada ashburner assessment assisted august auspices automatic available avenue ayache bartlett based best better boston braak bradley brain california cambridge cerebellar classification classifier classifiers clinical clinician clinicians coakley coffey cognitive colchester college communications compare computation computational computed computer computerassisted computing concave conf conference considered contiguous contributions control cooney correlation cplex criteria darcourt data datapoints dawson december decision delp demented dementias department diagnosing diagnosis diagnostic dierckx differences different differential direction discriminative disease division dobbs does dolan dougall drachman early ebmeier edition editor editors elderly emission especially european evaluation even evgeniou experts extend feature features february fifteenth fifth findings fisher flannery focus folstein force formulation frackowiak francisco freely freyne friston frith from function further future gives global goethals gool gosche grading group gunnar hamilton handle hare have health hippocampal hmpao holmes hooper however http human icdm icml identify ieee ilog image images imaging impact important incline index information instead institute interesting international intervention issue jacques journal july june katzman kaufmann kernel klaus knowledge kogure kopf koulibaly kunihiro large learning least lecture letters level lncs local london longitudinal look machine machines mahony malandain mangasarian mapping margin markesbery math mathematical mathematics matsuda mcewan mchugh mckhann means medical medicine mental method methods metric miccai might migneco mika mild mildly minimental minimization mining more morgan mortimer multimodal murphy nakano nature need netherlands network networks neural neurobiology neuroimage neurology neuropathologic neuropathology nevada niessen nincdsadrda nips nobili norm normal normalizing notes november nuclear numerical observers obtain october ohnishi only operating operations optimizer other others ourselin outperform outperforms pages parametric pathology patients pennec performances perfusion photon plane poggio points poline pontil possible potentially practical practice presented press price prima probable proceedings prog programming propose proposed provide providing psychiatric ratio recipes references registration regularization related relation report reports represents research retaining robert roche routine royen sagittal scheidhauer schol schuurmann science scientific sebastian selection sensitive separating september services shavlik should show shown significantly similarity single sitive slosman smith smola snowdon some soonawala specific spect spet springer stadlan state statistical steele still stoeckel study subject subjects sufficient support takasaki task tech teukolsky teunisse that their theory therefore this tomography tracer trained transaction triscott uller under university uptake useful using utrecht values vapnik vector versus vetterling viergever village volume voxel walsh walstra want weinstein welcome well wells where which while wiele will wisc with without work would york zant http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.86 59 Learning Functional Dependency Networks based on Genetic Programming acknowledgment alex algorithm algorithms analysis approach artificial aspects bacchus based bayesian belief blake bnpc cheng chickering chrisman ciobanu cognitive computation computational computer conference congress course cuai data databases david decomposable development directed discovery earmarked editor edwards efficient emphasizing evolutionary evolving extinction fahiem fifth fogel freitas friedman gaussian grammars grant greewood hettich hierarchical hong http hybrid icdm ieee incomplete induction intelligence interaction international jcheng knowledge kong kwong laskey learning leung levitt logic machine maxwell merz method microsoft mining models myers myllymaki nachman network networks other pages partially pattern powerconstructor principle probabilistic procces proceeding proceedings program programming psychology redmond references refinement report repository research roadmap satist school science search service siegler silander springer stochastic supported technical this three tirri toolkit tools transactions ualberta uncertainty uronen using winmine with wong work http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.26 138 Anomaly Intrusion Detection using Multi-Objective Genetic Fuzzy System and Agent-based Evolutionary Computation Framework Chi-Ho Tsang1 Sam Kwong2 Hanli Wang1 Department of Computer Science, City University of Hong Kong 83 Tat Chee Avenue, Kowloon, Hong Kong acceptably accuracy accurate achieve achieves addition agarwal agent algorithm algorithms alleviate also anomaly application applies attacks base based baseline bases boston both case classes classification classifier classifiers classifying compared competitive complexity computation conclusions conf conference considering contradictory data demonstrate desirable detecting detection discovery distributed effect electronics elitist elkan evaluated evolutionary evolve experimental extracted extraction fact fast feature fifth first fitting fprs framework from further fuzzy genetic gfrbs hierarchical high icdm ieee indicating industrial international interpretability interpretable intrusion ishibuchi joshi knowledge known kwong large learned learning literature major meanwhile measure memberships meyarivan minimal mining models mogfids multiobjective nakashima network normal novel nozaki nsga obtain often optimization other outperforms over overall pattern performance pnrule pratap precision probe problem proc proceedings propose rare rate rates recall recognition recognizing reduce reduced references regarding relative relatively representation results robust rule rules samples search sets siam sigkdd small study subset such systems tanaka tang terms test that thus traffic training trans unseen using wang weights when winner with http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.120 120 Predicting Software Escalations with Maximum ROI Charles X. Ling1, Shengli Sheng1, Tilmann Bruckhaus2, Nazim H. Madhavji1 aaai about above accurately actual actually addition advances after agarwal algorithm already also among amount applying appropriate artificial assigned available bagging based basili beats becomes been before beohm berry between boehm boosting breiman bruckhaus business calibrated california canadian candidates cannot caruana case catches chart chulani citing class classes classifying compared computational computer computing concept conclusions conference considered continue corrected corresponding cost costs costsensitive currently customer customers data datamining datasets date decision defect defects deployed diagonal direct discovery drummond during economics editor editors effectiveness elkan engineering enormous enterprise escalated escalation escalations evaluated evaluation evidence example expedited experiments fayyad feedback fields fifth final found foundations fourteenth fourth francisco frank freund from future good greatest group hall happen happened have haystack high higher holte icdm icml ieee ijcai imbalance imbalanced imbalances implementations improve indeed induction instance intelligence international introduction japkowicz java john joint joshi kaufmann knowledge known kumar langford lead learning lift likely line ling linoff list lower machine made madhavji maintenance make makes management many marketing maximum minimal mining mizil modeling models more morgan most must needles niculescu obtained obtaining order other over paper percentile performance period phase piatetsky plan positive practical practice predict predicted predicting prediction predictions predictive predictors preliminary prentice preparing presence press preventing priority proactively probabilities problems proceeding proceedings product programs proportionate proposed provide provided quarters quinlan quite random rare recent records reduction references release released removed report reports resolution resolve results risk risks rule sales sampling save schapire science sensitive sensitivity series shapiro sheng sigmod significant smyth society software solution solutions some sons specialist specific stats step studies submitted successfully such support symposium system techniques technologies technology test tested than that these they this thorough three time tools track trees under until updated upgraded used useful using uthurusamy value vendor vendors wait wang weeks weighting well were which wiley will with within witten work workshop would yang zadrozny zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.110 131 On the Tractability of Rule Discovery from Distributed Data Martin Scholz Artificial Intelligence Group Department of Computer Science University of Dortmund, Germany scholz@ls8.cs.uni-dortmund.de aaai acknowledgements advances agrawal algorithms allows applied approximating asia assistant association attribute based bases been behave behaviour best better between boosting both bound bounded case centre chapter cheung collaborative common completely complexity conclusion conditional conf conference contains covering data database databases deutsche difference different direction discovering discovery disjoint distributed distribution dortmund effect estimated estimates evaluate even example explora fast fifth finding first flach forschungsgemeinschaft from furnkranz future general generating geometry germany given global hard have higher homogeneously however icdm ieee indicate interesting international into investigated investigation isometrics journal klosgen knowledge lacking large lazarevic learning least literature local locally machine marginal measures metrics mining most multipattern multistrategy multivariate none obradovic ones other pacific pages paper parallel patterns precise press problem proc proceedings proven quickly real recent reduction references report requires research results rule rules rulesets same samples sampling scheffer scholz scope selection sequential share shown similarity since single sites skewness skews solution solved space srikant step structures subgroup supported synthetic target task technical than that their then these they this through tight towards translate uncertainty underlying understanding university using utilities utility various very well what which will with work world wrobel xiao years yield http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.122 94 Privacy-Preserving Frequent Pattern Mining Across Private Databases about across adopted agrawal association average bandwidths because bernstein change chiu clifton conference data database databases default device distributed does environments equal evfimievski execution fifth frequent from generated gilburd graphs hall http icdm ieee information international itemsets joins journal large list local ments message mining model more need noise ozsu pages parameters partitioned passing percentage prentice preserving principles privacy private proceedings queries references relational rule scale schuster semi sharing sigkdd sigmod smaller solve srikant stars systems than that there time using vaidya valduriez vertically wang wiki wikipedia with wolff http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.24 92 An Improved Categorization of Classifier's Sensitivity on Sample Selection Bias Wei Fan1 Ian Davidson2 Bianca Zadrozny1 Philip S. Yu1 analysis argue available being bianca bias cary chance class classifier classifiers common conclusion conference could datasets depends detailed directly earlier evaluating example examples experimental feature formally global including insensitive institute international into label learner learners learning local logistic longer machine made many most others paper proceedings references regression related report request results review sample selected selection sensitive some sumptions technical that theoretical this training tutorial type under upon vector verifying version where works ying zadrozny http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.139 66 Supervised Tensor Learning Dacheng Tao1, Xuelong Li1, Weiming Hu2, Stephen Maybank1, and Xindong Wu3 accept adaptive algebra also analysis analyzed annals anthony approach attention automatic based because been bellman better birkbeck boyd called cambridge challenge chang chebyshev city class classification coding cognitive college computational conclusion conference content control convergence converges convex curse cvpr data developed different dimension directly discriminant does duda each effectively efficiently eigenfaces error experimental experiments extracting face feature features fifth figure following framework from gait gelade generalize grail guided have holy however human icdm ieee image images indexing inequalities information initial input inputs insensitive integrated integration international into iterations itti jmlr kernel lanckriet landscapes lathauwer learn learning leuven levin libraries linear linguistic local london machine machines magazine makes marshall matching mathematical matrix meaningfully media method minimax minimum mining model modeling multilinear multimedia multivariate nature noitis novel number olkin optimization optimize order pami paper parameter parameters pattern pentland performance picture pictures press princeton principle probability problem proc procedure proceedings processes processing programming projections properties proposed proved psychology rank rapid rate recognition reduce reduces references regression representation robust rounds saliency samples sarkar scene scheme schemes semantics semidefinite sensitive series sets shashua signal simplicity slump springer stability stable statistical statistics studied supervised tech tensor tensors testing than that theoretically theory thesis they this tmpm tour training trans treisman true turk under univ universiteit university using vailaya values vandenberghe vapnik vector verlag visual wang which wiley with within york http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.7 49 A Heterogeneous Field Matching Method for Record Linkage Steven N. Minton and Claude Nanjo Fetch Technologies 2041 Rosecrans Ave., Suite 245 El Segundo, CA 90245 {sminton, cnanjo}@fetch.com Craig A. Knoblock, Martin Michalowski, and Matthew Michelson University of Southern California Information Sciences Institute, 4676 Admiralty Way Marina del Rey, CA 90292 USA {knoblock, martinm, michelso}@isi.edu able above accuracy acknowledgements active adaptive address advances alberta algorithms also application applications australasian authors award based bayesian bhamidipaty bilenko cambridge canada christen churches cleaning cluster cohen comparison complex conclusions conditional conference connected consolidation contained context contract coreference data databases deduplication detection dimensional domain domains duplicate edmonton either elkan endorsements experimentally expressed feinberg field fifth force foundation grant graphical herein hernandez hierarchical high huang huge icdm iden identification ieee ijcai implied important independent information integration interactive international interpreted into joachims kernel knoblock large learnable learning linkage machine machines making match matching matures mccallum measures merge methods metrics mining minton model models monge mooney more name names national necessarily neural nips noun number object office official organizations pages part perform person policies practical press prob probabilistic problem proceedings processing purge ravikumar real record records references representative representing research richman russell sarawagi scale science scientific sets should shown sigkdd sigmod similarity sizable standardization states stolfo string support supported systems task tejada them this those tity transformation uncertainty under united upon using varied vector views washington weights well wellner when where which with work workshop world http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.73 111 Gradual Model Generator for Single-pass Clustering academic algorithm analysis apparently applications archive based basic between bhattacharyya birch boolean boston bradley bull calcutta california chakrabati close cluster clustering compact components computation computer conclusions conf conference converts data databases defined dempster department desired different dimensional dimensionality discovery distributions divergence effect efficiently engineering exceeds extensions fast fastest fayyad features fifth find first frem from fukunaga gaussian gaussians generate hettich http huang icdm idea ieee incomplete information input international into introduction irvine ishii john journal knowledge krishnan laird large leung likelihood line livny lots management mars math maximum mclachlan measure mehrotra method might mining mixture model models more multivariate network neural normalized number omiecinski online only order ordonez original ortega paper pattern performance points populations porkaew post present press proc proceedings processed produce proposed queries quite ramakrishnan ranked reason recognition references reina result robust royal rubin running runs sato scalable science second select selected series sets should shows similar similarity size small smaller society sons stage statistical subsets supporting table that their then these this time times total transactions turn twice university usage using very when which wiley with wong york zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.39 110 Categorization and Keyword Identification of Unlabeled Documents Ning Kang Carlotta Domeniconi Daniel Barbara´ ISE Department George Mason University aaai absence accuracy achieved acknowledgements across actually adaptive addition additional against agarwal aggarwal agrawal algorithm algorithms analysis applications approach approximate automatic award barbara based capable career carlo case castelli categories categorization category certain chakrabarti characteristic chip cikm circuit class classification classifying cluster clustering collections combination common computed conference consequence construction content continuous corresponding cross data databases dataset decomposition demonstrate dependent derived detector device different dimensional dimensionality discarding discriminant dispersion documents domeniconi dumais each effect electronic electronicsmedical emails equipment evidence expected experimental fact fast feasibility feature fifth figure frequency frequent further gehrke general given global ground group gunopulos handle hardware have high icdm identification ieee indeed indexing information international interpreted inversely itemset jobs jones kang keogh keywords labels landauer language large larger largest latent letsche likewise littman local locally major management measure measured mechanism medical mehrotra messages method mining money monte moreover most multi murali nature newsgroup newsgroups normal obtain ones other papadopoulos park part pazzani people presence problem proceedings procopiuc projected projective proportional provides raghavan range receiving reduction references reflect reflecting relative relevance relevant representative result results retains retrieval same screen selected selection semantic series shared shown shows siam sifting sigmod signal since singular spaces spam speech spread spring spurious subclasses subspace successfully such supported symposium technique term terms text than that their therefore these this thomasian time tion topic trend tributions truth typical underlying union using value values variability very vldb weight weighting weights whereas which while wider wire within without wolf word words http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.28 71 Atomic Wedgie: Efficient Query Filtering for Streaming Time Series acknowledge acknowledgments algorithm america among annals annual aphid aphidae appendix approach arriving assistance associated atomic atrial audio automated bangkok based brute cares carson changing chicago chooses choosing circuits classic cole coler commonality compared comparing computing concept conclusions conference considerations contains continuously currently data dataset dictionary does drift dynamically each empirical engineering entomological errors every exploits fast fibrillation fifth figure filtering fisher force future given gottlieb graphed gratefully gynecol homoptera http icdm identification ieee illinois improvements index indexing insect international into introduce june knowledge large leave lewenstein management matching merging miller mining moon moore more moritor novel number numbers obstet optically oral over paper pattern patterns performance pest pheromone predefined pregnancy problem proceedings processing propose provides references reginald resulting scale scorza sensed series shared similar skyline society sorted sorting steps stock strategy streaming subsequence such support symposium systems table technical technologies terbutaline test thailand that theory they this time together transactions tremendous unsorted useful waveforms wedge wedgie well which will wingbeat wish with work http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.93 121 Mining Approximate Frequent Itemsets from Noisy Data aaai according advances agrawal algorithm algorithmic analysis anti approximate apriori association australia australian becquet between biogeography bioinformatics biol blachon blackwell both boulicaut bradley breadth cannot case challenges chapel chapter classical cluster column common comparative compuater computational conclusion conference considerations constraints creates creighton criteria currently data databases defined dense department depth derive development dimensions discove discovery distribution done each editors efficient enjoyed ensures envelopes error exact existing exploring expression fast fault fayyad fifth figure first fishes focus found fraction frequent freshwater from future gandrillon gene generates generation genome grant hanash hard have high hill however humamining icdm ieee imielinski integrative intelligence international investigation issues items itemset itemsets jeudy journal knowledge kumar large main make mannila method mining model models monotone more nobel noise noisy pages paper partially pattern patterns paulsen piatetsky places present prins problems proceedings projectionbased property provinces proximate pruning reasonable references relatively remain report representing research resource rule rules sage scale science second seppanen sequential sets several shading shapiro sigkdd sigmod smyth srikant steinbach strong structure studies study substantial support supported swami technical technique than these this through toivonen tolerance tolerant traditional tung under unmack unsolved useful uthurusamy verkamo volume wang will with work workshop yang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.135 37 Stability of Feature Selection Algorithms Alexandros Kalousis, Julien Prados, Melanie Hilario University of Geneva, Computer Science Department Rue General Dufour 24, 1211 Geneva 4, Switzerland {kalousis, prados, hilario}@cui.unige.ch accidents aggregating algorithm algorithms allard also among analogue analysis application applications appropriate ardekani artificial attributes bajcsy barnhill based believe better bias bienenstock bigger biomarker buck cancer carrette central cerebral classification clear combination coming computation conference confidence consistent continuous coupled created data databases datasets decomposition defined detection determine diagnosis differences different dilemma discovery discretization diverse domain domingos doursat duda editor elegantly emmert empirical empirically ensemble error estimate estimation estimations european examination examine examined exploited extraction fayyad feature features fifth finally fishman framework frank from fusaro future gained geman gene given goal graphs guyon hackett hart highly hilario hitt icdm identify ieee implementations important includes increases insights intelligence interesting international interval investigate irani java john joint journal kalousis kaufmann knots knowledge known kohn kononenko lancet langley large learning levine like linehan liotta machine machines marston mass measure measures method mills mining model models more morgan most multi networks neural note notion number order ornstein ovarian overlap pages pattern patterns paweletz perturbations petricoin picture possibility practical practice prados preferences preprocessing principles problems proceedings produce produced profiles proposed prostate proteomic proteomics provide quantification ranked real references refining reflect relevant relief relieff resampling results rexhepaj robnik robustnes rrelieff sanchez scene select selected selection sensitivity serum sets seventeenth sikonja simone some sons spectra stability steinberg stork submitted subsamples support svmrfe table technical technique techniques that their them theoretical these they three tools training traning trucco ture turney unified used users using vapnik variance vector velassco viewed visualized well weston what where wiegand willey with witten wood work world would http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.31 119 Average Number of Frequent (Closed) Patterns in Bernouilli and Markovian Databases Lo¨ick LHOTE, Franc¸ois RIOULT, Arnaud SOULET GREYC, CNRS - UMR 6072, Universite´ de Caen Basse-Normandie F-14032 Caen cedex, France {forename.surname}@info.unicaen.fr about admits advances agrawal algorithm algorithms almost also analysis another applied applies apprentissage apriori article artificial association average averagecase bases because behaviour bernoulli border boston bound bussche caen candidate case chile classical close closed commonly comparing complete complexity computer computing concept conceptual concerning conclusion confer conference confirm connect correlated correlations corresponds currently data database databases dataset devices diagnosis discovering discovery does electro ence enough equal evaluate experimental experiments exponential fast fault field fifth first fixed follow formal frequency frequent from fruitful geerts generates generating give given goethals groth growth gucht gunopulos hand have hypergraph icdm icdt ieee information intelligence interested interesting international items journal kaufman khardon knowledge kuznetsov large largest last lattice lattices learning lhote life machine management mannila many markovian mathematic mechanical mephu mining model models morgan most need negative nguifo number obiedkov open order other pages particular pattern patterns performance performed pods polynomial positive press priori probabilistic problem problems proceedings progress proportional proved pumsb purdom quantitative quite random randomized real redundant refer references reflect relational remark report respect result results rioult rules saluja same sampling santiago sentences siam sigkdd sigmod since size soulet specialists specific srikant steps studied such synthetic systems tables technical testbeds that theorem theoretical these this three threshold tight time toivonen transactions transversals uncorrelated unexpected universite upper used verkamo very vldb well were where which wille with without work worst zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.111 11 Online Hierarchical Clustering in a Data Warehouse Environment Elke Achtert, Christian Bo¨hm, Hans-Peter Kriegel, Peer Kr¨oger Institute for Computer Science, University of Munich {achtert,boehm,kriegel,kroegerp}@dbs.ifi.lmu.de academic algorithm ankerst approach arbitrary automatic barbara based berchtold block boosting breunig bubbles building chen cheng clink cluster clustering clusters complete computation computer concepts conference data dawak defays delta dimensional dynamic effective efficient evolution explorations extracting extraction fifth from ganti gehrke gotlibovich gravity hierarchical hierarchy high hwang icde icdm identify ieee incremental index international ioerger journal kamber keim kovarsky kriegel link maintenance method metric mining nassar oger optics optimally ordering oyang pages pakdd performance points preserving press proc proceedings quality ramakrishnan references representations requirements sander sibson sigkdd sigmod single slink spaces speeding steams streams structure summerization techniques theory tree under updates vector vldb warehouse widyantoro zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.33 51 Balancing Exploration and Exploitation: A New Algorithm for Active Machine Learning Thomas Osugi, Deng Kun, and Stephen Scott Dept. of Computer Science 256 Avery Hall University of Nebraska Lincoln, NE 68588-0115 {tosugi,kdeng,sscott}@cse.unl.edu acoustics active adaptive algorithm algorithms also anal analysis annu annual application applications applied apte automatic baram behavior blake bowring brinker call campbell center choice classification classifiers cohn comb committee comput computing conf conference convergence cristianini data databases decision discovery diversity efficient error estimation exploration fifth figure foundation freund from goldgof gorin grant gupta hakkani hall harrold hopkins html http icdm idea ieee image incorporating intell intelligence international intl iyengar journal kalai keogh knowledge koller kramer labeled large learning less lindenbaum mach machine machines margin markovitch mccallum merz mining mitra mlearn mlrepository more multiple murthy national nearest neighbor nguyen noise number online opper optimal orthogonal pages pandey park part patt pattern pillar plankton points preclustering press probabilistic problems proc proceedings processing program query rahim random rate recognition recognize reduction references rehg remsen repository resampling research resources results riccardi rusakov sampling samson scheduling schohn segmentation selective seung shamir sigkdd signal simple sixth smeulders smola software sompolinsky speech stochastic support supported symp symposium test testing text theory through tishby tong toward trans types uniform unsupervised using vector vectors vempala volume with workshop yaniv york zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.101 21 Modeling Multiple Time Series for Anomaly Detection Philip K. Chan and Matthew V. Mahoney Department of Computer Sciences Florida Institute of Technology Melbourne, FL 32901 pkc, mmahoney@cs.fit.edu analysis anomalies anonymous appear applied chan chris class classification classifier cohen comments comparison conf conference cost data detecting developed dimensional discovery distance distributions edbt effective efficient engineering fast fawcett ferrell fifth florida free gaussian gecko gunopulos hadjieleftheriou http icdm ieee imprecise indexing induction intelligence international intl johnson keogh knowledge kollios learning lindgren lonardi machine measures mining mixture models multi multiple nasa objects pages paper parameter performance phase povinelli proc proceedings project provided provost ratanamahatana reconstructed references results reviewers ripper rule rules salvador santuro schefele screen series shots shuttle sigkdd software spaces spatiotemporal states steve support tanner tech test thank their this time towards trans tsotras under used using valve visualization vlachos walter with http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.116 123 Parallel algorithms for distance-based and density-based outliers academic after agrawal aistance algorithm algorithms almost analysis applications artificial assigns austin back based bases because between blake breuning california chapman cheung clustering communication compstat computation computational computer computes concurrency conf conference connections cutoff data database databases dataset datasets density department detection developing discovery distance distributed each effective efficient entire factor fifth first francisco from gives good hall hand hawkins heavy hodge html http hung icdm identification identifying ieee information intelligence international irvine iteration journal kauffmann kaufmann kdistance kluver knorr knowledge kriegel large learning linear local locally london machine management master mertz methodologies methods middleware mining mlearn mlrepository morgan multivariate near nearest neighborhoods neighbors next ninth observation once other outlier outliers pacheco pages parallel partial performance proc proceedings process processor programming proposed pruning publishers ramaswamy randomization rastogi reachability reaches receives references repository resides respective results review robust rocke rule runs ruoming same sander schwabacher science sends sets shim siam sigkdd sigmod simple skillicorn slave slaves spatial speedup statistics strategies survey task that their there this time tucakov university very vldb which with wolfgang woodruff work http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.100 139 Mining Quantitative Frequent Itemsets Using Adaptive Density-based Subspace Clustering Takashi Washio, Yuki Mitsunaga and Hiroshi Motoda The Institute for Scientific and Industrial Research, Osaka University 8-1, Mihogaoka, Ibaraki City, Osaka, 567-0047, Japan washio@ar.sanken.osaka-u.ac.jp acknowledgement agarwal aggarwal agrawal alexandre algorithm algorithms applications approach association august authors automatic based carlo cecilia center cheng choudhary clustering clusters comference computer computing conf conference connected construction cpdc data databases decision density densitybased dept dimensional discovering discovery distributed efficient electrical engineering entropy ester extensive fast fifth fourth gehrke goil grant gunopulos high highdimensional histograms hock html http icdm ieee information interesting interestingnessbased international interval isir japan jones jsps june kailing knowledge kriegel kroger large learning machine mafia management march merger mining mlearn mlrepository monte murali nagesh ninth noise northwestern november numeric numerical osaka pages paper parallel park partially proc proceedings procopiuc projected projective promotion quantitative raghavan redefining references relational report repository research rules sander scalable schism science scientific sequeira sets siam sigkdd sigmod society spatial srikant subspace support supported tables tech termier thank this through tkde transactions tree univ university very wang wish with wong write zaki zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.119 104 Partial Ensemble Classifiers Selection for Better Ranking Jin Huang Charles X. Ling Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 jhuang33, cling @csd.uwo.ca aaai achieves adaboost algorithm algorithms also analysis area average bagging based best boosting bradley breiman california caruana class classification classifier combining communications compared comparison computer conclusions conference conferenceon cost curve data dataset datasets dept different discovery distribution draw efficient empirical evaluation even experiments fawcett fifth first following francisco frank freund generalisation generally hand have higher html http icdm icml ieee implementations imprecise inferior information international irvine iyer java kauf kaufmann knowledge learning loses losses machine madison mann memory mining mizil mlearn mlrepository more morgan much multiple niculescu numbers only original other outperforms pages partition pattern pecs pecss perform performance practical predictors preferences press problems proceedings proceedingsof provost publishers reasoning recognition schapire sciences settings shows significantly simple singer stanfill statistics supervised than that third thirteenth ties till tools toward under university visualization waltz which wins with witten workshop http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.81 133 Instability of Classifiers on Categorical Data algorithms american analysis analyzing application argue assessment association base behaviour blake boosting both breiman chapman classification classifier classifiers client conference congenital connectivity construction data databases decision diagnosis dimensional discover discovered disease duda edition effectiveness examples experiments fifth freund friedman generalization give hall hand hart have heart hettich high icdm ieee illustrated importantly insight instability instable international islands john journal learning line machine mathematical medical melnik merz method mining model more olshen pattern points potential proceedings references region regression repository rules schapire sons stability statistical stephenson stone stork that theoretic theory toronto trees used vapnik veasy warner well wiley with http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.103 62 Neighborhood Formation and Anomaly Detection in Bipartite Graphs addresses aggarwal anatomy anomaly author bipartite brin computer conference confirm convergence data datasets detection easily effectiveness efficiency engine evaluate experiments fast fifth formation graph graphs highdimensional hypertextual icdm idea ieee implementation international interpreted isdn large lawrence main methods mining neighborhood networks outlier page pages paper partitioning problems proceedings properties proposed random real references relationships restarts results scalability scale search sergey several sigmod simplicity stock such systems that this trades walk well with http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.11 80 A Preference Model for Structured Supervised Learning Tasks Fabio Aiolli Dip. di Matematica Pura e Applicata, Universita` di Padova Via G. Belzoni 7, 35131 Padova aiolli@math.unipd.it aiolli algorithms allows application approach classification codify comparing conference confirmed constraint cost crammer data dekel different experiments fifth framework functions furthermore gives have highlighted http icdm ieee important international into label learning line linear long loss manning methods mining model models multiclass naturally nips ordinal peled performed plug pranking preference preferences problem problems proceedings ranking references regression role roth same shell singer slpref sperduti supervised tasks them tool training unifying unipi used validity vised with zimak http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.43 130 CLUMP: A Scalable and Robust Framework for Structure Discovery about accumulation accuracy acknowledgments added agglomer agglomerative algorithm algorithms also andj austin average averaged banerjee based because best between birch both bradley bregman case certain chan change class classes clump cluster clustering clusterings clusteringusingevidence clusters combining compared components computer concept conference contains content controls correct cure data databases dataset datasets december decompositions details determining dhillon diff dimensional discover discovery distance divergences documents dubes each effectively efficient electrical enables engineering ensembles equal estimating every evident experiments fayyad fifth figure fortheseexperimentsweusedsubsetsofthe found framework fredanda from generated ghosh grant guha hall hastie helps here hierarchical high higher icdm icml icpr identify ieee increasing initial instead international invariance jain jmlr journal journalonmachinelearningresearch kmeans knowledge kunal lack lans large learning levels livny long lower machine made many means measure merugu method mining misc modha more most multiple negligible news newsgroup noise noisy number observations observe obtain obtained often omitted order original other over overlapping pages pair papers parameter parameters partitions pattern pendigits percentage plot points politics prentice presented proceedings prototypes provided punera ramakrishnan rastogi recognition reduced references refining remarkable report reported resilient result results reuse robust roughly royal runs salvador same scalable scored seen segmentation segments shim siam sigmod similar since size slightly society space sparse spherical spkmeans splitting statistic statistical strehl structure such summarized supported table talk technical terms texas text than that these this thisworkwas tibshirani times tools tough tried true unable university used using utexas value values very verydifferentintheircontent walther want which whichare while whiledocuments wise with zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.143 98 Text Classification with Evolving Label-sets about abstractions accuracy acknowledgements advance algorithm allan allocation also analysis appear approaches approximate associations author authors award away based basis benchmark better between bhattacharya blei bombay bridging called cannot captures carbonell categories categorization chakrabarti changes chasms chitrapura cikm class classes classification clustering collection compared concept concerned conclusions conference data dealing define details detect detecting detection developed different dirichlet discriminative distribution document documents drift dumais ecml entity event events evolving existing experiments explore features feed fellowship field fifth figure first follow followed from further future gabrilovich general generative godbole grouping handle handling hard harpale have helpful hiclass higher hofmann horvitz icdm icml ieee incoming incomplete india indicative information infosys integrate interactive international introduced involve joachims jordan journal klinkenberg known krishnapuram label labeling labels labelset latent lavrenko learning lewis like limited machine machines makes many meaningful methods mining model monitor more most multi multilabeled needs news newsfeeds newsjunkie nips noisy notion novelty observed occurrence online over papka pass pattern performed personalized pkdd popular possible presence presented probabilistic problem proc proceedings project prototypes providing quite ramakrishnan rates real references related relevant rely reported research rose runs sarawagi selection semantic sets setting sigir sigkdd similarities similarity single some space spawn stories story study summary supervision support supported systems tags tasks techniques technologies term text than that their them these thesis this threshold thresholds through time topic topicconditioned tracking triggering tuned understanding unsupervised user using vector well while with word words work workbenches working world would yang zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2005.94 79 Mining chains of relations Foto Afrati about abstract action algorithms algorithmscomplexity also analysis approach apriori area artificial association author authorities authority authors average based bayardo been borders called case ciently clare collection combinations committees communities complete complexity computation concepts conference conferences connections corresponding covered cubes data databases dataset datasets define dehaspe deletion described devices different discovery discrete dzeroski each edge editors effi empty example examples extraction fagin fantasy fictional fifth first focused forest formal frequent from general generalize genres gibson give given goethals graph graphs guha hand have http hypertext icdm ieee imdb implementations indexing indicating indicative inductive inferring intelligence intelligent interactive international intuition investigated investigaton itemset jacm karypis kleinberg knowledge kumar kuramochi lack languages layered least lester level levelwise like link logic logics mannila many match mathematical mathematics maximize meanings member members minimal mining models more most multi multidimensional multiple multirelational node novak number objective observed olap omit other overall papers pods portal potential precisely principles problem problems proceedings programcommittee programming programs raedt raghavan ranging ranking recent references relational relations requested requiring respectively results rules sarawagi sathe scalable schwarzenegger search searched second section select selecting sets sigmod single sivakumar size small space springer stoc structural structure structures subgraph subgraphs subset such taken tarjan task tends that theories these those three toivonen tomkins topic topics topology total trees used using values various very which wigderson williams wise with work workshop written yannakakis zaki