http://www.informatik.uni-trier.de/~ley/db/conf/icdm/icdm2006.html ICDM 2006 http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.11 40 Accelerating Newton Optimization for Log-Linear Models through Feature Redundancy accelerated accuracy achieved aggregation algorithm algorithms also applications approach argonne around available behind benefits benson berger between bfgs bound boyd broyden cambridge categorization chakrabarti citeseer class classification clusters collection complex computation computational compute conclusion conditional conference conferences constraint convex crfs cvxbook data databases document easy elements entropy equations even evolving exploiting exploring exponential extending factors faster feature fields filtering form friedman general godbole gradient graphical harpale hastie hauptmann have highdimensional however html http icdm icml ijcai ijcaiws implement inference information interactive interested international involved iterative knigam labeling labels laboratory lafferty language large larger learning lewis limited linear linguistics local loss machine math mathematics maxent maximum mccallum mechanical memory method methods metric minimization mining model models more murphy naacl national natural newton nigam nocedal nonlinear objective ongoing online optimization optimizer optimizers other papers parameters parsing pereira performance pietra pkdd press probabilistic proceedings processing programming proposal proposed quasi random references required reuters rewriting routines rsise sarawagi scale scaling schmidt schraudolph segmenting sequence shallow simple simultaneous sixth slightly solving spaces specific speed springer stanford state statistical step stochastic suffice supervision systematic tech term test text that there this through tibshirani training transformations understanding university upenn users using vandenberghe variable vector very vishwanathan vishy visschschmur well with work workshop wrapper zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.150 138 Solution Path for Semi-Supervised Classification with Manifold Regularization Gang Wang Tao Chen Dit-Yan Yeung Frederick H. Lochovsky Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China advances angle belkin chicago clustering computer conference data department efron eigenmaps embedding entire examples framework from geometric hastie icdm icml information international johnstone journal laplacian learning least linear lochovsky machine machines manifold mining neural nips niyogi norm path paths piecewise proceedings processing references regression regularization regularized report research rosset science sindhwani sixth solution spectral stanford support systems technical techniques tibshirani twodimensional university vector wang yeung http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.80 6 Hierarchical Classification by Expected Utility Maximization Korinna Bade, Eyke Hullermeier, Andreas Nurnberger Otto-von-Guericke-University Magdeburg, D-39106 Magdeburg, Germany {kbade,nuernb}@iws.cs.uni-magdeburg.de huellerm@iti.cs.uni-magdeburg.de algorithms annales applications approach approaches artifical artificial auer available basis been behavior benchmark bianchi blythe borgelt brief catalogues categorization categorizing ceci cesa chang chen choi cjlin classes classification clustering collections colloquium computational computing concerns conf conference consider content continuation continue corne corresponds cost csie current data database dataset decide decision decisions design development direction directory dmoz document documents dumais dynamic each economic editors elkan entry essays essentially europ european evaluation every experiments extension fast finetti first foundations from frommholz frontiers fuhr fuzzy games gentile gldv gori granitzer have hence henri hierarchical hierarchy history hofmann hotho however html http iasted icdm icml ieee ijcai improving incremental independent inform information insitut intelligence international john kegan knowl language large learning library libsvm linguistics logical logiques lois london machine machines made magazine making malerba management mathematics mccallum mining mitchell modelling more morgenstern must nanni networked neumann neural node nodes nurnber online open other page pages paper particular paul peng perspective pkdd planning poincare prevision probability proc proceedings process processing project ramsey reaching references research respective retrieval review rosenfeld savage sawm search selection sensitive sequential several shrinkage sigir sinka sixth soft software sons sources statistical statistics stokholm studied subjectives succession support survey sweden systems technology text than that theoretic theory these tironi transactions truth user vector visited volume webclassii what whether which wiley with workshop york zaniboni http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.97 5 Learning to Use a Learned Model: A Two-Stage Approach to Classification accurate achieve agrawal analysis antonie applicable approaches apte association associative assumption automated automatic automatically backpropagation bagging based bayardo bayes benefit best better between blake boosting borgelt both breiman brute candidate categorization ceur challenging class classification classifier classifiers cmar coenen cohen collection combine commonly comparisons computational computing conclusions conf conference confidence considering contrast could cpar damerau data databases dataset datasets decision definitions demsar described developing dimensionality direction discovered document ecml editors effective efficient engineering equal error eurocolt european evaluation experimental fast feature features fimi first force forty frank frequent furnkranz further fuzzy generalization generalize generation goethals good handle high hirsch html http hull iane icdm icml imielinski implementations improving incorporating incremental independence indexing induction information input inputs integrating intelligence international interposed investigated involved iris items itemset itory joachims joins journal kaufmann knowledge label large latent layer learn learning leng lewis machine machinery machines magdeburg making many maron merz methods mining mlearn mlrepository model morgan most multi multiple networks neural number occurring other output over pages pakdd paper particularly patterns performance performs possible practical predefined predictions predictive predictors problem proc proceedings programs pruning quinlan quiry reduced references relevant repos research retrieval reuters rigorous routing rule rules sarc schapire scheme schemes scoring second seen selection semantic series sets shows sigir sigkdd sigmod sixth small software stach stacked stage statistical support swami system systems techniques term test testset text that theoretical theory this thus tools transactions tuning under used using vector very views volume voting weighted weiss well when where whirl widmer with without witten wolpert workshop would yang zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.37 7 COALA : A Novel Approach for the Extraction of an Alternate Clustering of High Quality and High Dissimilarity Eric Bae and James Bailey NICTA Victoria Laboratory Department of Computer Science and Software Engineering University of Melbourne, Australia {kheb,jbailey}@csse.unimelb.edu.au aaai advances agglomerative algorithm algorithms allerton analysis annep aone application applications approach aspects attikiouzel attributes background based best bezdek bialek bottleneck cardie categorical chechik chen classifier closest clus cluster clustering clusterings clusters communication compact comparing computer computing conditional conf conference consensus consistent constrained constraints control cybernetics data database davidson detecting dimensional discovery discrete distance distrib document dunn dynamic easy effective efficient eissen ensemble ensembles entropy eppstein expensive explor extracting fast feasibility finding fred fuzzy garg generating geometric gondek guha haque hierarchical high hofmann icdm identifying implementing indexes info information inter intern international isodata issues jain jordan journal knowledge larsen learning level linear lisuri machine mallows management martin master means method methods metric metrics mining mixtures model multiobjective multiple nanni neural newsl nips normal other page pages pairs pakdd parallel parsons partition partitions patrikainen pattern pereira practical presence press proceedings process processing rajasekaran rastogi ravi recognition redundant references relative relevant retrieval review robust rock rogers russell schroedl scien search separated sets shim siam side sideinformation sigkdd sixth society soda soft some speeding stat stein structures subspace symposium syst systems tering text thesis time tishby topchy using vaithyanathan validity vision voorhees wagstaff wanquing well windham with xing zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.121 150 Opening the Black Box of Feature Extraction: Incorporating Visualization into High-Dimensional Data Mining Processes* Jianting Zhang1, Le Gruenwald2 1 LTER Network Office, the University of New Mexico, Albuquerque, NM, 8713, USA 2 School of Computer Science, University of Oklahoma, Norman, OK 73071, USA and National Science Foundation, 4201 Blvd, Arlington, VA 22230, USA Contact Email: jzhang@lternet.edu, Phone: 1-505-277-0666 accessed airborne algorithms althouse analysis approach artificial aviris bajcsy band banddecorrelation barlow bases best candidate case cases centered chang classification clustering cluto conference construction crawford curran data datasets decision efficient engineering exceeds explor exploration extraction feature features flextree foci francisco frank geoscience ghosh greatly groves hierarchies highdimensional http hyperspectral icdm ictai ieee image imaging implementations information infovis infrared integrating intelligence interactive international james java jeffrey joint july karypis kaufmann knowledge kumar large last learn learning mach machine methodology mining morgan multiple nasa neville newsl nguyen number package paintingclass photogrammetric practical prefuse prioritization proceedings process redundancy references relevance remote richard selection sensing shimodaira sigchi sigkdd sixth software song spectrometer sterritt stuart study supervised support symposium teoh tool toolkit tools toward transactions tree trees user visible visualisation visualization when with witten wwwusers http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.129 112 Probabilistic Segmentation and Analysis of Horizontal Cells accuracy acknowledgements after algorithm algorithms allow allows also although amoglu analyses analysis answers apply approachable around assigns attention attribute automated automatic avoid axis based basis because belknap best between bhattacharya bioinformatics biological biologists biology biomed biomedical bohm bonnet brain brauner bridgehead building business byun cell cells centered certain cheng community component concentration conclusion conference confocal construction conveys correspond data database databases deal definite density derived detachment develop different digital directly disciplines discrete distributed door dowling each efficient electron ellisman environments even experimental explore expression faloutsos faradjian fariss fashion feature figure fisher focused foundation from function functions further future gadt gausstree gehrke geoffrey giving goldberg grant guess gupta have historical horizontal icde icdm identification ieee images imaging immersion immunofluorescent imprecise index indexing indicates informatics instance instead integrating intelligently international intertwined into invest jolliffe kalashnikov kinds laboratory lewis light like likely linberg ljosa localization location main manjunath many martone measurement measurements medical methods micrographs microscopy mining molecular more morphological most mostly moving much multi multiscale murphy national networks neurite neuroinformatics neuroscience novel object objects obtained only opens opthal other over pages pami panel part photoreceptors physical pixel plots pmask possibility prabhakar price principal probabilistic probability proc proceedings processes produce proof properties proposed protein provide providing pryakhin qian quantitative queries querying questions random readings received recent record references representing result retina retinal return right robust rrepresentation same santini scaled schubert science scientific search second segmentation segmenting sensor sept serve shah shape sharp shown sigmod similarity simulations singh sixth small soille sokal sorger source space spaces springer steven structural structures sumengen supplement supported survey swedlow symp techniques thank that there they thick thickness this threedimensional threshold tissue tkde together trans transactions turns ucsb uncertain useful value values variability variations vectors verardo vincent visual vitter vivo vldb vocabulary walk watersheds which will with work world would yang zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.95 54 Latent Friend Mining from Blog Data Dou Shen1, Jian-Tao Sun2, Qiang Yang1, Zheng Chen2 Hong Kong University of Science and Technology, Hong Kong {dshen, qyang}@cse.ust.hk aaai acknowledgments adamic adar advances algorithm allocation analysis analyzing annual anonymous applications approach approaches arlington articles aspect association assumption authenticity author authors avesani bartlett based bayes bipartite blei blog bloggers blogs blogspace bonus boostexter boostingbased bridges bridging building bursty calvert cambridge categorization chen cikm classication classification classifiers clustering collaborative combining comments commun communication communities comparative compare comparison comparisons computer concerns conference constructing contextualised corrada cova craven customers data different diffusion dirichlet discovering discovery division document documents domains domingos easily ecml ecosystem editors eigenrumor emmanuel equivalence eurasia evaluation event evolution expectation extracting factors faust feature features filtering forty friends from fujimura fukuhara gabrilovich gender generation generative genre grant graph graphs griffiths gruhl guha hara hayes herring hicss hino hofmann huffaker hypertext icdm icml identity ijcai important independence individuals information inoue interests international ishida joachims jordan journal karypis kautz kleinberg knowledge kumar lafferty language large latent learn learning lewis liben likelihood link lorrain mach machine machines manner many margin markovitch massa mathematical mccallum mediated method methods milgram mining minka minnesota model models more murayama naive nakajima neclc neighbors network networks nigam nishida novak nowell objective outputs page pages paper partitioning pattern pedersen people platt prediction presented press probabilistic problem proceedings propogation psychology qiang query raghavan ranking real references referral regularized relational relevant report retr retrieval reviewers riboni richardson role rosen schapire scheidt scholkopf schuurmans schwartz selection selman semantic shah shared shen sigir sigkdd singer sixth slattery small smola smyth social sociology spatiotemporal statistical steinbach steyvers structural study subjective sugisaki support supported system tanaka tatemura technical techniques technology teenage temporal text thank their theme this thompson threads through today tomkins topic topics university useful using value vector virginia visual wang wasserman weblog weblogging weblogs white with wood workshop world wright yang york zhai http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.9 135 A Simple Yet Effective Data Clustering Algorithm Soujanya Vadapalli Satyanarayana R Valluri Kamalakar Karlapalem Center for Data Engineering, IIIT, Hyderabad, INDIA {soujanya, satya, kamal}@iiit.ac.in aggarwal algorithm algorithms ankerst based better breunig caters chameleon cliffs clustering clusters comad complicated components computer computers concept conference convincing data databases dataset datasets densities densitybased details detection different dimensional discovering dubes dynamic effective efficiency englewood ertoz ester evaluation experimental exploratory figure finding generate given gives hall have hierarchical high hyderabad icdm identify ieee improved influence information institute international intrusion jain jarvis karlapalem karypis korn kriegel kumar large letters life like literature mesure mining modeling more muthukrishnan nearest neighbor neighbors noise noisy optics ordering outlier patrick points prentice proceedings processing proposed published queries quick real record references report results reverse sander sets shapes shared show shown sigmod similarity simple sixth sizes solution spatial stability steinbach strong structure studies syndeca synthetic tarjan technical technology than that this tions tool transactions using vadapalli validity valluri vennam well with work http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.142 151 Semantic Smoothing for Model-based Document Clustering Xiaodan Zhang, Xiaohua Zhou, Xiaohua Hu College of Information Science & Technology, Drexel University xzhang@ischool.drexel.edu, xiaohua.zhou@drexel.edu, thu@ischool.drexel.edu acknowledgement agglomerative aggressive although among analysis annals applied approach background banerjee based berger bioinformatics both career classification cluster clustering coefficient collocations comparative comparing comparison competitive computational computer conclusions conference context cosine criterion data dataset datasets depends dept detail different dimensional discounting dmbio document documents dragon dragontool drexel effect effective effectiveness engineering equals evaluated experiment experiments figure finding findings five following formula frequency from functions general generative genomic ghosh grant groups health high hperspheres http icdm ieee information interesting international introduction ischool issue john joint karypis kaufman knowledge kullback kumar labeled lafferty language laplacian large latimes learning leibler less linguistics machine major march mathematical mccallum means methods mining minnesota model modelbased modeling models more much networks neural nigam ntfcosine obtained other otherwise paper part partitional problem proc proceedings promising quality references report retrieval retrieving rousseuw schemes science semantic sensitive settlement shows sigir sixth size smadja small smoothing song sons sparsity stage statistical statistics steinbach study sufficiency supported systems technical techniques tested text than that they this three tobacco toolkit translation univ university unlabeled using variance very volume weaken when where which wiley with words work worst xtract zhai zhang zhao zhong zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.108 89 Mining Generalized Graph Patterns based on User Examples1 accuracy acknowledgements algorithm alon also analysis applications applying arxiv asia asks assessment august australia austria authors baldwin based benefit berthold better biological biology borgelt both bouwmeester canada carolina casari case caused chains coexpression computer computing concept conclusion conference cores critical cybercommunities data databases datasets decomposition demonstrate dept detecting different discovery discussions division dmitriev documents does domains edge effect efficiency emerging evaluated even examples expect experiments exploration feature find first form formal foundation fragments francisco frequent from fuzzy gagneur gene generalization generalizations genome ginsparg given good grant graph graphbased graphs gspan gudes have having helma help hofer however hypermedia hypertext icdm ieee impact important improve induced inferring information intelligent interaction interestingly international isomorphic issue italy itzkovitz japan joachims karypis kashtan know knowledge known kramer krause kumar kuramochi kuznetsov labeling lagoze langston large larger learning less life like likely logical machine maebashi meinl microarray might milo mining minnesota miyahara modular molecular motifs motoda national negative neither network networks newsletter next north number often only other overall pacific paper patterns paul peng perceive perform performance performs pisa pointing practical practice precision presented problem problems proceedings properties protein raedt raghavan rajagopalan randomness real recall reference references report rules salzburg saxton scale science second selected semistructured september sequences shimony shoudai show sigkdd significant sixth snoody state step structured subgraph substructure suchkov supported suspect sydney synthetic systems taipei taiwan takahashi technical terms thank that there this thorsten through tomkins topological toronto trawling tree trees uchida ueda university used user using valuable vanetik vertex washio well what which wide wildcards will with work workshop world would http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.88 35 Incremental Mining of Frequent Query Patterns from XML Queries for Caching accordingly acharya acknowledgement adaptive afshar agrawal algorithm algorithms answerability approach archiv asai association ayres balmin based basic believe bettini beweis beyer bide bitmap bohannon bongki bulletin caching cathala cheap checking chemical chen child china closed clospan cochrane compounds conference confidence cyclic dasfaa data database databases dation dayal dehaspe discovering discovery document dong dynamic edbt effective effectively efficiency efficient efficiently eines engineering enumeration environment evolving existing experimental exploiting expressions fast filtering finding first fist flannick forest foun foundation framework freespan frequent from fxqpminer gehrke generalizations give grant granularities graph growth huiqing icde icdm icdt ieee importantly improvements include incremental incrementally index indexing information international introduce jajodia jcqn joonho karypis kaushik kawasoe king kuramochi kwon laboratory large lianghuai local long mandhani masseglia materialized mathematik method milo mine mining moon more mortazavi multiple national natural neuer noisy novel outperform ozcan ozden pages parent park partial path paths pattern patterns performance periodic permutationen philip physik pirahesh pkdd poncelet praveen prefix prefixspan prix proceedings processing program projected proposed prufer queris query querying ramaswamy references relationship relationships representation research results rich rousset rules satzes scalability scalable science sebag selection semi sequence sequences sequencing sequential series shenoy sigkdd sigmod silberschatz similarity sixth speed srikant step structural structured structures subgraph substructure substructures suciu summary supported techniques technology temporal termier terms that this thorough through time tkde tnlist toivonen towards transaction tree treefinder trees tsinghua twig uber under unique using valid validity view views vist vldb wang well which with work xpath yang zaki zhejiang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.14 148 Adaptive Kernel Principal Component Analysis With Unsupervised Learning of Kernels Daoqiang Zhang Zhi-Hua Zhou National Laboratory for Novel Software Technology Nanjing University, Nanjing 210093, China {zhangdq, zhouzh}lamda.nju.edu.cn Songcan Chen Department of Computer Science and Engineering NUAA, Nanjing 210016, China s.chen@nuaa.edu.cn accuracies accuracy achieves acknowledgments adaptive advances again alberta algorithm algorithms alignment also always among analysis approximate approximations average averaged balance banff bartlett base based becker besides best better blake bold bottom bousquet california cambridge canada carried cave chapelle chen china choosing claim classification classifier clustering coil columbia compares comparison comparisons component computation computer conclusion cone conference cristianini criterion cucs data databases department diego dietterich different dimension dimensions discriminant distance distinguished duda editors efficiency efficient eigen elisseeff enabling ensembling european example experiments extend extraction feature figure find fisher five former foundation from fund future gaussian generalized ghahramani ghaoui glass hart heart highest html http hyperkernel hyperkernels icdm idealized ieee image improve information interesting international introduction investigate ionosphere irvine italy iterative jiangsu jordan journal kandola keogh kernel kernels kinds kpca kpcas kwok lanckriet large learning leaves left library lowest machine machines matrices matrix means merz methods middle mika minimum mining mlearn mlrepository moreover mukherjee muller multiple murase national nayar nene networks neural nonlinear note obermayer object only optimize order other outperforms pages paper parameters patter pattern performance performed pisa possible presented press principal problem proceedings processing programming proposed proposes rank ratsch recognition references report reported repository research right scene scholars scholk science second selection semidefinite shawe simultaneously single sixth smola society softlib sons space such superiority support supported systems table tang target taylor technical that this thus transactions tsang tsuda tuning under underlined university unsupervised used using validate value vapnik vector verifies vision washington when which while wiley will williamson with work yang york young zhang zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.132 17 Rapid Identification of Column Heterogeneity accuracy against akaike algorithm allerton annals annual applications approximate automatic banerjee batini bialek bottleneck bregman browser build catarci chaudhuri cidr circuit cleaning clustering communication computing concepts conference control convergence cooperative cover dasu data database databases dempster devroye dhillon dimension divergences efficient elements email estimating exploratory figure flexible from functional fuzzy ganti generahttp ghosh hebrew html icdm identification ieee incomplete information integrated international interscience issues iteration john johnson joins journal koudas laird large learning likelihood lineage look machine management marathe match matching maximum mcgill merugu meta metadata method mihaila mining model motwani muthukrishnan online overview pages pereira phone practice proc proceedings quality querying random raschid rate references research rnbookindex robust royal rubin scannapieco schwartz shkapenyuk sigmod sixth slonim society srivastava statistical statistics string structure survey system systems techniques theory thesis thomas tion tishby transactions trio tutorial uniform university variate vidal vldb widom wiley with york http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.65 56 Entity Resolution with Markov Logic Parag Singla Pedro Domingos Department of Computer Science and Engineering University of Washington Seattle, WA 98195-2350, U.S.A. parag,pedrod@cs.washington.edu aaai accuracy acknowledgements active adaptive agresti alchemy algorithm algorithms ambiguous american analysis application applications approach approaches approximate approximately artificial association attribute authors automatic axford based bate bayesian bhamidipaty bhattacharya bibliography bilenko blockeel blog bonn boston build bulletin bureau carlo cate categorical census chain challenge chapman chicago cikm citation cleaning cliques cluster clustering cohen collins coloring comparison complex computer conclusions conditional conference consolidation constraint constraints construction contained contract coreference cornell costa culotta current darpa data database databases davis dbms deduplication della department dependences detecting detection dimacs dimensional discriminative division dmkd doan document domain domainindependent domains domingos dong drug dupli duplicate dutra dzeroski easily edinburgh editors efficient either elkan emnlp enables engineering entities entity equivalence establishing evaluation experiments expressed features fellegi fern fields fienberg foundations francisco from funded gehrke general genesereth germany getoor gilks ginsparg government grant gravano halevy hall hard hardening hardness hernandez hidden high http huang icdm icia icml identifiable identification identity ieee ijcai illustrate imls implementation implied independent inducing inference information integration intelligence intelligent interactive international interpreted ipeirotis iterative jagadish james jiang johnson joint kaufmann kautz kddcup kennedy kleinberg knoblock koudas lafferty large learnable learning linkage local logic logical london machine madhavan magazine markov marthi match matching mateo mathematical mcallester mccallum measures mediated memory merge methods metrics milch mining minton miss model models monge monte mooney morgan morie multi multiple muthukrishnan names nbchd necessarily networks newcombe nigam nilsson nips nocedal noren noun numerical object objects official online open optimization orre page pages pardalos partly pasula pattern pearl perceptron philadelphia pietarinen pietra pittsburgh pkdd plausible policies porto portugal practice press probabilistic problem problems proc proceedings processing projects purge qgrams quass random ravikumar reasoning reconciliation record records reference references relational report representing research resolution richardson richman roth rules russell safety sarawagi satisfiability science scotland search seattle second selman semantic sets shen should shpitser sigkdd sigmod similarity singla sixth society soft solving sontag sophisticated sources spaces spiegelhalter springer srivastava state states statistical stochastic stolfo strategies string structure sunter surveillance system systems technical tejada testing text theory this those traffic training transactions transformation trick types uncertainty ungar united university unknown using views vital washington weights wellner wiley winkler with wkshp workshop wright york http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.17 47 An Efficient Reference-based Approach to Outlier Detection in Large Datasets able achieve acknowledgment adaptive advocate alberta algorithms also analysis anonymous anormaly applications approach approaches arnold august authors barnett based becomes better breunig british castro chapman chawla clustering columbia comments computing conference data databases datasets density department detecting detection discovered discovering distance distancebased distributed done each effectively efficient efficiently empirical eskin estivill find finding fourth framework from generator geometric global grid hall hawkins highly http iane icdm identification identifying ieee international intrusions john june knorr kriegel large lewis linear local master method mining near ninth nserc number okanagan only oultiers outlier outliers over pages part perrizo point points portnoy prerau proc proceedings pruning ramaswamy randomization ranking rastogi referees reference references relative report representation results rule sander scalable schwabacher science security sets seventh shim show sigkdd sigmod simple sixth small sons sorting spatial statistical stolfo studies support survey surveys synthetic technical than thank that their theoretical thesis time tung uniformly university unlabeled unsupervised used using vertical very vldb wang when while wiley with wood work yaling http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.140 32 Secure Distributed k-Anonymous Pattern Mining abstract achieving acknowledgment advances agrawal algorithms annual anonymity anonymization anonymous application applications association atzori author authors baltimore based bases bayardo berlin blake blocking bonchi calders cambridge cardinality chang chapter chee chile clifton clustering collaborative completeness complexity computation computer computing conference connecticut crypto cryptographic cryptography cryptology cryptosystem dallas data database databases decision derivable dewitt dino disclosing discovery distributed domain editors efficient encouragement engineering eurocrypt european exchange factoring fast first fosca foundation foundations framework france franco frequent friedman full funded fuzziness game general generalization generalizing generate geopkdd germany giannotti goethals goldreich grant hand honest houston icdm icisc identities ieee ifip incognito induction information international intersection issue itemset itemsets japan jiang journal june kanonymous kantarcioglu knowledge korea large laur learning lecture lefevre lindell lipmaa lncs machine majority management mannila matwin means mental merz meyerson micali microdata mielikainen mining model national ninth notes november october okamoto optimal over page pages paris park partitioned patterns pedreschi pinkas pkdd play pods porto portugal practice preserving press principles privacy proceedings product project protecting protection protocols provide public raised ramakrishnan references release repository respondents results reviewers rule rules samarati santiago scalar schuster science seattle second secrets secure security seoul sept september seventeenth sigact sigactsigmod sigart sigkdd sigmod sigmodsigart sixth smyh society special springer srikant storrs suggestions support supported suppression sweeney symposium systems texas thank their theorem theory threats through tkde tokyo transactions tree turini uchiyama uncertainty under university using vaidya valuable verlag vertically very violate vldb volume washington well when wigderson williams wish with wolff work working york zhan http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.113 107 Multi-Tier Granule Mining for Representations of Multidimensional Association Rules abstractions accurateness acknowledgments acmsigkdd acquiring addition adequate alberta algorithm algorithms also approach associate association associations atms australian austria authors automatic automatically based beginning beijing bell between bucila canada checked chen china cimca clear closed comments computing concentrate conclusions condition conference constrained constraint constraints constructions convertible cost council current data databases deal decision definition demonstrate describe describes different dimension dimensional disconnect discovery dual dualminer during efficiency efficient efficiently elegant endeavor engineering exatraction existing exploiting explorations exploratory extended firstly florida foundational framework frequent from further fuzzy gehrke generating germany grant granular granule granules grouped guan heidelberg hirano however icdm ideas ieee improve information institute intelligence inter international interpretations into item itemsets japan journal justified kifer kinds knowledge lacks lakshmanan large leung level like logical looked maebashi meaningless melbourne mining multi multidimensional multiple needs ning ontology optimal optimizations order other pang paper partially pattern patterns pawlak pham present presented price proceedings product products professor prove provide pruning pursuit question random reasoning redundant references replacement representations represented research researchers rough rule rules same scaling seattle semantics sequential sets sigmod significant similarity single sixth software some sorts structure succinct supported systems table tables taxonomy technology thank that theory this tier tiers tool transaction transactions tree trees trends tsumoto tzvetkov used useful user using valuable vienna view visualization wang want washington webb when white wish with zaki zhang zhong http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.16 104 Adding Semantics to Email Clustering Hua Li1, Dou Shen2, Benyu Zhang1, Zheng Chen1, Qiang Yang2 according acknowledgement additionally algorithm although analysis anonymous approach assessment assist assistance association authors automatic automatically average basic bastide bekkerman benchmark best both categorization celeux ciir class classification classifies clear closed closet cluster clustering comments compared comparison computa conclusions conducted conference corpora dale data dataset datasets dekker descriptors discovered discovering documents each email emails embedded employed enron evaluate evaluated experiment experimental experimenters experiments extract features folders frequent from generalized generate generated good govaert gsppcl gsps guimei halkidi handbook heidorn help http huang human icdm icdt improve improvement improvements intelligent international into introduction invited involved itemset itemsets journal kmeans knowledge labeled labeling labels lakhal language learning leveraged like likelihood machine marcel maximum mccallum means mining mitchell mixture moisl more name names natural nigam novel obtains other owner owners paper pasquier patterns performed pkdd proceedings processing proposed pseudo quality ranked readability readable references report results reviewers rules scores searching sentence serve sets seven short showed shown sigkdd significant simulation sixth somers statistical statistically strategies subjects suggestions supervised table taouil technical technique test text thank that their this thrun tion treats umass unaware unlabeled unreadable unsupervised used user using valuable vazirgiannis viewpoints wang well were which with word would writing york http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.72 65 Finding "Who Is Talking to Whom" in VoIP Networks via Progressive Stream Clustering accuracy accurate achieve activities addition algorithm algorithms also analysis areas aspects attempted avenues awareness basu baugher being believe binary cannot careful carrara closing complementary compromised conclusions conference convergence conversational conversations conversing could coupled current currently data demonstrated different distance distributed easily effective efforts encryption examine execution exhibits exploring fast from fusion general have high icdm ietf independent indicating information institute intercepted intercepting interest international internet investigation like main malicious many march massachusetts mcgrew measures merely mining models multiple naslund norrman objective open packets pairing pairs paper parties point potentially presented privacy private problem proceedings protocol provide provision raise rates real reasons references results reveal robust scene schemes secure sixth srtp stages still streaming such suggest technology telephony that thesis this throttle time traffic transport ultimate voice voip ways well while with would http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.158 110 The Influence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study Xu-Ying Liu Zhi-Hua Zhou National Laboratory for Novel Software Technology Nanjing University, Nanjing 210093, China {liuxy, zhouzh}@lamda.nju.edu.cn aaai addressing altering artificial austin based beats blake boston california card case chan chawla chicago ciraco class classifier classifiers computer conference cost costly costs costsensitive credit data databases department detection diego discovery distribution distributions does domingos drummond editorial effect elkan engineering example explorations foundations framework fraud from general holte html http icdm icml ieee imbalance imbalanced improving induce induction information instance intelligence international irvine issue japkowicz joint joural keogh knowledge kotcz langford learning machine making maloof margineantu melbourne merz metacost method methods mining misclassification mlearn mlrepository more multi national networks neural nonuniform notes over pages problem problems proceeding proceedings proportionate provost rarity ratio references repository require research rogalewski sampling scalable science seattle sensitive sensitivity senstive sets sigkdd sixth solutions special stolfo study than ting toward training transactions tree trees under unequal unifying university unknown utility washington weighting weiss when with working workshop york zadrozny zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.165 48 Using an Ensemble of One-Class SVM Classifiers to Harden Payload-based Anomaly Detection Systems Roberto Perdisci , Guofei Gu, Wenke Lee College of Computing, Georgia Institute of Technology, Atlanta, GA 30332, USA absence abstract accurate acknowledgments activeperl advances algorithm allen analysis anomalous anomaly anonymous applications applied approach arnold attack attacks authors automated automating barbara based berg binary blending brunelli buffer bytes categorization chang chinchani christie class classification classifier classifiers clustering code combining command comments communication computation computer computing concept conference constructing contents counter cues data defending delft detect detecting detection detristan dhillon didaci dietterich dimensional distribution divisive duda duin editors engine ensemble eskin estimating examples execution exploit extension fabio falavigna fast feature firew flows fogla framework fusion generation geometric giacinto giorgio grant hart help high host icdm identification ieee information input inside insightful intelligence international intrusion intrusions isapi issue jajodia journal kindermann kirda kluwer kolesnikov kruegel kumar kuncheva learning leopold letters library libsvm like machine machines mahoney malcom mallela march mchugh media methods microsoft mimicry mining multiple mutz naval navy necessarily netherland network networks neural office official oneclass overflow packet pages part pattern payload perdisci perliis person phrack platt polymorphic portnoy prahlad prerau printer proceedings raid recent recognition references remote represent research responsibility reviewers rker robertson role roli scholk sebastiani security securityfocus sept service services sharif shawe shellcode signature sixth smola software solely soto space specific spectrum static statistical stolfo stork support supported surveys symposium systems taylor technology text texts thank their theoretic theory thesis this toth traffic trans ulenspiegel underduk university unlabeled unsupervised usenix using vapnik vector views vigna vulnerability wagner wang wiley williamson windows with work workshop worm would yourself http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.45 152 Corrective Classification: Classifier Ensembling with Corrective and Diverse Base Learners accuracies accuracy achieve acquired acru adopts advances aggressive algorithm algorithms also alternatives american among amount analysis approach approaches april artificial association automatic average back bagging base basic before beled benchmark best better blake blemishes bond book boosting bootstrap both breiman brodley build built care centered certain classification classifier classifiers classifiert cleansing close column comparison comprehensive computer concluded conclusions conditions conference consider consistently constructed convincing copies correcting correction corrective correspond cunningham data databases dataset datasets decision demonstrated denote department design detection dietterich diverse diversities diversity each edir edit edrp effective efforts either enhance enhanced enhancing ensemble ensembles ensembling error eventually experimental experiments explained feature fellegi figure final first focus four frank freund friedl from furthermore generate good hand have hettich holt icdm icml ictai identifying ieee impact imperfect imperfections implementations improved improvement imputation incurs individual induces inductive information initial instance intelligence intelligent intensive international isolate issues java journal kaufmann kolen kuncheva lead learner learners learning lecture left less levels lines loses loss lost machine made majority measures medians merz methodology methods michalski middle mining misla model monks more morgan multiple national neural nicely noise noisy noitci normally note notes observed often oitci onitc onitci only original other outperforms package pages paper pechenizkiy performance performances plot plots polishing pollack practical prediction predictors preprocessing procedure proceedings processing programs promising propagation proposed quality quinlan ranking rationale references relationship report repository represent represents requires research respectively result results right same sampling schapire science selection sensitive several show shows significant sixth size smaller software source sources statistical still study succeeding such summarizing superior systematic systems takes technical temporarily teng than that their them then theory they thirteenth this three tied time tools total traditional training tsymbal underlying university uses variations vermont very voting when which whitaker will with witten workshop yang ycar ycaru zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.33 96 Cluster Analysis of Time-series Medical Data Based on the Trajectory Representation and Multiscale Comparison Techniques Shoji Hirano Shusaku Tsumoto Department of Medical Informatics, Shimane University, School of Medicine 89-1 Enya-cho, Izumo, Shimane 693-8501, Japan hirano@ieee.org about active after algorithm alignment although among analysis application approach arnold artificial average babaud based baudin best between calculates cases chan changes chronic clinical close clsi cluster clustering clusters combination comp compares comparing comparison concave conclusions conf conference contained conventional convex corresponded correspondence could course courses curvature curves data dataset deepening deformed delivered demonstrate description detail determines development difference differences different difficult dimensional discovering discovery discrete discrimination dissimilarity distribution duda dudek dynamic each edition edits efficient eighth embed employed employs enabled enables evaluation everitt examination example existed experiments experts feature features fibrotic figure filtering find findings firstly fourier fourth from fundamental further future gaussian general global gross hart have hence hepatitis here hidden high hirano however human icdm ieee ieice ijcai implications implied imply include increase induced inflection information intelligence interesting international into items iterative japan journal kawagoe keogh kernel knowledge kruskal landau largely leese level lindeberg local machine mackworth macromolecules mannila matched matching maxima meaningless medical medicine method methodology mining mokhtarian monotonicity multiscale observed original outputs pairs pami paper parameters partial pattern patterns pazzani place planar plied plural points presented preserves previous proc proceedings produce proper proposed provide publications publishers receive recognition recognize references refinement relations renganathan represent representation requires research results rough rule sankoff scale scales search second segment segments sequence sequences series sets shape shapes should siam signals similar similarity sixth smyth society some space stage stages string structural structures subsequences such suzuki system systems takabayashi temporal that their there these they this time took tool tostsos total trajectories trans transactions transforms treatment truppel tsumoto ueda underlying understanding uniqueness used usefulness using validation value values various view views warping warps wavelet wavelets well which with witkin work would http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.126 128 1 abrams abuse access aiken algorithms annual applied arxiv association authorship brin broder check classes classification collberg communication complexity compression computer computers computing conference containment copy cornell culture data david davis degrees detect detection digital document documents editor electronic february finding fingerprinting first fourteen free garcia gehrke ginsparg halbert iadis icdm ieee information international internet irma july june kobourov koppel learning leong libraries library local louie machine management matchdetectreveal mechanism mechanisms mining molina monday monostori november october oneclass online open overlapping pages physics pinch plagiarism practice problem proceedings recommendation references report reputation research resemblance resources review ribler scam schleimer schler schmidt scholarly science self sequences shivakumar sigmod similar sixth slattery sorokina splat steps symposium system systems technical theory towards university using verification visualization warner wilkerson winnowing years zaslavsky http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.39 66 Comparison of Descriptor Spaces for Chemical Compound Retrieval and Classification above achieved adding advances aids algorithm algorithms also amount antiviral appendix applications apriori arbitrary baldi based been better bioactive bioinformatics bland bohm bound bounded bravi brown bruand calculation canonical case certain challenge changing chem chemaxon chemcomp chemical chemistry chen cliffs collapsed come comp compare comprehensive computation computational computations computing conference connected consider considerable constraint constraints contain contraction corresponds costa count cramer current cyclic data datasets daylight decomposition deletion denoted denotes derive description descriptor descriptors deshpande designed developed differ difference different discovered discovery does downloaded drug durant durst each easy edge edges embedding embeddings englewood ensure ensured equal equation evaluation every exactly exist existing explore faced fashion feature fewer first fischer following fragment fragments frasconi frequent from future gaston generate generated generates generating generation gillet given goal graph green gribskov group grtner gspan hall hand harper have helma hence henry higgs hipskind horvath http hussain icdm icml identify ieee ijcai implementation imposing improve incident info information initial inokuchi inspired interesting international introduction isomorphic joachims john july karypis kashima kernel kernels kier king kramer kuramochi labelling large leach learning least leland length light like lists make making martin medical meinl memory menchetti methods mining mitpress model modeling modifications modifying molecular molecules more moreover motoda muggleton must nature ncbi need needed nijssen note nothing nourse number objective observations obtained once only openbabel operation order original otherwise oxford pages paper particular pattern performance philippsen phung pickett pkdd possible practical predictive prentice press primary principles problems proceedings project pubchem quickstart raedt ralaivola reasons recurrence recursive references relation requirement results returned reviews richards robertson robinson runtime same satisfies savin scale schneider screen screening second selected selection siegel sigkdd significant significantly similar simply since sixth size sourceforge spaces spanning specifically speed spend srinivasan statistical statistics sternberg structure study subgraph subgraphs substructure substructures such support swamidass systems task tasks technique terminating than that then theory there these this three threshold thus time tkde together toxicology tree trees tried tsuda university upon upper used using vapnik various vector vertex vertices vieth virtual wale washio watson ways weighted west when where whittle wiley willett with work worlein would wrobel yield http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.106 113 Mining Correlation between Motifs and Gene Expression Yi Lu, Shiyong Lu, Adrian E. Platts, Stephen A. Krawetz Wayne State University able acad addition additional advanced agrawal algorithm algorithms always amon among analysis appeared applied approach approaches appropriate association austin bases baskets been between beyond biol brin campbell candidate cell central cerevisiae change collocations common complex computational concepts conclusion conditions conf conference constraint contain contains control controlling controls conway correlation correlations corresponding curr cycle data davis dependent discovered distract duplicate each early eukaryotes evert exactness examined example expression extra factors fast fate figure fishing found four fraenkel frequent function further futcher future gabrielian gene generalizing generate generation genes genome gerber gifford gordon group hannett harbison however http icdm identifying ignore important improve information international introduce introducing jennings joseph kamber kaufman know known krenn landsman large larger late liddell like literatures lockhart lowest management market method miner mining minute minutes mitotic morgan most motif motifs motifset motifsets motwani murray natl networks nine number odom opin other pages paper partition patterns pedersen performance phase phases post proc proceedings processing produced profiles prokaryotes promoters proposed publishers pushing recruits rediscovered references regulate regulatory related relationships reported required respectively rinaldi robert rule rules saccharomyces science seen selected show sigmod significant silverstein simon since sixth solved south specifically srikant state steinmetz step strategies such synergetic tagne technique techniques that then these this thompson time together transcription transcriptional university user users value very vldb volkert wayne well wide winzeler with without wodicka wolfsberg work wyrick yeast young zeitlinger http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.19 3 An Information Theoretic Approach to Detection of Minority Subsets in Database Shin Ando Graduate School of Engineering, Yokohama National University ando@ubicg.ynu.ac.jp Einoshin Suzuki Graduate School of Information Science and Electric Engineering, Kyushu University suzuki@i.kyushu-u.ac.jp addressed advanced agarwal aggarwal algorithm algorithms also analysis applications applied approach approaches approximations artificially assignment average banerjee based bases basis bayesian bekkerman berger berlin bernstein bialek blahut boston bottleneck boundary breunig brodley buhmann butterworths cambridge capacity categories categorization chance channel chapman chechik chen class classes classification classifying cliffs clustering clusters compression computation computer computing conclusion condition conditional conference consistently consisting contribution conventional convergence corpus corr cover crammer data database databases dense densities density densitybased derived description desirable detecting detection dhillon dimensional discovering discovery distance distortion distribution distributional divergence divisive document documents eccv edition editor editors effective elements englewood ester estimating estimation european evangelos even exception expectation expense experiment experiments exponential families favor feature first focuses formulating framework from function functions furthermore general generated ghosh good hall hard haystack hermes heyden high highdimensional hinneburg hinton hoboken icdm identifying ieee image inconsistent incremental induced induction inform information initial instances intell international irrelevant jiawei johansen joshi journal july justies keim knorr knowl knowledge kriegel kumar kurihara labeled large largely learning likelihood local london machine majority mallela management mathematical maximization maximum mcburney mccallum means meanwhile measure merugu method methods mining minority mitchell mixture model money monographs more nature naughton neal needle neural nielsen nigam noise objective ohsawa oilseed optimization other others outlier outliers overwhelming page pages paper parametric pereira phase physics platt precision prentice present press principle probabilities probability problem problematic procedure proceedings processing promising property proposed provides rare rate rected redner references relatively repository research respective respectively rest result results retrieval retrieved retrieving reuters review rijsbergen rule rules sample samples sander scale scholkopf second segmentation sensitivity setting shawe showed showing shows siam sigir sigmod significant silverman simoudis single sixth slonim small smola somewhat sparr sparse spatial specificity springer statistical statistics submerged subset subsets sugar supply support suzuki syst table task taylor text than that theoretic theory this thomas thrun tishby trans tucakov twenty undi unified unlabeled using vapnik variants variational very view vision vldb walker welling were which wiley williamson winter with word words yaniv york zoller zytkow http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.62 102 DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams Carson Kai-Sang Leung Quamrul I. Khan aaai about according acknowledgement additional agrawal algorithm algorithms also among analysis approach approximate arranges association associations based bayardo because between bonchi both bucila canada candidate canonical cantree captures cases challenges changes charm closed closet conclusions conference confirmed constrained constraints contents contrast contribution convertible corresponding cost count counting current data databases decreased decrement deleted deletion depend directions dmkd dstree dual dualminer dynamic easily effect efficient efficiently either exact example expected expensive experiment experiments exploiting explorations exploratory false fast fifth figure form former fourth frequency frequent from future gaber gave general generation giannella grants granularities hash high hsiao huang icde icdm ieee increased incremental international item items itemset itemsets june just keeps kept lakshmanan large latter leung linear list long lower lucchese maintained maintaining maintenance manitoba maximal method mining minsup moment moreover morevoer multiple needed negative next nice nodes novel nserc number only optimizations optimize order ossm other over paper park partially pattern patterns perform positive powerful press proc proceedings project properties proposed provide pruning record references remove require required research results review rules runtime scalability scan segmentation sept sets shifted show sigkdd sigmod simple sixth size sizes slides sliding smaller some specifically speed sponsored srikant stream streaming streams structure structures studied subset succinct tested than that this thus time tkde tods transaction transactional transactions tree trees trimming unaffected used user using usual vldb well were when whenever whereas while window with without zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.4 95 A Feature Selection and Evaluation Scheme for Computer Virus Detection acknowledgments also analysis arnold august automatic automatically bontchev bull bulletin classifier clean comparatives comparing computer conf conference cost crossley data dataset detection elkinbard eskin evaluation examples executables expert explore extraction ferrie files generated good graciously gryaznov heuristic heuristics higher hunting icdm ieee international involve kephart kerchen library like maintenance malicious marx matrix metamorphic methodology methods mining misclassifying more muttik national negative networks neural older olsson ones open orleans privacy problems proceedings really recent recognition references research retrospective scanners schultz security sept september setting sharing signatures sixth society sorkin static stolfo symposium systems tesauro test testing thanks their this tools training unix using virus viruses white with work would year zadok http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.124 82 Pattern Mining in Frequent Dynamic Subgraphs Karsten M. Borgwardt, Hans-Peter Kriegel, Peter Wackersreuther Institute of Computer Science Ludwig-Maximilians-Universitat Munich, Germany kb|kriegel|wackersr@dbs.ifi.lmu.de abundant algorithm algorithms also applications artif asynchronous attached background based biology both cambridge changing choices chosen classification color combinatorial common complete computational computer conclusions conference confirms constant cook corpus data dataset default deletions description desikan detected detecting discovered discovery discussion dynamic dynamics ecml edge edges email enhanced enron evolving examined expected experiment extended extreme figures finding first fixed frequency frequent from graph graphs grew gspan gusfield have hence higher holder icdm illustrated impact inokuchi insertions intell interaction international isbn jair june karypis klimt knowledge kuramochi large lead learning length long looking lower machine matching might minimum mining more motifs motoda networks next note number order pages paper parameter parameters particular pattern patterns point press problem proc proceedings protein quite rare references reported requiring research results same scalable science search searched selected sequences series shorter should show sixth size social sparse sparsity srivastava starting string strings study subgraph subgraphs substructure symp synchronous table technique telecommunication temporal temporally tested than that these they this those threshold thresholds time topologically trees types university unsurprisingly used using values volume washio webkdd well with within workshop yang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.22 30 Applying Data Mining to Pseudo-Relevance Feedback for High Performance Text Retrieval Xiangji Huang1, Yan Rui Huang2, Miao Wen2, Aijun An2, Yang Liu2 and Josiah Poon3 aaai access acknowledgments advanced advances allan analysis annual anonymous applied applying artificial australasian automatic background based bases bayesian beaulieu before bellegarda biomedical blind blum breiman buckley burges canada centre cercone challenges chan cikm clarke class classification classifier classify clustering cohen collaborative combining comments communication comput computational concept conf conference construct content contextual cormack cotraining council craven croft data dipasquo discovery document documents dual email engineering enhance enhancing expansion experiments exploiting fast feature feedback ferra forum francisco frank freitag friedman from gatford genomics global goldman grant group hall hard hauptmann hirst huang icdm icml ieee image improving incremental index inference information intelligence international iterative iwayama jasis joachims journal judgements kaufmann kernel kiritchenko knowledge koprinska kowalczyk labeled language latent learnable learning local lynam machine machines maching matwin mccallum methods minimal mining mitchell mitra model modeling morgan multimedia multisystem natural negative nigam nist nserc number okapi olshen optimization overview pacific page pages paper pattern peng platform platt poon practical prentice presence probabilistic proc proceeding proceedings processing programs pseudo pseudorelevance publication query quinlan ranking raskutti recognition references regression relevance relevant research retrieval reviewers robertson robust rocchio rules schuurmans sciences search segmentation selection semantic sequential sigir simposium singhal single sixth slattery small smart sparck special spring statistical stone studies study supervised support supported svms symposium system techniques term terms text thank that their theory this three thrun tong tools track tracks training transductive trec trees tuning tutorial univeristy unlabeled unlabelled useful using valiant vector video wadsworth walker wang weighting wide williams with witten workshop world york zelikovitz zhang zhong zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.111 126 Mining Maximal Quasi-Bicliques to Co-Cluster Stocks and Financial Ratios for Value Investment Kelvin Sim, Jinyan Li Vivekanand Gopalkrishnan abello acids algorithm algorithms allow also analysis applications approximately arboricity audience balanced because bicliques bioinformatics bipartite budding burleigh capturing certain chen clique cluster clustering clusters common communities compared completeqb concept concepts conceptual conclusion conference considered contains data databases dataset dawak decrease definition degrees detection developed discovered discovery diversify dividend dodd efficiency efficient enumerate enumerated eppstein erroneous error errors eulenstein evaluated even evolution example figure financial find finding firstly found framework from gained general good graham graphs higher hill icdm identifying impractical incomplete increase increased increases index information interaction interesting interestingly international investors kamber kaufmann large latin learn letters life linear ling listing mach massive maximal mcgraw measurement mining mishra missing molecular more morgan motif murata network nucleic number optimal optimized pages pairs pattern performances performing phylogenetic phylogenetics portfolios present price problems proceedings processing professional proposed protein proteome quasi rate ratios real references reported research resende runs sales scale secondly security sequence sequences sets shown sites sixth small solving stocks structure subgraph subgraphs sudarsky suitable swaminathan symmetrical techniques that their thereby this thus tolerance tolerate topological types undirected unlike useful user when which wide wong year yeast yield zhang zhao http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.68 109 Exploratory Under-Sampling for Class-Imbalance Learning academic acknowledgement adaboost addressing advances algorithms analysis area artificial asymmetric bagging beats berkeley blake bonn boosting boston bowyer bradley breiman brief california cambridge card cascade case chan chawla chen class classification classifier computer conference cost credit curve data databases department detection detector detectors distributions drummond edition editorial engineering evaluation explorations fast forest framework fraud from fukunaga generalization germany hall holte html http icdm icml ieee ijcai imbalance imbalanced information intelligence intelligent international introduction irvine issue japkowicz jiangsusf jones journal kegelmeyer keogh knowledge kotcz learn learning liaw linear machine merz methods mining minority mlearn mlrepository mullin networks neural nonuniform notes nsfc object over pages pattern predictors press problem problems proceeding proceedings processing random rarity real recognition references rehg report repository research robust sampling scalable schapire science sensitive sensitivity sets sigkdd sixth smote solutions special stacked statistical statistics stephen stockholm stolfo study supported sweden synthetic systematic systems technical technique time toward training trans under unifying university using viola vision washington weiss with wolpert working workshop york zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.122 25 Optimal Segmentation Using Tree Models Robert Gwadera, Aristides Gionis, and Heikki Mannila HIIT, Basic Research Unit Helsinki University of Technology and University of Helsinki Finland above annals annotated approach approximation average barron based basic bayesian bellman bernaola between bgrs bioinformatic bioinformatics biology biopolymer borders bottom buhl burge carpena chains coding communications complete complexities composition compression computational computer conclusions conference context correlations corresponding csiszar curves cytology data deciding denote depths description different dimension distinctive drosophila dynamic each encoding enst entropic estimating estimation exons experiments fast features feder fickett figure filipov finding finite flanking functional galvan gelfand gene genetics genic genomes genomic grosse guide guigo herzel homo homogeneous human icdm ieee inference information insititute intergenic international into journal karlin krichevsky lawrence lecture length letters line makeev mann markov maximizing melanogaster memory method middle minimum mining model modeling models molecular necessarily noncoding notes novosibirsk nucleotide number numbers obtained oliver optimal orlov overall pages performance physical plexity potapov prediction presented press principle proceedings process processes program programming properties proposed protein ramensky recognizing references regions research review rissanen roland role roman roytberg sapiens schwarz science sciences segment segmentation segmentations segmenting segments selection sequence sequences showed shtarkov sites sixth society source stanley statistical statistics structures system systems szpankowski talata that their theory tjalkens transactions tree trees trofimov tumanyan universal used usefulness uses using variable vlmc volume weighting weinberger willems with wyner zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.83 38 Chao Liu Dept. of Computer Science University of Illinois-UC Urbana, IL 61801 chaoliu@cs.uiuc.edu abnormality advances afshar aiken algorithm algorithms analysis appear appendix applied automated backtrace based bayesdebug bayesian because behavior being belief biennial boyapati bugs canadian cause chains cheng classifiers clearly closed clospan code common compared computational computer conf conference constitute control copy data databases debugging demonstrated design developed discovery dmkd dynamine effect effectiveness engineering error evaluation experiment fast fault finding flow following foundations from functions future ging gives graphs greiner have heckerman histories icdm ieee immediately implementation induction information integral intelligence interesting international isolating isolation issta jaroszewicz java jordan khurshid knowledge known korat language large leads learning lemma liblit livshits localization logic lozier marinov mathematics midkiff miner mining model myagmar naik nearest neighbor network networks neural nips noncrashing numerical olver operating osdi pages part parts paste patterns pldi positives predicates problem problems proc proceedings processing programming programs proof prove proved queries references reiss related relative remain renieris revision sampled scalable scheffer sequential siam sigkdd sigplan sigsoft sixth sober society softw software solve special statistical studies suppose symp symposia system systems testing that then theorem therefore this three tool trans unexpected well which with work zeller zheng zhou zimmermann http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.86 131 Improving Nearest Neighbor Classifier using Tabu Search and Ensemble Distance Metrics Muhammad Atif Tahir and James Smith School of Computer Science University of the West of England BS16 1QY, Bristol, United Kingdom {muhammad.tahir,james.smith@uwe.ac.uk} accuracy adaboost adaptive algorithm algorithms analysis annals australian automated average bagging barnes binary biopsies blake boosting bouridane breast breiman california cambridge canberra cancer chord classification classifier classifiers cobmining combinatorial combining comparison computation computer computing conference cover crawford data databases decision deviation diabetes different dimensionality distance domeniconi duda duin edition effects ellis engineering error euclidean european evolutionary exeter experiments feature features fifteenth figure forest forests francisco frank freund functions general genetic geoscience german ghosh glover guide gunopulos hart heart hierarchical horwood hyperspectral icdm ideal ieee igarss improving individual information intelligence intelligent international ionosphere irvine ishii iterative jain journal kaufmann keogh kohavi korycinski kudo kurugollu learning lecture lncs locally machine manhattan merz method metric metrics michie minimize mining morgan morgankaufmann multiple musk nbayes nearest needle neighbor neural notes operations optimization orsa paredes pattern peng power practical practitioners predictors proceedings programs proposed prostate quinlan random rate raudys raymer recognition recommendations reduction references remote repository research review sait sample scene schapire science search select selection sensing simulataneous sixth sklansky small society sonar spiegelhalter squared standard statistical subsets symposium table tables tabu tahir taillard taylor techniques that theory through tools transactions university user using variations various vehicle vidal wdbc weighted weigthing werra wiley with witten york youssef zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.24 130 Automatic Single-Organ Segmentation in Computed Tomography Images analysis applications applied approach automatic based building comparison computer computeraided conclusion conference context correctly data detection diagnosis extended figure furthermore future graphics have icdm image images impact information intensity interaction interest international liver medical mining moreover multiple olabarriaga organ organs original other perform preliminary presented proceedings processing proposed references reporting results sahoo scans segmentation segmented sensitive show signification since sixth smeulders soltani specific survey techniques that thresholding tools vision volumetric well while will within wong work http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.163 21 Turning Clusters into Patterns: Rectangle-based Discriminative Data Description Byron J. Gao Martin Ester School of Computing Science, Simon Fraser University, Canada bgao@cs.sfu.ca ester@cs.sfu.ca addison agrawal algorithmica algorithms analysis appl application applications approach approximating approximation arbitrary associated automatic axis baeza based become berman blake blue boston breiman broad cambridge carr class classification cluster clustering clusters combinatorial communications company complete completeness comput computers concept concise conference cover covering dasgupta data database databases describing description descriptions dimensional discovery discriminative doddi eckstein ecml ester feige find first formats framework freeman friedman from garey gehrke generalization generalized guide gunopulos hammer hard harlow high hochbaum icdm imielinski inductive inference information integrating international into intractability johnson journal knowledge konjevod kumar labeled lakshmanan last learning least longman machine mannila manuscript marathe masek maximum mendelzon merz mining modern most motivated narrow nearest nediak neto objects olshen optim output pages paghavan paradigm parallel patterns perspective pods polygon polygons problem problems proceedings publishing queriable query ramesh rectangle rectangles rectilinear references regression relational repository represent require research retrieval revisited ribeiro right secondgeneration sense serve sets sigmod simeone sixth soda some step stoc stone stored structured study subsets subspace such summarization system systems table that theory therefore this threshold trees tuples used vldb volume wadsworth wang wesley which with yates york zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.143 139 Semi-Supervised Kernel Regression* Meng Wang, Xian-Sheng Hua, Yan Song, Li-Rong Dai, Hong-Jiang Zhang analysis annals applications artificial bandwidth belkin better blum both bousquet cald carnegie chawla choice classification colt combining computational computer conference consistency convergence could data dependent detetion devroye documents edinburgh ensembling equivalence errors estimate estimating estimation fields from functions gaussian ghahramani global graph graphics graphs hall harmonic heberg herrmann icdm icml ieee ijcai inference intelligence international jerryzhu journal kernel krzyzak label labeled lafferty lahiri large learning literature local longrange machine madison many matveeva mccallum mellon mincuts mining mitchell models nadaraya networks neural nigam nips niyogi nonparametric object oles planning polzehl probability problems proceedings propagation references regression regularization report rosenberg sankhya schneiderman scholkopf seeger self semi semisupervised series short sixth smooth statistical statistics supervised survey tang technical text than theorem theory thrun training university unlabeled using value vision watson weston wisc wisconsin with workshop zhang zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.40 132 Comparisons of K-Anonymization and Randomization Schemes Under Linking Attacks Zhouxuan Teng and Wenliang Du Department of Electrical Engineering and Computer Science Syracuse University, Syracuse, USA Email: zhteng@syr.edu,wedu@ecs.syr.edu agrawal alamos alberta american anonymity answer association august bias canada china conference confidentiality dallas data dewitt disclosing disclosure discovery domain duncan edmonton efficient eliminating enforcement evasive evfimievski full gehrke generalization haritsa hong icdm incognito information international journal july june keller knowledge kong laboratory lefevre maintaining management march mcnulty mining national optimal pages preserving privacy proceedings protecting ramakrishnan randomization randomized references report response risk rizvi rule rules samarati sigkdd sigmod sixth srikant statistical stoke suppression survey sweeney technical technique techniques through using utility vldb warner washington when zhan http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.5 88 A Framework for Regional Association Rule Mining in Spatial Datasets accomplishments addition advances agency agrawal algorithm algorithms allows already also analyze application approach aquifer arsenic assessment association assumed barbara based bases been belong between board bromide buneman california cancer capabilities capability case centers central chawla chloride class classes cluster clustering coast coldspots computer conclusions conference confirmed consortium contamination created critical currently data databases datasets department deserve design desirable development different discharge discovering discovery discussed divisive drinking editors egenhofer eick elevated employed environmental european evaluated evidence exploration factors finite first fitness framework franzosa from function fundamental further furthermore future generated geocomputation geographic geographical geological geoscience giscience given global globally goodchild granularities granularity grid ground groundwater gulf hall have health herbert hierarchical high home hotspots houston http hudak hypotheses icdm identified identify identifying imielinski implications important index information integrated interesting interestingness interior international introduce invited isbn issues items jajodia jiang journal kaufmann keynote knowledge koperski large laws levels literature management master maximizing measures metals method mexico mine mining moreover morgan move muntz named national natural needs nitrate novel observed obtained offer openshaw other pages paper parker particular particularly patterns perspectives peter pkdd plains pointset practice prentice presented principles proc proceedings program proposed protection provide provided providing quality real rediscovering references region regional regions regulation relations relationships relying report reported representative requirement research researchers resources results reward rhyolitic rich risk risks rule rules santa science scientific scientists scmrg searches sediments sets several shekhar sigmod sixth smith source south southern spatial speech state statistical step sting structure studies study subregions such supervised supply surface survey sushil swami symp systematically systems talk technical techniques tertiary texas that then thesis this tools topological tour toward trace transaction twdb twentythird unfortunately univeristy university unsatisfactory using vaezian versa very vice volume wang water waters what which with work world zeidat zhao http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.101 146 Manifold Clustering of Shapes abbasi adaptively adiac adjusting again algorithm algorithms allows also analysis approach approaches approximation automatically balasubramanian based basic bayer better birds bisomap both bounded branches bridging building bunke called challenge chan christobal ciobanu class classes cluster clustering clusters combines coming compactness compared computing concept conclusions conference connected consistent content convexity current curvature data databases dataset decreases degree degreebounded demonstrated densities detect deviation diatom dimensionality double droop effect efficient efforts elements embedding envision examples existing extended features fischer forms framework from further geometric global graphs have head hybrid icdm iciap identification identified ieee image importantly intelligence international intrinsic introduced invariant invariantly isomap isometric juggins kittler langford learning linear local locally long machine manifold many marathe method metric minimum mining modification mokhtarian more most multi multimedia neighborhood noise nonlinear objective outperformed pacheco pages papadimitriou pattern pech presented preserve problem problems proc proceedings ravi reconstruct reduction references related reside result retrieval robust roerdink rosenkrantz rotation rotationally roweis salesman same saul scale scaling schwartz science search shahbazkia shape shifting should significantly silva single sixth smaller solution space spanning stability stone subsequently symposium targeted tenenbaum that theory this through time topological towards transactions transformations travelling tree trees twice using vazirani which with work workshop yang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.2 91 A Balanced Ensemble Approach to Weighting Classifiers for Text Classification aaai accuracy active adaptive algorithm algorithms american analysis annual applied approach approaches apte artificial australia automated automatic baeza balance based bayes benchmarks berkeley boostexter boosting buckley california case categorization chemnitz classification classifier classifiers clustering cohen combination combining committees compare comparison components computer computing conference confidence context contributions croft damerau data debole decision development documents ecir ecml edition editors effective effectively effectiveness ensemble estimates european event examination existing feature features filtering florence formulate four frank freakes function germany giacinto global goetz hall hampp hardness heterogeneous icdm iciap ieee image indicated information intelligence intelligent international internationalacm island italy jain joachims johnson journal kaufmann kegelmeyer kevin kinds larkey learning liere local louisiana machine machines management many maximizing mccallum melbourne meta methods mining models morgan moschitti multiple naive national necessary nigam oles optimal order orleans other outperforms pages parameter pattern perception performance philadelphia philip pisa practical prentice proceedings processing providence rasmussen references relative relevant research results retrieval reuters rhode rocchio roli salton schapire science seabastiani sebastiani second selection sensitive sigir significant singer singhal sixth society structures study subsets support surveys switzerland system systems table tadepalli techniques technology term test text that three tois tools tpami transactions tuningfor usability using vector weight weighting weiss with witten woods workshop yang yates zurich http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.34 122 Cluster Based Core Vector Machine M Narasimha Murty S K Shevade abbreviations acknowledgments adult advances august available badou balls belonging binary birch boundaries buffer burges cambridge canada case cauwenberghs cbcvm cheung cjlin clarkson class classes classification classifiers classifying clustering clusters comparatively compared computational computer conclusions conference core cover csie data databases datasets decision decremental department depend dependency different dimacs distributions does done editors efficient experiment experiments extended fast feature february figure first forest francisco from gaussian gave generalization generates geometry grant handle have hence here hierarchical hong html http icdm ijcnn implementation including increase incremental indexing induced information institute international irrespective ivor james john journal june keeping kernel kong kwok large learning least letters libsvm libsvmtools like linear livny machine machines management mangasarian merits method methods ming minimal mining montreal musicant neural newyork nonlinear number observed optimal optimization orderings original page pages paper parame parameter partially pattern percentage performance platt plots poggio points press proc proceedings process processing proposed providing proximal ramakrishnan real reduced references relations report reported required research results rsvm sample samples satyam scalability scalable scaling scan scholkopf science seconds selected sequential services sets shown siam sigkdd sigmod single sixth size slower smola some sons space squares statistical structures study support supported suykens synthetic table taken technical technology ters test thank that their theory these this time tolerance trained training tsang type university used using usps values vandewalle vapnik vector vectors very washington when wiley with within work workshop world would yang york zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.119 39 On the Use of Structure and Sequence-Based Features for Protein Classification and Retrieval Keith Marsolo and Srinivasan Parthasarathy Department of Computer Science and Engineering The Ohio State University Contact: srini@cse.ohio-state.edu able academic accuracy acid acids adaptive advances against aggarwal algorithm alignment alto altschul amino amoglu among amount analysis analyze anang anomalous appear applicability applications apply approach appropriate associate astral atoms attempts august aung barr barrett base based been behavior being bentley between bhattacharya binary biocomputing bioinformatics biol biology blast blocks bonn boon boston both brenner chapter characterize choice chothia choy cikm circumstances class classification cluster clustering codes combine combining come comm common comparable compared compendium computational computing conceptual conf conference could course create critical crucial data database databases datasets date defect defects describing descriptors detecting detection direct distributed domains dynamic dynamics easily edition editor effective efficiently either emphasizes employ engineered examining expected extend extraction fact fast find focusing fold folding frames francisco frank freund from further gapped general generate generation genome genomics germany given grant have henikoff hidden hinge home homologies homology hope horizon however hubbard hughley hybrid icde icdm icml ieee implies improve improvement include incomplete increase index indexing individual information instance interested intermediate international into intractable investigation jeong kahveci karplus karypis kaufmann kernel kernels knowl koehl kuang large larity larson lattice learning leslie leveraged levitt limited limits lipman machine machines machiraju madden mallat manner manufacture markov marsolo massively material matrices mehta method methods miller minimal mining model models molecular molecules more morgan most mote motif move multi multidimensional multiple murzin need nips noble nothing nucleic object online optimization order other over overall pacific page pages palo pande parthasarathy pathways pbpm performance platt pnas possible practical prediction presented press previously problems proc proceedings processing profile programs progress protein proteins provide psist querying ramamohanarao rangwala rapid recognition reconstruction references relatively remote represent representation research researcher results retrieval runs scale schaffer scientific scop search searching semi semiconductors sequence sequences sequential sets settings several shirts short siam siddiqi signal silicon simi simply simulation simulations simultaneous since singh single sixth small snow some spurious static string structural structure structurebased structures substitution substructure successfully such suffix suited summaries supervised support symp tackle technique techniques than that then there these they this those through throughout tools tour towards tracking training trans trees tremendous tung used useful using variant vector volume wang wavelet wavelets were weston where which while wilkins with witten work world would yang york zaki zhang zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.51 124 Detecting Link Spam using Temporal Information Guoyang Shen1,2*, Bin Gao1*, Tie-Yan Liu1, Guang Feng1,3*, Shiji Song2, Hang Li1 aaai able accurately acknowledgment active advances adversarial aggarwal along also amit among analysis annu annual apply artificial automatic axis bases basis been benczur better blogspace both bottom brazil buckets build burges calado carvalho changes checked chiba chirita claim classifier clickthrough collective combating comment comments computer conclusions conducted conf conference content correct costa csalogany damn data databases davison deal defined detect detected detecting detection detector development digital duplication dynamics easily edinburgh effective eight employ engine engines evaluation existing experimental expired fail farm features feedback fetterly figure first five found four france from fully further future gade garcia granka graphs guidance gyongyi have icdm idea identified identifying implicit including indeed indicates information integrate intelligence intend international interpreting involved japan joachims kazama kernel kimura krishna large larger learning lecture level library link links lkopf long looking majumder making manasse mapsfo method methods mining model molina more moura much najork nature need nejdl nepotistic network networks noise notes ntoulas ones other others pages paper paris pedersen phrase plan powerful practical press probability problem proc proceedings project proposed qualitative rangan recognizing references removal report research results retrieval saito salvador same sarlos sato scale scales science scotland search show sigir site sites sixth small smola spam spamrank specifically stanford statistics strogatz suggestions support suspicious taxonomy technical technologies temporal thank that their them this those through time trackback true trustrank uher updated used using vector very watts webpage websites which while wide with without work works workshop world ying ytilibaborp http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.96 63 a acker addison against aggregating algorithms allowed also anneal application approach artificial association associative assp attributes australian auto average based bayesian best between breast breiman caep cars cases class classification classificaton classifier classifiers cleve cohen combine compared comparisons complex computer computing concept concepts conf conference context continuous cpar dasarathy data dataset datasets decision diabetes discovery discretization disjuncts domingos dong dougherty effective efficient efficiently emerging empirical enhance error execution explore fast fayyad feature figure finally firmed framework friedman from future gain generalization genetic german glass goldberg harmony heart hepatitis holte horse hypo icdm ieee induction instance integrating intelligence international interval intl introduction ionosphere irani iris james joint kamber karypis kaufmann knowledge kohavi kumar labor lazy lazydt learner learners learning library lippmann lymph machine magazine mathematics maximum menlo mining more morgan most multi multiple national nearest neighbor nets neural olshen only optimization order other outperformed pages park pattern patterns physics pima porter predictive press problem proc proceedings programs proposed publishers quinlan rate rates realistic references regression relationship repository review ripper rule rules sake scenarios science search seconds selection sensitive siam sick simple singer sixth size small society sommerfield sonar statistical steinbach stone table techniques them three times tools tree trees unified using valued vehicle wadsworth wang waveform wesley wiley will wine with wolpert wong zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.54 101 Direct Marketing When There Are Voluntary Buyers applied approach area association characteristic class classifiers conference cost curve customer customers data database direct discovery domingos dougherty explorations from general giudici hanley html http icai icdm imbalance institute international japkowicz john journal kdnuggets knowledge kohavi learning library lift ling machine making marketing mcneil meaning meetings metacost method mining model modeling network novel operating problem problems proceedings radiology receiver references response richardson rules sensitive sigkdd significance sixth solutions sommerfield sons strategies system tech true under using value wang wiley yang yeung zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.117 120 Object Identification with Constraints Steffen Rendle, Lars Schmidt-Thieme Department of Computer Science University of Freiburg, Germany steffen@rendle.de, lst@informatik.uni-freiburg.de accuracy acknowledgements active adaptive additionally agglomerative alberta algorithm algorithms almost also although altogether application approach approaches attribute average basu because been best bhamidipaty bilenko boston bureau camera cannot census chapter chaudhuri class classic classical cluster clustering cohen collective commission comparison complete conclusions conference consequently considering constrained constraint constraints convergence could current data databases dataset davidson decision deduplication dependences detection dhillon different dimensional directly discovery division domain domingos drawn duplicate duplicates edmonton efficient effort empirical engineering erusae european evaluating evaluation even experiments extended faster figure framework funded fuzzy ganti generalization grant graph guiding handle have hierarchical high hope icde icdm icml identification ieee improve increasing independent information instances integration interactive international iterative japan kernel knoblock knowledge known kulis large learnable learning less link linkage machine match matching mccallum measure measures media mediated methods mining minton model models mooney more motwani much must nigam normalization notably number object objective online outperforms overall pages pairs pairwise pakdd part porto portugal practice presented press principles probabilistic problem problems proceedings process product programme project proportion proposed randomly ravi record reference references report research results richman robust sahami sarawagi satisfaction satisfies seattle semi semisupervised sets shopping show shown sigkdd similarity singla single sixth small society solve sponsored state statistical string structural sufficient suggest supervised supervisor task technical technologies tejada than that theoretical there this tokyo traditional training transformation under ungar unknown used using utilizes utilizing washington weights well when which winkler with work xmedia york http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.52 92 Detection of Interdomain Routing Anomalies Based on Higher-Order Path Analysis ability able abnormal accomplish acknowledgments advances agrawal algorithm algorithmic algorithms amen analysis anderson announcement anomalous anomaly antc applications approach april artificial association attack august author authors based beacons behavior best bethesda between bipartite blackout blackouts both buneman bush caesar categorize cause characteristics chordless classification classifying communications computation computer computing conference confidence considering control counterterrorism cowie cycles data databases datasets degree department dependable detect detection determine diestel discovering disruptions document during dynamics editors employed enforcement enumerating evaluate event events explore extremely failures feigenbaum first fisher flapping forensics framework further galitsky ganiz grace grant graph graphs griffin high higher holzman hopa http icdm identifying imielinski impact important incremental information infrastructure instance institute integrated intelligence international internet intrusion invalid ipam isbn items iwdc jajodia japan jesus journal justice katz knowledge kontostathis kruegel large learning lecture likely linear link london lord mahajan major management mankin march massey matchings maximal maximum measurement messages messiah mining misconfigurations mutz nanavati national necessarily network networking networks notes number numbers observation observing official ogielski order oregon output pacific pages paragraph part path paths perfect performance period plan planes points policies position pottenger premore press proceeding proceedings processing project proposed purous recent references renesys report represent research results rexford robertson roughan route routing rules savior science security sept sets show siam sigal sigcomm sigmod sign since sixth slammer slides sliding smith society software sound spects springer starts storm stress subramanian supervised supported surge swami symposium system systems tech technique text textual thank that theory these this those through time tools topology trouble truth under understanding underwood university uoregon update updates useful using valeur verlag view views wang washington well wetherall what william window wishes with work workshop worms yang yeshua zhang zhao http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.125 23 Personalization in Context: Does Context Matter When Building Personalized Customer Models? aaai aaltonen abowd accuracy active actually adaptations adaptive adomavicius advance affects although analysis appear application applications applied approach archaeological archaeology artificial assistant aware awareness banavar barnes based behavior believe berry bettman black bovey brand brown build byun carried century ceres chen cheverst choice collected college communications complexity computer computers computing conceptual conclusions conference constrained construct constructive consumer context contextual contextually contingent counter currently customer customers dartmouth data davies decision department deriving described design destributed devices dictionary different difficulty digital directions discovery disseminating does driven dull dynamic ebling effects effort elements engineers english enhanced enquiry environment environments especially examples experiments expert exploiting exxon fieldwork finally finding findings first flaschbart forever framework franklin friedman from further gadget gaffney generalizable generate generated given glasgow goal group hall hastie hill history hope hopper hosts however human huuskonen icdm ieee incorporating individuals industrial industry inference information intelligence intelligent interacting interaction interest international into jack jiang jones journal kannry keep kingdom klein knowledge kotler kotz laboratory lancaster language learning lehikoinen leusen lilien limited linoff location london long luce lussier makes making management maps marketing marketplace mcgraw media menlo merriamwebster mining mobile models moorthy more morse multidimensional multimedia network obtaining office olshavsky online other oxford paper park pascoe payne personal personalization personalized perspective perspectives popular population possible prediction prentice presented press prior proceedings processing prototyping provide publishing purchasing rapid reality recommender references report reported representation research rich rodden rule ryan salber sales sankaranarayanan schilit science sciences scientific scientists scotland section segmenting several should singapore sixth some spring springer springfield statistical stern studies subsequently such support supporting survey symposium systems tampere task technical technique techniques tedeschi term that theimer their thesis this tibshirani tkde today toolkit topic transactions tuzhilin twentieth type types ubiquitous unfortunately united university used user using utilizing validation value various verlag volume ward wearable webster when whenever which wiley will with workshop yadav york your http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.46 100 COSMIC: Conceptually Specified Multi-Instance Clusters Hans-Peter Kriegel Alexey Pryakhin Matthias Schubert Arthur Zimek Institute for Informatics, Ludwig-Maximilians-Universitat Munchen, Germany http://www.dbs.ifi.lmu.de {kriegel,pryakhin,schubert,zimek }@dbs.ifi.lmu.de about academic acta additional alignments along amount analysis ankerst another approach area attributes average behavior best between biol biotech blake breunig bruynooghe calculated called characterize china class cluster clustering clusterings common companies company comparable compared comparing comparison competitors complete computation computer concept concepts conclusion conference contact cosmic croft data databases demonstrate densitybased department deriving described describes descriptions disclaimers displayed distance dobson doig efficiency eiter elapsed emin employees employing employment enzyme even experiments extracts figure flach formal foundations from ganter gartner generates hettich hierarchical hierarchies hierarchy icdm icml identified identify informatica instance instances interesting international kamber kernels kowalczyk kriegel larger lattice lattices lazy leaf learning least like machine magnitude mannila marginal mathematical measure measures member menu merz method methods metric might mining more multi multiinstance multiple nanjing newman objects observed only optics ordering orders pages pakdd paper partners pattern plot point points polynomial precise predicting press problem proc proceedings proposed protein pryakhin putable quarter ramon reachability recruit reference references remaining report repository represent requires result results running runtime runtimes sander sanderson scales schubert science second sets several showed sigir sigmod sixth smola solving space spent springer step steps structure survey technical techniques technology text than that their this thus time took trying university used wang websites well were when which wille with without zhou zucker http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.98 147 Linear and Non-Linear Dimensional Reduction via Class Representatives for Text Classification achlioptas acknowledgments advances algorithm algorithmic algorithms also alto analysis annual antonellis application approach approximate approximation approximations association automated averaged background baltimore based beach berlin berry bodossaki boley brodley california cambridge candidates carlo castelli categorization center centroids chen chute cikm classification classifiers clsi clustered clustering collections collective comparison component compressed comput computation computational computations computer computing concept conclude conf conference context cristianini csvd danyluk data dataset decomposition decompositions dept dhillon dimension dimensional direction discussions distributed divisive document drineas during edition editors effective engin examination extraction fast feature february figure flexible formal foundation francisco from functions gallopoulos geist generating golub good grant graphs grouping held help helpful high hirsh hopkins huang hull icdm icmla ieee implementation improving included indexing information intelligent international introduction ipam jeon johns johnson kannan karatheodori kargupta kaufmann kernel kernels kind kisiel knowledge kogan language large latent learning least left linear linguistics littau llsf llsi loan local lodhi lower machine machines macro mahoney management mapping massive math mathematical matlab matrices matrix mcsherry measure method methodology methods mfir micro mining mmodapte model modha mohsumed monte morgan morristown multidimensional natural network neural newport nicholas noise nonlinear ostrouchov pages palo park part partial participation partitioning patras pattern pedersen performance powerful preprocessing presence press principal problem proc proceedings processing professor provides pulatova purposes quality rank recent recognition reduced reduction references regarding report representation representations representatives requires research retrieval right rosen routing runtimes samatova scalability scheme school schutze science sdair search sebastiani semantic sets shawe siam sigir similarity singular sivakumar sixth smodapte society softw spaces sparse spotting springer squares stanford statistical statistics stewart stoc strategies structure study substrate summer support supported surveys symposium systems taylor teboulle technical term text texts thank thanks that thesis thomasian thresholding time tool toolbox topic trans univ university upper using value vector vegas verlag versus very wang washington weigend when wiener with workshop yale yang york zeimpekis zelikovitz zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.74 42 Frequent Closed Itemset Mining Using Prefix Graphs with an Efficient Flow-Based Pruning Strategy H. D. K. Moonesinghe, Samah Fodeh, Pang-Ning Tan Department of Computer Science & Engineering Michigan State University East Lansing, MI 48824 (moonesin, fodehsam, ptan)@cse.msu.edu able achieve acknowledgements advantage algorithm algorithms almaden also analysis anonymized aspects association associations authors barry bastide best billions both burdick calimlim called candidate capable charm clearly close closed closedness closeminer closet code codes college conclusions conference current data database databases dataset derived detecting diffsets discover discovered discovering dmkd easily effective efficient efficiently either employs even example extend factors fast faster figure fimi flow fodeh fpclose frequent from future gehrke generation global globally gouda grahne graphs grateful grow henry hsiao html http human icde icdm icdt implemented international introduces itemset itemsets lakhal large larger leverages like local lucchese mafia mahanta many maximal measured medical medicine memory michael michigan mine mining moderate mohammed moonesinghe moreover network orlando other paper pasquier patterns percentage perego performance pgminer plan positive prefix prefixgraph proc proceedings projected proposed providing prune pruning quest quite rapidly references remaining report representation representations requirements rules searching secondary sequential several sharing shows sigkdd sigmod singh sixth size source state strategies structure such syndata synthetic table taouil technical techniques thank thanks that their theorem these this tidsets tkde total transactional transactions tree trees university usage using vector vertical wang webdocs were with without work would zaki zaroukian http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.61 85 Diverse Topic Phrase Extraction through Latent Semantic Analysis absence abstracts after algorithm algorithms also ambiguous american analysis another applying approach archive artificial average balance balanced based better bold called carbonell chiba clearly collection combination comparison complete compute conclusion conference contain croft crude data deerwester definition describing developed different difficult diverse diversification diversity document documents dumais each engineering equates evaluate every extract extracted extracting extraction find finding first from furnas generally generate goldstein harshman hierarchical higher however icdm illustration improve improving indexing indicates information intelligence interest international ishizuka japan journal judged keyphrase konstan label labels landauer language latent later lausen lawrie learn learning less letter list listed lists magerman marcus mcnee means measure method metric minimal mining mlsaf more must mutual national natural needs news number obvious only order over overlooked paper parsing phrase phrases precision present presented proceedings producing properly propose ratio read reader recall recommendation references relevance reordering reranking result results retrieval reuters rosenberg satisfying science semantic show shown shows sigir significantly simple since singapore single sixth society statistics successfully summaries summarization systems table techniques test testing than that their then they this through topic topics trade trivial turney unsupervised user using values watf weighting well where whether which whole wide without words world ziegler http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.79 94 GraphRank: Statistical Modeling and Mining of Significant Subgraphs in the Feature Space Huahai He Ambuj K. Singh Department of Computer Science University of California, Santa Barbara Santa Barbara, CA 93106, USA {huahai, ambuj}@cs.ucsb.edu acad academic acknowledgements addressed algorithm algorithms alon also ambuj analysis approaches apriori automatic barbara based bernstein berretti bimbo biokdd blocks building bystroff california chapter chemical chklovskii closed closedvect closegraph code complex compounds computer computing conference conserved contact content data databases department developed discovery edition efficient efficiently estimating experimental feature frequent from grants graph graphrank gspan gudes helma huahai huan icde icdm ideker ieee importance indexing inokuchi intelligence interaction international isomorphism itzkovitz jiawei karp karypis kashtan kelley knowledge koutroumbas kramer kuhn kuramochi labeled like machine maps matching maximal mccuine milo mining modeling models molecular motifs motoda multiple natl network networks october pages part partially pattern patterns performance practical presence presented press principles prins problem proc proceedings protein providing quality raedt rahm recognition references relative report results retrieval santa schema science searches second semistructured shao sharan shen shimony significant simple singh sittler sixth smyth space species spin statistical subgraph subgraphs substructure substructures supported survey suthram technical techniques thank that theodoridis trans uetz university usefulness validated vanetik vector vectors vicario vldb volume wang washio white work would xifeng yang zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.138 69 SAXually Explicit Images: Finding Unusual Shapes Li Wei Eamonn Keogh Xiaopeng Xi Department of Computer Science and Engineering University of California, Riverside {wli, eamonn, xxi}@cs.ucr.edu academic access adaptive affecting algorithm algorithms allows analysis andre annual anthropology anytime appear approach arbitrary archaeological archive arezzo artificial athitsos badal based bases benchmarks bentley bergstrom bioinformatics biol biology brady brien british buhler butterflies castrejon castrejonpita chakrabati chiu chronic cladistics clark cliffs columbia components computational computing conference contour csit dana daniel darwent data database databases dataset davies demonstration description development digital dimension dimensional dimensionality dincklage discords discovery discrete distance drawings drosophila efficiently empirical englewood euroconference european exact experimental extracting facility fast feature files finding finney fractal free from galan gapped garcia genetics gibson grass guarantees hall hashing hepatitis http huang icdm ieee implications indexing indyk information instruments intelligence international internet invariance invariant issues jakimoska japanese jill jonsson journal jsai karolina karp kasetty keogh kitaguchi knowledge landrum lankford large larson leslie line local locality loci lonardi loncarin lyman machine management maneva massive math measures mehrotra melanogaster microscopy mining modeling molecular monitoring morphbank most motif motifs motion motwani multidimensional narayanan need network novel nystrom oriented otterloo paleoindian palsson parameter pattern patterns pazzani philip phylogenies pita points practicalities prentice preserving press principles probabilistic proc proceedings processing projectile projections provable publications quantitative query querying quintero raghavan random ratanamahatana recognit recognition reconstructing reduction references representation representations research review rotation sarmiento science scientific search searching sedgewick series shape shapes siam sigart sigkdd sigmod signature similarity sixth slator society sorting southeastern spaces states streaming strings subsequence survey symbolic symposium systems tanaka techniques theory three time tompa tools towards tracy trait uehara under understanding united university unusual useful using vagena vast vempala very vision visually vlachos wabi wilke wing wings with workshop zhang zilberstein zimmerman http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.26 87 Belief Propagation in Large, Highly Connected Graphs for 3D Part-Based Object Recognition Frank DiMaio and Jude Shavlik Computer Sciences Dept. University of Wisconsin­Madison Madison, WI 53706 {dimaio,shavlik}@cs.wisc.edu academic accuracy accurate acknowledgements aggbp aggregation algorithm angelov appears application approach approximate approximation averaging avoid backbone based bayes bayesian belief beliefs better biological both branching cases clear codes coding communication comparison computational computer conclusions conference connected cryptography crystal crystallography cvpr dampening data density describe detailed digital dimaio efficiency efficient electron empirical equally errors even exactbp experiments factors feedback felzenszwalb figure finds framework freeman frey from fully gaussian general ghahramani good grant graph graphical graphs group hand handled here highest highly however huttenlocher hybrid icdm ignores ignoring ihler image images important improved improving inference inherent intelligent interesting international interpretations interpreting introduce introduction isard ismb iteration iterations jaakkola jordan kaufman koller large learning lerner lids likelihood loops loopy machine mackay made makes mandel maps matching mateo matrices memory message methods mining mixtures models moderate more morgan most multiscale murphy neal nets nips node nonparametric object objectrecognition over pampas paper part pearl perhaps phillips pictorial press probabilistic proc proceedings produce produces products propagating propagation protein radii random real reasoning recognition reduce references report requirements research result results rhodes runtime sampling saul savings scheme seems serves shavlik should show shows sixth small softness solution solutions some sometimes sparse standard structures study sudderth suited supported synthetic systems taking task technical technique term than that then these this though tracing tracking tractable unavoidable unclear under used using valued variance variational variations varied varying very vision visual weiss well where willsky with without work working zero http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.31 127 Boosting the Feature Space: Text Classification for Unstructured Data on the Web Yang Song1, Ding Zhou1, Jian Huang2, Isaac G. Councill2, aaai achieves adaboost advantage again agent agents algorithmic algorithms american analysis applied approach approachgenerally artificial association automatic automatically autonomous based baseline becomes bollacker boosting both breese bregman categorization chen chunk citeseer classification classifiers collaborative collection collins concepts conference conll crammer craven data dipasquo distances efficient empirical extract extracted feature filtering fourteenth freitag from giles haruno heckerman hofmann iaai icdm identification igsvm implementation improvement increase intelligence interesting international kadie kernel knowledge kudo larger lawrence learn learning logistic mach machines macro matsumoto mccallum menlo methods mining mitchell multiclass nearly nigam note outperforms over pages park performs predictive press proceedings proximal publications references regardless regression retrieval same schapire scores selection sigir significant singer sixth size slattery support symbolic taira text that uncertainty vector vsim webkb weighted well when whole wide with world worst yang york zhang zhuang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.152 51 Stability Region based Expectation Maximization for Model-based Clustering about academic algorithm also analysis analyzers annealing application applications applying approaches artificial automatic based basford bayesian been belongie biology blake blobworld bouchaffra boundary california carlo carson chain chiang circuits classification clearly clustering competitive component computation computational computer computing conference control convergence corresponding data databases datasets dekker demspter department dept deterministic documents dynamical easily editor efficient efficiently eighteenth elidan energy escaping estimation expectationmaximization exploited extend extended extensions factor figueiredo finding finite friedman from fundamental future gaussian general genetic ghahramani gibbs graphical greedy greenspan have hidden hinton html http husmeier icdm ieee image improvements incomplete incremental inference information intelligence international irvine itory jain john jordan journal justifies kluwer krishnan krose labeled laird learning like likelihood local machine malik manifested marcel markov maxima maximum mccallum mclachlan means merz method methodology methods mining mitchell mixture mixtures mlearn mlrepository modeling models monte multiple nakano national neal networks neural nigam ninio nonlinear obtaining optimal other pages parameter pattern penny performance pernkopf perturbation plan points popularly potential principal probabilistic problems proceedings programming properties publishers querying rate real recognition reddy references region regions related repos rezek roberts royal rubin saddle sampling schuurmans sciences search segmentation series sixth smem society solutions some sons sparse stability statistical statistics strategies successfully surface surfaces synthetic system systematic systematically systems technique tested text that theory these thesis thrun training trajectory transactions ueda university unlabeled unsupervised used using variants various verbeek view vlassis widely wiley work would yale york zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.105 140 Mining Complex Time-Series Data by Learning Markovian Models acoustics algorithms also amnesia annals applications april aref asia automata based buhlmann cambridge cerg chains china circle cityu complex components computer computing concepts conference connected continuous corman data database databases dbgroup dense department depth detection directed dong efficient elfeky elmagarmid engine engineering even finding first five fohr foundation gabor graph graphs hellerstein hidden high hong http icde icdm ieee incremental information interesting international introduce introduction journal junqira kamber kaufmann kinds knowledge known kong learning leiserson length letters linear machine march mari markov markovian memory merge microsoft mine mining model models morgan motion national novel nuutila online order other pages part partial partially patterns performance periodic periodicity periods phoneme polynomial power press probabilistic proc proceedings processing project providing rabiner recognition recognizing references report research rivest science search second selected series sheng siam signal singer sixth soininen soisalon speech statistics strongly structures supowit supported tarjan technical techniques temporal than thank this time tishby training trans tsinghua tutorial university unknown used variable variety versatile vlhmm wang wangyi well which wider with word wyner http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.85 117 Improving Grouped-Entity Resolution using Quasi-Cliques aaai across active adaptive additional addresses advances algorithms alias along although american analysis ananthakrishna application approach approximate aspect assume authority automated automatic autonomous barbara based bath been before benjelloun best between beyond bhamidipaty bhattacharya bilenko blocking bollacker borkar boston brown calls capture chaudhuri chen citation citations cleaning clique clustering cohen comparative compares comparison completed computer conclusion conf conference conjunction connected connection constraint constraints constructs context contexts control correspondences cosine counterterrorism cross data databases deduplication derive deshmukh detect determine difference different digital dimensional direction disambiguation discovery distance distances distqc doan domain duplicate duplicates each ecdl efficient element eliminating entities entity especially european evaluation examines expectationmaximization experimental experiments exploit exploiting exploits extensive fellegi field fienberg figure focus framework fuzzy ganjam ganti garcia generic getoor giles graph graphs gravano grouped have held hernandez hidden high hong http hull icdm idea identification identity ieee iiweb ijcai improves independent indexing information integration intelligent interactive international into investigate ipeirotis issues iterative java jcdl jiang joins joint kalashnikov kang knowledge kothari koudas labeling large lawrence learning libraries limitations link linkage links main malin marthi match matching mccallum mehrotra merge methods metric metrics milch mining mitra molina mooney more most motwani movie much multitude name names needed network neural never nigam notion online open opendblp ours package pages paper partition partitioning pasula pittsburgh popular precision present presents press problem proceedings process processing proposal proposed proposes purge quasi quasiclique quasicliques ravikumar rdbms recent record records reference references relationship relationships relaxation reldc report research resolution results retrieval robust russell santa sarawagi scalable secondstring security segmentation semantic sets sharing shen shows shpitser siam sigir sigmod similar similarity sixth social society some source sourceforge spurious srivastava stanford statistical stolfo strengths string structure structured study such sunter superimposition support swoosh system systematic systems tasks technical technique techniques testing text textual that their theory they this through together toward traditional trend trying type uncertainty under ungar university unlike unsupervised used usefulness using vector verify vldb warehouses warnner when which wide widom with work workshop world worsens zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.156 29 Subjectivity Categorization of Weblog with Part-Of-Speech Based Smoothing Shen Huang1 Jian-Tao Sun1 Xuanhui Wang2 Hua-Jun Zeng1 Zheng Chen1 aaai according acoustics adamic adaptive adjectives analysing analysis annual applied approach approaches argamon association assp augmenting authorship automatic automatically avneri based bayes bell bigrams blog blogosphere brown bruce caaw capturing case categorization category chapter class classification classifications classifiers clustering coling comparative comparison component compression computational computer conference conll corpora correlates croft customer data deep dellapietra dependencies derived desouza development disambiguate divided documents domain down eacl ecir ecosystem election emnlp engineering enhance essen estimating estimation european evaluation event events exploring expressions extraction feature features filtering finn frequency from gamon genre glance global gold gram hara hatzivassiloglou icassp icdm icml ieee information innovative international internet interpolated jelinek katz kirsten kneser koppel kushmerick lafferty language learning levels linguistic linguistics management manual markov mccallum mckeown mercer methods mining mishne model modeling models mood naive narrative natural niesler nigam ninth novel orientation pages parameters part pattern patterns pedersen peng point political ponte posts potentially practice predicting probabilistic probabilities problem proc proceedings processing recognition recognizer recognizing references research retrieval reviews rijke riloff routing schuurmans selection semantic sigir signal similar sixth smoothing smyth source sparse speech spring standard statistical stochastic structuring study style subjective subjectivity summarizing symposium systems tagging text theory they thumbs tracking transaction transactions transfer turney unsupervised using view wang weblogging weblogs weibe whittaker wiebe wilson with witten wolters woodland words workshop yang zero zhai http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.77 27 Global and Componentwise Extrapolation for Accelerating Data Mining from Large Incomplete Data Sets with the EM Algorithm Chun-Nan Hsu Han-Shen Huang Bo-Hou Yang Institute of Information Science Academia Sinica Nankang, Taipei, Taiwan {chunnan,hanshen,ericyang}@iis.sinica.edu.tw above accceleration accelerated accelerating acceleration accordingly acknowledgements adaptive advantage aiken aitken algebra algorithm also american analysis another appendix applications area artificial association august based bauer bayesian because becomes besides better between binder block bound burden cannot case chapman check chen circumstances claim class classifier classifiers close cluster comments complement componentwise computing conclusion condition conditions conference consider consideration considering containing converge convergence converging cooper corollary council ctjem data dempster denote dependent derivative derived described describes determine diagonal different direct does dseparated easily eijk eijkeijk elapsed elements equation estimation every exactly example examples experiments extensions extrapolation extreme fails faires feature features fifth figure finding first five fourteenth from further global grant guaranteed hall have here herskovits hesterberg hidden high highly however hsin huang icdm ieee incomplete independent induction inference influence information initialized intelligence intelligent international interscience iobs iteration iterations jacobian jamshidian jennrich joint journal jump kanazawa kaufmann know koller krishnan laird large learning left lemma less likelihood likely linear links local louis machine mapping matrix maximum mclachlan meng method methods mining minneapolis minnesota missing model more morgan most much multivariate national networks newton node number numerical observations observed obtain occurs only optimization order otherwise outperform outperformed outside overrelaxed pages paper parameter parameters part partial pearl performed plausible possible predicate preferred preparation previous probabilistic probabilistically probability proc proceedings produce proof property proved provide pwskent quasi randomly rate rates reasoning reducing references reflect research respect results rewritten right roweis royal rubin rules russell salakhutdinov satisfactory schafer science searle second section semi separated separates series served sets should show shows similarly simplified singer sixth society sparse speedup staggered start statistical statistics straightforward such suggest supervised supported suppose synthesized systems taiwan task term thank that then theorem there therefore third this three thus times tjem took train training triple true twentieth uncertainty under update useful using values vanilla variable variables variants verified verify when whether wijk wijkp wijkwijk wiley with yang york zero http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.49 78 Decision Trees for Functional Variables accuracy accurate acknowledgements addition algorithm algorithms alignment also alternative american analysis animal application applied approach archive association auxiliary base based bases basic bayesian boosting branches bunke case categorization class classification classifier classifiers clustering colored combine conference constructive currently curve curves data databases death decision defines describe designed diez dynamic early editors either example extensions extraction feature features figure finally financial foundation freiburg from functional further gaussian generated genkin geurts gonzalez grant group here hettich http icdm icml idea improved induction interest international interpretable intervalbased investigating james journal kadous kandel kernel kudo large last learning learnt length letters lewis library likely limited literals lnai local logisitic machine madigan madiregresurl many methods might mining model more multi multidimensional multivariate nakai national nips noma nonparametric october ones pages papers parametric particular particularly passing pattern pkdd powerful predicting predictive presented problems proceedings processes produce proposed raedt ramsay range recognition references regions representative rutgers sagayama saito sammut sampled scale science scientific segment september series shimbo shimodaira siebes silverman single sion sixth sophisticated sparsely split splits springer standard stat statistical structured studies sugar suggest support supported suzuki takabayashi test text thank that themselves these thesis this through time topic toyama tree trees university used using variable vector verlag view volume weeks wehenkel well work world would yale yamada yokoi york http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.7 70 A Novel Scalable Algorithm for Supervised Subspace Learning Jun Yan1, Ning Liu1, Benyu Zhang1, Qiang Yang2, Shuicheng Yan3, Zheng Chen1 Microsoft Research Asia, 49 Zhichun Road, Beijing 100080, P.R. China {junyan, ningl, byzhang, zhengc}@microsoft.com Department of Computer Science, Hong Kong University of Science and Technology qyang@cs.ust.hk ECE department, University of Illinois at Urbana Champaign, USA advances algorithm algorithms analysis applications arikawa arimura artae asai automatically background balakrishnama benchmark black boston brief california canada cardie categorization city classification clustering collection college computer conference constrain constrained criterion data databases demonstrated department designing discovery discriminant document domingos efficient even extraction feature framework future ganapathiraju give harbi health high hulten icdm icml ieee incremental information institute intelligence international irvine japan jiang jogan journal kawasoe knowledge latest learn learning lecture leonardis lewis line linear machine maebashi margin martinez maximum means mining neural notes online only optimal outperforms parameter pattern press proceedings processing proof quebec rayward real recognition references repository research robust rogers rose satisfied schr science semiof sigkdd signal sixth speed stream streams structured subspace supervised systems tasks text that torkkola torre transactions tutorial university valued vancouver versus vision visual volume wagstaff weights which williams williamstown with work yang zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.57 50 Discovering Partial Orders in Binary Data academy algorithm algorithms alizadeh alroy analysis applications apply approach archaeology atkins bayesian better biochronology biology boman booth branch callable christof chromosomes combinatorial computational computer computing conference consecutive cplextm data database detail develop direction discrete domains effective engineering european exact experience finding formulations fortelius fossil fragments from gansner genome gionis global graph graphs halekoh hartl helsinki hendrickson http icdm ilog integer international interval jernvall john journal junger karp kececioglu kujala latter library like lueker macroevolutionary mammals mannila mapping martin meek methodology methods mielikai mining more mutzel national nemhauser neogene north ones open optimization optimized order ordering orders other pages palazzolo paleobiology partial particular pattern patterns physical pkdd planarity possibly practice probes problem problems proceedings processes property public quantifying references reinelt release saggedsite sawyer science sciences selection sequence sequential seriation siam simple sites sixth smoller software solving sons spectral statistics strategies symposium system systems techniques test testing tree ukkonen unique university unordered using vach visualization weisser wiley wish wolsey world would york zweig http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.94 53 Latent Dirichlet Co-Clustering M. Mahdi Shafiei and Evangelos E. Milios Faculty of Computer Science, Dalhousie University 6050 University Ave., Halifax, Canada shafiei@cs.dal.ca , eem@cs.dal.ca ability academy accessed acknowledgements advances advantage again algorithm algorithms allocation also american analysis annual applied approximate april articles artificial aspect association august automated banerjee based basu bayes bayesian beal been beginning bengio bethesda biclustering bioinformatics biological biology blei bootstrap bottou buntine cambridge canada carlo categorization chain chapman chipman clustering coclustering compare comparison computational computing conference conjugate corpus correlated correlations council dagstructured data datasets december deerwester defined demonstrate dennis denoyer derivations development dhillon dimension dirichlet discovery distribution document dumais each editors efficient empirical engineering erlbaum estimating estimation expectation financial finding following formulae forum fourth furnas gallinari generative ghosh gibbs gilks ginius graphical grateful griffiths hall harshman have hierarchical hofmann icdm icml idiap idiaprr ieee improved indexing inference information informationtheoretic institutions integrating intelligence interactive international interpretable itcc jair joint jordan journal june keller kintsch knowledge krumpelman laferty lafferty landauer last latent laurence ldcc learning likelihood machine madeira mallela markov maryland matching mccallum mcnamara meaning measure method milios mining minka mixture mocc model modeled modeling models modha moment monte mooney national natural neural newton ninth nips november nzdsnv nzdsnwdsn oliveira operations other over overlapping pachinko pages papers parameter pdyds pennsylvania performance perplexity pittsburgh platt practice precision press priors probabilistic proceedings processes processing promising propagation proposed quality raftery random recall reduc references report reported representation research results retrieval road royal rule sampling saul scholk science sciences scientific sebastiani section segments semantic series services several shafiei show siam sigir sigkdd simplify sixth society some statistical statistics steyvers subset suppl support survey surveys symbols syntax systems table take tcbb technical techniques tenenbaum terms text their theme these tion topic topics transactions using wdsn weighted weiss where wikipedia with word words workshop wzdsn ydsl ydszdsn york zdsn http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.92 28 Keyphrase Extraction using Semantic Networks Structure Analysis Chong Huang1, Yonghong Tian2, Zhi Zhou2, Charles X. Ling3 , Tiejun Huang1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China academy accuracy accurate acknowledgement affects aisb alexander algorithm algorithms analogy analyses analysis apply approach automated automatic automatically available based behavior both bringing cadal caldeira california cancho capable caropreso categorization cecchi centrality chang chengqing china chinese choices cikm cjlin classification classifier cognitive collective comm comments community comparative complex comput concepts conclusion conference csie data databases datasets decide definition develop dickerson different digital directed document dorogovtscv dynamics early eelc effectiveness efficiency efficient either emnlp empirical encouraging enhance entropy euro evaluation exactly exploring expressions extension extensive extraction extractor fast feature features fields fifth files find forman fortunato foundation fourth from function generalize generously giuffrida global graphs greatly grobelnik group growth hershey html http human humphreys hurst icdm icml icnlpke idea ieee independent indicators infomine information international into investigating jinhu june keyphrase keyphrases keywords knowledgebased lack language large latora leaner learning lett lexicon libraries library libsvm lobao london lyon machine machines management marchiori markowetz mcndes metadata method metrics might mihalcea mining mladenic model more moreover msra multiple multiword national nature negatively nehaniv network networks newman number ones order organization outperforms partially pedersen performance peter phrase phraserate phys pnas postscript powerful practical practice press probability proc proceedings process prof project propose publishing references relation report requires research researchers retrieval riverside robertson scale science sciences selection semantic sequences several sharing shek sigman similarity simple sixth small smallworld software sole solvable solved statistical steyvers still strogatz structural structure structures study summarization supervised support supported syntactic tarau task tasks technical tenenbaum text textrank texts thankful that their theory this though tomokiyo tool traditional traditionally turney university unpractical unsupervised usefulness using variables vector watts weighted weights well which with witten word wordnet work workshop world written yang yuanning zong http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.36 52 Co-clustering documents and words using Bipartite Isoperimetric Graph Partitioning advantages algebra algebraic algorithm american analysis anderson applications approach based bipartite block boley bottleneck bound buckley categorization centroid certain cheeger cikm circuits classification cluster clustering clusters collection combinatorial computations concepts conclusions conference connectivity control correlating cuts czechoslovak data decomposition demonstrate density development dhillon difference ding distribution document documents dodziuk dong efficiency eigenvalue eigenvalues equations evaluation evolving experimental experiments features fiedler fotouhi found from generation geometry gini global golub grady graph graphs gross guattery hagen hastings hersh hickam hierarchically hopkins http icdm icip icme ieee image inequality information informationtheoretic integrated interactive international isoperimetric john joshi journal kahng karypis kendall keyframe kuijlaars kumar kummamuru lanczos laplacian laplacians large leone lewis linear loan local long longman lower malik mallela mandhani martinus mathematical mathematics matrices matrix method methods miller mining mobasher modeling modha mohar moore morley multilinear multilingual news nijhoff normalized notes numbers numerical ohsumed over pami partitioning performed physics pitman pkdd press problems proc proceedings proposed publishers quality query random ratio references rege requires research results retrieval reuters review schwartz scientific segmentation semantic separators series siam sigir sigkdd simon sixth slonim smallest society solution sparse special spectral stability story system systems techical terms test text their theory time times tishby trans transcript transience using value walks webace which wide with word words world zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.110 81 Mining Maximal Generalized Frequent Geographic Patterns with Knowledge Constraints acknowledgment advances agrawal algorithm algorithms alvares amount appice applied approach ares association avoid aware bastide been berardi bogorny camargo capes ceci celik charm chawla closed cnpq computational compute concept conclusions conference considering constraints containing data database databases delivery dependence dependences different discovering discovery domain efficient eliminate eliminates elimination engel european evaluate examples experiments fast figure filtering foundations fourteenth frequent from funded future generalized generate generated generates generation geographic geopkdd granularities granularity hall hierarchical high however hsiao icdm icdt ieee increases indeed information intelligent international ismis itemset itemsets join knowledge known koperski lake lakhal large lattice less level location logic lower lvares malerba many maximal method minimal minimum mining multi number pairs paper partially pasquier pattern patterns prentice presented privacy proc proceedings process program project proposed published quality real reduce redundant references removed replication research result results river rule rules sacrificed saddle sets shekhar showed siam since single sixth solution spatial srikant step stumme summary support symposium systems taouil tends that theory this time tour towards union upper using very vldb water well when which will with without work zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.8 34 A Parameterized Probabilistic Model of Network Evolution for Supervised Link Prediction adamic adar addison albert algorithms analysis annual applications approach approaches authors baeza barabasi bioinformatics biological cambridge chaoji chapman chemical collaborations collaborative colt comparison computational conducting conference craig cybernetics data diehl digital discovery discussions eighth enzyme estimation evolution experiments explorations faust field fifth filtering free friends from genomic getoor hall hasan helmbold helping hogg huang icdm ieeecs inference information institute integration international introduction issues jcdl jeong joint kanehisa laboratory learning libraries link linkkdd macmillan mathematical mean mining mixture modern multidimensional neda neighbors neto network networks pages physica planck prediction press problem proceedings random ravasz references research retrieval ribeiro salem scale scaling schapire scientific shubert sigkdd singer sixth social statistics supervised survey thank theory tokyo tsuboi university using valuable vert vicsek warmuth wasserman wesley with workshop yamanishi yates yuta zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.157 141 Temporal Data Mining in Dynamic Feature Spaces able accuracies accuracy adapt adaptive agrawal aguilar algorithm among analysis applied approaches april artificial asocs austin based bases better bookstore both brodley campbell carrier change chapman class classification classifier classifiers clinton college committee communications computer computing concept conceptdrifting concepts condition conf conference constantly construction contexts contextual could crisp data decision defined definitions demonstrate department descriptional desirable discovery distributions domingos drift drifting dublin dynamic edition efficient engine ensemble especially example explorations feature ferrer fluctuating found frank frasca from future gama generally giraud good granger groundwork guide half hand happens hidden hughes hulten icdm ieee incremental induction informatics instances intelligent international intl ireland journal katakis kaufmann kerber khabaza klinkenberg knowledge kohavi kolter kubat large learning lncs lower machine made majority maloof martinez mason maturity medas method mine miners minimizing mining modeling more morgan most nearness networks neural noisy note number numerical often onion organizers over pages panhellenic parameters past peeling perform performance perhaps practical predictive presence priority probationary problem proc proceedings promise readily real references reinartz related remains removed report research results riquelme rodrigues ruiz rule scenarios schlimmer science sciences selection september sets several shearer shown sigkdd sixth spencer spss stanley step streaming streamminer streams subject suggesting suspect symposium systematic task tasks technical techniques temporal tested texas textual than that these three threshold time timechanging tools tracking tree trees trinity troyano tsoumakas tsymbal university useful user using utgoff utility very vlahavas wang weighted weighting well were when which widmer wirth with witten work works world would zheng http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.133 137 Fei Wang Department of Automation Tsinghua University Beijing, 100084, P.R.China feiwang03@gmail.com Sheng Ma, Liuzhong Yang Vivido Media (Beijing) Inc. Shangdi Development Zone Beijing 100085, China masheng@vividomedia.com.cn Tao Li School of Computer Science Florida International University Miami, FL 33199 taoli@cs.fiu.edu above abscissa accelerate acceleration accurate achieve active addressed advantage affect after algebraic algorith algorithm algorithms alleviate already also amazon analysis application applied architecture averaged bars based been being bergstrom berlin better between borchers both boundary breese burden called cannot case change clearly collaborative commerce compaq compare computation computational compute computed condition conference connected consider constant constrained constraint data datasets define defined definition deshpande details different dimensionality discrete discuss discussion drop dzeroski each eachmovie effectiveness eigentaste elements empirical equivalently error evaluation excos experiment experimental experiments explore fields fighre figure filtering final find fold formulated foundation framework free from fully function functions gaussian geometric geometry germany ghahramani given goal goldberg grants graph grouplens gupta hancock harmonic have heckerman here herlocker however http iacovou icdm identical importance impose impractical improve indeed influence information initial international introduction intuitions intuitively inverse issue item itembased items just kadie karypis keep konstan lafferty large learning linden line lose lululr machine make manually many marlin matrix means measure measures method methods metrics minimize minimizing mining missing more moreover movielens multi namely national natural nearest need needs neighborhoodsize neighbors netnews normalization novel number ones open optimization order ordinaterepresents over paper parameter part partially pattern performance performances performing perkins perspective plot potential predict predicted prediction predictive preferences presented previous problem procedure proceedings produce propose provided question rank rated rates rating ratings recognition recommend recommendation recommendations recommender reduction reflects regularization relational requirements research resnick results return returned riedl roeder same sarwar schafer scholk science section seems seen semi sensitive should show shown shows sigcscw significantly similarity since sixth smith smoothness some spaces sparser sparsification sparsified sparsify sparsifying springer start stated stating storage study subsection such sudden supervised supported sushak system systems task techniques test tested that them then theoretical theory there therefore these thesis think this three time toronto tradeoff traditional trans treat true unchanged under university unlike unrated user users using value values variances varies vary vector vectors very want weight what when where whether which whichare will wilson with work york zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.141 80 Semantic Kernels for Text Classification based on Topological Measures of Feature Similarity advances alche also analysis annual approach available background based basili bloehdorn boosting both brazdil budanitsky burges camacho cambridge cammisa cases categorization cepts classification classify combination computational computer concepts conducted conference considerations consistent corpus cristianini critically data databases deliberatly depends development different disambiguated disambiguation discovery editors effect effectiveness employed enns error especially estimate european evaluating examples experiments exploit exploiting extremely feature gama given have hierarchical hirst hotho icdm icml ieee ijcnn impact improvement improvements indicate informaion information inns input intelligent international introduced investigate joachims joint jorge journal kept kernel kernels knowledge kopf large latent learners learning lexical light likely linguistics little lodhi machine machines making mavroeidis measures methods mining moschitti negative networks neural other pattern perfectly performance pessimistic pkdd pointed potential practical practice press principles proc proceedings proved question references relatedness representations research results retrieval reuters scale scheme schol search seen semantic sense series shawe sigir simple siolas sixth sizes smola smoothing society sparse springer stable step strategy structure subsets success superconcept support syntactic systematic systems tasks taylor terms text texts that theobald thesauri these this those torgo trail training trec tsatsaronis types university using vazirgiannis vector very weak weighting weikum were where while will with word wordnet workshop zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.128 119 Probabilistic Enhanced Mapping with the Generative Tabular Model able additive advances advantage algorithm ality allow along also alternative anal analyse analysis application applied approach based been believe berkeley bidimensional bishop bivariate carlo choice clarendon class classical classification cloud clustering collections columns combined comparison complementary component computational computer computers computing conference connected constraints construction contents continuous could current curvilinear data datamining dataset datasets demands demartines dempster density descriptive difference dimension discrete discretization distributions document easier embedded embedding enhance equally exact exists extended extensions final finally finite from further future gaussian generative girolami graphics have herault histogram hofmann icdm icml ieee incomplete independent information initialize inside instead intell intelligence interesting international interval introduced john kaban keim kohonen laird large latent learning lebart like likelihood linear local locality locally locations loose machine macqueen mapping maps marginal massive math maximum mclachlan method methodology methods minimal mining mixture model models monte moreover morineau most multivariate near network networks neural neuronal nonlinear observations obtain organizing organizingmap original over pages paradigm parameters part particular pattern peel percentile perspective pixel plane plot plots possible preceding preserving press principles proba probabilistic probability probmap proceedings processing projected projection provide quicker random readable recognition reduction references risk rotation roughly roweis rows royal rubin sammon sampling saul science seems self sensitive sets several show similar sixth some sons spaced speaking springer stat statist statistical structure studied such svense symp system tabular take that their then this together tools trait trans transactions under used variables visual visualization visualizing volume wang warwick ways when which wiley williams with work zoom http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.15 11 Adaptive Parallel Graph Mining for CMP Architectures aaai abel academic acknowledgments across active adaptive adaptivitat adapts addison addition address affinity afford after agrawal algorithm algorithms allowing although anthony applicable approaches architectures area associated association available balancing bandwidth based bauer behavior beitrage believe between bogdan brandherm buehrer busch bustamante candidate chemical chen chip coatney common comparison compilers compounds computing conclusion conference constraints consumption context control cook core corporation could customer cycles daehyun data databases decrease dehaspe dependency designers desrosiers developed difference discovery discussions distributed does dubey during dynamic dynamically editors efficient embedding embeddings emerging employ european excellent excessive exhibited experimentation exponentially extended factor factors ffsm finally finding first fischer fragments frequent fritts fundamental galal gaston gastonel gastonre generation genetic glowczwskie graph greatly gspan guralnik have herein high holder however icdm ieee imielinski immunodeficiency improve improved increasing influence information inoculation insight intel internation international intravaginal issues items java journal karypis king knowledge krishnamoorthy kroner kuramochi language large leblanc lernen less lifson like limiting lists load logml longman loop macaques machines macromolecules maglothin main make management managing many markatos markup meinl memory miller miners mining minnesota modify mofa motifs multiprocessor multiprocessors naive needed nguyen nijssen obstacles often ogihara only optimizations orlando other outside pages parallel parallelism parallelization parallelizing parthasarathy partitioning pattern patterns perform performance philippsen piatak pkdd poor poorly populations porto portugal practice present presented press principles proceedings processor processors proposed punin quantitative quickstart recombination references relationships replication report require results retroviral rhesus rochester round rourke rousset rules runtimes scalability scale scaleup scheduling scientific sebag sequence sequential sethi sets setting shared shown siam sigkdd sigmod simian simultaneous since sivmac sixth sizes soon specifically stanton state statically step strategies structural structure subgraph substructure substructures such supercomputing swami system systems table task tasks technical techniques termier than thank that their these thesis this through tiling toivonen tools touchpoints towards traditional transactions treefinder ullman univeristy university usage used uses using viral virology virus vivo volume wang webkdd well wesley when where which while will wissensentdeckung with without wolinsky work workshop workshopwoche worlein would york zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.81 114 High Quality, Efficient Hierarchical Document Clustering using Closed Interesting Itemsets Hassan H. Malik and John R. Kender Department of Computer Science, Columbia University {hhm2104, jrk}@cs.columbia.edu accessing accuracy achieved agglomerative algorithms analysis application applications apply approach approaches association associations based because beil believe berzal bisecting blanco both brijs cikm closed clustering cluto comparison conclusions conference construction cross data databases datasets defining dimensionality directory discovered discovery document domains ester evaluation experiment figure fihc found framework frequent frequently full fung future generalize gkhome glaros global hierarchical http icdm importance information intelligent interesting interestingness international introduced itemsets journal karypis knowledge kumar large level linearly means measure measures measuring medir merging mining more nchez need nodes nonparametric notion november number ohsu ohsumed optimizing outperforms over paper parameter patterns performance plan possibly presents proc proceedings process proposed provide pruning reducing reduction refer references replacement requiring results reuters right rules runtime scalability scalable scaled searsmith selecting serve showed siam sigkdd significant sixth small srivastava standard state steinbach steps summarizing superior techniques term terms termset text than that theories these this time topic tuning used using validation vanhoof various vectors versions views vila wang well wets with without work workshop worse zhao http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.136 134 Resource Management for Networked Classifiers in Distributed Stream Mining Systems Deepak S. Turaga Olivier Verscheure Upendra V. Chaudhari Lisa D. Amini IBM T.J. Watson Research Center Yorktown Heights, NY 10598 abadi acoustics across actual adaptive alarm alarms algorithmic allocation also amini analysis andrade application approach approaches april architecture august balakrishnan balazinska based bayesian because before believe between beyond biases binary boosting both brief careful carney cellular center cetintemel chain chains chandrasekaran chang change characteristics chaudhari cherniack cidr classification classifier classifiers classifying combination comparing complex conclusions conference conservative consideration constraints constructing continuous convey cooper core curve curves data dataflow depends deshpande determine determined directions distances distributed distributing duin dynamically earlier effective ensemble eskesen evaluation examine examining exclusive extend false figure figures filters formulated franklin furthermore future garg given heavily hellerstein highlighted hong icassp icdm icpr ideas ieee improve improving increasing individual interaction interested international introduction investigating jain jiang journal june kalman king krishnamurthy landgrebe leading learning like linear load loadstar lower machines madden making management memory metric mining model more moving multi multiple muntz nature navratil need networked networks nist november observe obvious october olston onto operating optimal optimising optimization optimized ordering over paclik paper park passes pattern pavlovic penalizes performance point points precise principled problem proceedings processing provided queries raman ramaswamy rate real recognition reduces references reiss related relative report required requirements research resource resources right rora scalable scenarios schapire scheme schemes selected selo senator several shah shapes shedding siam sigmod signal sixth solution solutions solve speaker speech stage standard stonebraker stream streams such system systems tatbul technical techniques telegraphcq telephony than that them theory there thereby these this through time topologies towards traffic trends uncertain under underlying used using utilization varying venkatramani verification very vldb wang watson where widom with workshop world would xing zdonik zilca http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.90 99 Intelligent Icons: Integrating Lite-Weight Data Mining and Visualization into GUI Operating Systems Eamonn Keogh Li Wei Xiaopeng Xi Stefano Lonardi Jin Shieh Scott Sirowy University of California ­ Riverside {eamonn, wli, xxi, stelo, shiehj, ssirowy}@cs.ucr.edu able about abstract acknowledgements addition african alaska algorithm algorithms allowing almost also although american among analysis appear approximately arbitrary archive arranged artificial asian asiatic associated augmented author authors available aware background basalaj based bases bayesian bear became becoming been before being belongs bill birds black blue bookmarks both briefly browser built buried california cambridge canada capture centered challenging chimpanzee chiu choice choose chose christos classification classifiers click clicked closest clue cluster clusters collection coloring comments como computer conclusions conference confirmed consider consideration content current data database dataset datasets date david decision desktop different directions discovery distance distinct diverse donors download draft dunham dynamic eamonn early edward eighth einoshin emailing encouraging english enhance enough envisioning equally europe even exact exactly examination example except exclude explains extended extensive face faloutsos figure file files final find first folder folders found from further future geffen general geographical gives glyph google graphics greenland have hearst helga here herle hideto hong however html http hunter hybrid icdm iceland icon icons implications inbetween include increasingly index indexed indexing induction information intelligence intelligent interaction interesting international internet into introduced intuitive issues italian jessica journal katsuhiko keogh know knowledge kong laboratory languages lapses large latent learning like likewise location lonardi machine major make makes mammals margaret marti martin match matthew medicine medida metric michael mining more most mostly multidimensional name note novel obvious occasional operating option orangutan order organizing other outliers page pages paper papers passage pazzani perfect perhaps pixels place placement placing plug plugins polar porter portuguese potentially preliminary press primary proceedings produce program proof provides providing proximity query question rather reference references reflect reflected related relationship remaining remember report representation reproducible reptiles research results retrieved returns reveals right riverside russia same school scott screen search searching semantic sense sequenced series shieh shneiderman shoots shortly shown sigmod similar similarity simply since sirowy sixth size smart somewhat split standard statement stefano still strategies streaming stripping structure study submitted such suffix superioridade support suppose surprising suzuki symbolic systems takabayashi take task taxonomic taxonomy technique test tests text texts than thank that their thesis they thibetanus this those thus time timeseries told tool tools tree tsdma tufte twentieth types ucla university ursus useful user using utility verificar version very visualization volume ward warp warping when which wild wildcards with wojciech work workshop would written yamada yokoi http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.10 86 AC-Close: Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery Hong Cheng Philip S. Yu Jiawei Han agrawal algorithm algorithms analysis anen approximate approximation association avoiding base bases bayardo boulicaut bradley bykowski candidate close closed closet compact conclusions condensed conference core data databases demonstrates dense density derive dimensions discover discovering discovery dmkd dong effective efficient efficiently engineering envelope envelopes error exceeds experimental exploring false fast fayyad fraction free frequency frequent from generating generation given high icdm identifying ieee information international itemsets knowledge kumar large long mannila means mechanism method mine mining model multiple nobel noise pages paper pattern patterns paulsen pkdd positive potentially presence prins proc proceedings propose proposed queries random references representation representations rigotti rules scalable sepp sets sigmod sixth srikant steinbach structure study submatrix sufficiently support symmetric systems technique that this threshold toivonen tolerant trans transaction true uses vldb wang while with without yang zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.115 116 NewsCATS: A News Categorization And Trading System aachen about account achieve addressed agreements aktienkursen algorithms allan amherst analysis analyst annual approach articles asia assumptions australasian automated banking because becker behavior berkeley berlastung bern between boston california carnegie categorization chemnitz cikm classification comp compare computational concurrent conference cost costs currency daily data database depth describing development diego discovery distributed ecml einfluss einsatz electronic elkan engineering esswein european evolutionary examination except exchange exist falvi favorable features finance financial forecasting forthcoming from fung gecco genetic giampapa gross hawaii headlines heidelberg hochschule hong however icdm ieee illiquidity indices information informations institute institutional integrating intelligence intelligent international into intraday investors island issue january jensen joachims journal knolmayer knowledge kong kursentwicklung kursrelevanzprognose kurzfristiger language lavrenko lawrie learning leung lewis likely limited machine machines major management many market massachusetts melbourne meldungen mellon methods mining mitteilungen mittermayer mobile models more most movements nach nalyst news newscats obtained ogilvie pacific paper peramunetilleke performance permunetilleke physica pittsburgh portfolio predict predicting prediction previous price prices proceedings processing profit prognose project proposal prototypes publikation rate rather real realistic recommendation redlich references relevant report representation research response results retrieval return roundtrip rules sankaran schmill schoop schulz science sciences sebastiani sensitive series shaker sigir sigkdd simulations since sixth spiliopoulou statistik stock support survey surveys sycara system systems taipei take technical techniques technology text textual that these thesis this thomas thrich time trading trend trends types under univ unternehmensnachrichten using vector vegas volume wanted washington wider will winkler wirtschaft wirtschaftsinformatik with wong working workshop yang yield york zero zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.109 79 Mining Latent Associations of Objects Using a Typed Mixture Model --A case study on expert/expertise mining Shenghua BaoI, Yunbo Cao, Bing Liu, Yong YuI, and Hang Li able accuracy additional also always amore analysis application apply association associations ausweb based because better both changes clusters collaborative community conclude conclusion conducted conference contexts contributions cooccurrence cooccurrences craswell cvcp data detection devries different diverse documents effectively effectiveness enhance enterprise existing experiments expert expertexpert expertise expertnet experts exploit extend figure filtering framework from future hawking help hofmann http htttp icdm improvement indeed indexing information international issue january just knowledge latent learning machine main memo mining mixture model models noptic number object observe occurrence occurrences other outperforms over overview paper performed performs plan press probabilistic problem proc proceedings profnet proposal puzicha references relations report represent respectively results search searching semantic separable sets shows sigir significant single sixth skillview small soboroff statistical statistically studied studying such table tasks technical technique terms tests texts than that this thus tois track trec tsmm typed types unsupervised utilize utilized values various vercoustre verification volume well were when which wilkins with within work http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.137 149 Rule-Based Platform for Web User Profiling1 adaptive adomavicius aggarwal aizu algorithm algorithms annual applications approaches artificial association based behavior build building challenges chen clickstream clustering commerce competitive computer conclusions conference configured constructed content customer data databases derivation diego different discovery driven during effort eirinaki entire environment evaluating event evolving example expert extracting fast following from fuzzy generated generation hour icdm ieee ijcai immune implemented information integrating intelligence intelligent international internet introduced japan journal knowledge logs management means measuring methods metric metrics middleton mining models more mushtao nara nasraoui navigational networked news noisy obvious online ontological page paper passive pattern patterns period personalization platform proceedings profile profiles profiling proportion prototype proven publishing rafter read recommender recruitment references relational roure rule rules scalable search seattle semantics server services shadbolt shahabi sites sixth smyth steps sterne success sugiyama system systems techniques technology template tested that then third this time tkde tois tolle tools transactions tuzhilin user users using validation value vazirgiannis view visitors washington webkdd website werner where wide wiley with without workshop world york yoshikawa zicari http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.73 73 Forecasting Skewed Biased Stochastic Ozone Days: Analyses and Solutions Kun Zhang1, Wei Fan2, Xiaojing Yuan3, Ian Davidson4,and Xiangshang Li5 above accuracy accurate achieve achieved adaboost adopted affects airnet albany alert amongst analysis analyze annual another applied approaches appropriate approximate area assessment atmos atmospheric attempt averaging avoid bagging baggingc base based bayesian behaviors being beneito berlinerc bertuccod besides best better between bias biased boostedc boosting bortnicka breeze buckles cawleye center certain changes characteristics chattertonb chem chemical choice choose class classes classifiers clean commission comparison compromise computational computers concentration concentrations conclusion conference consume control could criteria cross curves daily dangerous data davidson days decision demonstrated desirable developed developing difficult discriminant discuss distribution domingos dorlingb dostalc doyleb dsic dynamic dynamics ebenc ecml effect effectively efficiency efficient eight empirical energy engineering ensemble environment environmental established estimate estimation estimators evaluating exhaustive existing experiences experiments expert experts exposure extended fallen feature features ferri figure figures first flach fold forecast forecasting forecasts formulate forswall foxallb framework from fuzzy gavin general generate ghiaus greigg ground guide guideline hand have health hernndez hierarchical higgins high higher historical hour houston html http human icdm icml imented implementation important improve inadversely incremental induction inductive institute intercomparison international iras irrelevant irwinb issue janssen knowledge known kolehmainenf lambeth learner learners learning level life limited line linear logic look machine maximum mcmillan method methods mining mintz model modeling more most national ncdc neither network neural noaa nonexhaustive nunnarid ortega other ozone paper parametric particular parts peforms pelikanc peng perspective phys physical pino pkdd plotted pollution portable posterior power practical precision prediction predictions prior probabilistic probability probabilitybased problem problems procedure proceedings process program provide provided provides provost pruning quality random ranking rankings rather recall recent references regime regression relies report research reversetesting rice richtera rigorous same sample sanderson scenario schlink select selection seven shell shown side significantly similar simple simulates single sixth smoothing soler solution solve space splitting still stochastic study subset success suny surface surprises sustainability svrcek systems technical technique test testing tests texas than that there these this though threshold through time traditional trained tree trees true under university unpruned used using validation vector vectors version vondracekc wang weak well when where with within work years young zadrozny zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.167 59 What is the dimension of your binary data? Taneli Mielikainen Aristides Gionis Heikki Mannila about academic accident achlioptas advances algorithms allocation always analysis annual anthony applications arikawa artificial association assortment august barnsley basis batista bayardo becker behavior bennett berlin beyond biclique biggs binary bishop blei board brijs brodley bulletin buntine cambridge case cefet chaos chazelle clique clustering coins columns complexity component components computational computer concept conference correlation cover coverings croatia curse data database databases dayal decisions deflating depends detection diego dietterich dimension dimensionality dimensions direction dirichlet discovery discrepancy discrete dynamical edition editors elements empirical equal estimation european everywhere explorations faceted factorization factorizations faloutsos fast feature fernandes fifth filter fortelius fossil fractal fractals framework frasca frequency frey friendly geerts general geometric germany geurts gibbons gionis global goethals goodness grossman haddad hand have helsinki high hinneburg http hypothesis icde icdm ieee independence independent induced inen information integral intelligence interesting international intrinsic introduction january john johnson jolliffe jordan journal kamel kitagawa knowledge kohavi kolmogorov korn kruskal lang langford latent learning lecture leen liebrock limited lindenstrauss local locations loci machine mammals mannila margins mason matrices matrix method mielika mielikain miettinen mining monson more multi multidimensional multinomial multiple needed negative neogene netnews neural newsweeder ninth nkranz nondeterminism nonlinear nonmetric normalized notes number numbers obermayer obviously omicini onion open optimizing organizers orlando other outlier packing pagel pages palmerini papadimitriou peeling perego perttu pkdd pods practice press principal principles problem problems proceedings processing product profiling projections promising properties psychometrica pullman ramamritham random reduction rees references remain report research results rules sbbd scaling scheffer science sciences seems selection september series seung several side sigkdd silva sixth society souza spiliopoulou springer statistical statistics study subsets survey suzuki swinnen system systems tenenbaum texts theoretical theory thomas tiling tracts traina transactional transportation trees tresp tsaparas twelfth uniformity university using vanhoof variables verlag vijayaraman vitan volume wainwright washington wets wiley with workshop world would yannakakis zheng http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.120 115 On Trajectory Representation for Scientific Features access adair aerospace algorithms analysis animated aref authors award baseball body bulletin career chebyshev chen cisrc comparison conference continuously data datasets david delis edbt efficient eggert engineering estimating exploration faloutsos fast features fields fisher flow four framework generalized geometric ghanem gunopulos hadjieleftheriou harpercollins icdm ieee images indexing international jensen jiang kollios korea labs leutenegger like livermore lopez lorusso machiraju major mehta method methods ming mining mobile mokbel motion moving nocedal numerical object objects optimization optimized oria ozsu papadias parthasarathy patterns physics pods polynomials positions prediction predictive proceedings professor providing publsiher queries references relationships research results rigid robust saltenis samsung scientfic scientific search sigmod similarity sixth spatio spatiotemporal springer swirling temporal thank thompson tkde trajectories transformation tree tsotras unknown using vast verification visual vldb with would wright yang yootai york http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.161 145 TOP-COP: Mining TOP-K Strongly Correlated Pairs in Large Databases addition agrawal alexander algorithm alternative analysis association bayardo between bound brute coefficient combined computational computing conference constraintbased correlated correlation data databases demonstrated dense designed diagonal discovery efficiently experiments exploration faster filter financial find force geometric guide gunopulos icdm imielinski international interpretation items john journal knowledge large magnitude market method mining models monotone orders pages pairs pearson proceedings property provide query references refine rule rules sets sigmod sixth sons strategy swami than that this traversal upper uses wiley with http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.38 106 CoMiner: An Effective Algorithm for Mining Competitors from the Web acquisition adapative additionally algorithm ambiguity ambiguous area arsenal automatic barcelona based beans between billion burst cafarella called champion changes chen chien chin chinese cimino cluster collection column cominer comparative compared competitive competitor competitors computational concepts conclusion conference content corpora cross data definitions differen disambiguate disambiguation discover distinguishing distribution does domain domains downey dynamically effective entity etzioni european evaluate evaluation evolution experimental extraction fact final finally find first fish focused food from fukushinna future generated give growing handschuh hearst here highly however hyponyms icdm improve information intelligent international keyphrase knowitall large league learning limitation linguistics list march match means michigan ming mining mixture model more morinaga name nearly news notebook observation order owing pages paper people periods plan played popcscu previous problem proceedings processdings product proposed provide ranking references reflect reputations result results retrieval returned rice scale search searching second section selfannotating shaked shen show shows sigir since sixth soderland specific staab studied system table tateishi teams term texas text that then this time topic towrds tree university unrestricted velivelli wang weld when without work yamanishi zeng zhai http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.103 12 Meta Clustering Rich Caruana, Mohamed Elhawary, Nam Nguyen, Casey Smith Cornell University, Ithaca, New York 14853 {caruana, hawary, nhnguyen, casey}@cs.cornell.edu acknowledgments advances agglomerative aggregation aistats algorithm algorithms american analysis andrew anna anton applications applied applying approach arabie artificial artigas association azran background based basford before bengio bennet bergmark bertrand bias bioinformatics bipartite blake bottleneck bottou boulis bradley brodley cardie career caruana casey christensen class classification classifications classificatory cluster clustering clusters cohen combining comparing computer concensus conference consensus constrained contrasting convergence cornell criteria data database databases dekker demiriz determining dimensions discovery documents donna driven dror dubes duda early ecoregions efficiency elhawary embrechts engineering ensemble ensembles entropy environmental etzioni european evaluation experiments expression fast fayyad feature fellowship fern filkov forgy framework from furlanello gene general genetic geographic ghahramani ghosh gionis goldenberg golub grant graph groups hart helped hettich hexacorals hierarchical hodor hubert icdm impossibility inference information initial integrating intelligence intelligent international interpretability intuitive jain john joint journal jurman karp kellam kleinberg kmeans knowledge lance learning likhodedov machine madani mannila marcel martin mclachlan means measurement measures medicine meila merler merz mesirov meta method methods microarray mining mixture model models monti moubarki mufti multidimensional multiple multivariate natural networks neural newman nguyen nips number objective orengo ostendorf part partitioning partitions pattern paul pedro pharmacology pkdd points practice predictive principles problems proceedings processing project properties protein provided punch rand random ranking references refining report repository resampling research reuse rogers rosenberg ruppin scaling scene schroedl search segmentation selection semi serafini shapley siam sixth skiena slonim smith solving sorting spectral stability statistical stochastic strategies strehl suggested supervised supported swift symposium systems tamayo technical terrestrial thank theorem theory this tishby tools topchy toward tsaparas tucker university using value versus view viral visualization wagstaff walks weighting weightings weiss wild williams with without work workshop world york zamir http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.29 58 Boosting for Learning Multiple Classes with Imbalanced Class Distribution accomplish adaboost adac address adjust algorithm algorithms among applicable applications applied applying approach approaches area assume attracts base been best better bias boosting both bradley capable chawla class classes classification classifier combined complicated conclusion conducted conference consideration considering consuming contributions cost costs crucial curve data datasets developed different directly distributions editorial efficiency efficient error evaluation existing experimental explorations first focuses from genetic hardness however icdm imbalance imbalanced importance improve indicate interests international into involving issue japkowicz knowledge kolcz leaning learning line machine main methods might minimize mining misclassification more most multiple nature overall paper parameter pattern performances present problem problems procedure proceedings process real recognition reducing references research respectable results searching sensitive sets setting setups shows sigkdd significant situations sixth solving some special speed still study such systems tackle taking tests than that this three time training under update usually values vectors weight when with work world http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.159 55 The PDD Framework for Detecting Categories of Peculiar Data Mahesh Shrestha, Howard J. Hamilton, Yiyu Yao, Ken Konkel, and Liqiang Geng Department of Computer Science University of Regina {shresthm,hamilton,yyao,konkel1k,gengl}@cs.uregina.ca about absolute according additional advantages agarwal aggarwal aixen alberta algorithm algorithms almaden also amount amounts analyze anomaly application applications approach approaches archive arnold asia aspect attribute ature august australia available average bagging barnett based baxter beach because been between biggest bolton both breunig brodley california canada case categories category celcius celsius center change changes chen cheung chicago classification classified clustering columbia column columns combines comparative comparison composite computation computer conclusion conference confererence consecutive contributes converted could czech daily dallas data database databases datasets date days december degrees deltas demonstrated deng density department detect detected detecting detection detects determine developed difference differences different dimensional discovered discovering discovery discussed discussion distance distancebased diverse diversity domain drop during each eamonn edmonton efficient eighth eleventh empty engineering ensemble entire entirety episodes ertoz eskin estimation european example existing experiment experiments expert experts extended faloutsos fdon feature field find finding findout first five florida fluctuated fluctuation focus folias found four framework francisco fraud frequencies frequency frequent from future gave general generalizations generalized geometric global hand hansen have hawkins help high highest hockey html http icassp icdm identifying ieee illinois implement impressions improvements include index information input inspiration interanational interest interested interestingness international interval intervals into intrusion intrusions john jose journal july keogh kitagawa knoor knorr knowledge kriegel kumar lane large largest larsen lazarevic learning length lesion lewis likely linear local major management mannila measure measures mention method methods might mining months montreal most much multi near network networks neural newport next ninth noted notion observed occurred october often ohshima ohsuga only optics oriented orlando other outlier outliers over overall ozgur pacific papadimitriou particular patterns pdds peculiar peculiarities peculiarity performance performed period periods person perspectives philipsen points portnoy possibly practice prague precipitation precipitations prerau present previous principles proceedings properties provence provide provided providing pruning purely ramaswamy randomization rastogi record recorded records reduction references regina relatively remaining replicator report represent represented represents republic research results revealed revealing review riverside robust rule rulequest rules sander scheme schemes schwabacher science second section security selected september sequence sequences sequential series sets seven sheikholeslami shim shown siam sigkdd sigmod sigurdsson simple since sixth skin smallest sons specialist srikant srivastava start starting stationary statistical stolfo stores study subspace sydney system systems table taipei taiwan takeuchi tang tells temp temperature temporal texas than that there these thesis they third this three thus time toivonen total totaled transactions tsdma tucakov tung undergraduate unified unifies unifying unique university unlabeled unsupervised used useful usefulness user using value values variety vary verkamo very view views vldb waim warehousing washington weather week weeklong weeks were where which wiley williams winter with would wulf yamanishi years york zhang zhong http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.89 37 Integrating Features from Different Sources for Music Information Retrieval aaai abeny account acknowledgements acoustic acoustics adam adaptively advances afshin aggregation alexander algorithm allocation also analysis analyzing andrew annual applicability applications approach approaches argamon aristides artificial artist ashfaq assoc association audio author authorship automatic avrim award baayen ballard based beat becker behav berenzweig beth bill bimodal blei blum bootstrapping bounds brett brian brochu cambridge career categorization category chapter checkers cikm class classification cluster clustering cohen coherent collins colt combining comparability comparative comparisons computation computational computers concepts conference consensus constant content cook cooper corpora correctors cortical criteria criterion cultural dana daniel dasgupta data daubechies david dempster descriptive detection development dietterich different dirichlet discovery discrimination dmitry document during editors effectiveness efstathios eighth electronic eleventh ellis empirical enhancing ensembles entity eric estimating etrieval european evaluation experts expo expression extend external face fakotakis features february fifth fiona first foote framework freitas from functions fundamentals fusion gene generalization genre geoffrey george ghahramani ghani ghosh gionis gloub goldman grants ground guohui hall harald heikki hierarchical hinrich humanities iaaai icde icdm icml ieee imaging incomplete indexing individual inference information innovative integration intelligence intelligent interesting international into isca jean jill joint jonathan jordan journal joydeep juang july kamal karypis kaufmann kessler khokhar knowledge kokkinakis labeled laird language large laroche latent lawrence learn learning lectures lexical likelihood linguistic linguistics littman lizhong locations logan mach machine malcolm management mannila march marin maximization maximum mcallester meaning meanings measure meeting mesirov messages method methods michael microarray milligan mining misspellings mitchell mitsunori mitton modality models monti moreno morgan multi multilingual multimedia multimodal multiple multivar music musical mutual naacl name named nando national natural network neural nigam nikos ninth nunberg objective ogihara organization oviatt pablo pages panayiotis panchanathan parameter part partitions parts patrawadee pedro perry perspective philadelphia philip poor prasangsit prentice press probabilistic proceedings process processing publishers querying quest rabiner rand rayid recognition recordings references resampling research results retrieval reuse rhythm richness roger roth royal rubin sally sanjoy saric schutze second selected self semantic sensing sethurman seventeenth sharon shingo shlomo siam sigdat sigir sigkdd signal signals similarity singer sixth slaney smaragdis society some song sources spaces specch spectrum speech spellers spelling spoken stamatatos stat statistical stefano stein sterling steve steven strehl study style supervised supported suzanna swing synthetic systems tagging take tamayo techniques tempo terms text that theoretical theory thirty this through todd toward training transactions transformation truth tsaparas tweedie twelfth tzanetakis uchihashi unlabeled unsupervised users using variable vercoe very view views virginia visualization wahlster waspaa wavelethistogram wavelets weight whitman with wolfgang word wording work workshop would ying yoram zelenko zhao zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.3 44 A Data Mining Approach for Capacity Building of Stakeholders in Integrated Flood Management Peter Owotoki, Natasa Manojlovi , Friedrich Mayer-Lindenberg, Erik Pasche academic acquiring acquisition addison advances afit agency albert algorithm algorithms amsterdam applications april areas artificial back banks based bayes bayesian brain british buchanan building buntime cabena case classification computational concepts conf conference consistent cost cover craenen cybernetics damage data databases decision discovering discovery dounias edition eibe eiben engineering england environment errors exact examplars exemplars expert faculty features feigenbaum fema finn flood force francisco frank from geissler generalized guide hadjinian hall hart hayes hinton homeowners hydroinformatic icdm ieee implementation induction information initial instancebased institute integrated intellig intelligence intern international introduction ipswich ithaca january jensen joint june kaufmann kibler kluwer knowledge kraus laboratories learning lenat machine making management manojlovic march mateo methods mining model morgan nagy nature nearest neighbor neighbour nested networks nice organization pages pasche pattern perceptron perentice piatetsky practical practices preparation principles probabilistic proceedings products programs prone propagation psychological publishers quinlan reading real reduction references report representations representative research residential retrofitting review rosenblatt roth rule rules rumelhart salzberg santos sciences shapiro shortliffe sixth someren springer stadler storage strategies strategy structures studies study symbolic systems technical techniques technology telecom themes theory tools transactions trees tselentis universiteit urban using verhees verlag vrije waterman webbased weighted welbank wesley williams with witten workshop york zanasi zevenbergen zimmermann http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.13 10 Adaptive Blocking: Learning to Scale Up Record Linkage aaai accuracy active adaptive administration administrative advances agents algorithm algorithms ambiguous american analysis application applied approximate approximately approximation arlington association attribute automatic autonomous based basu bate baxter best between beyond bhamidipaty bhattacharya bilenko biomedical blocking blue bollacker bureau business carr census chaudhuri christen churches chvatal citation citeseer cleaning cluster clustering cohen comparator comparison complex computing conditional conference consolidation constraint coreference cover covering covermax current dasfaa data database databases datamining deduplication dependences detecting detection determining dimensional dirichlet division dmkd doan doddi domain domainindependent domingos dong drug duplicate effective efficient elfeky elkan elmagarmid entity extensible fast febrl fellegi field filtering florida flynn freely fuzzy ganjam ganti getoor giles greedy halevy handbook hardening health hernandez heterogeneous heuristic high html http icde icdm icml iden identification identity independent indexing induction information integration interactive international jain jaro jordan journal kautz kelley kernels knoblock konjevod label large latent lawrence learnable learning linkage lists lncs machines madhavan marathe marthi match matching mathematics mcallester mccallum measures mediated mehrotra merge method methodology methods michalowski michelson milch mining minton miss model models monge mooney morie motwani murty naacl names nanjo newcombe nigam nips noren normalization noun object online ontologies operations optimization orre oxford pages pasula peleg pkdd press problem problems proceedings product purge reading reconciliation record records reference references regularization report research resolution review richman robust roth rule russell safety sahami sarawagi schemes scholk search semantic sets shen shopping shpitser sigmod similarity singla sixth smola soda soft sources spaces spectral state statistical stolfo strategies strategy string studies sunter support surveys swat system tailor tampa tech techniques tejada theory tity tool tracing transformation uncertainty ungar unsupervised using vector very verykios washington weights weiss wellner winkler with workshop http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.154 93 Star-Structured High-Order Heterogeneous Data Co-clustering based on Consistent Information Theory Bin Gao Tie-Yan Liu Wei-Ying Ma Microsoft Research Asia 4F, Sigma Center, No. 49, Zhichun Road Beijing, 100080, P. R. China {bingao, tyliu, wyma}@microsoft.com accuracy algorithm also analysis another approach approximations autoregressive average axis banerjee based better between bipartite both bottleneck bregman cases categories category cbgc chang chen cheng choice classification clustering clusters comparison conclusions conference consistency consistent cost cover data dataset definitely dhillon diagonal document documents double each ecml effectiveness efficiency efficient elements entropy experiments extension fall features figure generalized ghosh good graph heterogeneous high highorder horizontal icdm ieee image index indexes indicating information informationtheoretic inter international interrelated iterative jain john learning mallela matrix maximum merugu method mining models modha more most much multi multimedia multiresolution novel objects order outperforms pages pair pairs paper parameter partitioning pattern performance permutation plot point points possible problem proceedings processing proposed random realization recognition recom references regarded reinforcement related report represents sdnoces segmentation selected semi showed shows side sigir sigkdd simultaneous sixth slonim solution sons souroujon spectral starstructured structured supervised surrounding terms texts texture than that theory this thomas time tishby transactions transform tree type unsupervised upper using utilization visual wang wavelet were which wiley with word words yaniv ycarucc york zeng zheng http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.20 14 An Interactive Semantic Video Mining and Retrieval Platform ­ Application in Transportation Surveillance Video for Incident Detection Xin Chen and Chengcui Zhang Department of Computer and Information Sciences, University of Alabama at Birmingham Birmingham, Alabama, 35294 USA {chenxin, zhang}@cis.uab.edu accident accuracy acknowledgement adapted advance advancement agonizing algorithm amia application applications applied approach approximators august backpropagation based bayesian bengio berry carnegie chan characteristics chen chengcui collobert combination comparative computer computing conclusions conf conference conjugate connectionist constructed content corresponding currently customized dagli data database davey demonstrated detection different discovery dorffner each economics effectiveness engineering evaluating event events example experiments extend extensions extraction feature feedback feedbacks feedforward fessant finance financial forecasting framework frank from future general generation given gradient group horizon hornik however huang hunt icdm icip identification ieee image include incorporates incorporation increases indexing initialization intelligent interactive international intersections introduction iteration iterations john journal june kamijo katsushi kinouchi learning linear linoff live mars matsushita medical medium mehrotra mellon method mining modeled modeling models monitoring more multilayer multimedia multiple nakazato needs nets network networks neural objects office only pain paper part patterson peeta phase platform prediction proceedings processing program progressing proposed provides query recorded references refines regression relevance report results retrieval returned robotic science seen semantic september sequence sequences series shewchuk shows shyu signal significant singapore single sixth sketches small some sons spain spatio specific specified still stinchcombe studied study subspaces successfully supported supports surveillance symp system systems technical technique techniques telecoms temporal term tested that then this through time topic tracked tracking traffic trajectories transactions transportation tsien types universal university user using vehicle video videos weight well when which white wiley will with without women wong work workshop world zhang zhao http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.118 75 On the Lower Bound of Local Optimums in K-Means Algorithm abstract academic accelerate accelerating algorithm algorithms analysis applications approximation arthur based been bengio best birch both bottou bradley clustering comparison component comput computational computed concept conference convergence cormen cost current data databases diagrams dimensionality dimensions ding discovery efficiency efficient elkan especially exact experimental experiments extended fast fayyad focs future geom geometric geometry greatly heckerman high higher icdm icml ieee imai implementation improve inaba inequality information initial initialization international introduction kamber kanungo katoh kmeans knowledge kumar large least leiserson linear livny lloyd local means meila method methods mining moore most mount netanyahu nips optimum pages parameter peled pelleg piatko points press principal proceedings properties quantization ramakrishnan randomization real reasoning references refining reveal rivest sabharwal sadri search sets several sigmod silverman simple sixth slow socg soda solution squares stein such symposium synthetic techniques than that theory time transactions triangle using variance vassilvitskii version very voronoi weighted with zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.107 26 Mining for Tree-Query Associations in a Graph Eveline Hoekx and Jan Van den Bussche Hasselt University and transnational University of Limburg Agoralaan D, 3590 Diepenbeek, Belgium {eveline.hoekx, jan.vandenbussche}@uhasselt.be abiteboul adams addison advances advantages agrawal algorithm algorithms analysis animal annual appear applications apriori artificial association background base based bases bayardo bennett body bolton borders bussche canonical centrality cercone chandra chawathe city cohen computer computing conference congressus conjunctive cook data database databases datalog december dehaspe description design detection discovery discrete dumouchel dzeroski ecology edinburgh editor editors efficient efficiently eleventh engineering exploratory extraction fast fayyad finding flocks food forest forms forward foundations free frequent from gehrke generality generalization generating getting ghazizadeh goethals graph graphs grossman gspan gudes hand hoekx holder hopcroft horvath huan hull icdm ieee implementation inokuchi intelligence international isomorphism japan jeong journal karypis knowl knowledge kohavi komorowski kuramochi labelled lange large lavrac lecture length lethality levelwise lexicographic lncs machine maebashi management mannila martinez mason mckay memmott merlin metaqueries michie minimum mining mitbander motoda muntz natural nature networks notes numerantium optimal order outerplanar pages parasites pathogens pattern patterns piatetsky pkdd placing practical predators presence press principles prins proceedings properties protein queries query ramon record references relational research richness rooted rule rules ruskey satoh science scions search semistructured seus shapiro shen shimony siam sigkdd sigmod sixth sizes smith smyth society space sparse species springer srikant structure subgraph subgraphs substructure substructures subtree summaries symposium syst systems tenth their theories theory thinking toivonen transactions tree trees trophic tsur ullman university using uthurusamy vaidya vanetik verkamo verlag vianu volume wang warmer washio wesley widom workshop wrobel yang zaki zaniolo zighed zytkow http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.18 90 An Experimental Investigation of Graph Kernels on a Collaborative Recommendation Task Francois Fouss, Luh Yen, Alain Pirotte & Marco Saerens Information Systems Research Unit (ISYS) Universite´ catholique de Louvain account across addition adjacency advances agreement algebraic algorithms allowing also always american among analysis another appealing appear applicable application applications applied artificial automation average based behavioral being belonging berlin best bests better between bipartite bold borg both brand cambridge case castellan catholique centering chapman characteristics chebotarev chung clustering coarse coifman collaborative colt commerce comparable compare comparison competitive components computation compute computed computer computing conclusion conference connected connectivity considering control correlated correlation corresponding cosine could cristianini cross crossvalidation data database defined degree derived determine deviation difference diffusion dimensionality direct directly discrete discussed displayed distance dupont dzeroski each ecml editors eigenfunctions eigenvector either electrical elements engineering entropy equation european except experiment experiments explorations fact fifth finally first fokker fold following forest formation fouss framework frequencies frequency from further general generally generic global good graining graph graphs groenen groups hall hancock have high hill however icdm ieee importance improve indicate indicates indirect information inner intelligence interesting international interpretation into introduced introduction investigated item itembased items karypis kernel kernels kevrekidis knowledge kondor konstan kudo lafferty lafon langville laplacian large learning least lecture link linked links list looking loops louvain machine maps markov mathematical matrices matrix matsumoto maxf maximizing mcgraw measure measures measuring method methods meyer mining modern more moreover movielens movies multi multidimensional nadler necessarily needs neighborhood neighbours networks neural newsletter nice node nodes nonparametric notes notice number observation observe obtained only operators optimal order original other pages paired paper parameter parameterization particular partitioning pattern penalized penalty percentile perform performance performed performing performs perspective pirotte pisa pkdd planck point preference preliminary press principal priori probably procedure procedures proceedings processing product profit propose provide provided provides purchases quantities random rank ranked recall recalls recommendation recommendations recommender reduction references regardless regularization related relational relations relationships rely remote renders report reported respect results retrieval review riedl runs saerens sarwar satisfaction scalable scale scaling scholk scholkopf sciences score scores scoring second seem sequel sequences shamis shawe shimbo show showed shows siam siegel sigkdd significant similarities similarity simply since sixth slightly small smaller smola social society some spectral springer standard statistics structures study summarized survey systematically systems table tables take takes task taylor technical technology terms test than that their them theorem theory there these they this those three through thus transactions trees tuned undirected unified universite university used user users using validation value varied various vectors verlag walk walks warmuth well were whatever when where whereas which while will wilson with words work workshop http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.59 83 Discovery of Collocation Episodes in Spatiotemporal Data Huiping Cao, Nikos Mamoulis, and David W. Cheung Department of Computer Science The University of Hong Kong Pokfulam Road, Hong Kong {hpcao,nikos,dcheung}@cs.hku.hk addition agrawal algorithms andrienko approximations apriori bakalov based best both carefully cbms cikm close collocation combination complex computer convert corresponding data databases definition designed devised dewitt discov discover discovering discovery ecml edbt editors efficient episodes equijoin evaluation event experimentation features finding first framework frequent from generalizations generalized hadjieleftheriou hash herle huang ieee important improvements keogh knowl kollios location magic malerba mannila medical mehta methodology mining naughton novel object original parthasarathy pattern patterns performance phase phases pkdd problem provided quent queries references results scalable schneider scientific second sequences sequential series shekhar showed spatial spatio spatiotemporal srikant sstd summary symp systems technique techniques teisseire temporal that this time toivonen topological trajectories tsotras unusual used verkamo vldb wang workshop yang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.75 64 Geometrically Inspired Itemset Mining Florian Verhein, Sanjay Chawla School of Information Technologies, University of Sydney, Australia {fverhein,chawla}@it.usyd.edu.au above abstraction achlioptas agrawal algorithm algorithms allows also association bases beats best biological candidate carpenter classification closed closet colocation computer conference cong constraints convertible data database datasets departs developed discovering discovery efficient efficiently existing experiments faloutsos fast fimi find finding flexibility focus framework frequent friendly from future generalizing generation geometric gives glimit great growth harmony helsinki http huang icdm implementations importantly impossible interesting interna international intl issues itemset itemsets itemvectors journal karypis kaufmann knowledge korn kotidis kumar labrinidis lakshmanan large larger lecture linear long management measures mines mining morgan most ninth notes notion novel number opens operating pages pass patterns potential presented press previously principles prior problem proceedings projections pushing quantifiable random ratio references research results rules science shekhar showed siam sigkdd sigmod significantly sixth small space spatial srikant steinbach summary support symposium systems tenth than that this thresholds time tional transaction transactionspace transformations tung used useful uses using very vldb wang were will without work workshop xiong yang zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.135 49 Relational Ensemble Classification aaai abilities academy accuracy acknowledgements aggregation algorithms alleviate analysis applying approach approaches artificial associative authoritative autocorrelation automating bagging based bayesian bernstein best better bias breiman business case categorization cause ceder chakrabarti chen classification classifier classifiers clearwater clustering collaborative collective combining commission compare compared complex compuscience computational conclusions conference considering considers construction cramer current data dataset defense dependency dietterich directions discrete does domingos emergent enhanced ensemble ensembles environment eprn estimation estimators european evaluate evaluation existing experimental experiments exploit fachinformationszentrum fairgrieve feature features filtering flach four frank friedland from functions funded further furthermore fusion future gallagher generalization generic getoor grant have hopfield huang hyperlink hyperlinked hyperlinks hypertext icdm ieee ijcai implementation implementations improve improves improving includes industry indyk inference information intelligence international internet iterative java jensen joint journal karlsruhe kaufmann klautau kleinberg kluwer koller koppa kramer kushmerick lavrac learning link linkage local loss machine macskassy magazine mathematical mccallum media method methods mining model models morgan most multi national networked networks neural neville nigam number optimality order other paper part pazzani performance performed pergamon pets physical plan portals practical predictors presented press princeton probabilistic probability problem proceedings programme project propositionalization providing provost publishers references relational relations rennie research results retrieval rifkin rnkranz robert rusnak school science second section segal selection semantic services several seymore shown siam sigkdd sigmod significantly simple sixth society sources space sparsity sponsored stacked statistical statistics stern study symposium systems taskar techniques technologies text than thank that this ting toolkit tools trained transactions trees under univariate university used using vector vsall well when will with witten wolpert work working works workshop xmedia york zeng zero http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.71 118 Fast Relevance Discovery in Time Series adaptive agrawal algo algorithm also although among arma association bagnall based behavior below better between brute chan chang change clipped clustering common computation computationally computer conference correlation cover data databases dennis derstand domains effcient effectiveness efficient elements enough entities errors exceed exhibit expensive experiments fall faloutsos fast faster find fits fodo force founda from func haixun have huang icde icdm information inspired international janacek john journal landmarks manolopoulos matching mead measure mendelzon method methods metric minimization mining model models monitored monitoring more much nelder noisy once organization pages paper parker pattern pearson perng phenomena point pointed problem proceedings processing proposed proposes queries query querying rafiei ranganathan real references relationship relevance resource reveal rithms robust search seeks sequence series shasha shing showed sigkdd sigmod significant significantly similarity simplex situations sixth some sons speed state statistical statstream stott streams subsequence swami sylvia task than that theory there this thomas thousands threshold time tion tions trade transition true values very vldb wang wavelets where wiley with without yunyue zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.153 20 STAGGER: Periodicity Mining of Data Streams using Expanding Sliding Windows aaaa accuracy adaptive addisonwesley addition against algorithm algorithms alphabet although analysis analyzing approach aref argue arrival asynchronous based bbbbc between brockwell bunke cambridge certain chiu clarity clearly computer conclusions conference convolution cormen corresponds coverage daily data databases daylight dealing demonstrate detect detected detection detects different discover discovered discovers discretization discretized dmkd dong each edbt editors efficient elfeky elmagarmid empirical enclosed essential event exactly expanding expected experiment experimental explained explanation faloutsos favoring feature fewer finney five frequency frequent gives handling handsoff hart have hellerstein high higher hour icde icdm ieee implications improve incremental infinite inspection instruments interesting international into introduction june kandel keogh knuth last least leiserson length lengths less level levels like lonardi lower maintains manual many massachusetts maximize medium merge mine mining months more much named notice novel number numeric obscure online order output outputs over papadimitriou paper partial partially pass pattern patterns pazzani period periodic periodicities periodicity periods plus potentially practicality press priori proceedings processing programming proposed proves publishing purpose quite range rate rates real references representation respect review rivest savings scientific segmenting sense series should shows single sixth size sliding slow some stagger stream streaming streams structure study survey symbol symbolic synthetic table techniques terms than that there this those though threshold thresholds through time tkde tracy tradeoff transactions tree unknown usefulness uses using validates value values verify very vldb volume wang wavelet weekly when which windows with within world yang zero http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.63 43 Efficient Clustering of Uncertain Data accelerate accurate achieve agglomerative algebra algorithm algorithms almost also amongst analysis anchor ankerst applications applied applying approach approximate argued asia based basic berkeley best bound breunig calculated calculations chamberlain chau cheng classification cluster clustering clusters combination combined compact computation computational computationally computations computing concluding conducted conference control cost could cybernetics data databases defined dempster density densitybased derived described detecting different discounted discovering discovery dist distance distances distributed does dominates dozen dunn each effective effectiveness efficient elkan environments especially ester estimation evaluating evaluation even example execution expected expensive experiment experimental experiments explained extensive factor fairly faster feasible feature figure follow four full further fuzzy gain generalized getoor give govaert hamdan have heidelberg hierarchical higher however hung icde icdm icfs ichino icml identify ieee imprecise improve incomplete independent indexing inequality information international introducing involved isodata jain journal kalashnikov kmeans knowledge kriegel laird large least likelihood location macqueen made major mathematical maximum means method methods metrics mining minkowski minmax mixed mixture mobile model models moderate more moving much multivariate nanni necessary nilesh noise number object objects observations only optics ordering other outperformed over overhead overheads pacific pacificasia pages paper parallel pdfs performance performed performs pfeifle physica points prabhakar precomputation presence probabilistic probability problem proc proceedings process pruning pxml queries query querying references regions registered relative remarks representation results royal rubin ruspini same sample sander sato save second semistructured separated series setting shah should showed shown sigkdd sigmod sistla sixth small society some spatial speeding statistical statistics still strategies strength structure studied study subrahmanian suciu symposium techniques terms than that therefore they this those thousands threshold thus times tkde together track trend triangle tsmc type ucslcs uncertain uncertainty units updating uprelpre used using various verlag very vitter vldb well when which while with wolfson work yaguchi yesha http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.134 67 Regularized Least Absolute Deviations Regression and an Efficient Algorithm for Parameter Tuning alternative applications approach behavior black breakdown computer conference data denoising early econometrica estimators finite giloni hamza icdm ieee image international journal jurechova koenker krim line linear mathematical methods mining modelling nonlinear optimization outlier padberg point points portnoy proceedings processes processing rangarajan references regression rejection robust sample siam signal sixth statistical statistics tail their transaction unification vision with http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.168 46 Who Thinks Who Knows Who? Socio-cognitive Analysis of Email Networks about acknowledgment actors adibi administrative agreement ahpcrc analysis anti application applications approaches army assessing auspices authors based behind brief business busschbach california cambridge carley case ceas center chart cognition cognitive comments communication communications company computing conference congruence conjunction contents contractor cooperative corpus counterterrorism coworkers daad data database dataset department dependent diesner different directions discovering duijn effects email emergent enron entropy exploration faust fields first formal from future grant graph hanson harvard helpful high homogeneity human icdm importance important include incorporating informal information international introducing kalmijn klimt knows krackhardt laboratory landscape like link linkkdd lyle mane marital methodology methods mining minneapolis minnesota models more multilevel network networks nishith nodes number organizations pathak perceptual performance personal political power predictors press proc proceedings quarterly random references report research review sandeep scalable schema science security semantic shetty siam sigkdd sixth snijder social socio sophisticated southern spam srivastava statistical status structure supported tech thank thinks this through tools under ungar university variables vermunt wasserman weights with workshop would yang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.58 72 Discovering Unrevealed Properties of Probability Estimation Trees: on Algorithm Selection and Performance Explanation aaai accuracy addition advantage aggregation algorithm algorithms also analyses analysis anomalies approaches appropriate assessing assessment assessors averaging bagging based bauer behaviors better blake boosting breiman buckles built calibration categorical changes chapman choose classification comes comparison concept conference considerations construction continuous cost creation curve data database dataset datasets davidson decision degroot detection distributed diversity does domingos drifting drummey dynamic ecml edbt effective efficiency efficient either empirical ennis ensembles estimation exhaustive expected explaining extensive extremely fawcett feature features ferri fienberg finally flach forests framework friedman from fully given graphs green greengrass guide gusto hall hand have high hinton however icdcs icdm icml improving incorporated induction inferior international into journal kaufmann kohavi leaf learning lies limits ling logistic machine main mainly master mccloskey mechanism medicine merz method methods mine mining model monash morgan naylor node notes olshen optimality orallo peng performance performs perlich pkdd posterior practical practitioners predictors probabilistic probabilities probability proceedings programs provided provost psychophysics purely quinlan random randomness ranking rankings reason references refinement regression related rely repository research researchers results revow rules running same selection sensitive separability seventeen shown signal simonoff singe sixth skewed statistica statistical statistics stolfo stone streams study swets systematic tests that theory thesis this through tibshirani topics trading tree trees trouble types unique university using utility variants voting wang when wiley with within without zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.21 61 Anytime Classification Using the Nearest Neighbor Algorithm with Applications to Stream Mining above acadamic academic accuracy achieve achieved achieves acknowledge acknowledgements active adamek address after agenor aggarwal agris algorithm algorithms alonso analysis annual anytime appear apple applications applied approach approximate arbitrarily arbitrary artificial asae atomic attempts autonomous aware based bayesian been bekel belgium best bias boosting boston both boughton bozma bradley brady brigham bulletin bypassing calyx camera chen circuits classification classifier classifiers classify classifying clayton close closed clustering comes complex computation computational computer computers computing conclusion concrete conference connor consider constant constrained constraints constructing contour contributions control converges convert conveyor cooperative data databases dataset decision defect deliberation demand dennis department design development dimensional discovery discrete discussed dissimilarity distance diverse domingos dual duin dynamic eager eamonn east efficient electrical engineering environments error esmeir estimation estimators even exact example exist experiments fast fayyad field filtering fish fourteenth fraction from funded further future gaber geoffrey geurts grant grass gratefully grumberg guez guttman hansen hardin have heidemann held help herle high highly however hulten icdm idea identification ieee image imaging imprecise improvement includes independently index indexing induction inductive information insect inspection instance instances integrated intelligence interactive international interrupted interruptible interval into introduced invariance investigation issue items iterative january jill journal just kearns keogh kluwer knowledge known korb kotenko krishnaswamy large learner learning length lethargy liege lindgren livne logic machine mafra management manufacturing many markovitch martinez matching mccallum means measures method metric metrology migration mining models monash monitoring monitors moving multiscale myers nagino nearest neighbor neto none nonrigid objects online only optics optimal order ordering other paclik partly pattern patterns pekalska pengcheng perception personal pittarelli populations press probabilistic problem problems proc proceedings processing programming prototype publishers query rapidly rate rates reasoning recognition reduction references reina report representation representations research resource respectively ritter robert robotics robots rodr rodriguez rotation russell same sampling scaling scheduling schoen schoenberger school science searching sections seeing selection selective series several shah shapes shiozawa shown siam sigart similarity simple singh single sixth small some sorting spatial special spie spotting stankevitch stem stream streaming streams structure study subdialogues such suggestion suggestions supports symposium system systems teams technical techniques technology their them there thereafter thesis they this three through time ting tools topic toward tradeoff transaction tree trees ubiquitous under university useful using utility variables variance verification very video vision visual vlachos vldb walker wang warping webb wedgie which while wilson winn wish with word work workshop xiaoqian yalcin yamada yang ying young zhan zhana zilberstein http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.82 121 High-Performance Unsupervised Relation Extraction from Large Corpora Binjamin Rozenfeld, Ronen Feldman Bar-Ilan University, Ramat Gan grurgrur@gmail.com, ronenf@gmail.com aaai acknowledgements agichtein among answering artificial association australia boosting bootstrapping brin center chen collections computational conference corpora data database demand dictionaries digital discovering discovery discussions edbt emnlp empirical entities entity etzioni experimental extending extracting extraction feature feldman from gravano grishman hasegawa helpful hovy icdm ijcnlp information intelligence international island jeju jones knowitall korea language large learning level libraries linguistics meeting methods mining multi naacl named natural oren patterns plain preemptive proceedings processing project provided question ravichandran references relation relations riloff rosenfeld sekine selection sets shinyama sixth snowball soderland some spain stephen study surface sydney system technology text thank turing university unrestricted unsupervised ures using valencia washington webdb were wide workshop world http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.131 154 Query-Sensitive Similarity Measure for Content-Based Image Retrieval Zhi-Hua Zhou Hong-Bin Dai National Laboratory for Novel Software Technology Nanjing University, Nanjing 210093, China {zhouzh, daihb}@lamda.nju.edu.cn according account acknowledgments advances advocates also analysis applying attempt barcelona base based bases bayesian been being belong bottou cambridge cbir chang chen choice ciocca circuits claim class classes collections comparison complicated comprehensive computer concept conference considered considering content contentdependent czech data databases degenerated designing different differently does early editors effective efficient enhancing european evident examples experiments exploiting exploits explore extent faloutsos fanedd feasible feedback first formulating functions future giacinto gupta huang icdm ieee illustrate image images influence information insensitive instance intelligence interactive interesting international into irrelevant ishikawa issue jain jiangsusf kinds knowledge labeled large learning left limit lippman longer machine management many measure measures mechanism mehrotra mindreader mining mistakes moreover multi multimedia multiple neural nsfc numerical often ortega other page pages panda paper pattern performance power prague present presented press proceedings process processing proposes qsim queried query recognition references regional regularize relevance relevant results retrieval review rijsbergen roli same santini saul schettini sensitive settings should show similarities similarity since singapore sixth smeulders some spain state stretching study subramanya such superior supported systems take techniques technology that these this through tombros tool transactions tried unifying unlabeled used user using usually validates variants vasconcelos version very video view vision weiss well which will work worring years york zhang zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.53 33 Dimension Reduction for Supervised Ordering Toshihiro Kamishima and Shotaro Akaho National Institute of Advanced Industrial Science and Technology (AIST) AIST Tsukuba Central 2, Umezono 1­1­1, Tsukuba, Ibaraki, 305­8568 Japan, mail@kamishima.net (http://www.kamishima.net/) and s.akaho@aist.go.jp akaho algorithm analysis analyzing annals application applied artificial bahamonde based bayon bollmann boosting categorization chains chapman clickthrough combining computers conf conference correlation data diaconis discovery edition efficient empirical engines european feature feedback fifth filling freund from generalization generalized gibbons graepel hall herbrich hirao homepage http icdm icml ieee implicit information international iyer japan joachims journal kamishima kazawa kendall kernel knowledge learning linear lnai luaces machine maeda marden mccullagh method methods mining missing modeling models monographs networks neural obermayer objects optimizing order ordering orders ordinal oxford pages preference preferences press probability proc proceedings query quevedo radlinski rank ranked regression relations research retrieval royal schapire sdorra search selection sensory singer sixth society spectral statistical statistics subset supervised support survey systems text university using vector volume with workshop http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.50 144 Deploying Approaches for Pattern Refinement in Text Mining Sheng-Tang Wu Yuefeng Li Yue Xu School of Software Engineering and Data Communications Queensland University of Technology, QLD 4001 Australia {s.wu, y2.li, yue.xu}@qut.edu.au ahonen applying automated behavior caropreso categorization collections computers conference data dell descriptive digital document dumais edda elaborazione external extraction from heinonen icdm improving information informazione input instituto instruments international jorg klemettinen learning machine machines matwin methods mining pages phrase phrases proceedings references report represent research retrieval sebastiani sixth sources space statistical support technical techniques text texts vector verkamo with http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.47 18 Data Mining Approaches to Criminal Career Analysis aaai above academies acknowledgment adderley addressed after also analysis analyzing applications artificial assaults assistance atabakhsh automatic based been behavior belgium best better between blumstein bnaic both broekens bruin buetow building career careers case casey centered chaboya challenges chau chen choice clustering cocx cohen collaboration commerce commit comparison competitive computative computing concerns conference coplink could crime crimes criminal criminals cushna customer dale data decision design determining digital dimensional discovery distance ecml eighteenth enforcement entities essence even ewart experiences expert experts extracting fall field financed fincen focus fourth framework from future fuzziness goldberg government grant have hierarchical hope huang hyperbolic icdm ieee incorporation industrial information infovis infrastructure intelligence intelligent interactive interest international investigations issues knowledge kosters laros link list lnai mainly maintaining matching meaningful measure mentioned methods mining modeling models more multi multidimensional musgrove muzner narrative national netherlands networks neural number oatley object offenders onto organization pages papers part petersen physica pkdd police possibilities practical predicting press proceedings program progressive project properly proposed provided reach reached references relationships reports research restructuring results roth same scaling schroeder scientific serious seventh sexual sgai sharing sigkdd similarity sixth social soft solving sources springer steerable studies study subject suited support symposium system systems task that their these this token tool toole topic transactional tree twenty under verlag visher visualization visualizing volume wezel while will williams wong workshop worlds xiang york zeleznikow zeng http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.64 123 Enhancing Text Clustering using Concept-based Mining Model Shady Shehata Fakhri Karray Mohamed Kamel Department of Electrical and Computer Engineering University of Waterloo Waterloo, Ontario, Canada N2L 3G1 {shady, karray, mkamel}@pami.uwaterloo.ca aaai academic accompany accuracy accurate achieved algorithm algorithms allows alyzes american among analysis analyzes another approaches artificial association august automatic based boston calculation calculations capture case chapter cios cliffs clustering comparison component computational computers concept concepts conference corpus cybernetics dagan data databases different digital direction discovery document documents dubes each edited employing englewood english extending feldman fillmore first francis future ghosh gildea hacioglu hall holt human icdm ieee impact importance improve information intelligence international jain july jurafsky karypis kingsbury kluwer knowledge kucera kumar labeling language level levels lexical linguistic linguistics link machines mans manual martin matching measure measures measuring methods mining model mooney naacl national next north number page pages palmer parsing pattern pedrycz performing porter possibilities pradhan prentice present presented procedure proceedings processing program propbank publishers quality references respect rinehart robust roles search second semantic semantics sentence shallow significantly similarity single sixth speech standard steinbach strategies strehl stripping structure suffix support surpasses swiniarski systems techniques technology term text textmining textual then theories theory there this topic traditional transactions treebank treebanks universals using vector very ward which winston with work workshop york http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.112 13 Mixed-Drove Spatio-Temporal Co-occurrence Pattern Mining: A Summary of Results Mete Celik1 Shashi Shekhar1 James P. Rogers2 James A. Shine2 Jin Soung Yoo1 agarwal algorithms analysis analyzing angra approach association bakalov bakiras banerjee berlin brazil carlin celik cheung chorochronos clusters collocations colocation complex computer conf conference cressie data databases datasets detecting detection discovering discovery efficient fast finding framework frank from gelfrand general generalized geoinformatica geospatial giscience groups grumbach gudmundsson guting hadjieleftheriou handling hierarchical houston huang icdm ieee imfeld international isbn jensen join joinless kalnis kaufmans knowledge kollios koubarakis kreveld laube lecture less lifelines lncs location lorentzos mamoulis mehta mining mobility modeling morgan motion mouza moving notes number object objects parthasarathy partial pattern patterns point press proceedings queries references reis relative remo results rigaux rules schek schneider scholl science scientific seatle sellis sets shekhar shou sigkdd sixth sons spatial spatio spatiotemporal speckmann springer srikant sstd statistics summary symp temporal tkde trackable trans tsotras verlag vldb washington wiley within xiong yang zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.69 76 Fast On-line Kernel Learning for Trees algorithms alignment annotated approach arikawa arimura asai automatic barcelona based bayes bioinformatics biology boosting both california campbell closed collins computational computer conference confidence constructed convolution cortes cruz dags data databases definite detecting detection diekhans discovering discovery discrete discriminative dryade duffy dynamic efficient extraction forests framework frequent freund from gildea graepel haussler herbrich heterogeneous holloway homologies homology icdm ieee improved international jaakkola january journal july jurasfky kawasoe kernel kernels kingsbury kivinen kuang labeling large learning leslie linguistic london lrec machine machines minimal mining moschitti motif networks noticed online over pages palmas palmer parsing perceptron point positive predictions proceedings processing profile propbank protein ranking rated references remote report research roles rousset royal sakamoto santa schapire sebag semantic semi shallow should siddiqi signal singer sixth smola society spain string structured structures study substructure support systems tagging technical termier that transactions tree treebank trees ucsc university using valid vapnik vector voted wang watkins whenever williamson with http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.149 108 Social Capital in Friendship-Event Networks Louis Licamele and Lise Getoor Computer Science Dept., University of Maryland College Park, MD 20742 USA {licamele,getoor}@cs.umd.edu able academy accuracy achieve acknowledgements advances algorithms allow alone also american analysis annual applications appropriate assigning author authors average based benefit best better bilgic both bourdieu burges cambridge capital chapter classifier coleman collaboration collier combined combining comes committees compared complex conclusion conf conference conferences construct constructing context correctly could creation current customers data defining definition definitions degenne denoted depending depth describing design determining difference different directed discovery distribution does domingos dynamic each earnings econometrica editor education evaluating evaluations event eventrank examined explorations explored fact family features figure forms formulated forse foundation four framework friendship from function general getoor given gives good greenwood group handbook have higher histories history hobbes human hutchins icdm ideally ijcai importance imum individual influence information interested intergenerational international intl introducing isolation issue jensen joachims journal july kempe kernel kleinberg knowledge large learning leviathan liben licamele link london loury lowest machine machines madadhain making management maryland maximizing maximum measure measures members methods metrics minimum mining models modern more much national network networking networks neville newman next note nowell obtain only optimize organizers origins over pages part participants participation past performance performing performs popescul portes practical prediction predictor presented press problem proceedings process program publication publications quantitative question ranking reader received references relational report representations research result results review reviewers richardson roussopoulos sage scale sccur schist scholkopf science sciences select series show shown shows sigkdd significant significantly sixth slightly smyth social sociology spread statistical statistically structure such support supported symposium table tardos tasks technical than that theory these this through time together total transfers under ungar university used useful using utilizing value variety varying vector very well which window windows with work workshop would years york http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.102 133 MARGIN: Maximal Frequent Subgraph Mining about acknowledgements algorithm algorithms based been below better biokdd biological border both bystroff causes cheboli closed closegraph comparable comparison complex computes conference contact data databases detecting discovery edge efficient efficiently expected experimental faster figure finds frequent from good grama graph graphs gspan helping hence higher however huan hyderabad icdm iiit implementation include increase infrequent international ismb isomorphic isomorphism july karlapalem karypis kokkula koyuturk kuramochi lattice library makin maps margin maximal measure mining most much networks number observed operations other pages pandey parameters pattern patterns performs post prins proceedings processing protein reduce references report results seconds seen shao shen show shown shows since sixth smaller some space spin step subgraph subgraphs substructure subtree such supergraph support szpankowski taken technical template than thank that thomas three time times tree twenty valluri values varying wang which with yang zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.56 142 Discover Bayesian Networks from Incomplete Data Using a Hybrid Evolutionary Algorithm acknowledgments algorithm applied applies approach available bacchus based bayesian bayesware been belief benchmark better called chickering classes compare compared compbio completing computation computational conclusion conference data databases demonstrate dempster described direct discoverer efficient equivalence evolutionary experimental found friedman from frontpage generated grants have heam heckerman hidden html http huji hybrid icdm ieee incomplete intelligence international journal laird learn learning leung libb likelihood lingnan machine marketing maximum method methods microsoft mining missing network networks number obtained online other outperforms paper parameter performance presence principle procedure proceedings ramoni real redmond references research results royal rubin sebastinani sets sixth society statistical structures supported technol tested that this those trans tutorial university using values variables with wong work world http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.87 31 Improving Personalization Solutions through Optimal Segmentation of Customer Bases Tianyi Jiang, Alexander Tuzhilin New York University tjiang, atuzhili@stern.nyu.edu aartf addison adomavicius algorithm algorithms alternative amelia american applied approach ascent attributes automatic aykanat bantam based baskets bayesian beaver behavior best beyer blaci booklocker bounds brijs brucker bulletin cacm capri center classification classifiers cluster clustering combinatorial comm communication complexity computational computer conceptual conference congress continuous current customer customers data decomposition dell descent differentiation directions discovery discrete discretization distributions does doig doubleday dougherty down drilling duda econometrica effective effort enterprise estimating fayyad features forever foundations francisco frank from future genetic gomory group guignard hall hansen hart henn heuristic hlenbein hochbaum hoffman human hypergraph icdm icml ieee ijcai implementation implementations individual inns integer international interval into introduction irani island issue italy java jiang john joint journal kamakura kaufmann keep kernigham kluwer knowledge kohavi korte kotler lagrangian land langley learning least linear machine management market marketing mathematical mathematics mendenhall method methodological methods mildest mining model models morgan multi network networks neural nonconvex novo numerical oettli ogier operational operations optimization oriented outline ozdal padmanabhan parallel pattern patterns peppers pergamon personalization perspective plantation population possible practical prentice press principle probability problem problems proceedings process product profits programming programs publishers quinlan references research rogers sahami salesman science search segmentation segmenting shmoys shoppers shopping singapore sixth smith society solutions solving sons special spreadsheet springer statistics steepest stork strategies stronger successes supermarket supervised tabu techniques technologies thomson tkde tools trans traveling turning tuzhilin unsupervised using valued verlag wedel wesley wiley with witten yang yielding york your zipf http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.78 105 Gradual Cube: Customize Profile on Mobile OLAP Jun Li Haofeng Zhou Wei Wang Department of Computing and Information Technology Fudan University Shanghai 200433 042021117@fudan.edu.cn account acharya after aggregate aggregates alfredo algorithm algorithms answering antonios approach approximate based baxbaxa because best better both browsing case cikm client compress computation conclusion condensed conference contour cube cubes customize cuzzocrea daniel data databases deligiannakis delivering delivery devices difference down dwarf edbt efficient empirical environments error evaluation experiment extensive fact figure find focus future gibbons gradual hand handheld heuristic hongjun icde icdm illustrated improve indexing international into iosif isads isse jeffrey jian join laks lakshmanan lazaridis line maniatis mechanism mining mobile mohamed more multidimensional multiresolution observational olap only paper performance petacube phillip proceedings progressive propose proposed provide quality quasi queries query random real reduce reducing references report response result scott semantic server services sharaf shrinking sigmod sismanis sixth size space sparse strategy structure summary support swarup synopses system table tables take technical than that there this three trans transmitting tree trees tuples update users using various vastly vitter wang wavelets weather well when which will wireless with worsen yannis http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.99 45 Local Correlation Tracking in Time Series Spiros Papadimitriou Jimeng Sun Philip S. Yu about accurately addresses advanced aggarwal aggregate agile aided algorithm algorithms allen allows among analysis another appeared applications approach approximation arbitrary architecture archive assumptions autocovariance bagnall based beyond bingham braid burst bursty capture captures case castelli celka change changepoint changepoints changing chebyshev chen chiu classification climatic clustering coefficient colditz collection comparing comparison component computer conclusion conference context cormode correlation correlations cross daniel data databases datar datasets david demonstrate dependency desirable detect detection dettinger different dimensional dimensionality discovery dist dmkd does dong dynamic each eamonn edition efficiency efficient efficiently eigendecomposition emitter employed employs estimate estimates estimation evolving extensions extract faloutsos family fast field finally financial find finding folias fourier framework frequency from furthermore gehrke general generalization geophys ghil ghosh gionis given global golyandina group haiminen hamming have heterogeneous hiisila http icde icdm ideas ieee implication implications includes independently indexed indexing indyk infants inoue international interpretation intersections interval intervals itself johannes joint jolliffe kargupta keogh kifer knowledge kondrashov lends level linear list local location loco lonardi magnitudes make mann mannila matrices matrix measure megalooikonomou methods metric mining mobimine model monitoring multi multidimensional multiple multiresolution music muthukrishnan naturally nekrutkin norms novel optimal orthonormal pakdd papadimitriou parameter particular patterns performance periodic periodicity phase pkdd point polynomials preprocessing present press principal problem proc proceedings processing projection prop propose proposes qualitative query random ratanamahatana real recently reduction references related relationships representation respect robertson robust robustness sakurai saunders scale scales scaling schmidt score scores search segmentation seizures series setting several shai shasha shifts sigmod signal similar similarity simple single singular sivakumar sixth spatio special specifically spectral spectrum springer stationarity statistical statstream stream streamcube streaming streams structural structure subsequently subspace such symbolic systems techniques temporal terzi that their these this thousands through tian time timeevolving track tracking traditional trajectories trans transforms transients transitions tree tsdma used useful using varadi varying versus viewed visualize vlachos vldb wang wavelets well which while widely with work yang yiou zero zhigljavsky http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.123 41 P3C: A Robust Projected Clustering Algorithm accuracy acknowledgments actual addition address agarwal aggarwal agrawal alberta algorithm algorithms alon amer analysis anticipate applications appropriate apriori arrays assoc association attributes automatic barkai based being belongs between beyer biclustering biological both broad called carlo categorical cheung circle cluster clustering clusters cochran coefficient colon combined comparing computation computes conclusions conference consistently consists containing cores crucially data decreasing defined demonstrates dempster dense depend depends derived detected different differs difficult dimensional dimensionality dimensions discover discovery distribution drawbacks each entries epch evaluation excellence existing experimental explorations expression extension extremely fail fashion fast find finding forming from fulldimensional fund future gehrke gene generalized generated gish goldstein grouped gunopulos haque harp high higher histogram histograms icde icdm icore identified identifiers ieee ignored implementation including incomplete indeed ingenuity initialization international into investigate iowa irrelevant iteratively jones kevin laird large lastly leverage levine like likelihood lncs lowering mack madeira matching maximum meaningful measure measured method methods mining moderately monte most motivated multivariate murali nearest neighbor newsletter noise normal notterman novel number numerous object objects obtained often oligonucleotide oliveira only order orientation other outliers outperforms pairs paper parameter parameters park parsons particular patterns performance performed pnas points practical press probed proceedings procopiuc projected projections projective providing raghavan ramakrishnan real recover references refined regions relevance relevant removed required research respect revealed review robust rousseeuw rubin rules scales scores semi sensitive sets shaft sigkdd sigmod signature signatures similarity sixth snedecor space spaces specified srikan sspc stat state statistical subsequently subspace subspaces supervised supported survey synthetic tcbb terms thank that their these this threshold through tissues tkde tumor unable unexpectedly university unmasking until used user using values varying very vldb well when whereas which while whose will with wolf wong work would yale ybarra zero zomeren http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.30 57 Boosting Kernel Models for Regression Ping Sun and Xin Yao School of Computer Science University of Birmingham additive advances algorithm algorithms amari analysis annals annual appear approach approximation artificial based bayesian bengio berlin beyond boosting bottou burges cascade choudhury classifiers collobert committee computation computational computed conference cosatto data different discovery discriminants dourdanovic dual editor efficient efficiently eigenvalue ensembles esann evgeniou fast fisher florida forward francisco friedman function gammermann gaussian general generalized germany gradient graf graphical greedy hastie herbrich icdm icml icpr ijcnn import intelligence international inverse journal kaufmann keane keerthi kernel kernels knowledge large lawrence learning logistic machine machines margin matching matrices matrix mika mining mixture modeling morgan muller multiple multiplication nair nakahara networks neural ninth nips nonlinear october optimization pages parallel pattern pavlov poggio ponent pontil press problem problems proceedings process processes pursuit rasmussen ratsch recognition references regression regularization representer ridge saunders scale scaling schafer schok scholkopf seeger selection sets shavlik since sixth smola sonnenburg sparse speed statistical statistics support svms systems technology their theorem theory thesis tibshirani tresp triangular tutorial university using vapnik variables vector very view west williams with workshop yamana http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.1 15 -Tolerance Closed Frequent Itemsets able accuracy achieve acknowledgement actual agrawal algorithm algorithms allows almaden also amount another approach approximate approximation association attains authors aware bart base based bases basket bastide bayardo benefits better between beyond boolean border both boulicaut bound brin bykowski caching calders candidate cases cerg cfis cheng christophe close closed closeness comparison compressed compression concise conclusions condensed conference consumes correlations data databases deduce define defined definition derivable derived determine differences discovering discovery dmkd dong efficient efficiently either emerging enjoy episodes equal error estimated event evidenced except existing experimental experiments extra fair fast faster fimi flexible flexibly fpclose free freq frequency frequent from generalizing generation generator goethals gosta grahne grant graph great hkust however http icde icdm icdt ieee imielinski implementations inclusionexclusion indexing information international items itemset itemsets jiawei knowl lakhal large less like long lossless lower mannila market measure memory mfis minetcfi minex minimum mining most motwani much ndis note notion number order other partially particular pasquier pattern patterns pkdd prefix principle proc proceedings prof project propose providing prune queries query quest rate recovered recursive recursively reduced redundant references relax relaxed removes represen representation representations representative required result results retain rigotti rplocal rules second section sequences sequential sets show shown sigmod significantly silverstein similar sixth slightly smaller software some srikant still structure subsets such supersets support supported swami syst tail taouil tation tcfi tcfis than thank that theoretical they this threshold thus toivonen tolerance transaction trees trends tune types under upper used user uses using utilize verify verkamo very viding vldb when whether which while with without work workshop would yang zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.55 62 Dirichlet Aspect Weighting: A Generalized EM algorithm for Integrating External Data Fields with Semantically Structured Queries by using Gradient Projection Method Atulya Velivelli and Thomas S. Huang Dept. of Electrical and Computer Engineering Beckman Institute for Advanced Science and Technology University of Illinois at Urbana-Champaign Urbana, IL 61801, U.S.A. {velivell, huang}@ifp.uiuc.edu acknowledgements advantage algorithm andrews arora askey aspect athena baseline bayesian belongs bertsekas better cambridge chengxiang comments comparison conclusion conference data database decrease dirichlet each editors estimation external facilitates feedback foundation framework from functions generalized governing gradient grant helpful himanshu icdm integration international introduce leading learning major method methods mining minor model modifications national nonlinear novel number other parameters part performance press prior proceedings programming projection pseudo query rather references relevance science scientific semantically shows sixth special structured subfield supported techniques term terms than thank that their this thus university using weight weighting which with work zhai http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.43 22 Converting Output Scores from Outlier Detection Algorithms into Probability Estimates about accurate advances after agreement algorithm algorithms alternative annual anomaly applications approach appropriate asymmetric bagging based bases bayes bayesian bennett bishop breunig buhmann calculated calibrated calibration cheng classifier classifiers close closer comparison computer conclusions conference constrained corresponding corresponds dashed data databases decision degroot demonstrate dempster density detection developed development diagonal diagram difference directly discovery discuss distance distancebased distributions done effectively efficient eighteenth eleventh elkan empirical ensemble estimate estimated estimates estimation evaluation examples existing experimental exponential feature fienberg figure finally first forecasters framework francisco from function further gaussian have help hidden however icdm icml identifying ieee improve improves improving incomplete incorporating informaion information international into jain journal kaufmann knorr knowledge kriegel kumar labeled labels laird lange large latter lazarevic learn learning likelihood line little local logistic machine machines management many margin maximum means methods mining mixture models more morgan most moved naive networks neural normalized note novel obtain obtained obtaining ones outlier outliers outputs oxford page pages paper parameters pattern performance piecewise platt plot points posterior potential predicted press probabilistic probabilities probability problem proceeding proceedings produce propose proposed quite recognition references regression regularized reliability require research results retrieval royal rubin rule sander score scores second selecting selection semi series seventh should show shows siam sigir sigkdd sigmod sigmoid since sixth society some statistical statistician still study suggest supervised support text that then this threshold transforming treat trees tucakov tung twenty university unlabelled unlike used using variables vector very vision vldb volume which with without yang york zadrozny zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.151 153 Speedup Clustering with Hierarchical Ranking accelerate access accuracy accurate achieves acknowledgment algorithm ankerst aone approach approximate approximately arbitrary artificial athitsos average baeza based bilenky binding bioinformatics birch boosting breunig bubbles canada centre chavez ciaccia cliffs close clustering clusters combination complexity comput concepts conclusions conference constant data databases density determine discovering discovery discussions document draft driven effective efficiency efficient efficiently elkan embeddings empirical englewood ester every fast figure finding fruitful funded genome gordon grateful hadjieleftheriou hall hierarchical high hjaltason icdm icml identify index inequality intelligence international introduced jersey kamber kaufmann kmeans kollios kriegel large larsen lett life linear linvy magnitude maintaining marroquin method metric michael mikhail mining modern more morgan much navarro nearest neighbors neighbours noise norvig novel nserc object obtaining optics order ordering orders over pages pairwise paper partially patella pattern perform performance points practice prentice preserving proceedings proposed protein publishers quadratic quality query ramakrishnan rank ranking real recogn references representation research respect result results reviewing robertson runtime russell samet sander scalability sciences sclaroff search searching sensitive series showed shows sigmod similarity sites sixth size smith spaces spatial speeding speedup stormo structure surv synthetic techniques text that this time tods tree triangle using vector very vidal vldb while with yates zezula zhang zhou http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.27 19 Biclustering Protein Complex Interactions with a Biclique Finding Algorithm able above acad academic acids acknowledgement acta actin activator activity adap adaptin adaptor alexe algorithm algorithms allows almost alpha also although among analysis analyzing annotated applied approximability approximate approximating arboricity artificial atpase bader based bateman bellare between biclique bicliques biclustering binding bioinformatics biological biology biosynthesis biotechnology biotin bipartite bits bomze bosche boston bromodomain bron budinich cagney canad case cerevisiae chain chandonia characterizations chiba chromatin clat clearly clique cliques closed coin combinatorial common communications comp comparative complete completeness complex complexes comprehensive computational compute computed computer computers computing conf conference consensus continuous contract correspondence crama cvpr cygd cytoplasmic data database databases dead dehydrogenase delta department detailed different ding directed discovery discrete domain domains durbin easily editors effectively efficiently eftu elements energy enumerating eppstein epsilon european examples exonucleases explore families farnesyltransferase favor feige finding flav flexibility foldes formalism found foundations free freeman friedman from ftase function functional functionally garey gavin gene generalized generation genome geranylgeranyltransferase ggtase gibbons gida goldreich goldwasser graph graphs group grouping gruhler guide guldener gyraseb hammer handbook hard hastad hastie hatpase hattori have hearn heilbut helicase hogue holbrook hybrid icdm identification ieee implemented indicates inference information interaction interactions interactome international intractability involves issue johnson journal kastenmuller kerbosch kluwer knowledge krause labeling large lasso learning letters level link linkage linkages linked lipoyl listing madeira maggie makino mansfield many mapping mass math mathematica mathematics maxima maximal maximum meraz method methods mining mitochondrial modeling more mostly motzkin mrna multi munsterkotter muts natl nature networks neural nucleic obtained october office oliveira ontology operations optimization organization other oxoacid oxoglutarate ozawa page pages paper pardalos part patterns pcps pelillo pfam pkdd polymerase popld porto portugal ppta practice prenyltrans primase principles problem proc proceedings processing program project proof propose protein proteinprotein proteins proteome provide provides publishers pyruvate ramana redox references regression relaxation remodel replication representation research respiration results ribosomal rnapol rnase royal rrna saccharomyces safra saga sakaki scandinavian science selection sequence shapes share shared show shrinkage simeone similar similarly simultaneously sixth small sources spectrometry splicing springer statist statistical straus structure study subgraph subgraphs subunit succ succinate sudan summary supported survey swirm symp synaptonemal synt synthase synthesis systematic szegedy table tafiis targeted tfiid than that their theorem theory these they this three through tibshirani tight together topoisoiv topoisomerases towards trans transcription transket transporting turan types uetz under undirected unified vacuolar verlag vision volume well with within wong work workshop yeast yeats yoshida zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.93 9 Large Scale Detection of Irregularities in Accounting Data Stephen Bay, Krishna Kumaraswamy, Markus G. Anderle, Rohit Kumar, David M. Steier Center for Advanced Research, PricewaterhouseCoopers LLP 10 Almaden Blvd, Suite 1600, San Jose, CA 95113 {firstname.initial.lastname}@us.pwc.com absolute abstract accountants accounting acknowledge acknowledgements actionable addition address addressed advanced advances algorithm algorithmic also alspector american analysis analysts ananthraman anna annual anspaugh applying approach approaches appropriate artificial assessing assignment assistance assumptions assurance audit auditing august available based bayes becker begun bell beneish berg bernoulli bias bolton both carcello card catch center certainty certified challenge challenges chan charis chen choi christos class classification classifier classifiers clayton coderre cogger cognitive colleagues collection comments commonly company conclusions conducted conference conll consider consideration constraint consultants contributed cook corporate cowan cpajournal credit cutshaw data dealing decision delisio denis desai deserve detailed detect detection developing directed discussed distributed distribution documents domain draft earlier earnings edition effect efficient either eleventh energy entities eric erick essentials estimates evidence examples experimented experts explicit faced fact faloutsos fanning fawcett features feedback finally finance financial firm firms first flags forensic fraud fraudulent from functions furtado fuzzy gaining galit general gerson ghahramani gilleron given glenn global gogtas golden grateful green group groups grove guide hakan hand handbook helped here hitzig hoboken houston http hwang icdm icfai identify identifying ieee important improve included incomplete independent indicate information insight insights institute intelligent international interns introduction investigation irregularities issue jaenicke jank jeff jessica jimeng john jordan journal juan kaskiris kendig knowledge known koskivaara krishnamurthy label large laube leaders learning ledger leena letouzey letters lever like likelihood linear list luis machine madhuri management managerial manipulation manner mansharamani many mave mcdonnell mcnamee meeting mehul members mention method methodologies mining mixture models montgomery more multiple naive nathan negatives network networks neural nineteenth numerous nysscpa occur online organizing other others paper parameter partially patel pattern pazzani people performance perspective philip points pollner poor positive positives practice presented prevention pricewaterhousecoopers primary proceedings process processing prodromidis provided public publications published quality quarterly ratio recognition references regression reilly reporting research restatements results retrieved review revisited ricart risk robin sampling santosh scale science sellers semi several shah sheldon shmueli short sign sixth skalak society solutions sons special specifically ssrn staff state statement statistical stolfo student sunny supervised support suspicious system systems target techniques technologies technology temporary tesauro text that their then theory they this those through tommasi tong training transactions treatment tsoi tsujii tsuruoka twelfth twenty uncertainty understand understandable unlabeled upton using vancouver vargas variations various venkataraman vidal wang watson wells wiley winograd with within wolfgang work worked york http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.66 136 Entropy-based Concept Shift Detection Peter Vorburger, Abraham Bernstein University of Zurich Department of Informatics Binzmuhlestrasse 14, 8050 Zurich, Switzerland {vorburger, bernstein}@ifi.unizh.ch addresses advances algorithm algorithms also ample analogon application applications approach approaches based basis believe bernstein better both capable central choice classifiers coarse communications complexity computational computing concept conceptdrifting concepts conference constam context could current data definitions department detect detecting detection devices different direct disagreements discovery domain drift drifting dynamic ecml effective efficiency egger ensemble entropy environments european experimental explicit find findings formulation further future gain generalizability given haym helmbold hirsh icdm important increasingly indeed induction informatics information initial insight instance international interruptablity into investigate iwan journal kaufmann knowledge kolter kubat learn learning like long mach machine majority maloof martin measure measuring method minimizing mining morgan neural ninth nips noise offs optimized other outperformed pages paper parameters patrice pervasive petsche power prediction predictive presented press problem proc proceedings processing project promising publishers real references related resulting rivest robustness scenario section selection sensitivity sets shift shifts show sigkdd simple sixth springer stage stierli strategy streams subjects substantial such suitable support switches systems thank that their this time towards tracking trade tsymbal ultimately university usefulness using varying versus very virtual vorburger wang wearable weighted widmer with work working would zurich http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.84 74 1 acquire action agrawal algorithm algorithms also appears appl applications association average based baskets between beyond brin called conclusion condition conf conference considered consists correlation correlations cyclic cyclically data database databases dawak dblp defining developed different differs discovering discovery engineering episodes erttap execution existing expert extensive fast fcip figure find finding first florida follow followed frequent from future fuzzy generalizing gong have icde icdm identifying information international item items itemset itemsets kantarcioglu knowledge large length less likely logic mannila many market method minimumsupport mining model modeling models montreal motwani numbe number objects occurs orlando pages pair pairs pattern patterns periodic presented proceedings quebec ramaswamy real references related relationship repeated repetition rules sciences segment sequence sequences sequential sets sigmod silberschatz silverstein sixth srikant support swami syst taipei that those threshold time times toivonen toroslu torosluk traditional transactions type under underlying useful user usually verkamo very which wise work world zden zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.12 16 Active Learning to Maximize Area Under the ROC Curve Matt Culver, Deng Kun, and Stephen Scott Dept. of Computer Science 256 Avery Hall University of Nebraska Lincoln, NE 68588-0115 {mculver,kdeng,sscott}@cse.unl.edu above achieve acknowledgments active adaptive advances algorithm algorithms alternative analysis annual appears application applications apte area areas atlas authors available bagging balancing baram base based bayes below blake boosting brinker cancer cauwenberghs characteristic choice classification classifier classifiers closest cohn colic comments commercial committee comparing completed computational computing conf conference convergence cortes cost credit currently curve curves data databases dataset decis decremental detection diabetes discovery discussions diverse diversity drummond edition ensembles envelopes error estauc estimates estimation evidence examples experimenting exploitation exploration extending facility fifth figure found francisco frank freund funded further future games gaussian generalization good grant green hanley helpful hettich heuristics holden holte html http icdm ieee improved improving includes incorporating incremental indepedent information intelligence international intl ionosphere iyengar john journal kaufmann knowledge koller labeled ladner learning lehmann less lindenbaum lower machine machines making mamitsuka markovitch mccallum mcclish mcneil mean meaning median melville merz method methods minimization minimum mining mitra mlearn mlrepository model mohri mooney more morgan multiclass murthy nearest nebraska needed neighbor neural newman nguyen nick nonparametrics number online operating opper optimal optimising optimization orthogonal osugi overall pages parentheses park part pattern perform performance performer pillar place poggio points practical preclustering probabilistic probability problems proc proceedings processing psychophysics query radiology rakotomamonjy ranc random ranked ranks rasp rate ratio receiver reduction references repository resampling research reviewers rocai rosset rusakov sampling schohn scott selection selective seung shamir signal sixth smeulders software sompolinsky sons southey starting statistical strategies strong study support svms swets synthetic systems table target techniques testing text than thank that their then theory there this through tishby tive tong tools toward training trans under university useful using usps utilization vector vectors very visualizing volume vote well when wiley wilkinson will wins with witten work workshop xiao yaniv zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.25 4 Bayesian State Space Modeling Approach for Measuring the Effectiveness of Marketing Activities and Baseline Sales from POS Data Tomohiro Ando Graduate School of Business Administration Keio University 2-1-1 Hiyoshi-Honcho, Kohoku-ku, Yokohama-shi, Kanagawa, 223-8523, Japan andoh@kbs.keio.ac.jp abraham acceleration accurate achieve acknowledgement acknowledges actual advertisement akihiro allows also american analysis ando annals anonymous appear application applied approach association attempted attractive author auto available axis ball balls barnard baseline bayes bayesian because below best between beverage beverages biometrics biometrika blattberg boatwright bold bpic brands calculated calculations cambridge candidate carlin carlo chain chains chapman chemical chichester coefficient coffee cold comments complexity computer computing conclusions conference confidence considered considering consists constructive consumer cont correlated correlation correlations corresponding coupon covariance criterion daily dashed data dataset deals decision decompose dempster dependencies dependency dependent depends determine developed deviations different dimensional discount discounting discrimination discussion display distributional distributions dynamic each edge edges effect effects empirical employed eppen equation equations estimated estimating estimation evaluation explore exploring extraction factors fast field figure fitting fluctuations forecasting framework fred from function further future gaussian general generate gersch gilks grant graphical graphs grateful gratefully group gupta hall hence henderson here hierarchical high higuchi horizontal icdm identify identifying impact implemented implications important improving indeterminate indicate indicates individual inference information inoue institute integrals intelligence interesting international intervals into introduction investigate item items iterations japan japanese jelly journal kamakura kitagawa kondo kopalle large larger lecture letters levels leverage liebermann like likelihood limited linde line linear lines lodish louis lunch lunches machine management marketing markov marks marsh matrices matrix maze mcculloch mcmc mean means measures mela meng method methods metropolis mining model modeling modelling models monte more much multi multivariate music neslin nondurables nonlinear nonstationary noodle normative note notes numbers observation obtained operations order other packed pairwise paper parameters part partial performance physics planning poisson posterior practically practice predict predictive prelaunch price pricing prior priors problem proceedings product productivity promotion promotions proposed purchase quelch rain range recorded referees references regressive relationship reported represented research respectively restaurant result results retailer rice richardson robert rosenbluth royal sales sandwich sandwiches sato scale scanner science scientists scores seconds selection series sets shrinkage significant simultaneously since sinica sixth size smoothness society space specific spiegelhalter springer standard state statistica statistical statistics stochastic store strong structure study suggest suggestions supported switching system table tackling teller tellis terms that theoretical these thin this three tierney time titterington transactions trend tsuchiya type unit unobserved used using utilized value values vertical volatility week weekly were wermuth what when which whittaker wiley with would yamaguchi yogurt york young zufryden http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.76 84 Getting the Most Out of Ensemble Selection Rich Caruana, Art Munson, Alexandru Niculescu-Mizil Department of Computer Science, Cornell University {caruana, mmunson, alexn} @cs.cornell.edu able about acknowledgments across actually adaptive additive adds algorithm almost also alternative always amount analysis angelis anonymous appears arbitrary available average averaged averaging award backstrom bagging base bayesian because before benefit better between boosting breiman building burer calibrating calibration cardie caruana classification classifier classifiers close combining comes comments comparison competitive compute conclusions conference cornell counter crew cross crossvalidation curves data designing dietterich different directly domingos drafts duin during easily effect effectiveness effects eight embedded embedding emnlp ensemble ensembles equal except expected expensive experiments explain exploring factors faster figure finally first forests francisco frequently from full further fusion getting giacinto good greatly half having help helpful heterogeneous high higher hill hillclimb hillclimbing however hurt hurts icdm icml icpr improve improved improvement improvements increase increases increasing independent indicating individual inez inside intelligent international intuition journal kaufmann ksikes large lars learning less level libraries likelihood lower machine machines make margin margineantu mart methods metric metrics mining mismatch mizil model models more morgan most much multiple munoz munson nearly needed niculescu number ones opposed optimization optimizing ordered other others outperform outputs overfit overfitting paper performance platt practice predetermined predicting predictions press probabilistic probabilities problem proceedings programming provided provides proxies pruning published random rarely reduces reducing references regularized repeatedly report research resistant result results reviewers risk roli roughly rows said same scale scarce seems selection selective semidefinite september sets show sibling significant sixth size slightly small some streaming street suarez such suggest sums super superior supervised support supported surprisingly systems table target technical than thank that them then there these third this total train training truly tsoumakas university unsurprisingly used using validated validation variance varying vector vernazza version vlahavas were when while will with within words work would yielding yields york zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.144 111 Similarity of Temporal Query Logs Based on ARIMA Model acmsigkdd addison agglomerative algorithms analysis andrade annual applications august backbone baeza beeferman berger book bursts chapman chatfield clustering communications company conference content control data development discovery discrete engine feedback financial forecasting fourier france groschwitz gunopulos hall hierarchical hill icdm identifying ieee information international interscience introduction jenkins johnson knowledge long longman louisiana management mcgill mcgraw meek mining model modern neto nsfnet online orleans paris periodicities polyzos prentice press proceedings psychometrika publishing queries query references reinsel research retrieval ribeiro salton schemes scientific search series sigir sigmod similarities sixth states statistic sundararajan term theory time traffic transform tsay united user using vagena vlachos wesley wiley words world yates york zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.60 143 Distances and (Indefinite) Kernels for Sets of Objects Adam Woznica, Alexandros Kalousis, Melanie Hilario ´ University of Geneva, Computer Science Department Rue General Dufour 24, 1211 Geneva 4, Switzerland {woznica, kalousis, hilario}@cui.unige.ch acta advances akutsu also angles approach artificial asymmetric axis based becker belong better between biological bollmann bruynooghe cambridge canu case class classification classifier computation conference costa cuturi data davy decomposition define defined desobry determinate dietterich directly dissimilarity distance distances domain duin ecml editor editors eiter elements esann european examine experimental experiments exploit extensions favorably finally first fitzgerald flach focusing frasconi from fukumizu functions future gains gartner gaussian generalized ghahramani goldman graepel graph groups have herbrich icdm icml improve improved induced inductive informatica information informative instance intelligence international jebara journal july kaufmann kernel kernels king kondor kowalczyk lathrop learning like logic lozano machine mahe mannila marginalized mary matchings matrix measures menchetti methods metric mining mixture more morgan muggleton multiinstance multiple mutagenesis networks neural obermayer october only other over pacl page pages pairs pairwise paper parallel pekalska perez perform performance perret point polynomial porto portugal positive predictive preliminary press principal problem proceedings processing programming proposed prototypes proximity putable ramon rectangles references regularization reported research results sammut sdorra selecting semigroup sets shashua show shown significant significantly sixth smola solving space specific srinivasan sternberg substitution such support svms symposium systems technique than that their these this time tsuda twenty types ueda unordered using vector vectors vert volume washington weighted whether which will with wolf workshop would wrobel york zhang http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.162 98 Robert Jaschk aachen acknowledgement actes additional afterwards again agreeing algebra algorithm algorithmen algorithms also always analysis analyzing appear applications approach artificial association avancees axis based bases bastide batagelj begriffsanalyse beitrag berlin biedermann binary bock boston cambridge charm chein classification closed closure collected common comparison computer computing concept concepts conceptual conclude conf conference connections consequently contexts covers darmstadt data datasets demanding denecke depicts dept different directly discovery discrete domingue donnees dordrecht ecos editor editors efficient ellis emes especially every expensive explained ferligoj figure folksonomies formal formalen found foundations frequent from funded further gained galois ganter general grows harvard heidelberg hierarchies hotho hsiao icdm icio ifcs implementation implications improvement increase information intelligence interested international jaschk journees june knowledge krolak lakhal large larger lattice lattices least lecture lehmann lenski level levinson like lnai logarithmically looking luder magnitude mannheim massive mathematical mathematics method mine mining mode model month more much mugnier nepomuk next notes number obiedkov observe october octobre only ordered organization orlik overall pages papers part pasquier peirce polytechnic powerset preprint press proc proceedings project provides providing pruning ranking references reidel relation rensselaer report research responsible restructuring results retrieval rich richter rival rule rules runtime scale scaled schmitz schwerdt science search seconds seen semantic sets shaker shown shows since sixth size small snapshot sowa speed springer start steep structures studies stumme such sure systems taouil technical than that theory they this three tools triadic trias triconcepts trilattices tripat triples trisets universit until users using verlag vocabulary volume what when which while wille wissenschaftsverlag with without wolff work zaki ziberna http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.139 103 Searching for Pattern Rules agrawal agresti analysis approach april arrays association august based basic baskets bayardo beyond bound brin candidate categorical change coefficient computer conference contract correlated correlation correlations crossclassifications data database dataset department derivable detecting diego discovery efficiently exploiting extended fast fimi fischer free frequent generalizing generation goethals hamilton helsinki heun http icdm identifying ieee interesting international itemset john knowledge kramer kumar lattices market metric mining morishita most motwani muhonen orlando pairs pattern patterns pazzani pearson pods press proc proceedings pruning redundant references regina report repository reynolds rules science searching seattle sese sets shekhar siam sigact sigart sigkdd sigmod silverstein sixth statistical string strongly suffix support symposium system technical toivonen traversing tree tucson university upper using washington wiley with without xiong york zaki http://doi.ieeecomputersociety.org/10.1109/ICDM.2006.104 125 Minimum Enclosing Spheres Formulations for Support Vector Ordinal Regression advances algorithm approach approaches bhattachryya boundaries burges cheung classification classifier classifiers computation conference constraint core currently data datasets design details editors elsewhere especially extended fast formulations graepel herbrich icdm idea improvements information international investigating journal keerthi kernel kopf kwok large learning levin machine machines march margin methods minimal mining multiclass murthy nature neural obermayer optimization ordinal pages paper peled platt press principle proceedings processing proposed rank ranking references regression reported research results roth schol sequential sets shashua shevade sixth smola springer statistical support systems theory these this training tsang useful using vapnik vector verlag very will with york zimac