http://www.informatik.uni-trier.de/~ley/db/conf/icdm/icdm2003.html ICDM 2003 http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780363abs.htm 45 Association Rule Mining in Peer-to-Peer Systems Ran Wolff and Assaf Schuster Technion ­ Israel Institute of Technology Email: ranw,assaf @cs.technion.ac.il agrawal algorithm algorithms anti approximate association associations august barbara bases basket beach berkeley between brin california chakravarthy cheung chile china communication condor conf conference constrained controllable correct counting counts data databases december devices discovery distributed dunham dynamic efficient engineering entropia fast florida frequency hipc home hong http icde icdm ieee imielinski implication incremental information international items itemset june karypis knowledge kong kumar large management manku market miami mining motwani namely number ogihara output over pages parallel parameters parthasarathy proc proceedings project record references rules santa santiago scalable schuster september seti setiathome sets shafer sigmod size skew srikant streams supplied swami systems that third thomas transactions tsur ullman united user very vldb washington wisc wolff zaki http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780505abs.htm 66 Icon-based Visualization of Large High-Dimensional Datasets Ping Chen Chenyi Hu Wei Ding Heloise Lynn, Yves Simon academic advances ahrens american arizona association avalos behavior beijing bruckner california canada challenges chernoff china christopher color colors combining computer conference context cord cybernetics data datasets details diego diffusion dimensional display displays dynamical efficient enns environment exploratory exvis faces facesto fective foley generation glance graphical graphically graphics grinstein grller healey help high higher icdm iconographics icons ieee images integrated interface international issue jacobs james journal kremers laidlaw large levkowitz lffelmann local london merging method mining mouse multidimensional multiple multivariate next overall pages parameter people perception phoenix pickett points press proceedings readhead references represent representation republic ribarsky rosenblum same scientific shenyang space spinal states statistical summary systems tensor texture textures third time tools transactions united visualization visualizing volume wang wegenkittl williams york http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780743abs.htm 125 Applying Noise Handling Techniques to Genomic Data: A Case Study Choh Man Teng Institute for Human and Machine Cognition 40 South Alcaniz Street, Pensacola FL 32501, USA academic account accuracy across additional against algorithm algorithmic always approaches artificial attributes automated between biology biomolecular biopolymers case cases circumstances clark classification collagen combination comparison computed concept conference connections correcting correction data databases decision desirable diagnosis differences different discarding discovery disease duplicated dzeroski each editors effective efficient elimination energy engineering experimental explore fairly feature features filtering finding first florida forthcoming fourteenth free from gamberger gave genomic give handling helpful huang hunter icdm ieee imperfecta improvement increase increased induction inductive information intelligence intelligent interaction interest international john kaufmann klein kluwer knowledge kollman large lavrac learning like likely little lose loss machine makes many measure mechanism mechanisms medical methods mining molecular mooney more morgan motoda mutations netic niblett noise noisy number obtained opposed osteogenesis outliers pages peptide performance point polishing portion possible presence proceedings programs pruning publishers quinlan redundancy references relevant remaining removing research resulting results rise robust selection seventh since sixteenth society some sparse springer stable straightforward study such suggested systems techniques teng that their theory these third this trees under useful using values verlag very while with workshop would http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780043abs.htm 5 Optimized Disjunctive Association Rules via Sampling accuracy agrawal algorithms andm approach appropriate association attributes bases between birmingham blumer buchmann bunemanands case categorical census chervonenkis comparison complexity computer conf conference considered consistent convergence data database databases demonstrated derived dimension dimensional discovery disjunctive editors ehrenfeucht engineering evaluation experimental experiments explorations fifteenth finding forestry fornumericattributes fraction fukuda general given haussler here hipp icdm ieee imielinski indeed indicate informationsystems international issues items jacm jajodia journal july kaufman knowledge large learnability lecture level lnai management mathematical meersman minconf mining miningoptimizedsupportrules mohan morgan morimoto morishita nakhaeizadeh notes numeric occur october offer ogihara optimal optimized optimizing pages parthasarathy particular pkdd practitioner press previous principles proc proceedings proceedingsofthe proe promises quantitative random rastogi rastogiandk real reasonable records reduction references relaxation research reservoir results ride rule rules sample sampling sarda scheme science section sets shim sigact sigart sigkdd sigmod size software springer such suggest support survey swami symposium systems that theoretical theory these third together toivonen tokuyama tradeoffs transactions tzer unnecessary using vapnik verlag very viewed vijayaraman visualization vitter warmuth washington when wijsen will with workshop zaki zelenko http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780059abs.htm 7 Identifying Markov Blankets with Decision Tree Induction aaai abramson adaptive agriculture alarm algorithms aliferis also aluallim anaheim analysis andersen appear approach arnone artificial attribute barley based bayesian beinlich belief bell berlin binder blanket blankets both brown building cambridge camda cardie caruana case causal causation chavez cheng children computation computational computers conference cooper darden data davidson decision denote design development dietterich direct discovery distribution edwards efficient electronics eleventh european every expert expression faithful feature features filters flairs forecasting freitag from function genomic glymour greedy greiner growing hailfinder hardwiring have hidden hugin icdm ieee improve induction inference information intelligence intelligent international irrelevant jensen john joint journal kanazawa kaufman kelly knowledge kohavi koller kristensen large learning leukemia local machine malting many margaritis markov mateo medicine method micorarray mining monitoring morgan murphy national neapolitan neighborhoods network networks ninth node olesen optimal organization other parents pearl pedersen pesticides prediction press principled probabilistic proceedings proof proposition quinlan rasmussen reasoning references regulatory relations relevancy report russell sahami same sample scale scheines search selection separates severe shell sigkdd sons spirtes spouses springer statistics statnikov study subset suermondt support system systems technical techniques tenth that then theorem theory third thirteenth thrun time toward towards trees tsamardinos unique universes using variable variables verlag weather weinberg wiley winkler with without workshop wrappers http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780685abs.htm 111 Postprocessing Decision Trees to Extract Actionable Knowledge about acknowledgment action addition agrawal algorithm algorithms always amount around association bases because best both canadian charles chen close combinations compare comparison conclude conclusions conference considering contributed customer customers data databases deployment described desired direct discovers discovery discussed effective effectiveness efficient efficiently engineering enterprises evaluate expectations experiments exponentially fall fast fayyad field figure find finding forms formulated found fourth from future government grant greedy gregory groups guarantee hand higher hong icdm ieee important improves increases increasing intelligent interesting international irrespective issue issues kaufmann keim knowledge kong kriegel large larger least limited ling magazine marketing maximal maximize mining models more morgan much needs note nserc number obtain obtained offer optimal order oriented other over pages paper patterns piatetsky possible postprocessing problem problems proceedings profit property provided qiang quality real reasonable references relationship research resources result results rules runtime same scale seconds september sets shapiro shown shows small solution solutions special specific srikant status supported techniques terminates than that third this those tielin time transactions transformed usama user using values versions very visualization visualizing vldb well which will with work world yang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780681abs.htm 110 Center-Based Indexing for Nearest Neighbors Search Arkadiusz Wojna Institute of Informatics, Warsaw University ul. Banacha 2, 02-097 Warsaw, Poland wojna@mimuw.edu.pl acceleration access acknowledgments acta algorithm allows almost andrzej associative author average based bentley better binary bisecting blake boley bound branch brin california chicago classification classifier combined combining committee communications comparison composite computation computations computer computers computing conference correspond criteria criterion data databases directly distance effective effectiveness engineering factors finkel first from fukunaga fundamenta gaede general gora grant grants grateful gunther html http icdm ieee included indexing induction informatica informaticae information instance international irvine iterative kalantari keys large learning letters linear logic machine mcdonald means measure measured merz methods metric mimuw mining ministry mlearn mlrepository multidimensional narendra near nearest neighbor neighbors number operations pages paper particularly partment pddp performance point polish present presentation presented problem procedure proceedings processing professor proximity pruning quad queries real references remarks repository research retrieval riona rses rule satisfying savaresi science scientific search searching sets several siam similarity single skowron software spaces splitting state step structure structures summary supported surveys system technology than that third this time times transactions tree trees twenty uhlmann university used useful very what while with wojna work http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780625abs.htm 96 Privacy-Preserving Collaborative Filtering Using Randomized Perturbation Techniques Huseyin Polat and Wenliang Du Systems Assurance Institute Department of Electrical Engineering and Computer Science Syracuse University, 121 Link Hall, Syracuse, NY 13244 E-mail: hpolat,wedu @ecs.syr.edu accuracy aggregate agrawal algorithmic algorithms along analysis annual anonymity anonymizer applying architecture artificial august based believe bergstrom borchers breese canny collaborative communications compared compromise computer conference cooperative crowds dallas data development disclosed disclosure disguised does empirical especially factor filtering finland fourteenth framework further gordon grouplens heckerman herlocker http iacovou icdm ieee improved information intelligence international july kadie konstan madison maltz management march miller mining more much netnews news oakland open original pages performing prediction predictive preserving privacy proceedings references reiter research resnick retrieval riedl rubin scheme security sigir sigmod srikant suchak supported symposium system tampere that third those transaction transactions uncertainty usenet users whose with work http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780243abs.htm 30 Efficient Nonlinear Dimension Reduction for Clustered Data Using Kernel Functions Cheong Hee Park Haesun Park Dept. of Computer Science and Engineering University of Minnesota Minneapolis, MN 55455 chpark@cs.umn.edu hpark@cs.umn.edu acadamic advances algorithm analysis anouar appear applications approach based baudat billings burges cambridge centroids cities classification component computation computations computer conference cost cristianini criterion data decomposition department dimensional discovery discrimi discriminant dortmund douglas duda edition editors eigenvalue engineering england error extraction feature first fisher fukunaga function functions generalized generalizing golub harlow hart hopkins howland html http icdm ieee information input international interscience introduction invariant janardan jeon joachims johns kernel kernels knirsch knowledge large larsen learning least loan longman lower machines making mathematics matrix methods mika minimum mining minnesosta mlearn mlrepository muller nant networks neural nonlin nonlinear numerical optimization orthogonal other pages park pattern practical press problem problems proceedings processing ratsch recognition references report reports representation reproducing rosen roth saitoh scale scholkopf science scientific second september shawe signal singular smola space spaces squared squares statistical steinhage stork support systems taylor technical text theory third transactions tutorial twin undersampled universitat university using value vector versus viii weston wiley wilson with york http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780155abs.htm 19 Direct Interesting Rule Generation aaai abstract acmkdd admissible agrawal algorithm algorithms analysis apriori arizona arti association associations august avoids based basket baskets bawa bayardo between beyond bhalotia boswell brin brussels california canada candidate charging chen chile cial cient clark class cohen computer conf conference constraintbased consuming correlations counting criteria dallas data database databases datar dense design dhar dimensional directly discovery driven dynamic editor edmonton effective eighth enabled engineering ewsl excluded experimentally experiments fast faster fifth finding forwardly frequent fujiwara fukuda generalizing generate generating generation gionis group gunopulos haritsa hash hill hong icde icdm ieee imal imielinski implication improvements induction inductive indyk information informative intelligence intelligent interest interesting interestingness international intl items itemset journal june knowledge kohavi kumar large lavrac learning less machine management market mason massive mcgraw measure memory menlo michalski mining mitchell montreal morimoto morishita most motwani mozetic ogihara optimal optimized opus orsay overview page pages paris park parthasarathy pattern patterns performance piatetsky post prediction presentation press proc proceedings proceedinmgs properties proved prunable pruning quebec real recent record redundant references research right rule rules santiago scheme search selecting sets seventh shah shapiro shen shenoy showed sigkdd sigmod silverstein smallest society some special srikant srivastava strong sudarshan support swami system systems texas than that these third through time tokuyama tokyo topor transactions tsur tucson turbo tuzhilin twentieth ullman undergo uninteresting universite unordered uses using vertical very visualization volume washington webb with without world yang york zaki zheng http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780533abs.htm 73 Comparing Pure Parallel Ensemble Creation Techniques Against Bagging Lawrence O. Hall, Kevin W. Bowyer1, Robert E. Banfield, Divya Bhadoria W. Philip Kegelmeyer2 and Steven Eschrich Computer Science & Engineering, University of South Florida, Tampa, Florida 33620-5399 {hall, rbanfiel, dbhadori}@csee.usf.edu adaboost advances algorithms analysis annual bagging bauer belmont boosting bowyer brazdil bregman breiman builder cambridge characterization chawla classification collins comparison computational conference constructing cybernetics data databases datasets decision dept dietterich distances distributed domingos empirical ensembles evaluation experimental finite forests francisco frank friedman from gama group hall html http hulten icdm ieee implementations infinite information intelligence international irvine java kaufmann kegelmeyer kohavi large learnability learning letters liacc like logistic machine mateo merz method methods mining mlearn mlrepository moore morgan murphy neural olshen pages parallel pattern performance practical predictors press proceedings processing programs project quinlan random randomization recognition references regression report repository schapire singer springer statlog stone strength subspace systems technical theory third thirteenth three time tools transactions tree trees univ variants very visualization voting wadsworth weak with witten http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780403abs.htm 50 Mining Plans for Customer-Class Transformation about acknowledgment acting actionable actions actual adaptive agents agrawal algorithm algorithms andconquer apers applications apply approach approximation april artificial assume august auplan authors barto behavior between boston both bouzeghoub broad cambridge canada cassandra cathala chen computer conclusion conference consider costsensitive could data database databases dayal decision demonstrate direct directions discovery divide dmkd domains edbt editors edmonton effective efficient efficiently eighth engineering execute executed executions exhaustive experimental extending failures following formulate fourth freespan frequent future gardarin generalizations germany government grant growth hall have heidelberg high hong icde icdm ieee important improvements integrated integrates intelligence intermediate international introduction issue issues journal kaelbling kaufmann knowledge kong large learning lesh ling littman machine make making march marketing masseglia mateo mining modern modified moore more morgan mortazavi norvig observable observed obtain ogihara only optimal pages part partially pattern patterns pbulishers pednault performance pinto plan planmine planning plans pomdp poncelet possible prefix prefixspan prentice press principles priori problem problems proceedings programming projected proposed quality quinlan real references reinforcement report research results returns river robot robots russell saddle scalability science search sequence sequences sequential sessions sigmod society solution solutions spade special springerverlag srikant states stochastic such supported survey sutton taipei taiwan technical technology test tests than that third truly university unsupervised upper utilities utility volume when will with without work workshop world yang york zadrozny zaki http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780299abs.htm 37 Introducing Uncertainty into Pattern Discovery in Temporal Event Sequences Xingzhi Sun, Maria E. Orlowska, and Xue Li School of Information Technology and Electrical Engineering The University of Queensland, Australia {sun, maria, xueli}@itee.uq.edu.au accommodate agrawal algorithm algorithms applicable apriori association asynchronous based between called case caused challenges chen compared complexity computational conference containing counting covered data database dayal defined definition designed discovery dmkd edbt effective efficient efficiently environment episodes event eventoriented events extended fast fault find finding freespan frequent general generalizations growth have hellerstein however icde icdm ieee improvements inaccurate increases infominer international interval intervals into introduced introducing knowledge long mannila matching maximal measure metric mining model modified mortazavi noisy number order orlowska otherwise overlap overlapped pages pakdd paper part partially pattern patterns performance periodic periods pinto precise prefix prefixspan probability problem problems proc proceedings projected proposed provided references risk rules sequences sequential series sigkdd sigmod significance solution srikant still support surprising temporal that there these third this time toivonen tolerant traditional tung type types uncertain uncertainty unit unknown verkamo vldb wang will with within yang zhou http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780339abs.htm 42 Visualization of Rule's Similarity using Multidimensional Scaling Shusaku Tsumoto and Shoji Hirano Department of Medical Informatics, Shimane University, School of Medicine, Enya-cho Izumo City, Shimane 693-8501 Japan tsumoto@computer.org, hirano@ieee.org about academic acknowledgement active adams adopt advances amsterdam analysis another applications approach approximation areas artificial assign assigns attribute automated based between boca bradshow busse cacm cambridge capture cartesian chapman clinical cluster complexity computational conclusion conference coordinate could covering creative culture data databases define dempster difference dimensional discovery dissimilarities distance domain dordrecht each eckart edition editor editors education evaluated everitt evidence experimental expert experts explorations extracting fayyad fedrizzi finally flood found from frontiers function further future gives grant grzymala hall hill hold icdm ieee implementation indice indices induction information intelligence international into intuitive japan john kacprzyk kluwer knowledge langley london lower made matrix mcgraw measures medical method mining ministry motoda multidimensional near neurology nonmetric number others pages pairs paper pawlak piatetsky plane point preliminary press principles priority proceedings process processes property propose psychometrika publishers rank raton references relations reported research results rough rule rules scalability scaling science sciences scientific semantic sets several shafer shapiro show shows similar similarities similarity similiaries simon since skowron smyth sons sports stress studies study such supported supporting syntactic system technology that theory these third this three transitivity tsumoto types useful value victor viewpoint visualization volumes which whose wiley will with work yager york young zytkow http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780629abs.htm 97 Semantic Role Parsing: Adding Semantic Structure to Unstructured Text Sameer Pradhan, Kadri Hacioglu, Wayne Ward, James H. Martin, Daniel Jurafsky Center for Spoken Language Research, University of Colorado, Boulder, CO 80303 aarseth advances algorithm analysed annotating argument arguments artificial augmentations automatic baker bartlett berkeley best bies bikel boulder cambridge canada canary categorial center chaniak chen chunking classification classifiers clustering coling colorado combinatory comparison computational conclusions conference converting cooccurrence corpora cslr current daniel data deep detailed detection discovery drawback each editors edmonton emnlp entity estimators extended extraction features ferguson fillmore first framenet france from generalizes gildea grammar grammatically groups hacioglu harabagiu have head hockenmaier hofmann human icdm identification identifinder identifying ieee immediate improvement improving independent indicate information institute intelligence international into islands japan jurafsky karin katz kingsbury kipper knowledge krugler labeling labels laboratory language large lattice learning learns letting like linguistics lowe lrec machine machines macintyre made mapping marcinkiewicz marcus margin martha martin massachussetts memo miller mining model models montreal name named nature necessity number original other others ours outperforms output overall overcome palmas palmer parser parsing penn performance philadelphia plan platt power pradhan precision predicate press probabilities probability proceedings project propbank providing puzicha ralph rambow recall recognition references replaced report reported research resulted results retrieval role roles sapporo schasberger schuurmans schwartz scolkopf score scott sean semantic sentence shallow significantly similar small smola source spain spoken springer statistical structure structures substantial such support surdeanu syntactic system table tagger target tasks technical technology test thank that their thematic then theory third this those toulouse treebank using valerie vapnik vector verlag wallis ward weischedel what which williams with word words work working would york http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780727abs.htm 121 Text Mining for a Clear Picture of Defect Reports: A Praxis Report army based categorization chulani conference control customer data davidson decision deriving different document documents enterprise escom european figure file from goetz html http icdm ieee index induction information international introduction itlpubl johnson journal june leszkowicz march metrics minimal mining moore oles portal proceedings quality references related rule santhanam satisfaction service shtml size software symbolic system systems text third tree uncategorized venkataraman view ward zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780529abs.htm 72 The Rough Set Approach to Association Rule Mining about american amir appear approach argue artificial aspects association associations attributes aumann bell classification collection computational concept conference confidence connections cooccurrences data databases demonstrate discover discovery document documents european feldman found france fresko guan however icdm ieee improve indicate information intelligence international introduce issue journal keyword kinar kloesgen kluwer knowledge lecture level lindell liphstat many maximal maximally means method methods mined mining mohamed moreover much nantes notes pawlak pkdd principles proceedings quafafou quality rajman reasoning references regular rough rule rules schler science september sets similar simpler society some special springer strong symposium systems term text than that then theoretical theory they third tool topic using which while with yehuda zamir zilberstain zytkow http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780371abs.htm 46 MPIS: Maximal-Profit Item Selection with Cross-Selling Considerations academic accesses acknowledgements advices agrawal algebra algorithm algorithms almaden also applications approach association assortment authoritative authority beasley believe between binary blischok both brijs byte candidate case center certain chain changed climbing clustered codes college combinatorial completeness comput computationally computers conf conference cross cuhk cybernetics data database databases dataset dbminer decision decisions definition derivatives determine different discovering discovery discrete discussed done dynamical earmarked edition effect effective efficient enhanced environment environments epileptic every executive experiments expert fast fifth finite formulation framework freeman frequent frontline functions future futures garey generalized generation generator generous global goethals gold grant guide hall hedberg help heuristic heuristical highly hill hiller hohenbalken hong horst html http hull hyperlinked iasemidis icdm ieee imilienski imperial included information international internet intractability introduction item items jacm johnson journal kleinberg kluwer knowledge known kong large lecture leon lieberman like linear london loss management mannila mathematical maximize mcgraw method methodologies methods microeconomic mining model modeling mpis much notes number october operations optimal optimization optimizing options other page pages papadimitriou paper parashar pardalos patterns pattipati ping polytopes predictability prefetching prentice problem problems proc proceedings product profit profset programming propose providing pseudoconcave publishers quadratic quest quin raghavan rank ranking real references reflect related report research retail rule rules rush sackellares safronov sahni school searching second seizures selection selling servers sets seventh shiau shopping show siam sigkdd sigmod soft solution solve solver source sources srikant stanford store story study such supermarket supported swami swinnen symp syndata synthetic system systems technical technology tells thank that theory these third this thoai toivonen total transaction ullman unconstrained used user using vanhoof verkamo view vldb volume wang ways wets which wide willett with without work world would wwwdb http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780323abs.htm 40 Learning Bayesian Networks from Incomplete Data Based on EMI Method algorithm algorithms annual anonymous approach approximating approximations aril artificial bacchus based bayesian belief bell bound buntine causal cheng chickering china chow collapse combination comments computational conf conference cooper data december dejong dependence discrete distributions efficient engineering evolutionary fourteenth friedman from gecco geiger graphical greiner guide heckerman helpful herskovits hidden hong icdm ieee incomplete induction inference information intelligence intelligent international journal july kaufmann kelly knowledge kong laskey learning likelihood literature machine management marginal mateo method mining missing morgan myers national networks november pages pakdd pearl plausible poly principle probabilistic probability proceeding proceedings ramoni readers reasoning rebane recovery references sebastiani singh sixth statistical statistics structural systems thank theory third tian transactions trees uncertainty using variables with http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780621abs.htm 95 Structure Search and Stability Enhancement of Bayesian Networks Hanchuan Peng and Chris Ding Computational Research Division, Lawrence Berkeley National Laboratory, University of California, Berkeley, CA, 94720, USA Email: hpeng@lbl.gov, chqding@lbl.gov about algorithm algorithms analysis andobservationaldata andyoo approach based bayesian being belief bell bioinformatics buntine causaldiscoveryfroma cell cheng chickering clustering collaborative compendium conf conference cooper data davatzikos dependency detecting ding discovery efficient elidan expression filtering friedman from fromdata fromperturbedexpressionprofiles functional geiger genes genetic guide hard heckerman herskovits hughe hulten icdm ieee ieeetrans ieeetranskde images imaging induction inference inferring information intelligent international jordan kadie knowledgemanagement koller kuijpers larranaga lbnl learning learningingraphicalmodels literature machine machinelearning machinelearningresearch macrotonano medical meek method methods microsoft microsoftresearch mining mitpress mixture modular morgankaufmann morphological murga network networks networksisnp ofexperimental pami pearl peng performanceanalysisofcontrolparameters plausibleinference poza preliminary probabilistic probabilisticnetworksfromdata proceedings profiles reasoning references regev regulated regulatory research rounthwaite sanmateo statistics structural structure study subnets subnetworks symp systems technicalreport theory third toolkit tutorial using visualization winmine with yurramendi http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780195abs.htm 24 Parsing Without a Grammar: Making Sense of Unknown File Formats Levon Lloyd and Steven Skiena Department of Computer Science State University of New York at Stony Brook Stony Brook, NY 11794-4400 {lloyd, skiena}@cs.sunysb.edu accept accepting accuracy accurate acknowledgments addison alberto algorithmic algorithms alternate alternating analysis analyzing another apostolico appears arbitrary around attempts average baltimore because begin between bimodal biology both browsing cambridge case categorization certain changes character characters cliffs clustering collaborative collected complexity components computational computer conclusions concrete conference conjectured consecutive context correctly could crawl criteria data delimiter delimitering delimiters delimiting determining differences difficult digital direction directions discrete discussions distance distances distinguish distribution divide dubes each ecsl edition eliminating englewood english even example experience extracting extracts fairly file files finally find first format formatted formed frequency frequentlyoccurring from general given graham greater greatly gusfi hall have icdm ideal identify identifying ieee improve incorporate independent information input interesting international into issues jain january joint knuth large last latex lead least length libraries library lifantsev like long lorie major manning mark maryland match matching mathematics maxim mining most multiple must needed negative neighboring nevill notably number numbers occurrence occurrences open pages pair pairing particular patashnik performing periodic periodicity plaintext positive postscript potentially practice predictions prentice presented preservation press printable proceedings program project putative quartile quartiles quotation quotations ranking ratio recognition recognized reduce reed references reflect rehttp remain represent research results rich roanoke robustly robustness satisfies science score second self separations sequence sequences several short shown siam sign since small software sort source specific states string strings strongly structural structure structures such sufficiently sunysb symbol symbols symposium system table techniques term test text than thank that then these third this thus times tool topic trees trial type united university unix unknown used using utility values very virginia well wesley when which will witten would yuntis http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780139abs.htm 17 Localized Prediction of Continuous Target Variables Using Hierarchical Clustering Aleksandar Lazarevic1, Ramdev Kanapady2, Chandrika Kamath3, Vipin Kumar1, Kumar Tamma2 adaptive ahpcrc aleks algorithm analysis applications approach arlington artificial asce assessment backpropagation based baxter berlin bicanoc boston braun breiman calculations carnegie caruana chen civil clustering complex computation computers computing conference congress conjugate connection continuous continuum criterion cross damage data databases detection dimensional direst discovery diverse document edelman engineering estimation experiments experts faster feature feedforward francisco friedman functions gradient griffths hagan hajela hierarchical icdm ieee incomplete information integrity international interpretation intrator jacobs jordan journal kamath kanapady karypis kluwer knowledge kumar lazarevic learn learning lecture linear list localized machine make marquardt mathematics mellon menhaj method minimization mining mitchell mixtures modal models multiple multitask multivariate networks neural nonconvex nondestructive notes numerical obradovic pittsburgh powell predicting prediction proceedings references regression report representation responses reston riedmiller royal rprop sandhu science selective sensitive series siam society spatial springer statistical structure structures structuring suitable sullivan szewczyk tamma target task tasks technical tests theoretical third thrun training transactions transfer university using variables verlag with zhao http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780697abs.htm 114 Clustering Item Data Sets with Association-Taxonomy Similarity Ching-Huang Yun*, Kun-Ta Chuang+ and Ming-Syan Chen*+ Department of Electrical Engineering* Graduate Institute of Communication Engineering+ National Taiwan University Taipei, Taiwan, ROC E-mail: chyun@arbor.ee.ntu.edu.tw, doug@arbor.ee.ntu.edu.tw, mschen@cc.ee.ntu.edu.tw acknowledgement adherence agrawal algorithm algorithms also assess association attributes authors bases basket categorical categorybased chen china chuang cikm cluster clustering conference council data databases devised discovery education engineering experimental extensions fast features guha huang icdm ieee information international item items knowledge large management market means mining ministry national outperforms pages paper part prior proceedings project quality rastogi real references republic results robust rock rules science september sets shim shown significantly srikant supported taiwan taxonomy that third this transactions using validated values very wang with works http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780723abs.htm 120 Understanding Helicoverpa armigera Pest Population Dynamics related to Chickpea Crop Using Neural Networks able abundance accuracy advance agricultural agriculture analysis andhra applications applied april arid armegira armigera arnold assessment association attack avoid avoiding being bhatnagar bibliography biological chemical chickpea china climatic coefficient comprehensive concepts conclusions conducted conference continues control correlation crop crops data databases dean decrease decreased decreasing deficits delay delhi department dhandapani disease dynamics eastern economic education effectiveness enemies entomology environment expected experiment experimental experiments extracting false farmers figure forecasting foundation given growing hall haykin helicoverpa help here high hits horticulture host however hyderabad icdm icrisat ieee importance improved increase increasing india indian influence insect institute international jackson jiawei johnson journal kamber kanojia kaufmann king larvae losses management manjunath micheline mining missed models more morgan multivariat natural network networks neural next number office over parameters particular patterns pawar peaks pearson performance pest pesticides pests pimbert plant plants population possible pradesh predict predicted predictions predicts prentice probability proceeding proceedings programs protection quality rainfall regional research resources result results richard rules sciences semi sets shen show shows significantly simon simple simulation sithnantham sprays srivastava states statistical stochastic summary surveillance technique techniques that their theories these third this three trivedi tropics united university unnecessary using value volume weather week weeks were wichern with workshop would zhao zhongua zuorui http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780497abs.htm 64 Towards Simple, Easy-to-Understand, yet Accurate Classifiers above accuracy acknowledgements applied asimov bagging based been best bottom bound boundary bred breiman buja cabrera calibrated caragea carver cases classes classifier classifiers classify college colors combination combining comp compensated computational computer computing conference constructed controls cook data deere diego dietterich different dimensional directions distributed drawn drish each ease easily ensemble estimates even exist expect experiments explored figure first foundation francisco from gaining generated graduate grand grants graphical grid groups here highdimensional higher honavar hurley hyperplane hyperplanes icdm identify ieee insights international into joachims john journal kernel kernels large learning lecture linear linsvm literature loss machine machines making manual methods mining more multidimensional national nature nested normal normals notes obtaining other outperforms over overlaid part particular pattern performance pioneer planes plot points practical predictors press probability proceedings projection projections purpose pursuit randomly references results runs scale scheme schemes science scientific second separating sets show shown shows siam similar smaller solution some spaces sphere springer statistical statistics support supported surface tangent that theory third this three through tool tour understanding underway uniformly used using vapnik variance various vector vectors verlag viewing visual visualization visualize visualized ways weighted weighting what where which while will with work would york http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780443abs.htm 55 CBC: Clustering Based Text Classification Requiring Minimal Labeled Data Hua-Jun Zeng1 Xuan-Hui Wang2 Zheng Chen1 Hongjun Lu3 Wei-Ying Ma1 adam adjust advances aided algorithm algorithms also analysis analyzing applicability approach apte assign assumption automated automatic background based bayes because been before best bhavani blum cannot capturer cardie carried caruana case categorization chute cikm classification classifiers classifying clustering colt combining comparable comparison conclusion conference considered constrained constraint constraints cotraining could damerau data decision dempster developments discriminative documents ecml edinburgh effectiveness enhance enough evaluate examination example examples existing experiments exploited fall feature features ferra focused form forty from further future generally generative ghani given guide herman higher icdm icml ieee impact incomplete independence inference information initial instance international into joachims jordan journal kamvar klein knowledge kowalczyk kowldege label labeled labels laird large largest learning level lewis likelihood link linoff local logistic lower machine machines mainly making maning many mapping masand maximum mccallurn means measure memory mentioned method methods minimal mining mitchell most must naive natural nature neural news nigam oles ones only other over paper perception performance plan point poor predictor presented prior probability problems proceedings processing provide raskutti reason reasoning references regression relevant report reported retrieval rogers royal rubin rules salton science seeger selection semi series showed sigir sigkdd similarity simple size small smallest society some sophisticated space springer starting statistical stories study such sufficient superior supervised support systems technical text than that theory there these they third this thrun tois training transductive tsvm understand university unlabeled usability using value vapnik variance vector verlag version very wagstaff waltz weak weiss when which while with work works yang york zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780451abs.htm 56 Regression Clustering Bin Zhang Hewlett-Packard Research Laboratories, Palo Alto, CA 94304 bzhang@hpl.hp.com academic algorithm algorithms analy analysis annaul anzmac appear april august available based before benefit berkeley berlin california center century challenge chapman chaudhuri chicago classification cluster clustered clustering clusters clusterwise cluterwise comparison computation computing conference converge corn correction costa data databases dataset datenanalyse dayal dempster density desarbo discovery disovery dissection dissertation duda dynamic edition elkan estimating estimation extensions facing fast fietz fifth finite first fitting fixed france free from functioncenters fuzzy gaffney gaul generalized hall hamburg hamerly harmonic hart hastie hennig html http icdm ieee ijcnn incomplete information institut international intl introduction john joint journal july junge kluwer knowledge korea krishnan laird lazarevic learn learning likelihood likelyhood linear locarek lyon machine macqueen madigan market marketing mathematical mathmatsche maximum mclachlan means method methodology methods mining miximum mixtures modellen models montgomery multivariate networks neural neyman nips number obradovic obser ordering pakdd paper part partial pattern peck performance pinto point preprint press probability proceedings processing publishers references regression research royal rubin scene segmentation seoul sept series setting seventeenth siam silverman simultaneous smyth society some sons south spath spatial spatio springer stanford stat statistic statistical statistics steenkamp steps stochastik structuring symposium systems temporal third tibs tibshirani torgo trajectory under universitat university unsupervised using vations vining visionary walther washington wedel weighting wiley williams workshop xvii york zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780617abs.htm 94 Mining Frequent Itemsets in Distributed and Dynamic Databases M. E. Otey C. Wang S. Parthasarathy A. Veloso W. Meira Jr. agrawal algorithm algorithms alsabti amount area arlington association associations august because become becomes between bodagala cheung cierniak cisrc clusters communication compile computer conclusions conf conference considered contrast data databases deal demon diego discovered discovery distributed dynamic efficient engg engineering evolving examined experiments fast frequent future ganti generated global greater harder heterogeneous high icdm ieee incorporate increases incremental international investigating involved involves issues itemset itemsets journal knowledge large latencies local maintenance many meanwhile meira minimize minimizing mining model monitoring more network networks number offs only order otey pages paper parallel parthasarathy possibility presented problem proc proceedings query ranka references relatively research response resulting results rules sacrifice sampling scheduling shafer sites skewness small smaller some support technique techniques tend there these they third this thomas threshold time trade trans transactions transferred update updating updation using veloso volume wang when where whether which wide will with work workstations zaki http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780537abs.htm 74 Improving Home Automation by Discovering Regularly Occurring Device Usage Patterns Edwin O. Heierman, III Diane J. Cook aaai able accuracy achieves actions activities adaptive addition additional agent agrawal algorithm algorithms also analyzed anomalous anticipate applying approach april architecture august auto automate automatically automation between bhattacharaya canada coen collected communications complexity component comprehend compression conclusions conference consists contained cook correctly correlation currently daily data dataset datasets davison december decision demonstrated design detect determine device discover discovered discovering discovery domain econds efficient empirically engineering enhance environment environments episode episodes evaluating event excessively figure finally following frequent from future generated have heierman helps highlighted hirsh home icde icdm identify identifying ieee improve improved improvement incorporate incorrectly increased indicate information inhabitants intelligent intend interaction interactions international intervals investigating itemset itemsets knowledge laboratory large learner likelihood linear mannila march mavhome mechanism membership mine minimums mining minutes month montreal mozer multiple must nature near needs number observed occur occurred occurrence occurrences occurring once only optimal part participants patterns penultimate performance plot predicting prediction predictive principles proc proceedings processing provides providing randomly range ranging references regularity report residence results role runtime rutgers sequences sequential serve several should show significant smart spring srikant stanford state statistical symposium synthetic systems taipei taiwan technical techniques test testing tests that third this three time times toivonen unable university used user value values varying verkamo versus week weekly which width widths will window windows wireless with within work york http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780709abs.htm 117 A K-NN Associated Fuzzy Evidential Reasoning Classifier with Adaptive Neighbor Selection Hongwei Zhu, Otman Basir Department of Systems Design Engineering, University of Waterloo 200 University Ave. W., Waterloo, Ontario, N2L 3G1, Canada h4zhu@engmail.uwaterloo.ca, obasir@uwaterloo.ca accuracies accuracy acknowledgments adaptive algorithm algorithms almost also applications appreciate approximate associated authors automation based basir best both bruzzone cagliari choice classification classifier classifiers combination comparative compared comparing computational conclusions conference constructing cover curves cybern data demonstrated dempster denoeux denoex developed deviations discerned effective efficient entropy equivalent evidence evidencetheoretic evidential experiments fast feltwell flatter from fusion fuzzy generalizing giacinto givens good gray ground hart have icdm icsc ieee image images implementation indicates information insensitivity intelligence international japan july justification keller lett like logic mathematic maximum minimum mining model much nearest neighbor neighbors neural notation number overlapped pages paper pattern performs pignistic press priceton princeton proc proceedings prof proposed providing reasoning recognit recognition references remote remotesensing result resulted robotics roli rule scheme selection sensing sensitive sensitivity shafer shannon shown shows small standard statistical structure structures supervised switzerland symposium syst table terms than that their theory third this trans truth university using variations visual well when which whose with works would zouhal zurich http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780091abs.htm 11 Scalable Model-based Clustering by Working on Data Summaries about accelerating accuracy accurate advances after aggregate algorithm algorithms also among applications apply approaches approximates approximation associated attributes autoclass automatically average based bayesian because behavior bemads biased birch both bradley bwem cadez cannot cardinality categorized cheeseman chen chinese chiu classical classification cluster clustering clusters combining comparable comparison complicated component comprehensive compressed computational computing concepts conclusion conference continuous converges cortes counterparts covariance cubes data database databases density densitybiased derived designed determine develop difference different dimensions directly discovery dumouchel each editors effective efficient emads embody environment established even exclusive exemplify exists expectation experimental extensions fact faloutsos fang fast faster fayyad files first flat flatter four fraley framework from function future gaffney gaussian general generate generated generates given granularities grid have heckerman heterogeneous hierarchical higher hong hours however icdm icpr ieee illustrated improved indicate indicates individuals information insensitivity international into items iterative jeris john johnson journal kamber kaufmann knowledge kong krishnan large larger learning level life likelihood little livny local loss machine magnitude main mathematically maxima maximization mclachlan mean meek meila method methods mining mixed mixture mixtures model models moore more morgan multiresolution mutually nips novelties number olap orders original other pages palmer phase pregibon probabilistic procedure procedures proceedings pseudo publishers quality query ramakrishnan random real references reina renders resources results robust running runs sampem sampling scalable scientific second seconds sets several shanmugasundaram shown siam sigmod significant significantly slightly smaller smyth some sons sound spatial squashing statistically statistics step stutz summaries summarization summary synthetic system systems tailed takes techniques test than that their them theory thesis thiesson third this three through times trees ttest type under university used using value very vldb volinsky volume wang which wiley will with work working worse worst york zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780235abs.htm 29 TECNO-STREAMS: Tracking Evolving Clusters in Noisy Data Streams with a Scalable Immune System Learning Model Olfa Nasraoui, Cesar Cardona Uribe, Carlos Rojas Coronel Department of Electrical and Computer Engineering The University of Memphis 206 Engineering Science Bldg., Memphis, TN 38152 onasraou,ccardona,crojas @memphis.edu Fabio Gonzalez Department of Systems and Industrial Engineering National University of Colombia Bogota, Colombia email: fgonza@ing.unal.edu.co access adam adaptative alamitos algorithm algorithms also american analysis appear approach arlington artificial association associations babu barbara based bases beach biosystems birch bradley callaghan chen china chronological clustering clusters cohen competitive compression computation computer conf conference congress context continuous cooke cybernetics dasgupta data databases days density discovered discovering discovery distributed distribution dong efficient eighth emerge ester evolutionary explorations fayyad figure focs foundations frigui functions fuzzy garden gecco genetic gonzalez gonzalezissupportedbythenationaluniversityofcolombia guha hampel hinneburg hits hitsperusagetrendversus hong hsinchu hunt icdm ieee immune influence input international jerne john joshi keim knowledge kong kriegel krishnapuram large last late learning livny logs management method mining mishra motwani multidimensional multimedia nasraoui natural navigation neal newsletter noise noisier noisy number only order oregon over pages patterns portland presented press proceedings profiles queries ramakrishnan rate record redondo references regression reina relational requirements robust ronchetti rousseeuw sander scaling science scientific sensitive series session sessionnumberwhenallsessionsarepresentedinnaturalchronological sessions siam sigkdd sigmod sons spatial specifed springer stahel statistics step streams symposium system systems taiwan tending third time timmis trend trends trendversussessionnumberwhensessionsarepresentedinreverseorderfromtrendf usage using verlag versus very vldb wang weakens weakier when while widom wiley with york zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780355abs.htm 44 Interactive Visualization and Navigation in Large Data Collections using the Hyperbolic Space Jorg Walter · Jorg Ontrup · Daniel Wessling · Helge Ritter Neuroinformatics Group · Department of Computer Science University of Bielefeld · D-33615 Bielefeld · Germany E-mail: walter@techfak.uni-bielefeld.de able abstracts achievable agents analysis appear applicable applications approach approaches balance bartlett based beginner between card cartographic chapman closeness collection collections computational computer computers conceptually conf conference content context cook coxeter crucial czerwinski data decoupled demands developments differential dimensional directed discovery document documents ease ecological elementary elsevier euclidean examination flexible focus foraging from geometry give goal graphics graphs grid guide hall hierarchies high hmds human hybrid hyperbolic icdm ieee important individual info information initial interactive interface international issue jones knowledge kohonen lamping large laying linear making mapping maps massive meant methods mining modest more morgan much multidimensional munzner networks neural offering organization organizing pages pirolli plane potential precise precision present press print proc proceedings publishers references represent representation research riemannian rigid risden ritter sammon scaling scheme sciences second self semantic series sigchi sigkdd simple since skupin software space spaces spec springer stage step streams strike structure studies symp systems technique technology temporal text that think third thorpe topics toronto towards trans trees university useful user using viewing visual visualization visualizations visualizationt visualize visualizing volume walter wege which while wise with http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780251abs.htm 31 Sequence Modeling with Mixtures of Conditional Maximum Entropy Distributions access acknowledgements acoustics advances alexandrin algorithm algorithmical also america analysis andrew annals annual applications approach april artficial association based basford behavior berger biokdd biological both buehler cadez cambridge carnegie challenge chemical chen chudova citeseer classes clustering clusters cmucs collaborative comparisons computational conditional conference corresponding could coupling cross darroch darya data david decoupled dekker della dempster derived dimensional disambiguation discovery discussions distributions documents domains done dynamic editors efficiently empirical employing entropy eren even experiments exploit extensions fast features fields fifth filtering finite first firstorder form formalism from future gaussian generalized giles goodman have heckerman hidden high hmms icdm icml idea ieee improvement incomplete inducing information intelligence international internet irvine isit iterative jaynes jelinek john jordan journal knowledge krishnan laboratories lafferty laird language latent lead learning like likelihood linear linguistics local lyle machine major make manavoglu mannila many marcel markov mathematical maxent maximum mccallum mclachlan meek meeting mellon method methods mining mixture mixtures model modeling models more mozer naming natural navigation neural nips ofspeech online onto open optimization others over page pages paper part pattern patterns pavlov pennock perform performance performed petsche pietra plan popescul powerful practical prediction predictions predictive present press previous principle prior probabilistic probabilstic problem proc procedure proceedings processing productive profiles prohibitively projected provide providing query ramdom random ratcliff ratnaparkhi real recognition recommending references remains report rival rosenfeld royal rubin scalable scaling schein sequence sequences sequential sets seventh shown shuurmans sigkdd signal significant simulated site slow smoothing smyth society solving sons sparse speech stand statistical statistics steps still structure study suboptimal such suggest support symposium systems tagging technical than thank that then theory third this though thus training transaction transactions uncertainty uncover ungar university used user users using value visualization volume wang well were where white widely wiley with work world would york zhao http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780203abs.htm 25 Eren Manavoglu Pennsylvania State University 001 Thomas Building University Park, PA 16802 manavogl@cse.psu.edu ability able achieve acknowledgements acoustics action adaptive addition algorithm also analysis analytical annals apply approach april architecture artificial association automatic automatically autonomous available avoid based bayesian been behavior behaviors bergstorm better bollacker both braries browsing buchner cadez cambridge carnegie carolina chapel chen chickering citation citeseer classes classification clustering cmucs collaborative collection communications compare complex computational computer conclusions conditional conference cooley cooperative cost customized darroch data demonstrated dempster density dependencies dependency described dictive different digital dimensional discovering discovery distributions domains dominant dynamic each entropy estimation etzioni expand experiments fast features fields fifteenth filtering finite first fourth from future gaussian generalized generate generating giles global goodman grant grouplens groups heckerman hidden high hill iacovou icdm identify identifying ieee incomplete indexing individual inducing information instance insufficient integrating intelligence intend interested international internet interpreting introduced investigated irvine iterative jelinek journal kadie knowledge known labs lafferty laird lawrence learning like likelihood linear lockheed long machine making managed manavoglu mannila marketing markov martin mathematical maxent maximum mechanism meek mellon method methods mining mixture mixtures mobasher model modeling models motivated mulvenna naming national navigation netnews networks neural north online open order ordered pages paper partially pattern patterns pavlov pays pennock perform performed perkowitz personalization pietra plan planning predicting predictive preparation press prior problem proceedings processing profiles proposed provide random ratcliff real recognition recognizing recommendations record references report represented representing research resnick riedl rosenfeld rounthwaite royal rubin rule scaling sequence sequences sequential services sigmod signal simple site sites smoothing smyth society sparse specific speech srivastava statistical statistics steve strengths strong suchak supported synthesizing systems technical techniques term thank that therefore these third this through time traditional training transaction transactions university unknown usage used useful user users using visualization visualize visualizing weaknesses were where whereas white wide with work world would http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780553abs.htm 78 Comparing Naive Bayes, Decision Trees, and SVM with AUC and Accuracy Jin Huang Jingjing Lu Charles X. Ling Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 fg@csd.uwo.ca jhuang, jlu, cling aaai academic academy accuracy accurate acknowledgements acquisition addition administration againstaccuracy algorithms also analysis appear applications approaches area artificial attributes automatic bayes bayesian benchmarking better beyond blake both bradley brown business california case chang class classification classifier classifiers codes compare comparing comparison computer conclude conclusions conditions conference conferenceonartificial consistent continuous cost current curve dansi dash data databases datasets decision density department dept difference discovery discretization discriminating distribution domingos economics edition editor enabling engineering estimation evaluated evaluation experimentally experiments expression fawcett fayyad ferri fifteenth flach foster fukunaga gene general gratefully gray great grundy guidelines have help helped hernandez hornik html http huajie huang hussain icdm icml ieee ijcai ijcnn important imprecise independence induction inductive information intelligence international interval introduction irani irvine itory jianning joint kaufmann kernel kindly knowledge knowledgebased kohavi kononenko langley learning least leisch library libsvm ling machine machines many mateo measure merz methods meyer microarray mining mlearn mlrepository more morgan multi multiclass naive national networks neural nineteenth optimality orallo other outperform pacificasia pages pattern pazzani performance popular predictive press previous probabilities probabilitybased proceedings proceedingsof produce programs proved provide providing provost qian quinlan ranking real recognition references related report repos retrofitting rule science sciences scores second should significant similar simple sixth smyth source springer squares stages statistical statistically support suykens taipei taiwan technical technique tenth terms than thank that then therrien they third thirteenth thomas topics toward tree trees trends under university using valued vandewalle various vector version very vienna visualization wang washington which wielinga wiley will with work world york zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780549abs.htm 77 Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism Jun Huan, Wei Wang, Jan Prins Department of Computer Science University of North Carolina, Chapel Hill {huan, weiwang, prins}@cs.unc.edu acknowledgement algorithm algorithms apriori artificial association based berhold borgelt bunks challenge champaign common computer computing conference conferrence confirmed data databases dataguides datasets decision demonstrated detection discovery efficiency efficient efficiently enabling evaluation executable fast ffsm finding forest formulation fragments frequent from further gain generator george goldman graph gspan gudes huan icdm ieee ijcai illinois indexing inokuchi intelligence international isomorphism jiawei joint karypis king kuramochi largest margin michihiro mining minnesota molecular molecules motoda muggleton ogihara optimization over pages parthasarathy pattern patterns performance pkdd predictive presence prins proc proceedings providing query real recognition references relevant report retrieval rules science semi semistructured shearer shimony sigkdd similarity srinivasan sternberg structured subgraph substructure substructures synthetic technique thank third toxicology trees university urbana using vanetik various venkatesh video vldb wang washio wide widom xifeng zaki http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780469abs.htm 57 Integrating Fuzziness into OLAP for Multidimensional Fuzzy Association Rules Mining agarwal agrawal alhajj also approach architecture arslan association attributes automated case chan chiang cikm cioss compression conclusions conference confidence cube cubes data databases deals dexa different dimensional discovery effect efficient faloutsos fast figure from fuzzy general generation guided gupta hidber hybrid icde icdm ictai ieee international kamber kaya knowledge large level margaritis meta method methods minimum mining modeling multidimensional multiple netcube numbers online paper polat presented proc proceedings proposed quantitative references rule rules sarawagi scalable selu sets sigmod summary support that third this three thrun tkde tool using utilizes values vldb with zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780315abs.htm 39 Detecting Interesting Exceptions from Medical Test Data with Visual Summarization Einoshin Suzuki1 Takeshi Watanabe1 Division of Electrical and Computer Engineering, Yokohama National University, Japan suzuki@ynu.ac.jp, nabekun@slab.dnj.ynu.ac.jp Hideto Yokoi2 Katsuhiko Takabayashi2 2. Chiba University Hospital, Japan yokoih@telemed.ho.chiba-u.ac.jp, takaba@ho.chiba-u.ac.jp about acknowledgments active algorithm also area believe berka card challenge clark conference culture current data discovery domain download ecml ecmlpkdd editors education establishing francisco from general grant grasping hepatitis http icdm ieee induction information international japanese kaufmann knowledge learning lisp machine makinlay mining ministry morgan niblett partially patient pkdd priority proceedings prototypelines readings references research science scientific september serve shneiderman sports status such supported technology that third this those used useful visualization whole work would http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780589abs.htm 87 Using Discriminant Analysis for Multi-class Classification Tao Li Shenghuo Zhu Mitsunori OgiharaÞ accurate achieved allwein approach benchmark binary class classification classifiers conference data databasess datasets decomposition efficient experiments future general generalized good gsvd have icdm ieee international margin mining multi multiclass overcome performance problem problems proceedings reducing references schapire shown simple singer singular singularity technique test text that third unifying used value work ysis http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780027abs.htm 3 Zigzag: a new algorithm for mining large inclusion dependencies in databases Fabien De Marchi, Jean-Marc Petit Laboratoire LIMOS, UMR CNRS 6158 Universite´ Blaise Pascal - Clermont-Ferrand II, 24 avenue des Landais 63 177 Aubiere` cedex, France {demarchi,jmpetit}@math.univ-bpclermont.fr abiteboul abstract acta addison aggarwal agrawal algorithm algorithms alonso analysis angeles application approximation arizona armstrong association atkinson bancilhon bases bastide bayardo bell berge bertino beyond bocca bohm boolean borders boulicaut brockhausen brodie bykowski calders california casanova cercone cheng chile clio closed companion computer computing condensed conference constraints control counting crete cybernetica czech data database databases demetrovics dense dependencies dependency derivable design dewitt dimentional discovering discovery ecml edbt edinburgh editors efficient efficiently eiter enforcing engineering european existing exploration explorations extended extending fagin fast florida folding form foundations free frequency frequent from functional furtado generating generation goethals gottlob gouda greece gryz guided gunopulos haas hernandez heterogeneity high holland hull hypergraph hypergraphs icde icdm identifying ieee implementation implication inclusion inference inferring information integrity intelligent interaction interesting interna international itemset itemsets jarke jeffery jensen jose journal justification kantola kaufmann kedem khardon knowledge koeller lakhal large lattices lavrac learning lecture leung level levelwise levene library logical loizou long lopes machine management managing mannila marchi mass mathematic maximal maximum miller minimal mining mitchell morgan normal north notes optimization orlando orlowska pages papadimitriou pasquier pattern patterns petit pincer pods pokorny popa poster prague press principles problem problems proceedings project qian queries query ramos reading record references referencial related relation relational remarks representation republic rigotti rules rundensteiner saltenis saltor santiago schek schiefer science sciences scotland search seattle semantic sets siam sigkdd sigmod siirtola society some spain springer srikant stumme symposium system systems taouil techniques technology their theories third tional tiwary toivonen toumani tour towards transactions transversals tucherman tucson tuning universal using valduriez valencia very vianu vincent vldb volume wesley with wrobel zaki zaniolo zdonik http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780597abs.htm 89 Chang-Tien Lu Dept. of Computer Science Virginia Polytechnic Institute and State University 7054 Haycock Road Falls Church, VA 22043 ctlu@vt.edu aaai accurate advances algorithm algorithms also although american analysis analyze anomalies application applications applied approach association attribute attributes averaged avoided barnett based between bivariate bonus both brandley burean california cambridge candidate candidates carries census chapman chawla claiming clouds column commence commonly computation computational computer computing conclusion conference confirm contours counties craig ctlu data databases datasets degree department depth detected detecting detection developed difference different dimensional discovery distance dynamic edition editor effectiveness environmental europa examination existing exists experimental exploratory exploring fact falsely fast focuses fourth from further furthermore generating geographic geographical geoinformatica global graph graphics haining hall haslett hawkins http icdm identification ieee implements important index indicators information intelligent interested international iteration iterative john johnson journal june knorr knowledge kwok large last lewis lisa local locating mapview median method methodologies methods mining more multivariate neighbors obtained ordering other outlier outlierness outliers package pages painho panatier paper point points practical prentice press proc proceedings project propose proposed ratio readers reducing refer references regular respect results risk rousseeuw running ruts science sciences selected seventh shekhar show shows sigkdd social software spatial springer stated statistical statistician statistics summary system systems table terms than that their these they third this those three tools tour true unified united university unwin used value variowin ventura verlag very vldb wang where which wiley wills with york zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780489abs.htm 62 Mining Relevant Text from Unlabelled Documents Daniel Barbara´ Carlotta Domeniconi Ning Kang Information and Software Engineering Department George Mason University Fairfax, VA 22030 dbarbara,cdomenic,nkang @gmu.edu aaai algorithm automatic axis biasmap blum cardinality catego categorization chen class clustering collection combining common computational computer conference cristianini cross crosslanguage data databases distribution document dumais during earn ester european figure frequent fung hierarchical html http huang icdm ieee image indexing information input international itemsets joachims kandola kindermann labelled landauer language latent learning leopold letsche lewis littman machine machines martin mini mining mitchell multimedia neural nips pattern porter porterstemmer precision proceedings processing program recognition references represent retrieval reuters rization sample semantic shawe siam similarity small space speech spring stripping support symposium systems table tartarus taylor test text texts theory third topic training unlabelled using values vector vision wang with zhou http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780509abs.htm 67 Indexing and Mining Free Trees Yun Chi, Yirong Yang, Richard R. Muntz Department of Computer Science University of California, Los Angeles, CA 90095 {ychi,yyr,muntz}@cs.ucla.edu addison agrawal algorithm algorithms also among analysis application apply assigned association authors available based both canonical chung computer conclusions conference data database databases datasets defined description design detailed discover efficient efficiently equivalent expressed family fast findings forest form foundation free frequent full grant homeomorphism hopcroft icdm ieee indexing interested international isomorphism journal july labeled material mining more muntz national necessarily november opinions order paper performance presented problem proceedings readers real recommendations refer references reflect report reports representation represents rules science september sigkdd srikant string study subgraph subtree subtrees supported synthetic tech technical technique techniques that therefore third this those time total traditional tree trees ucla ullman under unique upon used version views vldb wesley which with work yang zaki http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780669abs.htm 107 Combining the web content and usage mining to understand the visitor behavior in a web site Juan Velasquez, Hiroshi Yasuda and Terumasa Aoki ´ Research Center for Advanced Science and Technology, University of Tokyo, Japan E-mail:{jvelasqu,yasuda,aoki}@mpeg.rcast.u-tokyo.ac.jp about acquire acquiring addison advanced allow aoki appear august baeza based behavior berendt bezdek binary browsing capable chapter characteristics clustering clusters codes conf conference content cooley correcting data december definition deletions derived dokl each ecommerce education effective engineering feature first found from future greenwich hotho icdm ieee improve increase information insertions integrating intelligent international introduced introducing italy jersey journal june knowledge knowlegde levenshtein maps measure methodology mining mobasher modern more neto newark organizing oxford pages patterns personalization phys preferences preparation presented proceedings procs proposed references relational research result retrieval reversals ribeiro richard runkler sardinia self semantic september sequence sessions similarity site spent srivastava study stumme sung systems table technologies technology their them third this three time towards university usage useful user using variables velasquez very visited visitor weber wesley which wide with work world yasuda yates http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780275abs.htm 34 Statistical Relational Learning for Document Mining aaai abbeel about academy accuracy activity adaptive aggregates aggregation aided akaike algorithmic algorithms allows also analysis anderson annals application approaches artificial attributes autocorrelation autonomous based bayesian become better bias biological blockeel bollacker boolean bratko case categorization cause chakrabarti citation classes classification classifying clausal clauses cluster clustering clusters cohn combining comparison complexity computer concept conference connectivity constraints construction content craven critical cumulativity current data database datalog debugging decision dehaspe derived describing determination determine digital dimension discovery discriminative distribution document domingos down dzeroski editors effects efficiency enhanced entropy estimating even expanded exploiting explore extend extension feature features first flach flake foster frequent friedman from furnkranz further generating getoor giles glover grows haas help hoff hofmann horn hyperlinks hypertext icdm iclp icml ieee ijcai improve incorporates indexing inducing induction inductive indyk information intelligence intelligent international introduction invention iterative jensen karalic kersting king knobbe knowledge koller kramer laer large lavrac lawrence leaps learners learning libraries likelihood limited link linkage logic logical logistic machine many markov maximum meta metalearning method mining missing model modeling models more muggleton multirelational national navigation necessary network networks neville nips ontological optimizations order other pages patterns pennock perlich pfeffer pkdd popescul predicate prediction predictions preferable press principle probabilistic problems proceedings produce produced program programming programs prolog promise proposal propositional propositionalisation propositionalization provost quantitative query raedt random rapidly rather reasons references regression reinforcement relational relations representations richer rigorous roth sampling sato scalability scale scaling schema schwartz sciences search security select selection semantics shapiro shih should siebes sigmod slattery snowbird social space sparse springer springerverlag srinivasan statistical statistics stochastic structural structure study subspaces such support symposium taskar text than their theory third thirtythree time toivonen towards training trees truly tsioutsiouliklis ungar upgrade used using verlag volume weld what widmer will with working workshop workshops http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780701abs.htm 115 Dimensionality Reduction Using Kernel Pooled Local Discriminant Information achieve acoustics adaptive advances again akernel analysis anouar applied approach average based baudat both bylocally calculated cambridge chellappa chen chicago classification component components computation computed conf conference contrast courant cristianini dash data decrease dimension dimensional dimensionality discriminant distributions duction duda each edition editors eigenvalue eigenvalues embedding error etemad face faces failed feature figure generalized global hart hasselmo hastie hilbert human icdm ieee information intelligent international interscience introduction irwin john jolliffe journal kernel kpca kpools kutner learning left linear local machines mathematical method methods mining models mozer muller nachtsheim nearest neighbor neter neural nonlinear normalized other pages panel paper pattern performs physiacs ponent pooled pooling presents press principal problem proc proceedings processing rapidly rather recognition reduction references regression represent representations represented right roweis samplesize saul scene scholkopf science selection shawe shows signal significant small smola solve sons spaces speech springer statistical subspace subspaces summary support surprising system systems taylor that third this tibshirani touretzky type university used using vector verlag volume wasserman were which wiley york http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780179abs.htm 22 Building Text Classifiers Using Positive and Unlabeled Examples Bing Liu Department of Computer Science University of Illinois at Chicago {liub@cs.uic.edu} Yang Dai Department of Bioengineering University of Illinois at Chicago {yangdai@uic.edu} Xiaoli Li, Wee Sun Lee School of Computing, National University of Singapore/Singapore-MIT Alliance {lixl, leews}@comp.nus.edu.sg Philip S. Yu IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, USA aaai accepted acquire active actuaries adding addition advances agrawal algorithm allan among applications approach athena automatic banerjee based basu bayardo bayes bennett blum bockhorst boser brockhausen buckley burges capacity care case categorization chang class classification classifiers classify cliffs cluster clustering colt combining comparison concepts conference craven data databases demiriz dempster denis dimensional distribution document documents ecml edbt effect englewood enhancing environment estimating event examination example examples experiments exploiting faculty features feedback ferra filter formula freund from gale general ghani gilleron girosi goldman guyon high hill icdm icml ieee ijcai incomplete inductive information institute intensive interactive international introduction ipmu joachims journal kernel knoblock knowledge kowalczyk labeled laird lang laplace large learning lewis lidstone likelihood lkopf logistic machine machines making management manevitz many massachusetts maximum mccallum mcgill mcgraw memo methods microsoft mining minton mitchell models modern monitoring mooney morik muggleton multiclass multiview muslea nature netnews neural newsweeder nigam note osuna page parameters partially pebl platt positive posteriori practical probabilities problems proceedings processing queries rakutti references regression relations relevance relevant report research retrieval robust rocchio royal rubin salton scale scholkopf seeding semi semisupervised sequential shawe sigir smart smola society springer srikant statistical study supervised support svms system systems technical technology text theory third through thrun tommasi training transactions tuning unlabeled using value vapnik vcdimension vector verlag very weakly weighted williamson with workshop yang yousef zhang zhou http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780267abs.htm 33 Exploiting Unlabeled Data for Improving Accuracy of Predictive Data Mining Kang Peng, Slobodan Vucetic, Bo Han, Hongbo Xie, Zoran Obradovic Center for Information Science and Technology, Temple University, Philadelphia, PA 19122, USA {kangpeng, vucetic, hanbo, hongbox, zoran}@ist.temple.edu ability accurate acid acids acknowledgements adaptive advances algorithm alternatives altschul amino amount applications approaches appropriate apweiler arcing automatic available bairoch bartlett based bayesian beneficial berkeley between bias biased biochemistry biological biology blast blatter blum boeckmann breiman brown buckley business casp ceder characterization characterize class classification classifier classifiers colt combining comes compared comparison comput computation conclusion conditional conference considered contrast contrasting could data database dempster detect detection difference disorder distribution documents domain domingos donovan duin dunker edinburgh effect effectively elkan estimate estimates estimation estreicher examples experiments extensive families feedback framework friedman from function gapped gasteiger generation geosci grant greatly hierarchy however huang hughes iakoucheva icdm icml icpr ieee ignored image images imbalance important improvement improving incomplete indicate institute integral integrating international into intrinsic issue japkowicz kernel knowledgebase labeled laird landgrebe large lawson learn learning likelihood linial lipman lippmann local mach machines madden manage maps margin martin maximum mccallum methods michoud miller mining mitchell mitigating models multiclass network neural nigam nucleic object obradovic olshen optimizing outlying outputs paper part partially peng performance pets phan phenomenon pilbout platt pokrajac porter possibly posteriori practical predicting predictive press probabilistic probabilities probability problem problems proc proceedings process program programs properties proposed prot protein proteins protomap provided provost quality radivojac range reducing references regression regularized relevance remote report results retrieval richard rubin salton sample samples schiffer schneider scholkopf school schuurmans scores search seeger sens sequence sequences shahshahani should showed siam significance similar size small smola society solving space special standard statistical stern stone strategies strength stripping study successful suffix supervised supplement support supported swiss synthetic technical term text that theory therefore third this those thrun tian trained training trans transforming trees trembl true unbiased underrepresented uniform university unlabeled useful using vapnik variance vector very vucetic wadsworth weighting well when while wiley with working yona york zadrozny zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780163abs.htm 20 Spatial Interest Pixels (SIPs): Useful Low-Level Features of Visual Media Data affect alto alvey analysis annals applications arya belhumeur bergen bruce burton cambridge chellappa class cognition college color combined component components composite computational conference consulting corner data detector dimensionality edge educational eigenfaces ekman eugenics face faces facial features fisher fisherfaces friesen gevers hancock harris hespanha huber human iccv icdm ieee image indexing international invariant jolliffe journal kriegman landy linear machine manchester maryland measurements memory mining modeling models multiple nearest neighbor pages palo park perception pictures pixel press principal principle problems proc proceedings processing projection psychologist psychology recognition reduce references robust searching segregation shape sirohey smeulders specific statistics stephens survey taxonomic texture thesis third tpami university using video vision visual wiley wilson http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780705abs.htm 116 A Feature Selection Framework for Text Filtering Zhaohui Zheng Rohini Srihari Sargur Srihari CEDAR, Department of Computer Science and Engineering University at Buffalo, The State University of New York zzheng3,rohini,srihari @cedar.buffalo.edu advanced approaches automated butterworths case categorization combining comparative computing conference data developement digital dissertation distributed ecdl european evaluation evidence experiments feature features fourteenth from galavotti homogeneous icdm ieee imbalanced information international journal learning libraries ljubljana london machine mining mladeni negative optimally pages pedersen perceptron positive proceedings proved references research retrieval rijsbergen sebastiani selection sets sigir simi slovenia srihari statistical study surveys technology text third twentieth university usability workshop yang zheng http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780493abs.htm 63 A User-driven and Quality-oriented Visualization for Mining Association Rules Julien Blanchard, Fabrice Guillet, Henri Briand IRIN ­ Polytech'Nantes ­ University of Nantes La Chantrerie ­ BP50609 44306 Nantes cedex 3 France {julien.blanchard, fabrice.guillet, henri.briand}@polytech.univ-nantes.fr aaai acquisition advances agrawal aiding analyzing anand applied approach assisted association attribute bandhari basic blanchard bozdogan brachman briand centered conference control data databases decision decisionmaking definition discovered discovery dominance entropic european exploration fast fayyad focusing from gras guillaume guillet helsinki human humphreys icdm ieee implication improving information infovis intensity interaction interactive interestingness international klemettinen knowledge kumar kuntz lehn level machine mannila measure methodology mining model montgomery multi organization patterns philippe piatetsky press principles proc proceedings process processes production references report right rules search selecting shapiro sigkdd smyth software springer srikant srivastava statistical structure studies summarization svenson symposium technical text third thomas toivonen toward university usability userdriven vari verkamo version visualization visualizing whitney with wong http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780259abs.htm 32 MaPle: A Fast Algorithm for Maximal Pattern-based Clustering Jian Pei Xiaoling Zhang Moonjung Cho Haixun Wang Philip S. Yu able aggarwal agrawal algorithm algorithms applications arep association automatic based best between beyer biclustering biology both candidate capturing cheng church cluster clustering clusters compression conf conference correlation data databases dimensional entropy expression extraction fascicles fast finding frequent generalized generation harvard high http icde icdm icdt ieee intelligent international items jagadish large life maple matrix maximal meaningful method micro mining molecular nearest neighbor numerical outperforms pattern patterns previously proc proceedings projected proposed real references results rules semantic sets show sigmod similarity spaces srikant subspace synthetic system tavazoie test that third vldb wang when with without yang yeast http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780665abs.htm 106 Active Sampling for Feature Selection Sriharsha Veeramachaneni Paolo Avesani ITC-IRST, Via Sommarive 18 - Loc.Pante,` I-38050 Povo, Trento, Italy E-mail: sriharsha,avesani @irst.itc.it acquisition active against algorithm applications artificial assessment assume atlas bayes best between binns blum both class classification classifier classifiers coefficient cohn conditional conf conference correlation cost crop current data datamining decision design difference discovery disease distributions domingos elisseefi empirical error estimated estimation evaluation examples feature features figure francisco general generalization genetic guyon hughes hybrid icdm ieee improving incidence independence independent induction intelligence international introduction invest joint journal kaufmann knowledge koller ladner langley learning loss machine madden making management mccallum mean metacost method mining moreover morgan multinomial nominal nyrop only optimal order padmanabhan pages pest plant plot probability proc proceedings provost random rank ranking rate rates real reduction references related relevant research resources saar sample sampling selection sensitive should size solution spearman specific square strongly suitable support symposium text third through time tong toward tree true tsechansky turney valued variable vector werf where with zheng http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780517abs.htm 69 Information Theoretic Clustering of Sparse Co-Occurrence Data Inderjit S. Dhillon and Yuqiang Guan Department of Computer Sciences University of Texas Austin, TX 78712-1188, USA inderjit, yguan@cs.utexas.edu advanced algorithm annealing augmented award bottleneck career classification clustering clusters compression computer conf conference cover data dept deterministic dhillon dimensional divisive document duda edition elements feature filter friedman grant guan hart high icdm ieee information international iterative john kogan kumar lang learning local mach machine mallela maximization method mining netnews news occurrence optimization pages pattern problems proc proceedings program references regression related report research rose sciences search sept sequential sigir slonim sons sparse stork technical texas text theoretic theory third thomas tishby university unsupervised using weeder wiley word york http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780147abs.htm 18 An Algebra for Inductive Query Evaluation Sau Dan Lee Luc De Raedt Institut fur Informatik Albert-Ludwigs-Universitat Freiburg Georges-Kohler -Allee, Gebaude 079 D-79110 Freiburg im Breisgau Germany {danlee,deraedt}@informatik.uni-freiburg.de abstract academic accounts acknowledgements agrawal alberta algebraic algorithm algorithmica algorithms also amanda amount another answering application artificial association august australia authors automata baralis been bioinformatics boolean borders boston both bucila bytes calgary canada chapter cheung cinq clare class closed collected compute computer computing conclusions conference consortium constraint constraints construction consuming convex cost dasfaa data databases dawak december department developed development dimension dimensions discovered discovery domain down dual dualminer each edmonton effect effective effectively efficiently employed engineering evaluate evaluation experimental explorations extended fast feature finding florence found fragment framework from functional functions further furthermore gehrke general generalization generalized generalizes generalizing greenberg have heap heikki helma higher hirsh icdm ieee ijcai illustrate implemented important improved incremental inductive instead intelligence international isbn italy itemsets jaeger january japan july kaufmann kifer king kluwer knowledge kramer learning levelwise like line linear lncs machine machinery maebashi maintaining maintenance manfred mannila many march matching maximum melbourne memory mining mitchell molecular more morgan most needs nevertheless notion november number observation only operations opportunities optimization original pages paper part partly pattern patterns perspective phenotype planning present presented proc proceedings process processed programs project providing pruning psaila publishers pushed queries query querying question raedt realistic recorded references refinement remaining report represent representing research resulting results ross rules save saved saves science search september series sets shown sigkdd solution space spaces speed springer srikant strategy strings studied subqueries such suffix supported switching symposium table technique thank that their theoretical theories theory there these they third this thus time tions toivonen total traces traditional trees ukkonen under underpinnings university unix usage used useful users using version vldb volume weiner well were when white whole whom with work would yeast http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780395abs.htm 49 On Precision and Recall of Multi-Attribute Data Extraction from Semistructured Sources able acknowledgments address adelberg agents alphabet angluin anonymous approach arasu artificial ashish assumption attribute attributes authors automatic automatically automaton autonomous azavant based bases benchmark binary boundary breunig campbell cannot cases cikm cohen common complete complexity computer conference consistent control crescenzi current data databases deal detection discovery documents doorenbos dung ecology eliminate embley etzioni example examples experimental expressive extended extension extracting extraction finding finite flexible free from future garcia generating generation given gold grants hammer helpfulcomments hierarchical however html hurst icdm identification ieee ignored ijcai important induction inference information intelligence intelligent internation international internet issue jensen jiang joint journal kifer knoblock knowledge kushmerick large larger leaf learn learning liddle like list lists machine made maier management maximal mecca merialdo middendrof minimal minimum mining minton molina more multiple muslea need nested nestorov node nodes nodose noise nullalbe ontology over paes pages part patterns perkowitz plan practice presence problem problems proceedings raih ramakrishnan readily record recycling referees references regular results rich roadrunner rules sahuguet schema science semi semistructured sequences sets setup shortest sigmod simplifying single sites smith soderland some sources spans state strings structured structuring subsequences suggestions supersequence supersequences supported system systems tables techniques template test text thank that their theoretical third this those tool towards training transducers tree tsimmis types ukkonen understand unstructured useful using value various vassalos very vldb webdb weld where wide will with work workshop world would wrapper wrappers wrapping yang yerneni http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780569abs.htm 82 Effectiveness of Information Extraction, Multi-Relational, and Semi-Supervised Learning for Predicting Functional Properties of Genes aggregation algorithms approaches area based biocomputing biokdd biological blum bradley cheng classification combining competition computational conference contributions craven curve data deletion denecke effectiveness eleventh emmy entry evaluation experiments explorations extraction fellowship flach foundation from fukuda gene german hatzis hayashi hidden icdm identifying ieee inductive inference information international joachims kramer krogel labeled landwehr lavrac learning leek logic machine machines marco marcus markov master mining mitchell models morishita multi multirelational names noether pacific page papers pattern predicting prediction proceedings programming propositionalization protein recognition references regulation relational report results sche scheffer science sese sigkdd springer support supported symposium takagi tamura text thanks their theory thesis third tobias towards training transductive transformation tsunoda ucsd under unlabeled using vector view with workshop wrobel http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780187abs.htm 23 OP-Cluster: Clustering by Tendency in High Dimensional Space Jinze Liu and Wei Wang Computer Science Department University of North Carolina Chapel Hill, NC, 27599 {liuj, weiwang }@cs.unc.edu agarwal aggarwal algorithm although among analysis arep array association ayres biclustering birch bitmaps called capture capturing close closed cluster clustering clusters compact conference consistent correlation cosine data databases depth depthfirst devised dimensional dimensions discover discovery distance dmkd effectively efficient efficiently environment euclidean exhibited first flannick gehrke generation harvard high http icde icdm ieee international july large livny long manifest matrix measured method mining model namely noisy objects orihara pages paper parallel parsad parthasarathy pattern patterns proceedings proposed ramakrishnan references rules sequential sets sigkdd sigmod similarity space specified still structure subset subspace tendency that they third this threshold tree user using very wang with yang yeast zaki zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780513abs.htm 68 T-Trees, Vertical Partitioning and Distributed Association Rule Mining Frans Coenen, Paul Leng and Shakil Ahmed Department of Computer Science, The University of Liverpool Liverpool, L69 3BX, UK frans,phl,shakil @csc.liv.ac.uk addison advantages against agrawal algorithm algorithms also amount appear approach approaches apriori arnold associated association based better both bramer challenges chattratichat clearly coenen compared computing conclusions conf conference darlington data dataset demonstrated demonstrates described development discovery distributed distribution effectively efficiency engineering enhanced especially established evaluated evaluation experimental facilitates fast founded freeman generally generic ghanem good goulbourne have hning hupfer icdm ieee implement increases input intelligent international itemset itself javaspaces journal kaufman khler knowledge large largely lends leng many mechanisms message methods minimal mining more morgan much number offered optimising ordering parallel parallelisation partial passing patterns performs practice preece principal principles proc proceedings processes readily references research respect responses rule rules scale shafer size springer srikant structure structures summary support sutiwaraphun systems task than that third this those transactions tree ttree unlike used using vertical vldb wesley when which with xviii yang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780521abs.htm 70 Links Between Kleinberg's Hubs and Authorities, Correspondence Analysis, and Markov Chains academic adapted addition advantage after algorithm allow already analysis aperiodic appendix applications approach artificial authoritative authorities basis being bremaud brin bringing carlo chain chains characterise citation classified closely column complex component computations compute computed computer computing conclusions conference convergence corrections correspond correspondence corresponding could data databases decreasing denote density depends diag distance distribution each easily eigenvalue eigenvalues eigenvector eigenvectors elements entries environment equal equivalent eventually extended fields finding form from gibbs golub greenacre have hence higher hill hopkins hubs hyperlinked icdm idea ieee ijcai important informa initial initially instead intelligence international interpretations interpreted introduce introduction irreducible johns joint jordan journal kind kleinberg known laboratory left lempel link loan made markov matrices matrix maybe mcgraw means mentionned method mining model monte moran more moreover motwani multiplicity multivariate navigation next nonnegative numerical obtain often only order others page pagerank pairs papoulis particular pillai positive press princeton probability proceedings processes produces proof propose provide qtel queues random ranking rate real references related relational report represent results return right rows salsa same scores semidefinite show showed simulation since solution sorted sources springer stability stanford starting starts state states statistical steady steadystate step steps stewart stochastic structure structures subdominant such system systems technical term that then theory therefore they third this through thus time tion transactions transition uivi university used value values variables vector vectors verlag walk well when where which will winograd with york zheng http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780677abs.htm 109 aaai absolute accomplished accuracy acted advancing adverse algorithm also analysis analyst analyzed anesthesia anesthesiology applications applied approach artificial bagging bayesian berlin between boeing bootstrap boundaries breiman brute burden cahalan calculating carried chapman characterizing client computation computational computer computers concerned conclusion conference confidence conjunct conjunctive construct contained cost data database dataset datasets decision defined depending depth design detection dfisher directed disadvantage discovered disparate display dissertation domain domains easy efron eijkel endeavor essay estimation etzioni events executing expected experimentation expert experts exploited exploration fisher fitted focus force forrest four from future general generally give goldsmith good greater hall have highly holte homogeneous hours http icdm identical ieee images increased induced induction intel intelligence intelligent international introduction invite just king knowledge kubat larger learning lists machine manufacturing massive matwin methods mine mining minute modern monograph more multicenter multidimensional multitude must number other outcomes over pentium perioperative portions predictors preliminary press probabilities problem procedures proceedings radar reduce reducing references rehder replication representation research retrieving riddle risks rule rules running satellite scaling search seconds segal severe sheer should similarity single skepticism some space spills springer stability stable storage stored storing strategies study suited summary supported that their thesis third this three tibshirani time took twelve type understand understanding university using vanderbilt variability verlag visualizing vuse waitman washington well with work workstation york http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780601abs.htm 90 Learning Rules for Anomaly Detection of Hostile Network Traffic Matthew V. Mahoney and Philip K. Chan Department of Computer Sciences Florida Institute of Technology Melbourne, FL 32901 {mmahoney,pkc}@cs.fit.edu acknowledgments adam adaptive afil agrawal alad algorithms anomaly association assurance attack available barbara based bases chan code computer conf conference couto cyber darpa data defense detecting detection dist ethernet eval evaluation failure fast florida floyd fried funded haines hoagland hostile http icdm ieee information international intl intruders intrusion intrusions jajodia korba large learning leland lerad lightweight line lippmann lisa mahoney mining mmahoney model modeling monitoring nature netad network networking networks partially paxson phad poisson popyack proc proceedings raid real references report roesch rules security self sigcomm silicon silicondefense similar skinner snort software source spade spice srikant symposium system taqqu tech technical third this time traffic transactions usenix valdes very willinger wilson work workshop http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780735abs.htm 123 Inference of Protein-Protein Interactions by Unlikely Profile Pair Byung-Hoon Park, George Ostrouchov, Gong-Xin Yu, Al Geist, Andrey Gorin, and Nagiza F. Samatova Computational Biology Group, Computer Science and Mathematics Division, Oak Ridge National Laboratory {parkbh, samatovan}@ornl.gov acad achsel acids actin activated activation american analyses analysis analyze assigning based bateman binding binds biochemical biochemistry bioinformatics biol biology blocks bock brain brings carcinoma categorical cell cells cellular center chains chen chimeric chromosome classified clinical cofilin comparative complete complex computational conference conservation context contiguity correlated coupling coverage cross current dandekar data database detecting development directed domains drubin electrophoresis elongation endocrinology endometrial enright evaluation events experiments factor families features fibroblast fienberg filaments fingerprint fonstein forms free from function functional functions fusion gene genome genomes genomic gough growth hand henikoff hierarchical highthroughput homologue homology hprp human huynen icdm ieee including increased inferences inhibits interact interacting interaction interactions international interpro journal kendrickjones keratinocyte kinase laboratory light machines maltsev maps marcotte margalit markers mathematical messer metabolism methods mining mitogen modeling molecular mulder myosin national natl nature networks nishida notices novel nucleic opinion order overbeek pathway pawson pazos pellegrini peptide pfam phylogenetic physically porcine predict predicting prediction primary proc proceedings profiles program proliferation promotes protein proteins pusch qualitative quantitative references regulatory research resources ribosomal ridge schlessinger science sciences sequences sequencesignatures sequestration several signal silico society souza specific sponsored sprinzak stable structural structure studying subdomain switching synechococcus taniguchi terminal that their third this tolife tool transduction trends tropomyosin unwindase used valencia with work xenarios yeast http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780211abs.htm 26 Privacy-preserving Distributed Clustering using Generative Models Srujana Merugu and Joydeep Ghosh Electrical and Computer Engineering University of Texas at Austin Austin, TX 78712 merugu, ghosh @ece.utexas.edu accumulation acknowledgments aggarwal agrawal algorithm algorithms also arindam association azoury bagging banerjee bayesian bounds bradley breiman carlo chain classification cluster clustering collective combining computation computer conference cooperative cover data database dempster density dept design dhillon distributed distributions documents editor editors elements ensembles erlbaum estimation estimators evaluation evfimievski evidence exponential family fayyad framework fred from gehrke ghosh grants handbook helpful heterogeneous hierarchical hill icdm icml icpr ieee imum incomplete inference information initialization international iterative jain jmlr johnson july kargupta knowledge koku labeled laird large lawrence learning like likelihood lindell line linearly lncs loss machine markov mccallum mcgraw memory methodological methods mining mitchell modha monte multiprocessors neal nigam pages papoulis parallel part partitionings pinkas predictors preserving principles privacy probabilistic probability proceedings processes quantification random ravi references refinement reina relative report reuse royal rubin rules scalable scale science series sigkdd sigmod smyth society springer srikant stacking statistical stochastic strategies strehl suggestions supported symposium systems technical text thank their theory third this thomas thrun toronto university unlabeled using variables verlag volume warmuth wiley with wolpert work would yamanishi york zaki http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780649abs.htm 102 A Hybrid Data-Mining Approach in Genomics and Text Structures actually almost also analysis annu another appear applications applied applies appropriate articlerender back based bases basic basis been believed better between bible biol biomol biophys cabios caim cancun characteristics characterization clustering comparative compared compbio complete components computing conclusions conditional conference confidently configurations connections connective conversion converted converting coregulation crisp current data deal decision derive determine determining developed different directly discourse discussion distance distances domain download dynamics each eisen elsewhere embs enhances equally error errors eugenek example expected exploring expression extension fast fcgi finally fira first fiser format found further fuzzy gasch gauss gene generate genes genome genomes genomic hanai have hidden histogram home honda html http hugheygif hughkrogh hybrid icdm identification ieee implementation important improve including indeed individual industrial inferring informatics international into introduced invited kobayashi kononov kroghgif large least lecture letters like literature logic looks making markov mart mass mathematics means melo method methods mexico mine mining mlynek modeling models modified moreover much needed netcom network neural neuro neurofuzzy next numerical obtained occur occurrences oradea ornl ornlreview other overall papers parts patyra performance plays plenary plus poor positions predict predicting prediction predictor predictors proceedings process promising proposed protein proved pubmed pubmedcentral pubmedid quality quite rdsor reconstructed recurrence references renom reported reports representation reprint research results role romania same sanchez separation september sequence sequences series sets several show showing similar single specific square struct structure stuart successful successive suitable symbols symmetrical system systems talkers teodorescu tested teubner text texts than that their then thinking third those through tomida tool train training trend tried type uberbacher ucsc university used using values various varying version very virus visual were where which wiley will with word words work writer yamakawa yeast yields http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780419abs.htm 52 A new optimization criterion for generalized discriminant analysis on undersampled problems Jieping Ye Ravi Janardan Cheong Hee Park Haesun Park academic algebra algorithms analysis based berry brien centroids classification cluster clustering computations concept conference criterion data decomposition decompositions deerwester dhillon dimension dimensional discriminant dubes duda dumais edition fukunaga furnas generalized generalizing golub hall harshman hart hopkins howland icdm ieee indexing information intelligent international introduction jain janardan jeon john journal landauer large latent learning least linear loan lower machine matrix mining minnesota modha numerical optimization paige park pattern prentice preserving press problems proceedings recognition reduction references report representation retrieval review rosen saunders science semantic siam simax singular society sparse squares statistical stork structure technical text third towards undersampled univ university using value wiley http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780411abs.htm 51 Segmenting Customer Transactions Using a Pattern-Based Clustering Approach Yinghui Yang and Balaji Padmanabhan Operations and Information Management Department The Wharton School, University of Pennsylvania {yiyang, balaji}@wharton.upenn.edu aaai above accurate across advances aggarwal agrawal alberta algorithm algorithms allenby along also alternatives american analysis another answers application approach approaches approximate argued association assumed attributes august australia authoritative automation based behavior beil best better between build canada categorical categories category chapter choice cikm city claim clope cluster clustering clusters coming comp compared computer concept conf conference consumer continue criterion customer customers data december definitions demonstrated density department derived described design determining difference differences different discovery discriminant discussion distribution distributions document does domain dynamical each eater econometrics edmonton effective effectively efficiency enable engineering estimation existence experiments extend fast finally firms first fraley frequent from functions future gibson good guan guha harder hence heterogeneity heuristic hierarchically highly hope hypergraph hypergraphs hypothesized icdm ieee ignore ignores important improve incrementally influence information international introduction intuitive investigate issues items itemsets journal july june kansas karypis kimbrough kleinberg know knowledge kumar large learning limitations lines madison management mannila many marketing means mentioned method methods metric mining minnesota mixture mobasher model modelbased models more multilevel natural need november observed opportunities other overall padmanabhan paper partitioning pattern patterns performs presented press probability proc proceedings procs proposed provide raftery raghavan rastogi recent references report representation representing research results rock rossi rule rules seattle sets shekhar shim sigkdd sigmod similar similarity similarly simplify sites solutions spirit srikant statistical statistics subjective subsequent such suggest sydney systems tech technical technique techniques technology term termbased text that their there these they third this toivonen traditional transactional transactions type understand university usage using utilizing verkamo view viewed vldb vlsi wang washington which while will wits work workshop yaca yang york zhao zheng http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780003abs.htm 0 Efficient Multidimensional Quantitative Hypotheses Generation Amihood Amir Department of Computer Science Bar-Ilan University,52900 Ramat-Gan, Israel (972-3)531-8770, amir@cs.biu.ac.il and College of Computing, Georgia Tech, Atlanta, GA 30332-0280 Reuven Kashi RUTCOR-Rutgers Center for Operations Research Rutgers, The State University of New Jersey 640 Bartholomew Rd, Piscataway, NJ 08854-8003 kashi@cs.biu.ac.il Nathan S. Netanyahu Department of Computer Science Bar-Ilan University 52900 Ramat-Gan, Israel (972-3)531-8865, nathan@cs.biu.ac.il and Center for Automation Research, Univ. of Maryland, College Park, MD 20742 accepted acknowledgments advantages agrawal algorithm amir analyzing approaches attributes authors based bases bayardo both cluster clusters combinations complexity computer conclusions conf conference contained correlations creates daniel data database databases described detection dimensional discovery discussions elements engineering estimation every everything feedback fields fifth finding from function generate generated generator gives gratefully gries growth haas haralick have helpful high histogram histograms housten however icdm ieee image improved indeed interesting international ioannidis italy john june kashi keim knowledge kriegel large leroy linear local madison markus mathematical method mining misra most netanyahu november number outlier over pages poosala predicates probability proc proceedings programming proposed providing quantitative queries range ranging rather record records references regression relations repeated respect robust rome rousseeuw rules running science seeking seidl selectivity september series sets shekita shows sigkdd sigmod similar size sizes some sons sought space statistical statistics structural supporting synthetic table techniques texture thank that thesis third three thus time times triple univ used variables various very visual vldb wawryniuk wiley will wisconsin with http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780565abs.htm 81 PixelMaps: A New Visual Data Mining Approach for Analyzing Large Spatial Data Sets Daniel A. Keim, Christian Panse, Mike Sips University of Konstanz, Germany advizor advizorsolutions algorithm algorithms amounts analyst approach arcview case clustering comparative concepts conference constraints danalys data databases defined degree discovery edinburgh effective effectiveness efficient error esri evolutionary exploration fast features figure from gridfit herrmann hinneburg homepage html http icdm ieee insight intensive interfaces international invited kamber kaufmann keim knowledge koutsofios large library measurement mineset mining morgan multi multimedia multiobjective nature noise north objective optimization optimizer overlap pages paper parallel pdfs pixelmap position ppsn preservation problem proc proceedings publishers references relative scale section september sets software solutions solving spatial study systems talk techniques telecommunication thiele third user using visual visualizing white whitepapers with workshop zitzler http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780557abs.htm 79 SVM Based Models for Predicting Foreign Currency Exchange Rates about accepted accounting accuracy accurate achieved actual addition affect agent also although analysis appear appeared appears applic application arima attractive australian back based basis behavior behaviour believe best better british case cases change changes check choice chosen clearly complex complicated comput computing conclusion conclusions conf conference connor considered consistent control convincing correct could coverage currencies currency currently data deviational deviations difference different directional dollar downward drops each equally error errors every evidence exactly exchange exhibits experiment favour figure finance financial first fishwich following forecast forecasting forecasts forex four framework francosco free from function functions further gain general gestel given good grothmann happen hence higher hill historical holden however hybrid icdm icnnsp ideally ieee illustrated illustrates impact improvement increased increases indicates indicators individual intelligent international investigates investigation irrespective japanese jenkins jhee journal kamruzzaman kernel kernels larger later learning least less linear little long loss lower machine machines management managerial market markets matching maximizing maximum means medeiros melbourne method metrics minimizes minimum mining mixed mnse modeling models monitoring more most much multi multiple nanjing nets network networks neuneier neural neurocomputing nmse nsme observed omega only orsa other over paper parameter parameters pattern pedreira perform performance performs point poly polynomial possible pound practical practice practitioners predicted predicting prediction proc proceedings processing produced produces propagation radial range rate rates reducing references regularization reguralization remain remains remus respect result results same sarker science second selecting selection sensitivity series short should show showed showing shows signal similar single situation small smooth some spline squares statistical still stock study submitted success suitable superior support suppose systems tang technical term terms than that their then theory there third this thought three time total trans transitions trend trends type under unit units upward using value values vapnik variation varies vector vectors veiga versus very view wang week weeks when whereas while wide wiley winner with within words would yields york zhang zimmermann http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780083abs.htm 10 Mining Significant Pairs of Patterns from Graph Structures with Class Labels Akihiro Inokuchi and Hisashi Kashima Tokyo Research Laboratory, IBM Japan 1623-14, Shimotsuruma, Yamato, Kanagawa, 242-8502, Japan inokuchi,hkashima @jp.ibm.com aids algorithm answering antiviral application approach approaches aprioribased arikawa arimura artificial asai association associations based both comlab complete concept conference connected cook correlated data database databases deterministic discovered discovering discovery docs efficient efficiently european exception exceptions fast feature february finding forest fragment frequent from fuzzy global gonzalez graph graphs groups gspan helma holder horst html http icdm icml ieee ijcai implication inokuchi intelligence intensity international itemset joint karypis kawasoe knowledge knowledgediscovery kodratoff kramer kuramochi large lattice learning levelwise machine machlearn march metric mining molecular morishita most motishita motoda negative nishimura oldwww optimization oucl pattern patterns pkdd pods positive practice principles proc proceedings pruning raedt references relational report research rough rsfd rules sakamoto screen semi sese sets siam space springer statistical stochastic structured subgraph subgraphs substructure substructures summarizing surprising suzuki symposium systems third traversing trees undirected unexpected unified version washio with workshop zaki zhang zytkow http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780645abs.htm 101 Mining the Web to Discover the Meanings of an Ambiguous Word about accomplished account accounted achieved admit advances algorithm ambiguous amount analysen annual applications apply approach arbitrary arns association assoziationen assumption automatic automatically available aware based been behavior benefit berechnung best better bias called cambridge candidate case cases certain certainly changes choose chosen church clear close closer clustering collocation collocations column comments common commonly compared computational computed computer computers conclusions conference confidence considering constraint contain contemporary controlled could course crucial current data defining described describing descriptor descriptors desirable despite dictionary different difficult difficulty diplomarbeit disambiguation discovering discovery discussion disk dissertation distinction distinctions doctoral document documents does domains dominant early edmonton engineering english enough entries evaluation even examine example expected experiments extending fachbereich fails familiar field first florida following formulated foundations fourth frequent from fully further future gain gale generation germany given good google hand handheld happened have heaven hildesheim homograph humanities icdm identify identifying ieee include increasing indicate induction information instead instrumentation interesting international internet interpretation interpretations into issue journal kilgarrif knowledge known language large learners less lexikalischer like likely limitations limited linguistics list longman look looks main manning manuscript many master mcevoy meanings means meeting mehrdeutigkeiten memory methods mining mixtures more much multiword natural neglecting neill nelson nevertheless norms number observed occurrences often olms only other ours over paderborn pages pair pairs palm palmer pantel perform performance phil physical plans plausible poach pointed poisson positive possible presented press problem problems proceedings processing produced produces promising proposed prospects psychologie quantitative question quickly rapp rare rather reason reasonable references reflect reflected reflects related research respective restrict restricting results retrieved retrieving right rivaling sake same search second selection semantic sense senses senseval should sigkdd since some sometimes somewhat south space special specifying speech sprachstatistische stage statistical step straightforward supervised surprising surprisingly syntactic system table take tamir task technology tendency terms text texts than that then there these thesis third this thus time towards unexpected unfortunately universit university unsupervised usage used using utilizes version vocabulary walling well wheeler when where which with word words work yarowsky http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780633abs.htm 98 Mining Semantic Networks for Knowledge Discovery adaptive addison analog astar automatic better carpenter categorization chunk conceptual conclude cone conference corpus cream dagan data databases dictionaries disambiguation discovery entity exploiting fast feldman from fuzzy graph grossberg hmmbased http icdm ieee improvement indeed information international kanagasa knowledge lead learning lesk machine mind mining named networks neural pages patterns people performance pine precision proceedings processing rajaraman readable recall recognition references resonance rosen sense sigdoc sowa stable structure structures system tagger tell textmining textual that third thus tmcorpus using wesley zhou http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780331abs.htm 41 Combining Multiple Weak Clusterings Alexander Topchy, Anil K. Jain, and William Punch Computer Science Department, Michigan State University, East Lansing, MI, 48824, USA {topchyal, jain, punch}@cse.msu.edu aaai accumulation accuracy acknowledgements acquisition adaptive again agglomerative algorithm algorithms also analysis analyzed annals annual arises artificial association astronomical attractive automated award bagged bagging balanced based because been behavior being best bias biological boosting breiman built case categorical categories category center city class classical classification classifier close cluster clustering clusterings clusters cognitive collective combination combining compared comparing complexity component components computation computational computing conceptual conclusion conf conference consensus considered consistent contrasting control corresponding corter criterion cspa data datasets definition details dimitriadou discrimination distributed dominated dubes economics effectively effectiveness effort empirical ensemble ensembles erlbaum even evidence experiments explanatory expression extended factor faster finding first fisher flynn formulated framework fred freund friedman from function functions galaxy gene generalized ghosh gluck grant half hall happened have heterogeneous hgpa hierarchical hillsdale hornik humphreys hypergraph hyperplanes icdm icpr ieee impossibility improved increases incremental information informationsverarbeitung institut intelligence intelligent international intra introduced jain jersey johnson journal kargupta kaufmann kellam kittler kleinberg knowledge large lawrence learning leisch limited lncs machine management many martin mathematics matrix means median medicine menlo method methods michigan mining mirkin model modeling modelling monterrey morgan most multiple murty mutual networks neural nips number objects obtained odewahn offered olshen orengo overall papers parallel parameters park part partition partitions pattern pennington performance performed pharmocology power prefered prentice press previous problem problematic proc proceedings processing produktionsmanagement prohibits projecting projections quebec quinlan random recognition references regression reinterpreting related representation research resolution respects results reuse review rings roli same scale schapire science second sets seventh several show shown simple since size sizes slightly society speed splits springer star state stochastic stockwell stone strehl study subspaces such supported surveys swift systems terms that theorem therefore third thirteenth this though three times toward trees tucker uncertainty unified university used using utility values variance verlag vienna viral votingmerging wadsworth weak weingessel well what when which wien with work working workshop worse zaki zumach http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780739abs.htm 124 Regulatory Element Discovery Using Tree-structured Models Tu Minh Phuong, Doheon Lee, Kwang Hyung Lee Department of BioSystems, Korea Advanced Institute of Science and Technology 373-1 Guseong-dong Yuseong-gu Daejeon 305-701, Korea {phuong, dhlee, khlee}@bioif.kaist.ac.kr academy acid acids activities american analysis andisoprenoids appendix approaches architecture associated association available binary biogenesis bioinformatics biology biosynthesis botstein breakdown breiman brown budding bussemaker cell cerevisiae church classification cluster combinatorial computational conference conlon correlation cycle cytoskeleton data database destination detection determination discovered discovery display eisen element elements energy estep eukaryotic expression fatty fattyacid feature friedman full functionally generation genes genetic genetics genome genomes groups harvard html http hughes icdm identification identifying ieee integrating international isoprenoid journal keles laan lieb lipid lipids list longitudinal metal method methods mewes mining mips mitotic models molecular motcomb motif motifs multiple name names national nature network networks niemann nucleic ohler olshen orfnum other oxford patterns pilpel press proceedings program promoter promoters protein proteins recent references regression regulation regulatory related research responses science segal selection sequences siggia spellman sporulation statistical stone structured systematic tavazoie third tpilpel transcription transcriptional transport transporters tree trees trends using wadsworth wide with yeast zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780473abs.htm 58 Analyzing High-Dimensional Data by Subspace Validity Amihood Amir, Reuven Kashi, Nathan S. Netanyahu Bar-Ilan University Department of Computer Science 52900 Ramat-Gan, Israel amir,kashi,nathan @cs.biu.ac.il Daniel Keim, Markus Wawryniuk University of Konstanz Computer & Information Scienc 78457 Konstanz, Germany keim,wawryniu @informatik.uni-konstanz.de agarwal aggarwal aggrawal algorithm algorithms amir analyze analyzing applications approach appropriately automatic automatically bases boundary bounded boxes carlo center cluster clustering clusters computer conclusions conference considering constructs dallas data databases dempster diego dimensional dimensions discovery effective everything extending fast faster finding from function future gehrke generalized graphics gunopulos have high higher hinneburg hyperbox icdm idea ieee image implementation incomplete information international into italy january jones journal june kashi keim knowledge laird large likelihood management march maximizes maximum meaningful method methodology mining monte murali netanyahu notes obtain optimal other pages park past pennsylvania philadephia plan press proc proceedings procopiuc projected projections projective pursue quality quantitative raghavan references research resulted roma royal rubin sampling seattle september sets shown sigkdd sigmod size small society spaces statistical subspace system techniques texas them third those transactions tutorial tvcg unbounded uses very visual visualization vldb washington wawryniuk well with wolf work http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780485abs.htm 61 A Fast Algorithm for Computing Hypergraph Transversals and its Application in Mining Emerging Patterns acknowledgements advanced against algorithm also approach assistance averaged becomes berge blowout border call calls cases challenging clearly competitive computing conference considerably considered data datasets determines diff dimensionality edges elias elsevier evaluation examined executable expected expertise faster from grant half handling here holland hypergraphs icdm ieee important impractical impressive improvement improvements increases indicate initial instance instances international kavvadias library magnitude many mathematical mechanism mining more must next north number order over part partition partitioning partnership performance positive potential presented priori problem proceedings prohibitive property proportional providing references result results satimage search shown shows single slower space stavropoulos strength suffered superior supported system table than thanks that there these third this times tree uniformly very victorian volume waveform when which while with without work http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780307abs.htm 38 1 acknowledgements acoustics addison adelson algorithms also analysis annealing applications applied approach artificial banks based bebis better biological bovik browsing buhmann burges challenging chromosome chung clark clustering combining company comparison complete component compression computer computing conclusion conference crcd customized cybernetics data daugman design desing detection deterministic diba different discovery discrete discrimination domain dunn edge efficient eliminate encoding engineering evaluate evolutionary explicitly face family farrokhnia features fifth filter filters flynn followed ford framework freeman functions further future gabor geisler general genetic georgiopoulos goldberg grant grigorescu hall hamamoto handwritten have higgins hofmann iapr icdm ieee image images improving independent initiative intelligence intelligent international introductory jain journal knowledge kuizinga learning localized machine machines manjunath mehrotra method miller mining mitani miyamichi motor multichannel multiple murty namuduri nature network neural nevada numerals object okombi optical optimal optimization optimized optimizing otimally papageorgiou paper parameter parameters part pattern performance petkov plan poggio powerful prentice problem proceedings processing proposed provides puzicha ranganathan recognition recognizing redundancy references reno research responses retrieval review road schemes search segementation segmentation selected selection september sets shoji signal simple singapore spatial specifically speech springer statistical steerable such support supported surveys system systematic systems techniques test tested texture textured than theory third this tomota tools traditional trainable transactions transforms transportation trucco turner tutorial types uchimura uing under university unsupervised using uthiram vapnik vector vehicle verification verlag verri viii vision watanabe wavelet weldon were wesley work workshop yasuda yielded http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780011abs.htm 1 ExAMiner: Optimized Level-wise Frequent Pattern Mining with Monotone Constraints Francesco Bonchi, Fosca Giannotti Pisa KDD Laboratory agrawal algorithm algorithms association based chen chile conf conference constraints convertible data databases discovery effective fast frequent hash icde icdm ieee international item knowledge kohavi lakshmanan large mason mining pages park performance proc proceedings real references rule rules santiago sets seventh sigkdd sigmod srikant third twentieth very with world zheng http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780051abs.htm 6 Is random model better? On its accuracy and efficiency Wei Fan, Haixun Wang, Philip S. Yu, and Sheng Ma aaai ability accuracy accurate achieves advances agrawal agrawl algorithm algorithms amit annual application approach artificial averaged averaging bagging based been beliefs best blunt boat boosting both bound bradford breiman brodley brunk builds buntine california chan chapman choice choose classification classifier classifying columbia combining common comparable comparing complete completely computation compute computer computing conclusion conference confidence constructed construction contrary cost costs could create cristianini data database databases datasets decision decisions dependent depth derive different discovery discuss diversity does domingos each editor editors elkan empirical empirically ensure entirely estimate european evaluate even exactly expensive extending extensible fast feature features first fourth francisco freund from frontiers function gain ganti gehrke geman generalization given guarantee half hall hand have held heuristic heuristics higher hypothesis icdm ieee independent inductive information intelligence international item jordan kaufmann kearns knowledge kohavi kunz large learning less line london loss lower machine main making management mansour mehta memory meta metacost metalearning method methods minimal minimisation mining misclassification model more morgan most multiple necessary neural number occam once optimal optimality optimistic orthogonal outputs pages parallel perceptron picking point posteriori practice predictors press probabilities probability problems proceedings processing programs propose proposed pruning quantization quinlan ramakrishnan random randomized randomly razors real recognition references requirement results risk rissanen scalable scanned scans schapire sciences sensitive seventh shafer shape sharp shawe shown sigkdd sigmod significantly single sliq solla splitting sprint statistics step still stochastic store structural structure subset such sufficient symposium system systems taylor techniques technology than that then theoretic theory thesis third time times topdown training tree trees twentysecond university unknown untested update very vldb when with world zadrozny http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780387abs.htm 48 Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution acknowledgments affinity agrawal algorithm algorithms also alternative among anti applications army association associations avoid based baskets bayardo between beyond brin bulletin burdick butterworths calimlim called center charm cheung chin cikm closed closet clustering code cohen collaborative comments committee computer computing concept conference confidence confident contract copy correlations cross crosssupport daad data databases datar demonstrated detection developed different dimensional dimensionality discovering discovery distribution dmkd dubes edition efficient efficiently eighth elements engineering fast finally finding frequent friedman from fujiwara furthermore future gehrke generalizing generating gionis grant hall hastie high hsiao hyperclique hypergraph icde icdm ieee imielinski indyk inference information interest interesting interestingness international introduced involving item items itemset itemsets jain johannes karypis kumar large learning levels like llnl london long mafia march market maximal measure measures michael miner mining minnesota mohammed monotone motwani nasa number omiecinski pages partially patterns perclique performance potential prediction prentice proc proceedings properties property providing pruning reduction references report requirement research results retrieval right rijsbergen rules science selecting sets shashi shekhar showed sigkdd sigmod silverstein skewed springer spurious srikant srivastava statistical steinbach strong such summary support supported swami technical thank there third this tibshirani tkde transactional ullman univ used using utilizes valuable variety vldb wang with without work would xiong yang zaki http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780693abs.htm 113 General MC: Estimating Boundary of Positive Class from Small Positive Data Hwanjo Yu hwanjoyu@cs.uiuc.edu Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801 USA abruptly acapulco affect also appear approximation articial australia based bayes becomes boundary canada chang class classification classified compare compared completely conf conference could data daviddlewis details discovery documents dominant drops earn edmonton engineering enjoy ever example examples false form fraction from gives good highly howso html http hurts icdm icml ideal ieee ijcai intelligence international isvm joint knowledge labeled large learning less letter likely limitations liub machine machines main manually maxico method methods mining mlearn mlrepository movies naive negative negatives noisy noted number only osvm other outperform outperforms overfit page pages partially pattern pebl performance point positive positives previous proc proceedings published randomly readme recall recently recognition refer references report repository resources respectively results reuters same sampled scores seven shows significant significantly single slowly small some space standard subsititute supervised support susceptible svmc sydney table testcollections text that third this those thought trained training transactions true under universal unlabeled unreasonable used uses using vary vector vectors very watching when where which with without http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780427abs.htm 53 Sentiment Analyzer: Extracting Sentiments about a Given Topic using Natural Language Processing Techniques aaai access accuracy accurate accurately achieved adjectives advanced advertisements affect algorithm algorithms also although amento amount analysis anazon annotated answers apfa applied approach areas articles associates association automation available based berland better boards boys building cacm callan cases characteristic charniak chen classification classifier classifiers cogsci coincidence collocation comparable computation computational conf conference consistently continued contrast corpora corpus corresponding creter currently data dave degrades demonstrated designed detecting detection dictionary difficult direction discussion document documents does down dunning emnlp emotion enabling english entropy estimation exact expect experience experiments expert extracting extraction fact fair feature feedback finding finer foundations from fukushima full future fuzzy gallery general girls granularity handle handles hatzivassiloglou have hearst high hill however huettner human icdm icml identified ieee improvements increasing inevitable information inheritance initial intelligent international interpretation involvement issue katz kinds knowledge lafferty language lanuage large lawrence learning level lexi lexicogra likelihood line linguistics machine mainly management manning manual marcinkiewicz marcus margin market maximum mcdonald mckeown mean message methods miller mining mixtures model modeling more morinaga motor natural necessarily neutral news nouns occurring ofspeech online open operational opinion opinions orientation pages pang papers parsing part parts patterns peanut penn pennock perceptual perform performance phoaks plan point potentially precision predicting press princeton proc proceedings processing product provide quality questionnaire ratnaparkhi recommendations references refinement related relationship relevant reputations require research results retrieval review reviews reviewseer rovinelli sack santorini schutze second seen selection semantic semantics sentence sentences sentiment sentiments sharing sharply sigir sigkdd skills some special spring state statistical statistics stock structure style subasic subject subjective subjects substantial successfully surprise symp system systems tagging targeted techniques television term terms terveen teteishi text than that their theory these third those thumbs thus tong topic tracking trans treebank turney typing unlike unsupervised uses using vaithyanathan validation very view well when whissell wide wiebe will with women word wordnet work workshop world yahoo yamanishi zhai zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780541abs.htm 75 Ontologies Improve Text Document Clustering Andreas Hotho, Steffen Staab, Gerd Stumme {hotho,staab,stumme}@aifb.uni-karlsruhe.de Institute AIFB, University of Karlsruhe, 76128 Karlsruhe, Germany abisworkshop adaptive adaptivitat agirre aifb background based briefly cacm categorization clustering coling collection comparison computational conceptual could database density details disambiguation document domain dortmund english evaluation extensive given gutschke hannover henze hotho hypermedia improved improves increases instance institute introduction issue karlsruhe karypis kategorisierung knowledge kumar lehren lernen lernens lernobjekten lewis lexical linguistics llwa maschinellen measures methoden miller mining more most only ontology open other pages paper points present proc purity rahmen references report results reuters rigau sense short significant special specific staab state steinbach strategy studienarbeit stumme tailored technical techniques test text textuellen this towards universitat university using values veronis well when wissen word wordnet workshop workshopwoche http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780715abs.htm 118 Findings from a Practical Project Concerning Web Usage Mining achieved activities additional administraci adverts advisable also amounts analyses analysis analyze annals ante anual application articulocladea artikelcladea asamblea attraction automobile avoid based berendt bought breiman browsing bytes carline case chapman city cladea clara classification classified classify click clicks clinton club commitment communication compatibility concrete conference configuration configurations configurator consejo content controlling cooley correctly could counted crisp crispwp ctica customer data decision dellmann detailed discount discovery diss distribution done duration during dynamic emerge escuelas estad evaluation extended files financing findings first following formulation friedman from function further future gain gained german give given globalizaci group guide have help hitting however html http icdm ideas identification ieee important improve improvement improving incentives include including increased indications inferencial info information instrument interesting international internet into introduce issue kerber khabaza knowledge kumar larger latinoamericano lies lisis management mann marketing mathematical measures measuring mexico mining minnesota mobasher model modelos more motivated muenster must navigation navigational needs negocios nuevos number offers olshen only order other outlook pages paper paths patterns performance person pitkow planning pohle possibility possible practical predict predicted preparation problems proceedings product profile programming project projects proved provide provided provides random rate realised realtime recommendations references regarded regression reinartz relevant reliable research robot santa scope search semantics session sessions shearer simplifies site sites sitios sixth special specific spiliopoulou split splits splitting srivastan start started static statistics status step steps stica stochastically stone subsequently success such suitable supply support supporting systems technical test than that their them therefore third this those through thus todos total tree trees university usage used user users using utilizaci valuable variable variables version very visualize volkswagen wadsworth well were when whether which whitney wide will wirth with within without worked world would http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780613abs.htm 93 Protecting Sensitive Knowledge By Data Sanitization access achieve addition again alberta algo algorithm algorithms also alters among applications approaches approximately april association atallah balance balancing based bertino between called canada chicago china city clifton computer conclusions conference confidence controlled cost dasseni data database databases december department direct disclosure discovery does drops effective efficient elmagarmid enabling encryption engineering expensive experimental false figure file files fixing flexibility frequent from gives have hide hiding hong iane ibrahim icdm ideas ieee illinois important improvement improves improving indexing information international introduce introduced inverted involved itemset japan july june knowledge known kong large limitation literature lowest maebashi mainly makes method mining misses more moreover need note november number oliveira original other over owner pages paper pattern pittsburg point pointers possible presented preserving prevent privacy proc proceedings protection purposes range record references relevant report reproduce restricted restrictive results revealed robust rule rules sanitization sanitized sanitizing saygin scalability scale scan scans science security sense sensitive shows sigmod significant size sliding slightly someone strong support symposium technical that there third this thousands threshold thresholds time transaction transactional transactions tune university unknowns useful using varied verykios want well whenever while window with workshop http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780593abs.htm 88 Interpretations of Association Rules by Granular Computing algorithm also apart association associations attributes based between bulletin categorical china chinese cohen computers computing concept condition conference contribution criteria current data database databases dealing decision describe determining discovery easily efficient engineering evidence extended find finding frequencies from fuzzy generalized granular granules heinsoln icdm ieee include information interesting international interpretation journal knowledge kruse lattice level main measure methods mining more multiple numeric numerical optimized other paper patterns pawlak presents proc proceedings provide pruning pursuit random rastogi rdinternational reasoning references relationships rough rules schwecke sets shim significantly society springer structure support system systems than that theory third this transactions trends uncertainties uncertainty used vagueness verlag when with without york http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780171abs.htm 21 Unsupervised Link Discovery in Multi-relational Data via Rarity Analysis aggarwal algorithms analysis anatomy applications arginine artificial association associations based between biology breunig brin cambridge candan collaboration communications complex computer conf conference connections construction cook correlation costa darpa darpatech data datasets dbms dealing detecting detection dimensional dirty discovery distancebased document dutra efficient engine engineering evidence evidences extracting extraction faust fayyad fifth fish forensic foundation freitas from generation graph hamilton high hilderman holder http human hypertextual icdm identifying ieee implicit inductive intelligence intelligent interestingness international isolated italy journal kimball knorr knowledge kovalerchuk kriegel kumar large link literatures local logic magazine measures medicine melville methods mining mooney mutually national network next optics outlier outliers page patterns perez perspective perspectives piatetsky pkdd presentations press principles proc proceedings process program programming public ramaswamy rastogi raynaud reasoning references regina relational report rule rupert sander scale science search senator sets seventh shapiro shavlik shekhar shim sigkdd sigmod site smyth social somatomedin spatial speeches statistics survey swanson syndrome systems tang technical third undiscovered university useful valdes venice vityaev vldb volumes wasserman wide with workshop world zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780291abs.htm 36 A High-Performance Distributed Algorithm for Mining Association Rules Assaf Schuster, Ran Wolff, and Dan Trock Technion ­ Israel Institute of Technology Email: assaf,ranw,dtrock @cs.technion.ac.il additional agrawal algorithm algorithms ananthanarayana applications approach asso association average baltimore bangalore barbara basket beach becomes been between boston both bottleneck bounds brin california candidacy candidate candidates case chen chernoff cheung chile ciation cluster communication compensate computers conclusions conf conference conjunction constraints continue cost counting data database databases december demonstrate depends discovering discovery discussed disk distributed dsampling dynamic earlier effective efficiency efficient engineering enough even every exact experiments extending fact fast faster feel figure finally first florida fraser frequent from full future generalized generation good graphs greater guided hagerup hajj hashbased have held high hipc however icdm ieee iftode imielinski implication improvement increases increasing india information intend interesting international into ipps items itemset jarai jose journal june kedem knowledge large larger lead letters loading major management march market maximum memory miami mild mines mining more motwani murty navathe nothing november number numbers october omiecinski open other outperforms overhead pages parallel parallelize parallelized park partition partitioning pattern patterns performance pincer potential presented previous proc proceedings processing prove provided push questions range realized record references relation remain report require research resident rules runs sampling santa santiago savasere saving scalability scalable scan scans schuster search second seen september sets shafer share sigkdd sigmod simon single size slowdown smaller some speed speedup srikant still subramanian such superlinear swami systems technical technology tests than that then there they third this toivonen tour towards transactions tsur ullman university until uses value version very virmani vldb washington when will with without wolff workshop would zaiane http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780673abs.htm 108 Class Decomposition via Clustering: A New Framework for Low-Variance Classifiers Ricardo Vilalta, Murali-Krishna Achari, and Christoph F. Eick Department of Computer Science University of Houston Houston TX, 77204-3010, USA {vilalta, amkchari, ceick}@cs.uh.edu accuracy algorithm also analysis approach assignment assignments bagging bayes best bias bienenstock blake boosting boundaries breiman california class classes classifier clustering clusters computation computational computer conference configuration cost data databases decision decomposition demonstrate dept dilemma domains doursat either elements equal european evident experiments explain exploited explores finding freund friedman future geman hastie icdm ieee improve improvement increased induced inference information international irvine journal knowledge learning linear look machine maximize mining most naive networks neural over pages performance possible prediction predictors probabilistic proceedings proposed quality real references repository resulting results rish schapire sciences searching show space springer statistical that third tibshirani university used variance verlag vilalta ways will with work world http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780731abs.htm 122 Mining Production Data with Neural Network & CART Mingkun Li1, Shuo Feng1, Ishwar K. Sethi1, Jason Luciow2, Keith Wagner2 accounts achieved additionally adjust also analysis applications applying approach approaches assembly based bauer been belmont belue between boston both breiman breneman capabilities cart center change chen classification coated coater coating commerce comprehensive concepts conclusion conference confirmed control corp currently customer data dead design determining dhond different digest direct discussed dominant duda each electricity electronic embrechts employed engineers error exactly experience features feng foundation friedman from gain give given glass group guardian guide guideline gupta hart haykin highly hill however icdm ieee importance important improve improvement industries input insight international into inventories john kamber kaufmann kewley known layer layers learning line machine macmillan march markets measurements methods mining model models morgan most multilayer nadel nature need network networks neural neurocomputing nonparametric november often olshen only operator optimizing order other paper pattern perceptrons pharmaceuticals powerful primary proceedings process production provides publishers pyzdek quality quantitative range rank ranked ranking readings references regression relationships relative report results robustness same science score sensitivity served sethi setting shorten should sigma significance silver sons specification splitter springer statistical stone stork strip surrogate table technical techniques technology that theory they third this three throughout time tolerance transactions transmission tree trees usage used using vadhavkar vapnik variable variables verify virtual wadsworth well were what when which widely wiley with within working http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780219abs.htm 27 Change Profiles Taneli Mielikainen HIIT Basic Research Unit Department of Computer Science University of Helsinki, Finland Taneli.Mielikainen@cs.Helsinki.fi aaai abdulghani absolute acknowledgments adams additive advances agrawal algorithms also annual approximability approximate approximation apriori artificial association ausiello automata average averages based bases basic bastide beeri behaviour bernstein binary blockeel bolton boolean borders borgelt boulicaut bound buneman bussche bykowski calders candidate case castro census cercone chaining change chapter chen chicago chromatic chung closed closures clustering coffman collections combinatorial comments comparing comparison complexity compstat computational computer computing concise condensed condensing conejo conference constraints constructive could counting courcoubetis crescenzi cubegrades curves dasgupta data database databases derivable detection deviation discovering discovery discrete disjunction distance distribution dong down dynamic editors efficiently eidenbenz eight elements elomaa encouragement engineering episodes error estimate estimation estivill event examined example experiments explorations exploring expression fast fayyad feder feige figure find floris forest free frequency frequent friedman from functions further gamberger garey garofalakis gaussian geerts general generalization generalize generalizing generation generators geometric global goethals greene grieser guarantees guntzer hand hardle hashing hastie heikki hennessy hierarchical hipp holt icdm icdt ieee ifip illinois imielinski implementation impossibility induction inference information intelligence international interval inverted investigated ipums itemsets johnson joint journal kann karpinski karypis keim kenyon khachiyan kilian kivinen kleinberg knowledge kruse kryszkiewicz kurakochi lakhai lakhal landscapes languages lavrac learning lecture letters levelwise local management mannila many marchettispaccamela mathematics mean methods middle mielikainen minimal mining mitchell morales nakhaeizadeh naughton negative neural nicely nips noise noisified notes number ones online open optimal optimization ordering outside packing pages paper pasquier path paths pattern patterns perfect performance perturbed physika piatetsky pkdd position possibilities possible prediction press principles probability problem problems proceedings processing profile profiles programming properties protasi pruning queries rabani random rastogi references regular reidys representation representations review rigorously rigotti ronz rule rules sampling scalable schemes science search seem seems sequences sequential series sets shapiro shim shor should siam sigact sigart sigkdd sigmod sloan smyth society some somewhat springer srikant stadler standard statistical statistics still studied stumme subgraphs suitable suited survey symposium systems tanaka taouil techniques terano thank that their theorem theorems theories theory third thus tibshirani tight todorovski toivonen transactions trees triguero twenteenth twentieth uniform upper useful using uthurusamy vega verkamo verlag views volume weber well widmayer wish with without workshop yamamoto yannakakis zaki zero http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780481abs.htm 60 Efficient Subsequence Matching in Time Series Databases Under Time and Amplitude Transformations above acknowledgments adaptive advances afrati agrawal alexander algorithm algorithms along also although amplitude applied approach approaches approximate approximately approximation arbitrary argyros artificial asia aspects attention attracted attribute barbara based bases beach been both candidate cannot chakrabarti chaos chicago china comments complexity computations concentrated concentration concerning conf conference creating criterion czech data database databases datasets definition diego different digital dimakis dimensionality discontinuity discovering discovery discussions dismissals diverse done draft dynamic edinburgh efficient efficiently engineering ermopoulos european exact extend extended extensively extracting faloutsos false fast finance find finding follows foto foundations fractals fraction furthermore greece gunopulos gurel handle hard have high hong however http icdm idea identifying ieee inappropriate include including index indexing information initial insightful intelligence interesting international invaluable invariant jagadish japan journal kahveci keogh knowledge kokkinos kong kyoto landmarks large length lengths like linear locally magnitude management managment mandelbrot mannila manolopoulos maragos massive matches matching mehrotra menlo method mining minneapolis model modeling modulations more moreover mostly much multi newport ninth nonlinear norway ntua number only operations order organization orlando other over pacific pages paper park parker pattern pazzani perng philadelphia pioneer practice prague presented principles probabilistic problem proceedings process processing produced propose proposed queries query querying ranganathan recent reduction references regarding related republic requires research results retrieval risk robust santa santorini scaled scaling scientific scotland search searching seems sequence sequences series shift shifting sigmod signal significant similar similarity simple since singh skew small smyth solve some speech springer statistical subsequence subsequences subsequent swami symposium systems targyros technique techniques than thank that them there these they third this time tools transform transformation transformations trivial trondheim under used using variation variations vast verlag version very wang warping whether with without wong work would yield zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780577abs.htm 84 Tree-structured Partitioning Based on Splitting Histograms of Distances accrue adjusted algorithm algorithms also american analysis applied applying assessment automatic based berlin bing breiman called cancer cell choice classical classification clearly close cltree clus cluster clustering clusters columns composed computed conference construction correspond correspondence corresponding criteria data decision disagrees discrimination distance distinguish each eisen electronic euclidean evaluation events everitt expression feature features figure find finding five found friedman from gene genet ground groups hamish hand hard hartigan have heinemann hennig html http human icdm icpr ieee image index information intelligence international irises iyer jeffrey kaufman kaufmann kohonen lakamper latecki learning lect lines london lowered machine maps matrix mean measure measurements medg method methods mining misclassifications misclassified misclassifies morgan normalization normalized notes objective objects observations obtained olshen only organizing outperforms pamethod pami parameters partitioning parts pattern patterns pavel pergamenschikov perou philip plot points proc proceedings program proposed quinlan rameters rand recognition references regression report research resulted resulting results retrieval rijn ross rousseeu rows rules same samples scherf self separated shape show shown similarity slide software space spellman springer stathome statistic statistical statsoft statsoftinc stone summarized survey systematic table techniques tering textbook that there they third this three through thus trans tree trees truth unpredictable using vantage variance variation very videos visual wadsworth waltham when where which wildt wiley with without yiyuan york zero http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780609abs.htm 92 Simple Estimators for Relational Bayesian Classifiers Jennifer Neville, David Jensen and Brian Gallagher Knowledge Discovery Laboratory, Department of Computer Science, University of Massachusetts Amherst, 140 Governors Drive, Amherst, MA 01003 USA {jneville | jensen | bgallag}@cs.umass.edu aaai ability acknowledgments afmc afrl aggregating analysis appears approach artificial attribute avgprob avoiding based basic bayes bayesian bias biased both building categorization characteristics circumstances classification classifier classifiers clearly clustering comparison complex conf conference contained contract contributions correlated darpa data decomposition degree development disparity domain domingos draft dzeroski earlier effects engines estimation estimators event experiments exploit fairgrieve first flach freiburg friedman further future gallagher germany getoor good have icdm identified ieee imdb include inductive information intelligence international intl jensen joint june kersting koller lachiche lavrac learning logic loss lower machine massachusetts mccallum mining models multiple multiset naive national neville nigam number numbers optimality order over overall pazzani performance performs pfeffer principles probabilistic proc proceedings programming programs quite raedt rare real references relational rennie report research result reveals ross search segal select sets seymore simple specific springer squared supported synthetic taskar tasks tech technical text thank that third this true under unified university usaf values variance verlag well when which will with within work workshop world zero http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780227abs.htm 28 Complex Spatial Relationships Robert Munro and Sanjay Chawla and Pei Sun School of Information Technologies University of Sydney rmunro, chawla, psun2712 @it.usyd.edu.au aaai address advances agrawal algorithms allows analysis apart application applied area association attributes bailey bases bernstein biological bocca both brin candidate chawla chen colocation combination complex component computer computing conf conference confident coordinate data databases demonstrated described determining different directions discovering discovery discussed editors efficient egenhofer engineering especially even evident explicitly extent fast features forms frequent from fundamentally future gain gatrell generation geographic goal guilford have herring huang icdm icml ieee implemented important improvements inclusion information inherent interactive interest interesting international intl investigating jarke kaufmann knowledge koperski lake language large learning lecture lesser limitations location longman machine making management mentioned mine mining morgan munro must natural naughton necessary need negative notes numeric optimized other pages patterns peuquet piatetsky positive presentation press proc proceedings prove purely rastogi references relationships report represent representation representations results rules school science scientific several shapiro shekhar shim should sigmod significance simple space spatial spatiotemporal springer srikant step strengths strong such summary support sydney symp symposium system systems technical technologies that there third this those threshold time tour transactional transactions treated types university valley verlag very vldb volume volumes when where with without work would xiong zaniolo zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780637abs.htm 99 Impact Studies and Sensitivity Analysis in Medical Data Mining with ROC-based Genetic Learning Michele` Sebag Jer´ ome^ Aze´ Noel Lucas PCRI, CNRS UMR 86-23, Universite´ Paris-Sud Orsay, 91405 France Michele.Sebag, Jerome.Aze, Noel.Lucas @lri.fr accuracy acknowledgment against algorithm algorithms allowing also analysis angeline annals application arcing area argued artificial atherosclerosis attributes baeck bartlett based been better boosting boughton bradley breast breiman canadian cancer card case challenge chatellier class classification classifiers classifying collobert colombet compact comparing complex concerned conclusion conf conference construction cost curve data databases deci discovery domingos ecmlpkdd editors effectiveness estimation evaluation evolution evolutionary explanation exploitations exploited extending fawcett feature ferri figure flach fogel free freund from further general generalisation genetic gueyffier hand harvest hernandez http hunag hypotheses hypothesis icdm identification ieee imaging impact information inspection instance interactions international jaulent kaufmann kernel knowledge kohavi languages learner learning liardet limited linear ling lisp lucas machine mackinlay making margin maria measure medical meta method methods mining models moment more morgan multiple neural orallo oxford pages paper parallel pattern perspectives pluridisciplinary porto practice precise presents press problems proc proceedings provides provost readable recognition references representations research resources risk roger sebag selection sensitive sensitivity shapire shneiderman simple sion some springer statistical statistics strong studies such temeckova than thank thanks that their theory think third this till trans trees under university using valuable vapnik verlag vision visual visualization voting warmly wasson weights wiley will with york zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780035abs.htm 4 Frequent Sub-Structure-Based Approaches for Classifying Chemical Compounds Mukund Deshpande, Michihiro Kuramochi and George Karypis University of Minnesota, Department of Computer Science/Army HPC Research Center Minneapolis, MN 55455 aaai accurate acids activity advances after against aided aids akihiro algorithm algorithms american amount analyzing andrew antiviral application applications approach approaches apriori artificial ashwin association atomic atoms attribute attributes automated automatic available average based berthold besides better bing bioinformatics biological bond borgelt brockhausen care case cases challenge chapter chemical chemist chemists chloromycetin christian cikm class classification classifier classifying cliffs clustering cluto cmar coefficients column combining comp comparable compare comparing comparisons complete compounds computer computing concept conclusions conf conference conjunction connectivities constants construction cook coordinates correlation covering critical data databases dataset datasets decision dept derivatives derived deshpande design despite discovering discovery djoko domain drug effective efficient either eleventh engineering englewood environments european evaluation even existing experimental fact fawcett feature features fifteenth finally find finding fragments france frequent from fujita gasteiger generation geometric george gonzalez graham graph graw grid growth gspan hall hammett hand hansch have helma highly hill hiroshi holder http hyperplane icdm ieee ijcai imprecise improved inductive information inokuchi integrating intelligence intelligent intensive internation international january jian jiawei joachims john joint journal july karypis kaufmann kernel king knowledge kramer kuramochi large last leach leads learning less logic lyon machine making maloney maolney mateo method methods michael michihiro mining minnesota mitchell mitpress modeling models molecular molecules monitoring moreover morgan morik most motoda muggleton muir mukund multiple mutagenecity nature number obtained organic pages paper particular partition pattern performance phenoxyacetic pkdd plant pnas potential potentially practical practice predict predictions predictive prentice presented press principles problem proc proceedings produced programming programs project provides provost puter qsar quantitative quinlan raedt references regulators relationships relevant report required requires respectively result results reviews richards robust ross rudolph rule rules sadowski scale scheme science screen screening screensaver selection september sequential showed shown sigkdd society srinivasan sriniviasan statistical stephen sternberg still streich structural structure structures strucutre study subdue subduecl subgraph subgraphs substantially substituent substructure substructures support system systems table takashi technical tetrahedron than that their then theory third this those time toolkit topological toxicology transactions understand university used using valuable values vapnik vector virtual wang washio wenmin which wiley with without workshop wynne xifeng yiming york http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780435abs.htm 54 Cost-Sensitive Learning by Cost-Proportionate Example Weighting Bianca Zadrozny, John Langford , Naoki Abe Mathematical Sciences Department IBM T. J. Watson Research Center Yorktown Heights, NY 10598 adacost additional advances agnostic anifantis application applied archive artificial association bagging bayesian because boosting both bound bureau california chan class classes classification classifiers communications computation computer conference connection cost costs costsensitive criteria data decision decisions diego digits direct direction discovery distribution dmef dmefdset domingos drummond efficient efficiently ehrenfeucht elkan error especially estimating estimation european examples expected exploiting foundations freund from future general generalization haussler hettich holte http icdm ieee implies important information insensitive intelligence international irvine joachims joint journal kaufmann kearns kernel knowledge large learnable learning library line lower machine making margineantu marketing mateo mathematics metacost method methods minimal mining misclassification morgan multiclass naive national needed neumann noise number open performance practical press probabilities probability problem proceedings programs property queries quinlan random rate reduction references relatively report representation respect scale schapire sciences sellie sensitive sensitivity series shtml splitting standards statistical still stolfo support system technical techniques theoretic theory there these third thirteenth this tolerant toward tree unexplored university unknown used using valiant various vector weights when with work york young zadrozny zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780657abs.htm 104 Enhancing Techniques for Efficient Topic Hierarchy Integration agrawal algorithms annotation annotations applications approach approaches apte august automated automatic available based bayes between boostexter boosting boyapati building callan cascading case catalogs categorization chakrabarti cheng chinese chung class classes classification classifier classifiers comprehensive computational conference content csie damerau data databases decision dept design discovery ecml enhancing evaluation experiment extend external feature features general generation going hierarchical hierarchies hierarchy hill hong icdm icml ieee ignore improve improving information integrating international into joachims journal kong language large learning lewis liew linear linguistics machine machines many master mccallum mcgraw methods mining mitchell naive national news notice organizing papka perceptron performance portals proach probabilistic proc proceedings processing provided proximity raghavan rakesh ramakrishnan references relevant report research restriction retrieval rosenfeld rules scalable schapire selection shows shrinkage sigir signature significantly singer srikant statistical study such support susan system taxonomies technical text that thesis third topic towards traditional training trees tsay university usability using vector viewed vldb wang weiss where wide with world yang zhou http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780605abs.htm 91 An Algorithm for the Exact Computation of the Centroid of Higher Dimensional Polyhedra and its Application to Kernel Machines Frederic Maire Smart Devices Laboratory, School of SEDC, IT Faculty, Queensland University of Technology, 2 George Street, GPO Box 2434, Brisbane Q 4001, Australia. advances algorithm algorithmss analysis analytical applications based bayes bayesian billiard cambridge campbell classification comput conference convex data england estimating estimator europhys expression generalization graepel grant haussler herbrich holloway icdm ieee ijcai information international introduction journal kernel kernels large lasserre learning lett london machine machines massachusetts mika mining muller network neural opper optimal optimization pages partially perceptron performance phys playing point polyhedron press proc proceedings processing ratch references report research royal rujn scale scholkopf scholkopft shawe smola space support supported system taylor tech technical theory third this trans tsuda univ vector version volume watkin williamson with work workshop http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780661abs.htm 105 Pattern Discovery based on Rule Induction and Taxonomy Generation Shusaku Tsumoto and Shoji Hirano Department of Medical Informatics, Shimane University, School of Medicine, Enya-cho Izumo City, Shimane 693-8501 Japan tsumoto@computer.org, hirano@ieee.org above academic acknowledgments acquisition active advances algorithm alto analysis areas automated based basis between boca busse chapman characteristics clinical closely cluster compare compared conclusion conference coverage culture data databases datasets decision dempster dietterich difference diseases dordrecht edition editor editors education empirical equal everitt evidence examined example expert experts extraction fedrizzi flood focus focusing from future generates grant grouping grzymala hall hierarchical icdm ieee implementation important induction information intelligent international japan john kacprzyk kaufmann kluwer knowledge large larger learning london machine maximum measure mechanisms medical method methods minimum mining ministry model morgan multidimensional other pages palo paper pawlak preliminary priority proceedings programs propose proposed publishers quinlan raton readings real realize reasonable reasoning references research results role rough rule rules scaling science sciences scientific selected sets shafer shavlik shows skowron sons sports study suggest supported system technology than that then theory third this thus tsumoto using value values very whose wiley will with work world yager york http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780689abs.htm 112 Frequent-Pattern based Iterative Projected Clustering Man Lung Yiu and Nikos Mamoulis Department of Computer Science and Information Systems University of Hong Kong Pokfulam Road, Hong Kong mlyiu2,nikos @csis.hku.hk additional agarwal aggarwal agrawal algorithm algorithms applications appropriate assigning association attributes automatic beyer blake candidate carlo categorical close cluster clustering clusters comparing conclusions conditions conference consider considered data database databases datasets definition devise dimensional discovered distance each effective effectiveness efficiency efficient else evaluated extended faloutsos fast fastmap figure finding first frequent further future gehrke generalized generation goldstein guha gunopulos heuristics high hope html icde icdm icdt identifed ieee improved improving indexing ineclus international jones large learning machine meaningful measures meit merging merz mineclus mining mlearn mlrepository monte more multimedia murali nearest neighbor other outliers paper park patterns points presented proceedings proclus procopiuc projected projective pruning quality raghavan ramakrishnan rastogi real reduce references repository results robust rock rules scalability scalable search shaft shim shown sigmod significantly similar size small some space spaces srikant subspace subspaces synthetic that third this traditional under using various visualization vldb when with without wolf http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780585abs.htm 86 Ensembles of Cascading Trees Jinyan Li Huiqing Liu Institute for Infocomm Research 21 Heng Mui Keng Terrace, Singapore 119613 jinyan,huiqing @i2r.a-star.edu.sg acknowledgment acute algorithm algorithms appear bagging bari bauer behm bioinformatics boosting breiman campana cancer cascading cell cheng classification classifying comments comparison concept conference constructing data decision diagnosis diagnostic dietterich discovery downing draft editor emerging empirical ensembles evans experimental experiments expression features forest freund from gene giving good groups http icdm identifying ieee indicates international italy jinyan july kaufmann kindly kiong kohavi learning leukemia limsoon lower lymphoblastic machine mahfouz mateo matter methods mining more morgan naeve necessary outcome pages paper patel patients patterns pediatric perfect play prediction predictors proceedings profiles profiling programs quinlan raimondi random randomization ranked references relling role ross rules saitta schapire sdmc shurtleff significant simple some star subtype subtypes than thank that third thirteenth this three trees underlying used useful using variants very voting were whether wilkins williams with wong yeoh zhou http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780641abs.htm 100 K-D Decision Tree: accelerated acceleration accelerations adding against albert algorithm algorithms almost analysis application applications applied approximate arya associative based bentley bhattacharya binary boosting changing classification classifier classifiers combination communications computer conclusions condensed condensing conference confirmed consistent constant cover cybernetics dasarathy data decision demonstrate design dimensions distributions done down drastic dummy editing enlarging exploitation faster following freund further future gates generalization graphs hart have height http icdm identification ieee ijcai incomparable incomparably information instance international isenhour journal kddt keep keeping kibler learning library line lowry machine makes margin means merging minimal mining modifying monica mount much multidimensional nature nearest neighbor neighbour netanyahu noise number optimal others paper partitioning pattern potential poulsen power proc proceedings produces properties propose proximal proximity real reduced reducing references ritter rule safe samples sanchez santa schapire sciences search searching section selective shown silverman slightly slows spatial speed splinger statistical still superior symposium synergy system systems tasks than that theoretic theory these third this three time tolerant tomek tools toussaint townsend training trans transactions tree trees used vapnik vast will woodruff works zone http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780525abs.htm 71 Fast PNN-based Clustering Using K-nearest Neighbor Graph Pasi Fränti, Olli Virmajoki and Ville Hautamäki Department of Computer Science, University of Joensuu, PB 111, FIN-80101 Joensuu, Finland. franti@cs.joensuu.fi, ovirma@cs.joensuu.fi, villeh@cs.joensuu.fi activity adjacency algorithm algorithms amer analysis applied arya assoc assp based bull buzo canagarajah chang class cluster clustering code codebook compresion conference constantinou cover data design detection distance efficient electronics engineering equitz exact fast function gray grouping hierarchical icdm ieee image imaging implementation international jose kaukoranta linde linkage maps mean memory method methods minimum mining mount nearest neighbor nevalainen november objective optical optimize ordered pairwise partial practical proceedings quantization quantizer references ross search september shen simple snowbird spanning speeding spie statist statistics third trees using utah vector virmajoki ward http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780123abs.htm 15 Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift Jeremy Z. Kolter and Marcus A. Maloof Department of Computer Science Georgetown University Washington, DC 20057-1232, USA {jzk, maloof}@cs.georgetown.edu aaai abound accuracy acknowledgments adaboost alamitos algorithm algorithms also although american anonymous anticipate appear application artificial association attributes authors avrim bagging based bauer bayes bayesian beneficial berkman beyond blake blum boosting bounded bounds breiman calendar california candidate changing classification classifiers clouse comments community comparison computation computer concept concepts conducted conference considered constructing contexts continuous cvfdt data databases decision department description determining dietterich different discovery distributed distributions does domain domingos drafts drift earlier efficient empirical encountered ensemble ensembles estimating evaluation examples experimental experiments experts explicitly fern francisco freund from general generation georgetown givan granger grant handling headden helpful hidden high hoeffding html http hulten hybrid hypotheses icdm ieee illinois incremental induction inequalities information instance institute intelligence international investigate investigations irrelevant irvine john joint journal kaufmann knowledge kohavi kubat langley large larson lead learning likely line linear littlestone machine maclin majority maloof manuscript mechanisms memory menlo merz methodology methods michalski mining mlearn mlrepository morgan naive nanb national networks neural noise online opitz opportunities pages park part partial paul periodically popular positive predictors presence present press probability problems proceedings processing program programs quickly quinlan random randomization references releasing removing report repository research respective restructure restructuring results reviewers robust rule scalable scale scaling schapire scheduling schlimmer science sciences selecting site speed spencer standards statistical stolfo streaming streams street study such sums support supported systems tain target technical technology thank that their these third this those three threshold timechanging tracking tree trees uiucdcs uncertainty under undergraduate underlying university urbana utgoff variables variants vfdt voting warmuth weight weighted weightedmajority when widmer will william winnow with work would yielded york zhang http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780561abs.htm 80 Facilitating Fuzzy Association Rules Mining by Using Multi-Objective Genetic Algorithms for Automated Clustering achieved adaptation addition adjusts advantages agrawal algorithm algorithms alhajj analysis application approach appropriate arslan artificial association attribute attributes automated automatically autonomous base based cambridge case chan changing cikm clustering clusters comparative conclusions conference cure data database databases demonstrated desired determined dexa direction duration each edition effective efficient evolutionary experiments figure finally find first found from functions fuzzy given guha holland hong icde icdm ictai ieee implemented important information intelligence intelligent interesting international interval itemsets kaya large lent lisi membership method methods miller minimum mining more multi natural number objective obtain obtained optimizes optimum over paper pareto polat possible press proc proceedings proposed provide quantitative rastogi references relational required rlatot rules runtime second sets shim sigmod solutions spatial srikant strength structure study summaries summary support swami systems tables that thiele third this through thus timnu together total transactions tuning using value values vldb which widom yager yang zhang zitzler http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780653abs.htm 103 Model Stability: A key factor in determining whether an algorithm produces an optimal model from a matching distribution Kai Ming Ting and Regina Jing Ying Quek KaiMing.Ting@infotech.monash.edu.au Gippsland School of Computing and Information Technology Monash University, Victoria 3842, Australia. acknowledgements algorithm bayesian because blake california class classification classifier classifiers computer conduct conference conventional cost costsensitive criteria criterion curves data databases decision degree degrees demonstrate dept depth determining difference different distribution diverge divergence draft drummond effect empirical engineering environments evaluation experiments exploiting factor fawcett found francisco from generally geoff gscit have helps holte html http ieee imprecise increases induce influence initial instance international irvine issues kaufmann kmting knowledge learning lesser level levelled likely machine many mark matching maximum merz method micallef mlearn mlrepository model models monash more morgan much naive optimal particular performance place proceedings produces program proportional provides provost pruned pruning quek quinlan reduces reduction references repository result robust ross rutgers sensitivity shows significance size some space splitting stability study suggestions takes than that this ting transactions tree trees university unpruned unstable using valuable varying webb weighting weiss whether wisdom with http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780347abs.htm 43 TSP: Mining Top-K Closed Sequential Patterns Petre Tzvetkov Xifeng Yan Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign, Illinois, U.S.A. tzvetkov, xyan, hanj @cs.uiuc.edu accurate acmsigmod additional adopts afshar agrawal algorithm algorithms allows applying april arlington attributes australia avignon ayres based because bitmaps canada cases categorical challenging charm chen closed closegraph closet clospan clustering competitive complete conclude conclusions conf conference constraint correct covers currently dallas data database databases datasets dayal definition deliver delivers derives developed develops discovering discovery distinct dmkd during dynamically early edbt edmonton efficient efficiently engineering ensures episodes even experimental extending fast features flannick following france fransisco frequent further gehrke generalizations germany graph growth guha have heidelberg hsiao huge icde icdm ieee improvements including incorporating information intends international itemset itemsets japan july knowledge large learning length less like lose machine maebashi mannila many mines minimum mining montreal more mortazavi most much multi novel number ones only optimization outperforms pages paper pass pattern patterns performance performs pinto practical preferable prefix prefixspan present problem proc proceedings process projected proposed provides prune raising rastogi recently reduce references related results robust rock search sequence sequences sequential setting several shim shows siam sigkdd solution space spade spam srikant strategy studied study support sydney taipei taiwan techniques technology termination tern than that then these third this threshold through toivonen traditional traversal tzvetkov used using verification verkamo wang washington were when which while with without work workshop zaki http://csdl.computer.org/comp/proceedings/icdm/2003/1978/00/19780379abs.htm 47 Efficient Data Mining for Maximal Frequent Subtrees acknowledgement agrawal algorithm algorithms allow ancestor applications april apriori arikawa arimura arlington asai association based canada candidate chen chile city code computer conference cong data database databases december descendent discovering discovery dunham edmonton efficient efficiently embedded engineering european fast forest frequent from generation giugno graph gspan hierarchical icdm ieee inokuchi international japan june karypis kawasoe knowledge kuramochi large like madison maebashi mining mohammed motoda pages park path pattern patterns press principles proceedings prof program quent references relationship rules santiago satamoto searching semi sending sequential shasha siam sigact sigart sigkdd sigmod society source srikant structured subgraph substructure substructures subtrees symposium systems taipei taiwan thank third transactions traversal tree trees twentieth very wang washio which wisconsin without would xiao zaki