http://oregonstate.edu/conferences/icml2007/paperlist.html ICML 2007 http://www.machinelearning.org/proceedings/icml2007/papers/243.pdf 102 A Dep endence Maximization View of Clustering aaai aisa algorithm amount analysis applications applied approach artificial arxiv automatic bach based berkeley bialek blue borgwardt bottleneck bousquet boyd california cambridge centering circle classification clustering clusterings clusters colinear colour colt combinations component comput convex cuts cvpr data datasets dependence dhillon diamond different dimensional dimensionality ding distance distances duality efficient eigenvalue embedded entropy equivalence error expression expressions facial feature figure fine from fukumizu function girolami good graph graphs green gretton guan hard hebrew hemmecke hierarchy hilbert hilbertschmidt iccv icml identity ieee image images inference information integer into introduction jerusalem jmlr jordan kannan kernel kernels kmeans kondor kulis labeling learning lecam level linear lkopf local loera macqueen malik marker math maximization maximum means measuring meil mercer methods metric misclassification muller multivariate nemenman neural neyman nicta nips nonlinear nono normalized norms observations optimal pami pantic perturbation polynomial press principal prob probablistic problem proc rank recovered reduction references regularization relaxation represent representations reproducing results revisited ring rotation rothblum rothkrantz saul scheinberg segmentation shafee shape shashua simon slonim smola some song space spaces spectral square squared stat state statistical supervised symposium tats teapot tech theory thesis three training true tween unfolding unifying uniqueness university using variance vempala vetta view weinberger weismantel weiss with xiao zass http://www.machinelearning.org/proceedings/icml2007/papers/449.pdf 4 Scalable Training of L1 -Regularized Log-Linear Mo dels aaai abbeel algorithm algorithms also andrew angle annals applying athena because benson bertsekas bfgs bound bounded byrd case charniak coarse collins comparative comparison computing conclude conll constrained constraint constraints contradicts criterion darroch defining denominator discriminative each efficient efron emnlp entropy estimation evaluation exist exists exponential extension feature fine folk follows fore form fortran from generalised gives goodman grafting hand hastie have hence icml inequality invariance iteration iterative johnson johnstone journal kazama language large lasso least limit limited line linear logistic lows malouf math mathematical maxent maximum mean meet memory method methods metric microsoft minimization minka models more must natural nbest negative nocedal nonlinear numerator numerical obtain online optimization optimizers other parameter parsing passing perkins positive previously priors programming prop ratcliff references regression regularization regularized report reranking research rewrite rotational royal scale scaling scientific search selection series shrinkage siam since society softw some springer statistical statistics strictly study subroutines subsequence such take technical that theiler then theorem there therefore thus tibshirani toutanova trans tried tsujii using value variable vector which with wright http://www.machinelearning.org/proceedings/icml2007/papers/224.pdf 134 Map Building without Lo calization by Dimensionality Reduction Techniques action advances analysis appearance applications artificial automation barra bearingonly belkin bowling brodeau brunskill clustering comparison computationally concurrent conference convex cutsem data deans dept developments dimensional dimensionality donoho efficient eigenmaps embedding experimental feder ferris gaussian ghodsi global grimes hebert hessian high holland ieee incremental information intel international joint laplacian large latent lawrence learning leeuw leonard ligence ligent linear local localization machine manifolds mapping method methods models multidimensional neural ninth niyogi nonlinear north probabilistic proceedings process processing recent reduction references report research respecting robot robotics robots romier scale scaling second sensor silva slam spectral stanford statistics symposium systems technical techniques tenenbaum twenty university using variable versus wifi wilkinson http://www.machinelearning.org/proceedings/icml2007/papers/546.pdf 122 Multifactor Gaussian Pro cess Mo dels for Style-Content Separation adverbs analysis applications artistdirected based basis bodenheimer brand cohen comp component components computer conf content decomposition elgammal exponential faloutsos forum function gaussian graphics grassia grochow hertzmann iccv ieee image inter interface interpolation inverse journal kernel kinematics latent lathauwer lawrence learning linear machine machines manifold martin matrix models moor motion multidimensional multifactor multilinear nonlinear parameterization pattern popovi practical press principal probabilistic proc process processes radial rasmussen recognition references rose rotations separating shapiro siam siggraph singular sloan style synthesis transactions using value vandewalle variable verbs williams with http://www.machinelearning.org/proceedings/icml2007/papers/60.pdf 119 A Kernel Path Algorithm for Supp ort Vector Machines algorithm bach conference conic duality international jordan kernel lanckriet learning multiple proceedings references http://www.machinelearning.org/proceedings/icml2007/papers/222.pdf 30 Non-Isometric Manifold Learning: Analysis and an Algorithm academy acknowledgements actions active again alfred allows already analysis anna applications applied approaches being belongie bengio berlin between blake boyd brand calit career cayton charting close clustering come comp component computation computed concepts conclusion content contour contours cottrell craig cvpr dahmen dasgupta data datasets defined delalleau demers denker dimensional dimensionality distance division dollar domain donner donoho drawing duality each ecml eigenmaps eigenvalue elements elgammal embedding entail entire errl error euclidean even examination except extended extensions families fellowship figure following form formalism formulate framework friedman from funded future fwgrid general generalization generalizing geodesic geometric given global globally grant grants grimes group groups hand handle hastie heidelberg hemisphere hessian high hole homeomorphisms icml igert ijcv image implementation including infrastructure input interesting invariance invariant investigate isard isomap jmlr kass kasturi kernel keysers langford large lawrence learning lecun like linear local locally macherey manifold manifolds manipulation manner matching methodology methods miller missing models monperrus more muller multiple national neighbors networks neur neural nips noising nonlinear novel number once ongoing opposite organizations ouimet paiement park part patience pattern perception performance plan plausibly points portion posing predicted presented problem proc programming project projection propagation quantifying rabaud recognition reduct reduction references removed representations research results roux roweis ruderman same sample sampled samples sanjoy saul scaling scholkopf science sciences section semidefinite separating serves sets shemorry showed shown sides silva simard single sloan smola smoothing snakes sound space spectral springer statistical structure style supervised tangent tasks techniques tenenbaum terzopoulos test thank that theoretically they think this tibshirani trade trained transfer transformation traverse tricks tuning ucsd uncanny unseen unsupervised valuable variability version victorri view views vincent visual weinberger were where with witkin work would xiao york younes zhang http://www.machinelearning.org/proceedings/icml2007/papers/75.pdf 124 On Learning with Dissimilarity Functions algorithmic available balcan blum breiman chang cjlin classification conference csie dimensional experiments features freund friedman functions http international kernels learning library libsvm machine machines mappings margins olshen references regression schapire similarity stone support theory trees vector vempala wadsworth with workshop http://www.machinelearning.org/proceedings/icml2007/papers/354.pdf 41 Efficient Inference with Cardinality-based Clique Potentials advances algorithms anal analysis annual approximate approximation association belief boykov bunescu categorization chakrabarti classification collective combinatorial comparison computational conference cuts data discovery duchi elidan energy enhanced experimental extraction fast fields finkel flow gallagher gibbs graph grenager hyperlinks hypertext ieee improves incorporating indyk inference information intel international into jensen kleinberg know koller kolmogorov labeling ledge ligence linguistics local mach machine manning markov meeting metric minimization mining mooney networks neural neville nips optimization pairwise pami pattern problems proceedings processing product propagation random references relational relationships sampling september sigkdd sigmod systems tardos tarlow trans transactions using veksler vision with within zabih http://www.machinelearning.org/proceedings/icml2007/papers/553.pdf 20 Learning to Compress Images and Videos able achieving active algorithm algorithms ando annual application applied approach associated based belkin better cambridge center chapelle classification colorization competitive compress compression comput computation computational computer conf considered cost data design details different dimensionality discussion eigenmaps enhance example examples extend figure framework from geometric germany graph graphs heidelberg http images jerryzhu jpeg kernel kernels kero kondor label labeled laplacian learn learning lecture levin lischinski literature lkopf mach machine madison malik manifold many model motivated nels neural niyogi notes novel observation often optimization original outlo paper papers paradigm presented press proc querying ratios reduction references regularization representation research respectively sciences segmentation semi semisupervised siggraph sindhwani smola spectral springer standard state supervised survey tech techniques text than that theory there these this times transductive university unlabeled using verlag video viewpoint vision warmuth watson ways weiss well which widely wisc wisconsin with york zhang zien http://www.machinelearning.org/proceedings/icml2007/papers/558.pdf 146 On the Relation Between Multi-Instance Learning and Semi-Sup ervised Learning aaai aistats amar analysis andrews applications applied approach artificial auer australia axisparallel bags banff barbados based belkin bennett between blake bled blum bonn boosting both bottou bound branch bray brown cambridge canada categorization cavtat cavtatdubrovnik chapelle chen cheung chevaleyre class classification classifier collobert colt combining comparison computer convexity craven croatia csurka czech dance data databases decision demiriz department detection dietterich directed dissertation doctoral documents dooly dubrovnik eccv ecml edinburgh embedded empirical enhancing ensembles ervised evaluation examples experts fields framework francisco frank freiburg from functions fung gaussian generalized germany ghahramani goldman graphs harmonic hofmann hotel html http icml ieee ijcai image improved inference instance instances institute intel italy ject jiang joachims journal keerthi keogh kernel kernels keypoints kwok labeled labelled lafferty large lathrop lazy learners learning level ligence lincoln literature lkopf logistic lozano machine machines madison mangasarian maron matveeva mccallum merz method methods miles miller mining missing missl mitchell mixture mlearn mlrepository multi multiinstance multiple multipleinstance nashville natural nebraska nigam nips niyogi page pakdd pattern pfahringer pittsburgh platt prague press problem rahmani ratan real reasoning rectangles references regions regression regularization relation report repository research ruffo rules savannah scalability scene scholkopf schuurmans science sciences scotland scott security selection semi semio semisupervised sindhwani single sinz slovenia smola solving statistical supervised support survey sydney technical technique text theoretical thrun torino trading training trans transductive trees tsochantaridis turin university unlabeled unlabelled unsupervised using uyar valued variables vector versus viola vishwanathan vision visual wang washington weidmann weston williamston williamstown wisconsin with workshop zhang zhou zien zucker http://www.machinelearning.org/proceedings/icml2007/papers/589.pdf 104 Sparse Eigen Metho ds by D.C. Programming alon analysis applied aspremont barkai biology broad cadima cancer clustering colon components correlations direct expression formulation gene ghaoui gish interpretation jolliffe jordan lanckriet levine loadings mack normal notterman patterns principal references revealed sparse statistics tissues tumor ybarra http://www.machinelearning.org/proceedings/icml2007/papers/489.pdf 9 Structural Alignment based Kernels for Protein Structure Classification accuracies acids alignment athena bank berman bertsimas bhat bhattacharya bhattacharyya bioinformatics bourne chandra classes classfn classification combinatorial comparison contd dali data different engineering extension fast feng from full gilliland hcka incremental introduction jections kernels length linear negative nucleic optimal optimization pairwise path positive programs prot protein references research results retrieval rmsd robust scientific scop shindyalov spectral structure structures suppl table tsitsiklis various weissig westbrook http://www.machinelearning.org/proceedings/icml2007/papers/458.pdf 37 Bayesian Actor-Critic Algorithms advantage anderson baird barto control cybernetics difficult elements ieee laboratory learning like neuron problems references report solve sutton systems technical that trans updating wright http://www.machinelearning.org/proceedings/icml2007/papers/495.pdf 28 Percentile Optimization in Uncertain Markov Decision Processes with Application to Efficient Exploration applied approximation bound brafman cannot case convex effectively general information lower mathematics nemirovski nominal operations optimization problem references research reward robust tennenholtz uncertainty value with http://www.machinelearning.org/proceedings/icml2007/papers/162.pdf 109 On the Role of Tracking in Stationary Environments afforded algorithm american arbitrary arise become becomes black caruana causing choice coherence coherent conference convergence david davies different domains effect emphasized enzenberger evaluation exactly example exclusive exploiting focus from have having idbd implications important improved inductive journal justify large later learning machine meta methodological multiple nagging near negligible neural nips observations opportunities overlooked parts performance problem proceedings provide references relatedness research resolving retrospective review route schuller sequence showed single solved stationary stepsize study substantially task tasks temporal temporally that then theory these this tracking transfer using version which while white without workshop world years http://www.machinelearning.org/proceedings/icml2007/papers/158.pdf 8 Learning Distance Function by Co ding Similarity adaptation alignment based basu belongie bennett bilenko blake bmvc boosting chang clustering constraints context cover cristianini databases distance elements elissee equivalence from gacs hertz hillel icml ieee image information integrating ject jmlr john kandola kernel learning machine mahalanobis malik matching merz metric mooney nips pami performance proc puzicha recognition references repository retrieval semi shape shawe shental sons stepwise supervised target taylor theory thomas trans using vitanyi weinshall wiley yeung zurek http://www.machinelearning.org/proceedings/icml2007/papers/575.pdf 145 Spectral Clustering and Transductive Learning with Multiple Views advances american ando annual applications argyriou becchetti belkin blum boldi cambridge castillo cbms chapelle chung cluster collection combining complexity computational conference data donato forum graph graphs herbster information kernels labeled laplacian laplacians large learning mathematical mathematics matveeva mitchel neural niyogi pechyony pontil press proc processing providence rademacher reference references regional regression regularization santini scholkopf semi series sigir society spam spectral stable supervised systems theory training transductive unlabeled vigna weston with workshop yaniv zhang http://www.machinelearning.org/proceedings/icml2007/papers/205.pdf 56 Nonmyopic Active Learning of Gaussian Pro cesses: An Exploration­Exploitation Approach active addition address agnostic agric algorithm algorithms ambrose analyses analysis applied approach balcan bayesian believe better between beygelzimer biol biomet borgwardt bounds california caselton castro cens characterization combinatorial complexity conclusions condition covariance cover cressie data decreasing deployable deployment design designs dissertation doctoral dunsmuir ecological electronic elements entropy estimated estimation evaluated exact exploitation exploiting exploration extended faster fisher friedman from gaussian gilbert graepel gramacy gretton growth guestrin handle harmon here high hydraulic hyperkernels hypothesis icann icml ijcnn improvement including infomechanical information insight insights interscience into jmlr journal kaiser kernel koller krause langford learning lett local locations logarithmic machine matrices maximum method methods mixtures model models monitoring motivating much mutual natural near network networked nims nips nonmyopic nonstationary nott nowak obermayer observation optimal optimizing paciorek paper parameter parameters perform phase phases placements point potential prediction preprint presented press priori prob probabilistic problems process processes proposed proved provide provided quality queyranne rapid rasch rasmussen rates real realworld references regression rejection report resolution results river sample sampleproblem sampling schlkopf selection sensor sequential several shewry show significant singh smola space spatial statist statistics stealey stein storkey strategies strategy structure structured submodularity switch systems techn technical test testing than that theoretical theory this thomas toeplitz treed tresp truncated uncertainty under university unknown used using variational wallat water when wiley willett williams williamson with wynn zidek http://www.machinelearning.org/proceedings/icml2007/papers/596.pdf 54 Statistical Predicate Invention aaai alchemy alyawarra annotated anthropology approach artificial attributes austria austrian autocorrelation bain based behavior berkeley better biomedical bottleneck buntine burnside california change classes classification clich cluster clustering cognitive combinatorial comparative comprehensive computer concept concepts connections constraining constructive costa craven data davis default denham department dependencies detection deterministic dhillon dimensionality dirichlet discovering dissertation distinguishing doctoral domain domingos during dyads efficiency efficient elidan engineering enhanced exceptions fields file first friedman from functional genomics getoor griffiths group hidden hierarchies hierarchy http hypertext icdm icml icpsr ijcai improving induction inference infinite information institute intel intelligence intermediate introduction invention inverting ject jensen journal kaufmann kemp koller kramer kriegel langley latent learning level leveraging ligent logic long lotner machine mallela mansinghka markov mccallum mccray mitchell model models modha monotonic morgan muggleton multi multiple nation nations networks neville nips noise nonverbal ontology order osherson other overlapping page patterns pazzani pearl pitman plausible playing poon popescul predicate predicates press probabilistic probability proc processes reasoning references relational report representation research resolution richardson roles rosenfeld rummel science seattle shrinkage silverstein singla slattery smith social sound spectral srinivasan statistical statistics stern stob stochastic structure struyf sumner system systems taskar technical tenenbaum text theoretic tresp type ueda ungar university upper variable variables vienna view washington wilkie with wogulis wolfe workshop yamada zhang http://www.machinelearning.org/proceedings/icml2007/papers/303.pdf 10 Discriminative Learning for Differing Training and Test Distributions abbreviations according advances algorithm also based beta bias biased bickel binary block brevity case characterizes cholesky chosen classifiers compute condition conditionals contained convex convexity corner corollary correcting criterion crout decomposition define defined definite density derivatives diag diagonal differing dirichlet discriminative distributions dudik elements enhanced entropy equation equivalent estimate estimation examples exists exponential expressed filtering following found global gradient hence hessian holds indicates information instances kernelized known labels large larger ldiag leads learning likelihood likely linear local logistic matrix maximum means more negative neural newton norm notational number omit only optimization optimum paramen phillips pockets points positive posterior prior problem processing proves references rephrase replaced resolving respect rewrite rule sample samples satisfy schapire scheffer sdim second selection semi separators space spam such sufficiently symmetric systems test that theorem there this those thus touch training turn update version well which with within http://www.machinelearning.org/proceedings/icml2007/papers/312.pdf 129 Learning to Combine Distances for Complex Representations adaptive analysis applying argyriou artificial attwood avriel axisparallel basic bertinoro bradley cambridge classes classification classifying collapsing colt combinations component continuously convex data dietterich diterpene domeniconi dover dzeroski editorial elucidation fingerprints from globerson goldberger gunopulos heidtke hilario hinton inductive instance intelligence issue italy kernels kremer lathrop lavrac lazy learning logic lozano machines methods metric micchelli mining mitchell multiple nearest neighbor neighbourhood nips nonlinear parameterized perez pisa pkdd platt pontil press problem programming protein publications rectangles references relational review roweis salakhutdinov scholkopf schulze siems solving special spectra springer structure support using vector verlag weiss wettschereck with workshop york http://www.machinelearning.org/proceedings/icml2007/papers/461.pdf 81 Fast and Effective Kernels for Relational Learning from Texts above accuracy accurate advances algorithm algorithms alicante allows alone also anal answer answering answers aone applications apply approach arbor australia automatic average beam berlin best between biased bikel bmvc boughorbel case challenge challenges charniak clef coling collins communications computation computations conclusions confidence constituent contain convolution corley could croft cross cumby dagan database datasets demonstrated dependency derived design deteriorates devoted different discrete documents does dolan domains duffy dynamic ecml efficient empirical empirically employed england english entailment entailments entropy equivalence evaluation exercise experiment extraction family fast feature ferro finally fleuret fold from fully functions future germany getoor giampiccolo glickman haasdonk haim have here highly icml ieee indefinite information innovative inspired instances intell interpretation issue italy joachims journal kernel kernels kmax language large latter learning learns less lexical like limited limits london lower mach machine magnini making matrix maximum measuring mercer methods michigan mihalcea miller mixing modeling morristown moschitti naacl name natural network note notes number object obtain other over overcome overview pair paper parser parsing pascal pattern penas perceptron performs ponte practical presented press problem proc proceedings promising proposed provided question ranking recognition reduces references relation relational repeated reported research resources results retrieval reveal richardella rodrigo role roth sama samples scale schwartz search seattle semantic show sigir significant similarities similarity single slightly some southampton space spain special split state statistical statistically structures study suggests support svms sydney syntactic szpektor table tagging tarel testing text texts textual than that their them they this train trans trec trees tutorial using validation vector venice verdejo very voorhees voted washington weischedel what which with wordnet working workshop would york zanzotto zelenko http://www.machinelearning.org/proceedings/icml2007/papers/472.pdf 38 Exponentiated Gradient Algorithms for Log-Linear Structured Prediction above accelerated accuracy acoustical adaboost addressed advances aistats algebra algorithm algorithms also america annual applied approach argument arguments assigns assuming baker bartlett because beck below between boosting bounded bregman buchholz challenging choice classification clearly collins compact comparison components computation computed condition conditional conditions conditonal conll consists constant constants continuity convergent converges convex convexity correct corresponds cover crammer current data dataset define defining definition denote dependencies dependency dependent descent described difference directed distances distributions divergence dual duality duan edges efficiently elements endency entire equivalently experiment exponential exponentiated express fast feature fenchel fields fixed following follows form from function games given goal gradient grammars guestrin hastie haussler have holds icml identical implies import information inside interior into jaakkola jected jective keerthi kernel kivinen koller labeling lafferty language large learning leave lebanon lemma letters likelihood limit linear logistic machine machines mapping margin marginals markov marsi maximum mcallester mccallum mcdonald meeting memisevic methods minimal minka mirror model models monotone multilingual multiple murphy must naacl natural necessarily need networks next nips nonlinear normalized notation numerical obtain online operations optimal optimization optimizers optimum outside over pair pairs paper parse parsers parsing part parts pereira platt point positive predicted prediction predictors press probabilistic probability problem proc proof proved random rate reapplying recognition references regression repeated report reports required research result satisfies schapire schmidt schraudolph score seen segmenting sentences sequence sequential series setting shalev shallow shared shevade show shwartz similar simplified since singer slovene society some space spanning speech starting states stochastic strict structured structures subgradient subsequence such support task taskar teboulle technical that then theory there therefore this thomas those thus together tokens toronto trainable training tree trees unique univ update updates using value values variant vector versus vishwanathan warmuth weight where which wiley with words written yields zero http://www.machinelearning.org/proceedings/icml2007/papers/378.pdf 138 Robust Multi-Task Learning with t-Pro cesses acknowledgement analysis ando applications argyriou authors bakker bayesian bharat biometrika cambridge carlin caruana chapman clustering collaborative component data discussions distribution ecme estimation evgeniou extensions feature fernandez framework from gating gaussian gelman ghahramani graphical hall heskes hierarchical icml independent inference introduction jaakkola jmlr jordan kero kotz kriegel latent learning like lkopf machine methods modelling models multi multiple multitask multivariate nadara nels nips ordinal pitfalls pontil predictive press proceedings processes rasmussen references regression regularized related rubin saul schwaighofer sigkdd sinica smola statistica steel stern structures student task tasks tdistributions thank their tresp university unlabeled using valuable variational williams with would yang zhang zoubin http://www.machinelearning.org/proceedings/icml2007/papers/72.pdf 24 Boosting for Transfer Learning accuracy adaptae advances algorithm annals annual application artificial auxiliary bartlett based bias biased bickel boosting borgwardt boser brief carin caruana classification classifiers classify computational computer conference cori correcting covariate data daum david decisiontheoretic density dietterich dirichlet dissertation domain econometrica effectiveness enhanced entropy error estimation evaluating explanation exploiting fakultat fifth filtering first fourteenth freund function generalization gretton guyon heckman huang improving inductive inference informatik information intel international introduction joachims joint journal kaelbling kluwer kullback later learn learning leibler liao ligence likelihood line logistic machine machines marcu margin marx mathematical maximum methods mitchell more multiple multitask neural nips optimal phillips planning predictive proceedings processing recting references regression relatedness report research rosenstein sample samples schapire scheffer schmidhuber scholkopf schuller sciences second selec selection shift shimodaira sixteenth smola source sources spam specification statistical statistics strategies sufficiency support system systems task technical text theory thing thrun tion training transductive transfer twenty under unlabeled using vapnik vector voting weighting with workshop years zadrozny http://www.machinelearning.org/proceedings/icml2007/papers/605.pdf 55 Kernelizing PLS, Degrees of Freedom, and Efficient Mo del Selection academic additive advances algebra algorithm algorithms alternative american analysis applications applied approach association asymptotic bank best between braun buhmann building bureau calculation calibration cambridge cent chapman chemometrics christophersen classification communications computation computations computer conference conjugate connection covariance cross data degrees delve denoising descripion deviation differential dimension efficient efron eigenvalue error espen estimation estimator exploiting feature fewer fold frank freedom friedman geladi generalized gmdl golub gradients hall hansen hastie helland hilbert hoog hopkins http industrial information integral intel international intervals iteration jacobian jects johns journal kernel kernelizing laboratory lanczos latent learning least leastsquares lecture lemberge length ligent lindgren linear lingj loan london machine manne many mathematical mathematics matrix matthews method methods minimum model models muller multivariate national neural nipals nnar nonlinear notes observational operators overview part partial path penalties penlidis perspectives phatak prediction press principle problem proceedings processing proofs properties quantitative recursive reduction references regression relations reproducing research result results rilley rosipal scandinavian science selection serneels sets shrinkage simulation society sociology solution some space spaces spline springer squares standard standards statistical statistics structure subspace systems table techniques technometrics theory tibshirani tools toronto trejo twentieth univariate university using validation variables variance view wahba washington with within wold http://www.machinelearning.org/proceedings/icml2007/papers/140.pdf 57 On One Metho d of Non-Diagonal Regularization in Sparse Bayesian Learning ability advances algorithms application applied approach approximate arbitrary artificial assign august automatic bayesian besides bioinformatics bishop blake both boutilier calculation calculations cambridge canada cancer case cawley characteristic claim classical classification classifier classifiers coefficients comparing complementary complexity complicated computation computing conference consider constant constructing contribute convenient databases decomposition defined degrees demonstrated determination diag diagonal dietterich different dimensional direct directions directly effectively efficient eigenvalues eigenvectors erfc erfcx error estimation evidence example expectation expression fast faul features formulations found framework freedom function gene generalization generalized girolami goldszmidt here hessian hettich hoffmann ijcnn independently indicate indirectly individual inference information integral integrals intel international irrelevant jaakkola joint jordan july kaufmann kernel laplace large latter leads learn learning leen ligence likelihood linear logistic lrevm mach machine machines mackay marginal matrix maximisation means measure merz metho methods minka models montreal more morgan mueller multinomial natural neal negative networks neural newman ninth nonnegative number optimized optimizing parameter pattern platt positive present press prior priors proceedings processing product promising propagation provides pruning rather reasonable recognition references regression regularisation regularization regularized relevance relevant repository respect responsible revm scaled scholkopf secondary seems selection seventeenth simple solla some sparse springer statistical statistics such suggestion supervised suppose symmetric systems talbot tests than that then they this tipping trick true types uncertainty unite used using values variational vector very weights west where which whose williams with workshop write written yieldm york http://www.machinelearning.org/proceedings/icml2007/papers/242.pdf 121 Transductive Regression Piloted by Inter-Manifold Relations acknowledgement aistats algorithm algorithms alignment applications belkin best between blondel blum both brefeld carin chapelle class classes classification colt combining comput computer conclusions constructed contract cortes council cross cuhk data davidwilliams dedicated demonstrated described developed different dimensionality discuss dooren efficient eigenmaps embedding employed estimating eurocolt experiments extended extraction first framework from function functions further gartner generalized geometric global goldberg grants graph graphs hartemink have herbrich heymans hong icml inference information inter jardo ject kernel knowledge kong krishnapuram label labeled labels langford laplacian large learning least linear locally manifold manifolds matrix matveeva measure method mitchell mohri moreover multi nbchc neural nips niyogi nonlinear novel order over paper pilot predict preferences presented problem propagation proposed real reduction regression regularised regularization regularized relations representation representer research roweis sample saul scheffer schlkopf science sciences searching semi semisupervised senellart separately siam silva similarity sindhwani smola squares state superiority supervised supported synonym synthesized tenenbaum theorem this thus training transductive trick unlabeled utilize values vapnik vertices weinberger weston with without work world wrobel http://www.machinelearning.org/proceedings/icml2007/papers/549.pdf 108 Piecewise Pseudolikeliho o d for Efficient Training of Conditional Random Fields abbeel analysis annotation applications approximate artificial bayesian besag boltzmann bonn carnegie cascading complexity computation conditional conf conference consistency control data daum differential dissertation distributions doctoral domains emnlp empirical entropy errors estimation estimators extraction factor fields finkel first fleming francisco freitag fully germany gibbs gidas graphs hyvarinen icml inference informal information intel international kaufmann koller labeling lafferty language large lattice learning ligence likelihood linguistic lions machine machines manning marcu margin markov maximum mccallum mellon methods models morgan natural neural optimization pereira pipelines polynomial prediction probabilistic problem proc proceeding pseudolikelihood random references sample search segmentation segmenting sequence solving springer statistical statistician stochastic structured systems theory time twenty uncertainty university visible york http://www.machinelearning.org/proceedings/icml2007/papers/457.pdf 98 Graph Clustering with Network Structure Indices academy algorithms annual arithmetic biological brandes centrality clustering clusterings community complex computing conference coppersmith cuts data delling dhillon discovery eccs european experiments flake freeman gaertler generating girvan graph guan international internet kernel know kulis ledge mathematics matrix means minimum mining multiplication national networks newman nineteenth normalized proceedings progressions references sciences sigkdd significant social spectral structure symposium systems tarjan theory trees tsioutsiouliklis wagner winograd http://www.machinelearning.org/proceedings/icml2007/papers/169.pdf 107 A Kernel-based Causal Learning Algorithm achieved algorithm algorithms analysis approach approximating arcs aronszajn asia bach baker based bayesian bell cancer cancers causal cheng chickering chow cigarette columns component connected convenience cooper corresponding covariance cross data denote dependence deppro detected different direction discrete distributions efficient entries faithfulness fine fraumeni from fully gates generated geographic given greedy greiner here herskovits identification ieee incompatibility inddet independent indpro induction information informationtheory institute intl joint jordan kelly kernel kernels learning mach machine math mcmc measures meek method model monotone mwst national network networks operators optimal percentages points probabilistic probability rank references representations reproducing sampled scheinberg search skeleton smoking stands states statistics structure structures table text theory times tract training trans trees true underlying united urinary using variations with within http://www.machinelearning.org/proceedings/icml2007/papers/148.pdf 97 More Efficiency in Multiple Kernel Learning argyriou convex evgeniou feature learning multi pontil references report task technical http://www.machinelearning.org/proceedings/icml2007/papers/403.pdf 126 What Is Decreased by the Max-sum Arc Consistency Algorithm? abstract agreement algorithm algorithms also analysis appear approach approaches approx approximation artificial austria based because beek berkeley bessiere bistarelli block cannot case center chapter chekuri codognet coefficient commutativity comparison computer conditions conf consider consistency consistent constraint constraints control convergent convex cooper coordinate corresponds csps cybernetics czech declarative decreased decreasing dept descent descriptions deville diffusion difications dimensional discrete distributive drawings dumka dynamic easy ecai elsevier encyclopedia energy enhancing equivalent estimation exponential facets false families fargier find flach formalisms formulation framework frameworks francisco from gadducci generalized generating generic georget giginjak givry global glushkov graphical group guaranteed handbook haralick hard hentenryck hoesel however hummel hyderabad hyper ieee ijcai india inference information institute integer intel interest intl italy jaakkola joint jordan kaufmann khanna kibernetika kiev kolen kolmogorov koster koval kovalevsky labeling lambrecht lecture letters lifting ligence ligent linear lirmm longer machine machines mackworth manipulation mashiny massachusetts maxsum mceliece meseguer message metric minima minimization minimizing minimum models montanari montreal morgan naor naukova need networks noisy nonsmooth notes numbers observed operations optima optimal optimization other partial passing patt pattern pearl perception plausible poor press probabilistic problem problems proc programming programs propagation properties rational reasoning recog recognition references relaxation remains report research review reviewed reweighted rosenfeld rossi russian satisfaction scene scenes schiex schlesinger science semantic semiring semirings sets setting shadows shapiro shekhovtsov siam signals similar simplifies since sistemy soft solvable solving some specializations springer statistics still structural structure subclasses symposium syntactic systems technical technology teng than theorems theory things this thus trans transformations tree trees under unified university unpublished upravlyayushchie used using ussr valuation value valued variational verdu verfaillie verlag very vision visual wainwright walsh waltz werner what wiley willsky winter with workshop york zosin zucker http://www.machinelearning.org/proceedings/icml2007/papers/559.pdf 61 Learning a Meta-Level Prior for Feature Relevance from Multiple Related Tasks about above adding advances adviser allows also analysis annotation applies apply approach approaches argument argyriou artificial automatic baxter bayes bayesian been believe benefits between both cambridge caruana case cases change chapman classes classification collaborative completely component computation conference constructing convex convexity convolution could covariances cover data decomposition defined different direction directions discussed discussion dissertation doctoral each easier easily efficient empirical enough ensemble estimation evgeniou extended factor factors feature features filtering finally fink first follow formulation framework from function future gaussian generalize generalized geoffrey ghahramani gildea goal grouped hacioglu hall have heskes hierarchy hinton human hypothesis improving independent inducing inductive information informative informed intelligence interclass interesting international interpolation issue jmlr jointly jordan jurafsky kaelbling kernel kernels kingsbury koller krugler labeling language lasso latent learn learning learns leveraging linear london losing mach machine mackay marcus marlin martin mccallum mccullagh mean meta metafeatures methods micchelli mitchell model modeling models moschitti multi multiclass multiple multitask neal nelder networks neural nips online only other ours over paired palmer paper parameter parsing penn perspective pontil practice pradhan predicting prediction press prior priors probabilistic problem proc proceeding proceedings processes processing propose proposed raina raises references regression related relationships relevance research roles rosenfeld royal same sampling schwaighofer seeger selection semantic semiparameteric sets several shalev shallow sharing shrinkage shwatz simple singer special specifically statist statistic statistics study support systems taken task taskar tasks technology test text than that their them theoretic thereby therefore thing this thrun tibshirani tranfer transfer treebank tresp ullman unseen using values variable variables variance variances vary vector viamultiple viewed ward weights when where which with without wong work workshop would yang yuan zhang http://www.machinelearning.org/proceedings/icml2007/papers/182.pdf 50 Neighbor Search with Global Geometry: A Minimax Message Passing Algorithm adaptation adaptive algorithm analysis application applied artificial belkin bengio between carroll chang classificaton classified cluster clustering collaborative computation computing conference data delalleau dimensionality dissimilarity domeniconi efficient eigenmaps factor fouss frey function garcke gower graph graphs grids griebel gunopulos icml ieee induction information intel international kschischang laplacian learning length ligence linear linkage links locally loeliger machine manifolds matrix metric minimax minimum nearest neighbor neural niyogi nodes novel parametric partial pattern peng pirotte proceedings product psychometrika recommendation reduction references representation riemannian ross roux saerens semi semisupervised similarities single spanning sparse statistics supervised theory training trans tree trees with workshop yeung http://www.machinelearning.org/proceedings/icml2007/papers/453.pdf 79 Mixtures of Hierarchical Topics with Pachinko Allo cation allocation author authors bayes beyond blei carlo chinese correlations diggle dirichlet distribution documents estimating gratton griffiths hierarchical icml implicit inference jordan journal latent learning machine mccallum methods minka mixture model modeling models monte nested nips nonparametric ofwords pachinko process references research restaurant rosen royal smyth society statistical steyvers structured tenenbaum topic wallach http://www.machinelearning.org/proceedings/icml2007/papers/425.pdf 80 Three New Graphical Mo dels for Statistical Language Mo delling aistats bengio ducharme importance jauvin journal language learning machine model nets neural probabilistic quick references research sampling training vincent http://www.machinelearning.org/proceedings/icml2007/papers/98.pdf 93 Reinforcement Learning by Reward-weighted Regression for Op erational Space Control academic actual advancing algorithms also andrieu applications applied approach approaches artificial atkeson automation avoid believe broader bullock cognitive comparative complex comput computation conclusion contributes control controller cory dayan delayed demonstrated different distal doucet dynamics efficient entirely envisioned equivalent estimation expectationmaximization experiments featherstone finding first flavor followed force formulation found framework freitas from grossberg guenther haruno here hinton hoboken idea identification ieee immediate implementations intel into introduced introduction iros jectories jordan journal kaebling kawato khatib kluwer learning ligence little littman machine manipulators mcmc method methodology methods mistry model modern moore mosaic motion motor much multijoint nakanishi neural neuroscience nonparameteric operational optimal optimization organizing outs paper particular peters practical problem problems publishers reaching real realizations realized realm redundancy references regression reinforcement research resolution resolving reward rewards rewardweighted robot robotic robotics robots roll rumelhart scalable schaal science search second self sensorimotor simulation simulations solutions some space spall state statistics stochastic success supervised survey system systems task teacher techniques temporally that there this time tool transformation type udwadia unified unifying using vijayakumar ways weighted which while wiley will with wolpert http://www.machinelearning.org/proceedings/icml2007/papers/388.pdf 90 Multi-armed Bandit Problems with Dep endent Arms active advances agrawal allocation applied armed bandit based bounds chang coarse complexity dasgupta dynamic four frostig gittins index indices journal learning mean multi multiarmed nips optimal policies probability problem processes proofs references regret royal sample series society statistical stopping theorem trust weiss with http://www.machinelearning.org/proceedings/icml2007/papers/392.pdf 17 Local Similarity Discriminant Analysis about achieve additional advances advantage algorithm algorithms analogy analysis annealing appear approach approaches asymmetric based because belongie berg better bicego blake both buhmann cambridge campenhout case categorical categories categorization category cazzanti centroid characteristics class classification classifier classifiers clustering cognition cognitive combined common communications compared competitively computation compute computer conceptual conditional conf consistently contexts cost costs could coulomb cover data databases denker described design deterministic devroye dimension discriminant discriminative discussion dissimilarity distance distances distinctive distortion duin dyadic earlbaum efficient electrostatic elements entropy error experiments feature features flexible friedlander friedman gati gdalyahu generalized generalizing generative goldfarb graepel gupta gyorfi hamamoto hastie have helpful herbrich hettich hillsdale hochreiter hoffmann however ieee image information intel intl intuitive issue jacobs jaynes ject john journal judgments kernel koppal large learning letters ligence lloyd local logic lower lugosi machine machines maire malik matching matrix maximum meanbased memorybased merz methods metric minimizing misclassification mitani mixed model models mozer multiclass murino natural nearest neighbor neighborhood neural newman nonmetric nonparametric number numerical obermayer other pacl pairwise particularly pattern pekalska pelillo perceptual performs practical press probabilistic probability problems proc processing produces progress proposed prototype provide proximity psvm psychological psychology puzicha rates rationale reasoning recognition references relational relative repository representation representative research retrieval review ridder rosch rules salzberg sample samples science shape show simard similarities similarity similaritybased small sons spaces special springer springerverlag stanfill statistical studies suited support symbolic symp systems test than that theory they thomas tibshirani torsello toward training trans transformation tversky underlying university using values vector verlag viewed vision visual waltz weighted weighting weinshall well when where which wiley with york zhang http://www.machinelearning.org/proceedings/icml2007/papers/404.pdf 26 Information-Theoretic Metric Learning algorithms applications basu bilenko censor clustering conf data discovery framework knowledge mining mooney optimization oxford parallel press probabilistic proc references semi sigkdd supervised theory university zenios http://www.machinelearning.org/proceedings/icml2007/papers/448.pdf 31 Hierarchical Maximum Entropy Density Estimation across adaptive agree among analysis analyze annual another appear appears application applied approaches artif assume average based baxter beal because benefit berkeley better between bias biodiv biodiversity biology blei both boyd bution cambridge caruana case change choice class classes classification clusters community compared comparison conclusion concrete conf conference cons consistent constraints constructing convex data density depth determined devroye diadromous difference differences different dirichlet distribution distributions distrii dramatic drielsma each east ecography elith empirical empty entirely entropy estimation estimator evaluation even every evoke experience experiments extended extension extremely ferrier fifteenth figure fish fixed formalism four fourth freshwater from gelman generalized gets graham group groups guarantees gyorfi hastie have hierarchical hierarchies hierarchy highly hill however hypothesis improve improvement improves improving inadmissibility individual inductive inequality influence information informative intell international james jaynes jordan justified kazama knowledge koller language larger learning leathwick level likelihood loosely loss lugosi machine main majority manion math maxent maximum mccallum mean mechanics method methods mitchell model modelling models more most multilevel multitask multivariate natural nceas nineteenth nips nonempty normal north note novel null number numbers observe observed occurrence occurrences ones only optimization over parameter parameters pattern perform performance phillips physics predict prediction press priors prob probabilistic problems proc proceedings processes processing quadratic raina random range rather recognition records references region regression regularization regularized related relative report respectively restricted results reviews richardson rosenfeld rowe same sample samples schapire selection seventeenth sharing should show shrinkage significant significantly simultaneously single sizes small smaller smallest smoothing solution solve source south spatial species specific specifically splines springer stat statistical stein still symp synthetic tend tenth terms test text than that theory third this though three training transfer trend tsujii turns twenty university using usual value values vandenberghe varies vast verlag wales watson where with word worse zealand http://www.machinelearning.org/proceedings/icml2007/papers/255.pdf 133 The Matrix Stick-Breaking Pro cess for Flexible Multi-Task Learning albert american analysis association bayesian binary caruana chib data iorio journal learning maceachern machine muller multitask polychotomous references response rosner statistical http://www.machinelearning.org/proceedings/icml2007/papers/284.pdf 82 Dimensionality Reduction and Generalization algorithms amer aronsza bartlett bauer belkin berkeley bounds classification complexity convexity department jordan journal kernels learning math mcauliffe niyogi pereverzev references regularization report reproducing risk rosasco semi statistics supervised technical theory trans http://www.machinelearning.org/proceedings/icml2007/papers/407.pdf 99 Restricted Boltzmann Machines for Collab orative Filtering aistats alberta algorithm american analysis artificial banff belief california canada canny carlo carreira chain collaborative computation computer conference contrastive data deep deerwester department dimensionality divergence dumais experts factor fast filtering first fransisco furnas harshman hinton hofmann icml indexing inference information intel international journal july kaufmann landauer latent learning ligence machine markov marlin methods minimizing model monte morgan multiple multiplicative neal nets networks neural osindero perpinan privacy probabilistic proceedings products reducing references report salakhutdinov science semantic sigir society statistics technical toronto training twenty uncertainty university using with workshop zemel http://www.machinelearning.org/proceedings/icml2007/papers/387.pdf 34 Combining Online and Offline Knowledge in UCT advances against analysis applied approximating architectures armed artificial association auer average backup bandit bars barto based baxter bianchi bruegmann buro cambridge carlo cesa chess coarse coding combining complete computational computer computers conference coulom dayan default differ difference differences different dynamic each efficient ences enzenberger error european evaluation examples experiments features figure finite fischer francisco from functions game games gelly generalization gnugo gobble hawaii high hlynka honolulu html http ieee indicate information initialisation inria integrated intel international introduction italy joint journal jussila kaufmann knowledge kocsis learning level ligence like local machine ment method modification modifications monte morgan move muller multi munos network neural nprior offline online operators over parameter patterns performance planning playing plot point position predict press prior problem processing program programming programs qeven qgrand qrlgo rate reacting references reinforce reinforcement report schaeffer schraudolph search segmentation sejnowski selectivity sequence shape silver simple simulations soft sophisticated sparse standard successful sutton symposium systems szepesvari technical temporal teytaud time trav tree tridgell turin ucsf using wang weaver winning with http://www.machinelearning.org/proceedings/icml2007/papers/380.pdf 148 Transductive Supp ort Vector Machines for Structured Variables accuracy addition advances altun applied approaches appropriate assumption assumptions based belkin benefit beyond binary both brefeld cambridge case chapelle class classification clique cloud cluster combinatorial combining computation conclusion conditional conference conjecture conjugate constrained convex could data datasets density dependency derived descent determine devised differentiable discriminative distribution effect efficiently empirical entire error exploit fail fields figure findings from fully functions generally given gradient graph greatest greiner guestrin hand hidden hofmann hypothesis improve improved improves inference information instance instances interdependent international into jiao joachims kernel kind koller labeled labeling lafferty laplacian learning loss machine machines margin markov maximum mcallester methods model more much multi networks neural never news niyogi observe obtained optimization optimized original others output over parts point points predictors presented press primal problem problems proceedings processing purely question raises random references regularizer representation research rests resulting rule scheffer scholkopf schuurmans selection semi semisupervised separation sequence serves similar sindhwani some southey spaces spanish spatial special standard statistics still structured structures supervised support systems task taskar techniques text that this though token training transductive transformed tsochantaridis tsvm unconstrained unlabeled unsupervised using variables variant various vector wang weaker well which while wilkinson wire with workshop yields zien http://www.machinelearning.org/proceedings/icml2007/papers/369.pdf 141 Nonlinear Indep endent Comp onent Analysis with Minimal Nonlinear Distortion acknowledgement acyclic administration advances algorithm almeida also amer analysis applicability application approach arsenin asia assoc bank based basis bell bishop blind casual cathay causal causality cheung china close component computation computational conclusion council curvature data dawn deconvolution dent diagram discovery distortion driven dusk east electric ensemble existence experimental factor fast feedforward figure finance finland fixed from function gaussian generated good gorithms grant grants hang harmeling have helps helsinki henderson hldg hldgs hong hoyer hsbc hung hutchison ieee illposedness illustrates image indepena independent indeterminacies information informationmaximization invited journal junen jutten karhunen karthikesh kawanabe kerminen kernel koch kong land learning life likelihood linear lung machine market matrix method mild minimal misep mixing mixture mixtures model muller mutual nature network networks neural nonconcave nonlinear oracle overcome pacific paper partially penalized poggio point posed postnonlinear preferred principle priors problem problems proc processing produces prop properties proposed radial real reduce references region regularization reliable research result results rinen robust sejnowski selection seng separating separation session shimizu show signal smoothing solution solutions solving some sons source sparse special statist stock stocks successful supported supports swire symp synthetic system taleb that theory this tikhonov till torre trans uniqueness using validity valpola variable various vision wang washington wharf when whose winston with work workshop ziehe zurada http://www.machinelearning.org/proceedings/icml2007/papers/282.pdf 115 Entire Regularization Paths for Graph Data afraid angle bringmann conference databases discovery efron european hastie johnstone know least ledge nijssen patterns pkdd practice principles raedt references regression simpler statist tibshirani zimmermann http://www.machinelearning.org/proceedings/icml2007/papers/16.pdf 15 Feature Selection in a Kernel Space academic algorithms analysis annals anouar approach based baudat bradley california computation computer concave department diego discriminant dissertation doctoral empirical eugen evaluations feature fisher fukunaga generalized icml information instance introduction irvine jection kernel kernels learning machines mangasarian mathematical measurements minimization multiple neural neurocomputing pattern press problems professional psychological recognition references science selection statistical study supervised support tasks taxonomic university using vector http://www.machinelearning.org/proceedings/icml2007/papers/517.pdf 116 Discriminative Gaussian Process Latent Variable Model for Classification accuracies advances algorithm analysis application approximations assessing back bayesian being better body both cambridge candela chance class classes classification classifiers clustering collapsing computer conclusion conference constraints covariance criterion data database derived desirable developed dgplvm different dimension dimensional dimensionality discovers discriminant discriminative distance distribution dynamical embedding employing error errors figure fisher fleet from function gaussian generalization generative globerson gplvm griffiths have hertzman high hinton human image infor information input international iwata jordan kernels kuss large latent lawrence learned learning levels linear local locally machine manifold margin mation mean metric model models monocular more motion multiview neighbor neural nips nonlinear number original outperforms over overfits paper parametric perform performs points preservation press prior process processing properties quinonero rasmussen rate reduction references roweis russell saito saul science seeger side some space spaces stochastic stromsten structure subsets sugiyama supervised svms systems table temporal tenenbaum than this through tracking training ueda understanding urtasun using usps variable vision visualisation visualization wang while williams wine with xing zien http://www.machinelearning.org/proceedings/icml2007/papers/581.pdf 69 Quadratically Gated Mixture of Exp erts for Incomplete Data Classification advances albert algorithm altman american analysis approach area association based bayesian binary bioinformatics bivariate blake both botstein brown cantor carin characteristic chib classification computation computational conference curve data databases decision depends distribution estimation evaluated evidently exmi expectation experts ezik from generalized genz ghahramani graham graphical hanley hastie hettich hierarchical html http ibrahim icml imputation incomplete information inside international into iteration jacobs joachims jordan journal kernel large learning liao linear logistic machine making markov mcneil meaning merz methodological methods microarrays missing mixtures mlearn mlrepository models moments multiple multivariate neural newman nonresponse normal numerical observable operating partially polychotomous practical press probabilities proceedings processes processing psychological radiology receiver references region regression repository respect response rosenbaum royal rubin scale schafer series sherlock society state statistical statistics substituting summation supervised support surveys systems taken theorem tibshirani trace troyanskaya truncated under using value vector view where which wiley williams with york http://www.machinelearning.org/proceedings/icml2007/papers/349.pdf 73 Adaptive Mesh Compression in 3D Computer Graphics using Multiscale Manifold Learning adaptive addison algorithm amhed anal analysis annual appl approach approximation average aware bases belkin bremer bunny cald carnegie chicago clustering cohen coifman communications comp comparison compression comput computer computers conference convergence cosine cuts data decomposition design different diffusion direct discrete edges eigenfunctions elephant error evaluation fair fast figure file from function geometric geometry ghahramani graph graphics graphs harm high icml ieee image increase increases information intel interactive international irony irregular ject jordan jpeg karypis kumar label labeled laplacian large learning mach machine maggioni mahadevan malik mallat manifold manifolds markov matveeva meila mellon model multilevel multiples multiresolution multiscale natara neural nips niyogi normalized number numbers object over pami partition partitioning partitions pattern picture policy press proceedings processes processing propagation publishing quality random rate reduces references regression regularization report representation results riemannian running scheme seconds segmentation semi shape showing siam siggraph signal singer sizes sorkine spectral standard stanford still sublinearly superlinearly supervised surface systems szlam taubin technical techniques theory time times toledo tran trans transactions transform university unlabeled using value vertices visualization walks wallace wavelet wavelets weiss wesley with york http://www.machinelearning.org/proceedings/icml2007/papers/114.pdf 70 Trust Region Newton Metho ds for Large-Scale Logistic Regression algorithm algorithms also analysis apply approximately because benchmark bfgs boser bound carnegie categorization classification classifiers collection colt comparison conclusions conditional conjugate conll constrained convergence core data decoste differentiable direction discussion effective entropy estimation experiments fast faster features fields find finite formulation future getoor gradient guyon have however ieee implementation indicate inducing introduction investigated issues jmlr joachims journal keerthi komarek lafferty large learning lewis limited linear logistic making malouf mangasarian margin mathematical maximum mccallum mellon memory method methods mining modified moore must nash newton nice nocedal numerical only optimal optimization pami parameter pietra press problems programming random references region regions regression relational report research rose same scale shown siam software solution statistical steihaug strategy study summary survey sutton svms taskar technical text than that their theoretical this time tool topic training tron truncated trust twice university useful vapnik yang http://www.machinelearning.org/proceedings/icml2007/papers/62.pdf 117 Experimental Persp ectives on Learning from Imbalanced Data applications approach artificial barandela beats berenson blake bowyer breiman california chawla class computer cost databases department drummond ferri forests goldstein hall holte html http iapr imbalance imbalanced information intel intermediate international irvine joint journal kegelmeyer learning lecture levine ligence machine merz methods minority mlearn mlrepository notes over oversampling package pattern prentice problem random recognition references repository research sample sampling sanchez science sciences sensitivity smote sspr statistical structural syntactic synthetic technique training under university valdovinos workshops http://www.machinelearning.org/proceedings/icml2007/papers/215.pdf 147 Dynamic Hierarchical Markov Random Fields and their Application to Web Data Extraction approach artificial carreira computation conditional contrastive corrective culotta cvpr divergence estimation experts extraction feedback fieguth fields friedman getoor ghahramani graphical hinton icml ieee image information intelligence introduction irving jaakkola jordan journal koller kristjansson labeling learning mccallum methods minimizing modeling models multiscale neural overlapping perpinan persistent probabilistic proc processing products random references relational saul statistics stochastic structure taskar training trans tree variational viola willsky zemel http://www.machinelearning.org/proceedings/icml2007/papers/467.pdf 118 Learning from Interpretations: A Ro oted Kernel for Ordered Hyp ergraphs activity affinities approaches artificial based binding blair branham cambridge chemical classifying clausal compounds cristianini deshpande dial diversity dzeroski estrogen fang first frequent hass icdm intel introduction karypis kuramochi learnable ligands ligence machines moland mycoestrogens natural nutr order perkins phytoestrogens press raedt receptor references relationships relative shawe sheehan structural structure support taylor theories tong toxicol university uterine vector xenochemicals http://www.machinelearning.org/proceedings/icml2007/papers/81.pdf 22 Full Regularization Path for Sparse Principal Comp onent Analysis alizadeh cell davis diffuse distinct eisen expression gene identified large lossos lymphoma nature profiling references rosenwald types http://www.machinelearning.org/proceedings/icml2007/papers/564.pdf 40 Recovering Temp orally Rewiring Networks: A Mo del-based Approach acknowledgments across airoldi albert algorithm algorithms also american analysis analyze appealing applied apply approach approaches approximate arbeitman association assumptions attributes barabasi based bayesian because bioinformatics biological biology blei cannot changes coherent computatonal compute conclusion contribution corresponding current cycle data demonstrated dense depend description designing development directly discovery discrete discussion distributions doubly dougherty drosophila during dynamic dynamics each emergence emission empirical employ entirely estimation even evolving executed exponential expression fienberg formalize framework frank friedman from functional functions furthermore gaussian gene general genes genomic ghahramani gibbs graph graphical graphs handcock hanneke have heckerman help here hetunandan hidden hoff however htergm htergms icml ikeda inference inferring international interpretations intractable introduction invariant investigation journal kalish kamisetty keeping large latent lawrence learning left length life lifecycle likelihood linial link logistic logit loopy luscombe lusher mackay markov massive mcmc melanogaster membership methods metropolis meyer minimum mining mixed model modeling modelling models moore multiple murray muscle nachman nature network networks nice nips node novel often open over parameters partially particular particularly partition pattison performance point portion pose presented principle problem proceedings processes properties proposed prospect provides pseudolikelihood psychometrika quickly raftery random ratio ratios rattray readable reconstructed recover recovering references regression regulation regulatory related relational relations relatively report reveals rewiring robins sampling sanguinetti sarkar scale scaling scheme science seems sequence series serpedin significant simulated since single small social sociometric some space specific stability stage standard staples starting statistical strauss study subfamily subgraphs such superior task technical temporal temporally thank that their therefore this time toolbox topological tractable tractably transcriptional transition tutorial type under undermine undirected unified unobserved useful using variables wasserman wentao which while wish with workshop xing yanxin zhao http://www.machinelearning.org/proceedings/icml2007/papers/518.pdf 0 Quantum Clustering Algorithms algorithm algorithmic algorithms ambainis amplification amplitude amsterdam analysis angluin applications approximate approximating arxiv associative automata available bangalore based bell bennett bentley binary bonner bounds boyer brassard cambridge canadian chain chuang classical clustering coin communications complete complexity computation computer computers computing concept conference considered contemporary cryptography data description dimensionality discrete distribution dodge durr editor einstein estimation factorization fast finding focs fortschritte foundations framework freivalds fsttcs gambs geometric global gottlieb graph grover harel haystack heiligman helps holland horn http icalp ieee india information international jist journal kaufman koren langford languages learning letters logarithms machine markov mathematics means mechanical mechanics median medians medoids method methods meur mhalla minim minimum mishra moravsk mosca multidimensional nayak needle neural nielsen nips nonlinear norm north oblinger paradox pattern physical physics physik pirodovdeck pitt podolsky polynomial press prime probl problems proceedings processing programming public quant quantum queries query random reality recognition reduction references related review rosen rousseeuw science search searching separating servedio shor siam signal silva small soda software some speed spolenosti statistical statistics stoc sublinear survey symposium systems szegedy tapp technology tenenbaum their theoretical theory tight time tossing trees university used using walks workshop world http://www.machinelearning.org/proceedings/icml2007/papers/137.pdf 106 Robust Mixtures in the Presence of Measurement Errors archambeau conference delannay international jections learning machine probabilistic proceedings references robust verleysen http://www.machinelearning.org/proceedings/icml2007/papers/398.pdf 21 Magnitude-Preserving Ranking Algorithms advances agarwal algorithmic algorithms bipartite bousquet colt conference elisseeff generalization heidelberg information learning neural nips niyogi performance proceedings processing ranking references springer stability systems theory http://www.machinelearning.org/proceedings/icml2007/papers/463.pdf 127 Multi-Task Reinforcement Learning: A Hierarchical Bayesian Approach ability about across action after agent algorithm algorithms andre another applying approach arguably artificial attractive autonomous banerjee barto based bayesian between biometrika both bottleneck bowling cases chain classes colored combination computational computationally conference considered considers construct context convergence current dearden demonstrated demonstrates design different dirichlet domain domains duff each effectively efficient estimate evidence exceeds experience experiments exploration fern fifteenth figure finds first framework friedman from furthermore future game general goal good graphical gridworld have hierarchical however impact improve information intel international internationl introduction joint journal knowledge konidaris larger learning ligence likelihood line lizotte machine make many maps markov mdps mehta methods mixture model modelbased models more most motivated mtrl multi myopic natara national neal neural novel observations optimal optimization overwhelming particularly past plan planners policy possible potential press prevents primary prior probability probe problem proceedings process processing propose provide reasoning references reinforcement related representing resampling reward russell samples sampling schuurmans selection setting shaping shared solve sparse specific speed statistics step stone straightforward strategies strategy strens successfully summary sutton systems tadepalli task tasks that this thompson thus transfer transferring true underlying uninformed unknown uses using value variable view wang where which work works workshop http://www.machinelearning.org/proceedings/icml2007/papers/497.pdf 125 Winnowing Subspaces advances again along analysis annual appear approach arora around auer axis ball berlin best bhatia bianchi blue bottom bounds cesa circle close cloud combinatorial computing consists corresponding curve degenerate depicted depiction dimension dimensional direction directions disjunction disk donut double dyad dyads earlier eight ellipse figure focs gentile green handle happens improved information inner instructive jection just kale labeled labels learning left line machine matrix minus neural nips picture plane plot press previous primaldual proc processing programs radius raised recall references region right risk same satisfied semidefinite shape shrinks single springer subspaces symposium systems tail that then theory this three threshold tips touches trace tracking unit variance version warmuth what where winnowing http://www.machinelearning.org/proceedings/icml2007/papers/229.pdf 2 Uncovering Shared Structures in Multiclass Classification aardvark abernethy academic access advances again algorithmic also american annual another application applied apply approach appropriate approximation argyriou attributestechnical available bach based believe benchmark better biol boosting both boyd cambridge case categories categorization changizi character classification colt column complexity comprehensive conference control convex crammer cvpr decomposable decomposes degenerate dekel design detection different direct dualization ecole entries epsilon equivalent evgeniou explicit factorization family fazel feature features fink formulation freeman from function general gradientbase heuristic hierarchical hindi history hoffman human hypotheseis icml images implementation information insensitive interclass introduction iteratively itself jaakkola jmlr kernel kernelization kernels keshet kluwer large latent lazebnik learn learned learning lends linear local loss machines mammal margin marszalek matrix maximum methods mines minimization minimum more multi multiclass multiple multitask multiview murphy nips norm object oles online optimization order over paris platt pontil predic press problem problems proc proceedings publishers rank reach recently redundancy references regression regularization regularized related relying rennie report representation requires results retrieval rule schmid scholkopf schwartz seen setting shalev sharing shimojo shown shwartz singer sixteenth smooth solving space srebro studied study suggest symmetrization system systems task text texture their thrun torralba trace ture ullman university using vandenberghe vector vectors vert visual well where which with work workshop writing zhang zorro http://www.machinelearning.org/proceedings/icml2007/papers/601.pdf 68 A Permutation-Augmented Sampler for DP Mixture Mo dels about accurately across advances algorithm allows also american analysis annals antoniak applications approach artificial association augmentation augmented autocorrelation autocorrelations auxiliary bars bayesian beal being berkeley blei bodies both breaking burn calculation carlo chain class cluster clustering clusterings clusters collapsed combinatorial computing conference conjugate contemporary converges convex could critical dahl data daume defined defining dels density department derived describing different dimensions dirichlet discovery discussion distribution distributions duke during dynamic dynamics efficient enables ensure error escobar estimate estimation expansion experiment exponential extended extensions fast ferguson figure five four freeman friedman from ghahramani gibbs global haplotype have heller hierarchical hilbert hyperparameter improved inference information initialization initializations initialize intel international into introducing ishwaran jain james jections johnson joint jordan journal kannan koller kurihara largest learning letters ligence lindenstrauss lipschitz lovasz machine maps markov mathematics mean merge method methods mixing mixture mixtures mnist modal model models monte neal network networks neural nonparametric nonuniversal number obtained over parameter partition perm permutation phase physics pitman plots points poissondirichlet posterior presented priors probability problems procedure process processes processing product programming proposed random range reduced references report representing review runs sampler samplers sampling scenes search seconds seeds sequence sets sharan show simonovits simulations size some space split splitmerge stable stationary statistical statistics stick stochastic structure subordinator sudderth swendsen systems tanner technical that this toronto torralba transformed true uncertainty univariate university using values variable variance variational various visual volume walks wang welling west where willsky wisconsin with wong xing yield http://www.machinelearning.org/proceedings/icml2007/papers/548.pdf 67 A Novel Orthogonal NMF-Based Belief Compression for POMDPs again agent algorithms also always analysis approximate auxiliary belief bigger brown cassandra cheung clustering compiegne completes conference corresponding corresponds data decision decomposing detailed ding discovery dissertation doctoral equality exact factorization factorizations first following follows france function have holds ieee inequality information intel international jects know large learning ledge ligent markov matrix mining nature negative neural nonnegative observable obvious orthogonal park part partial parts peng philadelphia pomdp press proceedings processes processing proof property prove references referring remaining respectively scale second setting seung show sigkdd since state symmetric systems take technology term terms than that third this university when where http://www.machinelearning.org/proceedings/icml2007/papers/541.pdf 47 Bayesian Compressive Sensing and Pro jection Optimization bishop cand certainty exact frequency from highly ieee incomplete information machines principles reconstruction references relevance robust romberg signal theory tipping trans variational vector http://www.machinelearning.org/proceedings/icml2007/papers/401.pdf 112 Classifying Matrices with a Sp ectral Regularization abnormal above adapting additionally advances against aihara algorithm also altun amazingly artificial associated assumption attained available based basic bcis because becomes begin berlin bias blankertz both boyd brain called cambridge case choose ciences classification classifying clin clinical code colt communication comparison complete components comput computational computer condition conditions cones constant constrained constraint constraints convex core corresponds costs curio current currently data dataset defined density desynchronization details develop disciplined divergence dornhege dual duality during each ecializing ectral efficiency electrodes electroencephalogr empirically entropy equality equations estimation event every executable explained explicit expression extraction faster feasible figure filtering first follows formance formulation from function gerking given gives grangian grant guarantees hand handled haussler have here hill http ibis ieee imagined implementation implementations implicit improvement induction inference informationbased initialize input intel interface interior international into intuitive jaakkola ject kaufmann kernel known koles krauledat kunzmann lasso lect length ligence limitation lmis logistic losch loss mainly making mapping matlab matrices matrix maximum mcmaster medium memory method methods minimization minimize minimum models modified moreover morgan motivated movement muller multiplied multiplier negating neural neurophysiol newton nips nonlinearity nonstationary note notes number objective obtain obtained only opteron optimal optimality optimization original otherwise over overall pair peri pfurtscheller phillips point present press primal principles probabilistic problem proc proceedings programming quantitative ramoser references regression regularization regularized rehab related royal runtime runtimes same samples schapire secs sedumi selection seventh show shown shows shrinkage silva simple simplified single size small smaller smola software solution solve solver solves solving some spatial specialization specialized springer stanford start statist statistical statistics sturm subject symmetric symmetry synchronization systems table take that this thus tibshirani time times tolerance tolerances tomioka toolbox topographic trace training trans trial trivial unfortunately unifying university until used uses using vandenberghe variable variables version weight well when where which with without works workshop written http://www.machinelearning.org/proceedings/icml2007/papers/408.pdf 60 Hierarchical Gaussian Pro cess Latent Variable Mo dels advances analysis artificial awasthi back based bayesian bejing belief berlin beyond bhatia bishop black bottom cambridge candela carolina cess china combining common component computation computer conditional conference constraints constructing data dels density dimensional distance dynamical efficient factor felderhof felzenszwalb feng ferris fields finding fleet forsyth francisco from gagrani gaussian generative graphics grochow hawaii head hertzmann hierarchical high hilton human huttenlocher icann iccv ieee ijcai image images information instruments intel international introduction inverse ioffe isard island ject jection joint journal kaufmann kinematics latent lawrence learning ligence ligent limbed linear localized loose machine mackay madison manifolds mapping martin matching methods mixtures model modelling models morgan nabney networks neural ninth nonlinear nuclear omnipress pattern pearl people physics pictorial popovic pose preservation press principal principled priors probabilistic proceedings process processes processing quinonero ramanan random rasmussen ravindran reasoning recognition recovery references research roth royal scene sciences segmentation series sets sigal siggraph small society south springer statistical structured structures style svens systems through tino tipping topographic tracking training transactions tree trees urtasun using variable verlag vision visualisation wang washington wifislam williams wisconsin with york http://www.machinelearning.org/proceedings/icml2007/papers/329.pdf 110 Cross-Domain Transfer for Reinforcement Learning aaai acquisition across adaptive addition advice after aleph along artif artificial barto based benchmark berlin between bredenfeld cambridge coding cohen conclusion conf conference connectionist constructed continuous control cross cued data dept different difficulty directly domain domains ecml effective eligibility engineering environments evidence experience fast first frank from function future give gridworld guiding hand have homomorphisms howley improved induction infengrt initial intell intelligence international introduced introduction jacoff kaufmann keepaway kuhlmann language learned learner learning line machine maclin madden manual mapping methods metrics mining mooney morgan national natural niranjan noda observed options practical press proc progressive references reinforcement replaceing report results ringworld robocup robot rule rummery shavlik significantly singh skill soccer soni springer srinivasan stone structure such supervisory sutton systems takahashi taking task taylor technical techniques testbed this three tools torrey traces transfer university using utilization value verlag walker with witten work workshop world http://www.machinelearning.org/proceedings/icml2007/papers/532.pdf 140 Maximum Margin Clustering Made Practical alternating bezdek boyd cambridge computations convergence convex hathaway neural optimization paral press references scientific university vandenberghe http://www.machinelearning.org/proceedings/icml2007/papers/280.pdf 51 A Recursive Metho d for Discriminative Mixture Learning acoustics application approximation beaufays best boosting computational conference decision dept discriminative duan empirical estimation european freund friedman function gaussian generalization gradient greedy information international keerthi konig large learning line machine method mixture models multiclass neural processing references report schapire signal speech stanford statistics study systems technical theoretic theory university weight weintraub which http://www.machinelearning.org/proceedings/icml2007/papers/372.pdf 149 Multiclass Multiple Kernel Learning algorithm artificial bach biology boosting borgwardt bousquet boyd cambridge chapelle classification complexity conference conic convex crammer density design duality first function graph herrmann information intel intelligence international jordan kernel kernels keshet kriegel lanckriet learning ligent machine matrix molecular multiple neural optimization pages prediction press proceedings processing protein references schonauer semi separation singer smola statistics supervised systems tenth twenty university using vandenberghe vishwanathan workshop zien http://www.machinelearning.org/proceedings/icml2007/papers/211.pdf 7 Focused Crawling with Scalable Ordinal Regression Solvers accelerated aggarwal arbitrary chakrabarti conf crawling feedback focused garawi intelligent intl online predicates proc punera references relevance subramanyam through wide with world http://www.machinelearning.org/proceedings/icml2007/papers/160.pdf 19 Minimum Reference Set Based Feature Selection for Small Sample Classifications acad advances algorithms alon analysis arrays barkai barnhill berlin bishop bloomfield bottou bousquet bradley brain broad caligiuri cancer chapelle choosing class classification classifiers classify clustering coller colon comprehensive computer computing concave damage denker discovery downing edition elisseeff expression feature foundation francisco fung gassenbeek gene gish golub gorman guyon hall haykin hidden huard icml information introduction issue jmrl journal june kernel lander layered learning levine local machine machines mack mangasarian mesirov minimal minimization molecular monitoring mukherjee natl nato network networks neural normal notterman oligonucleotide optimal parameters patterns prediction prentice probed proc processing references research revealed science sciences sejnowski selection series slonim smola solla sonar special springerverlag support systems tamayo targets tissue trained tumor unitsin using vapnik variable vector weston ybarra http://www.machinelearning.org/proceedings/icml2007/papers/358.pdf 75 Automatic Shaping and Decomp osition of Reward Functions aaai abstraction actions advances agent agents agogino algorithm algorithms alstrom andre artificial athena automatic autonomous bagnell barto bertsekas bicycle boutilier cambridge chang chow conference control coordination dean decision decomposition dietterich discrete distributed drive dynamic efficient factored fourteenth function games global guestrin hauskrecht hierarchical icml ieee information intelligence international jair kaelbling knowledge koller konidaris learning local machine macro markov maxq mdps meuleau multi multigrid national neural neuro optimal parr press proceedings processes processing programmable programming quicr references reinforcement reward rewards russell saul scaling scholkopf scientific shaping solution state stochastic systems thrun time transactions transfer tsitsiklis tumer uncertainty using value venkataraman with http://www.machinelearning.org/proceedings/icml2007/papers/154.pdf 3 Two-view Feature Generation Mo del for Semi-sup ervised Learning analysis ando annual blum bounds classification combining computational conference dasgupta data documents eleventh fields framework from functions gaussian generalization ghahramani harmonic icml information issue john journal labeled lafferty learning littman machine mcallester mccallum mitchell multiple nigam nips oles predictive probability problems proceedings references research retrieval semisupervised sons special statistical structures tasks text theory thrun training unlabeled using value vapnik wiley with york zhang http://www.machinelearning.org/proceedings/icml2007/papers/257.pdf 29 Unsupervised Prediction of Citation Influences accommodate adapting added adding advances against allocation alternative american analogously analysis annotated apply approach artificial asdume association assume author authoritative authors based bayesian because between bibliometric blackwellised blackwellization blackwellized blei blended blockstructures both caches canceled canceling carlo chain chang change chapman citation citations cited citing closed cohn collection combination communities community compared complex computed concept conclusion conditional conference conjugate connectivity content copycat corresponding count counts delta denominator denoted depend derivation derive derived deriving detail determines developed digital dirichlet discourse distribution distributions dividing document documents domains doucet dynamic dynamics each either endix enhanced environment equation equations estimation exemplify exploit expressed factors filtering final first flake flow following form four fraction freitas from function garfield generated generative gibbs gilks given graphical griffiths hall hand here historiographic hofmann hyperlinked hypertext icml identify impact improve influence influences information inherited innovative instances integral integrating intel international into jcdl joint jordan journal kaplan kleinberg knowledge kriegel kronecker latent learning leveraging libraries ligence ling link literature london lthat machine mann manually mapping markov matching mccallum measure measures methods mimno minimum mining missing model models monte multinomial murphy necessary network networks neural newman nips notation note nowicki number numerator obtained online only other otherwise paper papers parameter particle performance position practice prediction press priors probabilistic probabilistically procedure proceedings process processing pseudo publications random reach reduce redundancy references relational remaining removal report research resolved review richardson roesing rondeau rosen russell same sampling sandor scalar science scientific second semantic seventeenth shared siam simplify size smyth snijders social solution sources spectral spiegel spiegelhalter statistical steyvers stochastic structure studies symmetric symposium systems technical text that then think token topic topics topology tresp trigg tsioutsiouliklis twice uncertainty unsupervised update uses using value variables version void ways what which with word yield york zhukov http://www.machinelearning.org/proceedings/icml2007/papers/219.pdf 64 A Transductive Framework of Distance Metric Learning by Sp ectral Dimensionality Reduction advances alignment analysis annual application applications applied bach bartlett based belkin bengio beyond blitzer bottou bousquet chang classification cloud clustering coefficients complexity computation computational computer conference consistency constraints context contextual cristianini data delalleau denker dependent dimensionality discriminant dissimilarities distance document documents eigenfunctions eigenmaps elisseeff embedding euclidean expected formulating framework from functions generalized geometric ghaoui global gower gradient haffner herbrich hermann idealized ieee image information intel international invariance jordan journal kandola kernel kernels kimeldorf kwok lanckriet langford laplacian large learning lebanon lecun legendre ligence links local machine manifolds margin mathematical matrix metric metrics micchelli mika modelbased muller multidimensional multimedia nearest neighbor networks neural niyogi nonlinear ouimet paiement panda pattern point pontil proceedings processing programming propagation properties recognition reduction references representation representer research result retrieval roux russell saul scaling scholkopf science semidefinite semisupervised separation shawetaylor side silva simard similarity sindhwani smola some spectral speech spline systems tangent target tchebycheffian tenenbaum text theorem theory toward trade transactions transductive transformation tricks tsang valued vector victorri view vincent vision wahba weinberger weston with xing yeung zhang zhou http://www.machinelearning.org/proceedings/icml2007/papers/483.pdf 128 Beamforming using the Relevance Vector Machine academy acoustics activities activity actually adaptive advances after allow also analysis approach assume assumption available baillet bayesian beamformer beamforming biomedical bishop boston brain buckley case combination component components computation condition conditions constrained contrast contribution correlated correlations cost course covariance dictionaries diego donoho drongelen effects effort eigendecomposition elad electrical electromagnetic engineering equal equivalent estimates estimation exists expression extra filtering finding first from full fundamentals general gives gradient greater hall handled henceforth hill however ieee increasing inequality information ingle intensity interference interpolation jersey kernel kogon leahy learning likewise linear linearly little local localization locally machine mackay magazine magizine manipulations manolakis mapping marantz mcgrall methods minimization minimize minimum mosher must mvdr nagarajan national necessary need neural nonnegative nonorthogonal note optimally original overall performance poeppel possible prentice presence presumed principal proc processing rank reconstructing reduces references relevance representation representations represented requirement research respect sahani sciences second sekihara signal simplicity since solving some sources sparse spatial speech statistical such suzuki systems tantamount technique term than that then theory therefore thesis this time tipping trace trans true unknown using variance vector veen versitile well where which will wipf with yuchtman zero zoltowski http://www.machinelearning.org/proceedings/icml2007/papers/168.pdf 131 On Learning Linear Ranking Functions for Beam Search above agarwal approximate artificial automata axis beam bining bipartite bound colt comb company completeness computer computers convergence daume dietterich discriminative fact fern freeman functions garey guide heuristics hoffgen horn icml ijcai induction instance intelligence intractability johnson journal large lathrop learnability learning lozano marcu margin mathematical methods multiple neurons novikoff optimization parallel perceptrons perez planning prediction problem proofs ranking rectangles references result robust roth sciences search simon single solving structured symposium system that theory this trainability upper with yoon york http://www.machinelearning.org/proceedings/icml2007/papers/278.pdf 53 Local Dep endent Comp onents advances alignment analysis analyzing archambeau artificial associative astola bach bayesian berger berkeley between bioinformatics biology biometrika bishop bottleneck breast brodley brooks california cambridge cancer canonical cells chain changes classification clustering code communications components computation computational conference conjugate convergence copy correlation cristianini cross data delannay department dependencies dhillon dirichlet discover discovery document english environmental estimating eukaryotic european exploring expression farquhar fern fifth formulation francisco friedl friedman functional fyfe gasch gaussian gelman gene general generative genome genomic genomics graphical harbison hardoon hautaniemi hotelling ieee information intel international interpretation iterative jain japanese jections jointly jordan journal kaski kaufmann kcca kernel klami know knuuta lahti language latent learning ledge leen ligence ligent local maceachern machine mallela markov means meng merging methods mining mitra mixture mixtures model models modha molecular monitoring morgan mosenzon multivariate nature neal networks neural nikkil ninth nonlinear normal number pattern practice press prior probabilistic proceedings process processes processing programs publishers reduction references regulatory relations report response retrieval robust roos roweis sampling sets shawe shawetaylor siam sigkdd signal simulation simulations sinkkonen slonim splitting statistics stochastic style symposium systems szedmak taylor technical that theoretic theory tila tishby toronto transactions transcriptional uncertainty university using variable variates verbeek verleysen view vlassis with yeast york http://www.machinelearning.org/proceedings/icml2007/papers/237.pdf 23 Kernel Selection for Semi-Sup ervised Kernel Machines advances annual argyriou artificial basic belkin bousquet cambridge chap choosing classification combinations combining conference continuation continuously convex density eighteenth eled elle ervised examples framework from geometric graph herbster inference information intel international joachims journal kernels laplacians learning ligence machine machines manifold manifolds method micchelli mukherjee multiple neural niyogi parameterized parameters pontil press proceedings processing references regularization research riemannian semi separation sindhwani statistics supp svms systems tenth text theory third transductive twenty unlab using vapnik vector workshop zien http://www.machinelearning.org/proceedings/icml2007/papers/381.pdf 11 Solving MultiClass Supp ort Vector Machines with LaRank bakir because bordes bottou cache depends ecml efficient fast have hofmann huller kernel larank learning letter lkopf lnai machine often online optimize outliers outputs predicting press processold references results runs simple since smola springer structured sufficient table taskar timings verlag vishwanathan when would http://www.machinelearning.org/proceedings/icml2007/papers/87.pdf 136 Least Squares Linear Discriminant Analysis alignment always american analysis angle annals aspremont association belhumeour belkin class classification columns comparison completes consists cristianini data defined diag diagonal direct discriminant discrimination discussion duda dudoit efron eigenfaces either elisseeff endix entries equality equals eugenics examples expression first fisher fisherfaces following follows formulation framework fridlyand friedman from gene geometric ghaoui golub hart hastie have hence hespanha holds ieee inequality intel jection johnstone jordan journal kandola kernel kriegman labeled lanckriet last learning least lemma ligence linear loan machine manifold matrices matrix measurements methods multiple nips niyogi nonsingular nonzero note only orthogonal orthonormal part pattern problems programming proof rank recognition references regression regularization regularized research rows second semidefinite shawetaylor since sindhwani some sparse specific speed squares statistical statistics stork target taxonomic that theorem this thus tibshirani trans tumors unlabeled using when where which wiley with zero http://www.machinelearning.org/proceedings/icml2007/papers/331.pdf 59 An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation advances aistats algorithm algorithms also analyzed appear application applied architecture available backpropagation becomes belief bengio between boser bottou certain chang chapelle cjlin class code compare computation conclusion considered contrastive csie cvpr data decoste deep denker dimensionality distribution divergence embedding empirical evidence experiments experts exponential factors family fast favorably feed first forward future generate generic greedy handwritten harmoniums henderson hidden hinton howard http huang hubbard ieee images information invariance invariant jackel ject kernel lamblin large larochelle layer learn learning lecun library libsvm lighting machine machines many methods minimizing models neighbourhood nets networks neural nonlinear only osindero other outperform particular performance point popovici pose presented preserving press problems proceedings processing products properties provided recognition recognize reducing references relationships report retrieval rosen salakhutdinov scale scaling scholkopf science series shallow shapes show single software state structure such support svms systems technical tend that these they toronto towards training university utml variation vector welling weston where which wise with work http://www.machinelearning.org/proceedings/icml2007/papers/515.pdf 95 Self-taught Learning: Transfer Learning from Unlab eled Data across adapt algorithms allocation also analysis ando angle applicable applications applied approach area asai assumption authors auto bags based battle baxter bayesian believe berg beyond bioinformatics biological blei broad broader caruana categories category cell characters class classification classifiers classifying code coding combining conceived consider considered constraints cortex cvpr data decades deerwester defined definition described digit digits dimensionality dirichlet discriminative dissimilar documents domains dumais easier efficient efron embedding emergence encoders encouraging english even examining example examples existing expensive exploiting extrapolation factorization feature features fergus field first fisher follow formalisms framework from furnas further generalize generate generative geometric global good hand harder harshman hastie haussler have here heuristics hinton holub hope however hoyer iccv icml images increasingly incremental indexing info initiate input inputs inspired instance introduced invariance isomap jaakkola ject jmlr johnstone jordan journal kernels labeled labels landauer langford lasso last latent lazebnik lead learn learned learning least less light linear little locally lowdimensional machine maire make making malik manifold many marginalized matching matrix mccallum method methods might mitchell model models modified most multiple multitask natural nature nearest negative neighbor networks neural nigam nips nonlinear notably note olshausen originally other paper particularly performance perona picked poggio ponce posed poses possible prediction predictive problem problems produce progress properties propose purely pyramid raina receptive recognition recognizing reducing reduction references regression regularization related relies representations representing represents requires research researchers results rotational roweis salakhutdinov same saul scene schmid science secondary selection self semantic semi sequence sequences serre several shrinkage silva simple single sometimes sparse sparseness spatial spectrum starting stat straightforwardly structure structures such suitable supervised svmknn task tasks taught tenenbaum tested text than that their them theoretical these thing this though through thrun tibshirani training transfer tsuda under unlabeled unsupervised useful using various vision visual welling where widely with wolf workshop would zhang http://www.machinelearning.org/proceedings/icml2007/papers/139.pdf 16 Learning to Rank: From Pairwise Approach to Listwise Approach adapting addison approach baeza burges corporation deeds descent document from gradient hamilton huang hullender icml information lazier learning listwise microsoft modern neto pairwise proceedings rank ranking references renshaw report retrieval ribeiro shaked sigir technical tsai using wesley yates http://www.machinelearning.org/proceedings/icml2007/papers/537.pdf 45 Learning Nonparametric Kernel Matrices from Pairwise Constraints acknowledgments advanced advances agonizing algorithm algorithms alignment almost analysis ando annual application approaches averaged background bartlett based because belkin boyd cacsd cambridge cardie carnegie chang chapelle classification clearly cluster clustering colt computational computationally conclusion condition conference conjugate constrained constraints contrary convex cost cristianini cuhk data dataset datasets definite described design dhillon different diffusion discrete distance double dramatically dual each effective efficient efficiently elisseeff empirical encouraging engineering equivalence examples factorization fast fields figure first fixed formulated from fully functions furthermore gaussian generate ghahramani ghaoui given gradient grant graph graphs greater harmonic hertz hillel hing icml increased increases infers information institute into introduction jmlr john jordan kandola kernel kernels knowledge kondor kulis lafferty lanckriet large learning learns like lkopf lofberg machines mahalanobis matlab matrices matrix matveeva means mellon method methods metric minimal modeling more nips niyogi nonnegative nonparametric note novel number numbers optimization other over paintechnical pairwise paper philadelphia platt presented press primal problem programming prohibited proposed rank references regularization related remains report result results rogers russell scholkopf schroedl seconds semi semisupervised sequential seung shawe shental shewchuk shiae show shows shun side size sizes smola solution solve solves sons specially spectral spiral statistical structures studied supervised support supported sustik target taylor than that then theory this time toolbox traditional training transforms trial trials unchanged unified university using vandenberghe vapnik vector wagstaff weinshall weston when which wiley with without work xing yalmip zhang http://www.machinelearning.org/proceedings/icml2007/papers/561.pdf 84 Revisiting Probabilistic Models for Clustering with Pair-wise Constraints aaai address algorithm algorithms also approach appropriate approximations background barbu basu between bias bilenko blake both bottom buhmann cannot cardie carlo challenge choosing chunklet chunklets clustering computing confidence constrained constraints critical cuts cvpr data databases demonstrated derivation dhillon digits dynamics effect empirical equivalence error errors especially evidence existing expert extends field figure finally fixed framework further gaussian ghosh graph grouping have hertz hettich high hillel hmrf iccv icml ijcai impact important improves instance interpretation introduced intuitive iris issues jain justification kamvar kernel klein knowledge kulis lacks lange learning leen lett level links machine manning mean means measures merz method methods misspecified mixture model models monte mooney more must newman nips nonuniversal other outperform outperformed outperforms page partition penalized penalty percentages performs phys poses practical press probabilistic probability references relationship repository represents robustness rogers sampled sampling scale schroedl semi settings shental showed showing shown similarity simulations since specified spectral strehl supervised swendsen technique tend than that their theoretical there these this though tolerant unlabelled used using value versions wagstaff wang weights weinshall well when which with http://www.machinelearning.org/proceedings/icml2007/papers/455.pdf 66 Large-scale RLSC Learning Without Agony additive advances again agony algebra algorithms american anal analysis appl applied approaches approximation arsenin artificial aspects bagnell baltimore barrett based basis beatson berry billings blocks body bottou building cambridge century chan chicago classifiers comput computational computations conference conjugate convergence curve data davis dealing decomposition demmel dietterich dimensions discovery dissertation doctoral domain donato dongarra duraiswami edition efficient eijkhout equations equivalent everything fast faul features fine first fitting foundations freitas fresh from function functions fung gauss gaussian generalized geometry girosi golub gradient greedy hasselmo hastie hilbert historical hoffman hopkins hyderabad ijcai improved india industrial information institute intel international interpolation introduction iterative january jects john johns joint journal kaufmann kernel kimeldorf know krylov lang large learning least ledge leen letin leung ligence light linear loan look lposed machine machines mahdaviani mangasarian maryland massachusetts math mathematical mathematics matrix mechanics method methods mining models morgan mozer multilayer multipole nashville networks neumann neural numerical nystrom philadelphia pitsianis plate platt poggio powell pozo practical predefined preparata press princeton problem problems proceedings process processes processing proof proximal quantum radial radiographs rank rasmussen ratliff reconstructing reduced references regression regularization regularized representations research results review rifkin rlsc romine rsvm saad saul scale schaback scheinberg scholkopf science sciences seeger seventeenth seventh shamos shen siam sigkdd smale smith smola society solomon solution solutions some sons space sparse speed spline springer squares statistical studies support surface systems tchebycheffian technique techniques technology templates that thin third tibshirani tikhonov touretzky training transform trees tresp university using vanderbilt vector version vorst wagner wahba wang weiss wendland williams winston with without yang york http://www.machinelearning.org/proceedings/icml2007/papers/7.pdf 76 Asymmetric Boosting adacost application boosting bowyer chan chawla class classification classifiers computer cost databases decision discovery domingos duda fifth freund general generalization hall hart icml improving journal knowledge lazarevie learning line making metacost method minority misclassification pattern prediction press principles proceedings references schapire sciences sensitive sigkdd smoteboost sons stolfo stork system theoretic wiley york zhang http://www.machinelearning.org/proceedings/icml2007/papers/285.pdf 114 Simpler Core Vector Machines with Enclosing Balls accuracies accuracy active advances aggressive algorithm algorithmics algorithms approximate artificial asharaf ball balls based bordes bottou bregman cambridge chapelle cheung classifiers cluster computation conference core coresets corr crammer data dekel different dimensions discovery enclosing ertekin european experimental fast figure fitting formulations freund girosi high ieee ijcnn implementations improved information intelligence international intrusion joachims joint journal kernel keshet knowledge kumar kwok large lasvm learning letter libsvm linear machine machines manifold margin maximum minimum mining mitchell multiple murty networks neural nielsen nock noise numbers obtained online optdigits ordinal osuna panigrahy passive peled pendigits performance polytope press primal problems proceedings processing ratsch references regression regularization research reuters roth satimage scale schafer scholkopf second seconds sets shalev shevade shwartz signal simpler simplesvm singer sixth size sizes smallest smola sonnenburg sparsified spheres support svms systems table testing time tolerant training tsang twentieth twenty used using usps various vector vectors very vishwanathan weston with workshop yildirim zimak http://www.machinelearning.org/proceedings/icml2007/papers/326.pdf 49 Most Likely Heteroscedastic Gaussian Pro cess Regression adaptive advances algorithm bengio bottou cawley chal chan chapelle computational convergence estimating estimation functions graphical information journal kernel kirby kmeans kohn learning lenges locally machine mean models neural nott predictive press processing properties references regression ridge semiparametric statistics systems talbot variance variances with http://www.machinelearning.org/proceedings/icml2007/papers/441.pdf 74 Simple, Robust, Scalable Semi-sup ervised Learning via Exp ectation Regularization algorithm algorithms analysis ando benchmarks bengio burges chapelle cohen conditional corduneanu cozman criterion data delalleau dellalleau dempster framework from harmonic incomplete information jaakkola jmlr label laird largescale learning likelihood lkopf maximum mixing multiple platt predictive press propagation quadratic references regularization risks roux royal rubin schlkopf scholkopf semi stat structures supervised tasks unlabeled with zhang zien http://www.machinelearning.org/proceedings/icml2007/papers/170.pdf 86 Multi-Task Learning for Sequential Data via iHMMs and the Nested Dirichlet Pro cess advances algorithm algorithms american analysis annals applications approximate aspect association aucouturier based bayesian beal between bharadwa blei breaking buzo carin carlin caruana case chapman chinese classification clip college communications computational computed conference constructive couchman data definition density design dirichlet discovering dissertation doctoral dunson escobar estimation expo ferguson figure function gatsby gaussian gelfang gelman ghahramani gibbs gray griffiths hall hidden hierarchical ieee ihmm ihmms index induction inference infinite information international ishwaran ismir iterations james jordan journal krishnapuram learning lewis liao linde logan london machine many markov matrix measures merging methods mixture mixtures model models multi multimedia multiple multitask music nested neural neuroscience nonparametric omohundro pachet posium priors problems proceedings process processes processing quantizer rabiner raftery rasmussen recognition references research respectively restaurant results retrieval rodriguez rubim runkle salomon same sampler sampling selected sequences sequential sethuraman signal similarity sinica some speech statistica statistical statistics stern stick stolcke structure submitted sullivan systems target task tasks tenenbaum thurn topic training trans transactions tutorial unit university using variational vector west whats with http://www.machinelearning.org/proceedings/icml2007/papers/562.pdf 1 Learning Random Walks to Rank No des in Graphs accuracy agarwal aggarwal albert algorithms anatomy annals approach artificial authority balmin barab based bertinoro bipartite bousquet brin burges burkard chakrabarti cheeger chung clickthrough colt combinatorics conference cortes cost cover data databases deeds deeper descent directed dissertation doctoral elements elisseeff engine engines entities faloutsos framework from functions generalization gradient graepel graph graphs hamilton herbrich high hristidis huang hullender hypertextual icdm icml inequality information inside international internet jectrank jeong joachims john journal kernels keyword kondor labeled langville laplacians large largescale laucius lazier learning lkopf machine margin mathematics matveeva metabolic meyer mining model models multiple nature nested networked networks neural nips niyogi nonsmooth norm obermayer oltvai optimizing ordinal organization page pagerank papakonstantinou personalized prediction push queries ragno rank ranker ranking recursive references regression regularization relational renshaw research retrieval rmat rudin scale scaling search seattle shaked sigir sigkdd smola sons stability stanford statistical structured support taskar theory thomas tombor toronto university unlabeled using vector vldb washington widom wiley with wong workshop zhan zhou http://www.machinelearning.org/proceedings/icml2007/papers/568.pdf 58 Online Kernel PCA with Entropic Matrix Up dates additive advances algorithms analysis annual appear application approach arora averaging bertinoro best boosting bounds bousquet boyd bregman cambridge colt combinatorial component computation computational computer computing conference convex decisiontheoretic dimension eigenvalue eurocolt european expert experts exponentiated freund generalization germany gradient herbster http information italy jections journal kale kernel kernels kivinen kuzmin learning leaving line linear lkopf logarithmic machine manfred march matrix minimization mixing muller multiplicative neural nips nonlinear nordkirchen online optimization past path pittsburg posteriors prediction predictions predictor press primaldual problem proc proceedings processing programs pubs randomized references regret research schapire sciences semidefinite small smola span springer symposium system systems takimoto that theory tracking tsch tsuda ucsc university updates vandenberghe variance version versus vishwanathan warmuth with http://www.machinelearning.org/proceedings/icml2007/papers/21.pdf 113 Approximate Maximum Margin Algorithms with Rules Controlled by the Numb er of Mistakes about accuracy advances algorithm algorithms also analysis appears approach approximate arbitrary asymptotic attributes available belonging brain cambridge chang chosen cjlin class classification classifiers classsification comparison competitive completely conclusions condition constant controlled convergence course cristianini csie cutting data desplayed diminish direct does duda ecml effective empirically entirely established establishing evidence experimental exploited extensions family fast faster fixing from fully generic gentile hands hart however http hyperplane implements independent information introduction involving joachims kernel large learning length library libsvm like linear long machine machines making margin maximal maximum meaningful methods micra microsoft minimal misclassification mistakes model moreover much need norm number online only optimal optimization organization parameter pattern perceptron perf plane platt possible powerful powers practical practitioner presented press probabilistic proved provided psychological rate recently references relatively relaxation relaxed remarkable report reported rescaled research results review rosenblatt rules scale scene secs sequential shawe simple since size skillful slow soft software sparsity springer statistical storage such sufficiently suggest support suppression svmlight svms table taylor technical than that their theoretical theory this time times tools training tsampouka university usefulness value values vapnik vector verlag very weight were which whole wiley with http://www.machinelearning.org/proceedings/icml2007/papers/291.pdf 132 Modeling Changing Dependency Structure in Multivariate Time Series american analysis andrieu annals appear appl application applications approach aspremont assoc association banerjee barry bayesian beal biol biometrika buhlmann carvalho century change changepoint class classification clustering components computing conf considerations convex covariance cowell cross curve cvpr dahl dawid decomposable dellaert denison dept determination dimensional distribution distributions doucet duration dynamic efficient estimation exact expert fearnhead financial fitting fitzgerald functional gaussian geiger gelman genet genomics ghahramani ghaoui giudici graphical graphs green hartigan heckerman hengartner hidden hierarchical high holmes hyper ieee implications inference infinite intl inverse journal lancaster large lasso lauritzen laws learning linear machine mallick markov matrix mcmc meinshausen methods minka modal model modeling models multiple natsoulis networks nips nondecomposable nonlinear notational online optimization parameterized parameters partition point prior probabilistic problems processing product punskaya rasmussen recursive references regression rehg report roverato royal scale scand schaefer scott section segmentation selection series shrinkage signal smith some sparse spiegelhalter springer stat statist statistical statistics strimmer structural switching systems talih technical techniques theory time tracking trans univ univariate using variable variance variate varying west wiley wisconsin wishart with http://www.machinelearning.org/proceedings/icml2007/papers/244.pdf 103 Supervised Feature Selection via Dep endence Estimation ability absence acknowledgments algorithm alignment also although american approach approximation artificial australia australian available bach backward bahsic baker baking barnhill based become bedo behind between bialek bias biased bioinformatics biological biomed blankertz boosting borgwardt both bounded bousquet cambridge cancer case chapelle choose class classification classifications code combination combined compared competitive conclusion consistency constant context convergence corresponding covariance cristianini criterion cross curio data davis demonstrates department dependence developer dimensionality discrepancy distributions dornhege elefant elimination elisseeff empirical endix entropy equal estimate estimator estimators factor feature filter filters first form former framework freely fukumizu funded gene given good government gretton guyon hilbert hilbertschmidt hsic http hutter icml idea identical ieee implies improving independence inference influence information initiative integrating interpretation intl introduction invasive irrelevant ismb jmlr joint jordan journal justification kandola kernel kernels khlh kira koller krauledat kriegel labels learning lemm linear lkopf losch machine machines mathematical maximises maximum mean measures measuring method methods models more mukherjee muller mulller multiclass mutual nemenman network neumann nicta nips norm normalisations norms note operators optimal optimizing package paper paradigms part pascal performance poggio pontil practical press proc procedure properties proposes provide provides rasch rates real reduction references regard relate rendell reproducing research result resulting revisited robust runtime sahami same schmidt schn selection serfling shafee shawe shown shows since single singletrial size smola society song spaces spatio specialised spectral statistical statistics steidl steinwart still strong structured subset supervised support supported svms target taylor tech temporal term terms that theorem theorems theoretical this through tipping toward trans transactions trial unified using vapnik variable vector very weston where which wiley with workshop world yields york zaffalon zero http://www.machinelearning.org/proceedings/icml2007/papers/336.pdf 137 Discriminant Kernel and Regularization Parameter Learning via Semidefinite Programming accuracy advances algorithm analysis anouar approach approx approximate argyriou bach bartlett based baudat beyond bioinformatics blake boosting boyd cambridge chen class classification coefficients combination comparison computational computations compututation cone cones conference conic constructing convex crammer cristianini data database databases descriptive design developmental discriminant discriminative drosophila duality dundar edition efficient embryonic exact expression fast feature features figure fisher formulations framework function fung fusion gene generalized genomic ghaoui golub handwritten hauser heterogeneous hettich hopkins hull hyperkernel hyperkernels icml ieee images intel introduction iterative jebara johns jordan journal kernel kernels kero keshet kumar kwok lanckriet large learning left lewis ligence lkopf loan machine machines magnani mathea matical matlab matrix merz methods micchelli mika minimal muller multi multiple nels networks neural newman nips noble nonlinear nonstationary optimal optimization order other over parameter pattern platt pontil press programming range rayleigh recognition references regularization repository research review right scale sdpapprox sdpexact second seconds sedumi selection semidefinite sequential sets siam singer smola software sonnenburg spaces stage statistical sturm support svms symmetric systems task taylor terms text theory third three time toolbox training trans tsang tsch university using usps vandenberghe vapnik vector waveform weston wiley williamson wine with york http://www.machinelearning.org/proceedings/icml2007/papers/233.pdf 42 Sparse Probabilistic Classifiers accepted advances annual appear application artificial asymptotic bartlett based bengio bottou bowyer boyd cambridge cauwenberghs chakrabartty chapelle chawla class classification collobert computation computing conditional conference convex convexity discrimination entropy estimating gini grandvalet hall icml information intel international interpretation jmlr journal kegelmeyer learning ligence logistic machine mari minority model multi neural optimization oversampling pearce press primal probabilistic probabilities probability proceedings processing quadratic references regression research results robust scalability sinz smote some sparseness springer statistics support svms synthetic systems technique tewari theory thoz trading training unbalanced university vandenberghe vector weston with http://www.machinelearning.org/proceedings/icml2007/papers/192.pdf 139 On the Value of Pairwise Constraints in Classification and Consistency active algebra american association banerjee bartlett basu bilenko blatt bounds classification clustering computation conf constrained convexity data domany framework from granular harville intl jordan journal learning machine magnet matrix mcauliffe model mooney neural pairwise perspective probabilistic proc proceedings references risk semi sigkdd springer statistical statistician supervised supervision using washington wiseman http://www.machinelearning.org/proceedings/icml2007/papers/409.pdf 13 Multiple Instance Learning for Sparse Positive Bags about accuracy acknowledgments advances also ambiguity andrews annotations annual anonymous approach artificial associated australia axisparallel bags balancing based belonging berlin bled bonn bottou bunescu burges cambridge captured case cccp chapelle classification cluster collobert come comparison competitively concaveconvex conclusion conference constraint constraints content could craven data datasets demonstrate density dept dietterich different directly dissertation distribution doctoral each edinburgh eecs effective empirical enforces estimate estimates experimental experiments exploited explored extraction fact fast flach formulations fraction freitas fritts from further future gartner general germany gift goldman google grant group have helpful hofmann icml image imbalance improvements incorporating increase individuals inference information instance instances intel international joachims journal kaufmann kernel kernels kowalczyk kuck labeled large lathrop lead leads learning least level ligence like lozano machine machines many maron mastodon methods minimal modeling mooney more morgan multi multiple nature negative neural number optimization other outperforms paper particularly perez performs platt positive positively presented press previous problem procedure proceedings processing property provided rangara real rectangles references relation remarks report research results retrieval reviewers same scale scholkopf scotland semi separation sequential setting shown significant similar sinz sixteenth slovenia smola solving sparse springer statistical statistics subsequence such supervised support supported svms sydney systems technical tenth text than thank that their them theory this training transductive treating tsochantaridis type types uncertainty unlabeled using vancouver vapnik vector verlag versus were weston when with work workshop world would yuille zhang zien http://www.machinelearning.org/proceedings/icml2007/papers/29.pdf 5 Multiclass Core Vector Machine aiolli badoiu balls chang cjlin clarkson classification computational core csie dimacs fletcher geometry http interscience journal learning library libsvm machine machines methods multi multiclass optimal optimization practical prototype references research sets sperduti support vector wiley with workshop york http://www.machinelearning.org/proceedings/icml2007/papers/341.pdf 14 Cluster Analysis of Heterogeneous Rank Data acknowledgments aggregating ailon algorithm analyses analysis analyzing annals annealing anonymous applications based beckett biometrika buhmann chapman charikar clustering combining computational computing conditional conference correlation cranking critchlow data deterministic diaconis distance estimation extensions fligner gelatt generalization group hall helpful hofmann ieee inconsistent information institute intel international john journal kendall kirkpatrick krishnan lafferty learning lebanon ligence likelihood machine mallows marden martin mathematical maximum mclachlan measure methods metric mixtures model modeling models murphy newman null optimization pairwise partial partially pattern permutations probability rank ranked ranking rankings references representations reviewers roth royal science simulated society sons spectral springer statistical statistics suggestions symposium thank theory transactions using vecchi verducci volker wiley with http://www.machinelearning.org/proceedings/icml2007/papers/173.pdf 72 Discriminant Analysis in Correlation Similarity Measure Space academic academy adjusted advances analysis appear application approach asia barker based belhumeur blake brown canonical categorization chang chemometrics class classification clustering computation computations computer conference constraints correlation cristianini data databases diego discovery discriminant discrimination distance edition eigenfaces equivalence expression fisherfaces from fukunaga furey gene golub grundy hardoon haussler hertz hespanha hillel hopkins html http ieee information intelligence johns jordan journal karypis kernel knowledge kriegman kuma learning least linear loan machine machines mahalanobis matrix measure merz method metric microarray mining mlearn mlrepository national nature nearest neighbor networks neural overview pacific pakdd partial pattern press proceedings processing projection rayens recognition references repository research russell science semi shental side similarity space specific springer squares statistical sugnet supervised support systems szedmak taylor text theory transactions university using vapnik vector verlag vision weight weinshall with xing yeung york http://www.machinelearning.org/proceedings/icml2007/papers/422.pdf 92 Analyzing Feature Generation for Value-Function Approximation advances algorithms application approximate approximating artificial atkeson automatic barto basis boosting boyan bradtke cambridge computational conference construction decision difference dynamic european freund function generalization information intelligence keller learning least line linear lncs locally machine mannor moore neural precup press proc processing programming references reinforcement review safely schaal schapire second squares systems temporal theoretic theory value weighted http://www.machinelearning.org/proceedings/icml2007/papers/89.pdf 143 Conditional Random Fields for Multi-agent Reinforcement Learning aberdeen actor actorcritic after again agents algorithms although altun amari approach approaches approximation arlington asymptotically auai bagnell bake bartlett baxter been benchmarks bernstein biometrika blackwellisation both boutilier cambridge case casella chain clearly complexity computation conclusions conditional connectionist control controllable controlling convergence cooperate coordinated coordination crfs critic curve data dbns decentralized decision direction distributed domain dutech early ecml edge efficiently equivalently estimation exactly examples exponential fail families feature fields figure following freitas from function functions givan given giving gradient graph guestrin hamze have hierarchical hofmann horizon however icml icra ijcai immerman improve independent inference infinite information input intersections jair kaelbling labeling lafferty lagoudakis learn learning left local localization long lower machine mackay mansour markov matrix maximum mcallester mccallum methods meuleau minimum model modeling moore moves multi multiagent murphy natural network neural next nips number observed offs optimal optimality optimisation optimization outperforms parr pereira performed peshkin peters plots point policy policygradient pomdps possible predicting prediction press previous probabilistic proc process processes random references reinforcement related represented representing results rewards richter riedmiller right road robert robot same sampling scale scaling scenario schaal schemes schneider search segmenting sequence sequential series shown shows simple simply singh smola special statistical step strong style sutton systems that theocharous theory this thus time traffic trees true typically undirected univ upper value versus view vijayakumar virginia waiting where williams with wong works workshops would zilberstein http://www.machinelearning.org/proceedings/icml2007/papers/216.pdf 39 Best of Both: A Hybridized Centroid-Medoid Clustering Heuristic arya boujemaa computing facility fauqueur ferecatu fleuret garg generic gouet heuristic ikona image interactive khandekar local location median munagala pandit press problems proc references sahbi saux search specific stoc symposium theory york http://www.machinelearning.org/proceedings/icml2007/papers/106.pdf 63 Support Cluster Machine active adaptive advanced advances algorithm analysis applications applied based bengio bhattacharyya birch boley bottou burges cambridge campbell cascade cauwenberghs cessing chang chapter chen cheung chuang classification classifier classifiers classifying cluster clustering clusters cohn college collobert computation concepts conf core cosatto data databases decremental dempster design discovery distance divergence durdanovic efficient fast freund friedman from functions fung gehl girosi graf hartigan hierarchical howard identification ieee imbalanced imperial implementation improved improvements incomplete incremental independent information introduction jebara joachims john joint journal kandel keerthi kernel kernels know kondor kruger kullback kwok laird large laskov learning ledge leibler less letters library libsvm likelihood livny london machine machines making management mangasarian maximum means method methodological methods minimal mining mixture more moreno muller multimedia murthy network networks neural optimization osuna parallel pattern platt poggio practical press probability problems proc processing product proximal ramakrishnan recognition reduced references regression research reynolds royal rsvm rubin scale schohn scholkopf sequential series sets shevade siam sigkdd sigmod signal smola society speaker statistical statistics sturim supervectors support svms svmtorch systems text theory training tsang tseng using vapnik vasconcelos vector verification very wiley with wong workshop yang yuan zhang http://www.machinelearning.org/proceedings/icml2007/papers/379.pdf 27 An Integrated Approach to Feature Invention and Model Construction for Drug Activity Prediction algorithms analysis binding brint bunce chemical cheng comfa common comparative computer cramer dimensional effect explorations field hatzis hayashi identification informatics krogel maximal molecular morishita page patterson references report sciences sese sigkdd substructures three willett http://www.machinelearning.org/proceedings/icml2007/papers/371.pdf 77 Linear and Nonlinear Generative Probabilistic Class Mo dels for Shap e Contours acknowledgements active adding affine algorithm algorithms anal analysing analysis anders appearance applied approach associated assume astrom automated automatic been benchmarking birds bishop bmvc both buhler chui comp component construction constuction contours cootes corresp correspondence curvature curves cvpr data davies deformable descent description direct dissertation distribution doctoral each editing eigenshape either ericsson error existing extensions feature forks framework functional future general generated generative gradient graphs ground handling have hierarchical high hladuvka iccv ieee include invariant investigate ipmi johan karlsson kernel kimia klein kotcheff latent learning length linear machine manchester many matching mcneill measure mechanisms minimizing minimum missing mixture mmbia model modeling models npcm occlusion often olafsdottir optimal optimization other outlier outliers outlying overcomes pami parts past pattern point points polynomial possibilities possibility presented principal probabilistic problems procrustes ramsay rangara rats recognition references registration reparameterisations representation representing resid retrival sebastian selection sets shape shapes sharks shock silverman space spline springer statistical steepest successfully summary supplying table taylor thank that their this thodberg trans truth twining university unordered used using variability variance vijayakumar waterton which will with work http://www.machinelearning.org/proceedings/icml2007/papers/426.pdf 91 Learning for Efficient Retrieval of Structured Data with Noisy Queries based beygelzimer carr conception conference cost cover evaluation humming icml international kakade langford learning lian machine music nature nearest neighb pennsylvania philipp pittsburgh proc proceedings query querye references retrieval study system trees http://www.machinelearning.org/proceedings/icml2007/papers/470.pdf 33 Manifold-Adaptive Dimension Estimation adaptive again american analysis annals appear approximation association assumption attractors audibert azuma bailey beltrami bickel bound bounded bounding certain colt combinatorics conf consistency consistent convergence definition dependent differences dimension dimensional dimensionality dubes empirical estimation estimator euclidean finishes first framework from geometric gine global graph graphbased graphs grassberger hein high hoeffding holds icml ieee inequalities inequality information intel intrinsic jain journal kegl koltchinskii langford laplace laplacian laplacians large last levina ligence likelihood lower luxburg machine manifolds math maximum mcdiarmid measuring method nearneighbor nips nonlinear nonparametric numbers numerator operators packing pages pattern pettis physica pointwise probability proc procaccia proof random reduction references regression regularization results sample science silva space start statistical statistics stone strange strangeness strong submanifolds submitted sums surveys tenenbaum thanks this time tohoku transactions uniform used using variables weak weighted where which with http://www.machinelearning.org/proceedings/icml2007/papers/206.pdf 85 Comparisons of Sequence Labeling Algorithms and Extensions algorithmic algorithms approaches based caruana character collins comparison conditional conference crammer crew data daume discriminative emnlp ensemble experiments fields from group handwritten hidden html http implementation international joural journal kassel kernel ksikes labeling lafferty langford language learning libraries machine machines mallet marcu markov matlab mccallum methods mizil model models multiclass murphy murphyk niculescu online perceptron pereira prediction probabilistic proceedings random recognition references research searchbased segmenting selection sequence singer software spoken structured submitted system theory thesis toolbox toolkit training umass vector with http://www.machinelearning.org/proceedings/icml2007/papers/351.pdf 111 Incremental Bayesian Networks for Structure Prediction aaai about abstract accurate achieved acids algorithm allow allows american analysis applied apprentissage approche appropriate approximation approximations arbor artificial assoc average barcelona based baseline bayesian belief berkeley best better bias biological both bottou broad california cambridge canada chap charniak class coarse collins comp conclusions conf connectionist connexionniste considered coverage dayan decision demonstrated demonstrating dependency dependent derive different discriminative dissertation doctoral domain driven durbin dynamic eddy edmonton empirical empirically entropy error evaluated exact experiments fact factored feed field fine first forward france frey from further ghahramani good graphical head henderson here hidden hinton history incremental inducing induction inference inspired intel into introduction isbns jaakkola johnson jordan klein koller krogh language latent leads learning ligence linguistics locality manning margin markov maxent maximum mean meeting melamed method methods mitchison model models more most motivates murphy natural neal network networks neural nips nonsignificantly north nucleic oretique outperformed output over paper paris parser parsing pennsylvania performed peshkin philadelphia pittsburgh prediction press probabilistic problems proc processes proposed proposes proteins reduction references reinforcement relative representation representations reranking research result results sallans saul savova scalable science seattle second sequence shown sigmoid significantly simple sleep spain statistical still structural structure suggests task taskar that theory these this toronto tractable training translation turian univ unsupervised vancouver variable variables variational wake washington wellington were where which with http://www.machinelearning.org/proceedings/icml2007/papers/375.pdf 44 A Bound on the Lab el Complexity of Agnostic Active Learning abel abeled above achieve active adaptively adding additionally advances after against aggressive agnostic agrees alglorithm algorithm algorithmic algorithms aligned also always among ample amples another answers appendix applications arbitrary arguments aside association assume attempt axis balcan based because behavior below benedek beygelzimer binary blumer bound bounds cannot case center chernoff chosen cics class clear clearly close coarse coefficient completely complexity computation computational computing concept conditional conference confident consider consistent constant constants constructing contains contradicts control copies correct correctness could cover creates cutoff dasgupta decision define demonstrated denote depenldent described determined different difficult dimension disa disagree distinct distribution distributions does dramatic drawing each effectively ehrenfeucht either eled else elsewhere endix entire entropy error essentially evaluation even examining example examples except exist exists exit factors fails fewer find first fixed following follows force from function fundamentally furthermore general generalizations gion given gives good guarantee guaranteed halfspaces hand happens haussler have here highlights however idea identical ieee implies improve improvement improvements included increase indicates individually inen information informative initial instead intelligent interested interesting international issue itai itive john journal keep know kulkarni label labeled labeling labels labnl langford lany ldots learnability learner learning learnling least lemma letting little long loop lower machine machinery maintain make makes making means measure ment method metric mistake mitter model modify more most much must need needed negative networks neural nlabeled noise nona none nonzero note numb number obability observing obtain offered omit only open optimal optimize oracle order other otherwise output outputs particular passive perform pick positive positives poss possibility possible precise premise prepared present probabilistic probability problem problems proc processing proof prove proven queries querying random randomly rare realizable rectangle rectangles reduce references rejecnion remaining remains report requeslts requests require requires respect return returns same sample sampling satsifying search searches seems separated sequence sets setting several show shows similar simple simply simultaneously since single size sketch slack smallest some sometimes sons space specifically standard start statistical step steps still strategy successful such suffices sufficient suppose systems tains target technical than that them then theorem theoretic theory there therefore thes they things think this three threshold thus tight tolerant tolerate total transactions true tsitsiklis types uniform uniformly union unlab unlabeled uses using value valued values vapnik vapnikchervonenkis variable verify warmuth ways were whatever when where whether which wiley will window with within workshop worse would xamples zero http://www.machinelearning.org/proceedings/icml2007/papers/491.pdf 87 Regression on Manifolds Using Kernel Dimension Reduction academy american analysis annals association belkin bengio chiaromonte clustering coifman computation cook data definition diffusion diffusions dimension dimensionality discussion eigenmaps examples extensions framework from geometric graphics harmonic informatique institute inter isomap journal labeled lafon laplacian learning machine maggioni manifold maps mathematics montr nadler national neural niyogi ofsample paiement partement proceedings rationnelle recherche reduction references regression regularization report representation research science sciences sindhwani spectral statistical structure sufficient technical tool universit unlabeled vincent warner wiley zucker http://www.machinelearning.org/proceedings/icml2007/papers/377.pdf 130 Local Learning Pro jections academic adaptive advances algorithms american analysers analysis annals artifical association average becker belhumeur belkin better bishop boldface bottou cambridge class classes clustering cohen collapsing component components computation conference cook cristianini cross curves dataset datasets denote densities dependent deviations dietterich different dimension dimensionality discriminant discussion dordrecht each edinburgh eigenfaces eigenmaps embedding error estimation face feature fisher fisherfaces five gaussians ghahramani globerson goldberger gradients group hespanha hinton hoffman ieee image information intel international inverse jection jections journal judged kernel kernels kluwer kriegman laplacian laplacianfaces learning level ligence linear lkopf local locality locally machine manifolds mean meinicke methods metric mixtures moore mukherjee nadaraya neighbourhood networks neural niyogi nonlinear nonparametric number numbers obtained optimal others over parentheses pattern platt preserving press principle proach probabiliry probabilistic proc processing projections rank rate recognition reduction references regression resolution results ritter roweis royal salakhutdinov saul science selection shawe shown significance significantly sliced smola society space specific spectral splits standard statistical statistics submitted sugiyama supervised systems table taylor techniques test than thrun tipping tong training transactions umist university using validation vapnik weisberg weiss wilcoxon with yale yaleb zhang zhou http://www.machinelearning.org/proceedings/icml2007/papers/85.pdf 18 Direct Convex Relaxations of Sparse SVM adaptive applications artificial bennett blum borchers boyd bradley breneman california cambridge canu chan classification computational concave conf convex csdp data diego dimensionality discrimination duals embrechts examples feature features fung grandvalet http information inseparable intel intl lanckriet langley learn learning library ligence linear linearly mach machine machines mangasarian massive method methods minimization neural newton optim optimization press processing programming qcqp reduction references relevant report robust scaling selection semidefinite sets softw song sparse support svcl svms systems technical ucsd university vandenberghe vasconcelos vector http://www.machinelearning.org/proceedings/icml2007/papers/161.pdf 100 Sample Compression Bounds for Decision Trees anthony bartlett bounds cambridge complexities foundations gaussian learning mendelson network neural press rademacher references risk structural theoretical university http://www.machinelearning.org/proceedings/icml2007/papers/37.pdf 78 Bottom-Up Learning of Markov Logic Network Structure aaai academic acknowledgement adapt alchemy algorithm also anonymous appear applications applying approach approaching artificial austin avenue average bayesian berlin both bottom bromberg busl cambridge candidates cannot capability case clauses combining comments computer computing conclusions considered constructing construction craven data dataset definitions demonstrate department dependencies deterministic dipasquo direction discovery domains domingos down dzeroski each edition effectiveness efficient ellis encouraging engineering entailment evaluation extend extract factor fashion feng finally first framework freitag from functions future general generalization generation getoor great group guide hall handling have heckerman help helpful honavar horwood http icml imdb include includes independence induction inductive inference intelligence intelligent inverse journal kaufmann kersting knowledge konvisser laer lavrac learn learner learners learning least like limitation logic logical machine margaritis markov mateo mccallum members methods microsoft mining minutes mitchell models modern mooney more morgan muggleton network networks nigam norvig novel number order other pathfinding pearl pedro plausible poon prentice presented press principled probabilistic problem progol programming programs proposi provide provided quinlan raedt realworld reasoning redmond references relational relations report research results reviewers revision richards richardson river russell saddle science second similar singla slattery sound speed springer statistical structure study such suggest support symbolic system systems table taskar tdsl team technical techniques tests thank thanks that their this three thus time tional tnodes towards training tutorial university upgrade upper used using verlag washington webkb which wide with work world york zelle http://www.machinelearning.org/proceedings/icml2007/papers/591.pdf 96 Online Discovery of Similarity Mappings abernethy acknowledge acknowledgments advice algorithm algorithms applications applied arora barnhill bartlett based bengio bianchi bottou cambridge canadian cancer cesa classification classify colt comput computer conf contract darpa data decision document efficient elements expert feature friedman games gene gradient guyon haffner hastie hazan ieee inference jority kalai kale learning lecun levi littlestone lugosi machine machines manuscript meta method mining multiplicative multitask ongoing online prediction press problems proceedings rakhlin recognition references robot selection springer statistical support syst tibshirani ullman under university update using vapnik vector vempala vision warmuth weighted weights weston with http://www.machinelearning.org/proceedings/icml2007/papers/360.pdf 12 Efficiently Computing Minimax Exp ected-Size Confidence Regions active advances astier astronomy astrophysics bernoul boundaries bryan cambridge confidence data evans expected first from function hansen identifying information learning legacy location measure measurement minimax neural parameters press processing references restricted sets stark supernova survey systems threshold year http://www.machinelearning.org/proceedings/icml2007/papers/105.pdf 120 Dirichlet Aggregation: Unsup ervised Learning towards an Optimal Metric for Prop ortional Data adapting aggregation allocation american analysis application applying approach bags ballard based bayesian blei bouguila bray brown buhmann buntine cabios caltech categories categorization collaborative color computer concepts correlated cstr csurka curves cvpr dance data database deerwester detection dimensionality dirichlet discrete distance distinctive distribution divergence dumais earth eccv editors embedding engineering ervised estimating examples exponential factor family feature features fergus figure filtering finite formulae framework from furnas gehler generative geometric global guibas harmoniums harshman haussler hill hinton histogram hofmann holub homology hughey icml ieee image improving incremental indexing information international introduction invariant jakulin ject jeffrey john jordan journal karplus keypoints know krogh lafferty landauer langford latent learned learning lebanon ledge linear locally lowe machine marlin mcgill mcgraw measures method methods metric metrics mian microsoft minka mixture mixtures mmlbased model modeling models modern mover multiple multiplicative nips nonlinear nonparametric obtained omer optimal ortional perona possion precision probabilistic proceedings prop protein puzicha quantitative quantization rate recall recognition reduction references report representation research retrieval riemannian rosen roweis rubner sadamitsu salton saul scale schutze science segmentation selection semantic sequence significant silva similarities similarity sjolander society sons space specific statistical stiles swain technical tenenbaum tested text texture tomasi topic towards training transactions tsukuba university unsup unsupervised vector vision visual volume weak welling werman wiley willamowski with word words workshop wyszecki yamamoto york zemel ziou http://www.machinelearning.org/proceedings/icml2007/papers/416.pdf 36 Gradient Bo osting for Kernelized Output Spaces able according adaptive advances alch algorithms altun analysis annals application applied applies approach approximation averagers averages bartlett base based baxter been behavior bennett bioinformatics boosting case cases chapelle chatalbashev chemical coherent combination competitive complexity computational computer conclusion confidence context cortes could currently data decision defined demiriz department dependency dependent descent difference discriminative diverse drawback elisseeff ensemble ensembles enzyme ernst estimation except experiments exploit extension extremely finite first focusing form framework frean freund friedman from function further general generalization genomic geurts global gradient greedy guestrin have hernandez hofmann hypothesis icml important improvement improves inference infinite information input instead integration interdependent introduction jmlr joachims journal kanehisa kere kernel kernelized kernels koller large leads learner learners learning level line linear local machine made making margin mason maximum mehryar memisevic methods models multivariate nature nearest neighbor nelizing network neural observed okboost ones operate operators other output over paired paper parado particular potential prediction predictions presented problem problems proc proceedings processing progress provided provides randomization randomized rather ratsch reduction references reformulated regression regularization report respect results sample schapire schemes schoelkopf science sciences shawe showed sided significant single southampton space spaces sparse splines standard statistically statistics stochastic structured substantially such suggests supervised system systems szedmak taskar tasks taylor technical technique tests that them theoretic these they this three toronto transductions tree treebased trees tsch tsochantaridis university using values vapnik variables vert wehenkel weighted weston what when with yamanishi http://www.machinelearning.org/proceedings/icml2007/papers/587.pdf 101 Pegasos: Primal Estimated sub-GrAdient SOlver for SVM ability accuracy advances agarwal aggressive algorithm algorithms anal analysis analyzed applic applications approximately bianchi bounds boyd burges cambridge censor cesa classification colt conclusions conconi convex crammer cristianini dekel descent described duda effective efficient fast fine freund func functions games generalization gentile gradient guaranteed hart hazan hebrew hush icml ieee improved information introduction jmlr joachims kalai kale kelly kernel kernels keshet kimeldorf kivinen large learning line linear logarithmic mach machine machines making margin math methods minimal minimizing nips objective online optimization oxford parallel passive pattern perceptron platt practical prediction press problems rank references regret repeated report representations results risk scale scene schapire scheinberg scholkopf scovel sequential shalev shawe shwartz simple singer smola solving some spline statistical steinwart stochastic strongly support svms tail taylor tchebycheffian technical theory time training transactions university using vandenberghe vapnik vector wahba wiley williamson with zenios zhang http://www.machinelearning.org/proceedings/icml2007/papers/523.pdf 94 Tracking Value Function Dynamics to Improve Reinforcement Learning with Piecewise Linear Function Approximation aaai algorithms approximation artificial baird barto bowling boyan bradtke choi conference decomposition dietterich difference discrete dynamic efficient event filter first fixed francisco function generalized geramifard hierarchical incremental intel international journal kalman kaufman learning least leastsquares ligence linear machine maxq morgan national point publishers references reinforcement research residual squares sutton systems technical temporal twelfth twenty update value with http://www.machinelearning.org/proceedings/icml2007/papers/245.pdf 46 Parameter Learning for Relational Bayesian Networks adaptive artificial bayesian binder conference explorations friedman getoor hidden ijcai intel international jaeger joint kanazawa kersting koller learning ligence logic machine models networks pfeffer probabilistic proceedings raedt references relational russell sigkdd uncertainty variables with http://www.machinelearning.org/proceedings/icml2007/papers/124.pdf 135 Asymptotic Bayesian Generalization Error when Training and Test Distributions are Different academic advances akaike approach automatic baldi bias bioinformatics borgwardt brunak cambridge control correcting data econometrica error experiments fedorov gretton heckman hoff huang identification ieee information learning look machine model neural optimal platt press processing references sample scholkopf selection smola specification statistical stolovitzky systems theory transactions unlabeled york http://www.machinelearning.org/proceedings/icml2007/papers/521.pdf 65 Adaptive Dimension Reduction Using Discriminant Analysis and K -means Clustering analysis approach approximation artificial banerjee beyer biclustering bregman cheng church cluster clustering conf conference dasgupta data database dhillon disc discriminative entropy experiments expression generalized ghosh goldstein icdt intelligence international ismb kanade knowledge learning machine matrix maximum meaningful merugu mining modha nearest neighbor proc proceedings projection ramakrishnan random references shaft springer symp theory torre uncertainty when with http://www.machinelearning.org/proceedings/icml2007/papers/35.pdf 142 Optimal Dimensionality of Metric Space for Classification accuracy across adaptive advances among analysis application artificial ascendingly beijing belhumeur belkin between blitzer cambridge change china chinese clarity class classification cluster clustering computer conference corresponding cumulative curve databases dataset datasets defense derived dimension dimensionality dimensions discriminant distance each eigenfaces eigenmaps eigenspectra eigenvalue eigenvalues embedding equivalence face figure fisher fisherfaces framework functions general geometric global graph hastie hertz hespanda hillel html http ieee industry information input intel international involves jection joint jordan just kiregeman langford laplacian laplacianfaces large learning ligence linear local locally lower machine margin matrix metric mlearn mlrepository more murphy nearest negative neighbor neural niyogi nonlinear number optimal over panel panels pattern performance pittsburgh positive press proceedings processing purpose range ranges recognition reduction references relation relations repository roweis russell saul saving science separate shental show shown side silva simultaneously sonar sorted space specific spectra spectral stepwise sugiyama supervised systems techniques tenenbaum testing than theory tibshirani training transactions umist upper using versus vision wang wdbc weinberger weinshall well which with xing zero zhang http://www.machinelearning.org/proceedings/icml2007/papers/493.pdf 62 Scalable Mo deling of Real Graphs using Kronecker Multiplication acadamy agrawal albert algorithms analysis annals applications approach barab barabasi bayesian brameier butts california carlo chain chakrabarti chapman citation cocoon collective complex computing curvature data defining dimension distribution domingos dynamics efficiency efron emergence empirical erdos estimating european evolution faloutsos free gamerman generation generators graph graphs hagberg hall hungarian icml inference institute internet irvine iswc journal kalai kleinberg kronecker kumar laws leskovec likelihood logistic logit management markov mathematical mathematically maximum measurements mechanics methods milgram mining model models modern monte multiplication nature network networks order paper pattison permutation physical physics pkdd pnas popular power problem properties psychology psychometrika publication raghavan rajagopalan random realistic recursive redner references regressions relational relationships renyi reviews richardson rmat sampling santhanam scale scaling schwarz science second selection semantic sigcomm simulation small social statist statistical statistics stochastic strogatz study stumpf subnets surveys tech today tomkins topology tractable trust univ using wasserman watts with wiuf world your zhan http://www.machinelearning.org/proceedings/icml2007/papers/129.pdf 89 A Fast Linear Separability Test by Pro jection of Positive Points on Subspaces advances algorithm barber blake burges cambridge cation cheung classi computing convex core cristianini data databases dobkin duda elizondo fast gonzaga hart hettich huhdanpaa hulls ieee interscience introduction joachims journal kernel kwok large learning linear machine machines making mathematical megiddo merz methods networks neural newman nostrand operations practical practice press problem problems programming progress quickhull references reinhold repository research scale schlkopf separability sets smola software solving some springer stork support taylor tern test testing theory training transactions tsang university vector verlag very wasserman wiley york http://www.machinelearning.org/proceedings/icml2007/papers/180.pdf 52 Infinite Mixtures of Trees annals annual antoniak applications approximating assigned averaged bayesian bits black blackwell blue bottom brazil broder center chickering chow circles cluster clusters combined complete component components computer concentrated contain correct cross dash dashed data dirichlet distribution distributions dmpt dotted dpmt each estimate estimated ferguson figure finite fisher form foundations fraction from generating green histogram infinite learning leave lenz likelihood line macqueen mcmc mixture mixtures model networks nonparametric number over parametric plots pluses prior priors problems proceedings processes proportion queensland random references sample samples scale schemes science season shown significant size solid spanning springer stars state statistics symposium test that thick total trees true under validated vectors verlag wide with york http://www.machinelearning.org/proceedings/icml2007/papers/335.pdf 88 Learning State-Action Basis Functions for Hierarchical MDPs american annals baltimore cbms cheeger chung combinatorics computations conference directed edition golub graph graphs hopkins inequailty johns laplacians loan mathematical mathematics matrix number press references regional series society spectral theory university http://www.machinelearning.org/proceedings/icml2007/papers/366.pdf 43 Supervised Clustering of Streaming Data for Email Batch Detection bansal based blum charikar chawla clustering collaborative computer computing conference correlation damiani demaine detection distance equivalence filtering foundations functions guruswami hertz hillel immorlica information international journal learning machine paraboschi partial peer proceedings qualitative references relations samarati science sciences shental spam symposium system using vimercati weinshall wirth with http://www.machinelearning.org/proceedings/icml2007/papers/471.pdf 35 Robust Non-linear Dimensionality Reduction using Successive 1-Dimensional Laplacian Eigenmaps adaptive algorithm applied awate balasubramanian belkin bengio bishop chapman chung clementi climate comput computation data delalleau dimensional dimensionality donoho eigenfunctions eigenmaps embedding energy filtering folding free geophysics graph grimes hall hessian high hinton ieee image information isomap kavraki kernel kurths landscapes laplacian learning linear links locally london moll monographs multidimensional neighbor networks neural nips niyogi nonlinear ouimet oxford paiement pami pattern pnas press probability proc processes protein reactions recognition reduction references representation restoration roux roweis scaling schwartz science sciences spectral stability stamati statistics stochastic techniques theoretic theory timmermann topological trans university unsupervised vincent whitaker zhou http://www.machinelearning.org/proceedings/icml2007/papers/225.pdf 83 Unsupervised Estimation for Noisy-Channel Mo dels aaai algorithm analysis applied approach bahl based baum berck brants brown cambridge chains clarkson cocke coling conference continuous copenhagen corpora corpus daelemans data dempster denmark esca estimating europarl eurospeech fourth francisco from functions generator gillis iaai incomplete jelinek journal kaufmann knight koehn laird language large likelihood machine markov math maximization maximum memory mercer modeling monolingual morgan natural occurring parallel part peterie pietra probabilistic probabilities proceedings processing publishers recognition references roossin rosenfeld royal rubin series sigdat sixth society souled speech statist statistical summit tagger technique toolkit translation unrelated using very weiss word workshop zavrel http://www.machinelearning.org/proceedings/icml2007/papers/276.pdf 123 Hybrid Hub erized Supp ort Vector Machines for Microarray Classification angle annals barnhill bloomfield bradley burges caligiuri cancer class classification coller concave conference cortes data discovery downing efron expression feature gaasenbeek gene golub guyon hastie huard international johnstone know lander learning least ledge machine machines mangasarian mesirov minimization mining molecular monitoring networks pattern prediction proceedings recognition references regression science selection slonim statistics support tamayo tibshirani tutorial vapnik vector weston http://www.machinelearning.org/proceedings/icml2007/papers/450.pdf 25 Intractability and Clustering with Constraints active added addison also application appropriate approximation ausiello background banerjee basu baxter bilenko built calculating cardie chromatic clustering clusters complexity computational computer computers conclusion constrained constraint constraints cornell data davidson dissertation distance does earlier efficiently empirical even example exists extrinsic feasibility feasible feige finding firstly fixing framework from function garey generated given have icml implications impossibility information instance intel intractability intractable issues johnson jordan kilian kleinberg knowledge labels learning learnt level ligent means metric minimally minimum mining mooney nips number oliver pairwise papadimitriou presenting probabilistic problem pruning ravi references results rogers russell satisfy schroedl sciences secondly seeding semi semisupervised several showed siam side solution springer supervised supervision system that then theorem there these thirdly this under university unsupervised upon uses using value wagstaff wallace were wesley where which with xing zero http://www.machinelearning.org/proceedings/icml2007/papers/370.pdf 48 Constructing Basis Functions from Directed Graphs for Value Function Approximation agaev algebra applications chebotarev laplacian linear matrices nonsymmetric references spectra http://www.machinelearning.org/proceedings/icml2007/papers/188.pdf 71 Relational Clustering by Symmetric Convex Coding algebra application approach approximation austin banerjee bregman catral chan clustering cuts dhillon entropy factorization fill generalized ghosh graph guan heuristic jones kernel kulis learning linear matrices matrix maximum means merugu modha nature negative neumann nonnegative objects partitioning parts plemmons ppsc rank ratio reduced reducing references report schlag seung sparse spectral symmetric technical texas unified university view zien http://www.machinelearning.org/proceedings/icml2007/papers/444.pdf 144 Spectral Feature Selection for Sup ervised and Unsup ervised Learning accuracy algorithm algorithms also anal analysis ando another appl applications approach arnoldi attention authors axis bartlett based baseline both brodley calculating cannot cases chapelle chapter chung class classification closely cluster clustering colt column columns comparison conclusions covariance covering cristianini cuts cvpr data defined derived design designing developing different diffusion discrete discussions does each effective either elisseeff emergence empirical equations ervised exhibit existing experiments extend extended extensive facilitates families feature features figure figures first framework from function functions future general generality ghaoui graph graphbased graphs guyon handle handling hockbase icml ieee image implicitly increasing indicator inference input integrating introduction iteration iteratively jmlr joint jordan kernel kernels kondor kononenko labels lafferty lanckriet laplacian last learning legend lehoucq line lines lkopf machine malik matrix measures methods metric natural nips niyogi normalized number numbers other paid paper piece powerful press programming propose proposed ranking recently references regularization related relathe relief relieff researchers respectively restarted restricts results robnik rows score segmentation selected selection semi semidefinite shashua show siam sikonja similarity since smola soft some sparsity special spectral stand stands structures study subspace supervise supervised table that then theoretical theory they thick this tkde toward unsup unsupervised usability using variable variety vector weight weiss which with wolf work zhang zhao zien