Invited Applications Paper The Role of Machine Learning in Business Optimization Chid Apte IBM T. J. Watson Research Center Yorktown Heights, NY 10598 APTE@US.IBM.COM Abstract In a trend that reflects the increasing demand for intelligent applications driven by business data, IBM today is building out a significant number of applications that leverage machine learning technologies to optimize business process decisions. This talk highlights this trend; and describes the many different ways in which leading edge machine learning concepts are being utilized in business applications developed by IBM for its internal use and for clients. enabling "descriptive" analytics that permit users to query and get rich reports from data, it is expected that Statistics and Machine Learning will fuel the use of "Predictive Analytics" from data, and Optimization Methods and Stochastic Analysis will provide the basis for "Prescriptive Analytics". IBM's interest in Machine Learning dates back to the 1950s when Arthur Samuels' Checkers program made history by demonstrating that a computer program could play checkers well enough to beat human experts, and could learn from its experience in playing against humans to become a better player. Early usage of Statistical and Machine Learning methods in IBM was driven by large scale research and development programs in Speech Recognition, Handwriting Recognition, Natural Language Understanding. Vision, and Game Playing systems. While projects like these have continued in IBM ever since the seminal days of the Checkers project, we have witnessed a significant growth recently in the focus and interest in machine learning as an application technology for the sorts of enterprise business applications that IBM builds and delivers. The primary driver of this shift has been the vast amounts of information that businesses today readily collect and manage on all aspects of its processes. The ongoing rapid growth of on-line data due to the widespread use of database technology has driven a new appetite for Machine Learning and Data Mining. The challenge of extracting useful insights from data draws upon research in statistics, data management, pattern recognition, and machine learning. Key advances in robust and scalable data mining techniques, methods for fast pattern detection from very large databases, and innovative applications of machine learning for business applications have come from our worldwide research laboratories. Early demonstrations of successful machine learning applications started appearing inside IBM in the manufacturing quality control and computer performance management areas in the early 90s. Both domains were of high interest to the company given its singular focus then 1. Overview IBM has embarked upon a major strategy that focuses on Business Analytics and Optimization as a key technology for helping businesses optimize their routine operational and strategic decisions. This strategy essentially acknowledges the natural progression and evolution in the way businesses have been leveraging information, from the traditional "descriptive" analytics methods, to increasing use of "predictive analytics", and eventually "prescriptive" analytics. Descriptive analytics of data allows a user to get a retrospective view on the business, getting answers to questions like "what happened", "how many times", and "where". Predictive analytics allows a user to get a prospective view on the business, getting answers to questions like "what could happen", "what if these trends continue", and "what might happen next if ..". Prescriptive analytics allows a user to obtain an actionable solution, getting an answer to the question, "what is the set of required actions" to take to achieve a business objective, under a given set of predictions and business constraints. Several case studies have overwhelmingly demonstrated that businesses dramatically improve their competitive presence by embracing predictive and prescriptive applications to optimize their business decision making. While traditional database management systems and the relational framework have provided the engine for Appearing in Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel, 2010. ---------- on hardware manufacturing internally and high availability systems offerings externally. Supervised learning methods such as decision trees and rule induction were shown to be extremely useful in fast generation of insightful predictive models from data, that could be used to proactively manage quality control issues on the manufacturing line, and performance issues in large-scale computer systems running complex client workloads. As IBM's focus started shifting towards services and solutions for clients in the mid 90's, access became possible to vast amounts of business data, including marketing campaign data, financial credit data, airlines reservation systems data, clinical care and genomics data, call center interaction data, telecom call data, and much more. Many of the supervised learning methods that had been used very successfully internally had now started being applied to very similar client problems. Some of the earliest machine learning application work IBM started to work upon was in the area of marketing and customer relationship management. Applications were developed and deployed in client locations for campaign management, including targeted marketing and crosssell/up-sell recommendation engines, using methods like decision trees, rule induction, and collaborative filtering. As more complex business problems have started presenting themselves, teams in IBM have started leveraging more advanced ideas from machine learning to overcome challenges from special characteristics of the data, such as link information, insufficient number of labeled examples, sparseness / high dimensionality, sequential prediction requirements, and timely analysis requirements. This has triggered the successful use and deployment of applications based upon emerging machine learning techniques for diverse domains, including Credit card fraud detection, Social media insights, Marketing optimization, Delinquency management. Life sciences and healthcare management, Customer & consumer insights, and Sustainability management. Some application domains that the IBM teams are working on in turn are providing a fertile ground for machine learning research. Social media insights; where the problems associated with learning from large-scale community-based information interaction repositories (e.g. blogs), concepts such as emerging topics and sentiment detection, are driving new directions in active learning, semi-supervised learning, as well as graphical modeling. Interest in root-cause diagnostics for complex instrumented systems is also driving new directions in machine learning research. Understanding causality from spatio-temporal data generated by such systems is driving new advances in methods for learning temporal causality. The interplay between machine learning and optimization is taking on a new dimension as well. The role of optimization methods within modern machine learning algorithms is well understood. However, as we build out end-to-end decision support systems that incorporate both prediction and prescription, there is an increasing need for coupling predictive modeling and optimization for generating solution plans. The reinforcement learning paradigm is turning out to be a surprisingly useful approach for many business applications that have this need. As machine learning applications become embedded in operational IT environments, the need for scalability, robustness, and automation has become of pre-eminent concern. The requirement to learn in-situ from massivescale data is driving the need for the design and implementation of scalable machine learning algorithms, and frameworks for parallel machine learning are being developed to enable a standards-based approach to developing such algorithms, sometime leveraging specialpurpose hardware accelerators and emerging multi-core architectures. Furthermore, applications that exhibit the high-volume low-latency streaming environment are driving the need for real-time and on-line machine learning methods, more so than ever before. Acknowledgments I would like to thank my many colleagues in numerous Machine Learning R&D teams working across IBM's global labs for making this compelling story possible. References Apte, C., Morgenstern, L., and Hong, S.J., AI at IBM Research, IEEE Intelligent Systems,Volume 15, Number 6, pages 51-57, 2000. Davenport, T., and Harris J., Competing on Analytics, Harvard Business School Press. ISBN: 1422103323. IBM Research Machine Learning Communities: http://domino.research.ibm.com/comm/research.nsf/pag es/r.ai.html, http://domino.research.ibm.com/comm/research.nsf/pag es/r.kdd.html, http://domino.watson.ibm.com/comm/research.nsf/page s/r.nlp.html, http://domino.watson.ibm.com/comm/research.nsf/page s/r.uit.html. Samuel, A. L. Some studies in machine learning using the game of checkers. IBM Journal of Research and Development, 3(3):211­229, 1959.