
Machine Learning CMSC 726 Fall 2011


Jump to:
[Background]
[Structure]
[Grading]
[Textbooks]
[Schedule]
[Homework]
[Links]
[Policies]
Background and Description
Machine learning is all about finding patterns in data. The whole
idea is to replace the "human writing code" with a "human supplying
data" and then let the system figure out what it is that the person
wants to do by looking at the examples. The most central concept in
machine learning is generalization: how to generalize beyond
the examples that have been provided at "training time" to new
examples that you see at "test time." A very large fraction of what
we'll talk about has to do with figuring out what generalization
means. We'll look at it from lots of different perspectives and
hopefully gain some understanding of what's going on.
This class will showcase machine learning technology in the context of
recommender systems, ala what you see on Amazon or NetFlix (or
eHarmony). The data we'll be working with is recommendations for CS
courses at UMD!
There are a few cool things about machine learning that I hope to get
across in class. The first is that it's broadly applicable. These
techniques have led to significant advances in many fields, including
stock trading, robotics, machine translation, computer vision,
medicine, etc. The second is that there is a very close connection
between theory and practice. While this course is more on the
"practical" side of things, almost everything we will talk about has a
huge amount of accompanying theory. The third is that once you
understand the basics of machine learning technology, it's a very open
field and lots of progress can be made quickly, effectively by
figuring out ways to formalize whatever we can figure out about the
world.
Prerequisites: I take prerequisites seriously. There will be a
lot of math in this class and if you do not come prepared, life
will be rough. You should be able to take derivatives by hand
(preferably of multivariate functions), you should know what dot
products are and how they are related to projections onto subspaces,
you should know what Bayes' rule is and you should know that it's okay
for the density of a Gaussian probability distribution to be greater
than one. I've provided some reading
material to refresh these issues in your head, but if you haven't
at least seen these things before, you should beef up your math
background before class begins. On the
programming side, projects will be in Python; you should understand basic
computer science concepts (like recursion), basic data structures
(trees, graphs), and basic algorithms (search, sorting, etc.). (If
you know matlab, here's a nice cheat sheet.)
Structure of Class
I will take a slightly nonstandard approach to class time. I will
not spend 3 hours per week going over material that was in the
readings. As a result, you should read. And you should do the short
written assignments. My responsibility will be to help you understand
things that are hard, and to give you an insider's view of the field.
Class time will be interactive. Certain homework problems will be
marked for inclass presentation, and you will do them. The
rest of class time will be spent talking about issues that arise,
things that I think are particularly interesting, doing activities
and/or demos.
Your responsibilities are as follows:
 Read the assigned reading assignments before class. There
are inline questions in the reading that you should be prepared to
answer in class. Though you don't need to be right, you do need to
have an answer (or face public embarassment!).
 It will be very helpful if you write down a short list of
questions before class, though this isn't actually required. I'm
serious about reading; to demonstrate that, most reading assignments
are about 10 pages long (some are 15ish).
 Complete the assigned weekly homework assignments before class.
Some will be "starred", meaning that we will spend the first part of
class time going over the solutions. Students will present the
solutions: you will be chosen to present uniformly at random (without
replacement). We have our own handin
system.
 Participate actively in class discussions and on Piazza
(questions and answers are considered participation!).
Given that this is a three credit class, I expect you to spend nine
hours per week working on ML stuff. Three of those hours will be in
class. Of the remaining six, I expect about two to be spent reading
(one hour per assignment), two to be spent on written homeworks and
two to be spent on projects. If things are taking significantly more
time than this, you should talk to us.
Grading
The purpose of grading (in my mind) is to provide extra incentive for
you to keep up with the material and to ensure that you exit the class
as a machine learnign genius. If everyone gets an A, that
would make me happy (sadly, it hasn't happened yet). The components
of grading are:
 27%  Programming projects
There are three programming projects, each worth 9% of your final
grade. You will be graded on both code correctness as well as your
analysis of the results. These must be completed in teams of two or
three students.

 18%  Written homeworks
There are thirteen written homeworks (one per week), each is worth 1.5%
of your final grade (lowest one dropped). They will be graded on a highpass (100%),
lowpass (50%) or fail (0%) basis. These are to be completed
individually. (The
initial homework, HW00, is not graded, but required if you do not
want to fail.)

 25%  Midterm exam
Roughly halfway through the semester, there will be a midterm exam
that covers everything up until that point. Obviously it is to be
completed individually, but is openbook.

 25%  Final (practical) exam
Everyone is to complete a final project, in teams of arbitrary size,
which will play the role of a practical final exam.
We will discuss the scope of the project later
in class. 
 5%  Class participation
You will be graded on your
inclass presentations of homework questions and other general
participation, including participation in the comments on the blog. This is mostly subjective.

Late homeworks are not allowed (without prior approval). This
is because I need to put solutions up on the web page. You may hand
any project in up to 48 hours late; however, once it is late by one
minute, your final score will be halved.
We will post notes on the blog when assignments have been graded. If
you handed something in and do not get a score for an
assignment, you have a one week moritorium on complaints.
Textbooks
There are no official books for this course. Our primary source will
be a collection of notes (aka CIML) I
have been writing. Some recommended (but not required) books:
Schedule (tentative)
The following schedule is subject to change, but likely not by very
much. The readings listed are readings that you should have finished
by that date. Everything is due by 10:55am on the date listed on the
schedule. Programming assignments are to be completed in Python.
Written assignments are to be handed in in PDF format.
One thing that students have pointed out in the past that I'll point
out to you is that Wikipedia has a
bunch of good articles related to machine learning and statistics.
Especially basic statistics stuff (various distributions, rules of
probability, etc.) are very well explained there. I highly
recommend it as an alternative source of information.
Date 
Topics 
Readings 
Due 
Notes 
01 Sep 
[1] What is machine learning?

CIML 11.2 
 

Basic Supervised Learing 
06 Sep 
[2] Decision trees and inductive bias

CIML 1.31.9 
HW00 
 
08 Sep 
[3] Geometry and nearest neighbors

CIML 22.3 
HW01 
 
13 Sep 
[4] Kmeans clustering

CIML 2.42.6 
 
 
15 Sep 
[5] Perceptrons

CIML 33.4 
HW02 
 
20 Sep 
[6] Perceptrons II

CIML 3.53.7 
 
 
22 Sep 
[7] Practical issues and evaluation

CIML 44.8 
HW03 
 
27 Sep 
[8] Imbalanced and multiclass classification

CIML 55.2 
P1 
 
29 Sep 
[9] Ranking and collective classification

CIML 5.35.5 
HW04 
 
Advanced Supervised Learing 
04 Oct 
[10] Linear models and gradient descent

CIML 66.4 
 
 
06 Oct 
[11] Subgradient descent and support vector machines

CIML 6.56.7 
HW05 
 
11 Oct 
[12] Probabilistic modeling

CIML 7 
 
 
13 Oct 
[13] Probabilistic modeling II

CIML 7 
HW06 
 
18 Oct 
[14] Neural networks

CIML 8 
 
 
20 Oct 
[15] Neural networks II

CIML 8 
HW07 
 
25 Oct 
[16] Kernel methods

CIML 99.4 
 
 
27 Oct 
[17] Kernel methods II

CIML 9.59.6 
HW08 
 
01 Nov 
[18] Ensemble methods

CIML 11 
P2 
 
03 Nov 
[19] Efficient learning

CIML 12 
HW09 
 
Unupervised Learing 
08 Nov 
[20] Linear unsupervised learning

CIML 1313.2 
Midterm 
 
10 Nov 
[21] Nonlinear unsupervised learning

CIML 13.313.5 
HW10 

15 Nov 
[22] Expectation maximization

CIML 1414.3 
 

17 Nov 
[23] Expectation maximization II

CIML 14.414.5 
HW11 

22 Nov 
[24] Semisupervised learning

ssl_survey (sec 24) 
 
 
Advanced Topics 
29 Nov 
[25] Hidden Markov models

hmmssl 
 

01 Dec 
[26] Graphical models

bp 
HW12 
 
06 Dec 
[27] Online learning

online (175) 
P3 
 
08 Dec 
[28] Structured learning

 
 

13 Dec 
[29] Bayesian learning

bayesslides 
HW13 

Homework Assignments
All written homeworks are due on Thursday. See the schedule above for due dates.
You may handin your homework/projects here.
You're free to use the LaTeX source in any way you want, but you'll need haldefs.sty and notes.sty to build them.
Written Homeworks

HW00:
Survey and background check
(tex)

HW01:
Basic concepts and geometry
(tex)

HW02:
Clustering and perceptrons
(tex)

HW03:
Perceptrons and evaluation
(tex)

HW04:
Complex predictions
(tex)

HW05:
Gradient descent and friends
(tex)

HW06:
Probabilistic and neural modeling
(tex)

HW07:
Kernel methods
(tex)

HW08:
Kernels II
(tex)

HW09:
Ensembles and efficiency
(tex)

HW10:
Unsupervised learning
(tex)

HW11:
Expectation maximization
(tex)

HW12:
Graphical models
(tex)

HW13:
Advanced learning topics
Programming Projects

P0:
Unix/Python/NumPy tutorial

P1:
Basic classification
; solution

P2:
Complex classification

P3:
Unsupervised learning
Final Project
See here
Useful Links
This course has been taught (by me!) in the past:
Fall 2009,
Fall 2008,
Spring 2008
and
Spring 2007
.
This course is similar to several other machine learning courses, taught at
other universities:
CMU (Tom Mitchell and Andrew Moore),
Stanford (Andrew Ng),
Cornell (Thorsten Joachims) and
Edinburgh (Sethu Vijayakumar).
There have also been a series of summer schools on
machine learning, some of which have videos up.
Although you won't need to use any of this software for your homeworks/projects, there are a large number of opensource machine learning toolkits out there. (Some of these may be useful for the competition.) A small sample:
 Torch3: a generic machine learning library, particularly good for neural networks, but also a lot more!
 MegaM: Optimization software for maximum entropy models, uses conjugate gradient for binary/binomial problems and LMBFGS for multiclass problems
 FastDT: Very fast decision tree learner that implements bagging and boosting
 libSVM: a very efficient library for SVMs
 SVMLight: another efficient library for SVMs
 Weka: the "defacto" machine learning/datamining library
 Mallet: a library for structured prediction with CRFs (plus other stuff)
Course Policies
Cheating: Any assignment or exam that is handed in must be your own work. However, talking with one another to understand the material better is strongly encouraged. Recognizing the distinction between cheating and cooperation is very important.
If you copy someone else's solution, you are cheating. If you let someone else copy your solution, you are cheating.
If someone dictates a solution to you, you are cheating. Everything you hand in must be in your own words, and based on your own understanding of the solution.
If someone helps you understand the problem during a highlevel discussion, you are not cheating. We strongly encourage students to help one another understand the material presented in class, in the book, and general issues relevant to the assignments.
When taking an exam, you must work independently. Any collaboration during an exam will be considered cheating.
Any student who is caught cheating will be given an E in the course and referred to the University Student Behavior Committee. Please don't take that chance  if you're having trouble understanding the material, please let us know and we will be more than happy to help.
ADA: Any student eligible for and requesting reasonable
academic accommodations due to a disability is requested to
provide, to the instructor in office hours, a letter of
accommodation from
the Office of
Disability Support Services (DSS) within the first two weeks of
the semester. You may reach them at 3013147682 or by visiting
Susquehanna Hall on the 4th Floor.
College guidelines: Document concerning adding, dropping, etc. here.