■ █▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓▒▓▓▓▓█▓▓█▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓█▓
▌ ▐ ▐▀▌ █ █▀▌ ▐▀▌ ▌ ▐ ▐▌▐▌ ▐▀▀ ▀▌▐▌▐▀ █▒▒▒▒▒▒▒▒▒▒▒▓▒▒▒▒█▒▒▒▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓
█▀█ █▀█ █ █ █ █▀█ █ █ ▐▐▌█ █▀ ▌▐▌▐ ▒▒▒▒▒▒▒▒▒▓▒▒▒█▒▒▒▒▒▒▒▒▒▒▓█████▒▓▓▓▓▓▓▓█▓▓▓▓
▌ ▐ ▌ ▐ ▐██ ███ ▌ ▐ █▄█ ▐ █ ▐██ ▄▌▐▌▐▄ █▒▒▒▒▒▒▒▒▓▓▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓▓▓▓▓▓█▓▓▓▓
█▒▒▒▒▒▓▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓▓▓▓▓▓▓▓█▓
(W)elcome (A)dvisees Tal(k)s █▒▒▒▒▒█▓████▒▓▒▒▒▒▓▒▒▒▒▒▒▒▒▓▓▓█▓▓▓▓▓
(P)ublications (V)ita/Bio/Contact (F)AQ ▓▒▒▒▒▒▒█▒███▒▒▒▓▒▒▒▒▒▒▒▒▒▓▓▓▓▓▓▓▓
(T)eaching (B)log (L)inks █▒▒▒▒▒▒▒██▒▒▒▓████▒▒▒▒▒▓▓▓▓▓▓▓▓
▓▒▒▒██▒▒▒▒▓▓▓█████▒▒▓▓▓▓█▓▓▓
███▓▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓▓▓▓▓▓▓
My thesis committee is made up of
Daniel Marcu (CS),
Kevin Knight (CS),
Eduard Hovy (CS),
Stefan Schaal (CS),
Gareth James (Statistics),
Andrew McCallum (UMass).
The thesis is available in PDF or
Postscript format (warning: it's a
big file!). BibTeX is also
availble. You can also download my defense slides in either OpenOffice format or PDF (warning: animations don't come
through in PDF).
The thesis abstract is:
Natural language processing is replete with problems whose outputs are
highly complex and structured. The current state-of-the-art in
machine learning is not yet sufficiently general to be applied to
general problems in NLP. In this thesis, I present Searn (for
"search-learn"), an approach to learning for structured
outputs that is applicable to the wide variety of problems
encountered in natural language (and, hopefully, to problems in other
domains, such as vision and biology). To demonstrate Searn's general applicability,
I present applications in such diverse areas as automatic document
summarization and entity detection and tracking. In these
applications, Searn is
empirically shown to achieve state-of-the-art performance.
Searn is based on an
integration of learning and search. This contrasts with
standard approaches that define a model, learn parameters for that
model, and then use the model and the learned parameters to produce
new outputs. In most NLP problems, the "produce new
outputs" step includes an intractable computation. One must
therefore employ a heuristic search function for the production step.
Instead of shying away from search, Searn attacks it head on and considers structured
prediction to be defined by a search problem. The
corresponding learning problem is then made natural: learn parameters
so that search succeeds.
The two application domains I study most closely in this thesis are
entity detection and tracking (EDT) and automatic document
summarization. EDT is the problem of finding all references to
people, places and organizations in a document and identifying their
relationships. Summarization is the task of producing a short summary
for either a single document or for a collection of documents. These
problems exhibit complex structure that cannot be captured and
exploited using previously proposed structured prediction algorithms.
By applying Searn to
these problems, I am able to learn models that benefit from highly
complex, non-local features. Such features would not be available to
structured prediction algorithm that require model tractability.
These improvements lead to state-of-the-art performance on
standardized data sets with low computational overhead.
Searn operates by
transforming structured prediction problems into a collection of
classification problems, to which any standard binary classifier may
be applied (for instance, a support vector machine or decision tree).
In fact, Searn
represents a family of structured prediction algorithms depending on
the classifier and search space used. From a theoretical perspective,
Searn satisfies a strong
fundamental performance guarantee: given a good classification
algorithm, Searn yields
a good structured prediction algorithm. Such theoretical results are
possible for other structured prediction only when the underlying
model is tractable. For Searn, I am able to state strong results that are
independent of the size or tractability of the search space. This
provides theoretical justification for integrating search with
learning.
|
credits: design and font inspired by Seth Able's LoRD, some images converted to ANSI using ManyTools, original drawing of me by anonymous.
last updated on nineteen november, two thousand twenty four.
|