Overview
Current Team
Past Members
Publications
Software
Project funded by the National Science Foundation (IIS-1409287,
UMD; IIS-1409739,
BYU)
PI: Jordan Boyd-Graber (Maryland),
co-PI: Leah Findlater (UW), PI: Kevin Seppi (BYU)
Individuals and organizations must cope with massive amounts of unstructured text information: individuals sifting through a lifetime of e-mail and documents, journalists understanding the activities of government organizations, companies reacting to what people say about them online, or scholars making sense of digitized documents from the ancient world. This project’s research goal is to bring together two previously disconnected components of how users understand this deluge of data: algorithms to sift through the data and interfaces to communicate the results of the algorithms. This project allows users to provide feedback to algorithms that were typically employed on a “take it or leave it” basis: if the algorithm makes a mistake or misunderstands the data, users can correct the problem using an intuitive user interface and improve the underlying analysis. This project jointly improves both the algorithms and the interfaces, leading to deeper understanding of, faster introduction to, and greater trust in the algorithms we rely on to understand massive textual datasets. Furthermore, source code and functional demos will be shared publicly, and tutorials will be shared online and in person in to aid the adoption of the methodologies.
This project enables computer algorithms and humans each to apply their strengths and collaborate in managing and making sense of large volumes of textual data. It “closes the loop” in novel ways to connect users with a class of big data analysis algorithms called topic models. This connection is made through interfaces that empower the user to change the underlying models by refining the number and granularity of topics, adding or removing words considered by the model, and adding constraints on what words appear together in topics. The underlying model also enables new visualizations in the form of a Metadata Map that uses active learning to focus users’ limited attention on the most important documents in a collection. Users annotate documents with useful meta-data and thereby further improve the quality of the discovered topics. The project includes evaluations of these methods through careful user studies and in-depth case studies to demonstrate that topics are more coherent, users can more quickly provide annotations, users trust the underlying algorithms more, and users can more effectively build an understanding of their textual data.
<< back to top
Jordan Boyd-Graber Associate Professor, Computer Science/Language Science/iSchool/UMIACS (Maryland) | |
Leah Findlater Assistant Professor, Human Centered Design and Engineering (UMD) | |
Kevin Seppi Professor, Computer Science (BYU) | |
Piper Armstrong MS Student, Computer Science (BYU) | |
Stephen Cowley MS Student, Computer Science (BYU) | |
Wilson Fearn MS Student, Computer Science (BYU) | |
Courtni Byun MS Student, Computer Science (BYU) | |
Fenfei Guo PhD Student, Computer Science (UMD) | |
Varun Kumar Applied Scientist at Amazon |
|
Pedro Rodriguez Ph.D. student, Computer Science (Colorado) |
|
Alison Smith-Renner PhD Student, Computer Science (UMD) | |
Thang Nguyen PhD Student, Computer Science (UMD) |
<< back to top
Eric Ringger Associate Professor, Computer Science (BYU) Now Senior Director of Maching Learning for Personalization at Zillow | |
Paul Felt PhD Student, Computer Science (BYU) Now Software Engineer at IBM | |
Ethan Garofolo MS Student, Computer Science (BYU) |
|
Jeff Lund PhD Student, Computer Science (BYU) Now at Google | |
Connor Cook Undergrad, Computer Science (BYU) Now an MS Student at the University of Colorado Boulder | |
Tak Yeon Lee PhD Student, Computer Science (UMD) Now Research Scientist at Adobe | |
You Lu MS Student, Computer Science (Colorado) Now PhD Student at Virginia Tech |
|
Viet-An Nguyen PhD Student, Computer Science (UMD) Now Research Scientist at Facebook |
|
Nozomu Okuda MS Student, Computer Science (BYU) |
|
Forough Poursabzi PhD Student, Computer Science (Colorado) Now Postdoc at MSR NYC |
<< back to top
@inproceedings{Smith:Boyd-Graber:Fan:Birchfield:Wu:Weld:Findlater-2020, Author = {Alison Smith and Jordan Boyd-Graber and Ron Fan and Melissa Birchfield and Tongshuang Wu and Dan Weld and Leah Findlater}, Booktitle = {Computer-Human Interaction}, Year = {2020}, Url = {http://umiacs.umd.edu/~jbg//docs/2020_chi_explanation.pdf}, Title = {No Explainability without Accountability: An Empirical Study of Explanations and Feedback in Interactive ML}, }
@inproceedings{Smith:Kumar:Boyd-Graber:Seppi:Findlater-2020, Author = {Alison Smith and Varun Kumar and Jordan Boyd-Graber and Kevin Seppi and Leah Findlater}, Booktitle = {Intelligent User Interfaces}, Url = {http://umiacs.umd.edu/~jbg//docs/2020_iui_control.pdf}, Year = {2020}, Title = {Digging into User Control: Perceptions of Adherence and Instability in Transparent Models}, }
@inproceedings{Guo:Boyd-Graber:Iyyer:Findlater-2020, Author = {Fenfei Guo and Jordan Boyd-Graber and Mohit Iyyer and Leah Findlater}, Location = {France (but only in dreams)}, Booktitle = {Linguistic Resources and Evaluation Conference}, Url = {http://umiacs.umd.edu/~jbg//docs/2020_lrec_sense.pdf}, Year = {2020}, Title = {Which Evaluations Uncover Sense Representations that Actually Make Sense?}, }
@inproceedings{Lund:Armstrong:Fearn:Cowley:Byun:Boyd-Graber:Seppi-2019, Title = {Automatic and Human Evaluation of Local Topic Quality}, Author = {Jeffrey Lund and Piper Armstrong and Wilson Fearn and Stephen Cowley and Courtni Byun and Jordan Boyd-Graber and Kevin Seppi}, Booktitle = {Association for Computational Linguistics}, Year = {2019}, Location = {Florence, Italy}, Url = {http://umiacs.umd.edu/~jbg//docs/2019_acl_local.pdf}, }
@inproceedings{Kumar:Smith:Findlater:Seppi:Boyd-Graber-2019, Author = {Varun Kumar and Alison Smith and Leah Findlater and Kevin Seppi and Jordan Boyd-Graber}, Booktitle = {Association for Computational Linguistics}, Year = {2019}, Location = {Florence, Italy}, Title = {Why Didn't You Listen to Me? Comparing User Control of Human-in-the-Loop Topic Models}, Url = {http://umiacs.umd.edu/~jbg//docs/2019_acl_control.pdf}, }
@article{Pruss:Fujinuma:Daughton:Paul:Arnot:Szafir:Boyd-Graber-2019, Author = {Dasha Pruss and Yoshinari Fujinuma and Ashlynn Daughton and Michael Paul and Brad Arnot and Danielle Szafir and Jordan Boyd-Graber}, Journal = {PlosOne}, Year = {2019}, Title = {Zika discourse in the Americas: A multilingual topic analysis of {Twitter}}, Url = {https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0216922}, }
@article{Wallace:Rodriguez:Feng:Yamada:Boyd-Graber-2019, Title = {Trick Me If You Can: Human-in-the-loop Generation of Adversarial Question Answering Examples}, Author = {Eric Wallace and Pedro Rodriguez and Shi Feng and Ikuya Yamada and Jordan Boyd-Graber}, Booktitle = {Transactions of the Association for Computational Linguistics}, Year = {2019}, Volume = {10}, Url = {http://umiacs.umd.edu/~jbg//docs/2019_tacl_trick.pdf}, }
@inproceedings{Smith:Kumar:Boyd-Graber:Seppi:Findlater-2018, Author = {Alison Smith and Varun Kumar and Jordan Boyd-Graber and Kevin Seppi and Leah Findlater}, Booktitle = {Intelligent User Interfaces}, Year = {2018}, Title = {User-Centered Design and Evaluation of a Human-in-the-Loop Topic Modeling System}, Url = {http://umiacs.umd.edu/~jbg//docs/2018_iui_itm.pdf}, }
@inproceedings{Felt:Ringger:Seppi:Boyd-Graber-2018, Title = {Learning from Measurements in Crowdsourcing Models: Inferring Ground Truth from Diverse Annotation Types}, Author = {Paul Felt and Eric Ringger and Kevin Seppi and Jordan Boyd-Graber}, Booktitle = {International Conference on Computational Linguistics}, Year = {2018}, Location = {Santa Fe, New Mexico}, Url = {http://umiacs.umd.edu/~jbg//docs/2018_coling_measurements.pdf}, }
@inproceedings{Lund:Cook:Seppi:Boyd-Graber-2017, Title = {Tandem Anchoring: A Multiword Anchor Approach for Interactive Topic Modeling}, Author = {Jeff Lund and Connor Cook and Kevin Seppi and Jordan Boyd-Graber}, Booktitle = {Association for Computational Linguistics}, Year = {2017}, Location = {Vancouver, British Columbia}, Url = {http://umiacs.umd.edu/~jbg//docs/2017_acl_multiword_anchors.pdf}, }
@inproceedings{Smith:Kumar:Boyd-Graber:Seppi:Findlater-2017, Author = {Alison Smith and Varun Kumar and Jordan Boyd-Graber and Kevin Seppi and Leah Findlater}, Booktitle = {CHI 2017 Designing for Uncertainty Workshop}, Year = {2017}, Location = {Denver, CO}, Title = {Accounting for Input Uncertainty in Human-in-the-Loop Systems}, Url = {http://visualization.ischool.uw.edu/hci_uncertainty/papers/Paper11.pdf}, }
@inproceedings{Lu:Lund:Boyd-Graber-2017, Title = {Why ADAGRAD Fails for Online Topic Modeling}, Author = {You Lu and Jeff Lund and Jordan Boyd-Graber}, Booktitle = {Empirical Methods in Natural Language Processing}, Url = {http://umiacs.umd.edu/~jbg//docs/2017_emnlp_adagrad_olda.pdf}, Year = {2017}, Location = {Copenhagen, Denmark}, }
@article{Lee:Smith:Seppi:Elmqvist:Boyd-Graber:Findlater-2017, Author = {Tak Yeon Lee and Alison Smith and Kevin Seppi and Niklas Elmqvist and Jordan Boyd-Graber and Leah Findlater}, Journal = {International Journal of Human-Computer Studies}, Year = {2017}, Title = {The Human Touch: How Non-expert Users Perceive, Interpret, and Fix Topic Models}, Url = {http://umiacs.umd.edu/~jbg//docs/2017_ijhcs_human_touch.pdf}, }
@article{Boyd-Graber-2017, Author = {Jordan Boyd-Graber}, Journal = {The Bridge}, Year = {2017}, Title = {Humans and Computers Working Together to Measure Machine Learning Interpretability}, Volume = {47}, Pages = {6--10}, }
@article{Smith:Lee:Poursabzi-Sangdeh:Boyd-Graber:Seppi:Elmqvist:Findlater-2017, Author = {Alison Smith and Tak Yeon Lee and Forough Poursabzi-Sangdeh and Jordan Boyd-Graber and Kevin Seppi and Niklas Elmqvist and Leah Findlater}, Journal = {Transactions of the Association for Computational Linguistics}, Year = {2017}, Title = {Evaluating Visual Representations for Topic Understanding and Their Effects on Manually Generated Labels}, Volume = {5}, Url = {http://umiacs.umd.edu/~jbg//docs/2017_tacl_eval_tm_viz.pdf}, Pages = {1--15}, }
@inproceedings{Poursabzi-Sangdeh:Boyd-Graber:Findlater:Seppi-2016, Title = {ALTO: Active Learning with Topic Overviews for Speeding Label Induction and Document Labeling}, Author = {Forough Poursabzi-Sangdeh and Jordan Boyd-Graber and Leah Findlater and Kevin Seppi}, Booktitle = {Association for Computational Linguistics}, Year = {2016}, Location = {Berlin, Brandenburg}, Url = {http://umiacs.umd.edu/~jbg//docs/2016_acl_doclabel.pdf}, }
@inproceedings{Smith:Lee:Poursabzi-Sangdeh:Boyd-Graber:Seppi:Elmqvist:Findlater-2016, Author = {Alison Smith and Tak Yeon Lee and Forough Poursabzi-Sangdeh and Jordan Boyd-Graber and Kevin Seppi and Niklas Elmqvist and Leah Findlater}, Booktitle = {CHI Human Centred Machine Learning Workshop}, Year = {2016}, Location = {San Jose, CA}, Title = {Human-Centered and Interactive: Expanding the Impact of Topic Models}, }
@inproceedings{Sultan:Boyd-Graber:Sumner-2016, Title = {Bayesian Supervised Domain Adaptation for Short Text Similarity}, Author = {Md Arafat Sultan and Jordan Boyd-Graber and Tamara Sumner}, Booktitle = {North American Association for Computational Linguistics}, Year = {2016}, Location = {San Diego, CA}, Url = {http://umiacs.umd.edu/~jbg//docs/2016_naacl_sts.pdf}, }
@inproceedings{Nguyen:Boyd-Graber:Resnik:Miler-2015, Title = {Tea Party in the House: A Hierarchical Ideal Point Topic Model and Its Application to Republican Legislators in the 112th Congress}, Author = {Viet-An Nguyen and Jordan Boyd-Graber and Philip Resnik and Kristina Miler}, Booktitle = {Association for Computational Linguistics}, Year = {2015}, Location = {Beijing, China}, Url = {http://umiacs.umd.edu/~jbg//docs/2015_acl_teaparty.pdf}, }
Accessible Abstract: In the mid 2010s, the Republican party in the United States diverged: mainstream conservatives split from the so-called "tea party" caucus. However, the primary statistical tool for analyzing political factions in legislative bodies (ideal point models) fail to account for these changes. This is because the schism is not fully reflected in voting patterns but rather in how politicians present themselves: thus we need to extend these models to capture not just how politicians vote but also how they frame particular issues. This paper proposes a new model to capture framing differences within a voting block to start explaining the new subcoalitions of the republican caucus.
@inproceedings{Felt:Ringger:Boyd-Graber:Seppi-2015, Title = {Making the Most of Crowdsourced Document Annotations: Confused Supervised {LDA}}, Author = {Paul Felt and Eric Ringger and Jordan Boyd-Graber and Kevin Seppi}, Booktitle = {Conference on Computational Natural Language Learning}, Year = {2015}, Location = {Beijing, China}, Url = {http://umiacs.umd.edu/~jbg//docs/2015_conll_cslda.pdf}, }
@inproceedings{Yang:Downey:Boyd-Graber-2015, Title = {Efficient Methods for Incorporating Knowledge into Topic Models}, Author = {Yi Yang and Doug Downey and Jordan Boyd-Graber}, Booktitle = {Empirical Methods in Natural Language Processing}, Year = {2015}, Location = {Lisbon, Portugal}, Url = {http://umiacs.umd.edu/~jbg//docs/2015_emnlp_fast_priors.pdf}, }
@inproceedings{Bach:Huang:Boyd-Graber:Getoor-2015, Title = {Paired-Dual Learning for Fast Training of Latent Variable Hinge-Loss MRFs}, Author = {Stephen H. Bach and Bert Huang and Jordan Boyd-Graber and Lise Getoor}, Location = {Lille, France}, Booktitle = {International Conference on Machine Learning}, Year = {2015}, Url = {http://umiacs.umd.edu/~jbg//docs/2015_icml_paired_dual.pdf}, }
@inproceedings{Nguyen:Boyd-Graber:Lund:Seppi:Ringger-2015, Title = {Is your anchor going up or down? {F}ast and accurate supervised topic models}, Author = {Thang Nguyen and Jordan Boyd-Graber and Jeff Lund and Kevin Seppi and Eric Ringger}, Booktitle = {North American Association for Computational Linguistics}, Year = {2015}, Location = {Denver, Colorado}, Url = {http://umiacs.umd.edu/~jbg//docs/2015_naacl_supervised_anchor.pdf}, }
@inproceedings{Smith:Chuang:Hu:Boyd-Graber:Findlater-2014, Title = {Concurrent Visualization of Relationships between Words and Topics in Topic Models}, Author = {Alison Smith and Jason Chuang and Yuening Hu and Jordan Boyd-Graber and Leah Findlater}, Booktitle = {ACL Workshop on Workshop on Interactive Language Learning, Visualization, and Interfaces}, Year = {2014}, Location = {Baltimore, Maryland}, }
@article{Hu:Boyd-Graber:Satinoff:Smith-2014, Title = {Interactive Topic Modeling}, Author = {Yuening Hu and Jordan Boyd-Graber and Brianna Satinoff and Alison Smith}, Journal = {Machine Learning}, Url = {http://umiacs.umd.edu/~jbg//docs/2014_mlj_itm.pdf}, Year = {2014}, Volume = {95}, Pages = {423--469}, Publisher = {Springer}, }
@article{Smith-2014, Author = {Marcus Smith}, Year = {2014}, Title = {Prof. Ringger and Natural Language Processing}, Journal = {Thinking Aloud}, Url = {https://cs.byu.edu/article/prof-ringger-and-natural-language-processing}, }
This work is supported by the National Science Foundation. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the researchers and do not necessarily reflect the views of the National Science Foundation.