Hal Daumé III - about me

You enter a dark forest. Standing in front of you is:

A professor named Hal Daumé III (he/him). He wields appointments in Computer Science where he is a Volpi-Cupal Professor, as well as Language Science at UMD where he leads the TRAILS, the Institute for Trustworthy AI in Law & Society (in Spring 2024 he's teaching a gen-ed course You and I, and Generative AI; past: Trustworthy ML (F23), AI (S23), Human-AI Interaction (F22), Just ML (F21)); he was formerly also a Senior Principal Researcher at Microsoft Research NYC. He and his wonderful advisees like to study questions related to how to get machines to becomes more adept at human language (and artificial intelligence tasks more broadly), by developing models and algorithms that allow them to learn from data. (Keywords: natural language processing and machine learning.) The two major questions that really drive their research these days are:

    (1) how can we get computers to learn
        through natural interaction with people/users?

and (2) how can we do this in a way that minimize harms
        in the learned models?

He's discussed interactive learning informally in a Talking Machines Podcast and more technically in recent talks; and has discussed fairness/bias in broad terms in a (now somewhat outdated) blog post. He is the author of the online textbook A Course in Machine Learning, which is fully open source.

Hal is super fortunate to be a member of, and have awesome colleagues in the Computional Linguistics and Information Processing Lab (which he formerly directed), the Human-Computer Interaction Lab, and the Center for Machine Learning. If you want to contact him, email is your best bet; you can also find him on @haldaume3 on Twitter. Or, in person, in his office (IRB 4134).

If you're a prospective grad student or grad applicant, please read his FAQ to answer some common questions. If you're thinking of inviting him for a talk or event, please ensure that the event is organized in an inclusive manner (inclusion rider). More generally, if you are organizing a conference, workshop or other event, you may wish to read the NeurIPS D&I survey results (joint with Katherine Heller), Humberto Corona's collection of resources/advice, or two blog posts on this topic.

I acknowledge that I live and work on the ancestral and unceded lands of the Piscataway People, who were among the first in the Western Hemisphere to encounter European colonists, as well as the lands of the Lenape and Nacotchtank people.

Recent Publications:

Causal Effect of Group Diversity on Redundancy and Coverage in Peer-Reviewing
Navita Goyal, Ivan Stelmakh, Nihar Shah and Hal Daumé III
arxiv, 2024
[Abstract] [BibTeX]

A large host of scientific journals and conferences solicit peer reviews from multiple reviewers for the same submission, aiming to gather a broader range of perspectives and mitigate individual biases. In this work, we reflect on the role of diversity in the slate of reviewers assigned to evaluate a submitted paper as a factor in diversifying perspectives and improving the utility of the peer-review process. We propose two measures for assessing review utility: review coverage -- reviews should cover most contents of the paper -- and review redundancy -- reviews should add information not already present in other reviews. We hypothesize that reviews from diverse reviewers will exhibit high coverage and low redundancy. We conduct a causal study of different measures of reviewer diversity on review coverage and redundancy using observational data from a peer-reviewed conference with approximately 5,000 submitted papers. Our study reveals disparate effects of different diversity measures on review coverage and redundancy. Our study finds that assigning a group of reviewers that are topically diverse, have different seniority levels, or have distinct publication networks leads to broader coverage of the paper or review criteria, but we find no evidence of an increase in coverage for reviewer slates with reviewers from diverse organizations or geographical locations. Reviewers from different organizations, seniority levels, topics, or publications networks (all except geographical diversity) lead to a decrease in redundancy in reviews. Furthermore, publication network-based diversity alone also helps bring in varying perspectives (that is, low redundancy), even within specific review criteria. Our study adopts a group decision-making perspective for reviewer assignments in peer review and suggests dimensions of diversity that can help guide the reviewer assignment process.

@inproceedings{daume24diversity, title = {Causal Effect of Group Diversity on Redundancy and Coverage in Peer-Reviewing}, author = {Navita Goyal and Ivan Stelmakh and Nihar Shah and Daum\'e, III, Hal}, booktitle = {arxiv}, year = {2024}, url = {http://hal3.name/docs/#daume24diversity}, }

Understanding the Impacts of Language Technologies’ Performance Disparities on African American Language Speakers
Jay Cunningham, Su Lin Blodgett, Michael Madaio, Hal Daumé III, Christina Harrington and Hanna Wallach
Conference of the Association for Computational Linguistics (ACL), 2024
[Abstract] [BibTeX]

This paper examines the experiences of African American Language (AAL) speakers when using language technologies. Previous work has used quantitative methods to uncover performance disparities between AAL speakers and White Mainstream English speakers when using language technologies, but has not sought to understand the impacts of these performance disparities on AAL speakers. Through interviews with 19 AAL speakers, we focus on understanding such impacts in a contextualized and human-centered manner. We find that AAL speakers often undertake invisible labor of adapting their speech patterns to successfully use language technologies, and they make connections between failures of language technologies for AAL speakers and a lack of inclusion of AAL speakers in language technology design processes and datasets. Our findings suggest that NLP researchers and practitioners should invest in developing contextualized and human-centered evaluations of language technologies that seek to understand the impacts of performance disparities on speakers of underrepresented languages and language varieties.

@inproceedings{daume24aal, title = {Understanding the Impacts of Language Technologies’ Performance Disparities on African American Language Speakers}, author = {Jay Cunningham and Su Lin Blodgett and Michael Madaio and Daum\'e, III, Hal and Christina Harrington and Hanna Wallach}, booktitle = {Proceedings of the Conference of the Association for Computational Linguistics (ACL)}, year = {2024}, url = {http://hal3.name/docs/#daume24aal}, }

ASL STEM Wiki: Dataset and Benchmark for Interpreting STEM Articles
Kayo Yin, Chinmay Singh, Fyodor O Minakov, Vanessa Milan, Hal Daumé III, Cyril Zhang, Alex X. Lu and Danielle Bragg
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
[Abstract] [BibTeX]

Deaf and hard-of-hearing (DHH) students face significant barriers in accessing science, tech- nology, engineering, and mathematics (STEM) education, notably due to the scarcity of STEM resources in signed languages. To help ad- dress this, we introduce ASL STEM Wiki: a parallel corpus of 254 Wikipedia articles on STEM topics in English, interpreted into over 300 hours of American Sign Language (ASL). ASL STEM Wiki is the first continuous sign- ing dataset focused on STEM, facilitating the development of AI resources for STEM edu- cation in ASL. We identify several use cases of ASL STEM Wiki with human-centered ap- plications. For example, because this dataset highlights the frequent use of fingerspelling for technical concepts, which inhibits DHH stu- dents’ ability to learn, we develop models to identify fingerspelled words—which can later be used to query for appropriate ASL signs to suggest to interpreters.

@inproceedings{daume24aslstem, title = {ASL STEM Wiki: Dataset and Benchmark for Interpreting STEM Articles}, author = {Kayo Yin and Chinmay Singh and Fyodor O Minakov and Vanessa Milan and Daum\'e, III, Hal and Cyril Zhang and Alex X. Lu and Danielle Bragg}, booktitle = {Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)}, year = {2024}, url = {http://hal3.name/docs/#daume24aslstem}, }

Do great minds think alike? Investigating Human-AI Complementarity in Question Answering with CAIMIRA
Maharshi Gor, Hal Daumé III, Tianyi Zhou and Jordan Boyd-Graber
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
[Abstract] [BibTeX]

Recent advancements of large language models (LLMs) have led to claims of AI surpassing humans in natural language processing (NLP) tasks such as textual understanding and reasoning. This work investigates these assertions by introducing CAIMIRA, a novel framework rooted in item response theory (IRT) that enables quantitative assessment and comparison of problem-solving abilities of question-answering (QA) agents: humans and AI systems. Through analysis of over 300,000 responses from ~70 AI systems and 155 humans across thousands of quiz questions, CAIMIRA uncovers distinct proficiency patterns in knowledge domains and reasoning skills. Humans outperform AI systems in knowledge-grounded abductive and conceptual reasoning, while state-of-the-art LLMs like GPT-4 and LLaMA show superior performance on targeted information retrieval and fact-based reasoning, particularly when information gaps are well-defined and addressable through pattern matching or data retrieval. These findings highlight the need for future QA tasks to focus on questions that challenge not only higher-order reasoning and scientific thinking, but also demand nuanced linguistic interpretation and cross-contextual knowledge application, helping advance AI developments that better emulate or complement human cognitive abilities in real-world problem-solving.

@inproceedings{daume24caimira, title = {Do great minds think alike? Investigating Human-AI Complementarity in Question Answering with CAIMIRA}, author = {Maharshi Gor and Daum\'e, III, Hal and Tianyi Zhou and Jordan Boyd-Graber}, booktitle = {Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)}, year = {2024},

HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models
Huy Nghiem and Hal Daumé III
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
[Abstract] [BibTeX]

The widespread use of social media necessitates reliable and efficient detection of offensive content to mitigate harmful effects. Although sophisticated models perform well on individual datasets, they often fail to generalize due to varying definitions and labeling of "offensive content." In this paper, we introduce HateCOT, an English dataset with over 52,000 samples from diverse sources, featuring explanations generated by GPT-3.5-Turbo and curated by humans. We demonstrate that pretraining on HateCOT significantly enhances the performance of open-source Large Language Models on three benchmark datasets for offensive content detection in both zero-shot and fewshot settings, despite differences in domain and task. Additionally, HateCOT facilitates effective K-shot fine-tuning of LLMs with limited data and improves the quality of their explanations, as confirmed by our human evaluation. Our repository is available at https: //github.com/hnghiem-usc/hatecot .

@inproceedings{daume24hate, title = {HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models}, author = {Huy Nghiem and Daum\'e, III, Hal}, booktitle = {Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)}, year = {2024}, url = {http://hal3.name/docs/#daume24hate}, }

More papers please!

Recent Talks:

AI UK: Doing better in data science – from algorithmic fairness to diversity
Anjali Mazumder, Shakir Mohamed, Danielle Belgrave, Maria De-Arteaga, and Hal Daumé III
The Alan Turing Institute AI UK Roadmap, March 2021
[Video]

Coded Bias Panel Discussion at the University of Maryland
Margrét Bjarnadóttir, Nicol Turner Lee, Deborah Raji, Adam Wenchel, and Hal Daumé III (moderator)
March, 2021
[Video]

Responsible AI Systems and Experiences
Abolfazl Asudeh (moderator), Hal Daumé III, Golnoosh Farnadi, Bernease Herman, Bill Howe (moderator), Yuval Moskovitch, Katie Shilton, and Jenn Wortman Vaughan
Panel at VLDB 2021
[Video]

Tech Ethics in a Changing World
Catherine Bannister, Mary Lacity, Cindy Moehring, and Hal Daumé III
Northwest Arkansas Tech Summit, 2021
[Video]

Language (Technology) Is Power: Exploring the Inherent Complexity of NLP Systems
Hal Daumé III and Sam Charrington (host)
TWIML AI Podcast, 2020
[Video]

More talks please!

Contact information:

    email: me AT hal3 DOT name               skype: haldaume3
    phone: 301-405-1073                    twitter: haldaume3
   office: IRB 4150                         github: hal3

I can't reply to all prospective students email; please read this before emailing me.

credits: design and font inspired by Seth Able's LoRD, some images converted to ANSI using ManyTools, original drawing of me by anonymous.
last updated on nineteen november, two thousand twenty four.