Overview
Project Team
Publications
Software

Datasets
Media



RI: EAGER: Collaborative Research: Adaptive Heads-up Displays for Simultaneous Interpretation

Project funded by the National Science Foundation: IIS-1748663 (UMD), IIS-1748642 (CMU)
PI: Graham Neubig, Carnegie Mellon University
PI: Hal Daumé III, University of Maryland
co-PI: Jordan Boyd-Graber, University of Maryland
co-PI: Leah Findlater, University of Washington

Overview

Interpretation, the task of translating speech from one language to another, is an important tool in facilitating communication in multi-lingual settings such as international meetings, travel, or diplomacy. However, simultaneous interpretation, during which the results must be produced as the speaker is speaking, is an extremely difficult task requiring a high level of experience and training. In particular, simultaneous interpreters often find certain content such as technical terms, names of people and organizations, and numbers particularly hard to translate correctly. This Early Grant for Exploratory Research project aims to create automatic interpretation assistants that will help interpreters with this difficult-to-translate content by recognizing this content in the original language, and displaying translations on a heads-up display (similar to teleprompter) for interpreters to use if they wish. This will make simultaneous interpretation more effective and accessible, making conversations across languages and cultures more natural, more common, and more effective and joining communities and cultures across the world in trade, cooperation, and friendship.

The major goal of the project is to examine methods for creating heads-up displays for simultaneous interpreters, providing real-time assistance with difficult-to-translate content. There are a number of goals for the project, including design, method development, and prototyping. These can be broken down into the following:

  1. Create offline translation assistants: Create static aids that convey useful information to interpreters, automating the process of creating “cheat sheets”; given a short description of the material to be interpreted automatically build a lexicon specific to that domain. This includes discovering salient terms and finding translations for these terms.
  2. Create machine-in-the-loop translation assistants: Create a display that will listen to to the speaker (in the source language), helping to create fluent translations, or possibly additionally the interpreter as well. The important thing in these interfaces is that they must not overwhelm the interpreter with irrelevant material.
  3. Create methods for robust prediction: Noise manifests itself in the form of MT errors when using models on bilingual text, or ASR errors when using models on speech. In addition, there is incomplete input resulting from the inherently sequential process of interpretation, and models must be able to handle this.
  4. Learning from explicit and implicit feedback: In order to create models that learn when and how to give suggestions to interpreters, we need a training signal about which suggestions are appropriate given a particular interpretation context. In order to do so, we can ask interpreters using the system to explicitly give feedback in real-time, or examine if forms of implicit feedback, which can be gleaned from observing user behavior in a deployed system.
  5. Create initial design and elicit interpreter feedback: Perform participatory design sessions will consist of three components: (1) semi-structured interview questions on support needs during interpreting, (2) critique of mock-ups that explore a range of possible design elements (e.g., display type, size, and placement, type and amount of information displayed), and (3) an opportunity for participants to sketch or describe their own design enhancements.
  6. Evaluations of the proposed interpretation interface: Deploy the system in a real interpretation setting and collect preliminary assessments with respect to objective measures of translation quality, the users’ subjective experience in using the system, and to measure cognitive load.

<< back to top

Project Team

Jordan Boyd-Graber Jordan Boyd-Graber
Assistant Professor, Computer Science (Maryland)
Hal Daume III Hal Daumé III
Professor, Computer Science (Maryland)
Leah Findlater Leah Findlater
Associate Professor, Human Centered Design and Engineering (UW)
Alvin Grissom II Alvin Grissom II
Assistant Professor, Computer Science (Ursinus)
Graham Neubig Graham Neubig
Assistant Professor, Computer Science (CMU)
Wenyan Li Wenyan Li
MS student, Electrical Engineering (Maryland)
Denis Peskov Denis Peskov
Ph.D. student, Computer Science (Maryland)
Jo Shoemaker Jo Shoemaker
Ph.D. student, Computer Science (Maryland)
Craig Stewart Craig Stewart
MS student, Computer Science (CMU)
Nikolai Vogler Nikolai Vogler
MS student, Computer Science (CMU)
Chen Zhao Chen Zhao
Ph.D. student, Computer Science (Maryland)

<< back to top

Publications (Selected)

Software

Datasets

Media

Acknowledgments

This work is supported by the National Science Foundation. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the researchers and do not necessarily reflect the views of the National Science Foundation.