Statistical machine translation (SMT) systems perform poorly when applied on new target domains. This degradation in quality can be as much as 1⁄3 of the original system’s performance. We conducted a summer workshop at JHU in 2012 to better understand and address issues that arise in domain adaptation for MT. This page will host both data and papers related to this enterprise.




Data and Code


Many thanks to the entire DAMT team, plus George Foster (NRC) for his expertise, Dragos Munteanu (Language Weaver) for initial brainstorming, the whole JHU team (especially Sanjeev Khudanpur) for making the workshop happen, and the various funders who contributed to this work (including Google, ODNI, NSF, DARPA).

If you make use of any of this data, we only ask that you acknowledge us by citing:

    author = {Marine Carpuat and Hal Daum\'e III and Alexander Fraser and Chris Quirk and Fabienne Braune
              and Ann Clifton and Ann Irvine and Jagadeesh Jagarlamudi and John Morgan and Majid Razmara
              and Ale\v{s} Tamchyna and Katharine Henry and Rachel Rudinger},
    title = {Domain Adaptation in Machine Translation: Final Report},
    booktitle = {2012 Johns Hopkins Summer Workshop Final Report},
    year = {2012},
    url = {}
You can read the final report if you like!