In This Article Expand or collapse the "in this article" section Machine Translation

  • Introduction
  • Conferences and Workshops
  • Journals
  • History
  • Corpora

Linguistics Machine Translation
François Yvon
  • LAST REVIEWED: 30 August 2022
  • LAST MODIFIED: 13 January 2014
  • DOI: 10.1093/obo/9780199772810-0170


Machine translation (MT) is an interdisciplinary scientific field that brings together linguists, lexicologists, computer scientists, and translation practitioners in the pursuit of a common goal: to design and develop electronic resources and computer software capable of automatically translating a document in a source language (SL) into an equivalent text in a target language (TL). By extension, machine translation technologies also include tools aimed at helping human translators to perform their work more efficiently using computer-assisted translation (CAT) technology. Machine Translation started in the late 1950s with attempts to automatically translate Russian into English. Realization of the extreme difficulty of the task led the MT community to concentrate its efforts on more focused and realistic problems, starting the field of natural language processing (NLP) studies. MT was thus broken down into three main sub-issues: analyzing the SL into a more abstract representation, transferring this representation into an equivalent target representation, and, finally, generating a proper surface realization in TL. Capitalizing on the progress in applied NLP and artificial intelligence, MT made slow progress over the next thirty years, using mostly symbolic models of language processing to accomplish the analysis, transfer, and generation processes. In spite of several remarkable achievements, these models were challenged in the 1980s by corpus-based methodologies, which rely on the analysis of large bodies of manually translated bitexts to generate translations of new documents. In particular, the statistical approaches in machine translation introduced in the early 1990s, and subsequently improved during the next decade, have rapidly gained momentum. Relying on the systematic exploitation of huge corpora of monolingual texts and multilingual bitexts available on the Internet, these approaches appear to be the most effective today for a wide variety of uses. Statistical approaches can handle almost any language pairs, provided a sufficient access to parallel corpora is available. Most studies, nonetheless, focus on machine translation into English.


Very few textbooks are available that deal solely with machine translation. Even though Jurafsky and Martin 2009 contains only a concise introduction to the issue, the volume overall provides an in-depth exposition of the entire conceptual background necessary to understand the vast literature on MT. Koehn 2010 focuses exclusively on statistical approaches, whereas Hutchins and Somers 1992 and Arnold, et al. 1994 are classical references documenting early rule-based approaches in MT.

  • Arnold, Douglas J., Lorna Balkan, Siety Meijer, R. Lee Humphreys, and Louisa Sadler. 1994. Machine translation: An introductory guide. Manchester, UK: Blackwells NCC.

    Similar in scope to Hutchins and Somers 1992 with a less technical perspective, this book makes a good choice for a more general audience. Also available online.

  • Hutchins, W. John, and Harold L. Somers. 1992. An introduction to machine translation. London: Academic Press.

    A basic course book covering all topics related to the design and development of MT systems, from the linguistic problems to the detailed analysis of some prototypical rule-based engines. Does not cover corpus-based methodology.

  • Jurafsky, Daniel, and James H. Martin. 2009. Speech and language processing: An introduction to natural language processing, speech recognition, and computational linguistics. 2d ed. Upper Saddle River, NJ: Pearson Prentice Hall.

    This general purpose NLP textbook cover a very large spectrum of topics. It notably includes (pp. 799–830) a rather broad, nontechnical introduction to MT and to translation problems from a computational linguistics perspective, with many valuable references.

  • Koehn, Philipp. 2010. Statistical machine translation. Cambridge, UK: Cambridge Univ. Press.

    The most comprehensive reference textbook on statistical machine translation, including many recent extensions of the statistical framework.

back to top

Users without a subscription are not able to see the full content on this page. Please subscribe or login.

How to Subscribe

Oxford Bibliographies Online is available by subscription and perpetual access to institutions. For more information or to contact an Oxford Sales Representative click here.