In This Article Expand or collapse the "in this article" section Machine Translation

  • Introduction
  • Conferences and Workshops
  • Journals
  • History
  • Corpora

Linguistics Machine Translation
by
François Yvon
  • LAST REVIEWED: 30 August 2022
  • LAST MODIFIED: 24 April 2023
  • DOI: 10.1093/obo/9780199772810-0170

Introduction

Machine translation (MT) is an interdisciplinary scientific field that brings together linguists, lexicologists, computer scientists, and translation practitioners in the pursuit of a common goal: to design and develop electronic resources and computer software capable of automatically translating a document in a source language (SL) into an equivalent text in a target language (TL). By extension, machine translation technologies also include tools aimed at helping human translators to perform their work more efficiently using computer-assisted translation (CAT) technology. Machine translation started in the late 1950s with attempts to automatically translate Russian into English. Realization of the extreme difficulty of the task led the MT community to concentrate its efforts on more focused and realistic problems, starting the field of natural language processing (NLP) studies. MT was thus broken down into three main sub-issues: analyzing the SL into a more abstract representation, transferring this representation into an equivalent target representation, and, finally, generating a proper surface realization in TL. Capitalizing on the progress in applied NLP and artificial intelligence, MT made slow progress over the next thirty years, using mostly symbolic models of language processing to accomplish the analysis, transfer, and generation processes. Despite of several remarkable achievements, these models were challenged in the 1980s by corpus-based methodologies, which rely on the analysis of large bodies of manually translated bitexts to generate translations of new documents. In particular, the statistical approaches in machine translation introduced in the early 1990s, and subsequently improved during the next decade, have rapidly gained momentum. As of 2014, statistical approaches have been superseded by more powerful machine learning techniques based on artificial neural networks. Relying on the systematic exploitation of huge corpora of monolingual texts and multilingual bitexts available on the Internet, ” Neural Machine Translation” appears to be the most effective approach today for a wide variety of uses. Neural approaches can handle almost any language pair, provided a sufficient access to parallel corpora is available. A remarkable recent evolution is the development of multilingual translation models that are able to handle multiple languages directions in one single system.

Textbooks

Very few textbooks are available that deal solely with machine translation. The most up-to date textbook is Koehn 2020, which contains a full exposition of Neural MT. Koehn 2010 focuses solely on statistical approaches, whereas Hutchins and Somers 1992 and Arnold, et al. 1994 are classical references documenting early rule-based approaches in MT. General purpose natural language processing (NLP) textbooks also include concise presentations of MT: this is the case notably of Eisenstein 2019 and of the latest edition of Jurafsky and Martin 2009. These volumes overall contain an in-depth exposition of the entire conceptual background necessary to understand the vast literature on MT.

  • Arnold, Douglas J., Lorna Balkan, Siety Meijer, R. Lee Humphreys, and Louisa Sadler. 1994. Machine translation: An introductory guide. Manchester, UK: Blackwells NCC.

    Similar in scope to Hutchins and Somers 1992 with a less technical perspective, this book makes a good choice for a more general audience.

  • Eisenstein, Jacob. 2019. An introduction to natural language processing. Cambridge, MA: MIT Press.

    A modern introduction to the field of language processing targeting computer scientists, with a detailed presentation of machine learning techniques and models. A dedicated chapter (18) discusses machine translation models.

  • Hutchins, W. John, and Harold L. Somers. 1992. An introduction to machine translation. London: Academic Press.

    A basic course book covering all topics related to the design and development of MT systems, from linguistic problems to detailed analysis of some prototypical rule-based engines. Somewhat outdated, as it does not cover corpus-based methodologies.

  • Jurafsky, Daniel, and James H. Martin. 2009. Speech and language processing: An introduction to natural language processing, speech recognition, and computational linguistics. 3d ed. Upper Saddle River, NJ: Pearson Prentice Hall.

    This general-purpose NLP textbook covers a very large spectrum of topics. The third revision includes a major rewrite of the machine translation chapter, which integrates recent advances in the field (chapter 10). It also includes an introduction to translation problems from a computational linguistics perspective with many valuable references.

  • Koehn, Philipp. 2010. Statistical machine translation. Cambridge, UK: Cambridge Univ. Press.

    The most comprehensive reference textbook on statistical machine translation, including many extensions of the statistical framework.

  • Koehn, Philipp. 2020. Neural machine translation. Cambridge, UK: Cambridge Univ. Press.

    DOI: 10.1017/9781108608480

    This recent textbook documents the latest technical evolutions of machine translation and provides readers with a complete overview of the neural machine translation paradigm.

back to top

Users without a subscription are not able to see the full content on this page. Please subscribe or login.

How to Subscribe

Oxford Bibliographies Online is available by subscription and perpetual access to institutions. For more information or to contact an Oxford Sales Representative click here.

Article

Up

Down