In This Article Finite State Languages

  • Introduction
  • Foundational Works
  • Textbooks
  • Edited Collections
  • Conference and Workshop Proceedings
  • Software
  • Journals
  • Theoretical Linguistics
  • Morphology
  • Phonology
  • Syntax and Parsing
  • Extended Models and Applications
  • Probabilistic Finite-State Models
  • Subclasses of Finite-State Models
  • Learnability

Linguistics Finite State Languages
Mans Hulden
  • LAST MODIFIED: 27 March 2014
  • DOI: 10.1093/obo/9780199772810-0181


A finite-state language—equivalently “regular language,” “type 3 language,” or “regular set”—belongs to the class of formal languages whose sentences can be generated or characterized by a number of different abstract devices—devices that are all ultimately equivalent in their generative capacity. These devices include type 3 generative grammars (regular grammars), regular expressions, finite automata, and read-only Turing machines. Essentially, almost any general model of computation that is restricted to possessing only a finite memory of predefined size will fall into this class. The origins of finite-state machines lie in early abstract neuron models and theories of computation, and they were later found to be equivalent to type 3 generative grammars. Today, interest in finite-state models is vast and encompasses research in formal language theory, mathematics, linguistics, logic, engineering, and theoretical computer science. Finite-state languages have been investigated and argued for and against as a potential model for capturing linguistic structure since the 1950s, particularly in the subdomains of syntax, morphology, and phonology. While finite-state models are often assumed to be too weak to capture syntactic structure—at least elegantly—they are now a mainstay of practical models of phonology and morphology in computational linguistics. Research into finite-state models of natural language continues because these models offer fruitful ways of approaching such matters as computational concerns, efficiency, learnability properties, and cognitive plausibility. Finite-state transducers—translation devices based on finite automata—are often categorized as “finite-state models” as well and are extensively used as generic devices for devising representations of various linguistic translations, such as phonological alternation patterns. In more recent developments, finite-state models enhanced with probabilistic information have been used to manipulate statistical models of language, and these are now widely employed for practical tasks in written language and speech processing. The literature on the topic traditionally employs different notation and expository style depending on the venue, with linguistics, mathematics (including formal language theory), and computer science publications using slightly varying conventions.

Foundational Works

The first formalization of what was later to be called finite automata is found in McCulloch and Pitts 1943, which was essentially a neural network model. This model was investigated intensively in subsequent years, with Kleene 1956 providing a more modern interpretation and showing the equivalence of regular expressions and finite automata. Many interesting properties of finite-state languages were discovered during the 1950s, showing that the expressive powers of different types of automata, regular expressions, and certain grammars are equivalent. Other discoveries from the same period include the notion that finite-state machines have canonical minimal representations (Moore 1956, Myhill 1957). Rabin and Scott 1959 introduces the influential concept of nondeterminism, while Chomsky 1959 places finite-state languages, or “type 3 grammars,” in what is now called the Chomsky hierarchy. The early analysis in Chomsky 1956 and the judgment that finite-state languages were unsuitable for describing natural language syntax was very influential in the domain of linguistics. Thompson 1968 marks the beginning of extensive use of finite-state techniques in computational text search—a circumstance that would later have an influence on the development of finite-state methods in computational linguistics.

  • Chomsky, Noam. 1956. Three models for the description of language. IRE Transactions on Information Theory 2.3: 113–124.

    DOI: 10.1109/TIT.1956.1056813E-mail Citation »

    Presents one of the earliest arguments against the adequacy of finite-state models in capturing syntactic phenomena in language. The argument is essentially repeated in the seminal Syntactic Structures, published in 1957 (The Hague: Mouton).

  • Chomsky, Noam. 1959. On certain formal properties of grammars. Information and Control 2.2: 137–167.

    DOI: 10.1016/S0019-9958(59)90362-6E-mail Citation »

    An early analysis of the generative capacity of grammars in the Chomsky hierarchy.

  • Kleene, S. C. 1956. Representation of events in nerve nets and finite automata. In Automata studies. Edited by Claude E. Shannon and John McCarthy, 3–42. Annals of Mathematics Studies 34. Princeton, NJ: Princeton Univ. Press.

    E-mail Citation »

    An early central work that makes the leap from McCulloch-Pitts neuron abstractions to regular expressions and finite automata and analyzes the resulting algebraic properties (note that “regular event,” a term often used in the earlier literature, is synonymous with “regular language”). The resulting equivalence of these is now known as “Kleene’s theorem.”

  • McCulloch, Warren S., and Walter Pitts. 1943. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics 5.4: 115–133.

    DOI: 10.1007/BF02478259E-mail Citation »

    An influential article that presents a precursor of what are now called finite automata, in the form of a network of abstract “neuron” elements.

  • Moore, Edward F. 1956. Gedanken-experiments on sequential machines. In Automata studies. Annals of Mathematics Studies 34. Edited by Claude E. Shannon and John McCarthy, 129–153. Princeton, NJ: Princeton Univ. Press.

    E-mail Citation »

    An early paper that introduces the idea that every regular language is representable by a unique minimal automaton.

  • Myhill, John. 1957. Finite automata and the representation of events. Technical Report WADD TR-57-624. Dayton, OH: Wright Patterson Air Force Base.

    E-mail Citation »

    The first proof of the Myhill-Nerode theorem—a tight, formal characterization of regular languages.

  • Rabin, Michael O., and Dana Scott. 1959. Finite automata and their decision problems. IBM Journal of Research and Development 3.2: 114–125.

    DOI: 10.1147/rd.32.0114E-mail Citation »

    The landmark paper that, among other things, introduces the idea of nondeterministic machines and shows the equivalence of nondeterministic and deterministic finite automata, as well as the equivalence of two-way and one-way automata, showing that a large range of seemingly different descriptive devices all correspond to the same class of regular languages.

  • Thompson, Ken. 1968. Programming techniques: Regular expression search algorithm. Communications of the ACM 11.6: 419–422.

    DOI: 10.1145/363347.363387E-mail Citation »

    An influential paper that introduced regular expression text search.

back to top

Users without a subscription are not able to see the full content on this page. Please subscribe or login.

How to Subscribe

Oxford Bibliographies Online is available by subscription and perpetual access to institutions and individuals. For more information or to contact an Oxford Sales Representative click here.

Purchase an Ebook Version of This Article

Ebooks of the Oxford Bibliographies Online subject articles are available in North America via a number of retailers including Amazon, vitalsource, and more. Simply search on their sites for Oxford Bibliographies Online Research Guides and your desired subject article.

If you would like to purchase an eBook article and live outside North America please email to express your interest.