To Homepage of Tero Harju

Topics in

Combinatorics on Words


Combinatorics on words studies general properties of discrete sequences over alphabets, i.e., it studies combinatorial aspects of free semigroups. The tools come from many parts of mathematics. The motivation and applications of this theory are manyfold, since all discrete chains of events can be represented as words. Combinatorics on words has a natural connection to algebra through groups and semigroups.

The basic resources for the topic are the two books of Lothaire:

  • M. Lothaire, Combinatorics on Words,
    Addison-Wesley, 1983

  • M. Lothaire, Algebraic Combinatorics on Words,
    Cambridge University Press, 2002

For introduction and problems, see
  • J. Karhumäki, Combinatorics on Words: A New Challenging Topic (TUCS report TR645.pdf)

*Topics*

(1) Ehrenfeucht's Conjecture: Every system of word equations has a finite equivalent subsystem.

Systems of equations is a classical topic in commutative algebra. In combinatorics on words this topic is partly motivated by the previous conjecture from the early 1970's which was solved by Albert, Lawrence, and Guba in 1985. The theorem is a noncommutative version of Hilbert's basis theorem. In the following articles the problem is studied in the setting of general semigroups:

A survey of the topic is given in

  • Tero Harju, Juhani Karhumäki and Wojtek Plandowski
    Independent Systems of Equations
    Chapter 13 in M. Lothaire, Algebraic Combinatorics on Words
    (J. Berstel and D. Perrin, eds.), Cambridge University Press, 2002, pages 443 - 472.

(2) Guibas-Odlyzko Theorem: For every word w, there exists a binary word w' having exactly the same periods as w. (Especially w' has the same length as w.)

For a short proof, see

  • Vesa Halava, Tero Harju and Lucian Ilie
    Periods and binary words (TUCS report TR213.ps.gz)
    J. Combinat. Theory Ser. A, 89 (2000), 298 - 303.

(3) Duvals' Conjecture: Let w be a word of length n. Then the length of a nontrivial Duval's extension wu of w satisfies |wu| < 2n-1.

A Duval extension of an unbordered word w of length n is a word wu which does not have unbordered factors of length greater n. Such an extension is trivial if u is a prefix of w. The topic was initiated by Ehrenfeucht and Silberger in 1979 and by Duval in 1982. The following paper proves this conjecture in a stronger version giving a strict relationship between periods and unbordered factors of words.

(4) Border Correlation: The words with maximal number of unbordered conjugates coincide with the cyclically overlap-free words.

This is a starting point for the border correlation function which specifies which conjugates of a word are bordered.

  • Tero Harju and Dirk Nowotka
    Border correlation of binary words
    J. Combin. Theory A 108 (2004), 331-341.
    doi:10.1016/j.jcta.2004.07.009

    The sequence of number of images of the border correlation function for binary words obtained a place in Sloane's On-Line Encyclopedia of Integer Sequences:
    1,2,4,7,11,18,29,47,76,121,199,310,521,841,1364,2207,3571, ... Item A091838
    The odd positions seem to be related to the Lucas numbers, but for the even positions we have no interpretation. (The first "exceptional case" is n=12.)

(5) Critical Factorization Theorem: Each word has a critical point, around which the minimum repetition has length equal to the period of the word.

This is one of the basic results in combinatorics on words. It was first proved by Cesari and Vincent (1978), and in the present form it is due to Duval (1979). A short proof based on Crochemore and Perrin (1991) is given in the following article, where also density of critical points is studied.

  • Tero Harju and Dirk Nowotka
    On the density of critical factorizations
    Theoretical Informatics and Applications 36 (2002), 315 - 327.
    doi:10.1051/ita:2002016

(6) Defect Theorem: If n words satisfies a nontrivial relation, then these words can be expressed using n-1 simpler words.

Defect theorem belongs to folklore, and it is one of the corner stones of combinatorics on words. Improvements and generalizations of this result are proved in

  • Tero Harju and Juhani Karhumäki
    On the defect theorem and simplifiability,
    Semigroup Forum 33 (1986), 199 - 217.

To Homepage of Tero Harju