Research in Combinatorics on Words
by Vesa Halava

Updated November 2006

Relational Words

In combinatorics on words the partial words have been under wide study recently. A partial word is a word containing "holes" or "do not know symbols." Denote the hole by . Now a partial word can be seen as a word over the alphabet A {}, where A is a finite alphabet. Two partial words u, v are compatible if they are of equal length and in all the position where both u and v have a letter (not a hole) these letters are the same. The motivation for studying the partial words comes from the biological sequences such as DNA, RNA and proteins. The properties studied have been the usual properties of the word, such as repetiotions/periods, code properties for sets of words etc. In sequence comparison you align two sequences, for example genes, in order to find correspondence between them. This alignment corresponds to a construction of compatible partial words. Clearly also the word relations can be made to describe this phenomenon with specific similarity relations on symbols. Another important operation in molecular biology is DNA sequencing. There the role of partial words as well as word relations is to model the task of fragment assembly. We introduce gaps (indicated by ) in DNA pieces (addg, dgtgc, ccad) to let the nucleotides align perfectly. Partial words have also been considered in DNA computing as good solutions to DNA encodings.

The relational words are a generalization of the partial words, where instead of the hole relation we have any possible relation of the letters. Note that the partial words are a special case here, since the "diamond relation" R can be defined by aR and R a for all letters a A. (The relations in letters are defined to be symmetric and reflexive, not necessarily transitive)

The research in the relational words has been initialized in our recent manuscripts [1-3] (with T. Harju and T. Kärki). It is immediate that the motivation of the theory of the partial words motivates to study relational words also. In [1] defined the relational codes which are generalizations of codes in the combinatorics on words. Let R and S be two letter relations on the alphabet A. A set X of words in A* is a relational ( R,S) code, if for all n,m ³ 1, and x1, ..., xn, y1, ... ,ym we have

x1... xn R y1 ... ym m=n and xj S yj for all j=1,...,n.

In [2] we studied the defect effect on sets of relational words.

At the moment we study more the periodical properties of the relational words, in the spirit of the theorem of Fine and Wilf. In [3] we proved a Fine and Wilf like theorem for the case when we have a "regular" period and a global relational period on the word.
My Homepage

  1. Vesa Halava, Tero Harju and Tomi Kärki:
    Relational codes of words pdf file:
  2. Vesa Halava, Tero Harju and Tomi Kärki:
    Defect theorems with compatibility relations pdf file:
  3. Vesa Halava, Tero Harju and Tomi Kärki:
    The Theorem of Fine and Wilf for Relational Periods pdf file: