To Homepage of Tero Harju

Shortly on

DNA: combinatorics and computing


In 1982 Nobel prize winner Richard Feynman suggested that computers might be essentially more powerful if they were based on nanoscale or even quantum scale components. In this line of research alternative models of computation are inspired by phenomena in biology and physics. There are several different principles underlying modelling of natural systems; e.g., molecular computing is based on aspects of recombination in molecular biology, and quantum computing exploits quantum parallelism. In these topics, new methodological perspectives are conducted: ultimately molecular computing aims at alternatives for silicon hardware by implementing algorithms in biological hardware, and quantum computing aims at hardware allowing quantum effects.

The main topics of the research group in the field of DNA computing concentrates mostly on modelling of recombination processes of genomic DNA.

Mathematical problems on DNA computing

DNA computing is a novel area at the crossroad of mathematics, computer science and molecular biology which is aimed to use processes in molecular biology for computational purposes. The advantage of this line of technology can be seen from the fact that a single gram of DNA can hold more information than 1011 compact disks. This field of research is at an early stage in its development, and thus requires new insights from both theoretical and practical points of view. The main ingredient of DNA computing is its potential for massive parallelism when compared to more traditional models of computation.

Problems in DNA based computing were initiated by Tom Head (Binghamton, N.Y.) in automata theory in the mid 1980's, when he introduced his splicing systems as a model for genetic recombination. This problem area was made more widely known by Leonard Adleman (U. Southern California) in the beginning of 1994 using a more computationally oriented approach.

The main problem proposed by T. Head, regularity problem of the splicing systems, was solved positively by Tero Harju together with K. Culik II (U. South Carolina) in 1991.

  • K. Culik and T. Harju, Splicing semigroups of dominoes and DNA,
    Discrete Appl. Math. 31 (1991), 261 - 277

    ICALP-version: Dominoes and the regularity of DNA splicing languages,
    (ICALP'89, Stresa, Italy), LNCS 372 (1989), 222 - 233.

Together with A. Ehrenfeucht (Boulder) I. Petre (Abo Akademi, Turku), D.M. Prescott (Boulder) and G. Rozenberg (Leiden), we have been working on modelling molecular recombination operations using combinatorial and computational tools. These operations are motivated by the acrobatic genetic reassembly processes that happen in unicellular organisms called Stichotrichs ciliates. The reassembling process transforms a defragmented and shuffled inactive micronuclear genome to a functional macronuclear genome of the expressible genes. The mathematical tools for this modelling process include permutations, strings and graphs. All these tools can be used to represent genomes and their recombination operations. The advantage of this variety of tools is that they provide different levels of abstraction to the theory, and some problems are easier to handle in one or the other of these frameworks. For instance, graphs are often good for proofs although their connection to genomes is not as clear as of the other representations.

  • A. Ehrenfeucht, T. Harju, I. Petre, D. M. Prescott, and G. Rozenberg
    ''Computation in Living Cells. Gene Assembly in Ciliates''
    Springer-Verlag, 2004 (201 pages).
    Book on formal models for gene assembly in ciliates.
    See Preface and References

To Homepage of Tero Harju