Some Restrictions of Periodicity in Strings


Simon Puglisi Department of Computing, Curtin University of Technology, Perth, Australia

Bill Smyth
Department of Computer Science, McMaster University, Canada.

Andrew Turpin
School of Computer Science and Information Technology, RMIT University, Melbourne, Australia.


Status

Proc. Australasian Workshop on Combinatorial Algorithms, September 2005.

Abstract

Given a string $\s{x}=\s{x}[1..n]$, a \itbf{repetition} of period $p$ in \s{x} is a substring $\s{u}^r = \s{x}[i..i\+ rp\- 1]$, $p = |\s{u}|$, $r \ge 2$, where neither $\s{u} = \s{x}[i..i\+ p\- 1]$ nor $\s{x}[i..i\+ (r\+ 1)p\- 1]$ is a repetition. The maximum number of repetitions in any string \s{x} is well known to be $\Theta(n\log n)$. A \itbf{run} or \itbf{maximal periodicity} of period $p$ in \s{x} is a substring $\s{u}^r\s{t} = \s{x}[i..i\+ rp\+ |\s{t}|\- 1]$ of \s{x}, where $\s{u}^r$ is a repetition, \s{t} a proper prefix of \s{u}, and no repetition of period $p$ begins at position $i\- 1$ of \s{x} or ends at position $i\+ rp\+ |\s{t}|$. In 2000 Kolpakov \& Kucherov showed that the maximum number $\rho(n)$ of runs in any string \s{x} is $O(n)$, but their proof was nonconstructive and provided no specific constant of proportionality. At the same time, they presented experimental data strongly suggesting that $\rho(n) < n$. %Related work by Fraenkel \& Simpson shows that the maximum %number $\sigma(n)$ of {\it distinct} squares in any string \s{x} %satisfies $\sigma(n) < 2n$, %while experiment again encourages the belief that in fact $\sigma(n) < n$. Recently, Fan et al.(``A new periodicity lemma'', {\it Sixteenth \CPM}, 2005) took a first step toward proving these conjectures, by presenting results that establish limitations on the number of squares of a specified range of periods that can occur over a specified range of positions in \s{x}. In this paper, we further tighten these restrictions by showing how the existence of two squares $\s{u}$ and $\s{v}$ (\s{v} longer than \s{u}) at the same position $i$ in $\s{x}$ limits the occurrence of smaller squares with period $w\in (|\s{v}|-|\s{u}|, |\s{u}|)$ in the neighborhood around $i$.