In formal language theory and computer science, a **substring** is a contiguous sequence of characters within a string. For instance, "*the best of*" is a substring of "*It was the best of times*". This is not to be confused with subsequence, which is a generalization of substring. For example, "*Itwastimes*" is a subsequence of "*It was the best of times*", but not a substring.

**Prefixes** and **suffixes** are special cases of substrings. A prefix of a string is a substring of that occurs at the beginning of ; likewise, a suffix of a string is a substring that occurs at the end of .

The list of all substrings of the string "*apple*" would be "*apple*", "*appl*", "*pple*", "*app*", "*ppl*", "*ple*", "*ap*", "*pp*", "*pl*", "*le*", "*a*", "*p*", "*l*", "*e*", "" (note the empty string at the end).

## Substring

A string is a substring (or factor)^{[1]} of a string if there exists two strings and such that . In particular, the empty string is a substring of every string.

Example: The string `ana`

is equal to substrings (and subsequences) of `banana`

at two different offsets:

banana ||||| ana|| ||| ana

The first occurrence is obtained with `b`

and `na`

, while the second occurrence is obtained with `ban`

and being the empty string.

A substring of a string is a prefix of a suffix of the string, and equivalently a suffix of a prefix; for example, `nan`

is a prefix of `nana`

, which is in turn a suffix of `banana`

. If is a substring of , it is also a subsequence, which is a more general concept. The occurrences of a given pattern in a given string can be found with a string searching algorithm. Finding the longest string which is equal to a substring of two or more strings is known as the longest common substring problem.
In the mathematical literature, substrings are also called **subwords** (in America) or **factors** (in Europe).^{[citation needed]}

## Prefix

A string is a prefix^{[1]} of a string if there exists a string such that . A *proper prefix* of a string is not equal to the string itself;^{[2]} some sources^{[3]} in addition restrict a proper prefix to be non-empty. A prefix can be seen as a special case of a substring.

Example: The string `ban`

is equal to a prefix (and substring and subsequence) of the string `banana`

:

banana ||| ban

The square subset symbol is sometimes used to indicate a prefix, so that denotes that is a prefix of . This defines a binary relation on strings, called the prefix relation, which is a particular kind of prefix order.

## Suffix

A string is a suffix^{[1]} of a string if there exists a string such that . A *proper suffix* of a string is not equal to the string itself. A more restricted interpretation is that it is also not empty^{[1]}. A suffix can be seen as a special case of a substring.

Example: The string `nana`

is equal to a suffix (and substring and subsequence) of the string `banana`

:

banana |||| nana

A suffix tree for a string is a trie data structure that represents all of its suffixes. Suffix trees have large numbers of applications in string algorithms. The suffix array is a simplified version of this data structure that lists the start positions of the suffixes in alphabetically sorted order; it has many of the same applications.

## Border

A border is suffix and prefix of the same string, e.g. "bab" is a border of "babab" (and also of "babooneatingakebab").

## Superstring

A **superstring** of a finite set of strings is a single string that contains every string in as a substring. For example, is a superstring of , and is a shorter one. Generally, one is interested in finding superstrings whose length is as small as possible;^{[clarification needed]} a concatenation of all strings of in any order gives a trivial superstring of . A string that contains every possible permutation of a specified character set is called a superpermutation.

## See also

## References

- ^
^{a}^{b}^{c}Lothaire, M. (1997).*Combinatorics on words*. Cambridge: Cambridge University Press. ISBN 0-521-59924-5. **^**Kelley, Dean (1995).*Automata and Formal Languages: An Introduction*. London: Prentice-Hall International. ISBN 0-13-497777-7.**^**Gusfield, Dan (1999) [1997].*Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology*. USA: Cambridge University Press. ISBN 0-521-58519-8.