Suffix Array: What is a Suffix Array? A Powerful Tool for Solving String Problems

Suffix array is an array that stores the starting positions of suffixes sorted lexicographically. A suffix is a substring starting from each position in the string to the end (e.g., for "banana", the suffixes are "banana", "anana", etc.). The lexicographical comparison rule is: compare the first different character by its size; if the characters are the same, compare subsequent characters in order; if one suffix is a prefix of the other, the shorter one is considered smaller. Taking "abrac" as an example, the sorted suffix starting positions array is [0, 3, 4, 1, 2] (e.g., the suffix starting at position 0 "abrac" is less than the one at position 3 "ac", and they are arranged in this order). The core value of suffix arrays lies in efficiently solving string problems: by leveraging the close relationship (long common prefix length) between adjacent suffixes after sorting, it can quickly handle tasks such as finding the longest repeated substring and verifying substring existence. For example, the LCP array can be used to find the longest repeated substring, or binary search can verify if a substring exists. In summary, the suffix array provides an efficient solution for string problems by sorting suffix starting positions and is a practical tool for string processing.

Read More