Journal article
Top Tree Compression of Tries
We present a compressed representation of tries based on top tree compression [ICALP 2013] that works on a standard, comparison-based, pointer machine model of computation and supports efficient prefix search queries. Namely, we show how to preprocess a set of strings of total length n over an alphabet of size σ into a compressed data structure of worst-case optimal size O(n/ log σn) that given a pattern string P of length m determines if P is a prefix of one of the strings in time O(min (mlog σ, m+ log n)).
We show that this query time is in fact optimal regardless of the size of the data structure. Existing solutions either use Ω (n) space or rely on word RAM techniques, such as tabulation, hashing, address arithmetic, or word-level parallelism, and hence do not work on a pointer machine. Our result is the first solution on a pointer machine that achieves worst-case o(n) space.
Along the way, we develop several interesting data structures that work on a pointer machine and are of independent interest. These include an optimal data structures for random access to a grammar-compressed string and an optimal data structure for a variant of the level ancestor problem.
Language: | English |
---|---|
Publisher: | Springer US |
Year: | 2021 |
Pages: | 3602-3628 |
ISSN: | 14320541 and 01784617 |
Types: | Journal article |
DOI: | 10.1007/s00453-021-00869-w |
ORCIDs: | Bille, Philip and Gørtz, Inge Li |