Book chapter ยท Conference paper
Compressed Indexing with Signature Grammars
The compressed indexing problem is to preprocess a string S of length n into a compressed representation that supports pattern matching queries. That is, given a string P of length m report all occurrences of P in S.We present a data structure that supports pattern matching queries in O(m + occ(lg lg n + lg(epsilon) z)) time using O(z lg(n/z)) space where z is the size of the LZ77 parse of S and epsilon > 0 is an arbitrarily small constant, when the alphabet is small or z = O(n(1-delta)) for any constant delta > 0.
We also present two data structures for the general case; one where the space is increased by O(z lg lg z), and one where the query time changes from worst-case to expected. These results improve the previously best known solutions. Notably, this is the first data structure that decides if P occurs in S in O(m) time using O(z lg(n/z)) space.
Our results are mainly obtained by a novel combination of a randomized grammar construction algorithm with well known techniques relating pattern matching to 2D-range reporting.
Language: | English |
---|---|
Publisher: | Springer |
Year: | 2018 |
Pages: | 331-345 |
Proceedings: | 13th Latin American Theoretical INformatics Symposium |
Series: | Lecture Notes in Computer Science |
Journal subtitle: | 13th Latin American Symposium, Buenos Aires, Argentina, April 16-19, 2018, Proceedings |
ISBN: | 3319774034 , 3319774042 , 9783319774039 and 9783319774046 |
ISSN: | 16113349 and 03029743 |
Types: | Book chapter and Conference paper |
DOI: | 10.1007/978-3-319-77404-6_25 |
ORCIDs: | Christiansen, Anders Roy and Ettienne, Mikko Berggren |