About

Log in?

DTU users get better search results including licensed content and discounts on order fees.

Anyone can log in and get personalized features such as favorites, tags and feeds.

Log in as DTU user Log in as non-DTU user No thanks

DTU Findit

Conference paper

Longest Common Extensions via Fingerprinting

From

Department of Informatics and Mathematical Modeling, Technical University of Denmark1

Algorithms and Logic, Department of Informatics and Mathematical Modeling, Technical University of Denmark2

Computer Science and Engineering, Department of Informatics and Mathematical Modeling, Technical University of Denmark3

Technical University of Denmark4

The longest common extension (LCE) problem is to preprocess a string in order to allow for a large number of LCE queries, such that the queries are efficient. The LCE value, LCE s (i,j), is the length of the longest common prefix of the pair of suffixes starting at index i and j in the string s. The LCE problem can be solved in linear space with constant query time and a preprocessing of sorting complexity.

There are two known approaches achieving these bounds, which use nearest common ancestors and range minimum queries, respectively. However, in practice a much simpler approach with linear query time, no extra space and no preprocessing achieves significantly better average case performance. We show a new algorithm, Fingerprint k , which for a parameter k, 1 ≤ k ≤ [log n], on a string of length n and alphabet size σ, gives O(k n1/k) query time using O(k n) space and O(k n + sort(n,σ)) preprocessing time, where sort(n,σ) is the time it takes to sort n numbers from σ.

Though this solution is asymptotically strictly worse than the asymptotically best previously known algorithms, it outperforms them in practice in average case and is almost as fast as the simple linear time algorithm. On worst case input, this new algorithm is significantly faster in practice compared to the simple linear time algorithm.

We also look at cache performance of the new algorithm, and we show that for k = 2, cache optimization can improve practical query time.

Language: English
Publisher: Springer
Year: 2012
Pages: 119-130
Proceedings: The 6th International Conference on Language and Automata Theory and Applications, LATA
Series: Lecture Notes in Computer Science
Journal subtitle: 6th International Conference, Lata 2012 a Coruña, Spain, March 5-9, 2012 Proceedings
ISBN: 3642283314 , 3642283322 , 9783642283314 and 9783642283321
ISSN: 03029743
Types: Conference paper
DOI: 10.1007/978-3-642-28332-1_11
ORCIDs: Bille, Philip and Gørtz, Inge Li

DTU users get better search results including licensed content and discounts on order fees.

Log in as DTU user

Access

Analysis