Journal article
Faster Approximate String Matching for Short Patterns
We study the classical approximate string matching problem, that is, given strings P and Q and an error threshold k, find all ending positions of substrings of Q whose edit distance to P is at most k. Let P and Q have lengths m and n, respectively. On a standard unit-cost word RAM with word size w≥log n we present an algorithm using time \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O\biggl(nk \cdot \min\biggl(\frac{\log^2 m}{\log n},\frac{\log^2 m\log w}{w}\biggr) + n\biggr)$$\end{document} When P is short, namely, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$m = 2^{o(\sqrt{\log n}\,)}$\end{document} or \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$m =2^{o(\sqrt{w/\log w}\,)}$\end{document} this improves the previously best known time bounds for the problem.
The result is achieved using a novel implementation of the Landau-Vishkin algorithm based on tabulation and word-level parallelism.
Language: | English |
---|---|
Publisher: | Springer-Verlag |
Year: | 2012 |
Pages: | 492-515 |
ISSN: | 14330490 and 14324350 |
Types: | Journal article |
DOI: | 10.1007/s00224-011-9322-y |
ORCIDs: | Bille, Philip |