Boolean:	`(bicycle AND helmet) OR (head AND protection)` (always group AND in parenthesis)
Title:	`title:(climate change)`
Author:	`author:("bohr niels" OR "bohr n")` (avoid only full first name)
Phrase:	`"water pump control"` (does not work with wildcards)
Wildcards:	`wom?n pharm*`

Journal article

Fast searching in packed strings

In Journal of Discrete Algorithms (amsterdam) — 2011, Volume 9, Issue 1, pp. 49-56

From

Algorithms and Logic, Department of Informatics and Mathematical Modeling, Technical University of Denmark¹

Department of Informatics and Mathematical Modeling, Technical University of Denmark²

Abstract

Given strings P and Q the (exact) string matching problem is to find all positions of substrings in Q matching P. The classical Knuth–Morris–Pratt algorithm [SIAM J. Comput. 6 (2) (1977) 323–350] solves the string matching problem in linear time which is optimal if we can only read one character at the time.

However, most strings are stored in a computer in a packed representation with several characters in a single word, giving us the opportunity to read multiple characters simultaneously. In this paper we study the worst-case complexity of string matching on strings given in packed representation. Let m⩽n be the lengths P and Q, respectively, and let σ denote the size of the alphabet.

On a standard unit-cost word-RAM with logarithmic word size we present an algorithm using timeO(nlogσn+m+occ). Here occ is the number of occurrences of P in Q. For m=o(n) this improves the O(n) bound of the Knuth–Morris–Pratt algorithm. Furthermore, if m=O(n/logσn) our algorithm is optimal since any algorithm must spend at least Ω((n+m)logσlogn+occ)=Ω(nlogσn+occ) time to read the input and report all occurrences.

The result is obtained by a novel automaton construction based on the Knuth–Morris–Pratt algorithm combined with a new compact representation of subautomata allowing an optimal tabulation-based simulation.

Language:	English
Year:	2011
Pages:	49-56
ISSN:	15708675 and 15708667
Types:	Journal article
DOI:	10.1016/j.jda.2010.09.003
ORCIDs:	Bille, Philip

Keywords

Knuth–Morris–Pratt algorithm String matching Word RAM

Fast searching in packed strings

DTU Library

Address

Shortcuts

Log in?

Fast searching in packed strings

DTU Library

Address

Shortcuts