Topic: searching compressed data

topics > computer science > programming > Group: patterns

information retrieval

approximate string matching and pattern matching with errors
compressed data
data compression algorithms
database queries, joins, and relational algebra
external search and sort
search algorithms
text compression


Searching compressed data can be faster than decompressing the data and then searching.

Simple compression schemes, such as byte-pair or Huffman encoding, are better for searching. (cbb 2/07)

Subtopic: searching block-sorted data up

Quote: use Burrows-Wheeler block-sorted data for dynamic searching; compressed, rapid searches; groups similar substrings together [»willK12_2003]

Subtopic: search compressed text up

Quote: string matching on LZW compressed text; up to 50% faster than decompress and search; LZgrep for gzip files [»navaG10_2005]
Quote: algorithm for pattern matching in UNIX Z-compressed files; O(n log m + m) for short patterns of length m [»amirA1_1994]
Quote: search compressed text with byte-pair compression; faster search, 30% compression, byte-pairs can not overlap [»manbU4_1997]
Quote: fast searching of compressed text using Huffman encoding, shift-or search, and Boyer-Moore filter; 30% compression, 2-8x faster than agrep [»demoES8_1998]
Quote: DCA compression and antidictionaries allow search and parallel algorithms because their encodings do not depend on context except for the block's prefix [»crocM7_1999]

Subtopic: regular expression up

Quote: first solution for regular expression searching in compressed text; faster than decompression plus search [»navaG7_2001]

Subtopic: approximate search up

Quote: first practical algorithm for approximate pattern matching over Ziv-Lempel compressed text; multipattern search; faster than decompressing plus searching [»navaG3_2001a]

Subtopic: search compressed database up

Quote: use a skipped index for fast queries in a compressed database

Related Topics up

Group: information retrieval   (25 topics, 674 quotes)

Topic: approximate string matching and pattern matching with errors (19 items)
Topic: compressed data (16 items)
Topic: data compression algorithms (53 items)
Topic: database queries, joins, and relational algebra (34 items)
Topic: external search and sort (23 items)
Topic: search algorithms (40 items)
Topic: text compression
(17 items)

Updated barberCB 10/05
Copyright © 2002-2008 by C. Bradford Barber. All rights reserved.
Thesa is a trademark of C. Bradford Barber.