Group: information retrieval
Topic: approximate string matching and pattern matching with errors
Topic: compressed data
Topic: data compression algorithms
Topic: database queries, joins, and relational algebra
Topic: external search and sort
Topic: search algorithms
Topic: text compression
| |
Summary
Searching compressed data can be faster than decompressing the data and then searching.
Simple compression schemes, such as byte-pair or Huffman encoding, are better for searching. (cbb 2/07)
Subtopic: searching block-sorted data
Quote: use Burrows-Wheeler block-sorted data for dynamic searching; compressed, rapid searches; groups similar substrings together [»willK12_2003]
| Subtopic: search compressed text
Quote: string matching on LZW compressed text; up to 50% faster than decompress and search; LZgrep for gzip files [»navaG10_2005]
| Quote: algorithm for pattern matching in UNIX Z-compressed files; O(n log m + m) for short patterns of length m [»amirA1_1994]
| Quote: search compressed text with byte-pair compression; faster search, 30% compression, byte-pairs can not overlap [»manbU4_1997]
| Quote: fast searching of compressed text using Huffman encoding, shift-or search, and Boyer-Moore filter; 30% compression, 2-8x faster than agrep [»demoES8_1998]
| Quote: DCA compression and antidictionaries allow search and parallel algorithms because their encodings do not depend on context except for the block's prefix [»crocM7_1999]
| Subtopic: regular expression
Quote: first solution for regular expression searching in compressed text; faster than decompression plus search [»navaG7_2001]
| Subtopic: approximate search
Quote: first practical algorithm for approximate pattern matching over Ziv-Lempel compressed text; multipattern search; faster than decompressing plus searching [»navaG3_2001a]
| Subtopic: search compressed database
Quote: use a skipped index for fast queries in a compressed database [»wittIH_1994]
|
Related Topics
Group: information retrieval (25 topics, 674 quotes)
Topic: approximate string matching and pattern matching with errors (19 items)
Topic: compressed data (16 items)
Topic: data compression algorithms (53 items)
Topic: database queries, joins, and relational algebra (34 items)
Topic: external search and sort (23 items)
Topic: search algorithms (40 items)
Topic: text compression (17 items)
|