Topic: approximate string matching and pattern matching with errors
Topic: compressed data
Topic: pattern matching
Topic: search algorithms
Topic: strings
Topic: text compression
Topic: trees
| |
Summary
A suffix trie is a tree whose nodes represents suffices. The root represents all possible strings in a text.
Suffix tries identify common subsequences. A suffix trie may need compression. (cbb 6/06)
Subtopic: searching suffix tries and suffix arrays
Quote: with suffix trie, search time independent of text length [»nelsMR8_1996]
| Quote: use compressed suffix arrays and trees to search for short strings in O(1) time with O(n) space [»grosR5_2000]
| Quote: implement a trie with a double array; 3-5x faster retrieval, 5-9x slower insertion, 17% smaller than list [»aoeJI9_1992]
| Quote: simple, fast algorithm for approximate string matching; special cases k-break periodic strings; uses suffix tree; short, overlapping text partitions; pattern at least 5k^3 where k is the maximum edit distance [»coleR9_2002]
| Quote: grep pattern matching by suffix array of vocabulary or by inverted index of digrams or trigrams [»zobeJ7_2006]
| Subtopic: constructing suffix trees
Quote: survey of suffix array construction algorithms; more space efficient than suffix trees [»puglSJ6_2007]
| Quote: Maniscalco & Puglisi's algorithm was fastest for suffix array construction; best three algorithms had a working memory of 5-6n bytes [»puglSJ6_2007]
| Quote: on-line construction of suffix tree in O(N) [»nelsMR8_1996]
| Quote: could construct suffix tree for human genome in 46G memory and nine hours [»kurtS11_1999]
| Quote: wotdlazy, an efficient algorithm for write-only, top-down construction of suffix trees; less than 12 bytes per input character; faster and less space than McCreight's hash table implementation [»giegR7_1999]
| Subtopic: space efficient tries
Quote: use Kurtz' space efficient suffix trees in 10n space, McCreight's constructor in O(kn) time, and an indexed root [»balkB10_2000]
| Quote: space efficient suffix trees; 10 bytes per input character, just as fast; best for binary and text [»kurtS11_1999]
| Quote: Bonsai trees compress large trie structures to 1/3; encodes indices with 32 by 32 bit multiplication [»darrJJ3_1993]
| Quote: Scrabble game uses a 'dawg'; i.e., a trie with merged sub-tries; takes a fifth the space [»appeAW5_1988]
| Quote: simple, space efficient suffix trees using balanced parentheses; O(n lg n) space and O(m) search [»munrJI5_2001]
| Subtopic: burst trie
Quote: use burst trie for sorted data; containers (e.g., binary tree) accessed by a trie; better performance than binary tree, splay tree, ternary tree; only 25% slower than hash table [»heinS4_2002]
| Quote: trend heuristic for bursting by using one counter per container; start value with bonuses and penalties; burst container when 0 [»heinS4_2002]
| Subtopic: dynamic trie
Quote: use dynamic suffix arrays to detect, on-line, the occurrences of a string subject to incremental insertions and deletions [»ferrP1_1995]
| Subtopic: common substring
Quote: compress assembly code by identifying common substrings via suffix trees; 7% compression on average [»frasCW6_1984]
| Subtopic: problems with suffix arrays
Quote: while fast for grep, suffix arrays do not scale well; no compression, 1.7x memory-resident data, no ranked queries [»zobeJ7_2006]
|
Related Topics
Topic: approximate string matching and pattern matching with errors (19 items)
Topic: compressed data (16 items)
Topic: pattern matching (42 items)
Topic: search algorithms (40 items)
Topic: strings (13 items)
Topic: text compression (17 items)
Topic: trees (21 items)
|