Map
Index
Random
Help
th

QuoteRef: wittIH_1991

topics > all references > ThesaHelp: references t-z



ThesaHelp:
references t-z
Topic:
compressed data
Topic:
absolute vs. relative names
Topic:
full-text indexing
Topic:
data compression algorithms
Topic:
text compression
Topic:
information retrieval with an index
Topic:
pattern matching

Reference

Witten, I.H., Bell, T.C., Nevil, C.G., "Indexing and compressing full-text databases for CD-ROM", Journal of Information Sciences, 17, pp. 265-271, 1991. Google

Quotations
266 ;;Quote: absolute references compress better than hierarchical references; encode gaps between successive occurrences
266 ;;Quote: lexicon of all words in database, linked to concordance, disk address, lexical hierarchy, and permuted index
267 ;;Quote: compress text by arithmetic coding of sentences using lexicon; 20-word sentence into 220 bits with 3% overhead
268 ;;Quote: don't use a stop list; compress common words by predicting the inter-word gap; 100 words are 76% of references and 44% of compressed size
270 ;;Quote: compress lexicon entry into 3 bytes; 8 characters on average, shared prefix, compressed suffix, encoded count, predicted entry size
270 ;;Quote: handle partial match queries by a permuted dictionary where each word appears in all possible rotations


Related Topics up

ThesaHelp: references t-z (309 items)
Topic: compressed data (16 items)
Topic: absolute vs. relative names (12 items)
Topic: full-text indexing (35 items)
Topic: data compression algorithms (53 items)
Topic: text compression (16 items)
Topic: information retrieval with an index (32 items)
Topic: pattern matching (42 items)

Collected barberCB 1/94
Copyright © 2002-2008 by C. Bradford Barber. All rights reserved.
Thesa is a trademark of C. Bradford Barber.