Map
Index
Random
Help
th

Quote: public domain code for compressing and indexing large document collections; entire retrieval system is 40% of the original

topics > all references > references t-z > QuoteRef: wittIH_1994 , p. 367



Topic:
text compression

Quotation Skeleton

The [public-domain] mg system uses a local Bernoulli … [in the index]. This provides fast decoding with quite acceptable compression … The within- document frequencies … are stored using the .gamma. code … For the small files [ca. 4-15Mbyte], the index consumed … [and the entire retrieval system was nearly 50% the size of the original]; for the two larger files [ca. 120-2000Mbyte TREC], the augmented inverted file … [and the entire retrieval system was less than 40% the size of the original]. [mg uses huffword for text compression.]   Google-1   Google-2

Copyright clearance needed for quotation.


Related Topics up

Topic: text compression (16 items)

Copyright © 2002-2008 by C. Bradford Barber. All rights reserved.
Thesa is a trademark of C. Bradford Barber.