Map
Index
Random
Help
th

QuoteRef: zobeJ7_2006




Topic:
full-text indexing
Topic:
suffix trie and suffix array
Topic:
pattern matching
Topic:
signature files
Topic:
searching the Web
Topic:
hypertext links

Reference

Zobel, J., Moffat, A., "Inverted files for text search engines", ACM Computing Surveys, 38, 2, July 2006, pp. 1-56. Google

Quotations
2 ;;Quote: indexing core for constructing a document-level index with ranked query evaluation; refinements for reorganization, phrases, and compression; includes bibliography
8 ;;Quote: use inverted files for text query evaluation; better than suffix arrays and signature files
8 ;;Quote: all terms should be indexed; including numbers, URL tokens, and stopwords
13 ;;Quote: build two indices, a word-level index for phrase and Boolean searches and a document-level index for ranked queries
13 ;;Quote: for phrase queries, use an index for word pairs that begin with a common word; the three commonest words halves phrase querying time
16 ;;Quote: merge-based index construction scales well; 100MB of memory, can minimize disk space overhead, only one parsing pass, compression effective
32 ;;Quote: while fast for grep, suffix arrays do not scale well; no compression, 1.7x memory-resident data, no ranked queries
32 ;;Quote: grep pattern matching by suffix array of vocabulary or by inverted index of digrams or trigrams
33 ;;Quote: problems with signature files -- false matches are linear in the collection size, large index, more disk accesses for short queries, and no ranked queries
35 ;;Quote: index the anchor text as a title of the target page; helps identify a document; good for Web searching

Related Topics up

Topic: full-text indexing (37 items)
Topic: suffix trie and suffix array (20 items)
Topic: pattern matching (42 items)
Topic: signature files (21 items)
Topic: searching the Web (53 items)
Topic: hypertext links (45 items)

Collected barberCB 8/06
Copyright © 2002-2008 by C. Bradford Barber. All rights reserved.
Thesa is a trademark of C. Bradford Barber.