Topic: approximate string matching and pattern matching with errors
Topic: hash table and hash functions
Topic: signature files
Topic: information filters
Topic: spelling errors
| |
Summary
A hash filter removes uninteresting entries from a retrieval set. It is an approximate membership tester.
A bloom filter hashes values to bits. It matches if all bits are set. (cbb 10/07)
Subtopic: Bloom filter
Quote: Bloom filter (failure hash) by hashing to multiple bits; need at least half 0's and constant time processing [»blooBH7_1970]
| Quote: if allow false hits and a sparse message universe, then can use a much smaller hash area to identify messages [»blooBH7_1970]
| Quote: explicit-state model checking with bitstate verification and Bloom filters; e.g., Spin, cache coherence, and network protocols [»dillPC4_2004]
| Quote: Z uses a Bloom filter for checking spelling in a dictionary [»woodSR6_1981]
| Quote: designed Bloom filter for differential file by simulating heavy day and extreme loads [»gremLL9_1982, OK]
| Subtopic: approximate membership testing
Quote: gives approximate membership testers that do better than Bloom filters [»cartL5_1978]
| Quote: an approximate membership tester correctly accepts every member but rarely accepts a non-member [»cartL5_1978]
| Quote: gives approximate membership tester that does not need an independent sequence of hash functions [»cartL5_1978]
| Subtopic: compressing hash filters
Quote: compress hash values for spelling list by storing differences; partition into M bins for speed [»mcilMD1_1982]
| Quote: store hash filter by differences between successive hash values [»bentJ5_1985]
| Subtopic: optimizing Bloom filter
Quote: for open-addressing, triple hashing modifies the increment as well as the hash value; much higher accuracy for Bloom filters [»dillPC4_2004]
| Quote: triple hashing with LOOKUP2 generating 96 bits of data; as fast as double hashing with 1/13th as many omissions [»dillPC4_2004]
| Quote: avoid MOD operations within the iteration part of a Bloom filter [»dillPC4_2004]
| Quote: found the best Bloom filter for 2,300 unique keys was 3K-4K bytes and three or more transformations; <2% filter error rate [»gremLL9_1982]
| Quote: the performance of a Bloom filter depends on its loading factor, the number of set bits [»gremLL9_1982]
| Quote: instead of using multiple hash values in a Bloom filter, use a partitioned bit vector; easier monitoring and no performance difference [»mullJK8_1983]
| Quote: gives probability of unset bits in a Bloom filter [»mullJK8_1983]
| Quote: expression for false-positives in a hash filter [»stanC12_1986, OK]
|
Related Topics
Topic: approximate string matching and pattern matching with errors (19 items)
Topic: hash table and hash functions (41 items)
Topic: signature files (21 items)
Topic: information filters (22 items)
Topic: spelling errors (18 items)
|