Topic: information retrieval by relevance

topics > computer science > information > Group: information retrieval

access by pattern matching
full-text indexing
information retrieval by searching
information retrieval with queries
problem of classifying information
searching the Web
using keywords to search hypertext

how to find everything relevant to some topic or question
why does directed search succeed for Thesa


An item is relevant to a query if it is potentially useful to the requester. Relevance is inherently subjective. Relevance is not the same as authority. An authority is a trusted source of correct information. An authority is always relevant, but not vice-versa.

Relevance and precision typically counter balance each other. It is easy to write a specific query that retrieves a small subset of relevant items. As the query is broadened, more relevant and irrelevant items are retrieved.

An alternative is a narrow query to locate relevant items, followed by browsing to locate all relevant items. Browsing is effective if all relevant items are linked together. (cbb 10/07)

Subtopic: relevance vs. precision up

Quote: the fallacy of abundance: in a large information retrieval system, it is hard to write reasonable queries that do not retrieve at least some relevant documents [»blaiDC1_1996]
Quote: specific queries are inappropriate for broad information needs with many relevant documents [»pratW7_1999]
Quote: recall studies of large document retrieval systems depended on the persistence of the evaluators and where they looked for unretrieved, relevant documents [»blaiDC1_1996]
Quote: full-text search retrieved more relevant documents than a manual index
Quote: in a 100 article collection, manual and full-text search retrieved less than half of the relevant documents on average

Subtopic: relevance vs. authority up

Quote: web search concerns authoritativeness instead of relevance; trusted source of correct information that has a strong web presence [»boroA2_2005]
Quote: want authoritative results from broad-topic queries; the set of relevant results is too large [»kleiJM9_1999]

Subtopic: relative preference up

Quote: skipped search results are clearly less relevant than selected results; pairwise relative preference had 80% agreement with manually ranked results [»joacT8_2007]

Subtopic: relevance search -- Thesa up

Quote: by searching and selecting links can find anything that is "out there"
Quote: with hypertext, continually uncover new items of interest, often with growing relevance
Quote: hypertext for specialist access and muster of information; enhance existing material with commentary, surveys, connections; maintains relevance [»nelsTH_1967]
Quote: use dogears for transient bookmarks and breadth-first traversal; allows prefetching of all interesting pages [»newfD11_1998]
Quote: categorize links for breadth-first search into never seen, visited, pending, and uninteresting (i.e., seen but not selected) [»newfD11_1998]

Subtopic: relevance as helpful for need up

Quote: relevance is being potentially helpful to a user in the resolution of a need [»greeR9_1995]

Subtopic: relevance to query up

Quote: relevant nodes for a search can be saved in a private document for later recovery [»walkJH11_1987]
Quote: iterative search by identifying relevant and irrelevant documents; augment query with keywords from relevant documents [»saltG4_1970]

Subtopic: word senses and relevance up

Quote: query word sense mismatches are far more likely to appear in nonrelevant documents than for relevant documents [»krovR4_1992]

Subtopic: reachability up

Quote: the best link analysis algorithm ranks nodes by their reachability; BFS combines InDegree with the Hits algorithm; 44% highly relevant [»boroA2_2005]
Quote: tightly-knit communities, cycles, and isolated components are generally irrelevant; reachability and high in-degree are better measures of relevance; BFS/InDegree better than Hits/PageRank [»boroA2_2005]

Subtopic: proximity measure up

Quote: use fractional distance metrics to preserve proximity in high dimensional spaces;
Quote: survey on querying multimedia databases; similarity search via feature transformation into a high dimensional space [»bohmC9_2001]
Quote: curse of dimensionality; space behaves differently above 10-d; volume and area depend exponentially on dimension [»bohmC9_2001]
Quote: an index partition in high-dimensional space spans most of the space in most dimensions, from border to border; coarse partitioning [»bohmC9_2001]
Quote: reasonably selective range and nearest-neighbor queries have a huge extension in each dimension [»bohmC9_2001]
Quote: in high dimensions, need a large search radius; better to use brute force [»chavE9_2001]
Quote: use equivalence relations, pivots, and compact partitions to index metric spaces for proximity queries; pivot best if memory, compact partitioning best in high dimension [»chavE9_2001]

Subtopic: relevance is subjective up

Quote: only the user can judge what is relevant to his or her own need [»greeR9_1995]
Quote: no automated routine can distinguish useless from relevant communications; especially when a user's tasks and interests change [»hiltSR7_1985]
Quote: if two scientists judge relevance of a document, only 60% agreement [»clevC4_1984]
Quote: "important" is subjective and a continuous scale; different things are important to different people in different contexts at different times [»kentW1_1985]
Quote: approximately half of the twice-indexed abstracts were assigned to different headings [»sievMC1_1991]

Subtopic: problems with relevance up

Quote: 5% of Excite queries used 'More Like This'; traditional IR searching uses relevance feedback more [»jansBJ1_1998]
Quote: categories and indices decay; over time, they decline in relevance; e.g., Dewey Decimal system

Related Topics up

Topic: access by pattern matching (18 items)
Topic: full-text indexing (37 items)
Topic: information retrieval by searching (35 items)
Topic: information retrieval with queries (18 items)
Topic: problem of classifying information (42 items)
Topic: searching the Web (53 items)
Topic: using keywords to search hypertext (26 items)

ThesaHelp: how to find everything relevant to some topic or question
ThesaHelp: why does directed search succeed for Thesa

Updated barberCB 6/05
Copyright © 2002-2008 by C. Bradford Barber. All rights reserved.
Thesa is a trademark of C. Bradford Barber.