Th Topic: information retrieval by relevance

Topic: information retrieval by relevance

topics > computer science > information > Group: information retrieval

Topic:
access by pattern matching
Topic:
full-text indexing
Topic:
information retrieval by searching
Topic:
information retrieval with queries
Topic:
problem of classifying information
Topic:
searching the Web
Topic:
using keywords to search hypertext

ThesaHelp:
how to find everything relevant to some topic or question
ThesaHelp:
why does directed search succeed for Thesa

Summary

An item is relevant to a query if it is potentially useful to the requester. Relevance is inherently subjective. Relevance is not the same as authority. An authority is a trusted source of correct information. An authority is always relevant, but not vice-versa.
Relevance and precision typically counter balance each other. It is easy to write a specific query that retrieves a small subset of relevant items. As the query is broadened, more relevant and irrelevant items are retrieved.
An alternative is a narrow query to locate relevant items, followed by browsing to locate all relevant items. Browsing is effective if all relevant items are linked together. (cbb 10/07)

Subtopic: relevance vs. precision

Quote: the fallacy of abundance: in a large information retrieval system, it is hard to write reasonable queries that do not retrieve at least some relevant documents [»blaiDC1_1996]
Quote: specific queries are inappropriate for broad information needs with many relevant documents [»pratW7_1999]
Quote: recall studies of large document retrieval systems depended on the persistence of the evaluators and where they looked for unretrieved, relevant documents [»blaiDC1_1996]
Quote: full-text search retrieved more relevant documents than a manual index
Quote: in a 100 article collection, manual and full-text search retrieved less than half of the relevant documents on average

Subtopic: relevance vs. authority

Quote: web search concerns authoritativeness instead of relevance; trusted source of correct information that has a strong web presence [»boroA2_2005]
Quote: want authoritative results from broad-topic queries; the set of relevant results is too large [»kleiJM9_1999]

Subtopic: relative preference

Quote: skipped search results are clearly less relevant than selected results; pairwise relative preference had 80% agreement with manually ranked results [»joacT8_2007]

Subtopic: relevance search -- Thesa

Quote: by searching and selecting links can find anything that is "out there"
Quote: with hypertext, continually uncover new items of interest, often with growing relevance
Quote: hypertext for specialist access and muster of information; enhance existing material with commentary, surveys, connections; maintains relevance [»nelsTH_1967]
Quote: use dogears for transient bookmarks and breadth-first traversal; allows prefetching of all interesting pages [»newfD11_1998]
Quote: categorize links for breadth-first search into never seen, visited, pending, and uninteresting (i.e., seen but not selected) [»newfD11_1998]

Subtopic: relevance as helpful for need

Quote: relevance is being potentially helpful to a user in the resolution of a need [»greeR9_1995]

Subtopic: relevance to query

Quote: relevant nodes for a search can be saved in a private document for later recovery [»walkJH11_1987]
Quote: iterative search by identifying relevant and irrelevant documents; augment query with keywords from relevant documents [»saltG4_1970]

Subtopic: word senses and relevance

Quote: query word sense mismatches are far more likely to appear in nonrelevant documents than for relevant documents [»krovR4_1992]

Subtopic: reachability

Quote: the best link analysis algorithm ranks nodes by their reachability; BFS combines InDegree with the Hits algorithm; 44% highly relevant [»boroA2_2005]
Quote: tightly-knit communities, cycles, and isolated components are generally irrelevant; reachability and high in-degree are better measures of relevance; BFS/InDegree better than Hits/PageRank [»boroA2_2005]

Subtopic: proximity measure

Quote: use fractional distance metrics to preserve proximity in high dimensional spaces;
Quote: survey on querying multimedia databases; similarity search via feature transformation into a high dimensional space [»bohmC9_2001]
Quote: curse of dimensionality; space behaves differently above 10-d; volume and area depend exponentially on dimension [»bohmC9_2001]
Quote: an index partition in high-dimensional space spans most of the space in most dimensions, from border to border; coarse partitioning [»bohmC9_2001]
Quote: reasonably selective range and nearest-neighbor queries have a huge extension in each dimension [»bohmC9_2001]
Quote: in high dimensions, need a large search radius; better to use brute force [»chavE9_2001]
Quote: use equivalence relations, pivots, and compact partitions to index metric spaces for proximity queries; pivot best if memory, compact partitioning best in high dimension [»chavE9_2001]

Subtopic: relevance is subjective

Quote: only the user can judge what is relevant to his or her own need [»greeR9_1995]
Quote: no automated routine can distinguish useless from relevant communications; especially when a user's tasks and interests change [»hiltSR7_1985]
Quote: if two scientists judge relevance of a document, only 60% agreement [»clevC4_1984]
Quote: "important" is subjective and a continuous scale; different things are important to different people in different contexts at different times [»kentW1_1985]
Quote: approximately half of the twice-indexed abstracts were assigned to different headings [»sievMC1_1991]

Subtopic: problems with relevance

Quote: 5% of Excite queries used 'More Like This'; traditional IR searching uses relevance feedback more [»jansBJ1_1998]
Quote: categories and indices decay; over time, they decline in relevance; e.g., Dewey Decimal system
[»nelsTH_1967]

Related Topics

Topic: access by pattern matching (18 items)
Topic: full-text indexing (37 items)
Topic: information retrieval by searching (35 items)
Topic: information retrieval with queries (18 items)
Topic: problem of classifying information (42 items)
Topic: searching the Web (53 items)
Topic: using keywords to search hypertext (26 items)
ThesaHelp: how to find everything relevant to some topic or question
ThesaHelp: why does directed search succeed for Thesa