Map
Index
Random
Help
Topics
th

Topic: problem of classifying information

topics > computer science > Group: information



Group:
database model
Group:
information retrieval

Topic:
abstraction by common attributes
Topic:
classification
Topic:
definition by example
Topic:
hypertext nodes
Topic:
fundamental concepts such as type, attributes, relationships are all the same
Topic:
information retrieval by relevance
Topic:
information retrieval by topic
Topic:
limitations of artificial intelligence and cognitive science
Topic:
limitations of hierarchical structures
Topic:
loosely structured data
Topic:
manual indexing
Topic:
non-hierarchical classification and multiple classification
Topic:
people better than computers
Topic:
personal information
Topic:
problem of assigning names
Topic:
problems with information retrieval
Topic:
problems with type inheritance
Topic:
taxonomy

Summary

Categorization is surprisingly difficult. Categories are fluid, like a natural language. The same items may be classified differently by different people. Items may be assigned to multiple categories. The concepts of category, relation, attribute, and identity are intertwined. (cbb 11/07)
Subtopic: categories are fluid up

Quote: the boundaries and extent of a classification can be arbitrary [»kentW_1978]
Quote: there is not a natural set of categories; a system defines its categories [»kentW_1978]
Quote: an exhaustive classification is impossible; always new, unanticipated concepts [»sowaJF_1984]
Quote: the things called 'games' form a family, and their similarities are family resemblances [»wittL_1958a]
Quote: species can not be classified into circles or prearranged groups
Quote: the antinomies of formal logic arise because classifications must be modified for unforeseen objects [»poinH_1908, OK]
Quote: with cancellation, Clyde the elephant could be just about anything, including a giraffe; the Elephant node is just a collection of properties [»bracRJ10_1983]

Subtopic: category as communication up

Quote: author and reader of hierarchies often have different interpretations; e.g., in videotext, nightlife decomposed into entertainment, attractions, etc. [»shasD12_1987]
Quote: information retrieval is a process of communication between inquirers and indexers; i.e., a problem of language and meaning, or use [»blaiDC_1990]
Quote: words are ambiguous because we must use a finite set to communicate a non-denumerable set of concepts [»kentW_1978]
Quote: in ordinary communication, the ability to stretch and modify meanings is essential [»kentW_1978]

Subtopic: categories and language up

Quote: we cannot intelligibly say that conceptual schemes are different or the same; nor can we distinguish conceptual schemes as a partial or total failure of translation [»daviD11_1974]
Quote: all intelligence must have a subcognitive substrate to compose categories; otherwise need to define all possible attributes of each category in every context [»frenRM1_1990]
Quote: the same words may have different meanings when describing a class; a serious difficulty for recognition programs [»bongM_1967]
Quote: use Bongard patterns to test more complex, recognition programs
Quote: Hopi call insect, airplane and aviator the same; but Eskimos have many words for snow [»kentW_1978]
Quote: earliest reference to Eskimos and snow; four unrelated words for snow on the ground, snow drift, falling snow, and drifting snow [»martL6_1986]
Quote: Eskimo uses two roots for snow itself; qanik for snow in the air and aput for snow on the ground [»martL6_1986]
Quote: only an appeal to semantics can resolve the syntactic ambiguity of 'time flies like an arrow'; three acceptable structures [»oettAG_1972]

Subtopic: inconsistent classification up

Quote: cardiologists were inconsistent in diagnosing men with chest pain [»funkM4_1983, OK]
Quote: measured indexing consistency in MEDLINE: 75% for checktags to 34% for main headings with subheadings [»funkM4_1983]
Quote: any set of ordinary English thesauri show how differently the same vocabulary can be treated
Quote: a retrieval thesaurus, like any thesaurus, is somewhat arbitrary
Quote: index consistency highest for checktags, a small set of basic-level categories preprinted on the indexing form [»funkM4_1983]

Subtopic: multiple classifications up

Quote: Agenda items tend to be filed in only a few places; why filing cabinets work acceptably [»kaplSJ7_1990]
Quote: a manual may be a 'book' in one library, but not a book in another (because it has a soft cover) [»kentW_1978]
Quote: a person on leave may be an employee for benefits but not for payroll [»kentW_1978]
Quote: phenomena can be taxonomized in an infinite number of ways at different hierarchical levels; interesting generalizations for each one
Quote: to describe data, need to account for an entity's membership in innumerable sets [»joneTC4_1979b]
QuoteRef: bateG_1972 ;;34 Macbeth III.l "Ay--in the catalogue ye go for men. as hounds and greyhounds, mongrels, spaniels, curs, shoughs, water-rugs and demi-wolves are clept all by the name of dogs"

Subtopic: multiple relationships up

Quote: a language is things related to one another in many different ways; no one thing in common, e.g., language games [»wittL_1958a]
Quote: while dogs inherit characteristics from mammals, the definition of mammals is partly from dogs [»coxBJ7_1983]

Subtopic: categories vs. attributes vs. relationships vs. individuals up

Quote: the same piece of information can be a category, an attribute, or a relationship; e.g., parents and children [»kentW_1978]
Quote: statements about individuals and classes are different logical types; hard to predict one from the other [»bateG_1979]
Quote: the is-a relation does not hold for particular instances of squares and rectangles; e.g. a rectangle of height 5 and width 20 [»winkJF8_1992]

Subtopic: data type as classification up

Quote: a strongly-typed language such as C++ distorts the original, intuitive class structure [»wolfW9_1989]
Quote: a weakly-typed language such as Flavors allows an intuitive class structure [»wolfW9_1989]

Subtopic: elaborate systems do not work up

Quote: users failed at establishing elaborate filing schemes for archived information; too much time and effort [»barrD7_1995]

Subtopic: dimensionality up

Quote: curse of dimensionality; space behaves differently above 10-d; volume and area depend exponentially on dimension [»bohmC9_2001]

Subtopic: multi-database and federated database up

Quote: classified the types of structural and representational discrepancies in multidatabase systems; basis for UniSQL/M [»kimW12_1991]

Subtopic: automatic classification does not work up

Quote: no automated routine can distinguish useless from relevant communications; especially when a user's tasks and interests change [»hiltSR7_1985]
Quote: Andrew's Advisor system first tried automatic filing of messages by subject; 50% misclassification, too cluttered [»boreNS9_1988]


Related Topics up

Group: database model   (15 topics, 316 quotes)
Group: information retrieval   (25 topics, 674 quotes)

Topic: abstraction by common attributes (19 items)
Topic: classification (65 items)
Topic: definition by example (26 items)
Topic: hypertext nodes (19 items)
Topic: fundamental concepts such as type, attributes, relationships are all the same (37 items)
Topic: information retrieval by relevance (33 items)
Topic: information retrieval by topic (16 items)
Topic: limitations of artificial intelligence and cognitive science (64 items)
Topic: limitations of hierarchical structures (10 items)
Topic: loosely structured data (20 items)
Topic: manual indexing (19 items)
Topic: non-hierarchical classification and multiple classification (16 items)
Topic: people better than computers (35 items)
Topic: personal information (41 items)
Topic: problem of assigning names (25 items)
Topic: problems with information retrieval (51 items)
Topic: problems with type inheritance (20 items)
Topic: taxonomy
(16 items)


Updated barberCB 6/05
Copyright © 2002-2008 by C. Bradford Barber. All rights reserved.
Thesa is a trademark of C. Bradford Barber.