Group: database model
Group: information retrieval
Topic: abstraction by common attributes
Topic: classification
Topic: definition by example
Topic: hypertext nodes
Topic: fundamental concepts such as type, attributes, relationships are all the same
Topic: information retrieval by relevance
Topic: information retrieval by topic
Topic: limitations of artificial intelligence and cognitive science
Topic: limitations of hierarchical structures
Topic: loosely structured data
Topic: manual indexing
Topic: non-hierarchical classification and multiple classification
Topic: people better than computers
Topic: personal information
Topic: problem of assigning names
Topic: problems with information retrieval
Topic: problems with type inheritance
Topic: taxonomy
| |
Summary
Categorization is surprisingly difficult. Categories are fluid, like a natural language. The same items may be classified differently by different people. Items may be assigned to multiple categories. The concepts of category, relation, attribute, and identity are intertwined. (cbb 11/07)
Subtopic: categories are fluid
Quote: the boundaries and extent of a classification can be arbitrary [»kentW_1978]
| Quote: there is not a natural set of categories; a system defines its categories [»kentW_1978]
| Quote: an exhaustive classification is impossible; always new, unanticipated concepts [»sowaJF_1984]
| Quote: the things called 'games' form a family, and their similarities are family resemblances [»wittL_1958a]
| Quote: species can not be classified into circles or prearranged groups
| Quote: the antinomies of formal logic arise because classifications must be modified for unforeseen objects [»poinH_1908, OK]
| Quote: with cancellation, Clyde the elephant could be just about anything, including a giraffe; the Elephant node is just a collection of properties [»bracRJ10_1983]
| Subtopic: category as communication
Quote: author and reader of hierarchies often have different interpretations; e.g., in videotext, nightlife decomposed into entertainment, attractions, etc. [»shasD12_1987]
| Quote: information retrieval is a process of communication between inquirers and indexers; i.e., a problem of language and meaning, or use [»blaiDC_1990]
| Quote: words are ambiguous because we must use a finite set to communicate a non-denumerable set of concepts [»kentW_1978]
| Quote: in ordinary communication, the ability to stretch and modify meanings is essential [»kentW_1978]
| Subtopic: categories and language
Quote: we cannot intelligibly say that conceptual schemes are different or the same; nor can we distinguish conceptual schemes as a partial or total failure of translation [»daviD11_1974]
| Quote: all intelligence must have a subcognitive substrate to compose categories; otherwise need to define all possible attributes of each category in every context [»frenRM1_1990]
| Quote: the same words may have different meanings when describing a class; a serious difficulty for recognition programs [»bongM_1967]
| Quote: use Bongard patterns to test more complex, recognition programs
| Quote: Hopi call insect, airplane and aviator the same; but Eskimos have many words for snow [»kentW_1978]
| Quote: earliest reference to Eskimos and snow; four unrelated words for snow on the ground, snow drift, falling snow, and drifting snow [»martL6_1986]
| Quote: Eskimo uses two roots for snow itself; qanik for snow in the air and aput for snow on the ground [»martL6_1986]
| Quote: only an appeal to semantics can resolve the syntactic ambiguity of 'time flies like an arrow'; three acceptable structures [»oettAG_1972]
| Subtopic: inconsistent classification
Quote: cardiologists were inconsistent in diagnosing men with chest pain [»funkM4_1983, OK]
| Quote: measured indexing consistency in MEDLINE: 75% for checktags to 34% for main headings with subheadings [»funkM4_1983]
| Quote: any set of ordinary English thesauri show how differently the same vocabulary can be treated
| Quote: a retrieval thesaurus, like any thesaurus, is somewhat arbitrary
| Quote: index consistency highest for checktags, a small set of basic-level categories preprinted on the indexing form [»funkM4_1983]
| Subtopic: multiple classifications
Quote: Agenda items tend to be filed in only a few places; why filing cabinets work acceptably [»kaplSJ7_1990]
| Quote: a manual may be a 'book' in one library, but not a book in another (because it has a soft cover) [»kentW_1978]
| Quote: a person on leave may be an employee for benefits but not for payroll [»kentW_1978]
| Quote: phenomena can be taxonomized in an infinite number of ways at different hierarchical levels; interesting generalizations for each one
| Quote: to describe data, need to account for an entity's membership in innumerable sets [»joneTC4_1979b]
| QuoteRef: bateG_1972 ;;34 Macbeth III.l "Ay--in the catalogue ye go for men. as hounds and greyhounds, mongrels, spaniels, curs, shoughs, water-rugs and demi-wolves are clept all by the name of dogs"
| Subtopic: multiple relationships
Quote: a language is things related to one another in many different ways; no one thing in common, e.g., language games [»wittL_1958a]
| Quote: while dogs inherit characteristics from mammals, the definition of mammals is partly from dogs [»coxBJ7_1983]
| Subtopic: categories vs. attributes vs. relationships vs. individuals
Quote: the same piece of information can be a category, an attribute, or a relationship; e.g., parents and children [»kentW_1978]
| Quote: statements about individuals and classes are different logical types; hard to predict one from the other [»bateG_1979]
| Quote: the is-a relation does not hold for particular instances of squares and rectangles; e.g. a rectangle of height 5 and width 20 [»winkJF8_1992]
| Subtopic: data type as classification
Quote: a strongly-typed language such as C++ distorts the original, intuitive class structure [»wolfW9_1989]
| Quote: a weakly-typed language such as Flavors allows an intuitive class structure [»wolfW9_1989]
| Subtopic: elaborate systems do not work
Quote: users failed at establishing elaborate filing schemes for archived information; too much time and effort [»barrD7_1995]
| Subtopic: dimensionality
Quote: curse of dimensionality; space behaves differently above 10-d; volume and area depend exponentially on dimension [»bohmC9_2001]
| Subtopic: multi-database and federated database
Quote: classified the types of structural and representational discrepancies in multidatabase systems; basis for UniSQL/M [»kimW12_1991]
| Subtopic: automatic classification does not work
Quote: no automated routine can distinguish useless from relevant communications; especially when a user's tasks and interests change [»hiltSR7_1985]
| Quote: Andrew's Advisor system first tried automatic filing of messages by subject; 50% misclassification, too cluttered [»boreNS9_1988]
|
Related Topics
Group: database model (15 topics, 316 quotes)
Group: information retrieval (25 topics, 674 quotes)
Topic: abstraction by common attributes (19 items)
Topic: classification (65 items)
Topic: definition by example (26 items)
Topic: hypertext nodes (19 items)
Topic: fundamental concepts such as type, attributes, relationships are all the same (37 items)
Topic: information retrieval by relevance (33 items)
Topic: information retrieval by topic (16 items)
Topic: limitations of artificial intelligence and cognitive science (64 items)
Topic: limitations of hierarchical structures (10 items)
Topic: loosely structured data (20 items)
Topic: manual indexing (19 items)
Topic: non-hierarchical classification and multiple classification (16 items)
Topic: people better than computers (35 items)
Topic: personal information (41 items)
Topic: problem of assigning names (25 items)
Topic: problems with information retrieval (51 items)
Topic: problems with type inheritance (20 items)
Topic: taxonomy (16 items)
|