abstract ;;Quote: to design a database, should focus on facts to be maintained by the database
|
102 ;;Quote: first identify facts to be maintained and then aggregate facts into data records
|
102 ;;Quote: all facts are connections between things instead of attributes or relationships
|
103 ;;Quote: facts can connect any number of things, e.g., John bought a certain computer at Sears
|
103 ;;Quote: need to know what a fact connects and why
|
103 ;;Quote: if a fact connects several things of one type, need to know their roles
|
104 ;;Quote: an entity is anything which a noun or noun phrase can reference
|
104 ;;Quote: a subtype inherits the properties and facts of its parent type
|
104 ;;Quote: a relationship type (a fact) has a name and a fixed number of things
|
104 ;;Quote: a relationship can be one-to-one, one-to-many, or many-to-many
|
106 ;;Quote: an ideal representation gives every entity a unique symbol that acts as its surrogate
|
106 ;;Quote: a symbol or representation is any character string that stands for something, e.g., 'six feet'
|
106 ;;Quote: most entities do not have unique names, need a descriptive phrase such as employee 999999
|
106 ;;Quote: a number represents an entity if we know the symbol type, i.e., naming universe
|
107 ;;Quote: a symbol type is the lexical constraints on allowable strings
|
108 ;;Quote: some entities have no representation; derived from a representation or a related entity
|
108 ;;Quote: some entities are only known via a related entity, e.g., the 1980 election
|
108 ;;Quote: representation of some entities derived from context, e.g., the name of the home team
|
109 ;;Quote: refer to most things by a combination of pointing, noun phrases, and context
|
109 ;;Quote: a representation is complete if there is one for every member of a type, otherwise are facts
|
109 ;;Quote: nonunique representations commonly occur in data; discriminated by local conventions
|
109 ;;Quote: uniqueness is required when an agent can't otherwise distinguish objects
|
109 ;;Quote: if multiple representations then problems of existence testing, multiple participation
|
109+ ;;Quote: a singular representation avoids multiple representations for an entity
|
110 ;;Quote: a representation, such as names, should be stable due to cost of renaming
|
110 ;;Quote: LP is 1 if complete else 0. MP is 1 if singular, else n. e.g., 1*n for 'at least one'
|
111 ;;Quote: records can not correspond to major entities; what are they, drifting assignment, relationships vs. attributes
|
111 ;;Quote: lengths, colors etc. can occur in a database without existence being announced
|
111 ;;Quote: need database mechanism to announce the existence of an entity
|
111+ ;;Quote: database entities which exist, will have single-valued facts associated with them
|
112 ;;Quote: for each fact specify an identifier, a relationship name, entity type for each role (if necessary), least and most participations
|
113 ;;Quote: relationship needs name and for each role: names, types, and least/most participation
|
113 ;;Quote: a entity type has a name, super type or symbol type
|
113 ;;Quote: a pseudo-record contains one field for each role in a fact
|
113 ;;Quote: in a pseudo-record, actual entities (e.g., employees) sit in fields of the record
|
113 ;;Quote: initially, pseudo-records do not contain digital data and there is one record for each attribute of an entity
|
114 ;;Quote: employee numbers are good representations for entities
|
114 ;;Quote: a pseudo-key is a set of fields that uniquely identifies a record
|
114 ;;Quote: merge pseudo-records when they have compatible keys
|
114 ;;Quote: if a pseudo-record key allows zero participation, it can merge by padding with null values
|
115 ;;Quote: in merging compound keys should only merge attributes of the same relationship
|
116 ;;Quote: soft keys can be null; merging soft keys propagates softness to all keys
|
116 ;;Quote: when possible, a data record should aggregate all facts about an entity
|
117 ;;Quote: need to eliminate nonsymbol fields from pseudo-records to make them proper records
|
119 ;;Quote: in fact-based analysis, start with facts instead of entities; do not need to distinguish relationship from attribute
|
119 ;;Quote: instead of normalizing badly assembled records, should construct normalized records directly
|