Topic: translation of data

topics > computer science > Group: information

digital communication
distributed database
distributed systems
document preparation
information retrieval

implementing distributed systems and applications
natural language translation
semistructured messages for automated processing
text markup and structured text
type conversion
XML schemas


Data from one location may need translation on arrival at another location. Data formats may differ as well as terminolgy and organization. Binary formats require exact translation while XML and other text formats allow partial translation.

The wire format may be self-describing, or sufficiently regular for type inferrence. Type erasure is the opposite process from typed data toan untyped stream. (cbb 11/07)

Subtopic: data conversion up

Quote: for any two data structures there is a largest common substructure, at worse, the bit; useful for communication [»hehnEC3_1983]
Quote: MIT's Common system for data type conversion [»giffDK7_1985, OK]

Subtopic: external data format up

Quote: an external data format should be self-describing and preserved under round-tripping; e.g., Lisp S-expressions [»simeJ1_2003]
Quote: Kleisli's data exchange format is CPL values; self-describing, e.g., {| indicates a bag; no fixed schema or type declaration [»wongL9_2000]
Quote: describe data with a declarative language; it is robust, concise, and self-documenting; generates a parser, validator, printer, profiler, formatting tools, query support, etc. [»fishK6_2005]

Subtopic: inferred type, self-describing up

Quote: Kleisli and its self-describing data exchange use parametric polymorphism and type inference; from functional programming [»wongL9_2000]
Quote: all types may be inferred in CPL; allows any data source to be used as long as the inferred type and actual structure are compatible [»wongL9_2000]

Subtopic: type erasure up

Quote: type erasure is the inverse of validation; validation if and only if value matches type and erases to the untyped value [»simeJ1_2003]
Quote: XML cannot distinguish integers from strings; XML Schema requires all type extensions before validation [»simeJ1_2003]

Subtopic: markup conversion up

Quote: built a prototype translator from a standard markup form into an internal format [»mamrSA5_1987]
Quote: little work in automatic translation systems; should support both construction and use of translation software [»mamrSA5_1987]
Quote: define translations in both directions by an attribute grammar [»mamrSA5_1987]
Quote: Generalized Markup Language, GML, does not restrict documents to an application, formatting style, or processing system [»goldCF6_1981]
Quote: MUCH imports SGML documents, builds text blocks, makes nodes of section headers, and links nodes by the table of contents [»radaR3_1993]
Quote: translates from Standard Generalized Markup Language; widely used [»mamrSA5_1987]

Subtopic: type hierarchy translation up

Quote: automatic translation is practical if groups share a specialization hierarchy; helps preserve the meaning of messages [»leeJ1_1990]
Quote: translate between partially shared object types: automatic adoption, type translation, translation to supertype [»leeJ1_1990]
Quote: want to translate between type hierarchies for semi-structured messages; preserve meaning while allowing autonomy for different groups [»leeJ3_1988]

Subtopic: database conversion up

Quote: a heterogeneous database may need a common view despite different definitions, algorithms, and units; users can specify the mappings [»ahmeR12_1991]
Quote: a Pegasus schema makes a local data source appear as Pegasus database; maps between data models and query languages [»ahmeR12_1991]
Quote: classified the types of structural and representational discrepancies in multidatabase systems; basis for UniSQL/M [»kimW12_1991]
Quote: schema conflicts occur when the same information uses different structures, names, data types, or constraints [»kimW12_1991]

Subtopic: change notification up

Quote: if translate between type hierarchies for semi-structured messages, need to notify other groups of changes [»leeJ3_1988]

Subtopic: thunks, procedural conversion up

Quote: PODUS uses interprocedures to change procedure interfaces and mapper procedures to change data formats [»segaME3_1993]
Quote: a thunk stores the address of a parameter into a known location for calling a procedure [»ingePZ1_1961]

Subtopic: problems with translation up

Quote: the variety of data formats for simple facts makes it difficult to interchange data between sites and applications

Related Topics up

Group: data   (140 topics, 3126 quotes)
Group: digital communication   (11 topics, 296 quotes)
Group: distributed database   (6 topics, 194 quotes)
Group: distributed systems   (14 topics, 348 quotes)
Group: document preparation   (8 topics, 180 quotes)
Group: information retrieval   (25 topics, 674 quotes)

Topic: file (22 items)
Topic: implementing distributed systems and applications (41 items)
Topic: natural language translation (8 items)
Topic: semistructured messages for automated processing (22 items)
Topic: text markup and structured text (25 items)
Topic: type conversion (33 items)
Topic: XML schemas
(16 items)

Updated barberCB 1/06
Copyright © 2002-2008 by C. Bradford Barber. All rights reserved.
Thesa is a trademark of C. Bradford Barber.