Map
Index
Random
Help
th

QuoteRef: gherS10_2003

topics > all references > ThesaHelp: references g-h



ThesaHelp:
ACM references f-l
Topic:
examples of distributed systems and applications
Topic:
reliability of distributed systems
Topic:
read-only and write-once file systems
Topic:
log-structured file system
Topic:
information hiding
Topic:
interface between program modules
Topic:
external search and sort
Topic:
unique numeric names as surrogates
Topic:
examples of file systems
Topic:
replicated data
Topic:
file system reliability
Topic:
implementing distributed systems and applications
Topic:
coordinated processes
Topic:
event time
Topic:
file cache
Topic:
updating information in a distributed system
Topic:
b-trees
Topic:
deadlocks
Topic:
memory management by garbage collection
Topic:
error safe systems
Topic:
one-way hash function

Reference

Ghemawat, S., Gobioff, H., Leung, S.-T., "The Google file system", SOSP '03, Bolton Landing, New York, USA, October 2003, ACM, pp. 1-15 ??. Google

Quotations
abstract ;;Quote: Google's scalable distributed file system was designed for frequent component failure and huge, append-only files
abstract+;;Quote: Google co-designed its applications and file system API
2 ;;Quote: use file snapshot and record append for producer-consumer queues and many-way merging; minimizes synchronization overhead; record append at-least-once at a known offset
2 ;;Quote: a GFS file consists of fixed-size chunks identified by a globally unique 64-bit handle; a chunk is a Linux file
2+;;Quote: replicate GFS file chunks on multiple chunkservers and racks; usually three replicas
2 ;;Quote: a single GFS master manages the file system metadata for chunkservers; e.g., namespace, access control, mappings, garbage collection
2+;;Quote: use heartbeat messages for communication between master and chunkservers
2 ;;Quote: direct access to 64MB chunks with cached metadata; few master communications, persistent TCB connection to chunkserver, all metadata in master's memory
4 ;;Quote: GFS supports serial and concurrent writes and record appends; a mutation is replicated and may fail
4+;;Quote: a file region is consistent if all clients see the same data; it is also defined if clients see the entire mutation
5 ;;Quote: use leases to maintain order across replicas; master grants a chunk lease to the primary replica, and the primary orders the mutations
6 ;;Quote: chunkservers forward data immediately to the next replica over a precomputed path
6+;;Quote: from the GFS network topology, estimate distance by IP addresses
7 ;;Quote: use metadata to quickly snapshot a file or directory tree; copy-on-write to same chunkserver
7 ;;Quote: GFS uses a B-tree to map namespace to metadata; no directory nodes; 100 bytes per file
7+;;Quote: each GFS namespace node has a read-write lock; consistent total order to prevent deadlock
8 ;;Quote: use garbage collection to reclaim deleted files; mark deleted files by rename; on scan, delete from metadata; delete metadata for orphaned chunks
8+;;Quote: heartbeat messages sync chunkservers and master incrementally; deletes orphaned chunks
9 ;;Quote: master and chunkservers restart in seconds no matter how they are terminated; no abnormal termination
9+;;Quote: GFS uses shadow masters instead of mirrors; sub-second delay
9 ;;Quote: GFS uses shadow masters instead of mirrors; sub-second delay
11 ;;Quote: GFS aggregate read rate achieves 80% of physical link limit (6 MB/s per client); aggregate write rate is half that
15 ;;Quote: use checksumming to detect data corruption at disk level; with many disks, occurs frequently


Related Topics up

ThesaHelp: ACM references f-l (241 items)
Topic: examples of distributed systems and applications (25 items)
Topic: reliability of distributed systems (35 items)
Topic: read-only and write-once file systems (8 items)
Topic: log-structured file system (11 items)
Topic: information hiding (50 items)
Topic: interface between program modules (55 items)
Topic: external search and sort (23 items)
Topic: unique numeric names as surrogates (67 items)
Topic: examples of file systems (44 items)
Topic: replicated data (51 items)
Topic: file system reliability (26 items)
Topic: implementing distributed systems and applications (41 items)
Topic: coordinated processes (8 items)
Topic: event time (45 items)
Topic: file cache (23 items)
Topic: updating information in a distributed system (50 items)
Topic: b-trees (16 items)
Topic: deadlocks (21 items)
Topic: memory management by garbage collection (116 items)
Topic: error safe systems (76 items)
Topic: one-way hash function (24 items)

Collected barberCB 8/04
Copyright © 2002-2008 by C. Bradford Barber. All rights reserved.
Thesa is a trademark of C. Bradford Barber.