Topic: file system reliability

topics > computer science > operating system > Group: file system

consistency testing
database consistency and reliability
log-structured rollback-recovery
replicated data
resourceful, redundant systems for reliability
self-identifying data structures
temporal database


A file system must be reliable, otherwise users can not trust their data to the file system. Multiple methods are needed. These include offsite archives, replicated or mirrored data, self-identifying data structures, and update logs.

Multiple offsite archives and backups protect against catastrophic failure such as fire or earthquake. Recovery may be slow, especially for large file systems. Archives do not contain recent updates.

File systems should recover smoothly from sector errors. Important data should be mirrored. Directory information should be included with files themselves. Prior versions should be readily available.

Self-identifying data allows the file system to be reconstructed from the data alone. (cbb 4/98)

Subtopic: high reliability up

Quote: a design goal was a highly reliable file system which did not need independent backups [»corbFJ_1979]
Quote: a file system should be high performance, robust against some software errors, and use available hardware [»hagmR_1987]
Quote: prevent needing to rebuild a file system; ZFS has not needed FSCK in two years of deployment [»browD9_2007]
Quote: master and chunkservers restart in seconds no matter how they are terminated; no abnormal termination [»gherS10_2003]

Subtopic: error rates up

Quote: error rates have remained constant while capacity and bandwidth have increased tremendously [»browD9_2007]
Quote: disks have one uncorrectable error every 10 to 20 terabytes

Subtopic: replicated data up

Quote: replicate GFS file chunks on multiple chunkservers and racks; usually three replicas
Quote: a file region is consistent if all clients see the same data; it is also defined if clients see the entire mutation

Subtopic: file recovery -- limit scope of failures up

Quote: failure of one or two consecutive sectors should only affect the corresponding file; the file name table survives single sector errors [»hagmR_1987]
Quote: file recovery from single sector errors should be fast; can not scan the entire disk [»hagmR_1987]
Quote: each file has a leader page that contains file name information, unique id, and run table header [»hagmR_1987, OK]
Quote: write changes of the file name table and leader pages to a redo log; recovery in about two seconds; log entry format [»hagmR_1987]
Quote: leader pages for files not as robust as headers and sector labels; bugs no longer detected when they occur; can damage file system [»hagmR_1987]
Quote: a file system should include a rebuilding procedure to reconstruct the file system; effects system design [»lampBW4_1974]
Quote: rebuilding reads all the labels, gets full names for all files, gets free pages, checks directories; takes 30 seconds [»lampBW4_1974]

Subtopic: hints and consistency checks up

Quote: never trust the hardware; compute a checksum with the data; store them separately on disk; recover data if they do not agree [»browD9_2007]
Quote: a page consists of absolutes (file id, version number, page number) and hints (for efficiency, verified, reconstructible) [»lampBW4_1974]
Quote: a page's label (file id etc) is always checked before writing; only written when freed or defined [»lampBW4_1974]
Quote: use checksumming to detect data corruption at disk level; with many disks, occurs frequently [»gherS10_2003]

Subtopic: mirroring up

Quote: UNIX file system described by a replicated, immutable superblock [»mckuMK8_1984]
Quote: Cedar's file name table is mirrored to different locations; both copies checked on reads; logging allows delayed writes [»hagmR_1987]
Quote: Amoeba files are replicated on two disks; the create-file operation can return when 0, 1 or 2 copies written to disk [»vanrR10_1988]
Quote: GFS uses shadow masters instead of mirrors; sub-second delay [»gherS10_2003]

Subtopic: read everything ever written to disk up

Quote: magnetic force microscopy can read everything ever written to a magnetic disk [»gutmP7_1996]
Quote: degaussing purges data from magnetic disks but it also makes the disk unusable [»gutmP7_1996]
Quote: sensitive data should not be written to disk; lock into memory instead

Related Topics up

Topic: archives (19 items)
Topic: consistency testing (60 items)
Topic: database consistency and reliability (15 items)
Topic: log-structured rollback-recovery (13 items)
Topic: replicated data (51 items)
Topic: resourceful, redundant systems for reliability (38 items)
Topic: self-identifying data structures (18 items)
Topic: temporal database
(25 items)

Updated barberCB 6/04
Copyright © 2002-2008 by C. Bradford Barber. All rights reserved.
Thesa is a trademark of C. Bradford Barber.