Lesson 10-file system-intro
Email this Mix
Slide 1 - Distributed File Systems
- Lesson 10
- Data Management
Slide 2 - Lesson Reading
- Distributed File Systems: Concepts and Examples, E. Levy, A. Silberschatz, ACM Computing Surveys, Vol 22(4), Dec 1990, pp. 321-374
- Digital copy of the paper is maintained at the Association for Computing Machinery (ACM) Digital Library
Slide 3 - ACM Digital Library
- ACM is a professional society of computing professionals. It sponsors many of the scholarly conferences and workshops in computer science.
- Since computer science does most of its publishing in conferences (instead of journals), it counts on its professional societies to be stewards of the articles that are published as products of the conferences.
- ACM maintains a digital library of digital content it sponsors: transactions (journals), conference proceedings, wider audience publications, studies that it sponsors (e.g., computer science education)
Slide 4 - ACM Digital Library
- ACM is a professional society of computing professionals. The other main professional society for computer science professionals is the IEEE Computer Society.
- Prof Plale has been member of both ACM and IEEE CS since 1999.
- The Distributed File Systems article is freely available for download from ACM Digital Library from a computer within the IU domain (or connected to the IU domain through VPN).
Slide 5 - Why this paper?
- The Levy & Silberschatz paper was published in ACM Surveys in 1990.
- ACM Surveys published excellent survey papers, of which this is one.
- The paper, while a bit dated, gives excellent overview of distributed file systems. And it is free.
- The paper touches on the core principles of distributed computing that are essential to our study of noSQL stores.
Slide 6 - Sections in DFS paper to ignore
- In 2.1, skip para 3-5
- In 2.2 skip para 3,4
- Skip 4.2 and 4.3: these can be advanced reading
- Skip all of Ch 7 Unix United
- Skip all of Ch 9 Locus
- Skip all of 10-12 (Sprite, Andrew, Related work)
Slide 7 - File system
- Often a subsystem of an operating system, a file system’s purpose is to provide persistent storage.
- Persistent storage – usually on disk. Unlike memory (sometimes called RAM – Random Access Memory), the contents of which go away when a program shuts down, persistent storage will not change when a computer is rebooted.
Slide 8 - Terms
- Objects (files) in a persistent store have nice property of being immune to temporary failures in a system.
- A general purpose file system by its nature supports simultaneous use by multiple users.
- A distributed file system allows the objects to be located across multiple computers.
- A distributed system is a collection of loosely coupled machines interconnected by a communication network.
Slide 9 - We will use the shorthand “DFS” to refer to a distributed file system. “DFS” is not in common use when talking about distributed file systems, so just shorthand for our purposes.
Slide 10 - Terms
- Service: software entity running on one or more machines; implements functionality of some kind. Offers that functionality through a set of methods or operations often called its API – Application Programming Interface
- Server: service software running on a single machine. The machine + the service is called the server.
Slide 11 - Terms cont
- Client: process that can invoke a service through the API (e.g., set of operations) that form its client interface.
- A client interface for a file service is formed by a set of file operations.
- e.g., Create a file, Delete a file, Read a file, Write a File
Slide 12 - Simple architecture
Slide 13 - Transparencies
- The multiplicity and dispersion of servers and storage devices should be transparent to application users (clients of the DFS).
- Known as transparencies
- Network transparency: clients should be able to access remote files using the same set of file operations applicable to local files
- Performance: (another type of transparency) amount of time needed to satisfy a service request (retrieve a file) should be independent of the location of the file.
- Key: performance of a DFS should be comparable to that of a conventional (non-distributed) file system.
Slide 14 - Failures are Critical Subject!
- Failures are a defining characteristic of distributed computing …
- because a system running over multiple machines fails differently than does one running on a single box.
- Communication faults, machine failures, storage device crashes, and decays of storage media are all considered to be faults that should be tolerated to some extent.
Slide 15 - Naming
- Naming is a mapping between logical and physical objects.
- URL and IP address
- Users deal with logical data objects represented by file names whereas the system manipulates physical blocks of data stored on disk tracks.
- Multilevel mapping provides users with an abstraction of a file that hides the details of how and where the file is actually stored on disk.
Slide 16 - Naming Plays Role in Transparency
- Location transparency: name of a file does not reveal any hint as to its physical storage location
- Location independence: the name of a file need not be changed when the file’s physical storage location changes
- Thus a location-independent naming scheme is a dynamic mapping, since it can map the same file name to different locations at two different instances of time.
Slide 17 - Example of location transparency, independence
Slide 18 - Location table available to all machines
- Lookup /a/b/c/x.txt
- Pathname is /a/b/c
- File name is x.txt
Slide 19 - Location table available to all machines
- Lookup /a/b/c/x.txt
- Pathname is /a/b/c
- File name is x.txt
- Suppose pathname /a/b/c/x.txt translates to low-level identifier <cu3,11>
- Only place where mapping is maintained is in location table so when cu3 is relocated (re-hosted) only location table has to change
Slide 20 - Semantics of sharing
- Semantics of sharing are rules that govern (or guarantee) behavior.
- These rules specify the effects (what to expect) when multiple clients simultaneously access a shared file.
- The rules specify when modifications of data by a client are observable, if at all, by other, remote clients.
Slide 21 - UNIX semantics hold in this case of single machine where B sees effect of A’s write.
- UNIX file system semantics: when a READ follows a WRITE, the READ returns the value just written. When two WRITES happen in quick succession, followed by READ, the value read is the value stored by the last write. That is, every READ of a file sees the effects of all previous WRITES performed on that file.
Slide 22 - UNIX semantics fail at action #3 in this distributed setting: process B has read stale data.
- UNIX semantics: every read of a file sees the effects of all previous writes performed on that file.
Slide 23 - Session semantics make this behavior correct.
- Session semantics: writes to an open file are visible immediately to local clients but are invisible to remote clients who have the same file open simultaneously.
- Once file is closed, the changes made to it are visible only in later starting sessions. Already open instances of the file do not reflect these changes.
Slide 24 - UNIX semantics limitation in distributed systems
- UNIX semantics work in a distributed system as long as there is one file server and clients do not cache files. But poor performance.
- In a distributed system with caching, obsolete values may be returned because each machine has its own clock (assumes absolute, globally visible order of all read/write events).
- Session semantics loosen the rules
Slide 25 - Session semantics
- Changes to open file initially visible only to process (or possibly machine) that modified the file. Only when file is closed are changes made visible to other processes (or machines).
- These relaxed semantics redefine the behavior of earlier example as being correct. When A closes the file, it sends copy to server, so subsequent READs get new value.
- Session semantics implemented in most distributed file systems.
Slide 26 - takeaway
- Failures and how handled define distributed computing
- Naming helps hide location which is critical for supporting changes to a big system over time (which are more prone to failures.)
- Distance matters : it creates need for paying attention to the order in which READ/WRITE/OPEN/CLOSE etc. events occur