Andrew file system
Encyclopedia
The Andrew File System is a distributed networked file system
which uses a set of trusted servers to present a homogeneous, location-transparent file name space to all the client workstations. It was developed by Carnegie Mellon University
as part of the Andrew Project
. It is named after Andrew Carnegie
and Andrew Mellon
. Its primary use is in distributed computing
.
s, particularly in the areas of security and scalability. It is not uncommon for enterprise AFS cells to exceed 25,000 clients. AFS uses Kerberos
for authentication, and implements access control list
s on directories for users and groups. Each client caches files on the local filesystem for increased speed on subsequent requests for the same file. This also allows limited filesystem access in the event of a server crash
or a network outage.
Read and write operations on an open file are directed only to the locally cached copy. When a modified file is closed, the changed portions are copied back to the file server. Cache consistency is maintained by callback
mechanism. When a file is cached, the server makes a note of this and promises to inform the client if the file is updated by someone else. Callbacks are discarded and must be re-established after any client, server, or network failure, including a time-out. Re-establishing a callback involves a status check and does not require re-reading the file itself.
A consequence of the file locking
strategy is that AFS does not support large shared databases or record updating within files shared between client systems. This was a deliberate design decision based on the perceived needs of the university computing environment. It leads, for example, to the use of a single file per message in the original email system for the Andrew Project, the Andrew Message System, rather than a single file per mailbox.
A significant feature of AFS is the volume
, a tree of files, sub-directories and AFS mountpoints
(links to other AFS volumes). Volumes are created by administrators and linked at a specific named path in an AFS cell. Once created, users of the filesystem may create directories and
files as usual without concern for the physical location of the volume. A volume may have a quota
assigned to it in order to limit the amount of space consumed. As needed, AFS administrators can move that volume to another server and disk location without the need to notify users; indeed the operation can occur while files in that volume are being used.
AFS volumes can be replicated to read-only cloned copies. When accessing files in a read-only volume, a client system will retrieve data from a particular read-only copy. If at some point that copy becomes unavailable, clients will look for any of the remaining copies. Again, users of that data are unaware of the location of the read-only copy; administrators can create and relocate such copies as needed. The AFS command suite guarantees that all read-only volumes contain exact copies of the original read-write volume at the time the read-only copy was created.
The file name space on an Andrew workstation is partitioned into a shared and local name space. The shared name space (usually mounted as /afs on the Unix filesystem) is identical on all workstations. The local name space is unique to each workstation. It only contains temporary files needed for workstation initialization and symbolic links to files in the shared name space.
The Andrew File System heavily influenced Version 4 of Sun Microsystems
' popular Network File System (NFS). Additionally, a variant of AFS, the Distributed File System
(DFS) was adopted by the Open Software Foundation
in 1989 as part of their Distributed Computing Environment
.
(IBM), OpenAFS
and Arla
, although the Transarc software is losing support and is deprecated. AFS (version two) is also the predecessor of the Coda
file system.
A fourth implementation exists in the Linux
kernel source code
since at least version 2.6.10. Committed by Red Hat
, this is a fairly simple implementation still in its early stages of development and therefore incomplete.
Lookup (l)
Insert (i)
Delete (d)
Administer (a)
Permissions that affect files and subdirectories include:
Read (r)
Write (w)
Lock (k)
Additionally, AFS includes Application ACLs (A)-(H) which have no effect on access to files.
Distributed file system
Network file system may refer to:* A distributed file system, which is accessed over a computer network* Network File System , a specific brand of distributed file system...
which uses a set of trusted servers to present a homogeneous, location-transparent file name space to all the client workstations. It was developed by Carnegie Mellon University
Carnegie Mellon University
Carnegie Mellon University is a private research university in Pittsburgh, Pennsylvania, United States....
as part of the Andrew Project
Andrew Project
The Andrew Project was a distributed computing environment developed at Carnegie Mellon University beginning in 1982. It was an ambitious project for its time and resulted in an unprecedentedly vast and accessible university computing infrastructure....
. It is named after Andrew Carnegie
Andrew Carnegie
Andrew Carnegie was a Scottish-American industrialist, businessman, and entrepreneur who led the enormous expansion of the American steel industry in the late 19th century...
and Andrew Mellon
Andrew W. Mellon
Andrew William Mellon was an American banker, industrialist, philanthropist, art collector and Secretary of the Treasury from March 4, 1921 until February 12, 1932.-Early life:...
. Its primary use is in distributed computing
Distributed computing
Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal...
.
Features
AFS has several benefits over traditional networked file systemFile system
A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...
s, particularly in the areas of security and scalability. It is not uncommon for enterprise AFS cells to exceed 25,000 clients. AFS uses Kerberos
Kerberos protocol
Kerberos is a computer network authentication protocol which works on the basis of "tickets" to allow nodes communicating over a non-secure network to prove their identity to one another in a secure manner. Its designers aimed primarily at a client–server model, and it provides mutual...
for authentication, and implements access control list
Access control list
An access control list , with respect to a computer file system, is a list of permissions attached to an object. An ACL specifies which users or system processes are granted access to objects, as well as what operations are allowed on given objects. Each entry in a typical ACL specifies a subject...
s on directories for users and groups. Each client caches files on the local filesystem for increased speed on subsequent requests for the same file. This also allows limited filesystem access in the event of a server crash
Crash (computing)
A crash in computing is a condition where a computer or a program, either an application or part of the operating system, ceases to function properly, often exiting after encountering errors. Often the offending program may appear to freeze or hang until a crash reporting service documents...
or a network outage.
Read and write operations on an open file are directed only to the locally cached copy. When a modified file is closed, the changed portions are copied back to the file server. Cache consistency is maintained by callback
Callback (computer science)
In computer programming, a callback is a reference to executable code, or a piece of executable code, that is passed as an argument to other code. This allows a lower-level software layer to call a subroutine defined in a higher-level layer....
mechanism. When a file is cached, the server makes a note of this and promises to inform the client if the file is updated by someone else. Callbacks are discarded and must be re-established after any client, server, or network failure, including a time-out. Re-establishing a callback involves a status check and does not require re-reading the file itself.
A consequence of the file locking
File locking
File locking is a mechanism that restricts access to a computer file by allowing only one user or process access at any specific time. Systems implement locking to prevent the classic interceding update scenario ....
strategy is that AFS does not support large shared databases or record updating within files shared between client systems. This was a deliberate design decision based on the perceived needs of the university computing environment. It leads, for example, to the use of a single file per message in the original email system for the Andrew Project, the Andrew Message System, rather than a single file per mailbox.
A significant feature of AFS is the volume
Volume (computing)
In the context of computer operating systems, volume is the term used to describe a single accessible storage area with a single file system, typically resident on a single partition of a hard disk. Similarly, it refers to the logical interface used by an operating system to access data stored on...
, a tree of files, sub-directories and AFS mountpoints
Mount (computing)
Mounting takes place before a computer can use any kind of storage device . The user or their operating system must make it accessible through the computer's file system. A user can access only files on mounted media.- Mount point :A mount point is a physical location in the partition used as a...
(links to other AFS volumes). Volumes are created by administrators and linked at a specific named path in an AFS cell. Once created, users of the filesystem may create directories and
files as usual without concern for the physical location of the volume. A volume may have a quota
Disk quota
A disk quota is a limit set by a system administrator that restricts certain aspects of file system usage on modern operating systems. The function of using disk quotas is to allocate limited disk space in a reasonable way.-Types of quotas:...
assigned to it in order to limit the amount of space consumed. As needed, AFS administrators can move that volume to another server and disk location without the need to notify users; indeed the operation can occur while files in that volume are being used.
AFS volumes can be replicated to read-only cloned copies. When accessing files in a read-only volume, a client system will retrieve data from a particular read-only copy. If at some point that copy becomes unavailable, clients will look for any of the remaining copies. Again, users of that data are unaware of the location of the read-only copy; administrators can create and relocate such copies as needed. The AFS command suite guarantees that all read-only volumes contain exact copies of the original read-write volume at the time the read-only copy was created.
The file name space on an Andrew workstation is partitioned into a shared and local name space. The shared name space (usually mounted as /afs on the Unix filesystem) is identical on all workstations. The local name space is unique to each workstation. It only contains temporary files needed for workstation initialization and symbolic links to files in the shared name space.
The Andrew File System heavily influenced Version 4 of Sun Microsystems
Sun Microsystems
Sun Microsystems, Inc. was a company that sold :computers, computer components, :computer software, and :information technology services. Sun was founded on February 24, 1982...
' popular Network File System (NFS). Additionally, a variant of AFS, the Distributed File System
DCE Distributed File System
The DCE Distributed File System is the remote file access protocol used with the Distributed Computing Environment. It was based on the AFS Version 3.0 protocol that was developed commercially by Transarc Corporation...
(DFS) was adopted by the Open Software Foundation
Open Software Foundation
The Open Software Foundation was a not-for-profit organization founded in 1988 under the U.S. National Cooperative Research Act of 1984 to create an open standard for an implementation of the UNIX operating system.-History:...
in 1989 as part of their Distributed Computing Environment
Distributed Computing Environment
The Distributed Computing Environment is a software system developed in the early 1990s by a consortium that included Apollo Computer , IBM, Digital Equipment Corporation, and others. The DCE supplies a framework and toolkit for developing client/server applications...
.
Implementations
There are three major implementations, TransarcTransarc
Transarc Corporation was a private Pittsburgh-based software company founded in 1989 by Jeffrey Eppinger, Michael Kazar, Alfred Spector, and Dean Thompson of Carnegie Mellon University...
(IBM), OpenAFS
OpenAFS
OpenAFS is an open source implementation of the Andrew distributed file system . AFS was originally developed at Carnegie Mellon University, and developed as a commercial product by the Transarc Corporation, which was subsequently acquired by IBM. At LinuxWorld on 15 August 2000, IBM their plans...
and Arla
Arla (file system)
Arla is an implementation of the AFS distributed file system developed at the Royal Institute of Technology in Stockholm.Arla was started by Björn Grönvall in 1993. Assar Westerlund and Johan Danielsson joined the project shortly thereafter. The project died down before it was usable.In the fall of...
, although the Transarc software is losing support and is deprecated. AFS (version two) is also the predecessor of the Coda
Coda (file system)
Coda is a distributed file system developed as a research project at Carnegie Mellon University since 1987 under the direction of Mahadev Satyanarayanan. It descended directly from an older version of AFS and offers many similar features. The InterMezzo file system was inspired by Coda...
file system.
A fourth implementation exists in the Linux
Linux kernel
The Linux kernel is an operating system kernel used by the Linux family of Unix-like operating systems. It is one of the most prominent examples of free and open source software....
kernel source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...
since at least version 2.6.10. Committed by Red Hat
Red Hat
Red Hat, Inc. is an S&P 500 company in the free and open source software sector, and a major Linux distribution vendor. Founded in 1993, Red Hat has its corporate headquarters in Raleigh, North Carolina with satellite offices worldwide....
, this is a fairly simple implementation still in its early stages of development and therefore incomplete.
Available permissions
The following Access Control List permissions can be granted:Lookup (l)
- allows a user to list the contents of the AFS directory, examine the ACL associated with the directory and access subdirectories.
Insert (i)
- allows a user to add new files or subdirectories to the directory.
Delete (d)
- allows a user to remove files and subdirectories from the directory.
Administer (a)
- allows a user to change the ACL for the directory. Users always have this right on their home directory, even if they accidentally remove themselves from the ACL.
Permissions that affect files and subdirectories include:
Read (r)
- allows a user to look at the contents of files in a directory and list files in subdirectories. Files that are to be granted read access to any user, including the owner, need to have the standard UNIX "owner read" permission set.
Write (w)
- allows a user to modify files in a directory. Files that are to be granted write access to any user, including the owner, need to have the standard UNIX "owner write" permission set.
Lock (k)
- allows the processor to run programs that need to "flockFile lockingFile locking is a mechanism that restricts access to a computer file by allowing only one user or process access at any specific time. Systems implement locking to prevent the classic interceding update scenario ....
" files in the directory.
Additionally, AFS includes Application ACLs (A)-(H) which have no effect on access to files.