Ext2
Encyclopedia
The ext2 or second extended filesystem is a file system
File system
A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...

 for the Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

 kernel. It was initially designed by Rémy Card
Rémy Card
Rémy Card is a French software developer who is noted for his contributions to the Linux kernel.He is credited as one of the primary developers of the Extended file system and Second Extended file system for Linux.- Bibliography :...

 as a replacement for the extended file system
Extended file system
The extended file system or ext was implemented in April 1992 as the first file system created specifically for the Linux operating system. It has metadata structure inspired by the traditional Unix File System and was designed by Rémy Card to overcome certain limitations of the Minix file...

 (ext).

The canonical implementation of ext2 is the ext2fs filesystem driver in the Linux kernel. Other implementations (of varying quality and completeness) exist in GNU Hurd
GNU Hurd
GNU Hurd is a free software Unix-like replacement for the Unix kernel, released under the GNU General Public License. It has been under development since 1990 by the GNU Project of the Free Software Foundation...

, MINIX 3
MINIX 3
MINIX 3 is a project to create a small, highly reliable and functional Unix-like operating system. It is published under the BSD license.The main goal of the project is for the system to be fault-tolerant by detecting and repairing its own faults on the fly, without user intervention...

, Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

 (third-party), Darwin
Darwin (operating system)
Darwin is an open source POSIX-compliant computer operating system released by Apple Inc. in 2000. It is composed of code developed by Apple, as well as code derived from NeXTSTEP, BSD, and other free software projects....

 (same third-party as Mac OS X but untested), some BSD kernels, in Atari MiNT
MiNT
MiNT is a free software alternative operating system kernel for the Atari ST and its successors. Together with the free system components fVDI , XaAES , and TeraDesk , MiNT provides a free TOS compatible replacement OS that is capable of multitasking.MiNT was originally released by Eric Smith as...

, and as third-party Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

 drivers.

ext2 was the default filesystem in several Linux distribution
Linux distribution
A Linux distribution is a member of the family of Unix-like operating systems built on top of the Linux kernel. Such distributions are operating systems including a large collection of software applications such as word processors, spreadsheets, media players, and database applications...

s, including Debian
Debian
Debian is a computer operating system composed of software packages released as free and open source software primarily under the GNU General Public License along with other free software licenses. Debian GNU/Linux, which includes the GNU OS tools and Linux kernel, is a popular and influential...

 and Red Hat Linux
Red Hat Linux
Red Hat Linux, assembled by the company Red Hat, was a popular Linux based operating system until its discontinuation in 2004.Red Hat Linux 1.0 was released on November 3, 1994...

, until supplanted more recently by ext3
Ext3
The ext3 or third extended filesystem is a journaled file system that is commonly used by the Linux kernel. It is the default file system for many popular Linux distributions, including Debian...

, which is almost completely compatible with ext2 and is a journaling file system
Journaling file system
A journaling file system is a file system that keeps track of the changes that will be made in a journal before committing them to the main file system...

. ext2 is still the filesystem of choice for flash
Flash memory
Flash memory is a non-volatile computer storage chip that can be electrically erased and reprogrammed. It was developed from EEPROM and must be erased in fairly large blocks before these can be rewritten with new data...

-based storage media (such as SD cards, and USB flash drive
USB flash drive
A flash drive is a data storage device that consists of flash memory with an integrated Universal Serial Bus interface. flash drives are typically removable and rewritable, and physically much smaller than a floppy disk. Most weigh less than 30 g...

s) since its lack of a journal minimizes the number of writes and flash devices have only a limited number of write cycles. Recent kernels, however, support a journal-less mode of ext4
Ext4
The ext4 or fourth extended filesystem is a journaling file system for Linux, developed as the successor to ext3.It was born as a series of backward compatible extensions to ext3, many of them originally developed by Cluster File Systems for the Lustre file system between 2003 and 2006, meant to...

, which would offer the same benefit along with a number of ext4-specific benefits.

History

The early development of the Linux kernel was made as a cross-development under the Minix
Minix
MINIX is a Unix-like computer operating system based on a microkernel architecture created by Andrew S. Tanenbaum for educational purposes; MINIX also inspired the creation of the Linux kernel....

 operating system. Naturally, it was obvious that the Minix file system
MINIX file system
-History:MINIX was written from scratch by Andrew S. Tanenbaum in the 1980s, as a Unix-like operating system whose source code could be used freely in education...

 would be used as Linux's first file system. The Minix file system was mostly free of bugs, but used 16-bit offsets internally and thus only had a maximum size limit of 64 megabyte
Megabyte
The megabyte is a multiple of the unit byte for digital information storage or transmission with two different values depending on context: bytes generally for computer memory; and one million bytes generally for computer storage. The IEEE Standards Board has decided that "Mega will mean 1 000...

s. There was also a filename length limit of 14 characters. Because of these limitations, work began on a replacement native file system for Linux.

To ease the addition of new file systems and provide a generic file API
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...

, VFS
Virtual file system
A virtual file system or virtual filesystem switch is an abstraction layer on top of a more concrete file system. The purpose of a VFS is to allow client applications to access different types of concrete file systems in a uniform way...

, a virtual file system layer was added to the Linux kernel. The extended file system (ext
Extended file system
The extended file system or ext was implemented in April 1992 as the first file system created specifically for the Linux operating system. It has metadata structure inspired by the traditional Unix File System and was designed by Rémy Card to overcome certain limitations of the Minix file...

), was released in April 1992 as the first file system using the VFS API and was included in Linux version 0.96c. The ext file system solved the two major problems in the Minix file system (maximum partition size and filename length limitation to 14 characters), and allowed 2 gigabyte
Gigabyte
The gigabyte is a multiple of the unit byte for digital information storage. The prefix giga means 109 in the International System of Units , therefore 1 gigabyte is...

s of data and filenames of up to 255 characters. But it still had problems: there was no support of separate timestamp
Time code
A timecode is a sequence of numeric codes generated at regular intervals by a timing system.- Video and film timecode :...

s for file access, inode
Inode
In computing, an inode is a data structure on a traditional Unix-style file system such as UFS. An inode stores all the information about a regular file, directory, or other file system object, except its data and name....

 modification, and data modification.

As a solution for these problems, two new filesystems were developed in January 1993: xiafs
Xiafs
Xiafs was a file system for the operating system Linux which was conceived and developed by Frank Xia and was based on the MINIX file system. Today it is obsolete and not in use, except possibly in some historic installations.-History:...

 and the second extended file system (ext2), which was an overhaul of the extended file system incorporating many ideas from the Berkeley Fast File System. ext2 was also designed with extensibility in mind, with space left in many of its on-disk data structures for use by future versions.

Since then, ext2 has been a testbed for many of the new extensions to the VFS API. Features such as POSIX
POSIX
POSIX , an acronym for "Portable Operating System Interface", is a family of standards specified by the IEEE for maintaining compatibility between operating systems...

 ACLs
Access control list
An access control list , with respect to a computer file system, is a list of permissions attached to an object. An ACL specifies which users or system processes are granted access to objects, as well as what operations are allowed on given objects. Each entry in a typical ACL specifies a subject...

 and extended attributes were generally implemented first on ext2 because it was relatively simple to extend and its internals were well-understood.

On Linux kernels prior to 2.6.17, restrictions in the block driver mean that ext2 filesystems have a maximum file size of 2TB.

ext2 is still recommended over journaling file systems on bootable USB flash drives and other solid-state drive
Solid-state drive
A solid-state drive , sometimes called a solid-state disk or electronic disk, is a data storage device that uses solid-state memory to store persistent data with the intention of providing access in the same manner of a traditional block i/o hard disk drive...

s. ext2 performs fewer writes than ext3 since it does not need to write to the journal. As the major aging factor of a flash chip is the number of erase cycles, and as those happen frequently on writes, this increases the life span of the solid-state device. Another good practice for filesystems on flash devices is the use of the noatime mount option, for the same reason.

ext2 data structures

The space in ext2 is split up into block
Block (data storage)
In computing , a block is a sequence of bytes or bits, having a nominal length . Data thus structured are said to be blocked. The process of putting data into blocks is called blocking. Blocking is used to facilitate the handling of the data-stream by the computer program receiving the data...

s. These blocks are grouped into block groups, analogous to cylinder groups in the Unix File System. There are typically thousands of blocks on a large file system. Data for any given file is typically contained within a single block group where possible. This is done to reduce external fragmentation and minimize the number of disk seeks when reading a large amount of consecutive data.

Each block group contains a copy of the superblock and block group descriptor table, and all block groups contain a block bitmap, an inode bitmap, an inode table and finally the actual data blocks.

The superblock
Unix File System
The Unix file system is a file system used by many Unix and Unix-like operating systems. It is also called the Berkeley Fast File System, the BSD Fast File System or FFS...

 contains important information that is crucial to the booting of the operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

, thus backup copies are made in multiple block groups in the file system. However, typically only the first copy of it, which is found at the first block of the file system, is used in the booting.

The group descriptor stores the location of the block bitmap, inode bitmap and the start of the inode table for every block group and these, in turn are stored in a group descriptor table.

Inodes

Every file or directory is represented by an inode. The inode includes data about the size, permission, ownership, and location on disk of the file or directory.

Example of ext2 inode structure:
Quote from the linux kernel documentation for ext2:

"There are pointers to the first 12 blocks which contain the file's data in the inode. There is a pointer to an indirect block (which contains pointers to the next set of blocks), a pointer to a doubly indirect block (which contains pointers to indirect blocks) and a pointer to a trebly indirect block (which contains pointers to doubly indirect blocks)."


So, there is a structure in ext2 that has 15 pointers, the first 12 are for direct blocks. Pointer number 13 points to an indirect block, number 14 to a doubly indirect block and number 15 to a trebly indirect block.

Directories

Each directory is a list of directory entries. Each directory entry associates one file name with one inode number, and consists of the inode number, the length of the file name, and the actual text of the file name. To find a file, the directory is searched front-to-back for the associated filename. For reasonable directory sizes, this is fine. But for huge large directories this is inefficient, and ext3
Ext3
The ext3 or third extended filesystem is a journaled file system that is commonly used by the Linux kernel. It is the default file system for many popular Linux distributions, including Debian...

 offers a second way of storing directories that is more efficient than just a list of filenames.

The root directory is always stored in inode number two, so that the file system code can find it at mount time. Subdirectories are implemented by storing the name of the subdirectory in the name field, and the inode number of the subdirectory in the inode field. Hard links are implemented by storing the same inode number with more than one file name. Accessing the file by either name results in the same inode number, and therefore the same data.

The special directories "." and ".." are implemented by storing the names "." and ".." in the directory, and the inode number of the current and parent directories in the inode field. The only special treatment these two entries receive is that they are automatically created when any new directory is made, and they cannot be deleted.

Allocating Data

When a new file or directory is created, the EXT2 file system must decide where to store the data. If the disk is mostly empty, then data can be stored almost anywhere. However, performance is maximized if the data is clustered with other related data to minimize seek times.

The EXT2 file system attempts to allocate each new directory in the group containing its parent directory, on the theory that accesses to parent and children directories are likely to be closely related. The EXT2 file system also attempts to place files in the same group as their directory entries, because directory accesses often lead to file accesses. However, if the group is full, then the new file or new directory is placed in some other non-full group.

The data blocks needed to store directories and files can be found by looking in the data allocation bitmap. Any needed space in the inode table can be found by looking in the inode allocation bitmap.

File system limits

Theoretical ext2 filesystem limits under Linux
Block size: 1 KB 2 KB 4 KB 8 KB
max. file size: 16 GB 256 GB 2 TB 2 TB
max. filesystem size: 4* TB 8 TB 16 TB 32 TB


The reason for some limits of the ext2-file system are the file format of the data and the operating system's kernel. Mostly these factors will be determined once when the file system is built. They depend on the block size and the ratio of the number of blocks and inodes.

In Linux the block size is limited by the architecture page size.

There are also some userspace programs that can't handle files larger than 2 GB
Large file support
Large file support, often abbreviated to LFS, is the term frequently applied to the ability to create files larger than 2 GiB on 32-bit operating systems.- Rationale :...

.

The maximum file size is limited to min( ((b/4)3+(b/4)2+b/4+12)*b, 232*b ) due to the i_block (an array of EXT2_N_BLOCKS) and i_blocks( 32-bits integer value ) representing the amount of b-bytes "blocks" in the file.

The limit of sublevel-directories is 31998 due to the link count limit. Directory indexing is not available in ext2, so there are performance issues for directories with a large number of files (10,000+). The theoretical limit on the number of files in a directory is 1.3 × 1020, although this is not relevant for practical situations.

Note: In Linux kernel 2.4 and earlier block devices were limited to 2 TB, limiting the maximum size of a partition regardless of block size.

Compression extension

e2compr is a modification to the ext2 file system
File system
A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...

 driver in the Linux kernel
Linux kernel
The Linux kernel is an operating system kernel used by the Linux family of Unix-like operating systems. It is one of the most prominent examples of free and open source software....

 to support online compression and decompression of files on file system level without any support by user applications.

e2compr is a small patch against the ext2 file system that allows on-the-fly compression
Data compression
In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use....

 and decompression. It compresses only regular files; the administrative data (superblock, inode
Inode
In computing, an inode is a data structure on a traditional Unix-style file system such as UFS. An inode stores all the information about a regular file, directory, or other file system object, except its data and name....

s, directory
Directory (file systems)
In computing, a folder, directory, catalog, or drawer, is a virtual container originally derived from an earlier Object-oriented programming concept by the same name within a digital file system, in which groups of computer files and other folders can be kept and organized.A typical file system may...

 files
Computer file
A computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished...

 etc.) are not compressed (mainly for safety reasons). Access to compressed blocks is provided for read and write operations. The compression algorithm and cluster size
Cluster (file system)
In computer file systems, a cluster or allocation unit is the unit of disk space allocation for files and directories. To reduce the overhead of managing on-disk data structures, the filesystem does not allocate individual disk sectors, but contiguous groups of sectors, called clusters.On a disk...

 is specified on a per-file basis. Directories can also be marked for compression, in which case every newly created file in the directory will be automatically compressed with the same cluster size and the same algorithm that was specified for the directory.

e2compr is not a new file system. It is only a patch to the ext2 file system made to support the EXT2_COMPR_FL flag. It does not require you to make a new partition, and will continue to read or write existing ext2 file systems. One can consider it as simply a way for the read and write routines to access files that could have been created by a simple utility similar to gzip or compress. Compressed and uncompressed files coexist nicely on ext2 partitions.

The latest e2compr-branch is available for current releases of 3.0, 2.6 and 2.4 Linux kernels. The latest patch for the kernel 3.0 series was released in August 2011 and provides mulitcore and himem support. There are also older branches for older 2.0 and 2.2 kernels.

See also

  • e2fsprogs
    E2fsprogs
    e2fsprogs is a set of utilities for maintaining the ext2, ext3 and ext4 file systems. Since those file systems are often the default for Linux distributions, it is commonly considered to be essential software....

  • ext
    Extended file system
    The extended file system or ext was implemented in April 1992 as the first file system created specifically for the Linux operating system. It has metadata structure inspired by the traditional Unix File System and was designed by Rémy Card to overcome certain limitations of the Minix file...

     : ancestor of ext2
  • ext3
    Ext3
    The ext3 or third extended filesystem is a journaled file system that is commonly used by the Linux kernel. It is the default file system for many popular Linux distributions, including Debian...

     : extended version of ext2
  • ext4
    Ext4
    The ext4 or fourth extended filesystem is a journaling file system for Linux, developed as the successor to ext3.It was born as a series of backward compatible extensions to ext3, many of them originally developed by Cluster File Systems for the Lustre file system between 2003 and 2006, meant to...

     : fourth extended file system
  • StegFS
    StegFS
    StegFS is a free file system for Linux. It is licensed under the GPL. It was principally developed by Andrew D. McDonald and Markus G. Kuhn. It is a steganographic file system based on the ext2 filesystem....

     : a steganographic file system
    Steganographic file system
    Steganographic file systems are a kind of file system first proposed by Ross Anderson, Roger Needham, and Adi Shamir. Their paper proposed two main methods of hiding data: in a series of fixed size files originally consisting of random bits on top of which 'vectors' could be superimposed in such a...

     based on ext2
  • cloop
    Cloop
    The compressed loopback device or cloop is a module for the Linux kernel. It adds support for transparently decompressed, read-only block devices. It is not a compressed file system in itself....

  • SquashFS
    SquashFS
    SquashFS is a compressed read-only file system for Linux. SquashFS compresses files, inodes and directories, and supports block sizes up to 1 MB for greater compression...

  • List of file systems
  • Comparison of file systems
    Comparison of file systems
    -General information:-Limits:-Metadata:-Features:-Allocation and layout policies:-Supporting operating systems:-See also:* Comparison of archive formats* Comparison of file archivers* List of archive formats* List of file archivers...

  • Filesystem in Userspace (FUSE)
    Filesystem in Userspace
    Filesystem in Userspace is a loadable kernel module for Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK