NTFS
Encyclopedia
NTFS is the standard file system
of Windows NT
, including its later versions Windows 2000
, Windows XP
, Windows Server 2003
, Windows Server 2008, Windows Vista
, and Windows 7.
NTFS supersedes the FAT
file system as the preferred file system for Microsoft’s Windows
operating systems. NTFS has several improvements over FAT and HPFS (High Performance File System) such as improved support for metadata and the use of advanced data structures to improve performance, reliability, and disk space utilization, plus additional extensions such as security access control list
s (ACL) and file system journaling
.
formed a joint project to create the next generation of graphical operating system
. The result of the project was OS/2
, but Microsoft and IBM disagreed on many important issues and eventually separated. OS/2 remained an IBM project. Microsoft started to work on Windows NT. The OS/2 file system HPFS contained several important new features. When Microsoft created their new operating system, they borrowed many of these concepts for NTFS. Probably as a result of this common ancestry, HPFS and NTFS share the same disk partition
identification type code (07). Sharing an ID is unusual since there were dozens of available codes, and other major file systems have their own code. FAT has more than nine (one each for FAT12, FAT16, FAT32, etc.). Algorithms which identify the file system in a partition type 07 must perform additional checks. It is also clear that NTFS owes some of its architectural design to Files-11
used by VMS. Dave Cutler
was the main lead for both VMS and Windows NT.
V1.0 and V1.1 (and newer) are incompatible: that is, volumes written by NT 3.5x cannot be read by NT 3.1 until an update on the NT 3.5x CD is applied to NT 3.1, which also adds FAT long file name support.
V1.2 supports compressed files, named streams, ACL-based security, etc.
V3.0 added disk quotas, encryption, sparse file
s, reparse points, update sequence number (USN) journaling, the $Extend folder and its files, and reorganized security descriptor
s so that multiple files which use the same security setting can share the same descriptor.
V3.1 expanded the Master File Table (MFT) entries with redundant MFT record number (useful for recovering damaged MFT files).
Windows Vista
introduced Transactional NTFS
, NTFS symbolic link
s, partition shrinking and self-healing functionality though these features owe more to additional functionality of the operating system than the file system itself.
The NTFS.sys version (i.e. NTFS v5.0 introduced with Windows 2000
) should not be confused with the on-disk NTFS format version (v3.1 since Windows XP). The NTFS v3.1 on-disk format is unchanged from the introduction of Windows XP
and is used in Windows Server 2003
, Windows Server 2008, Windows Vista
, and Windows 7. The confusion arises when no differentiation is made when features are implemented into the NTFS.sys driver within the Windows OS rather than in the NTFS on-disk format. An incident of this was when Microsoft detailed new features within NTFS in Windows 2000 and they called it NTFS v5.0, yet it is the NTFS.sys driver that is at that version and the on-disk format is only at v3.0.
(EFS).
and uses the NTFS Log ($LogFile) to record metadata changes to the volume.
It is a critical functionality of NTFS (a feature that FAT/FAT32 does not provide) for ensuring that its internal complex data structures (notably the volume allocation bitmap), or data moves performed by the defragmentation
API, the modifications to MFT records (such as moves of some variable-length attributes stored in MFT records and attribute lists), and indices (for directories and security descriptor
s) will remain consistent in case of system crashes, and allow easy rollback of uncommitted changes to these critical data structures when the volume is remounted.
(Update Sequence Number Journal) is a system management feature that records changes to all files, streams and directories on the volume, as well as their various attributes and security settings. The journal is made available for applications to track changes to the volume. This journal can be enabled or disabled on non-system volumes and is not enabled by default for a newly added drive.
subsystem in Windows NT, hard link
s are similar to directory junctions, but used for files instead of directories. Hard links can only be applied to files on the same volume since an additional filename record is added to the file's MFT record. Short (8.3) filenames are also implemented as additional filename records and directory entries that are linked and updated together. Hard links also have the behavior that changing the size or attributes of a file may not update the directory entries of other links until they are opened.
allow more than one data stream to be associated with a filename, using the filename format "filename:streamname" (e.g., "text.txt:extrastream"). Alternate streams are not listed in Windows Explorer, and their size is not included in the file's size. Only the main stream of a file is preserved when it is copied to a FAT-formatted USB drive, attached to an e-mail, or uploaded to a website. As a result, using alternate streams for critical data may cause problems. NTFS Streams were introduced in Windows NT 3.1
, to enable Services for Macintosh (SFM) to store Macintosh resource fork
s. Although current versions of Windows Server no longer include SFM, third-party Apple Filing Protocol
(AFP) products (such as Group Logic's ExtremeZ-IP
) still use this feature of the file system.
Malware
has used alternate data streams to hide its code; some malware scanners and other special tools now check for data in alternate streams. Microsoft provides a tool called Streams to allow users to view streams on a selected volume.
Very small ADS are also added within Internet Explorer (and now also other browsers) to mark files that have been downloaded from external sites: they may be unsafe to run locally and the local shell will require confirmation from the user before opening them. When the user indicates that he no longer wants this confirmation dialog, this ADS is simply dropped from the MFT entry for downloaded files.
Some media players have also tried to use ADS to store custom metadata to media files, in order to organize the collections, without modifying the effective data content of the media files themselves (using embedded tags when they are supported by the media file formats such as MPEG and OGG containers); these metadata may be displayed in the Windows Explorer as extra information columns, with the help of a registered Windows Shell extension that can parse them, but most media players prefer to use their own separate database instead of ADS for storing these information (notably because ADS are visible to all users of these files, instead of being managed with distinct per-user security settings and having their values defined according to user preferences).
s are files which contain sparse data sets
, which are files with segments stored at different file offsets with no actual storage space used for the space between segments. When a file is read back, the file system driver returns zeros for any data that does not actually exist, so the file may appear to be mostly filled with zeros. Database applications, for instance, sometimes use sparse files. Because of this, Microsoft has implemented support for efficient storage of sparse files by allowing an application to specify regions of empty (zero) data. An application that reads a sparse file reads it in the normal manner with the file system calculating what data should be returned based upon the file offset. As with compressed files, the actual sizes of sparse files are not taken into account when determining quota limits.
files using LZNT1 algorithm (a variant of the LZ77
).
Files are compressed in 16-cluster chunks. With 4kB clusters, files are compressed in 64kB chunks. If the compression reduces 64kB of data to 60kB or less, NTFS treats the unneeded 4kB pages like empty sparse file
clusters – they are not written. This allows not unreasonable random-access times. However, large compressible files become highly fragmented as then every 64k chunk becomes a smaller fragment.
Compression is not recommended by Microsoft for files exceeding 30MB because of the performance hit.
The best use of compression is for files which are repetitive, written seldom, usually accessed sequentially, and not themselves compressed. LOG files are an ideal example. Compressing files which are less than 4kB or already compressed (like .zip or .jpg or .avi) may make them bigger as well as slower. Avoid compressing executables like .exe and .dll (they may be paged in and out in 4kB pages). Never compress system files used at bootup like drivers or NTLDR or winload.exe or BOOTMGR.
Although read–write access to compressed files is often, but not always
transparent
, Microsoft recommends avoiding compression on server systems and/or network shares holding roaming profiles because it puts a considerable load on the processor.
Single-user systems with limited hard disk space can benefit from NTFS compression for small files, from 4+kB to 64kB or more, depending on compressibility. Files less than 900 bytes or so are stored with the directory entry in the MFT.
The slowest link in a computer is not the CPU but the speed of the hard drive, so NTFS compression allows the limited, slow storage space to be better used, in terms of both space and (often) speed.
(This assumes that compressed file fragments are stored consecutively.)
NTFS compression can also serve as a replacement for sparse files when a program (e.g., a download manager
) is not able to create files without content as sparse files.
). The old file data is overlaid on the new when the user requests a revert to an earlier version. This also allows data backup programs to archive files currently in use by the file system. On heavily loaded systems, Microsoft recommends setting up a shadow copy volume on a separate disk.
to group changes to files together into a transaction. The transaction will guarantee that all changes happen, or none of them do, and it will guarantee that applications outside the transaction will not see the changes until they are committed.
It uses similar techniques as those used for Volume Shadow Copies (i.e. copy-on-write) to ensure that overwritten data can be safely rolled back, and a CLFS
log to mark the transactions that have still not been committed, or those that have been committed but still not fully applied (in case of system crash during a commit by one of the participants).
Transactional NTFS does not restrict transactions to just the local NTFS volume, but also includes other transactional data or operations in other locations such as data stored in separate volumes, the local registry, or SQL databases, or the current states of system services or remote services. These transactions are coordinated network-wide with all participants using a specific service, the DTC
, to ensure that all participants will receive same commit state, and to transport the changes that have been validated by any participant (so that the others can invalidate their local caches for old data or rollback their ongoing uncommitted changes). Transactional NTFS allows, for example, the creation of network-wide consistent distributed filesystems, including with their local live or offline caches.
provides strong and user-transparent encryption of any file or folder on an NTFS volume. EFS works in conjunction with the EFS service, Microsoft's CryptoAPI
and the EFS File System Run-Time Library (FSRTL).
EFS works by encrypting a file with a bulk symmetric key (also known as the File Encryption Key, or FEK), which is used because it takes a relatively small amount of time to encrypt and decrypt large amounts of data than if an asymmetric key cipher is used. The symmetric key that is used to encrypt the file is then encrypted with a public key that is associated with the user who encrypted the file, and this encrypted data is stored in an alternate data stream of the encrypted file. To decrypt the file, the file system uses the private key of the user to decrypt the symmetric key that is stored in the file header. It then uses the symmetric key to decrypt the file. Because this is done at the file system level, it is transparent to the user. Also, in case of a user losing access to their key, support for additional decryption keys has been built in to the EFS system, so that a recovery agent can still access the files if needed. NTFS-provided encryption and NTFS-provided compression are mutually exclusive; however, NTFS can be used for one and a third-party tool for the other.
The support of EFS is not available in Basic, Home and MediaCenter versions of Windows, and must be activated after installation of Professional, Ultimate and Server versions of Windows or by using enterprise deployment tools within Windows domains.
s were introduced in NTFS v3. They allow the administrator of a computer that runs a version of Windows that supports NTFS to set a threshold of disk space that users may use. It also allows administrators to keep track of how much disk space each user is using. An administrator may specify a certain level of disk space that a user may use before they receive a warning, and then deny access to the user once they hit their upper limit of space. Disk quotas do not take into account NTFS's transparent file-compression, should this be enabled. Applications that query the amount of free space will also see the amount of free space left to the user who has a quota applied to them.
The support of disk quotas is not available in Basic, Home and MediaCenter versions of Windows, and must be activated after installation of Professional, Ultimate and Server versions of Windows or by using enterprise deployment tools within Windows domains.
are used by associating a reparse tag in the user space attribute of a file or directory. When the object manager (see Windows NT line executive) parses a file system name lookup and encounters a reparse attribute, it will reparse the name lookup, passing the user controlled reparse data to every file system filter driver that is loaded into Windows. Each filter driver examines the reparse data to see whether it is associated with that reparse point, and if that filter driver determines a match, then it intercepts the file system call and executes its special functionality. Reparse points
are used to implement Volume Mount Points, Directory Junctions, Hierarchical Storage Management, Native Structured Storage, Single Instance Storage, and Symbolic Links.
are similar to Unix
mount points, where the root of another file system is attached to a directory. In NTFS, this allows additional file systems to be mounted without requiring a separate drive letter (such as C: or D:) for each.
Once a volume has been mounted on top of an existing directory of another volume, the contents previously listed in that directory become invisible and are replaced by the content of the root directory of the mounted volume. The mounted volume could still have its own drive letter assigned separately. The file system does not allow volumes to be mutually mounted on each other. Volume mount points can be made to be either persistent (remounted automatically after system reboot) or not persistent (must be manually remounted after reboot).
Mounted volumes may use other file systems than just NTFS; notably they may be remote shared directories, possibly with their own security settings and remapping of access rights according to the remote file system policy.
are similar to volume mount points, but reference other directories in the file system instead of other volumes. For instance, the directory
, except that the target in NTFS must always be another directory (typical Unix file systems allow the target of a symbolic link to be any type of file) and have the semantics of a hardlink (i.e., they must be immediately resolvable when they are created).
Directory joins (which can be created with the command MKLINK /J junctionName targetDirectory and removed with RMDIR junctionName from a console prompt) are persistent, and resolved on the server side as they share the same security realm of the local system or domain on which the parent volume is mounted and the same security settings for its contents as the content of the target directory; however the junction itself may have distinct security settings. Unlinking a directory junction join does not delete files in the target directory.
Note that some directory junctions are installed by default on Windows Vista, for compatibility with previous versions of Windows, such as Documents and Settings in the root directory of the system drive, which links to the Users physical directory in the root directory of the same volume. However they are hidden by default, and their security settings are set up so that the Windows Explorer will refuse to open them from within the Shell or in most applications, except for the local built-in SYSTEM user or the local Administrators group (both user accounts are used by system software installers). This additional security restriction has probably been made to avoid users of finding apparent duplicate files in the joined directories and deleting them by error, because the semantics of directory junctions is not the same as hardlinks; the reference counting is not used on the target contents and not even on the referenced container itself.
Directory junctions are soft links (they will persist even if the target directory is removed), working as a limited form of symbolic links (with an additional restriction on the location of the target), but it is an optimized version which allows faster processing of the reparse point with which they are implemented, with less overhead than the newer NTFS symbolic links, and can be resolved on the server side (when they are found in remote shared directories).
s (or soft links) were introduced in Windows Vista. Symbolic links are resolved on the client side. So when a symbolic link is shared, the target is subject to the access restrictions on the client, and not the server.
Symbolic links can be created either to files (created with MKLINK symLink targetFilename) or to directories (created with MKLINK /D symLinkD targetDirectory), but (unlike Unix symbolic links) the semantic of the link must be provided with the created link. The target however need not exist or be available when the symbolic link is created: when the symbolic link will be accessed and the target will be checked for availability, NTFS will also check if it has the correct type (file or directory); it will return a not-found error if the existing target has the wrong type.
They can also reference shared directories on remote hosts or files and subdirectories within shared directories: their target is not mounted immediately at boot, but only temporarily on demand while opening them with the OpenFile or CreateFile API. Their definition is persistent on the NTFS volume where they are created (all types of symbolic links can be removed as if they were files, using DEL symLink from a command line prompt or batch).
, which is a technique by which memory copying is not really done until one copy is modified.
is a means of transferring files that are not used for some period of time to less expensive storage media. When the file is next accessed, the reparse point on that file determines that it is needed and retrieves it from storage.
document storage technology that has since been discontinued by Microsoft. It allowed ActiveX Document
s to be stored in the same multi-stream format that ActiveX uses internally. An NSS file system filter was loaded and used to process the multiple streams transparently to the application, and when the file was transferred to a non-NTFS formatted disk volume it would also transfer the multiple streams into a single stream.
- and backward-compatible
, there are technical considerations for mounting newer NTFS volumes in older versions of Microsoft Windows. This affects dual-booting, and external portable hard drives.
For example, attempting to use an NTFS partition with "Previous Versions" (a.k.a. Volume Shadow Copy) on an operating system that does not support it will result in the contents of those previous versions being lost.
-licensed NTFS-3G
also works on Mac OS X through FUSE
and allows reading and writing to NTFS partitions. A performance enhanced commercial version, called Tuxera
NTFS for Mac, is also available from the NTFS-3G developers. Paragon Software Group
sells a read-write driver named NTFS for Mac OS X, which is also included on some models of Seagate
hard drives. Native NTFS write support has been discovered in Mac OS X 10.6 and later, but is not activated by default, although hacks do exist to enable the functionality. However, user reports indicate the functionality is unstable and tends to cause kernel panics
, probably the reason why write support has not been enabled or advertised.
driver. It is included in most Linux distributions. Other solutions exist as well:
Note that all three userspace drivers, namely NTFSMount, NTFS-3G and Captive NTFS, are built on the Filesystem in Userspace
(FUSE), a Linux kernel module tasked with bridging userspace and kernel code to save and retrieve data. All drivers listed above (except Tuxera NTFS and Paragon NTFS for Linux) are open source
(GPL). Due to the complexity of internal NTFS structures, both the built-in 2.6.14 kernel driver and the FUSE drivers disallow changes to the volume that are considered unsafe, to avoid corruption.
, and FreeBSD
offer read-only NTFS support (there is a beta NTFS driver that allows write/delete for eComStation, but is generally considered unsafe). A free third-party tool for BeOS
, which was based on NTFS-3G, allows full NTFS read and write. NTFS-3G
also works on Mac OS X, FreeBSD, NetBSD, Solaris, QNX
and Haiku
, in addition to Linux, through FUSE
. A free for personal use read/write driver for MS-DOS
called "NTFS4DOS" also exists. OpenBSD
offer read-only NTFS support by default on i386 and amd64 platforms as of version 4.9 released 1. May 2011.
and, on Windows 2000 and later, FAT32.
Microsoft added the built-in ability to shrink or expand a partition, but this capability is limited because it will not relocate page file fragments or files that have been marked as unmovable. So shrinking will often require relocating or disabling any page file, the index of Windows Search
, and any Shadow Copy used by System Restore
.
(DST) is in effect, and other files are moved when standard time
is in effect, there can be some ambiguities in the conversions. As a result, especially shortly after one of the days on which local zone time changes, users may observe that some files have timestamps that are incorrect by one hour. Due to the differences in implementation of DST between the northern and southern hemispheres, this can result in a potential timestamp error of up to 4 hours in any given 12 months.
data—file name, creation date, access permissions (by the use of access control list
s), and contents—are stored as metadata in the Master File Table. This abstract approach allowed easy addition of file system features during Windows NT's development—an interesting example is the addition of fields for indexing used by the Active Directory
software.
NTFS allows any sequence of 16-bit values for name encoding (file names, stream names, index names, etc.). This means UTF-16 codepoints are supported, but the file system does not check whether a sequence is valid UTF-16
(it allows any sequence of short values, not restricted to those in the Unicode standard).
Internally, NTFS uses B+ tree
s to index file system data. Although complex to implement, this allows faster file look up times in most cases. A file system journal
is used to guarantee the integrity of the file system metadata but not individual files' content. Systems using NTFS are known to have improved reliability compared to FAT file systems.
The Master File Table (MFT) contains metadata about every file, directory, and metafile on an NTFS volume. It includes filenames, locations, size, and permissions. Its structure supports algorithms which minimize disk fragmentation
. A directory entry consists of a filename and a "file ID" which is the record number representing the file in the Master File Table. The file ID also contains a reuse count to detect stale references. While this strongly resembles the W_FID of Files-11, other NTFS structures radically differ.
expectations, track bad allocation units, and store security and disk space usage information. All content is in an unnamed data stream, unless otherwise indicated.
These metafiles are treated specially by Windows and are difficult to directly view: special purpose-built tools are needed.
One such tool is the nfi.exe-"NTFS File Sector Information Utility" that is freely distributed as part of the Microsoft "OEM Support Tools".
Each stream (or attribute) itself has a single type (internally just a fixed-size integer in the stored descriptor, but most often handled in applications using an equivalent symbolic name in the FileOpen or FileCreate API call), a single optional stream name (completely unrelated to the effective filenames), plus optional associated data for that stream. For NTFS, the standard data of files, or the index data for directories are handled the same way as other data for alternate data streams, or for standard attributes. They are just one of the attributes stored in one or several attribute lists.
All streams of a given file may be displayed by using the nfi.exe-"NTFS File Sector Information Utility" that is freely distributed as part of the Microsoft "OEM Support Tools".
workers. The amount of data which fits is highly dependent on the file's characteristics, but 700 to 800 bytes is common in single-stream files with non-lengthy filenames and no ACLs.
The NTFS filesystem driver will sometimes attempt to relocate the data of some of these non-resident streams into the streams repository, and will also attempt to relocate the stream descriptors stored in a non-resident repository back to the stream repository of the MFT record, based on priority and preferred ordering rules, and size constraints.
Since resident files do not directly occupy clusters ("allocation units"), it is possible for an NTFS volume to contain more files on a volume than there are clusters. For example, a 74.5 GB partition NTFS formats with 19,543,064 clusters of 4 KB. Subtracting system files (a 64 MB log file, a 2,442,888-byte Bitmap file, and about 25 clusters of fixed overhead) leaves 19,526,158 clusters free for files and indices. Since there are four MFT records per cluster, this volume theoretically could hold almost 4 × 19,526,158 = 78,104,632 resident files.
File Names: File names are limited to 255 UTF-16 code points. Certain names are reserved in the volume root directory and cannot be used for files. These are: $MFT, $MFTMirr, $LogFile, $Volume, $AttrDef, . (dot), $Bitmap, $Boot, $BadClus, $Secure, $Upcase, and $Extend;. (dot) and $Extend are both directories; the others are files. The NT kernel limits full paths to 32,767 UTF-16 code points.
Maximum Volume Size: In theory, the maximum NTFS volume size is 264−1 clusters. However, the maximum NTFS volume size as implemented in Windows XP Professional is 232−1 clusters. For example, using 64 kB clusters, the maximum Windows XP NTFS volume size is 256 TB
minus 64 kB
. Using the default cluster size of 4 kB, the maximum NTFS volume size is 16 TB minus 4 kB. (Both of these are vastly higher than the 128 GB
limit lifted in Windows XP SP1.) Because partition tables on master boot record (MBR) disks only support partition sizes up to 2 TB, dynamic or GPT
volumes must be used to create NTFS volumes over 2 TB. Booting from a GPT volume to a Windows environment requires a system with UEFI and 64-bit support.
Maximum File Size: As designed, the maximum NTFS file size is 16 EB minus 1 KB or 18,446,744,073,709,550,592 bytes. As implemented, the maximum NTFS file size is 16 TB minus 64 kB or 17,592,185,978,880 bytes.
Alternate Data Streams: Windows system calls may handle alternate data streams. Depending on the operating system, utility and remote file system, a file transfer might silently strip data streams. A safe way of copying or moving files is to use the BackupRead and BackupWrite system calls, which allow programs to enumerate streams, to verify whether each stream should be written to the destination volume and to knowingly skip unwanted streams.
File system
A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...
of Windows NT
Windows NT
Windows NT is a family of operating systems produced by Microsoft, the first version of which was released in July 1993. It was a powerful high-level-language-based, processor-independent, multiprocessing, multiuser operating system with features comparable to Unix. It was intended to complement...
, including its later versions Windows 2000
Windows 2000
Windows 2000 is a line of operating systems produced by Microsoft for use on personal computers, business desktops, laptops, and servers. Windows 2000 was released to manufacturing on 15 December 1999 and launched to retail on 17 February 2000. It is the successor to Windows NT 4.0, and is the...
, Windows XP
Windows XP
Windows XP is an operating system produced by Microsoft for use on personal computers, including home and business desktops, laptops and media centers. First released to computer manufacturers on August 24, 2001, it is the second most popular version of Windows, based on installed user base...
, Windows Server 2003
Windows Server 2003
Windows Server 2003 is a server operating system produced by Microsoft, introduced on 24 April 2003. An updated version, Windows Server 2003 R2, was released to manufacturing on 6 December 2005...
, Windows Server 2008, Windows Vista
Windows Vista
Windows Vista is an operating system released in several variations developed by Microsoft for use on personal computers, including home and business desktops, laptops, tablet PCs, and media center PCs...
, and Windows 7.
NTFS supersedes the FAT
File Allocation Table
File Allocation Table is a computer file system architecture now widely used on many computer systems and most memory cards, such as those used with digital cameras. FAT file systems are commonly found on floppy disks, flash memory cards, digital cameras, and many other portable devices because of...
file system as the preferred file system for Microsoft’s Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
operating systems. NTFS has several improvements over FAT and HPFS (High Performance File System) such as improved support for metadata and the use of advanced data structures to improve performance, reliability, and disk space utilization, plus additional extensions such as security access control list
Access control list
An access control list , with respect to a computer file system, is a list of permissions attached to an object. An ACL specifies which users or system processes are granted access to objects, as well as what operations are allowed on given objects. Each entry in a typical ACL specifies a subject...
s (ACL) and file system journaling
Journaling file system
A journaling file system is a file system that keeps track of the changes that will be made in a journal before committing them to the main file system...
.
History
In the mid 1980s, Microsoft and IBMIBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...
formed a joint project to create the next generation of graphical operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...
. The result of the project was OS/2
OS/2
OS/2 is a computer operating system, initially created by Microsoft and IBM, then later developed by IBM exclusively. The name stands for "Operating System/2," because it was introduced as part of the same generation change release as IBM's "Personal System/2 " line of second-generation personal...
, but Microsoft and IBM disagreed on many important issues and eventually separated. OS/2 remained an IBM project. Microsoft started to work on Windows NT. The OS/2 file system HPFS contained several important new features. When Microsoft created their new operating system, they borrowed many of these concepts for NTFS. Probably as a result of this common ancestry, HPFS and NTFS share the same disk partition
Disk partitioning
Disk partitioning is the act of dividing a hard disk drive into multiple logical storage units referred to as partitions, to treat one physical disk drive as if it were multiple disks. Partitions are also termed "slices" for operating systems based on BSD, Solaris or GNU Hurd...
identification type code (07). Sharing an ID is unusual since there were dozens of available codes, and other major file systems have their own code. FAT has more than nine (one each for FAT12, FAT16, FAT32, etc.). Algorithms which identify the file system in a partition type 07 must perform additional checks. It is also clear that NTFS owes some of its architectural design to Files-11
Files-11
Files-11, also known as on-disk structure, is the file system used by Hewlett-Packard's OpenVMS operating system, and also by the older RSX-11...
used by VMS. Dave Cutler
Dave Cutler
David Neil Cutler, Sr. is an American software engineer, designer and developer of several operating systems including RSX-11M, VMS and VAXELN at Digital Equipment Corporation and Windows at Microsoft.- Personal history :...
was the main lead for both VMS and Windows NT.
Versions
The NTFS on-disk format has five released versions:- v1.0 with NT 3.1Windows NT 3.1Windows NT 3.1 is the first release of Microsoft's Windows NT line of server and business desktop operating systems, and was released to manufacturing on 27 July 1993. The version number was chosen to match the one of Windows 3.1, the then-latest operating environment from Microsoft, on account of...
, released mid-1993 - v1.1 with NT 3.5Windows NT 3.5Windows NT 3.5 is the second release of the Microsoft Windows NT operating system. It was released on 21 September 1994.One of the primary goals during Windows NT 3.5's development was to increase the speed of the operating system; as a result, the project was given the codename "Daytona" in...
, released fall 1994 - v1.2 with NT 3.51Windows NT 3.51Windows NT 3.51 is the third release of Microsoft's Windows NT line of operating systems. It was released on 30 May 1995, nine months after Windows NT 3.5. The release provided two notable feature improvements; firstly NT 3.51 was the first of a short-lived outing of Microsoft Windows on the...
(mid-1995) and NT 4Windows NT 4.0Windows NT 4.0 is a preemptive, graphical and business-oriented operating system designed to work with either uniprocessor or symmetric multi-processor computers. It was the next release of Microsoft's Windows NT line of operating systems and was released to manufacturing on 31 July 1996...
(mid-1996) (occasionally referred to as "NTFS 4.0", because OS version is 4.0) - v3.0 from Windows 2000Windows 2000Windows 2000 is a line of operating systems produced by Microsoft for use on personal computers, business desktops, laptops, and servers. Windows 2000 was released to manufacturing on 15 December 1999 and launched to retail on 17 February 2000. It is the successor to Windows NT 4.0, and is the...
("NTFS V5.0" or "NTFS5") - v3.1 from Windows XPWindows XPWindows XP is an operating system produced by Microsoft for use on personal computers, including home and business desktops, laptops and media centers. First released to computer manufacturers on August 24, 2001, it is the second most popular version of Windows, based on installed user base...
(autumn 2001; "NTFS V5.1")- Windows Server 2003Windows Server 2003Windows Server 2003 is a server operating system produced by Microsoft, introduced on 24 April 2003. An updated version, Windows Server 2003 R2, was released to manufacturing on 6 December 2005...
(spring 2003; occasionally "NTFS V5.2") - Windows Server 2008 and Windows VistaWindows VistaWindows Vista is an operating system released in several variations developed by Microsoft for use on personal computers, including home and business desktops, laptops, tablet PCs, and media center PCs...
(mid-2005) (occasionally "NTFS V6.0") - Windows Server 2008 R2Windows Server 2008 R2Windows Server 2008 R2 is a server operating system produced by Microsoft. It was released to manufacturing on July 22, 2009 and launched on October 22, 2009. According to the Windows Server Team blog, the retail availability was September 14, 2009. It is built on Windows NT 6.1, the same core...
and Windows 7 (occasionally "NTFS V6.1").
- Windows Server 2003
V1.0 and V1.1 (and newer) are incompatible: that is, volumes written by NT 3.5x cannot be read by NT 3.1 until an update on the NT 3.5x CD is applied to NT 3.1, which also adds FAT long file name support.
V1.2 supports compressed files, named streams, ACL-based security, etc.
V3.0 added disk quotas, encryption, sparse file
Sparse file
In computer science, a sparse file is a type of computer file that attempts to use file system space more efficiently when blocks allocated to the file are mostly empty. This is achieved by writing brief information representing the empty blocks to disk instead of the actual "empty" space which...
s, reparse points, update sequence number (USN) journaling, the $Extend folder and its files, and reorganized security descriptor
Security descriptor
Security descriptors are data structures of security information for securable Windows objects, that is objects that can be identified by a unique name...
s so that multiple files which use the same security setting can share the same descriptor.
V3.1 expanded the Master File Table (MFT) entries with redundant MFT record number (useful for recovering damaged MFT files).
Windows Vista
Windows Vista
Windows Vista is an operating system released in several variations developed by Microsoft for use on personal computers, including home and business desktops, laptops, tablet PCs, and media center PCs...
introduced Transactional NTFS
Transactional NTFS
Transactional NTFS is a component of Windows Vista and later operating systems. It brings the concept of atomic transactions to the NTFS file system, allowing Windows application developers to write file output routines that are guaranteed either to succeed completely or to fail completely.-...
, NTFS symbolic link
NTFS symbolic link
An NTFS symbolic link is a filesystem object in the NTFS filesystem that points to another filesystem object. The object being pointed to is called the target. Symbolic links should be transparent to users; the links appear as normal files or directories, and can be acted upon by the user or...
s, partition shrinking and self-healing functionality though these features owe more to additional functionality of the operating system than the file system itself.
The NTFS.sys version (i.e. NTFS v5.0 introduced with Windows 2000
Windows 2000
Windows 2000 is a line of operating systems produced by Microsoft for use on personal computers, business desktops, laptops, and servers. Windows 2000 was released to manufacturing on 15 December 1999 and launched to retail on 17 February 2000. It is the successor to Windows NT 4.0, and is the...
) should not be confused with the on-disk NTFS format version (v3.1 since Windows XP). The NTFS v3.1 on-disk format is unchanged from the introduction of Windows XP
Windows XP
Windows XP is an operating system produced by Microsoft for use on personal computers, including home and business desktops, laptops and media centers. First released to computer manufacturers on August 24, 2001, it is the second most popular version of Windows, based on installed user base...
and is used in Windows Server 2003
Windows Server 2003
Windows Server 2003 is a server operating system produced by Microsoft, introduced on 24 April 2003. An updated version, Windows Server 2003 R2, was released to manufacturing on 6 December 2005...
, Windows Server 2008, Windows Vista
Windows Vista
Windows Vista is an operating system released in several variations developed by Microsoft for use on personal computers, including home and business desktops, laptops, tablet PCs, and media center PCs...
, and Windows 7. The confusion arises when no differentiation is made when features are implemented into the NTFS.sys driver within the Windows OS rather than in the NTFS on-disk format. An incident of this was when Microsoft detailed new features within NTFS in Windows 2000 and they called it NTFS v5.0, yet it is the NTFS.sys driver that is at that version and the on-disk format is only at v3.0.
Features
NTFS v3.0 includes several new features over its predecessors: sparse file support, disk usage quotas, reparse points, distributed link tracking, and file-level encryption, also known as the Encrypting File SystemEncrypting File System
The Encrypting File System on Microsoft Windows is a feature introduced in version 3.0 of NTFS that provides filesystem-level encryption...
(EFS).
NTFS Log
NTFS is a Journaling file systemJournaling file system
A journaling file system is a file system that keeps track of the changes that will be made in a journal before committing them to the main file system...
and uses the NTFS Log ($LogFile) to record metadata changes to the volume.
It is a critical functionality of NTFS (a feature that FAT/FAT32 does not provide) for ensuring that its internal complex data structures (notably the volume allocation bitmap), or data moves performed by the defragmentation
Defragmentation
In the maintenance of file systems, defragmentation is a process that reduces the amount of fragmentation. It does this by physically organizing the contents of the mass storage device used to store files into the smallest number of contiguous regions . It also attempts to create larger regions of...
API, the modifications to MFT records (such as moves of some variable-length attributes stored in MFT records and attribute lists), and indices (for directories and security descriptor
Security descriptor
Security descriptors are data structures of security information for securable Windows objects, that is objects that can be identified by a unique name...
s) will remain consistent in case of system crashes, and allow easy rollback of uncommitted changes to these critical data structures when the volume is remounted.
USN Journal
The USN JournalUsn Journal
USN Journal is a function of recording the changes on NTFS volumes....
(Update Sequence Number Journal) is a system management feature that records changes to all files, streams and directories on the volume, as well as their various attributes and security settings. The journal is made available for applications to track changes to the volume. This journal can be enabled or disabled on non-system volumes and is not enabled by default for a newly added drive.
Hard links and short filenames
Originally included to support the POSIXPOSIX
POSIX , an acronym for "Portable Operating System Interface", is a family of standards specified by the IEEE for maintaining compatibility between operating systems...
subsystem in Windows NT, hard link
Hard link
In computing, a hard link is a directory entry that associates a name with a file on a file system. . The term is used in file systems which allow multiple hard links to be created for the same file. This has the effect of creating multiple names for the same file, causing an aliasing effect: e.g...
s are similar to directory junctions, but used for files instead of directories. Hard links can only be applied to files on the same volume since an additional filename record is added to the file's MFT record. Short (8.3) filenames are also implemented as additional filename records and directory entries that are linked and updated together. Hard links also have the behavior that changing the size or attributes of a file may not update the directory entries of other links until they are opened.
Alternate data streams (ADS)
Alternate data streamsFork (filesystem)
In a computer file system, a fork is byte stream associated with a file system object. Every non-empty file must have at least one fork, and depending on the file system, a file may have one or more other associated forks, which in turn may contain primary data integral to the file, or just metadata...
allow more than one data stream to be associated with a filename, using the filename format "filename:streamname" (e.g., "text.txt:extrastream"). Alternate streams are not listed in Windows Explorer, and their size is not included in the file's size. Only the main stream of a file is preserved when it is copied to a FAT-formatted USB drive, attached to an e-mail, or uploaded to a website. As a result, using alternate streams for critical data may cause problems. NTFS Streams were introduced in Windows NT 3.1
Windows NT 3.1
Windows NT 3.1 is the first release of Microsoft's Windows NT line of server and business desktop operating systems, and was released to manufacturing on 27 July 1993. The version number was chosen to match the one of Windows 3.1, the then-latest operating environment from Microsoft, on account of...
, to enable Services for Macintosh (SFM) to store Macintosh resource fork
Resource fork
The resource fork is a construct of the Mac OS operating system used to store structured data in a file, alongside unstructured data stored within the data fork. A resource fork stores information in a specific form, such as icons, the shapes of windows, definitions of menus and their contents, and...
s. Although current versions of Windows Server no longer include SFM, third-party Apple Filing Protocol
Apple Filing Protocol
The Apple Filing Protocol is a network protocol that offers file services for Mac OS X and original Mac OS. In Mac OS X, AFP is one of several file services supported including Server Message Block , Network File System , File Transfer Protocol , and WebDAV...
(AFP) products (such as Group Logic's ExtremeZ-IP
ExtremeZ-IP
ExtremeZ-IP, a Apple Filing Protocol server from Group Logic, Inc, , runs on Windows operating systems enabling Mac clients to access files via the Apple Filing protocol...
) still use this feature of the file system.
Malware
Malware
Malware, short for malicious software, consists of programming that is designed to disrupt or deny operation, gather information that leads to loss of privacy or exploitation, or gain unauthorized access to system resources, or that otherwise exhibits abusive behavior...
has used alternate data streams to hide its code; some malware scanners and other special tools now check for data in alternate streams. Microsoft provides a tool called Streams to allow users to view streams on a selected volume.
Very small ADS are also added within Internet Explorer (and now also other browsers) to mark files that have been downloaded from external sites: they may be unsafe to run locally and the local shell will require confirmation from the user before opening them. When the user indicates that he no longer wants this confirmation dialog, this ADS is simply dropped from the MFT entry for downloaded files.
Some media players have also tried to use ADS to store custom metadata to media files, in order to organize the collections, without modifying the effective data content of the media files themselves (using embedded tags when they are supported by the media file formats such as MPEG and OGG containers); these metadata may be displayed in the Windows Explorer as extra information columns, with the help of a registered Windows Shell extension that can parse them, but most media players prefer to use their own separate database instead of ADS for storing these information (notably because ADS are visible to all users of these files, instead of being managed with distinct per-user security settings and having their values defined according to user preferences).
Sparse files
Sparse fileSparse file
In computer science, a sparse file is a type of computer file that attempts to use file system space more efficiently when blocks allocated to the file are mostly empty. This is achieved by writing brief information representing the empty blocks to disk instead of the actual "empty" space which...
s are files which contain sparse data sets
Sparse matrix
In the subfield of numerical analysis, a sparse matrix is a matrix populated primarily with zeros . The term itself was coined by Harry M. Markowitz....
, which are files with segments stored at different file offsets with no actual storage space used for the space between segments. When a file is read back, the file system driver returns zeros for any data that does not actually exist, so the file may appear to be mostly filled with zeros. Database applications, for instance, sometimes use sparse files. Because of this, Microsoft has implemented support for efficient storage of sparse files by allowing an application to specify regions of empty (zero) data. An application that reads a sparse file reads it in the normal manner with the file system calculating what data should be returned based upon the file offset. As with compressed files, the actual sizes of sparse files are not taken into account when determining quota limits.
File compression
NTFS can compressData compression
In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use....
files using LZNT1 algorithm (a variant of the LZ77
).
Files are compressed in 16-cluster chunks. With 4kB clusters, files are compressed in 64kB chunks. If the compression reduces 64kB of data to 60kB or less, NTFS treats the unneeded 4kB pages like empty sparse file
Sparse file
In computer science, a sparse file is a type of computer file that attempts to use file system space more efficiently when blocks allocated to the file are mostly empty. This is achieved by writing brief information representing the empty blocks to disk instead of the actual "empty" space which...
clusters – they are not written. This allows not unreasonable random-access times. However, large compressible files become highly fragmented as then every 64k chunk becomes a smaller fragment.
Compression is not recommended by Microsoft for files exceeding 30MB because of the performance hit.
The best use of compression is for files which are repetitive, written seldom, usually accessed sequentially, and not themselves compressed. LOG files are an ideal example. Compressing files which are less than 4kB or already compressed (like .zip or .jpg or .avi) may make them bigger as well as slower. Avoid compressing executables like .exe and .dll (they may be paged in and out in 4kB pages). Never compress system files used at bootup like drivers or NTLDR or winload.exe or BOOTMGR.
Although read–write access to compressed files is often, but not always
transparent
Transparency (computing)
Any change in a computing system, such as new feature or new component, is transparent if the system after change adheres to previous external interface as much as possible while changing its internal behaviour. The purpose is to shield from change all systems on the other end of the interface...
, Microsoft recommends avoiding compression on server systems and/or network shares holding roaming profiles because it puts a considerable load on the processor.
Single-user systems with limited hard disk space can benefit from NTFS compression for small files, from 4+kB to 64kB or more, depending on compressibility. Files less than 900 bytes or so are stored with the directory entry in the MFT.
The slowest link in a computer is not the CPU but the speed of the hard drive, so NTFS compression allows the limited, slow storage space to be better used, in terms of both space and (often) speed.
(This assumes that compressed file fragments are stored consecutively.)
NTFS compression can also serve as a replacement for sparse files when a program (e.g., a download manager
Download manager
A download manager is a computer program dedicated to the task of downloading possibly unrelated stand-alone files from the Internet for storage...
) is not able to create files without content as sparse files.
Volume Shadow Copy
The Volume Shadow Copy Service (VSS) keeps historical versions of files and folders on NTFS volumes by copying old, newly-overwritten data to shadow copy (copy-on-writeCopy-on-write
Copy-on-write is an optimization strategy used in computer programming. The fundamental idea is that if multiple callers ask for resources which are initially indistinguishable, they can all be given pointers to the same resource...
). The old file data is overlaid on the new when the user requests a revert to an earlier version. This also allows data backup programs to archive files currently in use by the file system. On heavily loaded systems, Microsoft recommends setting up a shadow copy volume on a separate disk.
Transactional NTFS
As of Windows Vista, applications can use Transactional NTFSTransactional NTFS
Transactional NTFS is a component of Windows Vista and later operating systems. It brings the concept of atomic transactions to the NTFS file system, allowing Windows application developers to write file output routines that are guaranteed either to succeed completely or to fail completely.-...
to group changes to files together into a transaction. The transaction will guarantee that all changes happen, or none of them do, and it will guarantee that applications outside the transaction will not see the changes until they are committed.
It uses similar techniques as those used for Volume Shadow Copies (i.e. copy-on-write) to ensure that overwritten data can be safely rolled back, and a CLFS
Common Log File System
Common Log File System is a general-purpose logging subsystem that is accessible to both kernel-mode as well as user-mode applications for building high-performance transaction logs. It was introduced with Windows Server 2003 R2 and included in later Windows OSs. CLFS can be used for both data...
log to mark the transactions that have still not been committed, or those that have been committed but still not fully applied (in case of system crash during a commit by one of the participants).
Transactional NTFS does not restrict transactions to just the local NTFS volume, but also includes other transactional data or operations in other locations such as data stored in separate volumes, the local registry, or SQL databases, or the current states of system services or remote services. These transactions are coordinated network-wide with all participants using a specific service, the DTC
Distributed Transaction Coordinator
The Distributed Transaction Coordinator service is a component of modern versions of Microsoft Windows that is responsible for coordinating transactions that span multiple resource managers, such as databases, message queues, and file systems...
, to ensure that all participants will receive same commit state, and to transport the changes that have been validated by any participant (so that the others can invalidate their local caches for old data or rollback their ongoing uncommitted changes). Transactional NTFS allows, for example, the creation of network-wide consistent distributed filesystems, including with their local live or offline caches.
Encrypting File System (EFS)
EFSEncrypting File System
The Encrypting File System on Microsoft Windows is a feature introduced in version 3.0 of NTFS that provides filesystem-level encryption...
provides strong and user-transparent encryption of any file or folder on an NTFS volume. EFS works in conjunction with the EFS service, Microsoft's CryptoAPI
Cryptographic Application Programming Interface
The Cryptographic Application Programming Interface is an application programming interface included with Microsoft Windows operating systems that provides services to enable developers to secure Windows-based applications using cryptography...
and the EFS File System Run-Time Library (FSRTL).
EFS works by encrypting a file with a bulk symmetric key (also known as the File Encryption Key, or FEK), which is used because it takes a relatively small amount of time to encrypt and decrypt large amounts of data than if an asymmetric key cipher is used. The symmetric key that is used to encrypt the file is then encrypted with a public key that is associated with the user who encrypted the file, and this encrypted data is stored in an alternate data stream of the encrypted file. To decrypt the file, the file system uses the private key of the user to decrypt the symmetric key that is stored in the file header. It then uses the symmetric key to decrypt the file. Because this is done at the file system level, it is transparent to the user. Also, in case of a user losing access to their key, support for additional decryption keys has been built in to the EFS system, so that a recovery agent can still access the files if needed. NTFS-provided encryption and NTFS-provided compression are mutually exclusive; however, NTFS can be used for one and a third-party tool for the other.
The support of EFS is not available in Basic, Home and MediaCenter versions of Windows, and must be activated after installation of Professional, Ultimate and Server versions of Windows or by using enterprise deployment tools within Windows domains.
Quotas
Disk quotaDisk quota
A disk quota is a limit set by a system administrator that restricts certain aspects of file system usage on modern operating systems. The function of using disk quotas is to allocate limited disk space in a reasonable way.-Types of quotas:...
s were introduced in NTFS v3. They allow the administrator of a computer that runs a version of Windows that supports NTFS to set a threshold of disk space that users may use. It also allows administrators to keep track of how much disk space each user is using. An administrator may specify a certain level of disk space that a user may use before they receive a warning, and then deny access to the user once they hit their upper limit of space. Disk quotas do not take into account NTFS's transparent file-compression, should this be enabled. Applications that query the amount of free space will also see the amount of free space left to the user who has a quota applied to them.
The support of disk quotas is not available in Basic, Home and MediaCenter versions of Windows, and must be activated after installation of Professional, Ultimate and Server versions of Windows or by using enterprise deployment tools within Windows domains.
Reparse points
This feature was introduced in NTFS v3. Reparse pointsNTFS reparse point
An NTFS reparse point is a type of NTFS file system object. It is available with the NTFS v3.0 found in Windows 2000 or later versions. Reparse points provide a way to extend the NTFS filesystem by adding extra information to the directory entry, so a file system filter can interpret how the...
are used by associating a reparse tag in the user space attribute of a file or directory. When the object manager (see Windows NT line executive) parses a file system name lookup and encounters a reparse attribute, it will reparse the name lookup, passing the user controlled reparse data to every file system filter driver that is loaded into Windows. Each filter driver examines the reparse data to see whether it is associated with that reparse point, and if that filter driver determines a match, then it intercepts the file system call and executes its special functionality. Reparse points
NTFS reparse point
An NTFS reparse point is a type of NTFS file system object. It is available with the NTFS v3.0 found in Windows 2000 or later versions. Reparse points provide a way to extend the NTFS filesystem by adding extra information to the directory entry, so a file system filter can interpret how the...
are used to implement Volume Mount Points, Directory Junctions, Hierarchical Storage Management, Native Structured Storage, Single Instance Storage, and Symbolic Links.
Volume mount points
Volume mount pointsVolume Mount Point
Volume Mount Points are specialized NTFS filesystem objects which are used to mount and provide an entry point to other volumes. Mount points can be created in a directory on an NTFS file system, which gives a reference to the root directory of the mounted volume. In fact, any empty directory can...
are similar to Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...
mount points, where the root of another file system is attached to a directory. In NTFS, this allows additional file systems to be mounted without requiring a separate drive letter (such as C: or D:) for each.
Once a volume has been mounted on top of an existing directory of another volume, the contents previously listed in that directory become invisible and are replaced by the content of the root directory of the mounted volume. The mounted volume could still have its own drive letter assigned separately. The file system does not allow volumes to be mutually mounted on each other. Volume mount points can be made to be either persistent (remounted automatically after system reboot) or not persistent (must be manually remounted after reboot).
Mounted volumes may use other file systems than just NTFS; notably they may be remote shared directories, possibly with their own security settings and remapping of access rights according to the remote file system policy.
Directory junctions
Directory junctionsNTFS junction point
An NTFS junction point is a feature of the NTFS file system that provides the ability to create a symbolic link to a directory which then functions as an alias of that directory...
are similar to volume mount points, but reference other directories in the file system instead of other volumes. For instance, the directory
C:\exampledir
with a directory junction attribute that contains a link to D:\linkeddir
will automatically refer to the directory D:\linkeddir
when it is accessed by a user-mode application. This function is conceptually similar to symbolic links to directories in UnixUnix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...
, except that the target in NTFS must always be another directory (typical Unix file systems allow the target of a symbolic link to be any type of file) and have the semantics of a hardlink (i.e., they must be immediately resolvable when they are created).
Directory joins (which can be created with the command MKLINK /J junctionName targetDirectory and removed with RMDIR junctionName from a console prompt) are persistent, and resolved on the server side as they share the same security realm of the local system or domain on which the parent volume is mounted and the same security settings for its contents as the content of the target directory; however the junction itself may have distinct security settings. Unlinking a directory junction join does not delete files in the target directory.
Note that some directory junctions are installed by default on Windows Vista, for compatibility with previous versions of Windows, such as Documents and Settings in the root directory of the system drive, which links to the Users physical directory in the root directory of the same volume. However they are hidden by default, and their security settings are set up so that the Windows Explorer will refuse to open them from within the Shell or in most applications, except for the local built-in SYSTEM user or the local Administrators group (both user accounts are used by system software installers). This additional security restriction has probably been made to avoid users of finding apparent duplicate files in the joined directories and deleting them by error, because the semantics of directory junctions is not the same as hardlinks; the reference counting is not used on the target contents and not even on the referenced container itself.
Directory junctions are soft links (they will persist even if the target directory is removed), working as a limited form of symbolic links (with an additional restriction on the location of the target), but it is an optimized version which allows faster processing of the reparse point with which they are implemented, with less overhead than the newer NTFS symbolic links, and can be resolved on the server side (when they are found in remote shared directories).
Symbolic links
Symbolic linkSymbolic link
In computing, a symbolic link is a special type of file that contains a reference to another file or directory in the form of an absolute or relative path and that affects pathname resolution. Symbolic links were already present by 1978 in mini-computer operating systems from DEC and Data...
s (or soft links) were introduced in Windows Vista. Symbolic links are resolved on the client side. So when a symbolic link is shared, the target is subject to the access restrictions on the client, and not the server.
Symbolic links can be created either to files (created with MKLINK symLink targetFilename) or to directories (created with MKLINK /D symLinkD targetDirectory), but (unlike Unix symbolic links) the semantic of the link must be provided with the created link. The target however need not exist or be available when the symbolic link is created: when the symbolic link will be accessed and the target will be checked for availability, NTFS will also check if it has the correct type (file or directory); it will return a not-found error if the existing target has the wrong type.
They can also reference shared directories on remote hosts or files and subdirectories within shared directories: their target is not mounted immediately at boot, but only temporarily on demand while opening them with the OpenFile or CreateFile API. Their definition is persistent on the NTFS volume where they are created (all types of symbolic links can be removed as if they were files, using DEL symLink from a command line prompt or batch).
Single Instance Storage (SIS)
When there are several directories that have different, but similar, files, some of these files may have identical content. Single instance storage allows identical files to be merged to one file and create references to that merged file. SIS consists of a file system filter that manages copies, modification and merges to files; and a user space service (or groveler) that searches for files that are identical and need merging. SIS was mainly designed for remote installation servers as these may have multiple installation images that contain many identical files; SIS allows these to be consolidated but, unlike for example hard links, each file remains distinct; changes to one copy of a file will leave others unaltered. This is similar to copy-on-writeCopy-on-write
Copy-on-write is an optimization strategy used in computer programming. The fundamental idea is that if multiple callers ask for resources which are initially indistinguishable, they can all be given pointers to the same resource...
, which is a technique by which memory copying is not really done until one copy is modified.
Hierarchical Storage Management (HSM)
Hierarchical Storage ManagementHierarchical storage management
Hierarchical storage management is a data storage technique which automatically moves data between high-cost and low-cost storage media. HSM systems exist because high-speed storage devices, such as hard disk drive arrays, are more expensive than slower devices, such as optical discs and magnetic...
is a means of transferring files that are not used for some period of time to less expensive storage media. When the file is next accessed, the reparse point on that file determines that it is needed and retrieves it from storage.
Native Structured Storage (NSS)
NSS was an ActiveXActiveX
ActiveX is a framework for defining reusable software components in a programming language-independent way. Software applications can then be composed from one or more of these components in order to provide their functionality....
document storage technology that has since been discontinued by Microsoft. It allowed ActiveX Document
ActiveX Document
ActiveX Document is a computer file in the form of a compound document that allows a container application to use the full capabilities of server applications. This approach distinguishes between a document, such as a word document or video clip, and the software that can be applied to the...
s to be stored in the same multi-stream format that ActiveX uses internally. An NSS file system filter was loaded and used to process the multiple streams transparently to the application, and when the file was transferred to a non-NTFS formatted disk volume it would also transfer the multiple streams into a single stream.
Interoperability
Details on the implementation's internals are not released, which makes it difficult for third-party vendors to provide tools to handle NTFS.Microsoft Windows
While the different NTFS versions are for the most part fully forwardForward compatibility
Forward compatibility or upward compatibility is a compatibility concept for systems design, as e.g. backward compatibility. Forward compatibility aims at the ability of a design to gracefully accept input intended for later versions of itself...
- and backward-compatible
Backward compatibility
In the context of telecommunications and computing, a device or technology is said to be backward or downward compatible if it can work with input generated by an older device...
, there are technical considerations for mounting newer NTFS volumes in older versions of Microsoft Windows. This affects dual-booting, and external portable hard drives.
For example, attempting to use an NTFS partition with "Previous Versions" (a.k.a. Volume Shadow Copy) on an operating system that does not support it will result in the contents of those previous versions being lost.
Mac OS X
Mac OS X 10.3 and later include read-only support for NTFS-formatted partitions. The GPLGNU General Public License
The GNU General Public License is the most widely used free software license, originally written by Richard Stallman for the GNU Project....
-licensed NTFS-3G
NTFS-3G
NTFS-3G is an open source cross-platform implementation of the Microsoft Windows NTFS file system with read-write support. NTFS-3G often uses the FUSE file system interface, so it can run unmodified on many different operating systems. It is runnable on Linux, FreeBSD, NetBSD, OpenSolaris, BeOS,...
also works on Mac OS X through FUSE
Filesystem in Userspace
Filesystem in Userspace is a loadable kernel module for Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code...
and allows reading and writing to NTFS partitions. A performance enhanced commercial version, called Tuxera
Tuxera
Tuxera Inc. is a Finnish company specialized in developing file systems' software, whose most widely deployed commercial software is "Tuxera NTFS for Mac".It was founded in 2008 by Szabolcs Szakacsits, current President and CTO.- History :...
NTFS for Mac, is also available from the NTFS-3G developers. Paragon Software Group
Paragon Software Group
Paragon Software Group produces hard drive management tools, such as partition managers, boot managers, back up software and system duplication software. Paragon products are available in personal or corporate versions, for use on PCs, servers or networks...
sells a read-write driver named NTFS for Mac OS X, which is also included on some models of Seagate
Seagate Technology
Seagate Technology is one of the world's largest manufacturers of hard disk drives. Incorporated in 1978 as Shugart Technology, Seagate is currently incorporated in Dublin, Ireland and has its principal executive offices in Scotts Valley, California, United States.-1970s:On November 1, 1979...
hard drives. Native NTFS write support has been discovered in Mac OS X 10.6 and later, but is not activated by default, although hacks do exist to enable the functionality. However, user reports indicate the functionality is unstable and tends to cause kernel panics
Kernel panic
A kernel panic is an action taken by an operating system upon detecting an internal fatal error from which it cannot safely recover. The term is largely specific to Unix and Unix-like systems; for Microsoft Windows operating systems the equivalent term is "Bug check" .The kernel routines that...
, probably the reason why write support has not been enabled or advertised.
Linux
The ability to read and write to NTFS is provided by the NTFS-3GNTFS-3G
NTFS-3G is an open source cross-platform implementation of the Microsoft Windows NTFS file system with read-write support. NTFS-3G often uses the FUSE file system interface, so it can run unmodified on many different operating systems. It is runnable on Linux, FreeBSD, NetBSD, OpenSolaris, BeOS,...
driver. It is included in most Linux distributions. Other solutions exist as well:
- Linux kernelLinux kernelThe Linux kernel is an operating system kernel used by the Linux family of Unix-like operating systems. It is one of the most prominent examples of free and open source software....
2.2: Kernel versions 2.2.0 and later include the ability to read NTFS partitions - Linux kernel 2.6: Kernel versions 2.6.0 and later contain a driver written by Anton Altaparmakov (University of CambridgeUniversity of CambridgeThe University of Cambridge is a public research university located in Cambridge, United Kingdom. It is the second-oldest university in both the United Kingdom and the English-speaking world , and the seventh-oldest globally...
) and Richard Russon. It supports file read, overwrite and resize. - NTFSMount: A read/write userspace NTFS driver. It provides read-write access to NTFS, excluding writing compressed and encrypted files, changing file ownership, and access rights.
- Tuxera NTFS: High-performance read/write commercial kernel driver, mainly targeted for embedded devices from TuxeraTuxeraTuxera Inc. is a Finnish company specialized in developing file systems' software, whose most widely deployed commercial software is "Tuxera NTFS for Mac".It was founded in 2008 by Szabolcs Szakacsits, current President and CTO.- History :...
which also develops the open source NTFS-3GNTFS-3GNTFS-3G is an open source cross-platform implementation of the Microsoft Windows NTFS file system with read-write support. NTFS-3G often uses the FUSE file system interface, so it can run unmodified on many different operating systems. It is runnable on Linux, FreeBSD, NetBSD, OpenSolaris, BeOS,...
driver. - NTFS for Linux: A commercial driver with full read/write support available as free and non-free download(s) from Paragon Software GroupParagon Software GroupParagon Software Group produces hard drive management tools, such as partition managers, boot managers, back up software and system duplication software. Paragon products are available in personal or corporate versions, for use on PCs, servers or networks...
. - Captive NTFSCaptive NTFSCaptive NTFS is a discontinued open-source project within the Linux programming community, started by Jan Kratochvíl. It is a driver wrapper around the original Microsoft Windows NTFS file system driver using parts of ReactOS code...
(discontinued): A 'wrapping' driver which uses Windows' own driver, ntfs.sys.
Note that all three userspace drivers, namely NTFSMount, NTFS-3G and Captive NTFS, are built on the Filesystem in Userspace
Filesystem in Userspace
Filesystem in Userspace is a loadable kernel module for Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code...
(FUSE), a Linux kernel module tasked with bridging userspace and kernel code to save and retrieve data. All drivers listed above (except Tuxera NTFS and Paragon NTFS for Linux) are open source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...
(GPL). Due to the complexity of internal NTFS structures, both the built-in 2.6.14 kernel driver and the FUSE drivers disallow changes to the volume that are considered unsafe, to avoid corruption.
Others
eComStationEComStation
eComStation or eCS is a PC operating system based on OS/2, published by Serenity Systems. It includes several additions and accompanying software not present in the IBM version of the system.-Differences between eComStation and OS/2:...
, and FreeBSD
FreeBSD
FreeBSD is a free Unix-like operating system descended from AT&T UNIX via BSD UNIX. Although for legal reasons FreeBSD cannot be called “UNIX”, as the direct descendant of BSD UNIX , FreeBSD’s internals and system APIs are UNIX-compliant...
offer read-only NTFS support (there is a beta NTFS driver that allows write/delete for eComStation, but is generally considered unsafe). A free third-party tool for BeOS
BeOS
BeOS is an operating system for personal computers which began development by Be Inc. in 1991. It was first written to run on BeBox hardware. BeOS was optimized for digital media work and was written to take advantage of modern hardware facilities such as symmetric multiprocessing by utilizing...
, which was based on NTFS-3G, allows full NTFS read and write. NTFS-3G
NTFS-3G
NTFS-3G is an open source cross-platform implementation of the Microsoft Windows NTFS file system with read-write support. NTFS-3G often uses the FUSE file system interface, so it can run unmodified on many different operating systems. It is runnable on Linux, FreeBSD, NetBSD, OpenSolaris, BeOS,...
also works on Mac OS X, FreeBSD, NetBSD, Solaris, QNX
QNX
QNX is a commercial Unix-like real-time operating system, aimed primarily at the embedded systems market. The product was originally developed by Canadian company, QNX Software Systems, which was later acquired by Canadian BlackBerry-producer Research In Motion.-Description:As a microkernel-based...
and Haiku
Haiku (operating system)
Haiku is a free and open source operating system compatible with BeOS. Its development began in 2001, and the operating system became self-hosting in 2008, with the first alpha release in September 2009, the second in May 2010 and the third in June 2011....
, in addition to Linux, through FUSE
Filesystem in Userspace
Filesystem in Userspace is a loadable kernel module for Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code...
. A free for personal use read/write driver for MS-DOS
MS-DOS
MS-DOS is an operating system for x86-based personal computers. It was the most commonly used member of the DOS family of operating systems, and was the main operating system for IBM PC compatible personal computers during the 1980s to the mid 1990s, until it was gradually superseded by operating...
called "NTFS4DOS" also exists. OpenBSD
OpenBSD
OpenBSD is a Unix-like computer operating system descended from Berkeley Software Distribution , a Unix derivative developed at the University of California, Berkeley. It was forked from NetBSD by project leader Theo de Raadt in late 1995...
offer read-only NTFS support by default on i386 and amd64 platforms as of version 4.9 released 1. May 2011.
Conversion from other file systems
Microsoft provides a tool (convert.exe) to convert to NTFS from other file systems. Supported systems include HPFS (only on Windows NT 3), FAT16File Allocation Table
File Allocation Table is a computer file system architecture now widely used on many computer systems and most memory cards, such as those used with digital cameras. FAT file systems are commonly found on floppy disks, flash memory cards, digital cameras, and many other portable devices because of...
and, on Windows 2000 and later, FAT32.
Resizing
Various third-party tools are all capable of safely resizing NTFS partitions. Starting with Windows VistaWindows Vista
Windows Vista is an operating system released in several variations developed by Microsoft for use on personal computers, including home and business desktops, laptops, tablet PCs, and media center PCs...
Microsoft added the built-in ability to shrink or expand a partition, but this capability is limited because it will not relocate page file fragments or files that have been marked as unmovable. So shrinking will often require relocating or disabling any page file, the index of Windows Search
Windows Search
Windows Search is an indexed desktop search platform released by Microsoft for the Windows operating system....
, and any Shadow Copy used by System Restore
System Restore
System Restore is a component of Microsoft's Windows Me, Windows XP, Windows Vista and Windows 7, but not Windows 2000, operating systems that allows for the rolling back of system files, registry keys, installed programs, etc., to a previous state in the event of system malfunction or failure.The...
.
Universal time
For historical reasons, the versions of Windows that do not support NTFS all keep time internally as local zone time, and therefore so do all file systems other than NTFS that are supported by current versions of Windows. However, Windows NT and its descendants keep internal timestamps as UTC and make the appropriate conversions for display purposes. Therefore, NTFS timestamps are in UTC. This means that when files are copied or moved between NTFS and non-NTFS partitions, the OS needs to convert timestamps on the fly. But if some files are moved when daylight saving timeDaylight saving time
Daylight saving time —also summer time in several countries including in British English and European official terminology —is the practice of temporarily advancing clocks during the summertime so that afternoons have more daylight and mornings have less...
(DST) is in effect, and other files are moved when standard time
Standard time
Standard time is the result of synchronizing clocks in different geographical locations within a time zone to the same time rather than using the local meridian as in local mean time or solar time. Historically, this helped in the process of weather forecasting and train travel. The concept...
is in effect, there can be some ambiguities in the conversions. As a result, especially shortly after one of the days on which local zone time changes, users may observe that some files have timestamps that are incorrect by one hour. Due to the differences in implementation of DST between the northern and southern hemispheres, this can result in a potential timestamp error of up to 4 hours in any given 12 months.
Internals
In NTFS, all fileComputer file
A computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished...
data—file name, creation date, access permissions (by the use of access control list
Access control list
An access control list , with respect to a computer file system, is a list of permissions attached to an object. An ACL specifies which users or system processes are granted access to objects, as well as what operations are allowed on given objects. Each entry in a typical ACL specifies a subject...
s), and contents—are stored as metadata in the Master File Table. This abstract approach allowed easy addition of file system features during Windows NT's development—an interesting example is the addition of fields for indexing used by the Active Directory
Active Directory
Active Directory is a directory service created by Microsoft for Windows domain networks. It is included in most Windows Server operating systems. Server computers on which Active Directory is running are called domain controllers....
software.
NTFS allows any sequence of 16-bit values for name encoding (file names, stream names, index names, etc.). This means UTF-16 codepoints are supported, but the file system does not check whether a sequence is valid UTF-16
UTF-16/UCS-2
UTF-16 is a character encoding for Unicode capable of encoding 1,112,064 numbers in the Unicode code space from 0 to 0x10FFFF...
(it allows any sequence of short values, not restricted to those in the Unicode standard).
Internally, NTFS uses B+ tree
B+ tree
In computer science, a B+ tree or B plus tree is a type of tree which represents sorted data in a way that allows for efficient insertion, retrieval and removal of records, each of which is identified by a key. It is a dynamic, multilevel index, with maximum and minimum bounds on the number of...
s to index file system data. Although complex to implement, this allows faster file look up times in most cases. A file system journal
Journaling file system
A journaling file system is a file system that keeps track of the changes that will be made in a journal before committing them to the main file system...
is used to guarantee the integrity of the file system metadata but not individual files' content. Systems using NTFS are known to have improved reliability compared to FAT file systems.
The Master File Table (MFT) contains metadata about every file, directory, and metafile on an NTFS volume. It includes filenames, locations, size, and permissions. Its structure supports algorithms which minimize disk fragmentation
File system fragmentation
In computing, file system fragmentation, sometimes called file system aging, is the inability of a file system to lay out related data sequentially , an inherent phenomenon in storage-backed file systems that allow in-place modification of their contents. It is a special case of data fragmentation...
. A directory entry consists of a filename and a "file ID" which is the record number representing the file in the Master File Table. The file ID also contains a reuse count to detect stale references. While this strongly resembles the W_FID of Files-11, other NTFS structures radically differ.
Metafiles
NTFS contains several files which define and organize the file system. In all respects, most of these files are structured like any other user file ($Volume being the most peculiar), but are not of direct interest to file system clients. These metafiles define files, back up critical file system data, buffer file system changes, manage free space allocation, satisfy BIOSBIOS
In IBM PC compatible computers, the basic input/output system , also known as the System BIOS or ROM BIOS , is a de facto standard defining a firmware interface....
expectations, track bad allocation units, and store security and disk space usage information. All content is in an unnamed data stream, unless otherwise indicated.
Segment Number | File Name | Purpose |
---|---|---|
0 | $MFT | Describes all files on the volume, including file names, timestamps, stream names, and lists of cluster numbers where data streams reside, indexes, security identifier Security Identifier In the context of the Microsoft Windows NT line of operating systems, a Security Identifier is a unique name which is assigned by a Windows Domain controller during the log on process that is used to identify a subject, such as a user or a group of users in a network of NT/2000... s, and file attributes like "read only", "compressed", "encrypted", etc. |
1 | $MFTMirr | Duplicate of the first vital entries of $MFT, usually 4 entries (4 Kilobyte Kilobyte The kilobyte is a multiple of the unit byte for digital information. Although the prefix kilo- means 1000, the term kilobyte and symbol KB have historically been used to refer to either 1024 bytes or 1000 bytes, dependent upon context, in the fields of computer science and information... ). |
2 | $LogFile | Contains transaction log of file system metadata changes. |
3 | $Volume | Contains information about the volume, namely the volume object identifier, volume label, file system version, and volume flags (mounted, chkdsk requested, requested $LogFile resize, mounted on NT 4, volume serial number updating, structure upgrade request). This data is not stored in a data stream, but in special MFT attributes: If present, a volume object ID is stored in an $OBJECT_ID record; the volume label is stored in a $VOLUME_NAME record, and the remaining volume data is in a $VOLUME_INFORMATION record. Note: volume serial number is stored in file $Boot (below). |
4 | $AttrDef | A table of MFT attributes which associates numeric identifiers with names. |
5 | . | Root directory Root directory In computer file systems, the root directory is the first or top-most directory in a hierarchy. It can be likened to the root of a tree — the starting point where all branches originate.-Metaphor:... . Directory data is stored in $INDEX_ROOT and $INDEX_ALLOCATION attributes both named $I30. |
6 | $Bitmap | An array of bit entries: each bit indicates whether its corresponding cluster is used (allocated) or free (available for allocation). |
7 | $Boot | Volume boot record Volume Boot Record A volume boot record is a type of boot sector introduced by the IBM Personal Computer... . This file is always located at the first clusters on the volume. It contains bootstrap code (see NTLDR NTLDR NTLDR is the boot loader for all releases of Windows NT operating system up to and including Windows XP and Windows Server 2003. NTLDR is typically run from the primary hard disk drive, but it can also run from portable storage devices such as a CD-ROM, USB flash drive, or floppy disk... /BOOTMGR) and a BIOS parameter block BIOS parameter block In computing, the BIOS parameter block, often shortened to BPB, is a data structure in the Volume Boot Record describing the physical layout of a data storage volume. On partitioned devices, such as hard disks, the BPB describes the volume partition, whereas, on unpartitioned devices, such as... including a volume serial number Volume serial number A volume serial number is a serial number assigned to a disk volume or tape volume. It originated in 1950s in mainframe computer operating systems. In OS/360 line it is human-configurable, has a maximum length of six characters, is in uppercase, must start with a letter, and identifies a volume to... and cluster numbers of $MFT and $MFTMirr. $Boot is usually 8192 bytes long. |
8 | $BadClus | A file which contains all the clusters marked as having bad sector Bad Sector Bad Sector is an ambient/noise project formed in 1992 in Tuscany, Italy by Massimo Magrini. While working at the Computer Art Lab of ISTI in Pisa , he developed original gesture interfaces that he uses in live performances: 'Aerial Painting Hand' , 'UV-Stick' Bad Sector is an ambient/noise... s. This file simplifies cluster management by the chkdsk utility, both as a place to put newly discovered bad sectors, and for identifying unreferenced clusters. This file contains two data streams, even on volumes with no bad sectors: an unnamed stream contains bad sectors—it is zero length for perfect volumes; the second stream is named $Bad and contains all clusters on the volume not in the first stream. |
9 | $Secure | Access control list Access control list An access control list , with respect to a computer file system, is a list of permissions attached to an object. An ACL specifies which users or system processes are granted access to objects, as well as what operations are allowed on given objects. Each entry in a typical ACL specifies a subject... database which reduces overhead having many identical ACLs stored with each file, by uniquely storing these ACLs in this database only (contains two indices: $SII (Standard_Information ID) and $SDH (Security Descriptor Security descriptor Security descriptors are data structures of security information for securable Windows objects, that is objects that can be identified by a unique name... Hash) which index the stream named $SDS containing actual ACL table). |
10 | $UpCase | A table of unicode uppercase characters for ensuring case insensitivity in Win32 and DOS namespaces. |
11 | $Extend | A filesystem directory containing various optional extensions, such as $Quota, $ObjId, $Reparse or $UsnJrnl. |
12 ... 23 | Reserved for $MFT extension entries. | |
usually 24 | $Extend\$Quota | Holds disk quota information. Contains two index roots, named $O and $Q. |
usually 25 | $Extend\$ObjId | Holds distributed link tracking information. Contains an index root and allocation named $O. |
usually 26 | $Extend\$Reparse | Holds reparse point data (such as symbolic link Symbolic link In computing, a symbolic link is a special type of file that contains a reference to another file or directory in the form of an absolute or relative path and that affects pathname resolution. Symbolic links were already present by 1978 in mini-computer operating systems from DEC and Data... s). Contains an index root and allocation named $R. |
27 ... | file.ext | Beginning of regular file entries. |
These metafiles are treated specially by Windows and are difficult to directly view: special purpose-built tools are needed.
One such tool is the nfi.exe-"NTFS File Sector Information Utility" that is freely distributed as part of the Microsoft "OEM Support Tools".
From MFT records to attribute lists, attributes, and streams
For each file (or directory) described in the MFT record, there's a linear repository of stream descriptors (also named attributes), packed together in a variable-length record (also named an attributes list), with extra padding to fill the fixed 1KB size of every MFT record, and that fully describes the effective streams associated with that file.Each stream (or attribute) itself has a single type (internally just a fixed-size integer in the stored descriptor, but most often handled in applications using an equivalent symbolic name in the FileOpen or FileCreate API call), a single optional stream name (completely unrelated to the effective filenames), plus optional associated data for that stream. For NTFS, the standard data of files, or the index data for directories are handled the same way as other data for alternate data streams, or for standard attributes. They are just one of the attributes stored in one or several attribute lists.
- For each file described in the MFT record (or in the non-resident respository of stream descriptors, see below), the stream descriptors identified by their (stream type value, stream name) must be unique. Additionally, NTFS has some ordering constraints for these descriptors.
- There's a predefined null stream type, used to indicate the end of the list of stream descriptors in the streams repository for that file. It must be present as the last stream descriptor in each stream repository (all other storage space available after it will be ignored and just consists in padding bytes to match the record size in the MFT or a cluster size in a non-resident streams repository).
- Some stream types are required and must be present in each MFT record, except unused records that are just indicated by a stream with null stream type.
- This is the case for the standard attributes that are stored as a fixed-size record and containing the timestampTimestampA timestamp is a sequence of characters, denoting the date or time at which a certain event occurred. A timestamp is the time at which an event is recorded by a computer, not the time of the event itself...
s and other basic single-bit attributes (compatible with those managed by FATFatFats consist of a wide group of compounds that are generally soluble in organic solvents and generally insoluble in water. Chemically, fats are triglycerides, triesters of glycerol and any of several fatty acids. Fats may be either solid or liquid at room temperature, depending on their structure...
/FAT32 in DOS or Windows 95/98 applications).
- This is the case for the standard attributes that are stored as a fixed-size record and containing the timestamp
- Some stream types cannot have a name and must remain anonymous.
- This is the case for the standard attributes, or for the preferred NTFS "filename" stream type, or the "short filename" stream type, when it is also present (for compatibility with DOS-like applications, see below). It is also possible for a file to only contain a short filename, in which case it will be the preferred one, as listed in the Windows Explorer.
- The filename streams stored in the streams repository do not make the file immediately accessible through the hierarchical filesystem. In fact, all the filenames must be indexed separately in at least one separate directory on the same volume, with its own MFT entry and its own security descriptorSecurity descriptorSecurity descriptors are data structures of security information for securable Windows objects, that is objects that can be identified by a unique name...
s and attributes, that will reference the MFT entry number for that file. This allows the same file or directory to be "hardlinked" several times from several containers on the same volume, possibly with distinct filenames.
- The default data stream of a regular file is a stream of type $DATA but with an anonymous name, and the ADS's are similar but must be named.
- On the opposite, the default data stream of directories has a distinct type, but are not anonymous: they have a stream name ("$I30" in NTFS 3+) that reflects its indexing format.
All streams of a given file may be displayed by using the nfi.exe-"NTFS File Sector Information Utility" that is freely distributed as part of the Microsoft "OEM Support Tools".
Resident vs. non-resident data streams
To optimize the storage and reduce the I/O overhead for the very common case of streams with very small associated data, NTFS prefers to place this data within the stream descriptor (if the size of the stream descriptor does not then exceed the maximum size of the MFT record or the maximum size of a single entry within an non-resident stream repository, see below), instead of using the MFT entry space to list clusters containing the data; in that case, the stream descriptor will not store the data directly but will just store an allocation map pointing to the actual data stored elsewhere on the volume. When the stream data can be accessed directly from within the stream descriptor, it is called "resident data" by computer forensicsComputer forensics
Computer forensics is a branch of digital forensic science pertaining to legal evidence found in computers and digital storage media...
workers. The amount of data which fits is highly dependent on the file's characteristics, but 700 to 800 bytes is common in single-stream files with non-lengthy filenames and no ACLs.
- Some stream descriptors (such as the preferred filename, the basic file attributes, or the main allocation map for each non-resident stream) cannot be made non-resident.
- Encrypted-by-NTFS, sparse data streams, or compressed data streams cannot be made resident.
- The format of the allocation map for non-resident streams depends on its capability of supporting sparse data storage. In the current implementation of NTFS, once a non-resident stream data has been marked and converted as sparse, it cannot be reverted to non-sparse data, so it cannot become resident again, unless this data is fully truncated, discarding the sparse allocation map completely.
- When a non-resident data stream is too much fragmented, so that its effective allocation map cannot fit entirely within the MFT record, the allocation map may be also stored as an non-resident stream, with just a small resident stream containing the indirect allocation map to the effective non-resident allocation map of the non-resident data stream.
- When there are too many streams for a file (including ADS's, extended attributes, or security descriptorSecurity descriptorSecurity descriptors are data structures of security information for securable Windows objects, that is objects that can be identified by a unique name...
s), so that their descriptors cannot fit all within the MFT record, a non-resident stream may also be used to store an additional repository for the other stream descriptors (except those few small streams that cannot be non-resident), using the same format as the one used in the MFT record, but without the space constraints of the MFT record.
The NTFS filesystem driver will sometimes attempt to relocate the data of some of these non-resident streams into the streams repository, and will also attempt to relocate the stream descriptors stored in a non-resident repository back to the stream repository of the MFT record, based on priority and preferred ordering rules, and size constraints.
Since resident files do not directly occupy clusters ("allocation units"), it is possible for an NTFS volume to contain more files on a volume than there are clusters. For example, a 74.5 GB partition NTFS formats with 19,543,064 clusters of 4 KB. Subtracting system files (a 64 MB log file, a 2,442,888-byte Bitmap file, and about 25 clusters of fixed overhead) leaves 19,526,158 clusters free for files and indices. Since there are four MFT records per cluster, this volume theoretically could hold almost 4 × 19,526,158 = 78,104,632 resident files.
Limitations
The following are a few limitations of NTFS:File Names: File names are limited to 255 UTF-16 code points. Certain names are reserved in the volume root directory and cannot be used for files. These are: $MFT, $MFTMirr, $LogFile, $Volume, $AttrDef, . (dot), $Bitmap, $Boot, $BadClus, $Secure, $Upcase, and $Extend;. (dot) and $Extend are both directories; the others are files. The NT kernel limits full paths to 32,767 UTF-16 code points.
Maximum Volume Size: In theory, the maximum NTFS volume size is 264−1 clusters. However, the maximum NTFS volume size as implemented in Windows XP Professional is 232−1 clusters. For example, using 64 kB clusters, the maximum Windows XP NTFS volume size is 256 TB
TB
-Music:*Tenor and bass, a score for male chorus*The Beatles, the English rock band, the most lauded and successful group in the history of modern music**The Beatles , the tenth album by the above band, also known as the White Album...
minus 64 kB
KB
- Computing :* Kilobit , a unit of information used, for example, to quantify computer memory or storage capacity* Kilobyte , a unit of information used, for example, to quantify computer memory or storage capacity...
. Using the default cluster size of 4 kB, the maximum NTFS volume size is 16 TB minus 4 kB. (Both of these are vastly higher than the 128 GB
GB
- Geography :* Gabon , a country in West Africa* Great Britain, an island in Europe* Guinea Bissau , a country in West Africa* Green Bay, Wisconsin, a city in Wisconsin, USA...
limit lifted in Windows XP SP1.) Because partition tables on master boot record (MBR) disks only support partition sizes up to 2 TB, dynamic or GPT
GUID Partition Table
In computer hardware, GUID Partition Table is a standard for the layout of the partition table on a physical hard disk. Although it forms a part of the Extensible Firmware Interface standard , it is also used on some BIOS systems because of the limitations of MBR partition tables, which restrict...
volumes must be used to create NTFS volumes over 2 TB. Booting from a GPT volume to a Windows environment requires a system with UEFI and 64-bit support.
Maximum File Size: As designed, the maximum NTFS file size is 16 EB minus 1 KB or 18,446,744,073,709,550,592 bytes. As implemented, the maximum NTFS file size is 16 TB minus 64 kB or 17,592,185,978,880 bytes.
Alternate Data Streams: Windows system calls may handle alternate data streams. Depending on the operating system, utility and remote file system, a file transfer might silently strip data streams. A safe way of copying or moving files is to use the BackupRead and BackupWrite system calls, which allow programs to enumerate streams, to verify whether each stream should be written to the destination volume and to knowingly skip unwanted streams.
Developers
NTFS developers include:- Tom MillerTom Miller (computer programmer)Tom Miller is a software developer who is employed by Microsoft.Miller worked as a member of the original team of developers who followed Dave Cutler from DEC to Microsoft, where he initially started working in the networking group....
- Gary KimuraGary KimuraGary Dean Kimura is a Professor for the Department of Computer Science & Engineering at the University of Washington and a software developer who worked for Microsoft....
- Brian Andrew
- David Goebel
See also
- Comparison of file systemsComparison of file systems-General information:-Limits:-Metadata:-Features:-Allocation and layout policies:-Supporting operating systems:-See also:* Comparison of archive formats* Comparison of file archivers* List of archive formats* List of file archivers...
- Files-11Files-11Files-11, also known as on-disk structure, is the file system used by Hewlett-Packard's OpenVMS operating system, and also by the older RSX-11...
: ODS-2 has similarities to NTFS (compareINDEXF.SYS
and$Mft
, andBITMAP.SYS
and$Bitmap
, for examples) - HPFSHPFSHPFS or High Performance File System is a file system created specifically for the OS/2 operating system to improve upon the limitations of the FAT file system...
, file system created for the OS/2OS/2OS/2 is a computer operating system, initially created by Microsoft and IBM, then later developed by IBM exclusively. The name stands for "Operating System/2," because it was introduced as part of the same generation change release as IBM's "Personal System/2 " line of second-generation personal...
operating system - ntfsresizeNtfsresizentfsresize is a free Unix utility that non-destructively resizes the NTFS filesystem used by Windows NT 4.0, 2000, XP, 2003, and Vista typically on a hard-disk partition. All NTFS versions used by 32-bit and 64-bit Windows are supported. No defragmentation is required prior to resizing since...
- Samba (software)Samba (software)Samba is a free software re-implementation, originally developed by Andrew Tridgell, of the SMB/CIFS networking protocol. As of version 3, Samba provides file and print services for various Microsoft Windows clients and can integrate with a Windows Server domain, either as a Primary Domain...
External links
- Documentation:
- Microsoft NTFS Technical Reference
- FSUtil file operations, useful for manipulating or creating files in an NTFS volume from Windows
- NTFS.com – documentation and resources for NTFS
- Low-level description of NTFS disk structures from the Linux-NTFS project
- Implementations:
- NTFS-3G – NTFS-3GNTFS-3GNTFS-3G is an open source cross-platform implementation of the Microsoft Windows NTFS file system with read-write support. NTFS-3G often uses the FUSE file system interface, so it can run unmodified on many different operating systems. It is runnable on Linux, FreeBSD, NetBSD, OpenSolaris, BeOS,...
an open source read/write NTFS driver for Linux, FreeBSD, Mac OS X, NetBSD, Solaris and Haiku. - Linux-NTFS – an open source project to add NTFS support to the Linux kernel (write support is limited), and write POSIX-compatible utilities for accessing and manipulating NTFS (ntfsprogs; includes ntfsls, ntfsresize, ntfsclone, etc.). Linux NTFS FAQ and howto
- Captive NTFS – Captive NTFSCaptive NTFSCaptive NTFS is a discontinued open-source project within the Linux programming community, started by Jan Kratochvíl. It is a driver wrapper around the original Microsoft Windows NTFS file system driver using parts of ReactOS code...
a shim which used the Windows NTFS driver to access NTFS file systems under Linux - Change Log Journal Parser – useful tool for parsing the Windows NTFS Change Log Journal on live systems.
- NTFS-3G – NTFS-3G