CFS (Compact File Set file format)
Encyclopedia
Compact File Set is an open archive file format and software distribution container file format.

Overview

The Compact File Set (CFS) is an open archive file format and software distribution container file format.
Basic CFS files are compatible with ISO files. It is intended to be similar enough to ISO-9660 that many systems and applications will be able to read CFS, and other applications will require only minor modifications. It is based on:
  • ISO-9660
  • Joliet Extensions
  • ISO-9660:1999
  • Compact ISO


It is available for use in free or commercial applications without charge. It is supposed that no parts of the format are covered by patents.

The primary application is expected to be container files for various archiving and distribution
applications, but CFS may be useful when written directly to CD/DVD media.

Goals

  • Simplify use with data compression and with non seeking storage (pipes, sockets, tape).
  • Simplify implementation of read and write applications compared to traditional ISO-9660/UDF based images.
  • Improved consistency and interchange of data between different applications.
  • Simplify implementation of applications that modify images.
  • Increase storage efficiency by using less image space for media structures and duplicated directory data.
  • Eliminate the folder count limitation imposed in ISO-9660 by the path table.
  • Eliminate the file size limitations imposed by various compatibility restrictions with use of ISO-9660 and UDF.

Main differences of CFS from ISO-9660

  • The layout and contents of the media header (first 40k)is fixed, always containing the same sequence of volume structures and data.
  • All file names and text fields are stored as big-endian UCS-2, as specified in the Joliet extensions.
  • Arbitrary file name and directory depth limitations are removed, up to the limitations of the ISO-9660 file record structure, 110 16 bit characters.
  • All directory data is written after the last block of file data.
  • Readers are expected to handle files over 4GB in size.
  • Path tables are optionally generated but are not used.

Media header

The first 20 blocks (40K) of the logical image is the media header. The layout of the media header is compatible with the various descriptor and directory structures for ISO-9660. The first block of file data is stored in block 20, immediately following the media header.

The media header has the following layout:
block 0-11
all zero
block 12
compatibility readme file text
block 13
compatibility root folder
block 14
compatibility little-endian path table
block 15
compatibility big-endian path table
block 16
ISO-9660 compatibility primary volume descriptor
block 17
ISO-9660 supplementary volume descriptor
block 18
ISO-9660 terminating descriptor
block 19
all zero

The primary volume descriptor in the media header references the fixed compatibility root folder and readme, to help users identify applications and systems that do not use the supplementary volume descriptor. The supplementary volume descriptor indicates the UCS-2 character set and references the real directory structure. The media header should be initialized exactly as is done in the logic in this header file. No additional application data, system data, comments, dates, text, etc., should be added to the media header.

Unicode file names

All file names and the system ID and volume ID fields of the
supplementary volume descriptor are encoded as UCS-2,
big-endian.

File name lengths are limited by the 8 bit file record size
to 110 16 bit characters.
No arbitrary limits are imposed on directory hierarchy depth
or combined length of a file name and included folder name
components. Readers will need to choose an appropriate limit
for their environment and perform checks as necessary.
As in ISO-9660-1999, version numbers are not added to file
names.
As in ISO-9660-1999, special meaning of the '.' and ';'
characters during file name sorting is eliminated.

Optional path tables

Path tables consume media space with redundant information,
and restrict media to a maximum of 64k folders. Readers
should not reference path tables.
Writers may choose to generate path tables to increase
compatibility with ISO-9660 readers. Path tables must be
written with the directory data (folder extents), beyond
the last block of file data. Note that correct path tables
cannot be generated for media containing more than 64K
folders.
Writers that are modifying an existing media may choose to
remove existing path tables.
If path tables are not present then the three related volume
descriptor fields in the supplementary volume descriptor
must be set to zero.

Extended attributes

Extended attributes are reserved for future extensions to
CFS. Writers must not create extended attributes. Readers
must gracefully handle extended attributes if they exist.
File data must be contiguos, and restricted use of duplicate
file records for multi-extent files.
All data for each file must exist in one contiguos extent.
This is true even when the files are represented using
multiple file records.
Interleaved files must not be created. Associated files
must not be created.

Duplicate file records are to be used only to allow
representing files with data extents that are larger
than 4GiB-2048. Duplicate file records are not to be used
to represent files with fragmented data. When duplicate
file records are used, the multi-extent flag must also
be used as indicated in ISO-9660-1999 specification.
Duplicate file records should not be created unless the
total data size of the file is greater than 4Gib-2048.
When duplicate file records exist for a file, all but
the last file record must have a data extent that is
exactly 4Gib-2048 bytes in size.

Location of directory data on media

All file data must precede all folder extents and path tables
on media. The intent is that an image modifying application
can read the entire directory into memory, add new file data
to the image, and rewrite an updated directory after the new
file data.
Writers will need to determine the last block of file data
after reading the entire directory.

Media header patch area

When the media header is modified, either at the end of image
creation or as part of later modifications to an existing
image, only some specific fields are to be updated. These
fields exist entirely within the media header patch area.
Only the media header patch area should be re-written. This
allows more options when dealing with image container file
formats or transports with limited seeking or overwrite
capability (compressed formats, pipes, sockets).

Format extensions and compound file systems.

All files and folders written in the image must be accessible
through the single directory structure referenced from the
supplementary volume descriptor.
Compound file systems, such as including UDF or HFS structures,
are not allowed.
Rockridge and other ISO-9660 extensions are not allowed.

Extensions for archiving system specific attributes.

Future versions of CFS may include extensions to allow storing
system specific attributes such as time fields,security
descriptors, access control lists, resource forks, symbolic
links etc.. Developers with a need for these extensions should contact
Pismo Technic with requirements and/or suggestions.

Media formats

CFS images are either written to CD/DVD media, or are stored
in a media container file. The media container file can be
a raw dump of the CFS image, referred to here as DD, but
more commonly known as ISO files. Also, the media container
file can be a more structured container format that provides
additional features such as compression and spanning.
CFS images are only compliant with this specification when
they are stored in DD or CISO (Compact ISO) format media
files. When burned to CD/DVD media or when stored in other
media container file formats such as NRG or DAA, the
combination is not CFS compliant and should not be referred
to as a CFS file.

Note: Compact ISO is not the same format as the compressed
ISO format common in Playstation Portable homebrew development.
The PSP compressied ISO format is also referred to as CISO, but
the file extension is CSO.


CFS writing applications should default to writing DD format
media container files unless the user has specified container
file options that require CISO (spanning, compression, ...). This
provides more intuitive interchange with systems and applications
that support DD CD/DVD images but do not support CFS.

See also

  • Comparison of archive formats
    Comparison of archive formats
    There are many popular computer data archive formats for creating and maintaining archive files. The tables below compare many popular archive formats.-Purpose:The earliest use of archive formats was for backup, mobility, and archiving....

  • List of archive formats
  • Free file format
  • Open format
    Open format
    An open file format is a published specification for storing digital data, usually maintained by a standards organization, which can therefore be used and implemented by anyone. For example, an open format can be implementable by both proprietary and free and open source software, using the typical...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK