ISAM
Encyclopedia
ISAM stands for Indexed Sequential Access Method, a method for indexing
data for fast retrieval. ISAM was originally developed by IBM
for mainframe computer
s. Today the term is used for several related concepts:
In an ISAM system, data is organized into records which are composed of fixed length fields. Records are stored sequentially, originally to speed access on a tape system. A secondary set of hash table
s known as indexes contain "pointers" into the tables, allowing individual records to be retrieved without having to search the entire data set. This is a departure from the contemporaneous navigational database
s, in which the pointers to other data were stored inside the records themselves. The key improvement in ISAM is that the indexes are small and can be searched quickly, thereby allowing the database to access only the records it needs. Additionally modifications to the data do not require changes to other data, only the table and indexes in question.
When an ISAM file is created, index nodes are fixed, and their pointers do not change during inserts and deletes that occur later (only content of leaf nodes change afterwards). As a consequence of this, if inserts to some leaf node exceed the node's capacity, new records are stored in overflow chains. If there are many more inserts than deletions from a table, these overflow chains can gradually become very large, and this affects the time required for retrieval of a record.
Relational databases can easily be built on an ISAM framework with the addition of logic to maintain the validity of the links between the tables. Typically the field being used as the link, the foreign key
, will be indexed for quick lookup. While this is slower than simply storing the pointer to the related data directly in the records, it also means that changes to the physical layout of the data do not require any updating of the pointers—the entry will still be valid.
ISAM is very simple to understand and implement, as it primarily consists of direct, sequential access to a database file. It is also very inexpensive. The tradeoff is that each client machine must manage its own connection to each file it accesses. This, in turn, leads to the possibility of conflicting inserts into those files, leading to an inconsistent database state. This is typically solved with the addition of a client-server
framework which marshals client requests and maintains ordering. This is the basic concept behind a DBMS
(Database Management System), which is a client layer over the underlying data store.
ISAM was replaced at IBM with a methodology called VSAM
(Virtual Storage Access Method). Still later, IBM developed DB2
which, as of 2004, IBM promotes as their primary database management system
. VSAM is the physical access method used in DB2.
The OpenVMS
operating system uses the Files-11
file system in conjunction with RMS (Record Management Services
). RMS provides an additional layer between the application and the files on disk that provides a consistent method of data organization and access across multiple 3GL and 4GL languages. RMS provides 4 different methods of accessing data; Sequential, Relative Record Number Access, Record File Address Access, and Indexed Access.
The Indexed Access method of reading or writing data only provides the desired outcome if in fact the file is organized as an ISAM file with the appropriate, previously defined keys. Access to data via the previously defined key(s) is extremely fast. Multiple keys, overlapping keys and key compression within the hash tables are supported. A utility to define/redefine keys in existing files is provided. Records can be deleted, although "garbage collection" is done via a separate utility.
Index (database)
A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of slower writes and increased storage space...
data for fast retrieval. ISAM was originally developed by IBM
IBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...
for mainframe computer
Mainframe computer
Mainframes are powerful computers used primarily by corporate and governmental organizations for critical applications, bulk data processing such as census, industry and consumer statistics, enterprise resource planning, and financial transaction processing.The term originally referred to the...
s. Today the term is used for several related concepts:
- Specifically, the IBM ISAM product and the algorithmAlgorithmIn mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
it employs. - A databaseDatabaseA database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...
system where an application developer directly uses an Application Programming InterfaceApplication programming interfaceAn application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...
to search indexes in order to locate records in data files. In contrast, a relational databaseRelational databaseA relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...
uses a query optimizerQuery optimizerThe query optimizer is the component of a database management system that attempts to determine the most efficient way to execute a query. The optimizer considers the possible query plans for a given input query, and attempts to determine which of those plans will be the most efficient...
which automatically selects indexes. - An indexing algorithm that allows both sequential and keyed access to data. Most databases now use some variation of the B-TreeB-treeIn computer science, a B-tree is a tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree is a generalization of a binary search tree in that a node can have more than two children...
for this purpose, although the original IBM ISAM and VSAM implementations did not do so. - Most generally, any index for a database. Indexes are used by almost all databases, both relational and otherwise.
In an ISAM system, data is organized into records which are composed of fixed length fields. Records are stored sequentially, originally to speed access on a tape system. A secondary set of hash table
Hash table
In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys , to their associated values . Thus, a hash table implements an associative array...
s known as indexes contain "pointers" into the tables, allowing individual records to be retrieved without having to search the entire data set. This is a departure from the contemporaneous navigational database
Navigational database
A navigational database is a type of database characterized by the fact that objects in it are found primarily by following references from other objects...
s, in which the pointers to other data were stored inside the records themselves. The key improvement in ISAM is that the indexes are small and can be searched quickly, thereby allowing the database to access only the records it needs. Additionally modifications to the data do not require changes to other data, only the table and indexes in question.
When an ISAM file is created, index nodes are fixed, and their pointers do not change during inserts and deletes that occur later (only content of leaf nodes change afterwards). As a consequence of this, if inserts to some leaf node exceed the node's capacity, new records are stored in overflow chains. If there are many more inserts than deletions from a table, these overflow chains can gradually become very large, and this affects the time required for retrieval of a record.
Relational databases can easily be built on an ISAM framework with the addition of logic to maintain the validity of the links between the tables. Typically the field being used as the link, the foreign key
Foreign key
In the context of relational databases, a foreign key is a referential constraint between two tables.A foreign key is a field in a relational table that matches a candidate key of another table...
, will be indexed for quick lookup. While this is slower than simply storing the pointer to the related data directly in the records, it also means that changes to the physical layout of the data do not require any updating of the pointers—the entry will still be valid.
ISAM is very simple to understand and implement, as it primarily consists of direct, sequential access to a database file. It is also very inexpensive. The tradeoff is that each client machine must manage its own connection to each file it accesses. This, in turn, leads to the possibility of conflicting inserts into those files, leading to an inconsistent database state. This is typically solved with the addition of a client-server
Client-server
The client–server model of computing is a distributed application that partitions tasks or workloads between the providers of a resource or service, called servers, and service requesters, called clients. Often clients and servers communicate over a computer network on separate hardware, but both...
framework which marshals client requests and maintains ordering. This is the basic concept behind a DBMS
Database management system
A database management system is a software package with computer programs that control the creation, maintenance, and use of a database. It allows organizations to conveniently develop databases for various applications by database administrators and other specialists. A database is an integrated...
(Database Management System), which is a client layer over the underlying data store.
ISAM was replaced at IBM with a methodology called VSAM
Virtual storage access method
Virtual storage access method an IBM disk file storage access method, first used in the OS/VS1, OS/VS2 Release 1 and Release 2 operating systems, later used throughout the Multiple Virtual Storage architecture and now in z/OS...
(Virtual Storage Access Method). Still later, IBM developed DB2
IBM DB2
The IBM DB2 Enterprise Server Edition is a relational model database server developed by IBM. It primarily runs on Unix , Linux, IBM i , z/OS and Windows servers. DB2 also powers the different IBM InfoSphere Warehouse editions...
which, as of 2004, IBM promotes as their primary database management system
Database management system
A database management system is a software package with computer programs that control the creation, maintenance, and use of a database. It allows organizations to conveniently develop databases for various applications by database administrators and other specialists. A database is an integrated...
. VSAM is the physical access method used in DB2.
The OpenVMS
OpenVMS
OpenVMS , previously known as VAX-11/VMS, VAX/VMS or VMS, is a computer server operating system that runs on VAX, Alpha and Itanium-based families of computers. Contrary to what its name suggests, OpenVMS is not open source software; however, the source listings are available for purchase...
operating system uses the Files-11
Files-11
Files-11, also known as on-disk structure, is the file system used by Hewlett-Packard's OpenVMS operating system, and also by the older RSX-11...
file system in conjunction with RMS (Record Management Services
Record Management Services
Record Management Services are procedures in the VMS, RSTS/E, RT-11 and high-end RSX-11 operating systems that programs may call to process files and records within files. VMS RMS is an integral part of the system software; its procedures run in executive mode...
). RMS provides an additional layer between the application and the files on disk that provides a consistent method of data organization and access across multiple 3GL and 4GL languages. RMS provides 4 different methods of accessing data; Sequential, Relative Record Number Access, Record File Address Access, and Indexed Access.
The Indexed Access method of reading or writing data only provides the desired outcome if in fact the file is organized as an ISAM file with the appropriate, previously defined keys. Access to data via the previously defined key(s) is extremely fast. Multiple keys, overlapping keys and key compression within the hash tables are supported. A utility to define/redefine keys in existing files is provided. Records can be deleted, although "garbage collection" is done via a separate utility.
ISAM-style Implementations
- Berkeley DBBerkeley DBBerkeley DB is a computer software library that provides a high-performance embedded database for key/value data. Berkeley DB is a programmatic software library written in C with API bindings for C++, PHP, Java, Perl, Python, Ruby, Tcl, Smalltalk, and most other programming languages...
- BtrieveBtrieveBtrieve is a transactional database software product. It is based on Indexed Sequential Access Method , which is a way of storing data for fast retrieval...
- C-ISAM
- cTreePlusC-treec-treeACE is a cross-platform database engine developed by FairCom Corporation. Software developers typically embed the c-treeACE engine within the applications that they create and then deploy the application and engine together as an integrated solution....
- DataflexDataflexDataFlex is a relational database management system, self-described as an applications development database, originated around 1980.It is a relatively early example of a fully fledged and commercially used fourth-generation programming language . In its early forms, DataFlex was available for UNIX,...
Dataflex proprietary database - dBaseDBASEdBase II was the first widely used database management system for microcomputers. It was originally published by Ashton-Tate for CP/M, and later on ported to the Apple II and IBM PC under DOS...
and related products Clipper and FoxproFoxPro' has two meanings:*Visual FoxPro, an object-oriented programming language and RDBMS, published by Microsoft, for Microsoft Windows*FoxPro 2, a text-based procedural programming language and DBMS, originally published by Fox Software and later by Microsoft, for MS-DOS, Microsoft Windows, Macintosh,... - Digital Equipment CorporationDigital Equipment CorporationDigital Equipment Corporation was a major American company in the computer industry and a leading vendor of computer systems, software and peripherals from the 1960s to the 1990s...
Record Management ServicesRecord Management ServicesRecord Management Services are procedures in the VMS, RSTS/E, RT-11 and high-end RSX-11 operating systems that programs may call to process files and records within files. VMS RMS is an integral part of the system software; its procedures run in executive mode... - EnscribeEnscribeEnscribe is the native hierarchical database in HP NonStop servers. It supports the five file structures: unstructured, key-sequenced, entry-sequenced, relative and queue. Enscribe supports partitioned files which spans across multiple physical disks. It supports locking at file and record levels...
is the HP Tandem structured file access method - Extensible Storage EngineExtensible Storage EngineExtensible Storage Engine , also known as JET Blue, is an Indexed Sequential Access Method data storage technology from Microsoft. ESE is notably a core of Microsoft Exchange Server and Active Directory. Its purpose is to allow applications to store and retrieve data via indexed and sequential...
- Microsoft AccessMicrosoft AccessMicrosoft Office Access, previously known as Microsoft Access, is a relational database management system from Microsoft that combines the relational Microsoft Jet Database Engine with a graphical user interface and software-development tools. It is a member of the Microsoft Office suite of...
- MySQLMySQLMySQL officially, but also commonly "My Sequel") is a relational database management system that runs as a server providing multi-user access to a number of databases. It is named after developer Michael Widenius' daughter, My...
implements and extends ISAM as MyISAMMyISAMMyISAM was the default storage engine for the MySQL relational database management system versions prior to 5.5 . It is based on the older ISAM code but has many useful extensions. The major deficiency of MyISAM is the absence of transactions support...
. - ParadoxParadox (database)Paradox is a relational database management system currently published by Corel Corporation. It was originally released for DOS by Ansa Software, and then by Borland after it bought the company...
- RaimaRaimaRaima is a provider of high-performance, real time, always on database technology for both in-memory database usage and persistent storage devices. The company has offices in Seattle, USA and London, UK.- History :...
Database Manager - Superbase databaseSuperbase databaseSuperbase is an end-user desktop database program that started on the Commodore PET and was ported from that to various operating systems over the course of more than 20 years...
family