Digital preservation
Encyclopedia
Digital preservation is the set of processes, activities and management of digital
information over time to ensure its long term accessibility. The goal of digital preservation is to preserve materials resulting from digital reformatting
, and particularly information that is born-digital
with no analog counterpart. Because of the relatively short lifecycle of digital information, preservation is an ongoing process.
In the language of digital imaging and electronic resources, preservation is no longer just the product of a program but an ongoing process. In this regard the way digital information is stored is important in ensuring its longevity. The long-term storage of digital information is assisted by the inclusion of preservation metadata
.
Digital preservation is defined as: long-term, error-free storage of digital information, with means for retrieval and interpretation, for the entire time span the information is required for. Long-term is defined as "long enough to be concerned with the impacts of changing technologies, including support for new media and data formats, or with a changing user community. Long Term may extend indefinitely". "Retrieval" means obtaining needed digital files from the long-term, error-free digital storage, without possibility of corrupting the continued error-free storage of the digital files. "Interpretation" means that the retrieved digital files, files that, for example, are of texts, charts, images or sounds, are decoded and transformed into usable representations. This is often interpreted as "rendering", i.e. making it available for a human to access. However, in many cases it will mean able to be processed by computational means.
, "Preserving the Internet", Scientific American, the average life of a URL was, in 1997, 44 days.
The unique characteristic of digital forms makes it easy to create content and keep it up-to-date, but at the same time brings many difficulties in the preservation of this content. Margaret Hedstrom
points out that "...digital preservation raises challenges of a fundamentally different nature which are added to the problems of preserving traditional format materials."
. While acid paper is prone to deterioration, becoming brittle and yellowing with age, the deterioration may not become apparent for some decades and progresses slowly. It remains possible to retrieve information without loss once deterioration is noticed. Digital data recording media may deteriorate more rapidly and once the deterioration starts, in most cases there may already be data loss. This characteristic of digital forms leaves a very short time frame for preservation decisions and actions.
.
This challenge is exacerbated by a lack of established standards, protocols and proven methods for preserving digital information. We used to save copies of data on tapes, but media standards for tapes have changed considerably over the last five to ten years, and there is no guarantee that tapes will be readable in the future. Recovering these materials may require special tools Hedstrom further explained that almost all digital library researches have been focused on "...architectures and systems for information organization and retrieval, presentation and visualization, and administration of intellectual property rights" and that "...digital preservation remains largely experimental and replete with the risks associated with untested methods".
There are several additional strategies that individuals and organizations may use to actively combat the loss of digital information.
changes or alteration of data. For example, transferring census data from an old preservation CD to a new one. This strategy may need to be combined with migration when the software
or hardware
required to read the data is no longer available or is unable to understand the format of the data. Refreshing will likely always be necessary due to the deterioration of physical media.
to another (e.g., conversion of Microsoft Word
to PDF or OpenDocument
), from one operating system
to another (e.g., Windows
to Linux
) or from one programming language
to another (e.g., C to Java
) so the resource remains fully accessible and functional. Resources that are migrated run the risk of losing some type of functionality since newer formats may be incapable of capturing all the functionality of the original format, or the converter itself may be unable to interpret all the nuances of the original format. The latter is often a concern with proprietary data formats.
The US National Archives
Electronic Records Archives and Lockheed Martin
are jointly developing a migration system that will preserve any type of document, created on any application or platform, and delivered to the archives on any type of digital media. In the system, files are translated into flexible formats, such as XML; they will therefore be accessible by technologies in the future. Lockheed Martin argues that it would be impossible to develop an emulation system for the National Archives ERA because the volume of records and cost would be prohibitive.
since the data is located in multiple places.
on a Windows
system or emulating WordPerfect 1.0
on a Macintosh. Emulator
s may be built for applications, operating systems, or hardware platforms. Emulation has been a popular strategy for retaining the functionality of old video game systems, such as with the MAME
project. The feasibility of emulation as a catch-all solution has been debated in the academic community. (Granger, 2000)
Raymond A. Lorie has suggested a Universal Virtual Computer
(UVC) could be used to run any software in the future on a yet unknown platform. The UVC strategy uses a combination of emulation and migration. The UVC strategy has not yet been widely adopted by the digital preservation community.
Jeff Rothenberg, a major proponent of Emulation for digital preservation in libraries, working in partnership with Koninklijke Bibliotheek and National Archief of the Netherlands, has recently helped launch Dioscuri, a modular emulator that succeeds in running MS-DOS, WordPerfect 5.1, DOS games, and more.
is data on a digital file that includes information on creation, access rights, restrictions, preservation history, and rights management. Metadata attached to digital files may be affected by file format obsolescence. ASCII
is considered to be the most durable format for metadata because it is widespread, backwards compatible when used with Unicode
, and utilizes human-readable characters, not numeric codes. It retains information, but not the structure information it is presented in. For higher functionality, SGML or XML
should be used. Both markup languages are stored in ASCII format, but contain tags that denote structure and format.
) was developed. The reference model (ISO 14721:2003) includes the following responsibilities that an OAIS archive must abide by:
OAIS is concerned with all technical aspects of a digital object’s life cycle: ingest into and storage in a preservation infrastructure, data management, accessibility, and distribution. The model also addresses metadata issues and recommends that five types of metadata be attached to a digital object: reference (identification) information, provenance (including preservation history), context, fixity (authenticity indicators), and representation (formatting, file structure, and what "imparts meaning to an object’s bitstream").
Prior to Gladney's proposal of TDOs was the Research Library Group's (RLG) development of "attributes and responsibilities" that denote the practices of a "Trusted Digital Repository" (TDR) The seven attributes of a TDR are: "compliance with the Reference Model for an Open Archival Information System (OAIS), Administrative responsibility, Organizational viability, Financial sustainability, Technological and procedural suitability, System security, Procedural accountability." Among RLG’s attributes and responsibilities were recommendations calling for the collaborative development of digital repository certifications, models for cooperative networks, and sharing of research and information on digital preservation with regards to intellectual property rights.
Updated technical guidelines on the creation and preservation of digital audio have been prepared by the International Association of Sound and Audiovisual Archives
(IASA).
(OCA), the Million Book Project
(MBP), and HathiTrust
. The primary motivation of these groups is to expand access to scholarly resources.
(CIC), have signed digitization agreements with either Google or Microsoft. Several of these cultural entities are participating in the Open Content Alliance
(OCA) and the Million Book Project
(MBP). Some libraries are involved in only one initiative and others have diversified their digitization strategies through participation in multiple initiatives. The three main reasons for library participation in LSDIs are: Access, Preservation and Research and Development. It is hoped that digital preservation will ensure that library materials remain accessible for future generations. Libraries have a perpetual responsibility for their materials and a commitment to archive their digital materials. Libraries plan to use digitized copies as backups for works in case they go out of print, deteriorate, or are lost and damaged.
Digital
A digital system is a data technology that uses discrete values. By contrast, non-digital systems use a continuous range of values to represent information...
information over time to ensure its long term accessibility. The goal of digital preservation is to preserve materials resulting from digital reformatting
Digital Reformatting
Digital reformatting is the process of converting analogue materials into a digital format as a surrogate of the original. The digital surrogates perform a preservation function by reducing or eliminating the use of the original...
, and particularly information that is born-digital
Born-digital
The term born-digital refers to materials that originate in a digital form. This is in contrast to digital reformatting, through which analog materials become digital. It is most often used in relation to digital libraries and the issues that go along with said organizations, such as digital...
with no analog counterpart. Because of the relatively short lifecycle of digital information, preservation is an ongoing process.
In the language of digital imaging and electronic resources, preservation is no longer just the product of a program but an ongoing process. In this regard the way digital information is stored is important in ensuring its longevity. The long-term storage of digital information is assisted by the inclusion of preservation metadata
Preservation metadata
Preservation metadata is an essential component of most digital preservation strategies. As an increasing proportion of the world’s information output shifts from analog to digital form, it is necessary to develop new strategies to preserve this information for the long-term. Preservation metadata...
.
Digital preservation is defined as: long-term, error-free storage of digital information, with means for retrieval and interpretation, for the entire time span the information is required for. Long-term is defined as "long enough to be concerned with the impacts of changing technologies, including support for new media and data formats, or with a changing user community. Long Term may extend indefinitely". "Retrieval" means obtaining needed digital files from the long-term, error-free digital storage, without possibility of corrupting the continued error-free storage of the digital files. "Interpretation" means that the retrieved digital files, files that, for example, are of texts, charts, images or sounds, are decoded and transformed into usable representations. This is often interpreted as "rendering", i.e. making it available for a human to access. However, in many cases it will mean able to be processed by computational means.
Digital Format Preservation Concerns
Society's heritage has been presented on many different materials, including stone, vellum, bamboo, silk, and paper. Now a large quantity of information exists in digital forms, including emails, blogs, social networking websites, national elections websites, web photo albums, and sites which change their content over time. According to an article by Brewster Kahle, in 1996 founder of Internet ArchiveInternet Archive
The Internet Archive is a non-profit digital library with the stated mission of "universal access to all knowledge". It offers permanent storage and access to collections of digitized materials, including websites, music, moving images, and nearly 3 million public domain books. The Internet Archive...
, "Preserving the Internet", Scientific American, the average life of a URL was, in 1997, 44 days.
The unique characteristic of digital forms makes it easy to create content and keep it up-to-date, but at the same time brings many difficulties in the preservation of this content. Margaret Hedstrom
Margaret Hedstrom
Dr. Margaret L. Hedstrom is a Professor at the University of Michigan School of Information and faculty coordinator of the Archives and Records Management specialization within the Master of Science in Information program....
points out that "...digital preservation raises challenges of a fundamentally different nature which are added to the problems of preserving traditional format materials."
Physical deterioration
The media on which digital contents are stored are more vulnerable to deterioration and catastrophic loss than some analog media such as paperPaper
Paper is a thin material mainly used for writing upon, printing upon, drawing or for packaging. It is produced by pressing together moist fibers, typically cellulose pulp derived from wood, rags or grasses, and drying them into flexible sheets....
. While acid paper is prone to deterioration, becoming brittle and yellowing with age, the deterioration may not become apparent for some decades and progresses slowly. It remains possible to retrieve information without loss once deterioration is noticed. Digital data recording media may deteriorate more rapidly and once the deterioration starts, in most cases there may already be data loss. This characteristic of digital forms leaves a very short time frame for preservation decisions and actions.
Digital obsolescence
Another challenge is the issue of long-term access to data. Digital technology is developing quickly and retrieval and playback technologies can become obsolete in a matter of years. When faster, more capable and less expensive storage and processing devices are developed, older versions may be quickly replaced. When a software or decoding technology is abandoned, or a hardware device is no longer in production, records created with such technologies are at great risk of loss, simply because they are no longer accessible. This process is known as digital obsolescenceDigital obsolescence
Digital obsolescence is a situation where a digital resource is no longer readable because the physical media, the reader required to read the media, the hardware, or the software that runs on it, is no longer available. A prime example of this is the BBC Domesday Project...
.
This challenge is exacerbated by a lack of established standards, protocols and proven methods for preserving digital information. We used to save copies of data on tapes, but media standards for tapes have changed considerably over the last five to ten years, and there is no guarantee that tapes will be readable in the future. Recovering these materials may require special tools Hedstrom further explained that almost all digital library researches have been focused on "...architectures and systems for information organization and retrieval, presentation and visualization, and administration of intellectual property rights" and that "...digital preservation remains largely experimental and replete with the risks associated with untested methods".
Strategies
In 2006, the Online Computer Library Center developed a four-point strategy for the long-term preservation of digital objects that consisted of:- Assessing the risks for loss of content posed by technology variables such as commonly used proprietary file formats and software applications.
- Evaluating the digital content objects to determine what type and degree of format conversion or other preservation actions should be applied.
- Determining the appropriate metadata needed for each object type and how it is associated with the objects.
- Providing access to the content.
There are several additional strategies that individuals and organizations may use to actively combat the loss of digital information.
Refreshing
Refreshing is the transfer of data between two types of the same storage medium so there are no bitrateBitrate
In telecommunications and computing, bit rate is the number of bits that are conveyed or processed per unit of time....
changes or alteration of data. For example, transferring census data from an old preservation CD to a new one. This strategy may need to be combined with migration when the software
Computer software
Computer software, or just software, is a collection of computer programs and related data that provide the instructions for telling a computer what to do and how to do it....
or hardware
Computer hardware
Personal computer hardware are component devices which are typically installed into or peripheral to a computer case to create a personal computer upon which system software is installed including a firmware interface such as a BIOS and an operating system which supports application software that...
required to read the data is no longer available or is unable to understand the format of the data. Refreshing will likely always be necessary due to the deterioration of physical media.
Migration
Migration is the transferring of data to newer system environments (Garrett et al., 1996). This may include conversion of resources from one file formatFile format
A file format is a particular way that information is encoded for storage in a computer file.Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for...
to another (e.g., conversion of Microsoft Word
Microsoft Word
Microsoft Word is a word processor designed by Microsoft. It was first released in 1983 under the name Multi-Tool Word for Xenix systems. Subsequent versions were later written for several other platforms including IBM PCs running DOS , the Apple Macintosh , the AT&T Unix PC , Atari ST , SCO UNIX,...
to PDF or OpenDocument
OpenDocument
The Open Document Format for Office Applications is an XML-based file format for representing electronic documents such as spreadsheets, charts, presentations and word processing documents....
), from one operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...
to another (e.g., Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
to Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
) or from one programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
to another (e.g., C to Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...
) so the resource remains fully accessible and functional. Resources that are migrated run the risk of losing some type of functionality since newer formats may be incapable of capturing all the functionality of the original format, or the converter itself may be unable to interpret all the nuances of the original format. The latter is often a concern with proprietary data formats.
The US National Archives
National Archives and Records Administration
The National Archives and Records Administration is an independent agency of the United States government charged with preserving and documenting government and historical records and with increasing public access to those documents, which comprise the National Archives...
Electronic Records Archives and Lockheed Martin
Lockheed Martin
Lockheed Martin is an American global aerospace, defense, security, and advanced technology company with worldwide interests. It was formed by the merger of Lockheed Corporation with Martin Marietta in March 1995. It is headquartered in Bethesda, Maryland, in the Washington Metropolitan Area....
are jointly developing a migration system that will preserve any type of document, created on any application or platform, and delivered to the archives on any type of digital media. In the system, files are translated into flexible formats, such as XML; they will therefore be accessible by technologies in the future. Lockheed Martin argues that it would be impossible to develop an emulation system for the National Archives ERA because the volume of records and cost would be prohibitive.
Replication
Creating duplicate copies of data on one or more systems is called replication. Data that exists as a single copy in only one location is highly vulnerable to software or hardware failure, intentional or accidental alteration, and environmental catastrophes like fire, flooding, etc. Digital data is more likely to survive if it is replicated in several locations. Replicated data may introduce difficulties in refreshing, migration, versioning, and access controlAccess control
Access control refers to exerting control over who can interact with a resource. Often but not always, this involves an authority, who does the controlling. The resource can be a given building, group of buildings, or computer-based information system...
since the data is located in multiple places.
Emulation
Emulation is the replicating of functionality of an obsolete system. Examples include emulating an Atari 2600Atari 2600
The Atari 2600 is a video game console released in October 1977 by Atari, Inc. It is credited with popularizing the use of microprocessor-based hardware and cartridges containing game code, instead of having non-microprocessor dedicated hardware with all games built in...
on a Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
system or emulating WordPerfect 1.0
WordPerfect
WordPerfect is a word processing application, now owned by Corel.Bruce Bastian, a Brigham Young University graduate student, and BYU computer science professor Dr. Alan Ashton joined forces to design a word processing system for the city of Orem's Data General Corp. minicomputer system in 1979...
on a Macintosh. Emulator
Emulator
In computing, an emulator is hardware or software or both that duplicates the functions of a first computer system in a different second computer system, so that the behavior of the second system closely resembles the behavior of the first system...
s may be built for applications, operating systems, or hardware platforms. Emulation has been a popular strategy for retaining the functionality of old video game systems, such as with the MAME
MAME
MAME is an emulator application designed to recreate the hardware of arcade game systems in software on modern personal computers and other platforms. The intention is to preserve gaming history by preventing vintage games from being lost or forgotten...
project. The feasibility of emulation as a catch-all solution has been debated in the academic community. (Granger, 2000)
Raymond A. Lorie has suggested a Universal Virtual Computer
Universal Virtual Computer
UVC-based preservation is a viable strategy to ensure digital preservation on a technical level.A Universal Virtual Computer is a virtual machine specially designed for preservation of digital objects such as held by libraries, archives and institutions alike. The method is based on emulation but...
(UVC) could be used to run any software in the future on a yet unknown platform. The UVC strategy uses a combination of emulation and migration. The UVC strategy has not yet been widely adopted by the digital preservation community.
Jeff Rothenberg, a major proponent of Emulation for digital preservation in libraries, working in partnership with Koninklijke Bibliotheek and National Archief of the Netherlands, has recently helped launch Dioscuri, a modular emulator that succeeds in running MS-DOS, WordPerfect 5.1, DOS games, and more.
Metadata attachment
MetadataMetadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
is data on a digital file that includes information on creation, access rights, restrictions, preservation history, and rights management. Metadata attached to digital files may be affected by file format obsolescence. ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
is considered to be the most durable format for metadata because it is widespread, backwards compatible when used with Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
, and utilizes human-readable characters, not numeric codes. It retains information, but not the structure information it is presented in. For higher functionality, SGML or XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
should be used. Both markup languages are stored in ASCII format, but contain tags that denote structure and format.
Trustworthy digital objects
Digital objects that can speak to their own authenticity are called trustworthy digital objects (TDOs). TDOs were proposed by Henry M. Gladney to enable digital objects to maintain a record of their change history so future users can know with certainty that the contents of the object are authentic. Other preservation strategies like replication and migration are necessary for the long-term preservation of TDOs.Digital sustainability
Digital sustainability encompasses a range of issues and concerns that contribute to the longevity of digital information. Unlike traditional, temporary strategies, and more permanent solutions, digital sustainability implies a more active and continuous process. Digital sustainability concentrates less on the solution and technology and more on building an infrastructure and approach that is flexible with an emphasis on interoperability, continued maintenance and continuous development. Digital sustainability incorporates activities in the present that will facilitate access and availability in the future.Digital preservation standards
To standardize digital preservation practice and provide a set of recommendations for preservation program implementation, the Reference Model for an Open Archival Information System (OAISOAIS
An Open Archival Information System is an archive, consisting of an organization of people and systems, that has accepted the responsibility to preserve information and make it available for a Designated Community....
) was developed. The reference model (ISO 14721:2003) includes the following responsibilities that an OAIS archive must abide by:
- Negotiate for and accept appropriate information from information Producers.
- Obtain sufficient control of the information provided to the level needed to ensure Long-Term Preservation.
- Determine, either by itself or in conjunction with other parties, which communities should become the Designated Community and, therefore, should be able to understand the information provided.
- Ensure that the information to be preserved is Independently Understandable to the Designated Community. In other words, the community should be able to understand the information without needing the assistance of the experts who produced the information.
- Follow documented policies and procedures which ensure that the information is preserved against all reasonable contingencies, and which enable the information to be disseminated as authenticated copies of the original, or as traceable to the original.
- Make the preserved information available to the Designated Community.
OAIS is concerned with all technical aspects of a digital object’s life cycle: ingest into and storage in a preservation infrastructure, data management, accessibility, and distribution. The model also addresses metadata issues and recommends that five types of metadata be attached to a digital object: reference (identification) information, provenance (including preservation history), context, fixity (authenticity indicators), and representation (formatting, file structure, and what "imparts meaning to an object’s bitstream").
Prior to Gladney's proposal of TDOs was the Research Library Group's (RLG) development of "attributes and responsibilities" that denote the practices of a "Trusted Digital Repository" (TDR) The seven attributes of a TDR are: "compliance with the Reference Model for an Open Archival Information System (OAIS), Administrative responsibility, Organizational viability, Financial sustainability, Technological and procedural suitability, System security, Procedural accountability." Among RLG’s attributes and responsibilities were recommendations calling for the collaborative development of digital repository certifications, models for cooperative networks, and sharing of research and information on digital preservation with regards to intellectual property rights.
Digital sound preservation standards
In January 2004, the Council on Library and Information Resources (CLIR) hosted a roundtable meeting of audio experts discussing best practices, which culminated in a report delivered March 2006. This report investigated procedures for reformatting sound from analog to digital, summarizing discussions and recommendations for best practices for digital preservation. Participants made a series of 15 recommendations for improving the practice of analog audio transfer for archiving:- Develop core competencies in audio preservation engineering. Participants noted with concern that the number of experts qualified to transfer older recordings is shrinking and emphasized the need to find a way to ensure that the technical knowledge of these experts can be passed on.
- Develop arrangements among smaller institutions that allow for cooperative buying of esoteric materials and supplies.
- Pursue a research agenda for magnetic-tape problems that focuses on a less destructive solution for hydrolysis than baking, relubrication of acetate tapes, and curing of cupping.
- Develop guidelines for the use of automated transfer of analog audio to digital preservation copies.
- Develop a web-based clearinghouse for sharing information on how archives can develop digital preservation transfer programs.
- Carry out further research into nondestructive playback of broken audio discs.
- Develop a flowchart for identifying the composition of various types of audio discs and tapes.
- Develop a reference chart of problematic media issues.
- Collate relevant audio engineering standards from organizations.
- Research safe and effective methods for cleaning analog tapes and discs.
- Develop a list of music experts who could be consulted for advice on transfer of specific types of musical content (e.g., determining the proper key so that correct playback speed can be established).
- Research the life expectancy of various audio formats.
- Establish regional digital audio repositories.
- Cooperate to develop a common vocabulary within the field of audio preservation.
- Investigate the transfer of technology from such fields as chemistry and materials science to various problems in audio preservation.
Updated technical guidelines on the creation and preservation of digital audio have been prepared by the International Association of Sound and Audiovisual Archives
International Association of Sound and Audiovisual Archives
The International Association of Sound and Audiovisual Archives was established in 1969 to serve as a forum for international co-operation between archives, libraries, and individuals interested in the preservation of recorded sound and audiovisual documents...
(IASA).
Examples of digital preservation initiatives
- The Library of CongressLibrary of CongressThe Library of Congress is the research library of the United States Congress, de facto national library of the United States, and the oldest federal cultural institution in the United States. Located in three buildings in Washington, D.C., it is the largest library in the world by shelf space and...
operates the National Digital Information Infrastructure and Preservation Program - The British LibraryBritish LibraryThe British Library is the national library of the United Kingdom, and is the world's largest library in terms of total number of items. The library is a major research library, holding over 150 million items from every country in the world, in virtually all known languages and in many formats,...
is responsible for several programmes in the area of digital preservation. The National Archives of the United KingdomUnited KingdomThe United Kingdom of Great Britain and Northern IrelandIn the United Kingdom and Dependencies, other languages have been officially recognised as legitimate autochthonous languages under the European Charter for Regional or Minority Languages...
have also pioneered various initiatives in the field of digital preservation. - The Safety Deposit Box software from TessellaTessella-History:Tessella was founded in 1980 by Kevin Gell as “Tessella Support Services Plc”. Tessella moved to its first permanent office in Abingdon, Oxfordshire in 1987.Tessella opened its Warrington branch in 1998 and its Burton upon Trent branch in 2000....
is being widely adopted by National Archives and has been selected by the National Archives in the UK, NetherlandsNationaal ArchiefThe Nationaal Archief is the national archive of the Netherlands, located in The Hague. It houses collections for the central government, the province of Zuid-Holland, and the former County of Holland. There is also material from private institutions and individuals with an association to the Dutch...
, SwitzerlandFederal Archives of SwitzerlandThe Federal Archives of Switzerland were created in 1798 following the creation of the Helvetic Republic. They are located in Berne.-External links:*...
, FinlandNational Archives of FinlandThe National Archives of Finland is a Finnish government agency responsible for archiving official documents of the Finnish state and municipalities. It is situated in the capital Helsinki....
, EstoniaNational Archives of EstoniaThe National Archives of Estonia is the National archive of Estonia. The primary purpose is to collect, preserve and archive historically valuable records from central authorities, such as ministries, agencies and national organisations and make them available to the public...
, Malaysia and Austria as well as FamilySearchFamilySearchFamilySearch is a genealogy organization established and run by The Church of Jesus Christ of Latter-day Saints. It is the largest genealogy organization in the world. FamilySearch consists of a collection of records, resources, and services designed to help people learn more about their family...
and the Wellcome CollectionWellcome CollectionThe Wellcome Collection is a museum at 183 Euston Road, London, displaying an unusual mixture of medical artifacts and original artworks exploring 'ideas about the connections between medicine, life and art'. The Collection comprises three public exhibition spaces, an auditorium, events space, cafe...
. It was recently awarded the Queen's Awards for EnterpriseQueen's Awards for EnterpriseThe Queen's Awards for Enterprise is an awards programme for British businesses and other organizations who excel at international trade, innovation or sustainable development. They are the highest official UK awards for British businesses...
in the UK. - Ex Libris Rosetta is commercial software helping memory institutions to collect, manage, archive and preserve their digital collections, ensuring its data integrity and access over time. The system enables managing digital entities end to end—from submission to dissemination. A rule-based workflow engine and open architecture allow institutions using the system to develop unique plug-in tools and applications to enhance the system’s ingest, management, preservation and delivery processes.
- The MetaArchive CooperativeMetaArchive CooperativeThe MetaArchive Cooperative is an international digital preservation network composed of libraries, archives, and other memory institutions. As of August 2011, the MetaArchive preservation network is composed of 24 secure servers in four countries with a collective capacity of over 300TB...
is a library-run, collaborative approach to digital preservation that embeds digital preservation infrastructure and knowledge in each of its constituent member institutions. Composed mainly of University libraries, the Cooperative functions as a network wherein each preserved file is replicated seven times, is stored in geographically distinct locations across four countries, and is carefully managed from ingest (as a SIP) to dissemination (as a DIP).
Large-scale digital preservation initiatives (LSDIs)
Many research libraries and archives have begun or are about to begin Large-Scale digital preservation initiatives (LSDI’s). The main players in LSDIs are cultural institutions, commercial companies such as Google and Microsoft, and non-profit groups including the Open Content AllianceOpen Content Alliance
The Open Content Alliance is a consortium of organizations contributing to a permanent, publicly accessible archive of digitized texts. Its creation was announced in October 2005 by Yahoo!, the Internet Archive, the University of California, the University of Toronto and others...
(OCA), the Million Book Project
Million Book Project
The Million Book Project , is a book digitization project, led by Carnegie Mellon University School of Computer Science and University Libraries...
(MBP), and HathiTrust
HathiTrust
HathiTrust is a very large-scale collaborative repository of digital content from research libraries including content digitized via the Google Books project and Internet Archive digitization initiatives, as well as content digitized locally by libraries....
. The primary motivation of these groups is to expand access to scholarly resources.
LSDIs: library perspective
Approximately 30 cultural entities, including the 12-member Committee on Institutional CooperationCommittee on Institutional Cooperation
The Committee on Institutional Cooperation is the academic consortium of the universities in the Big Ten Conference plus former conference member, the University of Chicago....
(CIC), have signed digitization agreements with either Google or Microsoft. Several of these cultural entities are participating in the Open Content Alliance
Open Content Alliance
The Open Content Alliance is a consortium of organizations contributing to a permanent, publicly accessible archive of digitized texts. Its creation was announced in October 2005 by Yahoo!, the Internet Archive, the University of California, the University of Toronto and others...
(OCA) and the Million Book Project
Million Book Project
The Million Book Project , is a book digitization project, led by Carnegie Mellon University School of Computer Science and University Libraries...
(MBP). Some libraries are involved in only one initiative and others have diversified their digitization strategies through participation in multiple initiatives. The three main reasons for library participation in LSDIs are: Access, Preservation and Research and Development. It is hoped that digital preservation will ensure that library materials remain accessible for future generations. Libraries have a perpetual responsibility for their materials and a commitment to archive their digital materials. Libraries plan to use digitized copies as backups for works in case they go out of print, deteriorate, or are lost and damaged.
See also
- BackupBackupIn information technology, a backup or the process of backing up is making copies of data which may be used to restore the original after a data loss event. The verb form is back up in two words, whereas the noun is backup....
- Bit rotBit rotBit rot, also known as bit decay, data rot, or data decay, is a colloquial computing term used to describe either a gradual decay of storage media or the degradation of a software program over time. The latter use of the term implies that software can wear out or rust like a physical tool...
- Charles M DollarCharles M DollarCharles M Dollar, an internationally recognized expert on the life cycle management of electronic records, particularly electronic records archiving, pioneered research into digital preservation of electronic records. He holds a Ph.D...
- Data archaeologyData archaeologyData archaeology refers to the art and science of recovering computer data encrypted in now obsolete media or formats. Data archaeology can also refer to recovering information from damaged electronic formats after natural or man made disasters....
- Digital artifactual valueDigital artifactual valueDigital artifactual value is a preservation term that refers to the intrinsic value of a digital object, rather than the informational content of the object. There are currently no established standards for what constitutes digital artifactual value...
- Database preservationDatabase preservationDatabase preservation usually involves converting the information stored in a database, without losing the characteristics of the data, to a format which can be used in the long term, even if the technology and daily life knowledge changes.-Database preservation projects:In the past different...
- Digital asset managementDigital asset managementDigital asset management consists of management tasks and decisions surrounding the ingestion, annotation, cataloguing, storage, retrieval and distribution of digital assets...
- Data format managementData Format ManagementData format management is the application of a systematic approach to the selection and use of the data formats used to encode information for storage on a computer....
- Digital curationDigital curationDigital curation is the selection, preservation, maintenance, collection and archiving of digital assets.Digital curation is generally referred to the process of establishing and developing long term repositories of digital assets for current and future reference by researchers, scientists,...
- Digital ContinuityDigital continuityDigital continuity is the ability to maintain the digital information of a creator in such a way that the information will continue to be available, as needed, despite changes in digital storage technology. It focuses on making sure that information is complete, available and therefore usable...
- Digital libraryDigital libraryA digital library is a library in which collections are stored in digital formats and accessible by computers. The digital content may be stored locally, or accessed remotely via computer networks...
- Digital obsolescenceDigital obsolescenceDigital obsolescence is a situation where a digital resource is no longer readable because the physical media, the reader required to read the media, the hardware, or the software that runs on it, is no longer available. A prime example of this is the BBC Domesday Project...
- Digital reformattingDigital ReformattingDigital reformatting is the process of converting analogue materials into a digital format as a surrogate of the original. The digital surrogates perform a preservation function by reducing or eliminating the use of the original...
- Enterprise content managementEnterprise content managementEnterprise Content Management is a formalized means of organizing and storing an organization's documents, and other content, that relate to the organization's processes...
- File formatFile formatA file format is a particular way that information is encoded for storage in a computer file.Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for...
- Information Lifecycle ManagementInformation Lifecycle ManagementInformation Lifecycle Management refers to a wide-ranging set of strategies for administering storage systems on computing devices. Specifically, four categories of storage strategies may be considered under the auspices of ILM.-Policy:...
- List of digital preservation initiatives
- New media art preservationNew media art preservationNew media art preservation, a form of Art conservation, is the study and practice of techniques for sustaining artworks created using digital, biological, performative, and other variable media....
- Margaret HedstromMargaret HedstromDr. Margaret L. Hedstrom is a Professor at the University of Michigan School of Information and faculty coordinator of the Archives and Records Management specialization within the Master of Science in Information program....
- Preservation metadataPreservation metadataPreservation metadata is an essential component of most digital preservation strategies. As an increasing proportion of the world’s information output shifts from analog to digital form, it is necessary to develop new strategies to preserve this information for the long-term. Preservation metadata...
- Section 108 Study GroupSection 108 Study GroupThe Section 108 Study Group is a select committee of copyright experts, convened by the Library of Congress, and charged with updating for the digital world the United States Copyright Act's balance between the rights of creators and copyright owners and the needs of libraries and...
- Trustworthy Repositories Audit & Certification (TRAC)
- Universal Virtual ComputerUniversal Virtual ComputerUVC-based preservation is a viable strategy to ensure digital preservation on a technical level.A Universal Virtual Computer is a virtual machine specially designed for preservation of digital objects such as held by libraries, archives and institutions alike. The method is based on emulation but...
- Web archivingWeb archivingWeb archiving is the process of collecting portions of the World Wide Web and ensuring the collection is preserved in an archive, such as an archive site, for future researchers, historians, and the public. Due to the massive size of the Web, web archivists typically employ web crawlers for...
External links
- National Digital Information Infrastructure and Preservation Program at the Library of CongressLibrary of CongressThe Library of Congress is the research library of the United States Congress, de facto national library of the United States, and the oldest federal cultural institution in the United States. Located in three buildings in Washington, D.C., it is the largest library in the world by shelf space and...
- Digital Preservation page from the Digital Library FederationDigital Library FederationThe Digital Library Federation is an international consortium of libraries and related agencies that are pioneering the use of electronic-information technologies to extend collections and services...
- "Thirteen Ways of Looking at...Digital Preservation"
- Cornell University Library's Digital Imaging Tutorial
- What is Digital Preservation? - an introduction to digital preservation by Digital Preservation EuropeDigital Preservation EuropeDigitalPreservationEurope is a European Union research project aimed at digital preservation coordination and dissemination activities within Europe. It was founded by the Sixth Framework Programme . DPE ranks among EU "Digital Libraries" priority called "i2010".DPE together with CASPAR and...
- Macroscopic 10-Terabit–per–Square-Inch Arrays from Block Copolymers with Lateral Order. Science magazine article about prospective usage of sapphire in digital storage media technology
- Animations introducing digital preservation and curation
- Capture Your Collections: Planning and Implementing Digitization Projects A CHIN (Canadian Heritage Information Network) Resource
- Digitales Archiv Hessen Digital preservation page by Hessisches Hauptstaatsarchiv Wiesbaden