Data corruption
Encyclopedia
Data corruption refers to errors in computer
Computer
A computer is a programmable machine designed to sequentially and automatically carry out a sequence of arithmetic or logical operations. The particular sequence of operations can be changed readily, allowing the computer to solve more than one kind of problem...

 data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...

 that occur during writing, reading, storage, transmission, or processing, which introduce unintended changes to the original data. Computer storage and transmission systems use a number of measures to provide data integrity
Data integrity
Data Integrity in its broadest meaning refers to the trustworthiness of system resources over their entire life cycle. In more analytic terms, it is "the representational faithfulness of information to the true state of the object that the information represents, where representational faithfulness...

, or lack of errors.

In general, when data corruption occurs, the file
File
File or filing may refer to:Tools:* File * Filing * Nail filePaper or computer records:* File folder, a folder for holding loose papers* Filing cabinet or file cabinet...

 containing that data may become inaccessible, and the system or the related application will give an error. For example, if a Microsoft Word
Microsoft Word
Microsoft Word is a word processor designed by Microsoft. It was first released in 1983 under the name Multi-Tool Word for Xenix systems. Subsequent versions were later written for several other platforms including IBM PCs running DOS , the Apple Macintosh , the AT&T Unix PC , Atari ST , SCO UNIX,...

 file is corrupted, when you try to open that file with MS Word, you will get an error message
Error message
An error message is information displayed when an unexpected condition occurs, usually on a computer or other device. On modern operating systems with graphical user interfaces, error messages are often displayed using dialog boxes...

, and the file would not be opened. Some programs can give a suggestion to repair the file automatically (after the error), and some programs cannot repair it. It depends on the level of corruption, and the in-built functionality of the application to handle the error. There are various causes of the corruption.

Transmission

Data corruption during transmission has a variety of causes. Interruption of data transmission causes information loss
Data loss
Data loss is an error condition in information systems in which information is destroyed by failures or neglect in storage, transmission, or processing. Information systems implement backup and disaster recovery equipment and processes to prevent data loss or restore lost data.Data loss is...

. Environmental conditions can interfere with data transmission, especially when dealing with wireless transmission methods. Heavy clouds can block satellite transmissions. Wireless networks are susceptible to interference from devices such as microwave ovens.

Storage

Data loss
Data loss
Data loss is an error condition in information systems in which information is destroyed by failures or neglect in storage, transmission, or processing. Information systems implement backup and disaster recovery equipment and processes to prevent data loss or restore lost data.Data loss is...

 during storage has two broad causes: hardware and software failure. Background radiation
Background radiation
Background radiation is the ionizing radiation constantly present in the natural environment of the Earth, which is emitted by natural and artificial sources.-Overview:Both Natural and human-made background radiation varies by location....

, head crash
Head crash
A head crash is a hard-disk failure that occurs when a read–write head of a hard disk drive comes in contact with its rotating platter, resulting in permanent and usually irreparable damage to the magnetic media on the platter surface....

es, and aging or wear of the storage device fall into the former category, while software failure typically occurs due to bugs
Software bug
A software bug is the common term used to describe an error, flaw, mistake, failure, or fault in a computer program or system that produces an incorrect or unexpected result, or causes it to behave in unintended ways. Most bugs arise from mistakes and errors made by people in either a program's...

 in the code.

Error detection and correction may occur in the hardware, the disk subsystem or adapter, or software which implements error checking and correction (i.e., RAID software such as mdadm for Linux).

There are two types of data loss:
  • Undetected- also known as "silent corruption". These problems have been attributed to errors during the write process to disk. These are the most dangerous errors as there is no indication that the data is incorrect.
  • Detected- these errors are most often caused by disk drive problems. Errors may either permanent or temporary, where temporary errors are able to be overcome when the operation is repeated by the hardware. Errors are normally detected by the hardware, either by the disk drive by checking the data read from the disk using the ECC/CRC error correcting code stored alongside the data on disk, or in the case of a RAID array by comparing the contents of the RAID strips with the ECC checksum or parity of the RAID stripe.

Countermeasures

When data corruption behaves as a Poisson process
Poisson process
A Poisson process, named after the French mathematician Siméon-Denis Poisson , is a stochastic process in which events occur continuously and independently of one another...

, where each bit
Bit
A bit is the basic unit of information in computing and telecommunications; it is the amount of information stored by a digital device or other physical system that exists in one of two possible distinct states...

 of data has an independently low probability of being changed, data corruption can generally be detected by the use of checksum
Checksum
A checksum or hash sum is a fixed-size datum computed from an arbitrary block of digital data for the purpose of detecting accidental errors that may have been introduced during its transmission or storage. The integrity of the data can be checked at any later time by recomputing the checksum and...

s, and can often be corrected
Error detection and correction
In information theory and coding theory with applications in computer science and telecommunication, error detection and correction or error control are techniques that enable reliable delivery of digital data over unreliable communication channels...

 by the use of error correcting codes.

If an uncorrectable data corruption is detected, procedures such as automatic retransmission or restoration from backup
Backup
In information technology, a backup or the process of backing up is making copies of data which may be used to restore the original after a data loss event. The verb form is back up in two words, whereas the noun is backup....

s can be applied. Certain levels of RAID disk arrays have the ability to store and evaluate parity bit
Parity bit
A parity bit is a bit that is added to ensure that the number of bits with the value one in a set of bits is even or odd. Parity bits are used as the simplest form of error detecting code....

s for data across a set of hard disks and can reconstruct corrupted data upon the failure of a single or multiple disks, depending on the level of RAID implemented.

Today, many errors are detected and corrected by the disk drive using the ECC/CRC codes which are stored on disk for each sector. If the disk drive detects multiple read errors on a sector it may make a copy of the failing sector on another part of the disk- remapping the failed sector of the disk to a spare sector without the involvement of the operating system (though this may be delayed until the next write to the sector).

This "silent correction" can lead to other problems if disk storage is not managed well, as the disk drive will continue to remap sectors until it runs out of spares, at which time the temporary correctable errors can turn into permanent ones as the disk drive deteriorates. S.M.A.R.T. provides a standardized way of monitoring the health of a disk drive, and there are tools available for most operating systems to automatically check the disk drive for impending failures by watching for deteriorating SMART parameters.

"Data scrubbing
Data scrubbing
Data scrubbing is an error correction technique which uses a background task that periodically inspects memory for errors, and then corrects the error using ECC memory or another copy of the data...

" is another method to reduce the likelihood of data corruption, as disk errors are caught and recovered from, before multiple errors accumulate and overwhelm the number of parity bits. Instead of parity being checked on each read, the parity is checked during a regular scan of the disk, often done as a low priority background process. Note that the "data scrubbing" operation activates a parity check. If a user simply runs a normal program that reads data from the disk, then the parity would not be checked unless parity-check-on-read was both supported and enabled on the disk subsystem.

If appropriate mechanisms are employed to detect and remedy data corruption, data integrity can be maintained. This is particularly important in commercial applications (e.g. banking), where an undetected error could either corrupt a database index or change data to drastically affect an account balance, and in the use of encrypted
Encryption
In cryptography, encryption is the process of transforming information using an algorithm to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key. The result of the process is encrypted information...

 or compressed
Data compression
In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use....

 data, where a small error can make an extensive dataset unusable. It is worth noting that while the study by CERN has been often referenced as showing large levels of data corruption, the disk subsystem which was the subject of the paper was set up with RAID5 and a single parity bit (hence could not recover from a single "silent" error), did not use parity-check-on-read (and hence could not detect "silent errors" through parity checking of the RAID stripe), and did not use data scrubbing. The disk storage was also subject to a microcode software bug which caused higher levels of errors than normal .

See also

  • Computer science
    Computer science
    Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

  • Bit rot
    Bit rot
    Bit rot, also known as bit decay, data rot, or data decay, is a colloquial computing term used to describe either a gradual decay of storage media or the degradation of a software program over time. The latter use of the term implies that software can wear out or rust like a physical tool...

  • Data integrity
    Data integrity
    Data Integrity in its broadest meaning refers to the trustworthiness of system resources over their entire life cycle. In more analytic terms, it is "the representational faithfulness of information to the true state of the object that the information represents, where representational faithfulness...

  • Database integrity
    Database integrity
    Database integrity ensures that data entered into the database is accurate, valid, and consistent. Any applicable integrity constraints and data validation rules must be satisfied before permitting a change to the database....

  • Reed-Solomon error correction
  • Forward error correction
    Forward error correction
    In telecommunication, information theory, and coding theory, forward error correction or channel coding is a technique used for controlling errors in data transmission over unreliable or noisy communication channels....

  • RAID
    RAID
    RAID is a storage technology that combines multiple disk drive components into a logical unit...

  • Radiation hardening
    Radiation hardening
    Radiation hardening is a method of designing and testing electronic components and systems to make them resistant to damage or malfunctions caused by ionizing radiation , such as would be encountered in outer space, high-altitude flight, around nuclear reactors, particle accelerators, or during...

  • Inaccessible boot device
    Inaccessible boot device
    INACCESSIBLE_BOOT_DEVICE is a category of blue screen error for Microsoft Windows with a reason code of 0x7B meaning that the system lost access to the boot partition during startup...

  • Blue Screen of Death
    Blue Screen of Death
    To forse a BSOD Open regedit.exe,Then search: HKLM\SYSTEM\CurrentControlSet\services\i8042prt\ParametersThen make a new DWORD called "CrashOnCtrlScroll" And set the value to 1....

  • List of data recovery software
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK