Failover
Encyclopedia
In computing
Computing
Computing is usually defined as the activity of using and improving computer hardware and software. It is the computer-specific part of information technology...

, failover is automatic switching to a redundant
Redundancy (engineering)
In engineering, redundancy is the duplication of critical components or functions of a system with the intention of increasing reliability of the system, usually in the case of a backup or fail-safe....

 or standby computer
Computer
A computer is a programmable machine designed to sequentially and automatically carry out a sequence of arithmetic or logical operations. The particular sequence of operations can be changed readily, allowing the computer to solve more than one kind of problem...

 server
Server (computing)
In the context of client-server architecture, a server is a computer program running to serve the requests of other programs, the "clients". Thus, the "server" performs some computational task on behalf of "clients"...

, system
System
System is a set of interacting or interdependent components forming an integrated whole....

, or network
Computer network
A computer network, often simply referred to as a network, is a collection of hardware components and computers interconnected by communication channels that allow sharing of resources and information....

 upon the failure or abnormal termination
Abnormal end
An ABEND is an abnormal termination of software, or a program crash.This usage derives from an error message from the IBM OS/360, IBM zOS operating systems. Usually capitalized, but may appear as "abend"...

 of the previously active application
Application software
Application software, also known as an application or an "app", is computer software designed to help the user to perform specific tasks. Examples include enterprise software, accounting software, office suites, graphics software and media players. Many application programs deal principally with...

, server, system, or network. Failover and switchover
Switchover
Switchover is the manual switch from one system to a redundant or standby computer server, system, or network upon the failure or abnormal termination of the previously active server, system, or network, or to perform system maintenance, such as installing patches, and upgrading software or...

 are essentially the same operation, except that failover is automatic and usually operates without warning, while switchover requires human intervention.

Systems design
Systems design
Systems design is the process of defining the architecture, components, modules, interfaces, and data for a system to satisfy specified requirements. One could see it as the application of systems theory to product development...

ers usually provide failover capability in servers, systems or networks requiring continuous availability
Availability
In telecommunications and reliability theory, the term availability has the following meanings:* The degree to which a system, subsystem, or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at an unknown, i.e., a random, time...

 and a high degree of reliability
Reliability engineering
Reliability engineering is an engineering field, that deals with the study, evaluation, and life-cycle management of reliability: the ability of a system or component to perform its required functions under stated conditions for a specified period of time. It is often measured as a probability of...

.

At server level, failover automation usually uses a "heartbeat" cable that connects two servers. As long as a regular "pulse" or "heartbeat" continues between the main server and the second server, the second server will not initiate its systems. There may also be a third "spare parts" server that has running spare components for "hot" switching to prevent downtime. The second server takes over the work of the first as soon as it detects an alteration in the "heartbeat" of the first machine. Some systems have the ability to send a notification of failover.

Some systems, intentionally, do not failover entirely automatically, but require human intervention. This "automated with manual approval" configuration runs automatically once a human has approved the failover.

Failback is the process of restoring a system, component, or service in a state of failover back to its original state (before failure).

The use of virtualization software has allowed failover practices to become less reliant on physical hardware.

Failover in disaster recovery

There are two types of failover:
  1. Automatic failover: Automatic ERSON-Failover where two servers are located in two different geographic locations. If disaster happens at host site, the secondary server will take over automatically without user or support intervention. In this case, usually, they have online data replication from host to the surviving recovery site, or using clustering technology to failover to secondary server. Of course, there are also other high-availability technologies such as hyperV or VMware, which cause a very minimum interruption and business can resume as normal. This solution is primarily used for high-reliability/critical applications or systems.
  2. Manual failover: In this case, user or support team intervention is necessary. For example, if an abnormality occurs at a host site, the support team has to restore the database manually at the surviving site, then switch users to the recovery site to resume business as usual. This is also known as a backup and restore solution, which is usually used for non-critical applications or systems.

Fallback in disaster recovery

There are also two types of fallback:
  1. After a term that is used for any disaster recovery
    Disaster recovery
    Disaster recovery is the process, policies and procedures related to preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster. Disaster recovery is a subset of business continuity...

     test that failed, the fallback or revert will take place
  2. After recovery is completed, fallback or back to normalcy take place. Failback is the term that is actually used for fallback, but failback means that there are two recovery sites. In other words, this is the second disaster recovery site.


In short,
  1. Failover (Automatic or manual) - from host to recovery site
  2. Fallback (Automatic or manual) – from recovery site to host
  3. Failback (Automatic or manual) – from recovery site 1 to recovery site 2

See also

  • Data reliability
  • Fault-tolerance
  • High-availability cluster
    High-availability cluster
    High-availability clusters are groups of computers that support server applications that can be reliably utilized with a minimum of down-time. They operate by harnessing redundant computers in groups or clusters that provide continued service when system components fail...

  • Log shipping
    Log shipping
    Log shipping is the process of automating the backup of a database and transaction log files on a primary database server, and then restoring them onto a standby server. This technique is supported by Microsoft SQL Server and PostgreSQL...

  • Safety engineering
    Safety engineering
    Safety engineering is an applied science strongly related to systems engineering / industrial engineering and the subset System Safety Engineering...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK