Multi-master replication
Encyclopedia
Multi-master replication is a method of database replication
Replication (computer science)
Replication is the process of sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility. It could be data replication if the same data is stored on multiple storage devices, or...

 which allows data to be stored by a group of computers, and updated by any member of the group. The multi-master replication system is responsible for propagating the data modifications made by each member to the rest of the group, and resolving any conflicts that might arise between concurrent changes made by different members. Multi-master replication can be contrasted with master-slave replication, in which a single member of the group is designated as the "master" for a given piece of data and is the only node allowed to modify that data item. Other members wishing to modify the data item must first contact the master node. Allowing only a single master makes it easier to achieve consistency among the members of the group, but is less flexible than multi-master replication.

Advantages

  • If one master fails, other masters continue to update the database
    Database
    A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

    .
  • Masters can be located in several physical sites, i.e. distributed across the network.

Disadvantages

  • Most multi-master replication systems are only loosely consistent, i.e. lazy and asynchronous, violating ACID
    ACID
    In computer science, ACID is a set of properties that guarantee database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction...

     properties.
  • Eager replication systems are complex and increase communication latency
    Latency (engineering)
    Latency is a measure of time delay experienced in a system, the precise definition of which depends on the system and the time being measured. Latencies may have different meaning in different contexts.-Packet-switched networks:...

    .
  • Issues such as conflict resolution can become intractable as the number of nodes involved rises and latency increases.

Log-Based

A database transaction
Database transaction
A transaction comprises a unit of work performed within a database management system against a database, and treated in a coherent and reliable way independent of other transactions...

 log is referenced to capture changes made to the database. For log-based transaction capturing, database changes can only be distributed asynchronously.

Trigger-Based

Triggers at the subscriber capture changes made to the database and submit them to the publisher. With trigger-based transaction capturing, database changes can be distributed either synchronously
Synchronization (computer science)
In computer science, synchronization refers to one of two distinct but related concepts: synchronization of processes, and synchronization of data. Process synchronization refers to the idea that multiple processes are to join up or handshake at a certain point, so as to reach an agreement or...

 or asynchronously.

Implementations

Many directory servers
Directory service
A directory service is the software system that stores, organizes and provides access to information in a directory. In software engineering, a directory is a map between names and values. It allows the lookup of values given a name, similar to a dictionary...

 based on LDAP
Lightweight Directory Access Protocol
The Lightweight Directory Access Protocol is an application protocol for accessing and maintaining distributed directory information services over an Internet Protocol network...

 implement multi-master replication.

Active Directory

One of the more prevalent multi-master replication implementations in directory servers is Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

's Active Directory
Active Directory
Active Directory is a directory service created by Microsoft for Windows domain networks. It is included in most Windows Server operating systems. Server computers on which Active Directory is running are called domain controllers....

. Within Active Directory, objects that are updated on one Domain Controller
Domain controller
On Windows Server Systems, a domain controller is a server that responds to security authentication requests within the Windows Server domain...

 are then replicated to other domain controllers through multi-master replication. It is not required for all domain controllers to replicate with each other domain controller as this would cause excessive network traffic in large Active Directory deployments. Instead, domain controllers have a complex update pattern that ensures that all servers are updated in a timely fashion without excessive replication traffic. Some Active Directory needs are however better served by Flexible single master operation
Flexible single master operation
Flexible Single Master Operations , or just single master operation or operations master, is a feature of Microsoft's Active Directory...

.

OpenDS

OpenDS
OpenDS
OpenDS Software is a free, open source directory service, written in Java, and developed as part of the OpenDS project. OpenDS Software implements a wide range of Lightweight Directory Access Protocol and related standards, including full compliance with LDAPv3 but also support for Directory...

 implements multi-master replication since its version 1.0.
The OpenDS multi-master replication is asynchronous, it uses a log with a publish-subscribe mechanism that allows scaling to a large number of nodes. OpenDS replication does conflict resolution at the entry and attribute level. OpenDS replication can be used over a Wide Area Network
Wide area network
A wide area network is a telecommunication network that covers a broad area . Business and government entities utilize WANs to relay data among employees, clients, buyers, and suppliers from various geographical locations...

.

OpenLDAP

The widely used open source LDAP server implements multi-master replication since its version 2.4 (October 2007) http://www.openldap.org/software/roadmap.html.

Oracle

Oracle database
Oracle database
The Oracle Database is an object-relational database management system produced and marketed by Oracle Corporation....

 clusters implement multi-master replication using one of two methods. Asynchronous multi-master replication commits data changes to a deferred transaction queue which is periodically processed on all databases in the cluster. Synchronous
Synchronization (computer science)
In computer science, synchronization refers to one of two distinct but related concepts: synchronization of processes, and synchronization of data. Process synchronization refers to the idea that multiple processes are to join up or handshake at a certain point, so as to reach an agreement or...

 multi-master replication uses Oracle's two phase commit functionality to ensure that all databases with the cluster have a consistent dataset.

MySQL

MariaDB
MariaDB
MariaDB is a community-developed branch of the MySQL database, the impetus being the community maintenance of its free status under GPL, as opposed to any uncertainty of MySQL license status under its current ownership by Oracle....

 and MySQL
MySQL
MySQL officially, but also commonly "My Sequel") is a relational database management system that runs as a server providing multi-user access to a number of databases. It is named after developer Michael Widenius' daughter, My...

 ships with replication support. It is possible to achieve a multi-master replication scheme beginning with MySQL version 3.23. MySQL Cluster
MySQL Cluster
MySQL Cluster is a technology which provides shared-nothing clustering capabilities for the MySQL database management system. It was first included in the production release of MySQL 4.1 in November 2004. It is designed to provide high availability and high performance, while allowing for nearly...

 supports conflict detection and resolution between multiple masters since version 6.3.

There is also an external project, Galera, that provides true multi-master capability.

PostgreSQL

PostgreSQL
PostgreSQL
PostgreSQL, often simply Postgres, is an object-relational database management system available for many platforms including Linux, FreeBSD, Solaris, MS Windows and Mac OS X. It is released under the PostgreSQL License, which is an MIT-style license, and is thus free and open source software...

, beginning from version 9.1, includes built-in synchronous replication, based on shipping the changes made to all database blocks to other systems after commit. Synchronous replication ensures that, for each write transaction, the master waits until at least one slave node has written the data to its transaction log. PostgreSQL also has the ability to run read-only queries against these replicated slaves. This allows splitting read traffic among multiple nodes efficiently.

For PostgreSQL there are also other packages for multi-master replication, including solutions based on two-phase commit. There is Bucardo, rubyrep, PgPool and PgPool-II, PgCluster,Sequoia and PostgresXC as well as some proprietary solutions.

Ingres

Within Ingres  Replicator, objects that are updated on one Ingres server can then replicated to other servers whether local or remote through multi-master replication. If one server fails, client connections can be re-directed to another server. It is not required for all Ingres servers in an environment to replicate with each other as this could cause excessive network traffic in large implementations. Instead, Ingres Replicator provides an elegant and sophisticated design that allows the appropriate data to be replicated to the appropriate servers without excessive replication traffic. This means that some servers in the environment can serve as failover candidates while other servers can meet other requirements such as managing a subset of columns or tables for a departmental solution, a subset of rows for a geographical region or one-way replication for a reporting server. In the event of a source, target, or network failure, data integrity is enforced through this two-phase commit protocol by ensuring that either the whole transaction is replicated, or none of it is. In addition, Ingres Replicator can operate over RDBMS’s from multiple vendors to connect them.

See also

  • Flexible single master operation
    Flexible single master operation
    Flexible Single Master Operations , or just single master operation or operations master, is a feature of Microsoft's Active Directory...

  • Active Directory
    Active Directory
    Active Directory is a directory service created by Microsoft for Windows domain networks. It is included in most Windows Server operating systems. Server computers on which Active Directory is running are called domain controllers....

  • Distributed database management system
    Distributed database management system
    A distributed database management system is a software system that permits the management of a distributed database and makes the distribution transparent to the users. A distributed database is a collection of multiple, logically interrelated databases distributed over a computer network...

  • DNS zone transfer
    DNS zone transfer
    DNS zone transfer, also sometimes known by its opcode mnemonic AXFR, is a type of DNS transaction. It is one of the many mechanisms available for administrators to employ for replicating the databases containing the DNS data across a set of DNS servers. Zone transfer comes in two flavors, full ...

  • Optimistic Replication
    Optimistic replication
    Optimistic replication is a strategy for replication in which replicas are allowed to diverge.Traditional pessimistic replication systems try to guarantee from the beginning that all of the replicas are identical to each other, as if there was only a single copy of the data all along...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK