Distributed revision control
Encyclopedia
A distributed revision control system (DRCS), distributed version control or decentralized version control (DVCS) keeps track of software
Computer software
Computer software, or just software, is a collection of computer programs and related data that provide the instructions for telling a computer what to do and how to do it....

 revisions and allows many developers to work on a given project without necessarily being connected to a common network.

Distributed vs. centralized

Distributed revision control (DRCS) takes a peer-to-peer approach, as opposed to the client-server approach of centralized systems. Rather than a single, central repository on which clients synchronize, each peer's working copy of the codebase is a bona-fide repository.
Distributed revision control conducts synchronization by exchanging patches
Patch (Unix)
patch is a Unix program that updates text files according to instructions contained in a separate file, called a patch file. The patch file is a text file that consists of a list of differences and is produced by running the related diff program with the original and updated file as arguments...

 (change-sets) from peer to peer. This results in some important differences from a centralized system:
  • No canonical, reference copy of the codebase exists by default; only working copies.
  • Common operations (such as commits, viewing history, and reverting changes) are fast, because there is no need to communicate with a central server.

Rather, communication is only necessary when pushing or pulling changes to or from other peers.
  • Each working copy effectively functions as a remote backup of the codebase and of its change-history, providing natural protection against data loss.


Other differences are as follows:
  • There may be many "central" repositories.
  • Code from disparate repositories are merged
    Merge (revision control)
    Merging in revision control, is a fundamental operation that reconciles multiple changes made to a revision-controlled collection of files. Most often, it is necessary when a file is modified by two people on two different computers at the same time...

     based on a web of trust
    Web of trust
    In cryptography, a web of trust is a concept used in PGP, GnuPG, and other OpenPGP-compatible systems to establish the authenticity of the binding between a public key and its owner. Its decentralized trust model is an alternative to the centralized trust model of a public key infrastructure ,...

    , i.e., historical merit or quality of changes.
  • Numerous different development models are possible, such as development / release branches or a Commander / Lieutenant model, allowing for efficient delegation of topical developments in very large projects.
  • Lieutenants are project members who have the power to dynamically decide which branches to merge.
  • Network is not involved in most operations.
  • A separate set of "sync" operations are available for committing or receiving changes with remote repositories.


DVCS proponents point to several advantages of distributed version control systems over the traditional centralised model:
  • Allows users to work productively even when not connected to a network
  • Makes most operations much faster since no network is involved
  • Allows participation in projects without requiring permissions from project authorities, and thus arguably better fosters culture of meritocracy
    Meritocracy
    Meritocracy, in the first, most administrative sense, is a system of government or other administration wherein appointments and responsibilities are objectively assigned to individuals based upon their "merits", namely intelligence, credentials, and education, determined through evaluations or...

    instead of requiring "committer" status
  • Allows private work, so users can use their revision control system even for early drafts they do not want to publish
  • Avoids relying on a single physical machine as a single point of failure.
  • Still permits centralized control of the "release version" of the project
  • For FLOSS
    Floss
    Floss may refer to:* Dental floss, used to clean teeth* Embroidery thread, machine or hand-spun yarn for embroidery* Fairy floss or candyfloss, alternative names for cotton candy* Rousong, i.e. meat floss-Computing:...

     software projects, it becomes much easier to create a project fork
    Fork (software development)
    In software engineering, a project fork happens when developers take a legal copy of source code from one software package and start independent development on it, creating a distinct piece of software...

     from a project that is stalled because of leadership conflicts or design disagreements.


Software development author Joel Spolsky
Joel Spolsky
Avram Joel Spolsky is a software engineer and writer. He is the author of Joel on Software, a blog on software development. He was a Program Manager on the Microsoft Excel team between 1991 and 1994. He later founded Fog Creek Software in 2000 and launched the Joel on Software blog...

 describes distributed version control as "possibly the biggest advance in software development technology in the [past] ten years."

As a disadvantage of DVCS, one could note that initial cloning of a repository is slower compared to centralized checkout, because all branches and revision history are copied. This may be relevant if access speed is low and the project is large enough. For instance, the size of the cloned git
Git (software)
Git is a distributed revision control system with an emphasis on speed. Git was initially designed and developed by Linus Torvalds for Linux kernel development. Every Git working directory is a full-fledged repository with complete history and full revision tracking capabilities, not dependent on...

 repository (all history, branches, tags, etc.) for the Linux kernel is approximately the size of the checked-out uncompressed HEAD, whereas the equivalent checkout of a single branch in a centralized checkout would be the compressed size of the contents of HEAD (except without any history, branches, tags, etc.). Another problem with DVCS is the lack of locking mechanisms that is part of most centralized VCS and still plays an important role when it comes to non-mergable binary files such as graphic assets.

Open systems

An "open system" of distributed revision control is characterized by its support for independent branches, and its heavy reliance on merge operations. Its general characteristics include:
  • Every working copy is effectively a fork.
  • The system implements each branch as a working copy, with merges conducted by ordinary patch exchange, from branch to branch.
  • Code forking therefore occurs more readily, where desired, because every working copy is a potential fork. (By the same token, undesirable forks are easier to mend because, if the dispute can be resolved, re-merging the code is easy.)
  • It may be possible to "cherry-pick" single changes, selectively pulling them from peer to peer.
  • New peers can freely join, without applying for access to a server.


One of the first open systems, BitKeeper
BitKeeper
BitKeeper is a software tool for distributed revision control of computer source code. A distributed system, BitKeeper competes largely against other systems such as Git and Mercurial...

, served in the development of the Linux kernel
Linux kernel
The Linux kernel is an operating system kernel used by the Linux family of Unix-like operating systems. It is one of the most prominent examples of free and open source software....

. When the makers of BitKeeper decided in 2005 to restrict its licensing,
Linus Torvalds
Linus Torvalds
Linus Benedict Torvalds is a Finnish software engineer and hacker, best known for having initiated the development of the open source Linux kernel. He later became the chief architect of the Linux kernel, and now acts as the project's coordinator...

, looking for a free alternative, finally started developing his own distributed source control management software, Git
Git (software)
Git is a distributed revision control system with an emphasis on speed. Git was initially designed and developed by Linus Torvalds for Linux kernel development. Every Git working directory is a full-fledged repository with complete history and full revision tracking capabilities, not dependent on...

.

For a list of distributed revision control systems, see the comparison of revision control software
Comparison of revision control software
The following is a comparison of revision control software. The following tables includes general and technical information for notable revision control and software configuration management software.- General information :Table Explanation...

.

Replicated systems

A replicated system of distributed revision control depends on a replicated
Replication (computer science)
Replication is the process of sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility. It could be data replication if the same data is stored on multiple storage devices, or...

 database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

. A check-in is equivalent to a distributed commit
Commit (data management)
In the context of computer science and data management, commit refers to the idea of making a set of tentative changes permanent. A popular usage is at the end of a transaction. A commit is an act of committing.-Data management:...

. Successful commits create a single baseline, which reduces the need for merges. An example of a replicated distributed system is Code Co-op
Code Co-op
-Distinguishing features:Code Co-op is a distributed revision control system of the replicated type.It uses peer-to-peer architecture to share projects among developers and to control changes to files...

.

Work model

The distributed model is generally better suited for large projects with partly independent developers, such as the Linux kernel project, because developers can work independently and submit their changes for merge (or rejection). The distributed model flexibly allows adopting custom source code contribution workflows, with the integrator workflow
Integrator workflow
Integrator workflow, also known as Integration Manager Workflow, is a method to handle source code contributions in work environments using distributed version control.-Scenario:...

 being the most widely use one.

In the centralized model, developers should serialize their work, or they may have problems with different versions.

History

First generation open-source DVCS systems include Arch
GNU arch
In computing, GNU arch is a distributed revision control system that is part of the GNU Project and licensed under the GNU General Public License...

 and Monotone
Monotone (software)
Monotone is an open source software tool for distributed revision control. Monotone tracks revisions to files, groups sets of revisions into changesets, and tracks history across renames.The focus of the project is on integrity over performance...

. The second generation was initiated by the arrival of Darcs
Darcs
Darcs is a distributed revision control system created by David Roundy; it was designed to replace traditional, centralized source control systems such as CVS and Subversion...

, followed by a host of others. Among them, Mercurial
Mercurial (software)
Mercurial is a cross-platform, distributed revision control tool for software developers. It is mainly implemented using the Python programming language, but includes a binary diff implementation written in C. It is supported on Windows and Unix-like systems, such as FreeBSD, Mac OS X and Linux...

 and Git
Git (software)
Git is a distributed revision control system with an emphasis on speed. Git was initially designed and developed by Linus Torvalds for Linux kernel development. Every Git working directory is a full-fledged repository with complete history and full revision tracking capabilities, not dependent on...

 were created as potential replacements for BitKeeper
BitKeeper
BitKeeper is a software tool for distributed revision control of computer source code. A distributed system, BitKeeper competes largely against other systems such as Git and Mercurial...

 when it was pulled from free use by the Linux kernel project by its publisher. Bazaar followed not long after.

Before these, closed source DVCS systems such as Sun WorkShop TeamWare (which inspired BitKeeper) were widely used in enterprise settings.

Future

Some natively centralized systems are starting to grow distributed features. For example, Subversion is able to do many operations with no network. It may become more difficult to separate natively distributed vs centralized systems.

There are many tools that rely on version control, such as wiki
Wiki software
Wiki software is collaborative software that runs a wiki, i.e., a website that allows users to create and collaboratively edit web pages via a web browser. A wiki system is usually a web application that runs on one or more web servers...

s, file systems, and text editor
Text editor
A text editor is a type of program used for editing plain text files.Text editors are often provided with operating systems or software development packages, and can be used to change configuration files and programming language source code....

s. Some are starting to adopt DVCS features, and even integrate with them, for example the Gazest wiki, ikiwiki
Ikiwiki
ikiwiki is a free, open source wiki application, designed by Joey Hess. It is licensed under the terms of the GNU General Public License, version 2 or later...

.

See also

  • Revision control
    Revision control
    Revision control, also known as version control and source control , is the management of changes to documents, programs, and other information stored as computer files. It is most commonly used in software development, where a team of people may change the same files...

  • List of revision control software
  • Comparison of revision control software
    Comparison of revision control software
    The following is a comparison of revision control software. The following tables includes general and technical information for notable revision control and software configuration management software.- General information :Table Explanation...


:Category:Software using distributed revision control
  • Repository clone
    Repository clone
    A Repository clone is a concept from distributed revision control which represents the cloning of a remote repository to a local copy.A clone operation is performed when a developer wants to start working on an existing project.-Usage by DVCSs:...

  • Git
    Git (software)
    Git is a distributed revision control system with an emphasis on speed. Git was initially designed and developed by Linus Torvalds for Linux kernel development. Every Git working directory is a full-fledged repository with complete history and full revision tracking capabilities, not dependent on...

    , an Open Source
    Open source
    The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...

     DVCS developed for Linux Kernel development
  • Mercurial
    Mercurial
    Mercurial is a cross-platform, distributed revision control tool for software developers. It is mainly implemented using the Python programming language, but includes a binary diff implementation written in C. It is supported on Windows and Unix-like systems, such as FreeBSD, Mac OS X and Linux...

    , a cross-platform system similar to Git, considered by some to be easier to use
  • BitKeeper
    BitKeeper
    BitKeeper is a software tool for distributed revision control of computer source code. A distributed system, BitKeeper competes largely against other systems such as Git and Mercurial...

  • Bazaar (software)
  • Concurrent Versions System
    Concurrent Versions System
    The Concurrent Versions System , also known as the Concurrent Versioning System, is a client-server free software revision control system in the field of software development. Version control system software keeps track of all work and all changes in a set of files, and allows several developers ...

    , a predecessor of distributed version control systems
  • TortoiseHg
    TortoiseHg
    TortoiseHg is a Mercurial revision control client, implemented as a Windows Explorer and Nautilus shell extension. The underlying client can be used on the command line...

    , a graphical interface for Mercurial

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK