Subversion
Encyclopedia
Apache Subversion is a software versioning and a revision control
system distributed under a free license
. Developers use Subversion to maintain current and historical versions of files such as source code
, web pages, and documentation. Its goal is to be a mostly-compatible successor to the widely used Concurrent Versions System
(CVS).
The open source
community has used Subversion widely: for example in projects such as Apache Software Foundation
, Free Pascal
, FreeBSD
, GCC
, Django, Ruby
, Mono
, SourceForge
, PHP
and MediaWiki
. Google Code
also provides Subversion hosting for their open source projects. BountySource
systems use it exclusively. CodePlex offers access to Subversion as well as to other types of clients.
The corporate world has also started to adopt Subversion. A 2007 report by Forrester Research
recognized Subversion as the sole leader in the Standalone Software Configuration Management (SCM) category and as a strong performer in the Software Configuration and Change Management (SCCM) category.
Subversion was created by CollabNet
Inc. in 2000 and is now a top-level Apache project being built and used by a global community of contributors.
founded the Subversion project in 2000 as an effort to write an open-source version-control system which operated much like CVS
but which fixed the bugs and supplied some features missing in CVS. By 2001, Subversion had advanced sufficiently to host its own source code. In November 2009, Subversion was accepted into Apache Incubator
: this marked the beginning of the process to become a standard top-level Apache project. It became a top-level Apache project on February 17, 2010.
package.
Subversion has some limitations with Berkeley DB usage when a program that accesses the database crashes or terminates forcibly. No data loss or corruption occurs, but the repository is offline while Berkeley DB replays the journal and cleans up any outstanding locks. When using Berkeley DB repository, the safest way to use it is by a single server process running as one user, instead of through a shared filesystem.
and Filesystem in Userspace
(FUSE) packages.
FSFS works faster on directories with a large number of files and takes less disk space, due to less logging.
Beginning with Subversion 1.2, FSFS became the default data store for new repositories.
All three means can access both FSFS and Berkeley DB repositories.
Any 1.x version of a client can work with any 1.x server. Newer clients and servers have additional features and performance capabilities, but have fallback support for older clients/servers.
Fs : The lowest level; it implements the versioned filesystem which stores the user data.
Repos : Concerned with the repository built up around the filesystem. It has many helper functions and handles the various "hooks" that a repository may have, e.g. scripts that run when an action is performed. Together, Fs and Repos constitute the "filesystem interface".
mod_dav_svn : Provides WebDAV
/Delta-V access through Apache 2.
Ra : Handles "repository access", both local and remote. From this point on, repositories are referred to using URLs, e.g.
Client, Wc : The highest level. It abstracts repository access and provides common client tasks, such as authenticating users or comparing versions. Subversion clients use the Wc library to manage the local working copy.
Each revision in a Subversion filesystem has its own root
, which is used to access contents at that revision. Files are stored as links to the most recent change; thus a Subversion repository is quite compact. The system consumes storage space proportional to the number of changes made, not to the number of revisions.
The Subversion filesystem uses transactions to keep changes atomic. A transaction operates on a specified revision of the filesystem, not necessarily the latest. The transaction has its own root, on which changes are made. It is then either committed and becomes the latest revision, or is aborted. The transaction is actually a long-lived filesystem object; a client does not need to commit or abort a transaction itself, rather it can also begin a transaction, exit, and then can re-open the transaction and continue using it. Multiple clients can access the same transaction and work together on an atomic change, though no existing clients expose this capability.
svn:executable : Makes files on Unix
-hosted working copies executable.
svn:mime-type : Stores the Internet media type
("MIME type") of a file. Affects the handling of diffs and merging.
svn:ignore : A list of filename patterns to ignore in a directory. Similar to CVS
's .cvsignore file.
svn:keywords : A list of keywords to substitute into a file when changes are made. The file itself must also reference the keywords as $keyword$ or $keyword:...$. This is used to maintain certain information (e.g., author, date of last change, revision number) in a file without human intervention.
The keyword substitution mechanism originates from rcs
and from cvs.
svn:eol-style : Makes the client convert end-of-line
characters in text files. Used when the working copy is needed with a specific EOL style. "native" is commonly used, so that EOLs match the user's OS EOL style. Repositories may require this property on all files to prevent inconsistent line endings, which can cause a problem in itself.
svn:externals : Allows parts of other repositories to be automatically checked-out into a sub-directory.
svn:needs-lock : Specifies that a file is to be checked out with file permissions set to read-only. This is designed for use with the locking mechanism. The read-only permission reminds one to obtain a lock before modifying the file: obtaining a lock makes the file writable, and releasing the lock makes it read-only again. Locks are only enforced during a commit operation. Locks can be used without setting this property. However, that is not recommended, because it introduces the risk of someone modifying a locked file; they will only discover it has been locked when their commit fails.
svn:special : This property is not meant to be set or modified directly by users. only used for having symbolic link
s in the repository. When a symbolic link is added to the repository, a file containing the link target is created with this property set. When a Unix-like system checks out this file, the client converts it to a symbolic link.
svn:mergeinfo : Used to track merge data (revision numbers) in Subversion 1.5 (or later). This property is automatically maintained by the merge command, and it is not recommended to change its value manually.
Subversion also uses properties on revisions themselves. Like the above properties on filesystem entries the names are completely arbitrary, with the Subversion client using certain properties prefixed with 'svn:'. However, these properties are not versioned and can be changed later.
svn:date : the date and time stamp of a revision
svn:author : the name of the user that submitted the change(s)
svn:log : the user-supplied description of the change(s);
to handle branches
and does not support tagging
. A branch is a separate line of development. Tagging refers to labeling the repository at a certain point in time so that it can be easily found in the future.
The system sets up a new branch by using the 'svn copy' command, which should be used in place of the native operating system mechanism. Subversion does not create an entire new file version in the repository with its copy. Instead, the old and new versions are linked together internally and the history is preserved for both. The copied versions take up only a little extra room in the repository because Subversion saves only the differences from the original versions.
All the versions in each branch maintain the history of the file up to the point of the copy, plus any changes made since. One can "merge" changes back into the trunk
or between branches. Due to the differencing algorithm, creating a copy takes very little additional space in the repository.
, Subversion lacks some repository-administration and management features. For instance, someone may wish to edit the repository to permanently remove all historical records of certain data. Subversion does not have built-in support to achieve this simply.
Subversion stores additional copies of data on the local machine, which can become an issue with very large projects or files, or if developers work on multiple branches simultaneously. These .svn directories on the client side can become corrupted by ill-advised user activity.
Subversion does not store the modification times of files. As such, a file checked out of a subversion repository will have the 'current' date (instead of the modification time in the repository), and a file checked into the repository will have the date of the check-in (instead of the modification time of the file being checked in). This might not always be what is wanted.
To mitigate this third party solutions exist that allow for preserving modification time and other filesystem meta-data.
However, giving checked out files a current date is important as well — this is how tools like make(1) will take notice of a changed file for rebuilding it.
Subversion does not use a distributed revision control
model. Ben Collins-Sussman, one of the designers of Subversion, believes a centralised model would help prevent "insecure programmers" from hiding their work from other team members. Some users of version control systems see the centralised model as detrimental; famously, Linus Torvalds
attacked Subversion's model and its developers.
While Subversion stores filenames as Unicode
, it does not specify if precomposition
or decomposition
is used for certain accented characters (such as é). Thus, files added in SVN clients running on some operating systems (such as OS X) use decomposition encoding, while clients running on other operating systems (such as Linux) use precomposition encoding, with the consequence that those accented characters do not display correctly if the local SVN client is not using the same encoding as the client used to add the files.
By design, the
Revision numbers are difficult to remember in any version-control system. For this reason, most systems offer symbolic tags as user-friendly references to them. Subversion does not have such a feature and what its documentation recommends to use instead is very different in nature. Instead of implementing tags as references to points in history, Subversion recommends making snapshot copies into a well-known subdirectory ("
This history-to-space projection has multiple issues:
1. When a snapshot is taken, the system does not assign any special meaning to the name of the tag/snapshot. This is the difference between a copy and a reference. The revision is recorded and the snapshot can be accessed by URL. This makes some operations less convenient and others impossible. For instance, a naive
2. When two (ideally independent) object types live in the repository tree, a "fight to the top" can ensue. In other words, it is often difficult to decide at which level to create the "
3. Tags, by their conventional definition are both read-only and light-weight, on the repository and client. Subversion copies are not read-only, and while they are light-weight on the repository, they are incredibly heavy-weight on the client.
To address such issues, posters on the Subversion mailing lists have suggested a new feature called "labels" or "aliases".
SVN labels would more closely resemble the "tags" of other systems such as CVS
or git
. The fact that Subversion has global revision numbers opens the way to a very simple label->revision implementation. Yet as of 2010, no progress has been made and symbolic tags are not in the list of the most wanted features.
has continued its involvement with Subversion, but the project runs as an independent open source community. In November 2009, the project was accepted into the Apache Incubator
, aiming to become part of the Apache Software Foundation
's efforts. Since March 2010, the project is formally known as Apache Subversion, being a part of the Apache Top-Level Projects.
In October 2009, WANdisco
announced the hiring of core Subversion committers as the company moved to become a major corporate sponsor of the project. This included Hyrum Wright, president of the Subversion Corporation and release manager for the Subversion project since early 2008, who joined the company to lead its open source team.
The Subversion open-source community does not provide binaries, but potential users can download binaries from volunteers. While the Subversion project does not include an official graphical user interface
(GUI) for use with Subversion, third parties have developed a number of different GUIs, along with a wide variety of additional ancillary software.
Work announced in 2009 included SubversionJ (a Java API) and implementation of the Obliterate command, similar to that provided by Perforce
. Both of these enhancements were sponsored by WANdisco.
The Subversion committers normally have at least one or two new features under active development at any one time. The 1.7 release of Subversion in October 2011 included a streamlined HTTP transport to improve performance and a rewritten working-copy library.
Revision control
Revision control, also known as version control and source control , is the management of changes to documents, programs, and other information stored as computer files. It is most commonly used in software development, where a team of people may change the same files...
system distributed under a free license
Free license
A free license is a license which grants recipients rights to modify and redistribute the software or the content, which would otherwise be prohibited by copyright law.* Free software licence* Free content license...
. Developers use Subversion to maintain current and historical versions of files such as source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...
, web pages, and documentation. Its goal is to be a mostly-compatible successor to the widely used Concurrent Versions System
Concurrent Versions System
The Concurrent Versions System , also known as the Concurrent Versioning System, is a client-server free software revision control system in the field of software development. Version control system software keeps track of all work and all changes in a set of files, and allows several developers ...
(CVS).
The open source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...
community has used Subversion widely: for example in projects such as Apache Software Foundation
Apache Software Foundation
The Apache Software Foundation is a non-profit corporation to support Apache software projects, including the Apache HTTP Server. The ASF was formed from the Apache Group and incorporated in Delaware, U.S., in June 1999.The Apache Software Foundation is a decentralized community of developers...
, Free Pascal
Free Pascal
Free Pascal Compiler is a free Pascal and Object Pascal compiler.In addition to its own Object Pascal dialect, Free Pascal supports, to varying degrees, the dialects of several other compilers, including those of Turbo Pascal, Delphi, and some historical Macintosh compilers...
, FreeBSD
FreeBSD
FreeBSD is a free Unix-like operating system descended from AT&T UNIX via BSD UNIX. Although for legal reasons FreeBSD cannot be called “UNIX”, as the direct descendant of BSD UNIX , FreeBSD’s internals and system APIs are UNIX-compliant...
, GCC
GNU Compiler Collection
The GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages. GCC is a key component of the GNU toolchain...
, Django, Ruby
Ruby (programming language)
Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro "Matz" Matsumoto...
, Mono
Mono (software)
Mono, pronounced , is a free and open source project led by Xamarin to create an Ecma standard compliant .NET-compatible set of tools including, among others, a C# compiler and a Common Language Runtime....
, SourceForge
SourceForge
SourceForge Enterprise Edition is a collaborative revision control and software development management system. It provides a front-end to a range of software development lifecycle services and integrates with a number of free software / open source software applications .While originally itself...
, PHP
PHP
PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...
and MediaWiki
MediaWiki
MediaWiki is a popular free web-based wiki software application. Developed by the Wikimedia Foundation, it is used to run all of its projects, including Wikipedia, Wiktionary and Wikinews. Numerous other wikis around the world also use it to power their websites...
. Google Code
Google Code
Google Code is Google's site for developer tools, APIs and technical resources. The site contains documentation on using Google developer tools and APIs—including discussion groups and blogs for developers using Google's developer products....
also provides Subversion hosting for their open source projects. BountySource
BountySource
BountySource is a collaborative project management service for use by any open-source software with an OSI-approved license. Like other free services and products , BountySource allows for developers to track bugs and feature requests. Unlike the others, BountySource also allows for "bounties" and...
systems use it exclusively. CodePlex offers access to Subversion as well as to other types of clients.
The corporate world has also started to adopt Subversion. A 2007 report by Forrester Research
Forrester Research
Forrester Research is an independent technology and market research company that provides its clients with advice about technology's impact on business and consumers. Forrester Research has five research centers in the US: Cambridge, Massachusetts; New York, New York; San Francisco, California;...
recognized Subversion as the sole leader in the Standalone Software Configuration Management (SCM) category and as a strong performer in the Software Configuration and Change Management (SCCM) category.
Subversion was created by CollabNet
CollabNet
CollabNet is a company that sells application lifecycle management software for distributed development teams engaged in both enterprise and open source development.-History:...
Inc. in 2000 and is now a top-level Apache project being built and used by a global community of contributors.
History
CollabNetCollabNet
CollabNet is a company that sells application lifecycle management software for distributed development teams engaged in both enterprise and open source development.-History:...
founded the Subversion project in 2000 as an effort to write an open-source version-control system which operated much like CVS
Concurrent Versions System
The Concurrent Versions System , also known as the Concurrent Versioning System, is a client-server free software revision control system in the field of software development. Version control system software keeps track of all work and all changes in a set of files, and allows several developers ...
but which fixed the bugs and supplied some features missing in CVS. By 2001, Subversion had advanced sufficiently to host its own source code. In November 2009, Subversion was accepted into Apache Incubator
Apache Incubator
Apache Incubator is the gateway for Open source projects intended to become fully fledged Apache Software Foundation projects.The Incubator project was created in October 2002 to provide an entry path to the Apache Software Foundation for projects and codebases wishing to become part of the...
: this marked the beginning of the process to become a standard top-level Apache project. It became a top-level Apache project on February 17, 2010.
Features
- CommitsCommit (data management)In the context of computer science and data management, commit refers to the idea of making a set of tentative changes permanent. A popular usage is at the end of a transaction. A commit is an act of committing.-Data management:...
as true atomic operations (interrupted commit operations in CVS would cause repository inconsistency or corruption). - Renamed/copied/moved/removed files retain full revision history.
- The system maintains versioning for directories, renames, and file metadataMetadataThe term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
(but not for timestamps). Users can move and/or copy entire directory-trees very quickly, while retaining full revision history. - Versioning of symbolic linkSymbolic linkIn computing, a symbolic link is a special type of file that contains a reference to another file or directory in the form of an absolute or relative path and that affects pathname resolution. Symbolic links were already present by 1978 in mini-computer operating systems from DEC and Data...
s. - Native support for binary files, with space-efficient binary-diff storage.
- Apache HTTP ServerApache HTTP ServerThe Apache HTTP Server, commonly referred to as Apache , is web server software notable for playing a key role in the initial growth of the World Wide Web. In 2009 it became the first web server software to surpass the 100 million website milestone...
as network server, WebDAVWebDAVWeb-based Distributed Authoring and Versioning is a set of methods based on the Hypertext Transfer Protocol that facilitates collaboration between users in editing and managing documents and files stored on World Wide Web servers...
/Delta-V for protocol. There is also an independent server processProcess (computing)In computing, a process is an instance of a computer program that is being executed. It contains the program code and its current activity. Depending on the operating system , a process may be made up of multiple threads of execution that execute instructions concurrently.A computer program is a...
called svnserve that uses a custom protocol over TCP/IPInternet protocol suiteThe Internet protocol suite is the set of communications protocols used for the Internet and other similar networks. It is commonly known as TCP/IP from its most important protocols: Transmission Control Protocol and Internet Protocol , which were the first networking protocols defined in this...
. - BranchingBranching (software)Branching, in revision control and software configuration management, is the duplication of an object under revision control so that modifications can happen in parallel along both branches....
as a cheap operation, independent of file size (though Subversion itself does not distinguish between a branch and a directory) - Natively client–server, layeredAbstraction layerAn abstraction layer is a way of hiding the implementation details of a particular set of functionality...
library design. - Client/server protocol sends diffDiffIn computing, diff is a file comparison utility that outputs the differences between two files. It is typically used to show the changes between one version of a file and a former version of the same file. Diff displays the changes made per line for text files. Modern implementations also...
s in both directions. - Costs proportional to change size, not to data size.
- ParsableParsingIn computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens , to determine its grammatical structure with respect to a given formal grammar...
output, including XMLXMLExtensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
log output. - Open sourceOpen sourceThe term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...
licensed — Apache LicenseApache LicenseThe Apache License is a copyfree free software license authored by the Apache Software Foundation . The Apache License requires preservation of the copyright notice and disclaimer....
in the projected 1.7 release; prior versions use a derivative of the Apache Software License, v1.1 - InternationalizedInternationalization and localizationIn computing, internationalization and localization are means of adapting computer software to different languages, regional differences and technical requirements of a target market...
program messages. - File locking for unmergeable files ("reserved checkouts").
- Path-based authorization.
- Language bindingLanguage bindingIn computing, a binding from a programming language to a library or OS service is an API providing that service in the language.Many software libraries are written in systems programming languages such as C or C++...
s for C#, PHPPHPPHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...
, PythonPython (programming language)Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...
, PerlPerlPerl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...
, RubyRuby (programming language)Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro "Matz" Matsumoto...
, and JavaJava (programming language)Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...
. - Full MIMEMIMEMultipurpose Internet Mail Extensions is an Internet standard that extends the format of email to support:* Text in character sets other than ASCII* Non-text attachments* Message bodies with multiple parts...
support — users can view or change the MIME type of each file, with the software knowing which MIME types can have their differences from previous versions shown.
Berkeley DB
Original development of Subversion used the Berkeley DBBerkeley DB
Berkeley DB is a computer software library that provides a high-performance embedded database for key/value data. Berkeley DB is a programmatic software library written in C with API bindings for C++, PHP, Java, Perl, Python, Ruby, Tcl, Smalltalk, and most other programming languages...
package.
Subversion has some limitations with Berkeley DB usage when a program that accesses the database crashes or terminates forcibly. No data loss or corruption occurs, but the repository is offline while Berkeley DB replays the journal and cleans up any outstanding locks. When using Berkeley DB repository, the safest way to use it is by a single server process running as one user, instead of through a shared filesystem.
FSFS
In 2004, the Fast Secure File System (FSFS) was first developed. It uses the OpenSSLOpenSSL
OpenSSL is an open source implementation of the SSL and TLS protocols. The core library implements the basic cryptographic functions and provides various utility functions...
and Filesystem in Userspace
Filesystem in Userspace
Filesystem in Userspace is a loadable kernel module for Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code...
(FUSE) packages.
FSFS works faster on directories with a large number of files and takes less disk space, due to less logging.
Beginning with Subversion 1.2, FSFS became the default data store for new repositories.
Repository access
Access to Subversion repositories can take place by:- Local filesystem or network filesystem, accessed by client directly. This mode uses the file:///path access scheme.
- WebDAVWebDAVWeb-based Distributed Authoring and Versioning is a set of methods based on the Hypertext Transfer Protocol that facilitates collaboration between users in editing and managing documents and files stored on World Wide Web servers...
/Delta-V (over http or https) using the mod_dav_svn module for Apache 2Apache HTTP ServerThe Apache HTTP Server, commonly referred to as Apache , is web server software notable for playing a key role in the initial growth of the World Wide Web. In 2009 it became the first web server software to surpass the 100 million website milestone...
. This mode uses thehttp://host/path access scheme orhttps://host/path for secure connections using ssl. - Custom "svn" protocol (default port 3690), using plain text or over TCP/IP. This mode uses either the
svn://host/path access scheme for unencrypted transport or svn+ssh://host/path scheme for tunneling over ssh.
All three means can access both FSFS and Berkeley DB repositories.
Any 1.x version of a client can work with any 1.x server. Newer clients and servers have additional features and performance capabilities, but have fallback support for older clients/servers.
Layers
Internally, a Subversion system comprises several libraries arranged as layers. Each performs a specific task and allows developers to create their own tools at the desired level of complexity and specificity.Fs : The lowest level; it implements the versioned filesystem which stores the user data.
Repos : Concerned with the repository built up around the filesystem. It has many helper functions and handles the various "hooks" that a repository may have, e.g. scripts that run when an action is performed. Together, Fs and Repos constitute the "filesystem interface".
mod_dav_svn : Provides WebDAV
WebDAV
Web-based Distributed Authoring and Versioning is a set of methods based on the Hypertext Transfer Protocol that facilitates collaboration between users in editing and managing documents and files stored on World Wide Web servers...
/Delta-V access through Apache 2.
Ra : Handles "repository access", both local and remote. From this point on, repositories are referred to using URLs, e.g.
- file:///path/ for local access,
-
http://host/path/ orhttps://host/path/ for WebDAV access, or -
svn://host/path/ or svn+ssh://host/path/ for the SVN protocol.
Client, Wc : The highest level. It abstracts repository access and provides common client tasks, such as authenticating users or comparing versions. Subversion clients use the Wc library to manage the local working copy.
Filesystem
One can view the Subversion filesystem as "two-dimensional". Two coordinates are used to unambiguously address filesystem items:- Path (regular pathPath (computing)A path, the general form of a filename or of a directory name, specifies a unique location in a file system. A path points to a file system location by following the directory tree hierarchy expressed in a string of characters in which path components, separated by a delimiting character, represent...
of Unix-likeUnix-likeA Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....
OS filesystem) - Revision
Each revision in a Subversion filesystem has its own root
Root directory
In computer file systems, the root directory is the first or top-most directory in a hierarchy. It can be likened to the root of a tree — the starting point where all branches originate.-Metaphor:...
, which is used to access contents at that revision. Files are stored as links to the most recent change; thus a Subversion repository is quite compact. The system consumes storage space proportional to the number of changes made, not to the number of revisions.
The Subversion filesystem uses transactions to keep changes atomic. A transaction operates on a specified revision of the filesystem, not necessarily the latest. The transaction has its own root, on which changes are made. It is then either committed and becomes the latest revision, or is aborted. The transaction is actually a long-lived filesystem object; a client does not need to commit or abort a transaction itself, rather it can also begin a transaction, exit, and then can re-open the transaction and continue using it. Multiple clients can access the same transaction and work together on an atomic change, though no existing clients expose this capability.
Properties
One important feature of the Subversion filesystem is properties: simple name=value pairs of text. Properties occur in two different places in the Subversion filesystem. The first is on filesystem entries (i.e., files and directories). These are versioned just like other changes to the filesystem. Users can add any property they wish, and the Subversion client uses a set of properties, which it prefixes with 'svn:'.svn:executable : Makes files on Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...
-hosted working copies executable.
svn:mime-type : Stores the Internet media type
Internet media type
An Internet media type, originally called a MIME type after MIME and sometimes a Content-type after the name of a header in several protocols whose value is such a type, is a two-part identifier for file formats on the Internet.The identifiers were originally defined in RFC 2046 for use in email...
("MIME type") of a file. Affects the handling of diffs and merging.
svn:ignore : A list of filename patterns to ignore in a directory. Similar to CVS
Concurrent Versions System
The Concurrent Versions System , also known as the Concurrent Versioning System, is a client-server free software revision control system in the field of software development. Version control system software keeps track of all work and all changes in a set of files, and allows several developers ...
's .cvsignore file.
svn:keywords : A list of keywords to substitute into a file when changes are made. The file itself must also reference the keywords as $keyword$ or $keyword:...$. This is used to maintain certain information (e.g., author, date of last change, revision number) in a file without human intervention.
The keyword substitution mechanism originates from rcs
Revision Control System
The Revision Control System is a software implementation of revision control that automates the storing, retrieval, logging, identification, and merging of revisions. RCS is useful for text that is revised frequently, for example programs, documentation, procedural graphics, papers, and form...
and from cvs.
svn:eol-style : Makes the client convert end-of-line
Newline
In computing, a newline, also known as a line break or end-of-line marker, is a special character or sequence of characters signifying the end of a line of text. The name comes from the fact that the next character after the newline will appear on a new line—that is, on the next line below the...
characters in text files. Used when the working copy is needed with a specific EOL style. "native" is commonly used, so that EOLs match the user's OS EOL style. Repositories may require this property on all files to prevent inconsistent line endings, which can cause a problem in itself.
svn:externals : Allows parts of other repositories to be automatically checked-out into a sub-directory.
svn:needs-lock : Specifies that a file is to be checked out with file permissions set to read-only. This is designed for use with the locking mechanism. The read-only permission reminds one to obtain a lock before modifying the file: obtaining a lock makes the file writable, and releasing the lock makes it read-only again. Locks are only enforced during a commit operation. Locks can be used without setting this property. However, that is not recommended, because it introduces the risk of someone modifying a locked file; they will only discover it has been locked when their commit fails.
svn:special : This property is not meant to be set or modified directly by users. only used for having symbolic link
Symbolic link
In computing, a symbolic link is a special type of file that contains a reference to another file or directory in the form of an absolute or relative path and that affects pathname resolution. Symbolic links were already present by 1978 in mini-computer operating systems from DEC and Data...
s in the repository. When a symbolic link is added to the repository, a file containing the link target is created with this property set. When a Unix-like system checks out this file, the client converts it to a symbolic link.
svn:mergeinfo : Used to track merge data (revision numbers) in Subversion 1.5 (or later). This property is automatically maintained by the merge command, and it is not recommended to change its value manually.
Subversion also uses properties on revisions themselves. Like the above properties on filesystem entries the names are completely arbitrary, with the Subversion client using certain properties prefixed with 'svn:'. However, these properties are not versioned and can be changed later.
svn:date : the date and time stamp of a revision
svn:author : the name of the user that submitted the change(s)
svn:log : the user-supplied description of the change(s);
Branching and tagging
Subversion uses the inter-file branching model from PerforcePerforce
Perforce is a commercial, proprietary, centralized revision control system developed by Perforce Software, Inc.-Architecture:Perforce is a client/server system.The server manages a central database and a master repository of file versions....
to handle branches
Branching (software)
Branching, in revision control and software configuration management, is the duplication of an object under revision control so that modifications can happen in parallel along both branches....
and does not support tagging
Revision tag
A revision tag is the term often used to define a textual label that can be associated with a specific revision of a project maintained by a revision control system. This allows the user to define a meaningful name to be given to a particular state of a project that is under version control...
. A branch is a separate line of development. Tagging refers to labeling the repository at a certain point in time so that it can be easily found in the future.
The system sets up a new branch by using the 'svn copy' command, which should be used in place of the native operating system mechanism. Subversion does not create an entire new file version in the repository with its copy. Instead, the old and new versions are linked together internally and the history is preserved for both. The copied versions take up only a little extra room in the repository because Subversion saves only the differences from the original versions.
All the versions in each branch maintain the history of the file up to the point of the copy, plus any changes made since. One can "merge" changes back into the trunk
Trunk (software)
In the field of software development, trunk refers to the unnamed branch of a file tree under revision control. The trunk is usually meant to be the base of a project on which development progresses. If developers are working exclusively on the trunk, it always contains the latest cutting-edge...
or between branches. Due to the differencing algorithm, creating a copy takes very little additional space in the repository.
Limitations and problems
A known problem in Subversion affects the implementation of the file and directory rename operation. , Subversion implements the renaming of files and directories as a "copy" to the new name followed by a "delete" of the old name. Only the names change, all data relating to the edit history remains the same, and Subversion will still use the old name in older revisions of the "tree". However, Subversion may become confused when files are modified and moved in the same commit. This can also cause problems when a move conflicts with edits made elsewhere, for example during merging branches. The Subversion 1.5 release addressed some of these scenarios while others remain problematic., Subversion lacks some repository-administration and management features. For instance, someone may wish to edit the repository to permanently remove all historical records of certain data. Subversion does not have built-in support to achieve this simply.
Subversion stores additional copies of data on the local machine, which can become an issue with very large projects or files, or if developers work on multiple branches simultaneously. These .svn directories on the client side can become corrupted by ill-advised user activity.
Subversion does not store the modification times of files. As such, a file checked out of a subversion repository will have the 'current' date (instead of the modification time in the repository), and a file checked into the repository will have the date of the check-in (instead of the modification time of the file being checked in). This might not always be what is wanted.
To mitigate this third party solutions exist that allow for preserving modification time and other filesystem meta-data.
However, giving checked out files a current date is important as well — this is how tools like make(1) will take notice of a changed file for rebuilding it.
Subversion does not use a distributed revision control
Distributed revision control
A distributed revision control system , distributed version control or decentralized version control keeps track of software revisions and allows many developers to work on a given project without necessarily being connected to a common network.-Distributed vs...
model. Ben Collins-Sussman, one of the designers of Subversion, believes a centralised model would help prevent "insecure programmers" from hiding their work from other team members. Some users of version control systems see the centralised model as detrimental; famously, Linus Torvalds
Linus Torvalds
Linus Benedict Torvalds is a Finnish software engineer and hacker, best known for having initiated the development of the open source Linux kernel. He later became the chief architect of the Linux kernel, and now acts as the project's coordinator...
attacked Subversion's model and its developers.
While Subversion stores filenames as Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
, it does not specify if precomposition
Precomposed character
A precomposed character is a Unicode entity that can be defined as a combination of two or more other characters. A precomposed character may typically represent a letter with a diacritical mark, such as é...
or decomposition
Combining character
In digital typography, combining characters are characters that are intended to modify other characters. The most common combining characters in the Latin script are the combining diacritical marks ....
is used for certain accented characters (such as é). Thus, files added in SVN clients running on some operating systems (such as OS X) use decomposition encoding, while clients running on other operating systems (such as Linux) use precomposition encoding, with the consequence that those accented characters do not display correctly if the local SVN client is not using the same encoding as the client used to add the files.
By design, the
svn log
command is always recursive: trying to access the history of a directory systematically pulls out the history of its entire hierarchy. A workaround is not to use the command line but use a sophisticated SVN client with filtering capabilities.Subversion tags
This subsection focuses on tags but parts of it also apply to branches.Revision numbers are difficult to remember in any version-control system. For this reason, most systems offer symbolic tags as user-friendly references to them. Subversion does not have such a feature and what its documentation recommends to use instead is very different in nature. Instead of implementing tags as references to points in history, Subversion recommends making snapshot copies into a well-known subdirectory ("
tags/
") in the space of the repository tree. Only a few predefined references are available: HEAD, BASE, PREV and COMMITTED.This history-to-space projection has multiple issues:
1. When a snapshot is taken, the system does not assign any special meaning to the name of the tag/snapshot. This is the difference between a copy and a reference. The revision is recorded and the snapshot can be accessed by URL. This makes some operations less convenient and others impossible. For instance, a naive
svn diff -r tag1:tag2 myfile
does not work; it is slightly more complicated than that to achieve, requiring the user to know and input URL/paths to the snapshots instead of just the names: svn diff /myfile /myfile
. Other operations like for instance svn log -r tag1:tag2 myfile
are just impossible.2. When two (ideally independent) object types live in the repository tree, a "fight to the top" can ensue. In other words, it is often difficult to decide at which level to create the "
tags/
" subdirectory:trunk/componentfoo/ /componentbar/ tags/1.1/componentfoo/ /componentbar/ |
or | componentfoo/trunk/ /tags/1.1/ componentbar/trunk/ /tags/1.1/ |
3. Tags, by their conventional definition are both read-only and light-weight, on the repository and client. Subversion copies are not read-only, and while they are light-weight on the repository, they are incredibly heavy-weight on the client.
To address such issues, posters on the Subversion mailing lists have suggested a new feature called "labels" or "aliases".
SVN labels would more closely resemble the "tags" of other systems such as CVS
Concurrent Versions System
The Concurrent Versions System , also known as the Concurrent Versioning System, is a client-server free software revision control system in the field of software development. Version control system software keeps track of all work and all changes in a set of files, and allows several developers ...
or git
Git (software)
Git is a distributed revision control system with an emphasis on speed. Git was initially designed and developed by Linus Torvalds for Linux kernel development. Every Git working directory is a full-fledged repository with complete history and full revision tracking capabilities, not dependent on...
. The fact that Subversion has global revision numbers opens the way to a very simple label->revision implementation. Yet as of 2010, no progress has been made and symbolic tags are not in the list of the most wanted features.
Development and implementation
CollabNetCollabNet
CollabNet is a company that sells application lifecycle management software for distributed development teams engaged in both enterprise and open source development.-History:...
has continued its involvement with Subversion, but the project runs as an independent open source community. In November 2009, the project was accepted into the Apache Incubator
Apache Incubator
Apache Incubator is the gateway for Open source projects intended to become fully fledged Apache Software Foundation projects.The Incubator project was created in October 2002 to provide an entry path to the Apache Software Foundation for projects and codebases wishing to become part of the...
, aiming to become part of the Apache Software Foundation
Apache Software Foundation
The Apache Software Foundation is a non-profit corporation to support Apache software projects, including the Apache HTTP Server. The ASF was formed from the Apache Group and incorporated in Delaware, U.S., in June 1999.The Apache Software Foundation is a decentralized community of developers...
's efforts. Since March 2010, the project is formally known as Apache Subversion, being a part of the Apache Top-Level Projects.
In October 2009, WANdisco
WANdisco
WANdisco, Inc. is a United States based software company involved in the production of Subversion, a software versioning and revision control system.-History:WANdisco was incorporated in 2005...
announced the hiring of core Subversion committers as the company moved to become a major corporate sponsor of the project. This included Hyrum Wright, president of the Subversion Corporation and release manager for the Subversion project since early 2008, who joined the company to lead its open source team.
The Subversion open-source community does not provide binaries, but potential users can download binaries from volunteers. While the Subversion project does not include an official graphical user interface
Graphical user interface
In computing, a graphical user interface is a type of user interface that allows users to interact with electronic devices with images rather than text commands. GUIs can be used in computers, hand-held devices such as MP3 players, portable media players or gaming devices, household appliances and...
(GUI) for use with Subversion, third parties have developed a number of different GUIs, along with a wide variety of additional ancillary software.
Work announced in 2009 included SubversionJ (a Java API) and implementation of the Obliterate command, similar to that provided by Perforce
Perforce
Perforce is a commercial, proprietary, centralized revision control system developed by Perforce Software, Inc.-Architecture:Perforce is a client/server system.The server manages a central database and a master repository of file versions....
. Both of these enhancements were sponsored by WANdisco.
The Subversion committers normally have at least one or two new features under active development at any one time. The 1.7 release of Subversion in October 2011 included a streamlined HTTP transport to improve performance and a rewritten working-copy library.
Source code hosting
The following websites provide free source code hosting for SVN repositories:- AliothAlioth (Debian)Alioth is a FusionForge system run by the Debian project for development of free software and free documentation,especially software or documentation to do with Debian....
- AssemblaAssemblaAssembla is a collaborative project management service for open-source and commercial software. The service rents development applications to other companies online, reducing the cost of software development.-History:...
- BerliOSBerliOSBerliOS is a project founded by FOKUS, a Fraunhofer Institute located in Berlin, to coordinate the different interest groups in the field of open source software and to assume a neutral coordinator function...
- BetavineBetavineBetavine is an open community and resource website, created and managed by Vodafone Group R&D, for the mobile development community in order to support and stimulate the development of new applications for mobile and Internet communications...
- FreepositoryFreepositoryFreepository provides on-demand source code repositories that developers create, control and access from anywhere on the Internet using clients such as Eclipse, TortoiseCVS, WinCVS, the CLI and the web browser....
- Google CodeGoogle CodeGoogle Code is Google's site for developer tools, APIs and technical resources. The site contains documentation on using Google developer tools and APIs—including discussion groups and blogs for developers using Google's developer products....
- SourceForgeSourceForgeSourceForge Enterprise Edition is a collaborative revision control and software development management system. It provides a front-end to a range of software development lifecycle services and integrates with a number of free software / open source software applications .While originally itself...
See also
- List of revision control software
- Comparison of revision control softwareComparison of revision control softwareThe following is a comparison of revision control software. The following tables includes general and technical information for notable revision control and software configuration management software.- General information :Table Explanation...
- Comparison of Subversion clientsComparison of Subversion clientsA comparison of Subversion clients includes various aspects of computer software implementations of the client role using the client–server model of the Subversion revision control system.- Descriptions:...
- TortoiseSVNTortoiseSVNTortoiseSVN is a Subversion client, implemented as a Microsoft Windows shell extension. It is free software released under the GNU General Public License.TortoiseSVN won the SourceForge.net 2007 Community Choice Award for Best Tool or Utility for Developers....
- DotSVNDotSVNDotSVN is a Free open source .Net port of Subversion. It is released under the GNU General Public License.Current version of DotSVN implements the repository access layer using 'file://' protocol to access a local FSFS repository...
- UberSVNUberSVNuberSVN is a Freeware software product developed by WANdisco Inc. It provides a Web Application for installation, administration and use of the Apache Subversion software versioning and revision control system. The first public beta was launched on April 22, 2011...
Further reading
- Dispelling Subversion FUD by Ben Collins-Sussman (Subversion developer), as of 2004-12-21
External links
-
- Previous official site Not all content has yet been migrated to the new official site.
- Version Control with Subversion, an O'ReillyO'Reilly MediaO'Reilly Media is an American media company established by Tim O'Reilly that publishes books and Web sites and produces conferences on computer technology topics...
book available for free online