Wikisource
Encyclopedia
Wikisource is an online digital library
Digital library
A digital library is a library in which collections are stored in digital formats and accessible by computers. The digital content may be stored locally, or accessed remotely via computer networks...

 of free content
Free content
Free content, or free information, is any kind of functional work, artwork, or other creative content that meets the definition of a free cultural work...

 textual sources
Source text
A source text is a text from which information or ideas are derived. In translation, a source text is the original text that is to be translated into another language.-Description:...

 on a wiki
Wiki
A wiki is a website that allows the creation and editing of any number of interlinked web pages via a web browser using a simplified markup language or a WYSIWYG text editor. Wikis are typically powered by wiki software and are often used collaboratively by multiple users. Examples include...

, operated by the Wikimedia Foundation
Wikimedia Foundation
Wikimedia Foundation, Inc. is an American non-profit charitable organization headquartered in San Francisco, California, United States, and organized under the laws of the state of Florida, where it was initially based...

. Its aims are to host all forms of free text, in many languages, and translations. Originally conceived as an archive to store useful or important historical texts, it has expanded to become a general-content library. The project officially began in November 24, 2003 under the name Project Sourceberg. The name Wikisource was adopted later that year and it received its own domain name
Domain name
A domain name is an identification string that defines a realm of administrative autonomy, authority, or control in the Internet. Domain names are formed by the rules and procedures of the Domain Name System ....

 seven months later. The project has come under criticism for lack of reliability but it is also cited by organisations such as the National Archives and Records Administration
National Archives and Records Administration
The National Archives and Records Administration is an independent agency of the United States government charged with preserving and documenting government and historical records and with increasing public access to those documents, which comprise the National Archives...

.

The project holds works that are either in the public domain
Public domain
Works are in the public domain if the intellectual property rights have expired, if the intellectual property rights are forfeited, or if they are not covered by intellectual property rights at all...

 or freely licenced; professionally published works or historical source documents, not vanity products
Vanity press
A vanity press or vanity publisher is a term describing a publishing house that publishes books at the author's expense. Publisher Johnathon Clifford claims to have coined the term in 1959. However, the term appears in mainstream U.S...

; and are verifiable. Verification was initially made offline, or by trusting the reliability of other digital libraries. Now works are supported by online scans via the ProofReadPage extension, which ensures the reliability and accuracy of the project's texts. Some Wikisources now only allow works backed up with scans. While the bulk of its collection are texts, Wikisource hosts other media, from comics to film to audio books. The only original works allowed on Wikisource are translations and annotations.

History

Wikisource's early (2003–2005) history included several changes of name and location (URL), and the move to language subdomains in 2005.

Early history

The original concept for Wikisource was as storage for useful or important historical texts. These texts were intended to support Wikipedia articles, by providing primary evidence and original source texts, and as an archive in its own right. The collection was initially focussed on important historical and cultural material, distinguishing it from other digitial archives such as Project Gutenberg.
The project was originally called Project Sourceberg during its planning stages (a play on words for Project Gutenberg
Project Gutenberg
Project Gutenberg is a volunteer effort to digitize and archive cultural works, to "encourage the creation and distribution of eBooks". Founded in 1971 by Michael S. Hart, it is the oldest digital library. Most of the items in its collection are the full texts of public domain books...

).

In 2001, there was a dispute on Wikipedia regarding the addition of primary source material, leading to edit wars over their inclusion or deletion. Project Sourceberg was suggested as a solution to this. In describing the proposed project, user The Cunctator said, "It would be to Project Gutenberg what Wikipedia is to Nupedia
Nupedia
Nupedia was an English-language Web-based encyclopedia whose articles were written by experts and licensed as free content. It was founded by Jimmy Wales and underwritten by Bomis, with Larry Sanger as editor-in-chief...

."

The project began its activity at ps.wikipedia.org. The contributors understood the "PS" subdomain to mean either "primary sources" or Project Sourceberg. However, this resulted in Project Sourceberg occupying the subdomain of the Pashto Wikipedia (the ISO language code
ISO 639-1
ISO 639-1:2002, Codes for the representation of names of languages — Part 1: Alpha-2 code, is the first part of the ISO 639 series of international standards for language codes. Part 1 covers the registration of two-letter codes. There are 136 two-letter codes registered...

 of the Pashto language
Pashto language
Pashto , known as Afghani in Persian and Pathani in Punjabi , is the native language of the indigenous Pashtun people or Afghan people who are found primarily between an area south of the Amu Darya in Afghanistan and...

 is "ps").

Project Sourceberg officially launched on November 24, 2003 when it received its own temporary URL, at sources.wikipedia.org, and all texts and discussions hosted on ps.wikipedia.org were moved to the temporary address. A vote on the project's name changed it to Wikisource on December 6, 2003. Despite the change in name, the project did not move to its permanent URL (at http://wikisource.org) until July 23, 2004.

Logo and slogan

Since Wikisource was initially called "Project Sourceberg", its first logo was a picture of an iceberg
Iceberg
An iceberg is a large piece of ice from freshwater that has broken off from a snow-formed glacier or ice shelf and is floating in open water. It may subsequently become frozen into pack ice...

. Two votes conducted to choose a successor were inconclusive, and the original logo remained until 2006. Finally, for both legal and technical reasons – because the picture's license was inappropriate for a Wikimedia Foundation
Wikimedia Foundation
Wikimedia Foundation, Inc. is an American non-profit charitable organization headquartered in San Francisco, California, United States, and organized under the laws of the state of Florida, where it was initially based...

 logo and because a photo cannot scale properly – a stylized vector iceberg inspired by the original picture was mandated to serve as the project's logo.

The first prominent use of Wikisource's slogan — The Free Library — was at the project's multilingual portal, when it was redesigned based upon the Wikipedia portal on August 27, 2005, (historical version). As in the Wikipedia portal the Wikisource slogan appears around the logo in the project's ten largest languages.

Clicking on the portal's central images (the iceberg logo in the center and the "Wikisource" heading at the top of the page) links to a list of translations for Wikisource and The Free Library in 60 languages.

ProofReadPage

An extension called ProofReadPage was developed for Wikisource to improve the quality of texts held by the project. This displays pages of scanned works side-by-side with the text relating to that page, allowing the text to be accurately proofread and later checked for quality by any user. Once a book, or other text, has been scanned, it can be modified with image processing
Image processing
In electrical engineering and computer science, image processing is any form of signal processing for which the input is an image, such as a photograph or video frame; the output of image processing may be either an image or, a set of characteristics or parameters related to the image...

 software to correct for page rotations and other problems. Once this is done, the images can be converted into a PDF
Portable Document Format
Portable Document Format is an open standard for document exchange. This file format, created by Adobe Systems in 1993, is used for representing documents in a manner independent of application software, hardware, and operating systems....

 or DjVu
DjVu
DjVu is a computer file format designed primarily to store scanned documents, especially those containing a combination of text, line drawings, and photographs. It uses technologies such as image layer separation of text and background/images, progressive loading, arithmetic coding, and lossy...

 file and uploaded to either Wikisource or Wikimedia Commons
Wikimedia Commons
Wikimedia Commons is an online repository of free-use images, sound and other media files. It is a project of the Wikimedia Foundation....

.

This system ensures the reliability of texts on Wikisource. It not only provides fully proofread works for the project, it also makes the original page scans immediately available to any user so any errors that do make it through the process can be corrected later (or verified as being part of the original). Further, in terms of principle, the extension combines the Wikimedia concept that anyone can contribute with the process of proofreading.

Milestones

Within two weeks of the project's official start at sources.wikipedia.org, over 1,000 pages had been created, with approximately 200 of these being designated as actual articles. On January 4, 2004, Wikisource welcomed its 100th registered user. In early July, 2004 the number of articles exceeded 2,400, and more than 500 users had registered. On April 30, 2005, there were 2667 registered users (including 18 administrators) and almost 19,000 articles. The project passed its 96,000th edit that same day.

On November 27, 2005, the English Wikisource passed 20,000 text-units in its third month of existence, already holding more texts than did the entire project in April (before the move to language subdomains). On February 14, 2008, the English Wikisource passed 100,000 text-units with Chapter LXXIV of Six Months at the White House, a memoir by painter Francis Bicknell Carpenter
Francis Bicknell Carpenter
Francis Bicknell Carpenter was an American painter born in Homer, New York. Carpenter is best known for his painting First Reading of the Emancipation Proclamation of President Lincoln, which is hanging in the United States Capitol...

. In November, 2011, 250,000 text-units milestone was passed by.

Library contents

Wikisource collects and stores in digital format
Digitizing
Digitizing or digitization is the representation of an object, image, sound, document or a signal by a discrete set of its points or samples. The result is called digital representation or, more specifically, a digital image, for the object, and digital form, for the signal...

 previously published texts; including novels, non-fiction works, letters, speeches, constitutional and historical documents, laws and a range of other documents. All texts collected are either free of copyright or released under the Creative Commons Attribution/Share-Alike License
Creative Commons
Creative Commons is a non-profit organization headquartered in Mountain View, California, United States devoted to expanding the range of creative works available for others to build upon legally and to share. The organization has released several copyright-licenses known as Creative Commons...

. Texts in all languages are welcome, as are translations. In addition to texts, Wikisource hosts material such as a comics, films, recordings and spoken word works. All texts held by Wikisource must have been previously published; the project does not host "vanity press
Vanity press
A vanity press or vanity publisher is a term describing a publishing house that publishes books at the author's expense. Publisher Johnathon Clifford claims to have coined the term in 1959. However, the term appears in mainstream U.S...

" books or documents produced by its contributors.

A scanned source is preferred on many Wikisources and required on some. Most Wikisources will, however, accept works transcribed from offline sources or acquired from other digital libraries. The requirement for prior publication can also be waived in a small number of cases if the work is a source document of notable historical importance. The legal requirement for works to be licensed or free of copyright remains constant.

The only original pieces accepted by Wikisource are annotations and translations. Wikisource, and its sister project Wikibooks
Wikibooks
Wikibooks is a Wiki hosted by the Wikimedia Foundation for the creation of free content textbooks and annotated texts that anyone can edit....

, has the capacity for annotated editions
Annotated novel
An annotated novel is a book-length dramatic narrative for which marginal comments have been added to explain, interpret, or illuminate words, phrases, themes, or other elements of the text...

 of texts. On Wikisource, the annotations are supplementary to the original text, which remains the primary objective of the project. By contrast, on Wikibooks the annotations are primary, with the original text as only a reference or supplement, if present at all. Annotated editions are more popular on the German Wikisource. The project also accommodates translations of texts provided by its users. A significant translation on the English Wikisource is the Wiki Bible project, intended to create a new, "laissez-faire translation" of The Bible.

Language subdomains

A separate Hebrew version of Wikisource (he.wikisource.org) was created in August 2004. The need for a language-specific Hebrew
Hebrew language
Hebrew is a Semitic language of the Afroasiatic language family. Culturally, is it considered by Jews and other religious groups as the language of the Jewish people, though other Jewish languages had originated among diaspora Jews, and the Hebrew language is also used by non-Jewish groups, such...

 website derived from the difficulty of typing and editing Hebrew texts in a left-to-right
Bi-directional text
Bi-directional text is text containing text in both text directionalities, both right-to-left and left-to-right . It generally involves text containing different types of alphabets, but may also refer to boustrophedon, which is changing text directionality in each row.Some writing systems of the...

 environment (Hebrew is written right-to-left
Bi-directional text
Bi-directional text is text containing text in both text directionalities, both right-to-left and left-to-right . It generally involves text containing different types of alphabets, but may also refer to boustrophedon, which is changing text directionality in each row.Some writing systems of the...

). In the ensuing months, contributors in other languages including German
German language
German is a West Germanic language, related to and classified alongside English and Dutch. With an estimated 90 – 98 million native speakers, German is one of the world's major languages and is the most widely-spoken first language in the European Union....

 requested their own wikis, but a December vote on the creation of separate language domains was inconclusive. Finally, a second vote that ended May 12, 2005, supported the adoption of separate language subdomains at Wikisource by a large margin, allowing each language to host its texts on its own wiki.

An initial wave of 14 languages was set up by Brion Vibber on August 23, 2005. The new languages did not include English, but the code en: was temporarily set to redirect to the main website (wikisource.org).

At this point the Wikisource community, through a mass project of manually sorting thousands of pages and categories by language, prepared for a second wave of page imports to local wikis. On September 11, 2005, the wikisource.org wiki was reconfigured to enable the English version, along with 8 other languages that were created early that morning and late the night before.

Three more languages were created on March 29, 2006, and then another large wave of 14 language domains was created on June 2, 2006. Currently, there are individual subdomains for Wikisources in more than 60 languages, besides the additional languages hosted at wikisource.org, which serves as an incubator or a home for languages without their own subdomains (31 languages are currently hosted locally)

wikisource.org

During the move to language subdomains, the community requested that the main wikisource.org website remain a functioning wiki, in order to serve three purposes:
  1. To be a multilingual coordination site for the entire Wikisource project in all languages. In practice, use of the website for multilingual coordination has not been heavy since the conversion to language domains. Nevertheless, there is some policy activity at the Scriptorium, and multilingual updates for news and language milestones at pages such as Wikisource:2007.
  2. To be a home for texts in languages without their own subdomains, each with its own local main page for self-organization. As a language incubator, the wiki currently provides a home for over 30 languages that do not presently have their own language subdomains. Some of these are very active, and have built libraries with hundreds of texts (such as Esperanto and Volapuk), and one with thousands (Hindi).
  3. To provide direct, ongoing support by a local wiki community for a dynamic multilingual portal at its Main Page, for users who go to http://wikisource.org. The current Main Page portal was created on August 26, 2005, by ThomasV, who based it upon the Wikipedia
    Wikipedia
    Wikipedia is a free, web-based, collaborative, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation. Its 20 million articles have been written collaboratively by volunteers around the world. Almost all of its articles can be edited by anyone with access to the site,...

     portal.


The idea of a project-specific coordination wiki, first realized at Wikisource, also took hold in another Wikimedia project, namely at Wikiversity
Wikiversity
Wikiversity is a Wikimedia Foundation project, which supports learning communities, their learning materials, and resulting activities. It differs from more structured projects such as Wikipedia in that it instead offers a series of tutorials, or courses, for the fostering of learning, rather than...

's Beta Wiki. Like wikisource.org, it serves Wikiversity coordination in all languages, and as a language incubator. But unlike Wikisource, its Main Page does not serve as its multilingual portal (which is not a wiki page).

Reception

Larry Sanger
Larry Sanger
Lawrence Mark "Larry" Sanger is an American philosopher, co-founder of Wikipedia, and the founder of Citizendium....

 has criticised Wikisource, and sister project Wiktionary
Wiktionary
Wiktionary is a multilingual, web-based project to create a free content dictionary, available in 158 languages...

, because the collaborative nature and technology of these projects means there is no oversight by experts and therefore their content is not reliable.

Bart D. Ehrman
Bart D. Ehrman
Bart D. Ehrman is an American New Testament scholar, currently the James A. Gray Distinguished Professor of Religious Studies at the University of North Carolina at Chapel Hill....

, a New Testament scholar and professor of religious studies at the University of North Carolina at Chapel Hill
University of North Carolina at Chapel Hill
The University of North Carolina at Chapel Hill is a public research university located in Chapel Hill, North Carolina, United States...

, has criticised the English Wikisource's project to create a user-generated translation of The Bible saying "Democratization isn't necessarily good for scholarship." Richard Elliott Friedman
Richard Elliott Friedman
Richard Elliott Friedman is a biblical scholar and the Ann and Jay Davis Professor of Jewish Studies at the University of Georgia. He joined the faculty of the UGA Religion Department in 2006. Prior to his appointment there, he was the Katzin Professor of Jewish Civilization: Hebrew Bible; Near...

, a Old Testament scholar and professor of Jewish studies at the University of Georgia
University of Georgia
The University of Georgia is a public research university located in Athens, Georgia, United States. Founded in 1785, it is the oldest and largest of the state's institutions of higher learning and is one of multiple schools to claim the title of the oldest public university in the United States...

, has identified errors in the translation of the Book of Genesis.

In 2010, the Wikimedia France signed an agreement with the Bibliothèque nationale de France
Bibliothèque nationale de France
The is the National Library of France, located in Paris. It is intended to be the repository of all that is published in France. The current president of the library is Bruno Racine.-History:...

 (National Library of France) to add scans from its own Gallica digital library to French Wikisource. 1,400 public domain French texts were added to the Wikisource library as a result via upload to the Wikimedia Commons. The quality of the transcriptions, previously automatically generated by optical character recognition
Optical character recognition
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping...

 (OCR), were expected to be improved by Wikisource's human proofreaders.

In 2011, the English Wikisource received many high-quality scans of documents from the National Archives and Records Administration
National Archives and Records Administration
The National Archives and Records Administration is an independent agency of the United States government charged with preserving and documenting government and historical records and with increasing public access to those documents, which comprise the National Archives...

 (NARA) as part of their efforts "to increase the accessibility and visibility of its holdings." Processing and upload to Commons of these documents, along with many images from the NARA collection, was facilitated by a NARA "Wikimedian in residence", Dominic. Many of these documents have been transcribed and proofread by the Wikisource community and now feature as links on the National Archives' own online catalog.

External links

Wikisource:
  • English Wikisource
  • Wikisource:For Wikipedians
  • Multilingual portal


About Wikisource:
  • Danny Wool on Wikisource (Wikimedia Foundation article).
  • A personal perspective on the history of Wikisource by Angela Beesley
  • Early discussions and plans for the project (Meta)
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK