E-text
Encyclopedia
An e-text is, generally, any text-based information that is available in a digitally encoded human-readable format and read by electronic means, but more specifically it refers to files in the ASCII
character encoding
.
E-text has the broad meaning of something electronic that represents words, a binary
(or digital
) version of a published work of text. Indeed, there are ASCII textbooks available. These are now referred to as, and the term is often used synonymously, an e-book
.
The term e-text is used for the more limited case of data in ASCII text format, while the more general e-book can be in a specialized (and, at times, proprietary
) file format. An ebook is commonly bundled by a publisher
for distribution (as an ebook, an ezine, or an internet
newspaper
), whereas e-text is distributed in ASCII (or plain text
). Metadata relating to the text is sometimes included with e-text (though it appears more frequently with ebooks).
Typically, e-text have some control character
s such as tabs
, line feeds
and carriage return
s without any embedded information such as font
information, hyperlink
s, or inline image
s. E-text files are files with generally a one-to-one correspondence between the bytes and ordinary readable characters such as letters and digits. Sometimes e-text files contain more than ASCII characters if they are encoded by East-Asian encoding (such as Shift JIS or unicode
). If the e-texts are written in unicode, a UTF standard (such as UTF-8
) defines the encoding format. Although e-text files are generally human-readable, they can of course be used for data storage by computer programs. Note that a webpage with formatted text is not an e-text specifically, but the HTML
source code
is; whether a file is an e-text thus may depend on the level on which one is considering it.
Most programming language
s require source files to be stored in etext, as do HTML
and XML
. These files can be opened, read, and edited with a text editor
. An e-text file can have the MIME
type "text/plain", often with suffixes indicating an encoding. Common encodings for e-text include Unicode
UTF-8
, Unicode UTF-16/UCS-2
, ISO/IEC 8859
and ASCII
. Transferring e-text files between Unix
, Macintosh
and Microsoft Windows
or DOS
computers can be problematic, as each platform uses different control characters.
The added functionality (such as searching
within the text) and easy portability make e-text popular. Hand-held computers (such as Personal Digital Assistant
s (PDAs)) allow a large number of e-texts to be carried. These devices also allow the e-text to be read on the move more conveniently than text printed on paper
.
Project Gutenberg
and other various digital libraries
are using e-text.
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
character encoding
Character encoding
A character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the transmission of data through telecommunication networks or storage of text in...
.
E-text has the broad meaning of something electronic that represents words, a binary
Binary file
A binary file is a computer file which may contain any type of data, encoded in binary form for computer storage and processing purposes; for example, computer document files containing formatted text...
(or digital
Digital
A digital system is a data technology that uses discrete values. By contrast, non-digital systems use a continuous range of values to represent information...
) version of a published work of text. Indeed, there are ASCII textbooks available. These are now referred to as, and the term is often used synonymously, an e-book
E-book
An electronic book is a book-length publication in digital form, consisting of text, images, or both, and produced on, published through, and readable on computers or other electronic devices. Sometimes the equivalent of a conventional printed book, e-books can also be born digital...
.
The term e-text is used for the more limited case of data in ASCII text format, while the more general e-book can be in a specialized (and, at times, proprietary
Proprietary software
Proprietary software is computer software licensed under exclusive legal right of the copyright holder. The licensee is given the right to use the software under certain conditions, while restricted from other uses, such as modification, further distribution, or reverse engineering.Complementary...
) file format. An ebook is commonly bundled by a publisher
Publishing
Publishing is the process of production and dissemination of literature or information—the activity of making information available to the general public...
for distribution (as an ebook, an ezine, or an internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...
newspaper
Newspaper
A newspaper is a scheduled publication containing news of current events, informative articles, diverse features and advertising. It usually is printed on relatively inexpensive, low-grade paper such as newsprint. By 2007, there were 6580 daily newspapers in the world selling 395 million copies a...
), whereas e-text is distributed in ASCII (or plain text
Plain text
In computing, plain text is the contents of an ordinary sequential file readable as textual material without much processing, usually opposed to formatted text....
). Metadata relating to the text is sometimes included with e-text (though it appears more frequently with ebooks).
Typically, e-text have some control character
Control character
In computing and telecommunication, a control character or non-printing character is a code point in a character set, that does not in itself represent a written symbol.It is in-band signaling in the context of character encoding....
s such as tabs
Indentation
An indentation may refer to:* A notch, or deep recesses; for instance in a coastline, or a carving in rock* The placement of text farther to the right to separate it from surrounding text....
, line feeds
Newline
In computing, a newline, also known as a line break or end-of-line marker, is a special character or sequence of characters signifying the end of a line of text. The name comes from the fact that the next character after the newline will appear on a new line—that is, on the next line below the...
and carriage return
Carriage return
Carriage return, often shortened to return, refers to a control character or mechanism used to start a new line of text.Originally, the term "carriage return" referred to a mechanism or lever on a typewriter...
s without any embedded information such as font
Typeface
In typography, a typeface is the artistic representation or interpretation of characters; it is the way the type looks. Each type is designed and there are thousands of different typefaces in existence, with new ones being developed constantly....
information, hyperlink
Hyperlink
In computing, a hyperlink is a reference to data that the reader can directly follow, or that is followed automatically. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks...
s, or inline image
Image
An image is an artifact, for example a two-dimensional picture, that has a similar appearance to some subject—usually a physical object or a person.-Characteristics:...
s. E-text files are files with generally a one-to-one correspondence between the bytes and ordinary readable characters such as letters and digits. Sometimes e-text files contain more than ASCII characters if they are encoded by East-Asian encoding (such as Shift JIS or unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
). If the e-texts are written in unicode, a UTF standard (such as UTF-8
UTF-8
UTF-8 is a multibyte character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set. Unlike them, it is backward-compatible with ASCII and avoids the complications of endianness and byte order marks...
) defines the encoding format. Although e-text files are generally human-readable, they can of course be used for data storage by computer programs. Note that a webpage with formatted text is not an e-text specifically, but the HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...
is; whether a file is an e-text thus may depend on the level on which one is considering it.
Most programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....
s require source files to be stored in etext, as do HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
and XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
. These files can be opened, read, and edited with a text editor
Text editor
A text editor is a type of program used for editing plain text files.Text editors are often provided with operating systems or software development packages, and can be used to change configuration files and programming language source code....
. An e-text file can have the MIME
MIME
Multipurpose Internet Mail Extensions is an Internet standard that extends the format of email to support:* Text in character sets other than ASCII* Non-text attachments* Message bodies with multiple parts...
type "text/plain", often with suffixes indicating an encoding. Common encodings for e-text include Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
UTF-8
UTF-8
UTF-8 is a multibyte character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set. Unlike them, it is backward-compatible with ASCII and avoids the complications of endianness and byte order marks...
, Unicode UTF-16/UCS-2
UTF-16/UCS-2
UTF-16 is a character encoding for Unicode capable of encoding 1,112,064 numbers in the Unicode code space from 0 to 0x10FFFF...
, ISO/IEC 8859
ISO/IEC 8859
ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12...
and ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
. Transferring e-text files between Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...
, Macintosh
Macintosh
The Macintosh , or Mac, is a series of several lines of personal computers designed, developed, and marketed by Apple Inc. The first Macintosh was introduced by Apple's then-chairman Steve Jobs on January 24, 1984; it was the first commercially successful personal computer to feature a mouse and a...
and Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
or DOS
DOS
DOS, short for "Disk Operating System", is an acronym for several closely related operating systems that dominated the IBM PC compatible market between 1981 and 1995, or until about 2000 if one includes the partially DOS-based Microsoft Windows versions 95, 98, and Millennium Edition.Related...
computers can be problematic, as each platform uses different control characters.
The added functionality (such as searching
Search algorithm
In computer science, a search algorithm is an algorithm for finding an item with specified properties among a collection of items. The items may be stored individually as records in a database; or may be elements of a search space defined by a mathematical formula or procedure, such as the roots...
within the text) and easy portability make e-text popular. Hand-held computers (such as Personal Digital Assistant
Personal digital assistant
A personal digital assistant , also known as a palmtop computer, or personal data assistant, is a mobile device that functions as a personal information manager. Current PDAs often have the ability to connect to the Internet...
s (PDAs)) allow a large number of e-texts to be carried. These devices also allow the e-text to be read on the move more conveniently than text printed on paper
Paper
Paper is a thin material mainly used for writing upon, printing upon, drawing or for packaging. It is produced by pressing together moist fibers, typically cellulose pulp derived from wood, rags or grasses, and drying them into flexible sheets....
.
Project Gutenberg
Project Gutenberg
Project Gutenberg is a volunteer effort to digitize and archive cultural works, to "encourage the creation and distribution of eBooks". Founded in 1971 by Michael S. Hart, it is the oldest digital library. Most of the items in its collection are the full texts of public domain books...
and other various digital libraries
Digital library
A digital library is a library in which collections are stored in digital formats and accessible by computers. The digital content may be stored locally, or accessed remotely via computer networks...
are using e-text.
See also
- Text fileText fileA text file is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists within a computer file system...
- e-bookE-bookAn electronic book is a book-length publication in digital form, consisting of text, images, or both, and produced on, published through, and readable on computers or other electronic devices. Sometimes the equivalent of a conventional printed book, e-books can also be born digital...
- Electronic paperElectronic paperElectronic paper, e-paper and electronic ink are a range of display technology which are designed to mimic the appearance of ordinary ink on paper. Unlike conventional backlit flat panel displays, electronic paper displays reflect light like ordinary paper...
- Digital libraryDigital libraryA digital library is a library in which collections are stored in digital formats and accessible by computers. The digital content may be stored locally, or accessed remotely via computer networks...
- Online Books PageOnline Books PageThe Online Books Page is an index of e-text books available on the Internet. It is edited by John Mark Ockerbloom and is hosted by the library of the University of Pennsylvania...
- Project GutenbergProject GutenbergProject Gutenberg is a volunteer effort to digitize and archive cultural works, to "encourage the creation and distribution of eBooks". Founded in 1971 by Michael S. Hart, it is the oldest digital library. Most of the items in its collection are the full texts of public domain books...
- Distributed ProofreadersDistributed ProofreadersDistributed Proofreaders is a web-based project that supports the development of e-texts for Project Gutenberg by allowing many people to work together in proofreading drafts of e-texts for errors.- History :...
- L'Association des Bibliophiles UniverselsL'Association des Bibliophiles UniverselsThe Association des Bibliophiles Universels is a French language organization dedicated to producing e-text versions of public domain French texts...
- Higher intellect projectHigher intellect projectThe Higher Intellect project, also known as preterhuman.net, is a large freely accessible collections of text files and books. The archive consists of over 280,000 text files in a 110GB database. The project is entirely non-profit and is funded only by donations from its users and...