Whitespace (computer science)
Encyclopedia
In computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, whitespace is any single character
Character (computing)
In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language....

 or series of characters that represents horizontal or vertical space in typography
Typography
Typography is the art and technique of arranging type in order to make language visible. The arrangement of type involves the selection of typefaces, point size, line length, leading , adjusting the spaces between groups of letters and adjusting the space between pairs of letters...

. When rendered, a whitespace character does not correspond to a visual mark, but typically does occupy an area on a page. For example, the common whitespace symbol
Symbol
A symbol is something which represents an idea, a physical entity or a process but is distinct from it. The purpose of a symbol is to communicate meaning. For example, a red octagon may be a symbol for "STOP". On a map, a picture of a tent might represent a campsite. Numerals are symbols for...

 " " (Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

 code point U+0020, decimal 32) represents a blank space
Space (punctuation)
In writing, a space is a blank area devoid of content, serving to separate words, letters, numbers, and punctuation. Conventions for interword and intersentence spaces vary among languages, and in some cases the spacing rules are quite complex....

, used as a word divider in Western script
Writing system
A writing system is a symbolic system used to represent elements or statements expressible in language.-General properties:Writing systems are distinguished from other possible symbolic communication systems in that the reader must usually understand something of the associated spoken language to...

s.

The term "whitespace" is based on the assumption that the background color used for rendered text is white.

Definition and ambiguity

As is common in technical literature, the two words "white space" have found widespread usage as the single term "whitespace", especially when used as an adjective
Adjective
In grammar, an adjective is a 'describing' word; the main syntactic role of which is to qualify a noun or noun phrase, giving more information about the object signified....

, as in "whitespace character". Some specifications refer to "white space" while others refer to "whitespace"; there is no difference between the terms, although exactly which characters are being referred to does vary from context to context. For example, the form feed character is "whitespace" in HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

, but is not "white space" in XML.

The most common whitespace characters may be typed via the space bar
Space bar
thumb|250px|A [[computer keyboard]], Space Bar is on the bottom center of the keyboardThe space bar, spacebar, or space key, is a key on an alphanumeric keyboard in the form of a horizontal bar in the lowermost row, significantly wider than other keys. Its main purpose is to conveniently enter the...

 or the Tab key
Tab key
Tab key on a keyboard is used to advance the cursor to the next tab stop.- Origin :The word tab derives from the word tabulate, which means "to arrange data in a tabular, or table, form"...

. Depending on context, a line-break generated by the Return key (Enter key
Enter key
In computer keyboards, the enter key in most cases causes a command line, window form, or dialog box to operate its default function...

) may be considered whitespace as well.

Unicode

In Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

 (Unicode Character Database) the following 26 characters are defined as whitespace:

Programming Languages

Runs of whitespace (beyond a first whitespace character) occurring within source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

 written in computer programming language
Programming language
A programming language is an artificial language designed to communicate instructions to a machine, particularly a computer. Programming languages can be used to create programs that control the behavior of a machine and/or to express algorithms precisely....

s are generally ignored; such languages are free-form
Free-form language
In computer programming, a free-form language is a programming language in which the positioning of characters on the page in program text is not significant. Program text does not need to be placed in specific columns as on old punched card systems, and frequently ends of lines are not significant...

. But, for example, in Haskell
Haskell (programming language)
Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...

 and Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

, whitespace and indentation are used for syntactical purposes. And in Whitespace
Whitespace (programming language)
Whitespace is an esoteric programming language developed by Edwin Brady and Chris Morris at the University of Durham . It was released on 1 April 2003 . Its name is a reference to whitespace characters...

, whitespaces are the only valid characters for programming, while any other characters are ignored.

Still, for most programming languages, abundant use of whitespace, especially trailing whitespace at the end of lines, is considered a nuisance. However correct use of whitespace aids developers. It can make the code easier to read and help group related logic. In interpreted language
Interpreted language
Interpreted language is a programming language in which programs are 'indirectly' executed by an interpreter program. This can be contrasted with a compiled language which is converted into machine code and then 'directly' executed by the host CPU...

s, parsing of unnecessary whitespace may affect the speed of execution. In markup language
Markup language
A markup language is a modern system for annotating a text in a way that is syntactically distinguishable from that text. The idea and terminology evolved from the "marking up" of manuscripts, i.e. the revision instructions by editors, traditionally written with a blue pencil on authors' manuscripts...

s like HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

, unnecessary whitespace increases the file size, and may so affect the speed of transfer over a network. On the other hand, unnecessary whitespace can also inconspicuously mark code, similar to, but less obvious than comments in code. This can be desirable to prove an infringement
Infringement
Infringement, when used alone, has several possible meanings in the English language.In a legal context, an infringement refers to the violation of a law or a right. This includes intellectual property infringements such as:*Copyright infringement...

 of license or copyright that was committed by copying and pasting.

The C language
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 defines whitespace to be "... space, horizontal tab, new-line, vertical tab, and form-feed". The HTTP network protocol has very strict requirements about what type of whitespace can occur in the control structures (such as the header fields) and where it must and must not occur.

Literature

On some occasions, such as a textbook on the Modula-2 computer language published ca. 1985 by Springer-Verlag, it is necessary to explicitly show a symbol to indicate a space code. That book, at least, used the symbol ␣ (Unicode U+2423, decimal 9251, OPEN BOX) to show an explicit space code. (In case it doesn't render well in your web browser, it's much like a ] (a closing square bracket) although not as wide, rotated a quarter-turn clockwise and placed below the writing line. Some fonts render it too narrowly.)

The TI-8x
Comparison of Texas Instruments graphing calculators
A graphing calculator is a class of hand-held calculator that is capable of plotting graphs and solving complex functions. There are several companies that manufacture models of graphing calculators. Texas Instruments is a major manufacturer....

 series graphing calculators from Texas Instruments
Texas Instruments
Texas Instruments Inc. , widely known as TI, is an American company based in Dallas, Texas, United States, which develops and commercializes semiconductor and computer technology...

, at least the early models, use the same symbol to represent the space character in the keypad silkscreening, although on the calculators' display, this character appears as a blank space as on typical computer monitors.

File names

Such usage is similar to multiword file names written for operating systems and applications that are confused by embedded space codes—such file names instead use an underscore
Underscore
The underscore [ _ ] is a character that originally appeared on the typewriter and was primarily used to underline words...

 (_) as a word separator, as_in_this_phrase.

Another such symbol was . This was used in the early years of computer programming when writing on coding forms. Keypunch operators immediately recognized the symbol as an "explicit space".

See also

  • Programming style
    Programming style
    Programming style is a set of rules or guidelines used when writing the source code for a computer program. It is often claimed that following a particular programming style will help programmers to read and understand source code conforming to the style, and help to avoid introducing errors.A...

  • Whitespace (programming language)
    Whitespace (programming language)
    Whitespace is an esoteric programming language developed by Edwin Brady and Chris Morris at the University of Durham . It was released on 1 April 2003 . Its name is a reference to whitespace characters...

  • Indent style
    Indent style
    In computer programming, an indent style is a convention governing the indentation of blocks of code to convey the program's structure. This article largely addresses the C programming language and its descendants, but can be applied to most other programming languages...

  • Space (punctuation)
    Space (punctuation)
    In writing, a space is a blank area devoid of content, serving to separate words, letters, numbers, and punctuation. Conventions for interword and intersentence spaces vary among languages, and in some cases the spacing rules are quite complex....

  • Trim (programming)
    Trim (programming)
    In programming, trim or strip is a common string manipulation function which removes leading and trailing whitespace from a string.For example, the text' this is a test 'would be changed, after trimming, to'this is a test'-Variants:...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK