Non-breaking space
Encyclopedia
In computer-based text processing and digital typesetting, a non-breaking space or no-break space (NBSP) is a variant of the space character that prevents an automatic line break (line wrap) at its position. In certain formats (such as HTML
), it also prevents the “collapsing” of multiple consecutive whitespace characters
into a single space. The non-breaking space is also known as a hard space
or fixed space. In Unicode, it is encoded at .
file formats such as SGML, HTML
, TeX
, and LaTeX
, which sometimes treat sequences of whitespace characters
(space, newline, tab, form feed, etc.) as if they were a single white-space character. Such “collapsing” of white-space allows the author to neatly arrange the source text using line breaks, indentation and other forms of spacing without affecting the final typeset result.
In contrast, non-breaking spaces are not merged with neighboring whitespace characters, and can therefore be used by an author to insert additional visible space in the formatted text. For example, in HTML, non-breaking spaces may be used in conjunction with a fixed-width font to create tabular alignment (courier new font family used):
Column 1 Column 2
-------- --------
1.2 2.3
(note that the use of the
, the
rule, or a table are alternative, if not better, ways to achieve the same result in HTML)
If ordinary spaces are used instead then the spaces are collapsed when the HTML is rendered and the layout is broken:
Column 1 Column 2
-------- --------
1.2 2.3
Non-breaking space can also be used to automatically change formatting in a document. This is useful for things like class plans and recipe files where the description of a cell or line may be different from the actual text or title. For instance "recipe for: SOURDOUGH" can be set up to change font, point size, color, etc., anywhere on the line. First type text using the first format, then change the formatting as desired, insert the hard space by using cntrl+shift+space bar (in Word). When the cursor touches the hard space the formatting will change automatically to the second format. The non-breaking space works well for template documents that are used multiple times, making them quicker and easier to fill in.
Unicode defines several other non-break space characters that differ from the regular space in width:
s to define an input method for the non-breaking space. An exception is the Finnish Multilingual Keyboard, accepted as the national standard SFS
5966 in 2008. According to the SFS setting, the non-breaking space can be entered with the key combination AltGr + Space
.
Typically, authors of keyboard drivers and application programs (e.g., word processor
s) have devised their own keyboard shortcut
s for the non-breaking space. For example:
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
), it also prevents the “collapsing” of multiple consecutive whitespace characters
Whitespace (computer science)
In computer science, whitespace is any single character or series of characters that represents horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visual mark, but typically does occupy an area on a page...
into a single space. The non-breaking space is also known as a hard space
Hard space
In typesetting and text editors, the term hard space has several meanings, all related to a special way of representing the space between characters....
or fixed space. In Unicode, it is encoded at .
Non-breaking behavior
Text-processing software typically assumes that an automatic line break may be inserted anywhere a space character occurs; a non-breaking space prevents this from happening (provided the software recognizes the character). For example, if the text “100 km” will not quite fit at the end of a line, the software may insert a line break between “100” and “km”. To avoid this undesirable behaviour, the editor may choose to use a non-breaking space between “100” and “km”. This guarantees that the text “100 km” will not be broken: if it does not fit at the end of a line it is moved in its entirety to the next line.Use as non-collapsing white-space
A second common application of non-breaking spaces is in plain textPlain text
In computing, plain text is the contents of an ordinary sequential file readable as textual material without much processing, usually opposed to formatted text....
file formats such as SGML, HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
, TeX
TeX
TeX is a typesetting system designed and mostly written by Donald Knuth and released in 1978. Within the typesetting system, its name is formatted as ....
, and LaTeX
LaTeX
LaTeX is a document markup language and document preparation system for the TeX typesetting program. Within the typesetting system, its name is styled as . The term LaTeX refers only to the language in which documents are written, not to the editor used to write those documents. In order to...
, which sometimes treat sequences of whitespace characters
Whitespace (computer science)
In computer science, whitespace is any single character or series of characters that represents horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visual mark, but typically does occupy an area on a page...
(space, newline, tab, form feed, etc.) as if they were a single white-space character. Such “collapsing” of white-space allows the author to neatly arrange the source text using line breaks, indentation and other forms of spacing without affecting the final typeset result.
In contrast, non-breaking spaces are not merged with neighboring whitespace characters, and can therefore be used by an author to insert additional visible space in the formatted text. For example, in HTML, non-breaking spaces may be used in conjunction with a fixed-width font to create tabular alignment (courier new font family used):
Column 1 Column 2
-------- --------
1.2 2.3
(note that the use of the
pre
tagHTML element
An HTML element is an individual component of an HTML document. HTML documents are composed of a tree of HTML elements and other nodes, such as text nodes. Each element can have attributes specified. Elements can also have content, including other elements and text. HTML elements represent...
, the
whitespace:pre
CSSCSS
-Computing:*Cascading Style Sheets, a language used to describe the style of document presentations in web development*Central Structure Store in the PHIGS 3D API*Closed source software, software that is not distributed with source code...
rule, or a table are alternative, if not better, ways to achieve the same result in HTML)
If ordinary spaces are used instead then the spaces are collapsed when the HTML is rendered and the layout is broken:
Column 1 Column 2
-------- --------
1.2 2.3
Non-breaking space can also be used to automatically change formatting in a document. This is useful for things like class plans and recipe files where the description of a cell or line may be different from the actual text or title. For instance "recipe for: SOURDOUGH" can be set up to change font, point size, color, etc., anywhere on the line. First type text using the first format, then change the formatting as desired, insert the hard space by using cntrl+shift+space bar (in Word). When the cursor touches the hard space the formatting will change automatically to the second format. The non-breaking space works well for template documents that are used multiple times, making them quicker and easier to fill in.
Encodings
Format | Representation of non-breaking space |
---|---|
Unicode Unicode Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems... and ISO/IEC 10646 |
. Can be encoded by UTF-8 UTF-8 UTF-8 is a multibyte character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set. Unlike them, it is backward-compatible with ASCII and avoids the complications of endianness and byte order marks... as 0xC20xA0. |
ISO/IEC 8859 ISO/IEC 8859 ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12... |
0xA0 |
CP1252 (Windows default in most countries using Germanic Germanic languages The Germanic languages constitute a sub-branch of the Indo-European language family. The common ancestor of all of the languages in this branch is called Proto-Germanic , which was spoken in approximately the mid-1st millennium BC in Iron Age northern Europe... or Romance languages Romance languages The Romance languages are a branch of the Indo-European language family, more precisely of the Italic languages subfamily, comprising all the languages that descend from Vulgar Latin, the language of ancient Rome... ) |
0xA0 |
KOI8-R KOI8-R KOI8-R is an 8-bit character encoding, designed to cover Russian, which uses the Cyrillic alphabet. It also happens to cover Bulgarian, but is not used since CP1251 is accepted. A derivative encoding is KOI8-U, which adds Ukrainian characters... |
0x9A |
EBCDIC EBCDIC Extended Binary Coded Decimal Interchange Code is an 8-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems.... |
0x41 |
CP437 Code page 437 IBM PC or MS-DOS code page 437 is the character set of the original IBM PC. It is also known as CP 437, OEM 437, PC-8, MS-DOS Latin US or sometimes misleadingly referred to as the OEM font, High ASCII or Extended ASCII.... , CP850 Code page 850 Code page 850 is a code page used under MS-DOS in Western Europe. It is the code page commonly used by the version of MS-DOS underlying Windows ME... , CP866 Code page 866 Code page 866 is a code page used under MS-DOS to write Cyrillic script. It is based on the "alternative character set" of GOST 19768-87... |
0xFF |
SGML and HTML HTML HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages.... (including Wikitext Wikitext Wikitext language, or wiki markup, is a lightweight markup language used to write pages in wiki websites, such as Wikipedia, and is a simplified alternative/intermediate to HTML. Its ultimate purpose is to be converted by wiki software into HTML, which in turn is served to web browsers.There is no... ) |
Character entity reference Character entity reference In the markup languages SGML, HTML, XHTML and XML, a character entity reference is a reference to a particular kind of named entity that has been predefined or explicitly declared in a Document Type Definition . The "replacement text" of the entity consists of a single character from the Universal... : Numeric character reference Numeric character reference A numeric character reference is a common markup construct used in SGML and other SGML-related markup languages such as HTML and XML. It consists of a short sequence of characters that, in turn, represent a single character from the Universal Character Set of Unicode... s:   or   |
TeX TeX TeX is a typesetting system designed and mostly written by Donald Knuth and released in 1978. Within the typesetting system, its name is formatted as .... |
tilde Tilde The tilde is a grapheme with several uses. The name of the character comes from Portuguese and Spanish, from the Latin titulus meaning "title" or "superscription", though the term "tilde" has evolved and now has a different meaning in linguistics.... (~) |
ASCII ASCII The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text... |
Not available |
Unicode defines several other non-break space characters that differ from the regular space in width:
- No-break thin space, known in Unicode as “NARROW NO-BREAK SPACE” (U+202F). This is required for French punctuation (before ?, ! or ;).
- Word joiner, encoded in Unicode 3.2 and above as U+2060 and HTML as ⁠. The word-joiner does not normally produce any space but prohibits a line break on either side of it.
- The Byte Order MarkByte Order MarkThe byte order mark is a Unicode character used to signal the endianness of a text file or stream. Its code point is U+FEFF. BOM use is optional, and, if used, should appear at the start of the text stream...
, U+FEFF, officially named “ZERO WIDTH NO-BREAK SPACE”, can also be used with the same meaning as the word joiner, but in current documents this use is deprecated. See also Zero-width non-breaking space.
Keyboard entry methods
It is rare for national or international standards on keyboard layoutKeyboard layout
A keyboard layout is any specific mechanical, visual, or functional arrangement of the keys, legends, or key–meaning associations of a computer, typewriter, or other typographic keyboard....
s to define an input method for the non-breaking space. An exception is the Finnish Multilingual Keyboard, accepted as the national standard SFS
Standards organization
A standards organization, standards body, standards developing organization , or standards setting organization is any organization whose primary activities are developing, coordinating, promulgating, revising, amending, reissuing, interpreting, or otherwise producing technical standards that are...
5966 in 2008. According to the SFS setting, the non-breaking space can be entered with the key combination AltGr + Space
Space bar
thumb|250px|A [[computer keyboard]], Space Bar is on the bottom center of the keyboardThe space bar, spacebar, or space key, is a key on an alphanumeric keyboard in the form of a horizontal bar in the lowermost row, significantly wider than other keys. Its main purpose is to conveniently enter the...
.
Typically, authors of keyboard drivers and application programs (e.g., word processor
Word processor
A word processor is a computer application used for the production of any sort of printable material....
s) have devised their own keyboard shortcut
Keyboard shortcut
In computing, a keyboard shortcut is a finite set of one or more keys that invoke a software or operating system operation when triggered by the user. A meaning of term "keyboard shortcut" can vary depending on software manufacturer...
s for the non-breaking space. For example:
System/application | Entry method |
---|---|
Mac OS Mac OS Mac OS is a series of graphical user interface-based operating systems developed by Apple Inc. for their Macintosh line of computer systems. The Macintosh user experience is credited with popularizing the graphical user interface... |
Option Option key The Option key is a modifier key present on Apple keyboards. It is located between the Control key and Command key on a typical Mac keyboard. There are two option keys on modern Mac desktop and notebook keyboards, one on each side of the space bar.... +Space Space bar thumb|250px|A [[computer keyboard]], Space Bar is on the bottom center of the keyboardThe space bar, spacebar, or space key, is a key on an alphanumeric keyboard in the form of a horizontal bar in the lowermost row, significantly wider than other keys. Its main purpose is to conveniently enter the... |
X11 | Compose Compose key A compose key, available on some computer keyboards, is a special kind of modifier key designated to signal the software to interpret the following sequence of two keystrokes as a combination in order to produce a character not found directly on the keyboard... , Space, Space |
Emacs Emacs Emacs is a class of text editors, usually characterized by their extensibility. GNU Emacs has over 1,000 commands. It also allows the user to combine these commands into macros to automate work.Development began in the mid-1970s and continues actively... |
Ctrl Control key In computing, a Control key is a modifier key which, when pressed in conjunction with another key, will perform a special operation ; similar to the Shift key, the Control key rarely performs any function when pressed by itself... +X 8 Space |
Vim Vim (text editor) Vim is a text editor written by Bram Moolenaar and first released publicly in 1991. Based on the vi editor common to Unix-like systems, Vim is designed for use both from a command line interface and as a standalone application in a graphical user interface... |
Ctrl+K N S |
Windows (all applications) | Alt Alt key The Alt key on a computer keyboard is used to change the function of other pressed keys. Thus, the Alt key is a modifier key, used in a similar fashion to the Shift key. For example, simply pressing "A" will type the letter a, but if you hold down either Alt key while pressing A, the computer... +0160 or Alt+255 (on numeric keypad Numeric keypad A numeric keypad, numpad or tenkey for short, is the small, palm-sized, seventeen key section of a computer keyboard, usually on the very far right. The numeric keypad features digits 0 to 9, addition , subtraction , multiplication and division symbols, a decimal point and Num Lock and Enter keys... ) |
Microsoft Word Microsoft Word Microsoft Word is a word processor designed by Microsoft. It was first released in 1983 under the name Multi-Tool Word for Xenix systems. Subsequent versions were later written for several other platforms including IBM PCs running DOS , the Apple Macintosh , the AT&T Unix PC , Atari ST , SCO UNIX,... , Dreamweaver, OpenOffice.org OpenOffice.org OpenOffice.org, commonly known as OOo or OpenOffice, is an open-source application suite whose main components are for word processing, spreadsheets, presentations, graphics, and databases. OpenOffice is available for a number of different computer operating systems, is distributed as free software... (since 3.0) |
Ctrl+Shift Shift key The shift key is a modifier key on a keyboard, used to type capital letters and other alternate "upper" characters. There are typically two shift keys, on the left and right sides of the row below the home row... +Space |
WordPerfect WordPerfect WordPerfect is a word processing application, now owned by Corel.Bruce Bastian, a Brigham Young University graduate student, and BYU computer science professor Dr. Alan Ashton joined forces to design a word processing system for the city of Orem's Data General Corp. minicomputer system in 1979... , OpenOffice.org OpenOffice.org OpenOffice.org, commonly known as OOo or OpenOffice, is an open-source application suite whose main components are for word processing, spreadsheets, presentations, graphics, and databases. OpenOffice is available for a number of different computer operating systems, is distributed as free software... (before 3.0), LyX LyX LyX is a document processor following the self-coined "what you see is what you mean" paradigm , as opposed to the WYSIWYG ideas used by word processors... |
Ctrl+Space, Ctrl+Shift+Space for recent OpenOffice (see LP) |
GTK-based applications | Ctrl+Shift+U 00A0 |
Mac Adobe InDesign | Option+Command+X |
Many office applications | Insert → Symbol dialog box (Latin-1 subset, after ~) |
See also
- Byte order markByte Order MarkThe byte order mark is a Unicode character used to signal the endianness of a text file or stream. Its code point is U+FEFF. BOM use is optional, and, if used, should appear at the start of the text stream...
- Hyphens in computing, for information about hard and non-breaking hyphens
- List of XML and HTML character entity references
- Orphans and widows
- PunctuationPunctuationPunctuation marks are symbols that indicate the structure and organization of written language, as well as intonation and pauses to be observed when reading aloud.In written English, punctuation is vital to disambiguate the meaning of sentences...
- Sentence spacing in digital media
- Space (punctuation)Space (punctuation)In writing, a space is a blank area devoid of content, serving to separate words, letters, numbers, and punctuation. Conventions for interword and intersentence spaces vary among languages, and in some cases the spacing rules are quite complex....