Trim (programming)
Encyclopedia
In programming, trim or strip is a common string manipulation function which removes leading and trailing whitespace
Whitespace (computer science)
In computer science, whitespace is any single character or series of characters that represents horizontal or vertical space in typography. When rendered, a whitespace character does not correspond to a visual mark, but typically does occupy an area on a page...

 from a string
String (computer science)
In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set or alphabet....

.

For example, the text

' this is a test '

would be changed, after trimming, to

'this is a test'

Variants

The most popular variants of the trim function strip only the beginning or end of the string. Typically named ltrim and rtrim respectively, or in the case of Python: lstrip and rstrip. C# uses TrimStart and TrimEnd, and Common Lisp string-left-trim and string-right-trim. Pascal and Java do not have these variants built-in, although Object Pascal
Object Pascal
Object Pascal refers to a branch of object-oriented derivatives of Pascal, mostly known as the primary programming language of Embarcadero Delphi.-Early history at Apple:...

 (Delphi) has TrimLeft and TrimRight functionshttp://www.freepascal.org/docs-html/rtl/sysutils/trim.html.

Many trim functions have an optional parameter to specify a list of characters to trim, instead of the default whitespace characters. For example, PHP and Python allow this optional parameter, while Pascal and Java do not. With Common Lisp's string-trim function, the parameter (called character-bag) is required. The C++ Boost library
Boost library
Boost is a set of free software libraries that extend the functionality of C++.-Overview:Most of the Boost libraries are licensed under the Boost Software License, designed to allow Boost to be used with both free and proprietary software projects...

 defines space characters according to locale
Locale
In computing, locale is a set of parameters that defines the user's language, country and any special variant preferences that the user wants to see in their user interface...

, as well as offering variants with a predicate parameter (a functor
Functor
In category theory, a branch of mathematics, a functor is a special type of mapping between categories. Functors can be thought of as homomorphisms between categories, or morphisms when in the category of small categories....

) to select which characters are trimmed.


An uncommon variant of trim returns a special result if no characters remain after the trim operation. For example, Apache Jakarta
Jakarta Project
The Jakarta Project creates and maintains open source software for the Java platform. It operates as an umbrella project under the auspices of the Apache Software Foundation, and all of Jakarta products are released under the Apache License.-Subprojects:...

's StringUtils has a function called stripToNull which returns null in place of an empty string.

An alternative to trimming a string is space normalization, where in addition to removing surrounding whitespace, any sequence of whitespace characters within the string is replaced with a single space. Space normalization is done by Trim in spreadsheet applications (including Excel
Microsoft Excel
Microsoft Excel is a proprietary commercial spreadsheet application written and distributed by Microsoft for Microsoft Windows and Mac OS X. It features calculation, graphing tools, pivot tables, and a macro programming language called Visual Basic for Applications...

, Calc
OpenOffice.org Calc
OpenOffice.org Calc is the spreadsheet component of the OpenOffice.org software package.Calc is similar to Microsoft Excel, with a roughly equivalent range of features. Calc is capable of opening and saving most spreadsheets in Microsoft Excel file format...

, Gnumeric
Gnumeric
Gnumeric is a spreadsheet program that is part of the GNOME Free Software Desktop Project. Gnumeric version 1.0 was released December 31, 2001. Gnumeric is distributed as free software under the GNU GPL license; it is intended to replace proprietary and other spreadsheet programs such as Microsoft...

, and Google Docs), and by the normalize-space function in XSLT
XSL Transformations
XSLT is a declarative, XML-based language used for the transformation of XML documents. The original document is not changed; rather, a new document is created based on the content of an existing one. The new document may be serialized by the processor in standard XML syntax or in another format,...

 and XPath
XPath
XPath is a language for selecting nodes from an XML document. In addition, XPath may be used to compute values from the content of an XML document...

,

While most algorithms return a new (trimmed) string, some alter the original string in-place. Notably, the Boost library
Boost library
Boost is a set of free software libraries that extend the functionality of C++.-Overview:Most of the Boost libraries are licensed under the Boost Software License, designed to allow Boost to be used with both free and proprietary software projects...

 allows either in-place trimming or a trimmed copy to be returned.

Definition of whitespace

The characters which are considered whitespace varies between programming languages and implementations. For example, C traditionally only counts space, tab, line feed, and carriage return characters, while languages which support Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

 typically include all Unicode space characters. Some implementations also include ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...

 control codes (non-printing characters) along with whitespace characters.

Java's trim method considers ASCII spaces and control codes as whitespace, while Java's isWhitespace method recognizes Unicode space characters.

Delphi's Trim function considers characters U+0000 (NULL) through U+0020 (SPACE) to be whitespace.

Usage

Following are examples of trimming a string using several programming languages. All of the implementations shown return a new string and do not alter the original variable.
Example usage Languages
String.Trim([chars]) C#, VB.NET, Windows PowerShell
Windows PowerShell
Windows PowerShell is Microsoft's task automation framework, consisting of a command-line shell and associated scripting language built on top of, and integrated with the .NET Framework...

string.strip; D
D (programming language)
The D programming language is an object-oriented, imperative, multi-paradigm, system programming language created by Walter Bright of Digital Mars. It originated as a re-engineering of C++, but even though it is mainly influenced by that language, it is not a variant of C++...

(.trim string) Clojure
Clojure
Clojure |closure]]") is a recent dialect of the Lisp programming language created by Rich Hickey. It is a general-purpose language supporting interactive development that encourages a functional programming style, and simplifies multithreaded programming....

sequence [ predicate? ] trim Factor
Factor (programming language)
Factor is a stack-oriented programming language created by Slava Pestov. Factor is dynamically typed and has automatic memory management, as well as powerful metaprogramming features. The language has a single implementation featuring a self-hosted optimizing compiler and an interactive development...

(string-trim '(#\Space #\Tab #\Newline) string) Common Lisp
Common Lisp
Common Lisp, commonly abbreviated CL, is a dialect of the Lisp programming language, published in ANSI standard document ANSI INCITS 226-1994 , . From the ANSI Common Lisp standard the Common Lisp HyperSpec has been derived for use with web browsers...

(string-trim string) Scheme
string.trim Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

, JavaScript
JavaScript
JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

 (1.8.1+, Firefox 3.5+)
Trim(String) Pascal
Pascal (programming language)
Pascal is an influential imperative and procedural programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small and efficient language intended to encourage good programming practices using structured programming and data structuring.A derivative known as Object Pascal...

 http://gnu-pascal.de/gpc-hr/Trim.html, QBasic
QBasic
QBasic is an IDE and interpreter for a variant of the BASIC programming language which is based on QuickBASIC. Code entered into the IDE is compiled to an intermediate form, and this intermediate form is immediately interpreted on demand within the IDE. It can run under nearly all versions of DOS...

, Visual Basic
Visual Basic
Visual Basic is the third-generation event-driven programming language and integrated development environment from Microsoft for its COM programming model...

, Delphi
string.strip Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

strings.Trim(string, chars) Go
Go (programming language)
Go is a compiled, garbage-collected, concurrent programming language developed by Google Inc.The initial design of Go was started in September 2007 by Robert Griesemer, Rob Pike, and Ken Thompson. Go was officially announced in November 2009. In May 2010, Rob Pike publicly stated that Go was being...

LTRIM(RTRIM(String)) Oracle
Oracle Corporation
Oracle Corporation is an American multinational computer technology corporation that specializes in developing and marketing hardware systems and enterprise software products – particularly database management systems...

 SQL
SQL
SQL is a programming language designed for managing data in relational database management systems ....

, T-SQL
strip(string [,option , char]) REXX
string:strip(string [,option , char]) Erlang
string.strip Ruby
Ruby (programming language)
Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro "Matz" Matsumoto...

(string =~ /\S(.*\S)?/s, $&) Perl 5
string.trim Perl 6
Perl 6
Perl 6 is a major revision to the Perl programming language. It is still in development, as a specification from which several interpreter and compiler implementations are being written. It is introducing elements of many modern and historical languages. Perl 6 is intended to have many...

trim(string) PHP
PHP
PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...

[string stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]] Objective-C
Objective-C
Objective-C is a reflective, object-oriented programming language that adds Smalltalk-style messaging to the C programming language.Today, it is used primarily on Apple's Mac OS X and iOS: two environments derived from the OpenStep standard, though not compliant with it...

 using Cocoa
Cocoa (API)
Cocoa is Apple's native object-oriented application programming interface for the Mac OS X operating system and—along with the Cocoa Touch extension for gesture recognition and animation—for applications for the iOS operating system, used on Apple devices such as the iPhone, the iPod Touch, and...

string withBlanksTrimmed Smalltalk
Smalltalk
Smalltalk is an object-oriented, dynamically typed, reflective programming language. Smalltalk was created as the language to underpin the "new world" of computing exemplified by "human–computer symbiosis." It was designed and created in part for educational use, more so for constructionist...

 (Squeak, Pharo)
strip(string) SAS
SAS System
SAS is an integrated system of software products provided by SAS Institute Inc. that enables programmers to perform:* retrieval, management, and mining* report writing and graphics* statistical analysis...

string trim $string Tcl
Tcl
Tcl is a scripting language created by John Ousterhout. Originally "born out of frustration", according to the author, with programmers devising their own languages intended to be embedded into applications, Tcl gained acceptance on its own...

TRIM(string) or TRIM(ADJUSTL(string)) Fortran
Fortran
Fortran is a general-purpose, procedural, imperative programming language that is especially suited to numeric computation and scientific computing...

TRIM(string) SQL
SQL
SQL is a programming language designed for managing data in relational database management systems ....

TRIM(string) or LTrim(string) or RTrim(String) ColdFusion
ColdFusion
In computing, ColdFusion is the name of a commercial rapid application development platform invented by Jeremy and JJ Allaire in 1995. ColdFusion was originally designed to make it easier to connect simple HTML pages to a database, by version 2 it had...


Other languages

In languages without a built-in trim function, it is usually simple to create a custom function which accomplishes the same task.

AWK

In AWK, one can use regular expressions to trim:


ltrim(v) = gsub(/^[ \t]+/, "", v)
rtrim(v) = gsub(/[ \t]+$/, "", v)
trim(v) = ltrim(v); rtrim(v)


or:


function ltrim(s) { sub(/^[ \t]+/, "", s); return s }
function rtrim(s) { sub(/[ \t]+$/, "", s); return s }
function trim(s) { return rtrim(ltrim(s)); }

C/C++

There is no standard trim function in C or C++. Most of the available string librarieshttp://www.and.org/vstr/comparison for C contain code which implements trimming, or functions that significantly ease an efficient implementation. The function has also often been called EatWhitespace in some, non-standard C libraries.

The open source
Open-source software
Open-source software is computer software that is available in source code form: the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, improve and at times also to distribute the software.Open...

 C++ library Boost
Boost library
Boost is a set of free software libraries that extend the functionality of C++.-Overview:Most of the Boost libraries are licensed under the Boost Software License, designed to allow Boost to be used with both free and proprietary software projects...

 has several trim variants, including a standard one: http://www.boost.org/doc/html/string_algo/usage.html#id2742817

  1. include

trimmed = boost::algorithm::trim_copy("string");


Note that with boost's function named simply trim the input sequence is modified in-placehttp://www.boost.org/doc/html/trim.html, and does not return a result.

Another open source
Open-source software
Open-source software is computer software that is available in source code form: the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, improve and at times also to distribute the software.Open...

 C++ library Qt
Qt (toolkit)
Qt is a cross-platform application framework that is widely used for developing application software with a graphical user interface , and also used for developing non-GUI programs such as command-line tools and consoles for servers...

 has several trim variants, including a standard one: http://doc.trolltech.com/4.5/qstring.html#trimmed

  1. include

trimmed = s.trimmed;


The Linux kernel
Linux kernel
The Linux kernel is an operating system kernel used by the Linux family of Unix-like operating systems. It is one of the most prominent examples of free and open source software....

 also includes a strip function, strstrip, since 2.6.18-rc1, which trims the string "in place". Since 2.6.33-rc1, the kernel uses strim instead of strstrip to avoid false warnings. http://www.kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.33-rc1-git1.log

Haskell

A trim algorithm in Haskell
Haskell (programming language)
Haskell is a standardized, general-purpose purely functional programming language, with non-strict semantics and strong static typing. It is named after logician Haskell Curry. In Haskell, "a function is a first-class citizen" of the programming language. As a functional programming language, the...

:


import Data.Char (isSpace)
trim :: String -> String
trim = f . f
where f = reverse . dropWhile isSpace


may be interpreted as follows: f drops the preceding whitespace, and reverses the string. f is then again applied to its own output. Note that the type signature (the second line) is optional.

J

The trim algorithm in J
J (programming language)
The J programming language, developed in the early 1990s by Kenneth E. Iverson and Roger Hui, is a synthesis of APL and the FP and FL function-level languages created by John Backus....

 is a functional
Functional programming
In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state...

 description:


trim =. #~ [: (+./\ *. +./\.) ' '&~:


That is: filter (#~) for non-space characters (' '&~:) between leading (+./\) and (*.) trailing (+./\.) spaces.

JavaScript

There is a built-in trim function in JavaScript 1.8.1 (Firefox 3.5 and later), and the ECMAScript 5 standard. In earlier versions it can be added to the String object's prototype as follows:


String.prototype.trim = function {
return this.replace(/^\s+/g, "").replace(/\s+$/g, "");
};

Perl

Perl 5 has no built-in trim function. However, the functionality is commonly achieved using regular expression
Regular expression
In computing, a regular expression provides a concise and flexible means for "matching" strings of text, such as particular characters, words, or patterns of characters. Abbreviations for "regular expression" include "regex" and "regexp"...

s.

Example:

$string =~ s/^\s+//; # remove leading whitespace
$string =~ s/\s+$//; # remove trailing whitespace

or:

$string =~ s/^\s+|\s+$//g ; # remove both leading and trailing whitespace

These examples modify the value of the original variable $string.

Also available for Perl is StripLTSpace in String::Strip from CPAN
CPAN
CPAN, the Comprehensive Perl Archive Network, is an archive of nearly 100,000 modules of software written in Perl, as well as documentation for it. It has a presence on the World Wide Web at and is mirrored worldwide at more than 200 locations...

.

There are, however, two functions that are commonly used to strip whitespace from the end of strings, chomp and chop:
  • chop removes the last character from a string and returns it.
  • chomp removes the trailing newline character(s) from a string if present. (What constitutes a newline is $INPUT_RECORD_SEPARATOR dependent).


In Perl 6
Perl 6
Perl 6 is a major revision to the Perl programming language. It is still in development, as a specification from which several interpreter and compiler implementations are being written. It is introducing elements of many modern and historical languages. Perl 6 is intended to have many...

, the upcoming major revision of the language, strings have a trim method.

Example:

$string = $string.trim; # remove leading and trailing whitespace
$string .= trim; # same thing

Tcl

The Tcl
Tcl
Tcl is a scripting language created by John Ousterhout. Originally "born out of frustration", according to the author, with programmers devising their own languages intended to be embedded into applications, Tcl gained acceptance on its own...

 string command has three relevant subcommands: trim, trimright and trimleft. For each of those commands, an additional argument may be specified: a string that represents a set of characters to remove -- the default is whitespace (space, tab, newline, carriage return).

Example of trimming vowels:


set string onomatopoeia
set trimmed [string trim $string aeiou] ;# result is nomatop
set r_trimmed [string trimright $string aeiou] ;# result is onomatop
set l_trimmed [string trimleft $string aeiou] ;# result is nomatopoeia

XSLT

XSLT
XSL Transformations
XSLT is a declarative, XML-based language used for the transformation of XML documents. The original document is not changed; rather, a new document is created based on the content of an existing one. The new document may be serialized by the processor in standard XML syntax or in another format,...

includes the function normalize-space(string) which strips leading and trailing whitespace, in addition to replacing any whitespace sequence (including line breaks) with a single space.

Example:





XSLT 2.0 includes regular expressions, providing another mechanism to perform string trimming.

Another XSLT technique for trimming is to utilize the XPath 2.0 substring function.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK