Tr (Unix)
Encyclopedia
tr is a command in Unix-like
Unix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....

 operating systems.

When executed, the program reads from the standard input and writes to the standard output. It takes as parameters two sets of characters, and replaces occurrences of the characters in the first set with the corresponding elements from the other set. For example,


tr 'abcd' 'jkmn'


maps 'a' to 'j', 'b' to 'k', 'c' to 'm', and 'd' to 'n'.

Sets of characters may be abbreviated by using character ranges. The previous example could be written:


tr 'a-d' 'jkmn'


In POSIX
POSIX
POSIX , an acronym for "Portable Operating System Interface", is a family of standards specified by the IEEE for maintaining compatibility between operating systems...

 compliant versions of tr the set represented by a character range depends on the locale's collating order, so it is safer to avoid character ranges in scripts that might be executed in a locale different from that in which they were written. Ranges can often be replaced with POSIX
POSIX
POSIX , an acronym for "Portable Operating System Interface", is a family of standards specified by the IEEE for maintaining compatibility between operating systems...

 character sets such as [:alpha:].

The -c flag complements the first set of characters.


tr -cd '[:alnum:]'


therefore removes all non-alphanumeric characters.

The -s flag causes tr to compress sequences of identical adjacent characters in its output to a single token. For example,


tr -s '\n' '\n'


replaces sequences of one or more newline characters with a single newline.

The -d flag causes tr to delete all tokens of the specified set of characters from its input. In this case, only a single character set argument is used. The following command removes carriage return characters, thereby converting a file in DOS/Windows format to one in Unix format.


tr -d '\r'


Most versions of tr, including GNU tr and classic Unix tr, operate on single byte characters and are not Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...

 compliant. An exception is the Heirloom Toolchest
Heirloom Project
The Heirloom Project is a collection of traditional Unix utilities. Most of them are derived fromoriginal Unix source code, as released as Open Source by Caldera and Sun.The project has the following components:...

 implementation, which provides basic Unicode support.

Ruby
Ruby (programming language)
Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro "Matz" Matsumoto...

 and Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...

 also have an internal tr operator, which operates analogously. Tcl
Tcl
Tcl is a scripting language created by John Ousterhout. Originally "born out of frustration", according to the author, with programmers devising their own languages intended to be embedded into applications, Tcl gained acceptance on its own...

's string map command is more general in that it maps strings to strings while tr maps characters to characters.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK