Filter (Unix)
Encyclopedia
In Unix
and Unix-like
operating systems, a filter is a program that gets most of its data from its standard input (the main input stream) and writes its main results to its standard output (the main output stream). Unix filters are often used as elements of pipeline
s. The pipe operator ("|") on a command line signifies that the main output of the command to the left is passed as main input to the command on the right.
The classic filter would be grep
, which at it simplest prints to its output any lines containing a character string. Here's an example:
cut -d : -f 1 /etc/passwd | grep foo
This finds all registered users that have "foo" as part of their username by using the cut
command to take the first field (username) of each line of the Unix system password file and passing them all as input to grep, which searches its input for lines containing the character string "foo" and prints them on its output.
Here is a Perl
equivalent to the above, which prints the whole line from the passwd file:
perl -ne 'print if m/^[^:]*foo/' /etc/passwd
Or, to print only the username, without the rest of the line:
perl -ane '$_ = shift @F; print "$_\n" if /foo/' -F: /etc/passwd
Common Unix filter programs are: cat
, cut
, grep
, head
, sort
, uniq
and tail
. Programs like awk and sed
can be used to build quite complex filters because they are fully programmable.
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...
and Unix-like
Unix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....
operating systems, a filter is a program that gets most of its data from its standard input (the main input stream) and writes its main results to its standard output (the main output stream). Unix filters are often used as elements of pipeline
Pipeline (Unix)
In Unix-like computer operating systems , a pipeline is the original software pipeline: a set of processes chained by their standard streams, so that the output of each process feeds directly as input to the next one. Each connection is implemented by an anonymous pipe...
s. The pipe operator ("|") on a command line signifies that the main output of the command to the left is passed as main input to the command on the right.
The classic filter would be grep
Grep
grep is a command-line text-search utility originally written for Unix. The name comes from the ed command g/re/p...
, which at it simplest prints to its output any lines containing a character string. Here's an example:
cut -d : -f 1 /etc/passwd | grep foo
This finds all registered users that have "foo" as part of their username by using the cut
Cut (Unix)
In computing, cut is a Unix command line utility which is used to extract sections from each line of input — usually from a file.Extraction of line segments can typically be done by bytes , characters , or fields separated by a delimiter...
command to take the first field (username) of each line of the Unix system password file and passing them all as input to grep, which searches its input for lines containing the character string "foo" and prints them on its output.
Here is a Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...
equivalent to the above, which prints the whole line from the passwd file:
perl -ne 'print if m/^[^:]*foo/' /etc/passwd
Or, to print only the username, without the rest of the line:
perl -ane '$_ = shift @F; print "$_\n" if /foo/' -F: /etc/passwd
Common Unix filter programs are: cat
Cat (Unix)
The cat command is a standard Unix program used to concatenate and display files. The name is from catenate, a synonym of concatenate.- Specification :...
, cut
Cut (Unix)
In computing, cut is a Unix command line utility which is used to extract sections from each line of input — usually from a file.Extraction of line segments can typically be done by bytes , characters , or fields separated by a delimiter...
, grep
Grep
grep is a command-line text-search utility originally written for Unix. The name comes from the ed command g/re/p...
, head
Head (Unix)
head is a program on Unix and Unix-like systems used to display the first few lines of a text file or piped data. The command syntax is: head [options] <file_name>...
, sort
Sort (Unix)
sort is a standard Unix command line program that prints the lines of its input or concatenation of all files listed in its argument list in sorted order. Sorting is done based on one or more sort keys extracted from each line of input. By default, the entire input is taken as sort key...
, uniq
Uniq
uniq is a Unix utility which, when fed a text file, outputs the file with adjacent identical lines collapsed to one. It is a kind of filter program. Typically it is used after sort...
and tail
Tail (Unix)
tail is a program on Unix and Unix-like systems used to display the last few lines of a text file or piped data.-Syntax:The command-syntax is: tail [options]...
. Programs like awk and sed
Sed
sed is a Unix utility that parses text and implements a programming language which can apply transformations to such text. It reads input line by line , applying the operation which has been specified via the command line , and then outputs the line. It was developed from 1973 to 1974 as a Unix...
can be used to build quite complex filters because they are fully programmable.
List of Unix filter programs
- awk
- catCat (Unix)The cat command is a standard Unix program used to concatenate and display files. The name is from catenate, a synonym of concatenate.- Specification :...
- comm
- cutCut (Unix)In computing, cut is a Unix command line utility which is used to extract sections from each line of input — usually from a file.Extraction of line segments can typically be done by bytes , characters , or fields separated by a delimiter...
- expandExpand (Unix)expand is a program that converts tab characters into groups of space characters. It is available in Unix operating systems and many Unix-like operating systems.For example: $ echo -e "foo\tbar" | expand | xxd -g 1 -u...
- compressCompressCompress is a UNIX compression program based on the LZC compression method, which is an LZW implementation using variable size pointers as in LZ78.- Description of program :Files compressed by compress are typically given the extension .Z...
- foldFold (Unix)Fold is a Unix command used for making a file with long lines more readable on a limited width terminal. Most Linux terminals have a default screen width of 80, and therefore reading files with long lines could get annoying. The fold command puts a line feed every X characters if it doesn't...
- grepGrepgrep is a command-line text-search utility originally written for Unix. The name comes from the ed command g/re/p...
- headHead (Unix)head is a program on Unix and Unix-like systems used to display the first few lines of a text file or piped data. The command syntax is: head [options] <file_name>...
- nlNl (Unix)nl is a Unix utility for numbering lines, either from a file or from standard input, reproducing output on standard output.It has a number of switches:*a - number all lines*t - number lines with printable text only*n - no line numbering...
- perlPerlPerl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...
- prPr (Unix)pr is used to paginate or columnate files for printing.It is a required program in a POSIX-compliant environment and has been implemented by GNU as part of the GNU Core Utilities.-External links:*]]*...
- sedSedsed is a Unix utility that parses text and implements a programming language which can apply transformations to such text. It reads input line by line , applying the operation which has been specified via the command line , and then outputs the line. It was developed from 1973 to 1974 as a Unix...
- shUnix shellA Unix shell is a command-line interpreter or shell that provides a traditional user interface for the Unix operating system and for Unix-like systems...
- sortSort (Unix)sort is a standard Unix command line program that prints the lines of its input or concatenation of all files listed in its argument list in sorted order. Sorting is done based on one or more sort keys extracted from each line of input. By default, the entire input is taken as sort key...
- splitSplit (Unix)split is a Unix utility most commonly used to split a file into two or more smaller files.-Usage:The command-syntax is: split [OPTION] [INPUT [PREFIX]]...
- stringsStrings (Unix)In computer software, strings is a program in Unix-like operating systems that finds and prints text strings embedded in binary files such as executables.It can be used on object files, and core dumps....
- tailTail (Unix)tail is a program on Unix and Unix-like systems used to display the last few lines of a text file or piped data.-Syntax:The command-syntax is: tail [options]...
- tacTac (Unix)tac is a Linux command that allows you to see a file line-by-line backwards. It is named by analogy with cat. Its standard Unix equivalent is tail -r.Usage:Usage: tac [OPTION].....
- teeTee (Unix)In computing, tee is a command in various command-line interpreters such as Unix shells, 4DOS/4NT and Windows PowerShell, which displays or pipes the output of a command and copies it into a file or a variable...
- tr
- uniqUniquniq is a Unix utility which, when fed a text file, outputs the file with adjacent identical lines collapsed to one. It is a kind of filter program. Typically it is used after sort...
- wcWc (Unix)wc is a command in Unix-like operating systems.The program reads either standard input or a list of files and generates one or more of the following statistics: number of bytes, number of words, and number of lines...