Grep
Encyclopedia
grep is a command-line text-search utility originally written for Unix
. The name comes from the ed
command g/re/p (global / regular expression / print). The
, and prints the lines to the program's standard output.
as a standalone application adapted from the regular expression parser he had written for ed
(which he also created). Its official creation date is given as March 3, 1973, in the Manual for Unix Version 4.
In this case, grep prints all lines containing apple from the file fruitlist.txt, regardless of word boundaries; therefore lines containing pineapple or apples are also printed. The grep command is case sensitive by default, so this example's output does not include lines containing Apple (with a capital A) unless they also contain apple.
To search all .txt files in a directory for apple in a shell that supports globbing, use an asterisk in place of the file name:
Regular expression
s can be used to match more complicated queries. The following prints all lines in the file that begin with the letter a, followed by any one character, then the letters ple.
As noted above, the term "grep" derives from a usage in ed
and related text editor
s. Before grep existed as a separate command, the same effect might have been achieved by doing:
where the second line is the command given to ed to print the relevant lines, and the third line is the command to exit from ed.
Like most Unix commands, grep accepts options in the form of command-line arguments, to change many of its behaviors. For example:
This prints all lines containing apple regardless of capitalization. The -i argument tells grep to be case insensitive, or to ignore case.
To print all lines containing apple as a word (pineapple and apples will not match):
But if fruitlist.txt contains apple as a word followed by hyphen (-) character, it will also get matched.
So to print all lines only containing exactly apple in the whole line, use line-regexp instead of word-regexp:
the -v (lower-case v) prints all lines that do NOT contain apple in this example.
's original regular expression implementation.
. These variants of
). In such combined implementations, grep may also behave differently depending on the name by which it is invoked, allowing fgrep, egrep, and grep to be links to the same program.
Other commands contain the word "grep" to indicate that they search (usually for regular expression matches). The pgrep
utility, for instance, displays the processes whose names match a given regular expression.
In Perl
, grep is the name of the built-in function that finds elements in a list that satisfy a certain property. This higher-order function
is typically named filter
in functional programming
languages.
pcregrep is an implementation of grep that uses Perl regular expression syntax.
Ports
of grep (within Cygwin
and GnuWin32
, for example) also run under Microsoft Windows
. Some versions of Windows feature the similar
Online added draft entries for "grep" as both a noun and a verb.
A common verb usage is the phrase "You can't grep dead trees"—meaning one can more easily search through digital media, using tools such as grep, than one could with a hard copy (i.e., one made from dead trees, paper). Compare with google
.
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...
. The name comes from the ed
Ed (text editor)
ed is a line editor for the Unix operating system. It was one of the first end-user programs hosted on the system and has been standard in Unix-based systems ever since. ed was originally written in PDP-11/20 assembler by Ken Thompson in 1971...
command g/re/p (global / regular expression / print). The
grep
command searches files or standard input globally for lines matching a given regular expressionRegular expression
In computing, a regular expression provides a concise and flexible means for "matching" strings of text, such as particular characters, words, or patterns of characters. Abbreviations for "regular expression" include "regex" and "regexp"...
, and prints the lines to the program's standard output.
History
Grep was created by Ken ThompsonKen Thompson
Kenneth Lane Thompson , commonly referred to as ken in hacker circles, is an American pioneer of computer science...
as a standalone application adapted from the regular expression parser he had written for ed
Ed (text editor)
ed is a line editor for the Unix operating system. It was one of the first end-user programs hosted on the system and has been standard in Unix-based systems ever since. ed was originally written in PDP-11/20 assembler by Ken Thompson in 1971...
(which he also created). Its official creation date is given as March 3, 1973, in the Manual for Unix Version 4.
Usage
This is an example of a common grep usage:In this case, grep prints all lines containing apple from the file fruitlist.txt, regardless of word boundaries; therefore lines containing pineapple or apples are also printed. The grep command is case sensitive by default, so this example's output does not include lines containing Apple (with a capital A) unless they also contain apple.
To search all .txt files in a directory for apple in a shell that supports globbing, use an asterisk in place of the file name:
Regular expression
Regular expression
In computing, a regular expression provides a concise and flexible means for "matching" strings of text, such as particular characters, words, or patterns of characters. Abbreviations for "regular expression" include "regex" and "regexp"...
s can be used to match more complicated queries. The following prints all lines in the file that begin with the letter a, followed by any one character, then the letters ple.
As noted above, the term "grep" derives from a usage in ed
Ed (text editor)
ed is a line editor for the Unix operating system. It was one of the first end-user programs hosted on the system and has been standard in Unix-based systems ever since. ed was originally written in PDP-11/20 assembler by Ken Thompson in 1971...
and related text editor
Text editor
A text editor is a type of program used for editing plain text files.Text editors are often provided with operating systems or software development packages, and can be used to change configuration files and programming language source code....
s. Before grep existed as a separate command, the same effect might have been achieved by doing:
where the second line is the command given to ed to print the relevant lines, and the third line is the command to exit from ed.
Like most Unix commands, grep accepts options in the form of command-line arguments, to change many of its behaviors. For example:
This prints all lines containing apple regardless of capitalization. The -i argument tells grep to be case insensitive, or to ignore case.
To print all lines containing apple as a word (pineapple and apples will not match):
But if fruitlist.txt contains apple as a word followed by hyphen (-) character, it will also get matched.
So to print all lines only containing exactly apple in the whole line, use line-regexp instead of word-regexp:
the -v (lower-case v) prints all lines that do NOT contain apple in this example.
Variations
There are countless implementations and derivatives of grep available for many operating systems. Early variants of grep included egrep and fgrep.egrep
applies an extended regular expression syntax that was added to Unix after Ken ThompsonKen Thompson
Kenneth Lane Thompson , commonly referred to as ken in hacker circles, is an American pioneer of computer science...
's original regular expression implementation.
fgrep
searches for any of a list of fixed strings using the Aho–Corasick string matching algorithmAho–Corasick string matching algorithm
The Aho–Corasick string matching algorithm is a string searching algorithm invented by Alfred V. Aho and Margaret J. Corasick. It is a kind of dictionary-matching algorithm that locates elements of a finite set of strings within an input text. It matches all patterns simultaneously...
. These variants of
grep
persist in most modern grep implementations as command-line switches (and standardized as -E
and -F
in POSIXPOSIX
POSIX , an acronym for "Portable Operating System Interface", is a family of standards specified by the IEEE for maintaining compatibility between operating systems...
). In such combined implementations, grep may also behave differently depending on the name by which it is invoked, allowing fgrep, egrep, and grep to be links to the same program.
Other commands contain the word "grep" to indicate that they search (usually for regular expression matches). The pgrep
Pgrep
pgrep is a command-line utility initially written for use with the Solaris 7 operating system. It has since been reimplemented for Linux and the BSDs . It searches for all the named processes that can be specified as extended regular expression patterns, and—by default—returns their process ID...
utility, for instance, displays the processes whose names match a given regular expression.
In Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...
, grep is the name of the built-in function that finds elements in a list that satisfy a certain property. This higher-order function
Higher-order function
In mathematics and computer science, higher-order functions, functional forms, or functionals are functions which do at least one of the following:*take one or more functions as an input*output a function...
is typically named filter
Filter (higher-order function)
In functional programming, filter is a higher-order function that processes a data structure in some order to produce a new data structure containing exactly those elements of the original data structure for which a given predicate returns the boolean value true.-Example:In Haskell, the code...
in functional programming
Functional programming
In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state...
languages.
pcregrep is an implementation of grep that uses Perl regular expression syntax.
Ports
Porting
In computer science, porting is the process of adapting software so that an executable program can be created for a computing environment that is different from the one for which it was originally designed...
of grep (within Cygwin
Cygwin
Cygwin is a Unix-like environment and command-line interface for Microsoft Windows. Cygwin provides native integration of Windows-based applications, data, and other system resources with applications, software tools, and data of the Unix-like environment...
and GnuWin32
GnuWin32
The GnuWin32 project provides native ports in the form of runnable computer programs, patches, and source code for various GNU and open source tools and software, much of it modified to run on the 32-bit Windows platform...
, for example) also run under Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
. Some versions of Windows feature the similar
qgrep
command.Usage as a verb
In December 2003, the Oxford English DictionaryOxford English Dictionary
The Oxford English Dictionary , published by the Oxford University Press, is the self-styled premier dictionary of the English language. Two fully bound print editions of the OED have been published under its current name, in 1928 and 1989. The first edition was published in twelve volumes , and...
Online added draft entries for "grep" as both a noun and a verb.
A common verb usage is the phrase "You can't grep dead trees"—meaning one can more easily search through digital media, using tools such as grep, than one could with a hard copy (i.e., one made from dead trees, paper). Compare with google
Google (verb)
The transitive verb to google refers to using the Google search engine to obtain information on the Web. However, it can also be used as a general term for searching the internet using any search engine, not just Google...
.
See also
- Boyer–Moore string search algorithmBoyer–Moore string search algorithmThe Boyer–Moore string search algorithm is a particularly efficient string searching algorithm, and it has been the standard benchmark for the practical string search literature. It was developed by Bob Boyer and J Strother Moore in 1977...
- List of Unix utilities
- vgrep, or "visual grep"
- findFind (command)In computing, find is a command in the command line interpreters of DOS, OS/2 and Microsoft Windows. It is used to search for a specific text string in a file or files...
External links
- The grep command tutorial for Linux / UNIX.
- The grep Command – by The Linux Information Project (LINFO)
- Egrep for linguists An introduction to egrep
- Network grep - A packet analyzer used to match patterns at the network layer
- ack - A grep replacement for searching source code
- "why GNU grep is fast" - implementation details from GNU grep's author.