grep utility page

This time I'm talking about something about the external command grep on the bash command line.

History 1

grep was originally used for Unix Operating system command line Tools. In giving a list of files or Standard input Later, grep matches one or more regular expression Text is searched and only matched (or mismatched) lines or text are output.

grep is an application originally created by Ken Thompson Written. G re P was originally an application under ed. Its name came from g/re/p (globally search a regular expression and print, global search and print by regular representation). Under ed, after entering the command g/re/p, all strings matching the first defined style are printed out in action units.

In 1973, grep first appeared on the man page in the fourth edition of Unix.

These are mainly from Wikipedia.

function

Grep uses regular expressions to search for text and print out matching rows. As input text, it can come from either standard input or files (any number of wildcard representations). The new version of grep also supports regular matching and searching of all files through subfolders of the current directory.

Typical options for grep include:

Model selection and interpretation:

- E uses the template style as an extended generic representation, meaning that extended regular expressions can be used. (extended regular expression)
- F treats the template style as a list of fixed strings. (newline-separated strings)
- G uses model styles as basic regularities. (basic regular expression)
- P uses the template style as a representation of Perl. (Perl regular expression)
- e < Template Style > Specifies a string as a template style for finding the contents of a file.
- f < Template Document > Specifies a Template Document whose content has one or more template styles, allowing grep to find the content of a document that meets the requirements of the template in the form of a template style for each column.
- i Ignore the difference in character case.
- w shows only full-character columns.
- x shows only columns that match the full column.

Miscellaneous categories:

- v Inverse lookup.
- s does not display error messages.

Output control:

- b Displays the byte offset of the output line from the beginning of the file.
- c Calculates the number of columns that conform to the template style.
- h Does not indicate the name of the file to which the column belongs before displaying the column that conforms to the template style.
- H indicates the file name of the column before displaying the column that conforms to the template style.
- Lists the file names whose contents conform to the specified template style.
- L lists file names whose contents do not conform to the specified template style.
- n Indicates the column number before displaying the column that conforms to the template style.
- o Outputs only the matched parts in the file.
- q does not display any information.
- The effect of R/-r is the same as that of the specified "-d recurse" parameter.

Content control:

- B < Display Column Number > Displays the contents before the row, in addition to the line that conforms to the template style.
- A < Display Column Number > Displays the contents after the row, in addition to the line that conforms to the template style.
- C < Display Column Number > or -< Display Column Number > displays the contents before and after the column in addition to the one that conforms to the template style.

Refer to grep --help output for detailed options.

Usage: grep [OPTION]... PATTERN [FILE]...
Search for PATTERN in each FILE.
Example: grep -i 'hello world' menu.h main.c

The complete reference manual can be retrieved from the command line man grep and info grep.

Basic Usage 2

Searching for a word in the file, the command returns a text line containing "match_pattern":

grep match_pattern file_name
grep 'match_pattern' file_name
grep "match_pattern" file_name

The above three commands are equivalent to grep. The difference is that single quotation marks can prevent the occurrence of whitespace in match_pattern, and prohibit bash nesting calculation (for example, $var variable embedding), while double quotation marks can support bash variable expansion, bash command nesting calculation, bash arithmetic expression calculation and expansion, etc. while having the effect of single quotation marks.

Find in multiple files:

grep "match_pattern" file_1 file_2 file_3 ...

Output all line-v options except:

grep -v "match_pattern" file_name

Again as

ps -auxef|grep java|grep -v grep

Here grep-v grep represents the exclusion of instances with grep text from previous results (all java run instances). In fact, this is to exclude instances of the grep java command, so that we can get a pure java run instance.

Markup matching color -- color=auto option:

grep "match_pattern" file_name --color=auto

Use the regular expression-E option:

grep -E "[1-9]+"
# or
egrep "[1-9]+"

egrep denotes the use of Extended regular expression syntax.

Output only the part-o option matched in the file:

echo this is a test line. | grep -o -E "[a-z]+\."
line.

echo this is a test line. | egrep -o "[a-z]+\."
line.

Statistical files or text contain the number of lines matching strings - c option:

grep -c "text" file_name

The output contains the number of rows matching the string - n option:

grep "text" -n file_name
//or
cat file_name | grep "text" -n

#Multiple files
grep "text" -n file_1 file_2

Print style matches the character or byte offset where it is located:

echo gun is not unix | grep -b -o "not"
7:not

#Characters of a string in a line are cheaper to start with the first character in that line, starting at 0. Option - b -o is always used together.

Search multiple files and find which files match the text:

grep -l "text" file1 file2 file3...

grep recursive search file

Recursive search of text in multilevel directories:

grep "text" . -r -n
# Represents the current directory.

Ignore character case in matching style:

echo "hello world" | grep -i "HELLO"
hello

Option-e Brake Multiple Matching Styles:

echo this is a text line | grep -e "is" -e "line" -o
is
line

#You can also use the - f option to match multiple styles and write the characters that need to be matched line by line in the style file.
cat patfile
aaa
bbb

echo aaa bbb ccc ddd eee | grep -f patfile -o

Include or exclude specified files in grep search results:

#The search character "main()" is recursively searched only in all. php and. html files in the directory
grep "main()" . -r --include *.{php,html}

#Exclude all README files from search results
grep "main()" . -r --exclude "README"

#Exclude files from the filelist file list in search results
grep "main()" . -r --exclude-from filelist

grep and xargs:

#Test files:
echo "aaa" > file1
echo "bbb" > file2
echo "aaa" > file3

grep "aaa" file* -lZ | xargs -0 rm
#file1 and file3 are deleted after execution. The grep output uses the - Z option to specify the filename (\0) with 0-byte as the terminator, xargs-0 reads the input and separates the filename with 0-byte terminator, and then deletes the matching file. - Z is usually used in conjunction with - l.

grep silent output:

grep -q "test" filename
#No information will be output. If the command runs successfully and returns 0, it returns a non-zero value if it fails. Usually used for conditional testing.

Print out the lines before or after the matching text:

#Display three lines after matching a result, using the - A option:
seq 10 | grep "5" -A 3
5
6
7
8

#Show three lines before matching a result, using the - B option:
seq 10 | grep "5" -B 3
2
3
4
5

#Display the first three lines and the last three lines that match a result, using the - C option:
seq 10 | grep "5" -C 3
2
3
4
5
6
7
8

#If there are multiple matching results, "--" is used as the separator between the matching results:
echo -e "a\nb\nc\na\nb\nc" | grep a -A 1
a
b
--
a
b

Grep-P means that perl grammar rules are enabled. At this point, you can use perl regular grammar to write rules.

Perl regular grammar, also known as PCR E expression, can be used for reference. Wiki's Complete Works of PCE Expressions.

Common usage

find text string recursively

In a folder, I don't know which files contain fantasy text descriptions. You can look for them as follows:

grep -PHni 'fantasy' * -r

This command lists all the files containing fantasy in the current folder and lists their filenames, lines containing fantasy text and their line numbers.

If you also need to look at the context of matching text, you can use:

grep -PHni 'fantasy' * -r -C 3

- P denotes using Perl regular grammar

- H denotes the name of the file where the matching line is printed

- n denotes the line number that prints the matching line

- i means ignoring case

- C 3 indicates that all three lines are listed.

- B 3 indicates that the first three lines are also listed.

- A 3 indicates that the following three lines are also listed.

find ip address

When using the - o parameter, grep is often used to extract text content from a particular pattern rather than output the entire matching line.

For example:

$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 56:00:01:c6:ab:01 brd ff:ff:ff:ff:ff:ff
    inet 217.179.87.159/23 brd 217.179.87.255 scope global dynamic ens3
       valid_lft 63125sec preferred_lft 63125sec
3: ens7: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 5a:00:01:c6:ab:01 brd ff:ff:ff:ff:ff:ff

$ ip addr | grep -Po 'inet \d+\.\d+\.\d+\.\d+' | grep -v 'inet 127' | grep -Po '\d+.+'
217.179.87.159

In this case, the expression will extract two lines of'inet xxxxx', such as:

inet 127.0.0.1
inet 217.179.87.159

The second expression excludes line 127.0.0.1, the third expression removes the inet prefix, and finally gets the IP address we want.

Similar approaches can be used to extract IPv6 addresses.

Of course, the extraction of'inet 217.179.87.159'in expression three is a more exhaustive method. In fact, we will use awk to cut off the first half: awk' {print $2}'. This phrase divides the input text into n segments according to the space, and $2 represents the second segment, which is the IP address we want.

ports

If you want to find the service that listens on the port in the current host, you can use the output of the lsof command:

$ sudo lsof -Pni|grep LISTEN
sshd        858              root    3u  IPv4    19572      0t0  TCP *:22 (LISTEN)
sshd        858              root    4u  IPv6    19582      0t0  TCP *:22 (LISTEN)
nginx      6170              root    9u  IPv4 53951827      0t0  TCP *:443 (LISTEN)
nginx      6170              root   10u  IPv4 53951828      0t0  TCP *:8060 (LISTEN)
nginx      6170              root   11u  IPv4 53951829      0t0  TCP *:80 (LISTEN)

Accordingly, we can write a common command function ports and put it in the. bashrc file, so we can easily view the port number. This function can be written as follows:

ports () {
    local x=$1
    if [ "$x" == "" ]; then
        sudo lsof -Pni|grep -P 'LISTEN|UDP'
    else
        sudo lsof -Pni|grep -P 'LISTEN|UDP'|grep ":$x"
    endif
}

Then we can use it like this:

ports
ports 443
ports 22

Note that you'd better adjust your Linux account to password-free sudo, otherwise you may need to enter your password to get sudo identity when using ports. Of course, if you just want to check the port number of the service you started, you can remove the sudo instruction.

has-user, has-group

How to detect the existence of a linux account?

There are no general commands in Linux dedicated to this detection. Commands such as useradd usually fail to return when a user exists, but this is not an appropriate detection method.

To achieve this goal, we have to interpret the / etc/passwd file ourselves. This file lists all accounts in the system in the form of:

root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
...

So, to determine whether a user exists, you just need to determine the first field.

Obviously, awk is fit to do this:

$ cat /etc/passwd|awk -F: '{print $1}'
root
daemon
bin
sys
sync

However, grep is still used to solve the problem in this paper.

has-user() {
    local name=${1:-root}
    cat /etc/passwd|grep -q "^$name"
}

has-user 'joe' && echo 'joe exists' || 'joe not exists'

Similarly, we can define a similar function has-group:

has-group () {
    local name=${1:-root}
    cat /etc/group|grep -q "^$name"
}

has-group staff && echo 'staff group exists' || echo 'staff group not exists'

Next, we give some practical examples:

function find_ip () { ip addr|grep -Poi "inet ((192.168.\d+.\d+)|(172.\d+.\d+.\d+)|(10.\d+.\d+.\d+))"|grep -Poi "\d+.\d+.\d+.\d+"; }

function find_ip_uniq () { ip addr|grep -Poi "inet ((192.168.\d+.\d+)|(172.\d+.\d+.\d+)|(10.\d+.\d+.\d+))"|grep -Poi "\d+.\d+.\d+.\d+"|grep -v '\.255'|head -n1; }

genpasswd(){ strings /dev/urandom|grep -oP '[[:alnum:]]|[\#\%\@\&\^]'|head -n "${1:-16}"|tr -d '\n';echo;}

Concluding remarks

Grep and awk, sed are three major tools of Linux. They represent to a large extent the design philosophy of Linux, that is, compactness, concentration and combination. The greatest skill of using a tool like grep is to decompose the target behavior: get the source text, filter the source text, and construct the output of the result.

This article is only about the basic usage. It depends on your own intelligence to open your mind.

Reference resources

Posted by Sealr0x on Wed, 15 May 2019 14:35:04 -0700

Programmer Group