Regular expression explanation of Shell script

option	explain
-f	Ignoring case, lowercase letters are converted to uppercase letters for comparison
-b	Ignore spaces before each line
-n	Sort by number
-r	Reverse sort
-u	Equivalent to uniq, which means that the same data is displayed in only one row
-t	Specify the field separator, which is separated by the [Tab] key by default
-k	Specify sort field
-O < output file >	Transfer the sorted results to the specified file

1.2 uniq command

The uniq command is used to check and delete repeated rows and columns in text files. It is generally used in combination with the sort command
```
Format: uniq [option]  parameter
```
Common options:

option explain
-c Count and delete duplicate lines in the file
-d Show only consecutive repeating lines
-u Show rows that appear only once

option	explain
-c	Count and delete duplicate lines in the file
-d	Show only consecutive repeating lines
-u	Show rows that appear only once

1.3 tr command

Commonly used to replace, compress, and delete characters from standard input.
```
Format: tr [option] [parameter]
```

Common options

option	explain
-c	Keep the characters of character set 1, and replace other characters with (including newline characters) \ ncharacterset 2
-d	Delete all characters belonging to character set 1
-s	Compress the repeated string into a string; Replace character set 1 with character set 2
-t	Character set 2 replaces character set 1. The result is the same without options

Common parameters

parameter	explain
Character set 1	Specifies the original character set to convert or delete. When performing a conversion operation, you must specify the target character set for the conversion using the parameter "character set 2". However, the parameter "character set 2" is not required for deletion
Character set 2	Specifies the target character set to convert to

1.4cut command

Displays the specified part of the line and deletes the specified field in the file
```
Format: cut [option]  parameter
```

Common options

option	explain
-f	By specifying which field to extract. The cut command uses "TAB" as the default field delimiter
-d	TAB is the default separator, use this option to change to a different separator
- -complement	Used to exclude the specified field
- -output-delimiter	Change the separator of the output

1.5 examples

1.5.1 statistics of current host connection status

[root@yxp data]#ss -ant|cut -d " " -f1|sort -n|uniq -c|head -2
      2 ESTAB
     13 LISTEN

1.5.2 count the number of currently connected hosts

[root@yxp opt]#ss -ant|tr -s " "|cut -d" " -f5|cut -d":" -f1|sort|uniq -c|tail -n +3
      3 192.168.59.1
      1 192.168.59.118
      1 Address

2, Regular expression

2.1 definition of regular expression

Regular expression, also known as normal expression and regular expression
Use strings to describe and match a series of strings that meet a rule
Regular expression composition
- Ordinary characters include upper and lower case letters, numbers, punctuation marks and some other symbols.
- Metacharacters are special characters with special meaning in regular expressions

2.2 common metacharacters (supported tools: find, grep, egrep, sed and awk)

Match character	Express meaning
.	Represents any character
[ ]	Matches a character in parentheses
[^ ]	Indicates that the character in the character class in the negative bracket is reversed
\Escape character	Used to cancel the meaning of special symbols
^	The position where the matching string begins
$	Matches the end of the string
{n}	Match the previous subexpression n times
{n,}	Match the previous subexpression no less than n times
{n,m}	Match the previous subexpression n to m times
[:alnum:]	Match any letters and numbers
[:alpha:]	Matches any letter, uppercase or lowercase
[:lower:]	Lowercase characters a-z
[:upper:]	Uppercase characters A-Z
[:blank:]	Spaces and TAB characters
[:space:]	All white space characters (new lines, spaces, tabs)
[:digit:]	Number 0-9
[:xdigit:]	Hexadecimal digit
[:cntrl:]	Control character

Example 1:. Represents any character

#Represents any character
[root@yxp data]#echo abc|grep "a.c"
abc
#The original point needs to be added \ escaped
[root@yxp data]#echo abc|grep "a\.c"

#Standard format needs to add '' or ''
[root@yxp data]#echo abc a.c|grep "a\.c"
abc a.c

[root@yxp data]#echo abc adc|grep "a.c"
abc adc

Example 2: [] matches a character in parentheses

#[yxp]
[root@yxp opt]#ls |grep "[yxp].txt"
p.txt
x.txt
y.txt
yxp.txt


#[0-9]
[root@yxp opt]#ls |grep "[0-9].txt"
0.txt
1.txt
2.txt
3.txt
4.txt
5.txt
6.txt
7.txt
8.txt
9.txt

#{a..d}
[root@yxp aa]#touch {a..z}.txt
[root@yxp aa]#ls
a.txt  e.txt  i.txt  m.txt  q.txt  u.txt  y.txt
b.txt  f.txt  j.txt  n.txt  r.txt  v.txt  z.txt
c.txt  g.txt  k.txt  o.txt  s.txt  w.txt
d.txt  h.txt  l.txt  p.txt  t.txt  x.txt

#{A..Z}
[root@yxp bb]#touch {A..Z}.txt
[root@yxp bb]#ls
A.txt  E.txt  I.txt  M.txt  Q.txt  U.txt  Y.txt
B.txt  F.txt  J.txt  N.txt  R.txt  V.txt  Z.txt
C.txt  G.txt  K.txt  O.txt  S.txt  W.txt
D.txt  H.txt  L.txt  P.txt  T.txt  X.txt


#[a-d]: including small a to small D, and capital, except D
[root@yxp opt]#ls [a-d].txt
a.txt  A.txt  b.txt  B.txt  c.txt  C.txt  d.txt
##Just want to match lowercase (with grep)
[root@yxp opt]#ls |grep '[a-d].txt'
a.txt
b.txt
c.txt
d.txt

#[A-D]: excluding small a
[root@yxp opt]#ls [A-D].txt
A.txt  b.txt  B.txt  c.txt  C.txt  d.txt  D.txt
##[A-D] just want to match uppercase
[root@yxp opt]#ls |grep '[A-D].txt'
A.txt
B.txt
C.txt
D.txt

Example 3: [^]: indicates that the characters in the character class appear in the negative bracket, and the reverse is taken

[root@yxp opt]#ls |grep "[^yxp].txt"
0.txt
1.txt
2.txt
3.txt
4.txt
5.txt
6.txt
7.txt
8.txt
9.txt
a.txt
.........Omitted later

[root@yxp opt]#echo 12txt|grep "[^az].txt"
12txt

Example 4: [: alnum:] matches any letter and number

##Note: be sure to put another [] on the outside
[root@yxp opt]#ls |grep '[[:alnum:]].txt'
0.txt
1.txt
2.txt
3.txt
4.txt
5.txt
6.txt

Example 5: metacharacter: (.)

#//Indicates RC. Rc0... RC6
[root@yxp opt]#ls /etc/ |grep 'rc[.0-6]'
rc0.d
rc1.d
rc2.d
rc3.d
rc4.d
rc5.d
rc6.d
rc.d
rc.local

#r..t.. Indicates any two characters
[root@yxp opt]#grep "r..t" /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin

2.3 extended regular expressions

Supported tools: egrep, awk or grep -E and sed -r

qualifier	explain
*	Match the front sub expression 0 or more times
.*	Any character of any length
？	Match the front sub expression 0 or 1 times, that is, it is optional
+	Similar to the asterisk, it indicates that the character before it appears one or more times, but it must appear once, > = 1
{n,m}	Match the previous subexpression n to m times
{m}	Match the previous subexpression n times
{n,}	Match the previous subexpression no less than n times > = n
{，n}	Match the previous sub expression up to N times, < = n
\|	Use logical OR (OR) to specify the pattern used by the regular expression
()	String grouping, taking the strings in parentheses as a whole.

Example 1: * matches the front sub expression 0 or more times

[root@yxp opt]#echo google ggle|grep "go*gle"
google ggle

[root@yxp opt]#echo google ggle gggle|grep "go*gle"
google ggle gggle

Example 2: {n,m} matches the previous subexpression n to m times

[root@yxp opt]#echo goooogle goole gggle|egrep "go{3,5}gle"
goooogle goole gggle

Example 3: {n,} matches the previous subexpression no less than n times > = n

[root@yxp opt]#echo goooogle gooogle gggle|egrep "go{3,}gle"
goooogle gooogle gggle

Example 4: {, n} matches the previous subexpression up to N times, < = n

[root@yxp opt]#echo goooogle gooogle gggle|egrep "go{,3}gle"
goooogle gooogle gggle

Example 5: * match the front sub expression 0 or more times

[root@yxp opt]#echo gggggggggggdadasgle|grep 'g*gle'
gggggggggggdadasgle

Example 6:. * any character of any length

[root@yxp opt]#echo gggggggggggdadasgle|grep '.*gle'
gggggggggggdadasgle

Example 7:? Match the front sub expression 0 or 1 times, that is, it is optional

[root@yxp opt]#echo goole gogle ggle|egrep "go?gle"
goole gogle ggle

Example 8: + is similar to the asterisk, indicating that the character before it appears one or more times, but it must appear once, > = 1

[root@yxp opt]#echo google gogle ggle gooogle|egrep "go+gle"
google gogle ggle gooogle

Example 9: | the logical OR (OR) method specifies the mode to be used by the regular expression

[root@yxp opt]#echo 1ee 1abc 2abc|egrep "1|2abc"
1ee 1abc 2abc

Example 10: () string grouping, taking the string in parentheses as a whole.

[root@yxp opt]#echo 1ee 1abc 2abc|egrep "(1|2)abc"
1ee 1abc 2abc

Example 11: extract ip address

#FA Yi
[root@yxp opt]#ifconfig ens33|grep "netmask"|grep -o -E "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"|head -1
192.168.59.102

#Method 2: grouping is used
[root@yxp opt]#ifconfig ens33|grep "netmask"|egrep -o '([0-9]{1,3}.){3}[0-9]{1,3}'|head -1
192.168.59.102

2.4 position anchoring

Position qualifier	explain
^	Row head anchor for the leftmost side of the pattern
$	End of line anchor for the rightmost side of the pattern
^PATTERN^	For pattern matching entire line
^$	Blank line
^[[:space:]]*$	Blank line
\< or \ b	Initial anchor, used on the left side of the word pattern (consecutive numbers, letters, underscores count)
\>Or \ b	Suffix anchor, used on the right side of a word
\<PATTERN\>	Match entire word

Example 1: end of line anchor for the rightmost side of the pattern

[root@yxp opt]#grep "bash$" /etc/passwd
root:x:0:0:root:/root:/bin/bash
yxp:x:1000:1000:yxp:/home/yxp:/bin/bash

Example 2: row head anchor, used for the leftmost side of the pattern

[root@yxp opt]#grep "^root" /etc/passwd
root:x:0:0:root:/root:/bin/bash

Example 3: for pattern matching, the whole line is used, and the matching content is on a single line

[root@yxp opt]#echo root|grep "^root$" 
root

Example 4: \ <: match only the words on the right

[root@yxp opt]#echo hello-123|grep "\<123"
hello-123

Example 5: \ >: match only the words on the left

[root@yxp opt]#echo hello-123 222|grep "hello\>"
hello-123 222

Example 6: filter out non empty lines that do not start with #

[root@yxp opt]#grep "^[^#]" /etc/fstab 
/dev/mapper/centos-root /                       xfs     defaults        0 0
UUID=183ca7c7-1989-4f43-9e81-d2676192f5a4 /boot                   xfs     defaults        0 0
/dev/mapper/centos-home /home                   xfs     defaults        0 0
/dev/mapper/centos-swap swap                    swap    defaults        0 0
/dev/sdb1 /mnt xfs defaults 0 0

Posted by delmardata on Tue, 26 Oct 2021 07:04:13 -0700

Programmer Group