Regular expression explanation of Shell script

Keywords: Linux Operation & Maintenance shell regex

catalogue

1, Common pipeline commands

1.1 sort command

1.2 uniq command

1.3 tr command

1.4cut command

1.5 examples

1.5.1 statistics of current host connection status

1.5.2 count the number of currently connected hosts

2, Regular expression

2.1 definition of regular expression

2.2 common metacharacters (supported tools: find, grep, egrep, sed and awk)

2.3 extended regular expressions

2.4 position anchoring

1, Common pipeline commands

1.1 sort command

  • The sort command sorts the contents of a text file in behavioral units.

    Format: sort [option]  parameter
  • Common options:

optionexplain
-fIgnoring case, lowercase letters are converted to uppercase letters for comparison
-bIgnore spaces before each line
-nSort by number
-rReverse sort
-uEquivalent to uniq, which means that the same data is displayed in only one row
-tSpecify the field separator, which is separated by the [Tab] key by default
-kSpecify sort field
-O < output file >Transfer the sorted results to the specified file

1.2 uniq command

  • The uniq command is used to check and delete repeated rows and columns in text files. It is generally used in combination with the sort command

    Format: uniq [option]  parameter
  • Common options:

    optionexplain
    -cCount and delete duplicate lines in the file
    -dShow only consecutive repeating lines
    -uShow rows that appear only once

1.3 tr command

  • Commonly used to replace, compress, and delete characters from standard input.

    Format: tr [option] [parameter]
  • Common options

    optionexplain
    -cKeep the characters of character set 1, and replace other characters with (including newline characters) \ ncharacterset 2
    -dDelete all characters belonging to character set 1
    -sCompress the repeated string into a string; Replace character set 1 with character set 2
    -tCharacter set 2 replaces character set 1. The result is the same without options
  • Common parameters

    parameterexplain
    Character set 1Specifies the original character set to convert or delete. When performing a conversion operation, you must specify the target character set for the conversion using the parameter "character set 2". However, the parameter "character set 2" is not required for deletion
    Character set 2Specifies the target character set to convert to

1.4cut command

  • Displays the specified part of the line and deletes the specified field in the file

    Format: cut [option]  parameter
  • Common options

    optionexplain
    -fBy specifying which field to extract. The cut command uses "TAB" as the default field delimiter
    -dTAB is the default separator, use this option to change to a different separator
    - -complementUsed to exclude the specified field
    - -output-delimiterChange the separator of the output

1.5 examples

1.5.1 statistics of current host connection status

[root@yxp data]#ss -ant|cut -d " " -f1|sort -n|uniq -c|head -2
      2 ESTAB
     13 LISTEN

 

1.5.2 count the number of currently connected hosts

[root@yxp opt]#ss -ant|tr -s " "|cut -d" " -f5|cut -d":" -f1|sort|uniq -c|tail -n +3
      3 192.168.59.1
      1 192.168.59.118
      1 Address

 

2, Regular expression

2.1 definition of regular expression

  • Regular expression, also known as normal expression and regular expression

  • Use strings to describe and match a series of strings that meet a rule

  • Regular expression composition

    • Ordinary characters include upper and lower case letters, numbers, punctuation marks and some other symbols.

    • Metacharacters are special characters with special meaning in regular expressions

2.2 common metacharacters (supported tools: find, grep, egrep, sed and awk)

Match characterExpress meaning
.Represents any character
[ ]Matches a character in parentheses
[^ ]Indicates that the character in the character class in the negative bracket is reversed
\Escape characterUsed to cancel the meaning of special symbols
^ The position where the matching string begins
$ Matches the end of the string
{n}Match the previous subexpression n times
{n,} Match the previous subexpression no less than n times
{n,m} Match the previous subexpression n to m times
[:alnum:]Match any letters and numbers
[:alpha:]Matches any letter, uppercase or lowercase
[:lower:]Lowercase characters a-z
[:upper:]Uppercase characters A-Z
[:blank:]Spaces and TAB characters
[:space:]All white space characters (new lines, spaces, tabs)
[:digit:]Number 0-9
[:xdigit:]Hexadecimal digit
[:cntrl:]Control character

Example 1:. Represents any character

#Represents any character
[root@yxp data]#echo abc|grep "a.c"
abc
#The original point needs to be added \ escaped
[root@yxp data]#echo abc|grep "a\.c"

#Standard format needs to add '' or ''
[root@yxp data]#echo abc a.c|grep "a\.c"
abc a.c

[root@yxp data]#echo abc adc|grep "a.c"
abc adc

 

 

 

Example 2: [] matches a character in parentheses

#[yxp]
[root@yxp opt]#ls |grep "[yxp].txt"
p.txt
x.txt
y.txt
yxp.txt


#[0-9]
[root@yxp opt]#ls |grep "[0-9].txt"
0.txt
1.txt
2.txt
3.txt
4.txt
5.txt
6.txt
7.txt
8.txt
9.txt

#{a..d}
[root@yxp aa]#touch {a..z}.txt
[root@yxp aa]#ls
a.txt  e.txt  i.txt  m.txt  q.txt  u.txt  y.txt
b.txt  f.txt  j.txt  n.txt  r.txt  v.txt  z.txt
c.txt  g.txt  k.txt  o.txt  s.txt  w.txt
d.txt  h.txt  l.txt  p.txt  t.txt  x.txt

#{A..Z}
[root@yxp bb]#touch {A..Z}.txt
[root@yxp bb]#ls
A.txt  E.txt  I.txt  M.txt  Q.txt  U.txt  Y.txt
B.txt  F.txt  J.txt  N.txt  R.txt  V.txt  Z.txt
C.txt  G.txt  K.txt  O.txt  S.txt  W.txt
D.txt  H.txt  L.txt  P.txt  T.txt  X.txt


#[a-d]: including small a to small D, and capital, except D
[root@yxp opt]#ls [a-d].txt
a.txt  A.txt  b.txt  B.txt  c.txt  C.txt  d.txt
##Just want to match lowercase (with grep)
[root@yxp opt]#ls |grep '[a-d].txt'
a.txt
b.txt
c.txt
d.txt

#[A-D]: excluding small a
[root@yxp opt]#ls [A-D].txt
A.txt  b.txt  B.txt  c.txt  C.txt  d.txt  D.txt
##[A-D] just want to match uppercase
[root@yxp opt]#ls |grep '[A-D].txt'
A.txt
B.txt
C.txt
D.txt

 

 

 

 

Example 3: [^]: indicates that the characters in the character class appear in the negative bracket, and the reverse is taken

[root@yxp opt]#ls |grep "[^yxp].txt"
0.txt
1.txt
2.txt
3.txt
4.txt
5.txt
6.txt
7.txt
8.txt
9.txt
a.txt
.........Omitted later

[root@yxp opt]#echo 12txt|grep "[^az].txt"
12txt

 

 

Example 4: [: alnum:] matches any letter and number

##Note: be sure to put another [] on the outside
[root@yxp opt]#ls |grep '[[:alnum:]].txt'
0.txt
1.txt
2.txt
3.txt
4.txt
5.txt
6.txt

 

Example 5: metacharacter: (.)

#//Indicates RC. Rc0... RC6
[root@yxp opt]#ls /etc/ |grep 'rc[.0-6]'
rc0.d
rc1.d
rc2.d
rc3.d
rc4.d
rc5.d
rc6.d
rc.d
rc.local

#r..t.. Indicates any two characters
[root@yxp opt]#grep "r..t" /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin

 

 

 

2.3 extended regular expressions

  • Supported tools: egrep, awk or grep -E and sed -r

qualifierexplain
*Match the front sub expression 0 or more times
.*Any character of any length
Match the front sub expression 0 or 1 times, that is, it is optional
+Similar to the asterisk, it indicates that the character before it appears one or more times, but it must appear once, > = 1
{n,m}Match the previous subexpression n to m times
{m}Match the previous subexpression n times
{n,} Match the previous subexpression no less than n times > = n
{,n} Match the previous sub expression up to N times, < = n
|Use logical OR (OR) to specify the pattern used by the regular expression
()String grouping, taking the strings in parentheses as a whole.

Example 1: * matches the front sub expression 0 or more times

[root@yxp opt]#echo google ggle|grep "go*gle"
google ggle

[root@yxp opt]#echo google ggle gggle|grep "go*gle"
google ggle gggle

 

Example 2: {n,m} matches the previous subexpression n to m times

[root@yxp opt]#echo goooogle goole gggle|egrep "go{3,5}gle"
goooogle goole gggle

 

Example 3: {n,} matches the previous subexpression no less than n times > = n

[root@yxp opt]#echo goooogle gooogle gggle|egrep "go{3,}gle"
goooogle gooogle gggle

 

Example 4: {, n} matches the previous subexpression up to N times, < = n

[root@yxp opt]#echo goooogle gooogle gggle|egrep "go{,3}gle"
goooogle gooogle gggle

 

Example 5: * match the front sub expression 0 or more times

[root@yxp opt]#echo gggggggggggdadasgle|grep 'g*gle'
gggggggggggdadasgle

 

Example 6:. * any character of any length

[root@yxp opt]#echo gggggggggggdadasgle|grep '.*gle'
gggggggggggdadasgle

 

Example 7:? Match the front sub expression 0 or 1 times, that is, it is optional

[root@yxp opt]#echo goole gogle ggle|egrep "go?gle"
goole gogle ggle

 

Example 8: + is similar to the asterisk, indicating that the character before it appears one or more times, but it must appear once, > = 1

[root@yxp opt]#echo google gogle ggle gooogle|egrep "go+gle"
google gogle ggle gooogle

 

Example 9: | the logical OR (OR) method specifies the mode to be used by the regular expression

[root@yxp opt]#echo 1ee 1abc 2abc|egrep "1|2abc"
1ee 1abc 2abc

 

Example 10: () string grouping, taking the string in parentheses as a whole.

[root@yxp opt]#echo 1ee 1abc 2abc|egrep "(1|2)abc"
1ee 1abc 2abc

 

Example 11: extract ip address

#FA Yi
[root@yxp opt]#ifconfig ens33|grep "netmask"|grep -o -E "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"|head -1
192.168.59.102

#Method 2: grouping is used
[root@yxp opt]#ifconfig ens33|grep "netmask"|egrep -o '([0-9]{1,3}.){3}[0-9]{1,3}'|head -1
192.168.59.102

 

 

 

2.4 position anchoring

Position qualifierexplain
^Row head anchor for the leftmost side of the pattern
$End of line anchor for the rightmost side of the pattern
^PATTERN^For pattern matching entire line
^$Blank line
^[[:space:]]*$Blank line
\< or \ bInitial anchor, used on the left side of the word pattern (consecutive numbers, letters, underscores count)
\>Or \ bSuffix anchor, used on the right side of a word
\<PATTERN\>Match entire word

Example 1: end of line anchor for the rightmost side of the pattern

[root@yxp opt]#grep "bash$" /etc/passwd
root:x:0:0:root:/root:/bin/bash
yxp:x:1000:1000:yxp:/home/yxp:/bin/bash

 

Example 2: row head anchor, used for the leftmost side of the pattern

[root@yxp opt]#grep "^root" /etc/passwd
root:x:0:0:root:/root:/bin/bash

 

Example 3: for pattern matching, the whole line is used, and the matching content is on a single line

[root@yxp opt]#echo root|grep "^root$" 
root

 

Example 4: \ <: match only the words on the right

[root@yxp opt]#echo hello-123|grep "\<123"
hello-123

 

Example 5: \ >: match only the words on the left

[root@yxp opt]#echo hello-123 222|grep "hello\>"
hello-123 222

 

Example 6: filter out non empty lines that do not start with #

[root@yxp opt]#grep "^[^#]" /etc/fstab 
/dev/mapper/centos-root /                       xfs     defaults        0 0
UUID=183ca7c7-1989-4f43-9e81-d2676192f5a4 /boot                   xfs     defaults        0 0
/dev/mapper/centos-home /home                   xfs     defaults        0 0
/dev/mapper/centos-swap swap                    swap    defaults        0 0
/dev/sdb1 /mnt xfs defaults 0 0

 

Posted by delmardata on Tue, 26 Oct 2021 07:04:13 -0700