Regular expressions in Shell scripts

Keywords: Linux Google shell vim

Bowen catalog
Definition of Regular Expression
2. Extended Regular Expression Metacharacters
3. Text Processor

Definition of Regular Expression

Regular expressions are also called regular expressions and regular expressions.It is often abbreviated as regex, regexp, or RE in code.Regular expressions are descriptions that use a single string to match a sequence of strings that conform to a certain syntax rule. Simply put, they are a way to match strings and quickly find, delete, and replace a particular string with some special symbols.
Regular expressions are literal patterns consisting of common characters and metacharacters.A pattern is used to describe one or more strings to match when searching for text.A regular expression acts as a template to match a character pattern to the string it searches for.Common characters include uppercase and lowercase letters, numbers, punctuation symbols, and some other symbols. Metacharacters refer to special characters that have special meaning in regular expressions and can be used to specify the mode in which the leading characters (that is, the characters that precede the metacharacters) appear in the target object.

1. Basic Regular Expressions

The string representation of regular expressions can be divided into basic and extended regular expressions according to their rigor and function.Basic regular expressions are the most basic part of common regular expressions.grep and sed support basic regular expressions in common file processing tools on Linux systems, while egrep and awk support extended regular expressions.

Prepare a test file named test.txt in advance, which is as follows:

[root@centos01 ~]# vim test.txt
he was short and fat.
He was wearing a blue polo shirt with black pants.
The home of Football on BBC Sport online.
the tongue is boneless but it breaks bones.12!
google is the best tools for search keyword.
The year ahead will test our political establishment to the limit.
PI=3.14148223023840-2382924893980--2383892948
a wood cross!
Actions speak louder than words

#wooood #
#woooood #
AxyzxyzxyzxyzxyzC
I bet this place is really spooky late at night!
Misfortunes never come alone/single.
I shouldn't have lett so tast.

1) Examples of basic regular expressions:

[root@centos01 ~]# Grep-n'the'test.txt <!--Find a specific character, -n Display line number-->
4:the tongue is boneless but it breaks bones.12!
5:google is the best tools for search keyword.
6:The year ahead will test our political establishment to the limit.
[root@centos01 ~]# Grep-in'the'test.txt <!--Find specific characters, -in displays line numbers case insensitive-->
3:The home of Football on BBC Sport online.
4:the tongue is boneless but it breaks bones.12!
5:google is the best tools for search keyword.
6:The year ahead will test our political establishment to the limit.
[root@centos01 ~]# Grep-vn'the'test.txt <!--Find lines that do not contain specific characters, -vn option implementation-->
1:he was short and fat.
2:He was wearing a blue polo shirt with black pants.
3:The home of Football on BBC Sport online.
7:PI=3.14148223023840-2382924893980--2383892948
8:a wood cross!
9:Actions speak louder than words
10:
11:
12:#wooood #
13:#woooood #
14:AxyzxyzxyzxyzxyzC
15:I bet this place is really spooky late at night!
16:Misfortunes never come alone/single.
17:I shouldn't have lett so tast.

2) grep uses brackets'[]'to find set characters

[root@centos01 ~]# Grep-n'sh[io]rt'test.txt <!--middle brackets to find set characters,
"[]"No matter how many characters there are, they only represent one character.
//That is,'[i o]'means matching'i' or'o'-->
1:he was short and fat.
2:He was wearing a blue polo shirt with black pants.
[root@centos01 ~]# Grep-n'oo'test.txt <!--Find duplicate single characters-->
3:The home of Football on BBC Sport online.
5:google is the best tools for search keyword.
8:a wood cross!
12:#wooood #
13:#woooood #
15:I bet this place is really spooky late at night!
[root@centos01 ~]# Grep-n'[^w]oo'test.txt <!--Find a string that is not preceded by'w',
//Use the'[^]'option to implement -->
3:The home of Football on BBC Sport online.
5:google is the best tools for search keyword.
12:#wooood #
13:#woooood #
15:I bet this place is really spooky late at night!
[root@centos01 ~]# Grep-n'[^a-z]oo'test.txt <!--Find no lower case letters before'oo'-->
3:The home of Football on BBC Sport online.
[root@centos01 ~]# Grep-n'[0-9]'test.txt <!--Find rows containing numbers-->
4:the tongue is boneless but it breaks bones.12!
7:PI=3.14148223023840-2382924893980--2383892948

3)grep finds the beginning'^'and end'$'

[root@centos01 ~]# Grep-n'^the'test.txt <!--Find lines starting with the'the' string-->
4:the tongue is boneless but it breaks bones.12!
[root@centos01 ~]# Grep-n'^[a-z]'test.txt <!--Find lines starting with lowercase letters-->
1:he was short and fat.
4:the tongue is boneless but it breaks bones.12!
5:google is the best tools for search keyword.
8:a wood cross!
[root@centos01 ~]# Grep-n'^[A-Z]'test.txt <!--Find lines starting with uppercase letters-->
2:He was wearing a blue polo shirt with black pants.
3:The home of Football on BBC Sport online.
6:The year ahead will test our political establishment to the limit.
7:PI=3.14148223023840-2382924893980--2383892948
9:Actions speak louder than words
14:AxyzxyzxyzxyzxyzC
15:I bet this place is really spooky late at night!
16:Misfortunes never come alone/single.
17:I shouldn't have lett so tast.
[root@centos01 ~]# Grep-n'^[^a-zA-Z]'test.txt <!--Find lines that do not start with a letter-->
12:#wooood #
13:#woooood #
[root@centos01 ~]# Grep-n'w..D'test.txt <!--Find any character'.' and repeating character'*'-->
5:google is the best tools for search keyword.
8:a wood cross!
9:Actions speak louder than words
[root@centos01 ~]# Grep-n'O o o*'test.txt <!--View strings containing at least two o or more-->
3:The home of Football on BBC Sport online.
5:google is the best tools for search keyword.
8:a wood cross!
11:#woood #
13:#woooooood #
19:I bet this place is really spooky late at night!
[root@centos01 ~]# Grep-n'w o o*d'test.txt <!--A string containing at least one o at the beginning and end of the query w-->
8:a wood cross!
11:#woood #
13:#woooooood #
[root@centos01 ~]# Grep-n'[0-9][0-9]*'test.txt <!--Query the row of any number-->
4:the tongue is boneless but it breaks bones.12!
7:PI=3.141592653589793238462643383249901429
[root@centos01 ~]# Grep-n'o\{2}'test.txt <!--Find the character'{}' for two consecutive o-->
3:The home of Football on BBC Sport online.
5:google is the best tools for search keyword.
8:a wood cross!
11:#woood #
13:#woooooood #
19:I bet this place is really spooky late at night!

2. Metacharacter Summary

2. Extended Regular Expression Metacharacters

3. Text Processor

There are many text processors or text editors in Linux/UNIX systems, including VIM editors and grep.Grep, sed, awk are more commonly used text processing tools in shell programming, called shell programming three swordsmen.

1. sed tool

sed (Stream EDitor) is a powerful and simple text parsing and conversion tool that reads text and edits (deletes, deletes, and transforms) the text content according to specified conditions.
Replace, add, move, etc.), and finally output all rows or only some of the rows processed.sed can also perform quite complex text processing operations without interaction and is widely used in shell scripts to accomplish a variety of automated processing tasks.

sed's workflow mainly consists of three processes: reading, executing and displaying:

  • Read: sed reads a line from the input stream (file, pipe, standard input) and stores it in a temporary buffer (also known as patterm space).
  • Execution: By default, all sed commands are executed sequentially in mode space, unless the address of the line is specified, the SED command will execute sequentially on all lines.
  • Display: Send modified content to output stream.After sending the data again, the mode space will be emptied.The above process repeats until all the contents of the file are processed.

2. Common usage of sed command

sed [Option]'Action'parameter
 sed [Options] -f scriptfile parameter

Common sed command options include the following:

  • -e or--expression=: Indicates that the input text file is processed with the specified command or script.
  • -f or--file=: Indicates that the input text file is processed with the specified script file.
  • -h or--help: Display help.
  • -n, --quiet, or silent: Indicates that only the results after processing are displayed.
  • -i: Edit the text file directly.
    Action specifies the action behavior for file operations, which is the sed command.Typically, the format of the'[n1[,n2]'operation parameter is used.N1 and N2 are optional and do not necessarily exist. They represent the number of rows selected for operation. If the operation needs to be between 5 and 20 rows, they are expressed as "5,20 action behaviors".Common operations include the following:
  • A: increase, add a line below the current line to specify the content.
  • c: Replace, replacing the selected line with the specified content.
  • d: Delete, delete the selected row.
  • i: Insert, insert a line above the selected line specifying the content.
  • p: Print, if both lines are specified, the specified lines are printed; if no lines are specified, all contents are printed; if there are non-printing characters, they are output in ASCII code.It is often used with the'-n'option.
  • s: Replace, replace the specified character.
  • y: Character conversion.

3. Example usage

1) Output the text of the symbol condition (p for normal output)

[root@centos01 ~]# Sed-n'3p'test.txt <!--Output third line-->
The home of Football on BBC Sport online.
[root@centos01 ~]# Sed-n'3,5p'test.txt <!--Output lines 3 to 5-->
The home of Football on BBC Sport online.
the tongue is boneless but it breaks bones.12!
google is the best tools for search keyword.
[root@centos01 ~]# Sed-n'p;n'test.txt <!--Output all odd rows-->
he was short and fat.
The home of Football on BBC Sport online.
google is the best tools for search keyword.
PI=3.141592653589793238462643383249901429
Actions speak louder than words
#woood #
#woooooood #

I bet this place is really spooky late at night!
I shouldn't have lett so tast.
[root@centos01 ~]# Sed-n'p;n'test.txt <!--Output all even rows-->
he was short and fat.
The home of Football on BBC Sport online.
google is the best tools for search keyword.
PI=3.141592653589793238462643383249901429
Actions speak louder than words
#woood #
#woooooood #

I bet this place is really spooky late at night!
I shouldn't have lett so tast.
[root@centos01 ~]# Sed-n'1,5{p;n}'test.txt <!--Output odd rows between first and fifth lines-->
he was short and fat.
The home of Football on BBC Sport online.
google is the best tools for search keyword.

[root@centos01 ~]# Sed-n'10, ${n; p}'test.txt <!--Output even lines between line 10 and end of file-->
#woood #
#woooooood #

I bet this place is really spooky late at night!
I shouldn't have lett so tast.

2) Sed command combined with regular expression

[root@centos01 ~]# Sed-n'/the/p'test.txt <!--Output line containing the -->
the tongue is boneless but it breaks bones.12!
google is the best tools for search keyword.
The year ahead will test our political establishment to the limit.
[root@centos01 ~]# Sed-n'4, /the/p'test.txt<!--Output from line 4 to the first line containing the -->
the tongue is boneless but it breaks bones.12!
google is the best tools for search keyword.
[root@centos01 ~]# Sed-n'/the/='test.txt <!--Output the line number of the line containing the,
//The equal sign (=) is used to output the line number-->
4
5
6
[root@centos01 ~]# Sed-n'/^PI/p'test.txt <!--Output lines starting with PI-->
PI=3.141592653589793238462643383249901429
[root@centos01 ~]# Sed-n'/\<wood>/p'test.txt <!--Output lines containing the word wood,
\<,\>Represents a word boundary-->
a wood cross!

3) Delete qualified files (d)

[root@centos01 ~]# NL test.txt | sed'3d'<!--Delete line 3-->
     1  he was short and fat.
     2  He was wearing a blue polo shirt with black pants.
     4  the tongue is boneless but it breaks bones.12!
     5  google is the best tools for search keyword.
     6  The year ahead will test our political establishment to the limit.
     7  PI=3.141592653589793238462643383249901429
     8  a wood cross!
     9  Actions speak louder than words
    10  
    11  #woood #
    12  
    13  #woooooood #
    14  
    15  
    16  AxyzxyzxyzxyzC
    17  
    18  
    19  I bet this place is really spooky late at night!
    20  Misfortunes never come alone/single.
    21  I shouldn't have lett so tast.
[root@centos01 ~]# NL test.txt | sed'3,5d'<!--Delete lines 3-5-->
     1  he was short and fat.
     2  He was wearing a blue polo shirt with black pants.
     6  The year ahead will test our political establishment to the limit.
     7  PI=3.141592653589793238462643383249901429
     8  a wood cross!
     9  Actions speak louder than words
    10  
    11  #woood #
    12  
    13  #woooooood #
    14  
    15  
    16  AxyzxyzxyzxyzC
    17  
    18  
    19  I bet this place is really spooky late at night!
    20  Misfortunes never come alone/single.
    21  I shouldn't have lett so tast.
[root@centos01 ~]# Sed'/^[a-z]/d'test.txt <!--Delete lines starting with lowercase letters-->
He was wearing a blue polo shirt with black pants.
The home of Football on BBC Sport online.
The year ahead will test our political establishment to the limit.
PI=3.141592653589793238462643383249901429
Actions speak louder than words

#woood #

#woooooood #

AxyzxyzxyzxyzC

I bet this place is really spooky late at night!
Misfortunes never come alone/single.
I shouldn't have lett so tast.

4) Replace qualified text

[root@centos01 ~]# Sed's/the/THE/'test.txt <!--Replace the first of each line with The-->
he was short and fat.
He was wearing a blue polo shirt with black pants.
The home of Football on BBC Sport online.
THE tongue is boneless but it breaks bones.12!
google is THE best tools for search keyword.
The year ahead will test our political establishment to THE limit.
PI=3.141592653589793238462643383249901429
a wood cross!
Actions speak louder than words

#woood #

#woooooood #

AxyzxyzxyzxyzC

I bet this place is really spooky late at night!
Misfortunes never come alone/single.
I shouldn't have lett so tast.
[root@centos01 ~]# Sed's/l/L/2'test.txt <!--Replace the third L in each line with L-->
he was short and fat.
He was wearing a blue poLo shirt with black pants.
The home of FootbalL on BBC Sport online.
the tongue is boneless but it breaks bones.12!
google is the best tooLs for search keyword.
The year ahead wilL test our political establishment to the limit.
PI=3.141592653589793238462643383249901429
a wood cross!
Actions speak louder than words

#woood #

#woooooood #

AxyzxyzxyzxyzC

I bet this place is reaLly spooky late at night!
Misfortunes never come alone/singLe.
I shouldn't have Lett so tast.
[root@centos01 ~]# sed 's/^/#/' test.txt  <!--Insert at the beginning of each line#No. -->
#he was short and fat.
#He was wearing a blue polo shirt with black pants.
#The home of Football on BBC Sport online.
#the tongue is boneless but it breaks bones.12!
#google is the best tools for search keyword.
#The year ahead will test our political establishment to the limit.
#PI=3.141592653589793238462643383249901429
#a wood cross!
#Actions speak louder than words
#
##woood #
#
##woooooood #
#
#
#AxyzxyzxyzxyzC
#
#
#I bet this place is really spooky late at night!
#Misfortunes never come alone/single.
#I shouldn't have lett so tast.
[root@centos01 ~]# Sed'/the/s/o/0/g'test.txt <!--Replace o in all rows containing the with 0-->
he was short and fat.
He was wearing a blue polo shirt with black pants.
The home of Football on BBC Sport online.
the t0ngue is b0neless but it breaks b0nes.12!
g00gle is the best t00ls f0r search keyw0rd.
The year ahead will test 0ur p0litical establishment t0 the limit.
PI=3.141592653589793238462643383249901429
a wood cross!
Actions speak louder than words

#woood #

#woooooood #

AxyzxyzxyzxyzC

I bet this place is really spooky late at night!
Misfortunes never come alone/single.
I shouldn't have lett so tast.

3. awk tools

In Linux/UNIX systems, awk is a powerful editing tool that reads input text line by line, searches according to a specified matching pattern, formats and outputs qualified content or filters it. It can achieve quite complex text operations without interaction, and is widely used in Shell scripts to complete various automated configuration tasks.

1) Common use of awk

Usually awk uses the following command format, where single quotation marks plus braces'{}'are used to set the processing action on the data.Awk can either process the target file directly or it can process the target file through a'-f'read script.

awk option'mode or condition {edit instructions}'file 1 file 2...
Awk-f script file file file 1 file 2...

awk contains several special built-in variables (which can be used directly) as follows:

  • NF: Number of fields in the row being processed.
  • FS: Specifies the field delimiter for each line of text, defaulting to a space or tab.
  • NR: The number of fields in the row being processed.
  • $0: The entire line of the row being processed.
  • FILENAME: The name of the file being processed.
  • RS: Data records are separated, defaulting to \n, which means one record per behavior.

2) Example usage

[root@centos01 ~]# Awk'{print}'test.txt <!--Output everything-->
he was short and fat.
He was wearing a blue polo shirt with black pants.
The home of Football on BBC Sport online.
the tongue is boneless but it breaks bones.12!
google is the best tools for search keyword.
The year ahead will test our political establishment to the limit.
PI=3.141592653589793238462643383249901429
a wood cross!
Actions speak louder than words

#woood #

#woooooood #

AxyzxyzxyzxyzC

I bet this place is really spooky late at night!
Misfortunes never come alone/single.
I shouldn't have lett so tast.
[root@centos01 ~]# Awk'NR==1, NR==3{print}'test.txt <!--Output 1-3 lines of content-->
he was short and fat.
He was wearing a blue polo shirt with black pants.
The home of Football on BBC Sport online.
[root@centos01 ~]# Awk'(NR%2) ==1{print}'test.txt <!--Output the contents of all odd rows-->
he was short and fat.
The home of Football on BBC Sport online.
google is the best tools for search keyword.
PI=3.141592653589793238462643383249901429
Actions speak louder than words
#woood #
#woooooood #

I bet this place is really spooky late at night!
I shouldn't have lett so tast.
[root@centos01 ~]# Awk'(NR%2)==0{print}'test.txt <!--Output all even lines of content-->
He was wearing a blue polo shirt with black pants.
the tongue is boneless but it breaks bones.12!
The year ahead will test our political establishment to the limit.
a wood cross!

AxyzxyzxyzxyzC

Misfortunes never come alone/single.
[root@centos01 ~]# Awk'/^root/{print}'/etc/passwd <!--Output lines starting with root-->
root:x:0:0:root:/root:/bin/bash
[root@centos01 ~]# Awk'{print $1 $3}'test.txt <!--Output the first and third fields in each line-->
heshort
Hewearing
Theof
theis
googlethe
Theahead
PI=3.141592653589793238462643383249901429
across!
Actionslouder

#woood

#woooooood

AxyzxyzxyzxyzC

Ithis
Misfortunescome
Ihave

- This is the end of the article. Thank you for reading ---

Posted by codebuilder on Mon, 11 Nov 2019 11:58:45 -0800