Usage of regular sed/awk

Keywords: Linux ascii less

*sed implements some grep functionality, but it's a little more cumbersome, sed's strength is to delete the contents of the file and replace it

sed implements grep retrieval:

1. Retrieve by keyword:

[root@localhost ~]# sed -n '/root/'p passwd.txt 

* Retrieval with sed requires a -n parameter before the keyword, p after the keyword, and / or inclusion of the keyword

2. Add-r parameter to keywords with special symbols or use delimiters

[root@localhost ~]# sed -nr '/o+t/'p passwd.txt 

3. Print the specified line: (when printing the specified line, write the line number directly, without enclosing the //symbol)

[root@localhost ~]# sed -n '5'p passwd.txt 
[root@localhost ~]# sed -n '5,$'p passwd.txt   #Print the fifth to last line

4.-e parameter: uses multiple expressions:

[root@localhost ~]# sed -e '1'p -e '/root/'p -n passwd.txt 

* Print the first line and retrieve the row containing root. If the first line contains root at the same time, the first line will be printed twice

5. Case insensitive: (plus uppercase i)

[root@localhost ~]# sed -n '/testword/'Ip passwd.txt 

SedDelete function:

1. Delete the line specified in the printed result:

[root@localhost ~]# wc -l passwd.txt 
22 passwd.txt
#View file lines

[root@localhost ~]# sed '1,20'd passwd.txt 
chrony:x:998:996::/var/lib/chrony:/sbin/nologin
linux01:x:1000:1000::/home/linux01:/bin/bash
#Delete lines 1-20 and print the remaining lines

[root@localhost ~]# wc -l passwd.txt 
22 passwd.txt
#This action does not actually delete the contents of the file, it just deletes the printed results

2.-i parameter: delete the line specified in the file (add-i parameter will actually delete the file contents)

[root@localhost ~]# wc -l passwd.txt 
22 passwd.txt
[root@localhost ~]# sed -i '1,20'd passwd.txt 
[root@localhost ~]# wc -l passwd.txt 
2 passwd.txt

*Commonly used when deleting large log file contents

sed replacement function:

1. Replace by keyword:

[root@localhost ~]# cat passwd.txt 
chrony:x:998:996::/var/lib/chrony:/sbin/nologin
linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# sed 's/chrony/sed_test/g' passwd.txt 
sed_test:x:998:996::/var/lib/sed_test:/sbin/nologin
linux01:x:1000:1000::/home/linux01:/bin/bash

*Format:'s/Replaced Keyword/Replaced Content/g'

2. Add-r parameter for special symbol matching:

[root@localhost ~]# cat passwd.txt 
nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# sed -r 's/n+y/sed_test/g' passwd.txt 
sed_test:x:998:996::/var/lib/chronwy:/sbin/nologin
linux01:x:1000:1000::/home/linux01:/bin/bash

3. Subdivision replacement position:

[root@localhost ~]# cat passwd.txt 
nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# sed -r 's/([^:]+):(.*):([^:]+)/\3:\2:\1/g' passwd.txt 
/sbin/nologin:x:998:996::/var/lib/chronwy:nnnnny
/bin/bash:x:1000:1000::/home/linux01:linux01

* Use a colon to split the paragraph into three, swapping the third and first paragraphs

4. Configure keywords to directory paths with delimiters or with other substitution symbols:

[root@localhost ~]# cat passwd.txt 
nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# sed 's/\/bin\/bash/AAAAAAA/g' passwd.txt 
nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
linux01:x:1000:1000::/home/linux01:AAAAAAA

[root@localhost ~]# sed 's#/bin/bash#AAAAAAA#g' passwd.txt 
nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
linux01:x:1000:1000::/home/linux01:AAAAAAA

5. Delete all letters:

[root@localhost ~]# cat passwd.txt 
nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# sed 's/[a-zA-Z]//g' passwd.txt 
::998:996::///://
01::1000:1000:://01://

6. Add content before each line:

[root@localhost ~]# cat passwd.txt 
nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# sed -r 's/.*/sed_test:&/g' passwd.txt 
sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

*Match everything on each line:. *, &Symbols represent. *

7. True replacement of the contents in the file: -i parameter (none of the above 6 examples will actually change the contents of the file, only the printed output results of the replacement)

[root@localhost ~]# cat passwd.txt 
nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# sed -i 's/.*/sed_test:&/g' passwd.txt 

[root@localhost ~]# cat passwd.txt 
sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

Supplement:
Capitalize the first lowercase letter of each word:
sed 's/\b[a-z]/\u&/g' filename

Capitalize all lower cases:
sed 's/[a-z]/\u&/g' filename

Uppercase to lowercase:
sed 's/[A-Z]/\l&/g' filename

sed Add a number at the end of a line
sed -r 's/(^a.*)/\1 12/' test
sed -r 's/^a.*/& 12/' test

Print lines 1 to 100 with a string
sed -n '1,100{/abc/p}' 1.txt

*awk is more powerful than grep/egrep/sed. By default, extended regular expressions are supported. grep needs to add -E parameter, sed needs -r parameter

1. Split the contents of a file to print a specified number of segments:

[root@localhost ~]# cat test.txt 
zhangsan 100
lisi     92
wangwu   95
user1    88
user2    93
[root@localhost ~]# awk '{print $1}' test.txt 
zhangsan
lisi
wangwu
user1
user2

*Default space delimiter, $Specifies the number of segments to print

2.-F parameter: specify the separator:

[root@localhost ~]# cat passwd.txt 
AAAA:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk -F ':' '{print $3}' passwd.txt 
nnnnny
linux01
#Use commas to distinguish between multiple segments when printing
[root@localhost ~]# awk -F ':' '{print $1,$3}' passwd.txt 
AAAA nnnnny
AAAA linux01
#Specify the split symbol for the middle and the segment of the printed result
[root@localhost ~]# awk -F ':' '{print $1"-->"$3}' passwd.txt 
AAAA-->nnnnny
AAAA-->linux01

*awk {print $0} means to print all, equivalent to cat

3.awk Retrieval: (equivalent to grep)

[root@localhost ~]# cat passwd.txt 
AAAA:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk '/nnn/' passwd.txt 
AAAA:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin

4. Retrieve the rows containing keywords in the specified paragraph:

[root@localhost ~]# cat passwd.txt 
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk -F ':' '$1 ~/AAA/' passwd.txt 
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

5. Combining multiple expressions:

[root@localhost ~]# cat passwd.txt 
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk -F ':' '/linux/ {print $1,$3} /nnn/ {print $2,$3}' passwd.txt 
sed_test nnnnny
AAAA linux01

* Print paragraphs 1 and 3 of the line containing the linux keyword, and paragraphs 2 and 3 of the line containing the nnn keyword

6. Retrieve rows of multiple keywords and specify print segments:

[root@localhost ~]# cat passwd.txt 
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk -F ':' '/linux|nnn/ {print $1}'  passwd.txt 
ABCD
AAAA

7. Retrieving by Operational Symbols:

[root@localhost ~]# cat passwd.txt 
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk -F ':' '$5==1000' passwd.txt 
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk -F ':' '$5==1000 {print $1}' passwd.txt 
AAAA

8. When judging numbers, do not use double quotation marks, otherwise the judgment will be treated as a string instead of a number (when numbers are judged as strings, 998 is greater than 1000 according to ASCII code):

[root@localhost ~]# cat passwd.txt 
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk -F ':' '$5<1000' passwd.txt 
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
#1000 after double quotation marks is treated as a string, not a number
[root@localhost ~]# awk -F ':' '$5<"1000"' passwd.txt 
[root@localhost ~]# 

9. String judgment: (double quotation marks when judging strings)

[root@localhost ~]# cat passwd.txt 
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk -F ':' '$9!="/sbin/nologin"' passwd.txt 
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

10. Comparison between two fields and multiple conditional retrieval:

[root@localhost ~]# cat passwd.txt 
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk -F ':' '$5>$6' passwd.txt  
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin

[root@localhost ~]# awk -F ':' '$5>$6 {print $5,$6}' passwd.txt  
998 996

[root@localhost ~]# awk -F ':' '$5>999' passwd.txt  
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk -F ':' '$5>900 && $6<999' passwd.txt  
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin

[root@localhost ~]# awk -F ':' '$5>999 || $9=="/sbin/nologin"' passwd.txt  
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

*The last string judgment can be used~retrieved in addition to the ==symbol

11.OFS: Specify the print result separator:

[root@localhost ~]# cat passwd.txt 
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk -F ':' '{OFS="***"} {print $1,$2,$3}' passwd.txt  
ABCD***sed_test***nnnnny
AAAA***sed_test***linux01

[root@localhost ~]# awk -F ':' '{OFS="***"} $5<1000 {print $1,$2,$3}' passwd.txt  
ABCD***sed_test***nnnnny

12.NR: Display the number of rows in front of each row:

[root@localhost ~]# cat passwd.txt 
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk -F ':' '{print NR":" $0}' passwd.txt 
1:ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
2:AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

13.NF: Show the number of segments in front of each line:

[root@localhost ~]# awk -F ':' '{print NF":" $0}' passwd.txt 
9:ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
9:AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

14. Display the specified rows according to the number of rows or segments:
1) Show lines with line number less than 2:

[root@localhost ~]# awk -F ':' '{print NR":" $0}' passwd.txt 
1:ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
2:AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk 'NR<2' passwd.txt 
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin

2) Show the first 10 rows with the first paragraph AAAA:

[root@localhost ~]# awk -F ':' 'NR<10 && $1 ~/AAAA/' passwd.txt 
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

The difference between 15. = and ==: (== is a matching number or string, = is an assignment)

[root@localhost ~]# cat passwd.txt 
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk -F ":" '$1=="ABCD"' passwd.txt 
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin

[root@localhost ~]# awk -F ":" '$1="hello"' passwd.txt 
hello sed_test nnnnny x 998 996  /var/lib/chronwy /sbin/nologin
hello sed_test linux01 x 1000 1000  /home/linux01 /bin/bash

16. Sum:

[root@localhost ~]# cat passwd.txt 
ABCD:sed_test:nnnnny:x:998:996::/var/lib/chronwy:/sbin/nologin
AAAA:sed_test:linux01:x:1000:1000::/home/linux01:/bin/bash

[root@localhost ~]# awk -F ':' '{(tot=tot+$5)}; END {print tot}' passwd.txt 
1998

Supplement:

Represents any 1 character

* The character before a b o has 0 or more
abc* ===> ab,abccc

*Greedy Match

The extended regular means:? There are 0 or 1 characters in front of?
a1? ==> a or a1

+Extended Regular Indicates: +The character preceding has one or more

| Extended regular means: or
egrep 'abc|123' 1.txt

[] denotes a character within square brackets
[a-zA-Z0-9] Represents all uppercase and lowercase letters and numbers
[a b c] means a or b or C
[a|@] means a or | or @
[^] denotes not, negatively

^ indicates beginning

$means end

{} denotes scope
A{1,5} ===> a or a a a or a a a or a a a a a or a a a a a
b{3} ===> bbb

() Characters in parentheses as a whole
(abc){2} ==> abcabc
(abc) + ==>abc or ABC ABC or abc*n
abc{2} ==> abcc

Extended Regular Symbols: + | {} grep-E, sed-r when used

Posted by koray on Thu, 19 Sep 2019 10:10:41 -0700