Introduction to awk (report generator), grep (text filter), sed (stream editor)

Keywords: shell Linux vim


Three swordsmen

Three Swordsmen of Text under linux

grep

egrep,grep,fgrep 
Text Search Needs
 grep: Search the text according to the pattern and display the line of text that conforms to the pattern.
pattern: matching condition of combination of text characters and metacharacters of regular expressions

grep [option] "pattern" file 
grep root /etc/passwd

- i: Ignore case and case 
--color: Matched characters highlight alias
alias  grep='grep --color'
- v: Reverse lookup 
- o: Show only string matched by pattern (no rows)

globbing

* Arbitrary characters of any length
 ? Any single character
 []: Any character
 [^]: Any one of them is not 

Regular expression: Regular ExPression,REGEXP

Meta character:
Match any single character
 []: Matches any character within the specified range
 [^]: Matches any single character within a specified range
[:digit:][:lower:][:upper:] []

Character matching times:
* Represents any number of times (0-inf) that matches the previous character
   a*b 
   a.*b
 *: Any length, any character
 Working in a greedy mode 
\?: Match the character before it one or zero times.
    Partial matching 
  a?b 
\ {m,n}: Matches at least m of the previous characters, up to n times.
   \{1,\}
  \{0,3\}
  a\{1,3\}
  a.\{1,3\}

Location anchoring:

^:Anchor the beginning of a line. Any content after this character must appear at the beginning of the line.
grep "^root" /etc/passwd 

$:Anchor the end of a line. Anything preceding this character must appear at the end of the line.

grep "bash$" /etc/passwd 
^$:Blank line 
grep '^$' /etc/passwd 

Figures:

[0-9]:

grep "[[:space:]][[:digit:]]$" 

r555t 

Anchor words:

\<or\b:Any character that follows must appear at the beginning of the line
\>or\b:Any character preceding it must appear at the end of the line.

This is root.
The user is mroot
rooter is dogs name.
chroot is a command.
grep "root\>" test.txt 
grep "\<root" test.txt 
grep "\<root\>" test.txt  

Grouping:

\(\)
\(ab\)* :ab all in one 
  
  //Backward reference
  
He love his lover.
She like her liker.
He  love his liker.
She like her lover.

grep 'l..e*l..e*' text.txt 
grep "l..e.*\1" text.txt
grep "\(l..e\)" 

\1:Call the content between the first left bracket and the corresponding right bracket.
\2:
\3:

/etc/inittab 
grep '\([0-90]\).*\1$'  /etc/inittab 

REGEXP: regular Expresssion

pattern: Text filtering condition

Regular expressions:
basic REGEXP: Basic Regular Expressions
Extent REGEXP: Extended Regular Expressions

Basic Regular Expression

.
[]
[^]

Number matching:
*:
?: 0 or 1 times.
\ {m,n}: at least m times, up to N times

.*:

Anchoring:
^:
$:
\<,\b: 
\>,\b:

\(\)
\1,\2....

grep: Commands that filter text using patterns defined by basic regular expressions:

- i: Ignore case and case 
-v 
-o 
--color 

- E supports extended regular expressions 
- A #: Displays matching rows and how many subsequent rows are also displayed 
  after 
- B: Display matching rows and the n rows ahead
   before 
- C: Display matching rows and n rows before and after
   contest 
grep -A 2 ""  file 


Extended regular expressions:
   Greedy mode

Character matching:
.
[]
[^]

Number matching:
*: 
?:
+ Match at least one character before it
{m,n}

Location anchoring:
^
$
\<
\>

Grouping:
(): grouping
\1,\2,\3.....

Or:
a|b  or 

C|cat: 
(C|c)at: 

grep --color -E '^[[:space:]]+' /boot/grub/grub.conf 

grep -E = egrep 

egrep --color '\<([1-9]|[1-9][0-9]|1[0-9][0-9]|2[0-5][0-9]|25[0-5])\>' 

(\<([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-5][0-9]|25[0-5])\>\.){3}'\<([1-9]|[1-9][0-9]|1[0-9][0-9]|2[0-5][0-9]|25[0-5])\>\.'

IPV4: 
The 5 category:
A B C D E 
A:1-127 
B:128-191 
C: 192--223 

\<[1-9]|[1-9][0-9]|1[0-9]{2}|2[01][0-9]|22[0-30]\>

Sed (Stream Editor)

The basic usage of sed is:

sed:stream Editor 
Line editor 
   text editor 
   Processing text line by line 
  
Full screen editor: vim 
 
Memory space: schema space 
sed pattern space 
After matching the pattern space, the operation is carried out and the result is output. Only the data in the mode space is processed, and then the processing is finished and the mode space is printed to the screen.

The default sed does not edit the original file, but only processes the data in the schema space.

sed [option] [sed-scripts]

option:
- n: silent mode 
- i: Direct modification of the original document
 - e scripts-e script: Multiple scripts can be executed simultaneously.
- f/path/to/sed_scripts commands and scripts are saved and invoked in files.
  sed -f /path/to/scripts  file 
- r: Represents the use of extended regular expressions.
   It only operates and does not display data in the default schema space.

comamnd:

address: Specifies the range of rows to be processed

sed 'addressCommand' file ... 
Operate in accordance with address range.
Address: 
1.startline,endline 
 For example, 1100
   $: Last line
2./RegExp/ 
  /^root/
3./pattern1/,/pattern2/ 
  The first line matched by pattern begins, and the first line matched by pattern 2 ends, all lines in the middle.
4.LineNumber 
   Specified row 
5.startline,+N 
 Start from startline, N lines backwards.
 
Command: 
 d: Delete qualified rows.
     sed '3,$d' /etc/fstab
     sed '/oot/d' /etc/fstab 
Note: Pattern matching, to use// 
    sed '1d' file 
p: Display qualified rows 
 sed '/^\//d' /etc/fstab 
 sed '/^\//p' /etc/fstab 
   It will be displayed twice.
    First P matching is displayed, then all the data in the schema space is displayed.
A\string: Adds a new line after the specified line with the content "string"
sed '/^\//a \# hello world' /etc/fstab 
Add two lines:
sed '/^\//a \#hello world \n #hi' /etc/fstab 

I\ sting: Add a new line to the front of the specified line with the content string.

r file: Adds the contents of the specified file after the specified line.
  sed '2r /etc/issue'   /etc/fstab 
  sed '$r /etc/issue' /etc/fstab 

w file: Save the contents of the range specified by the address in another file.
 sed '/oot/w /tmp/oot.txt' /etc/fstab 
 
s/pattern/string/: Find and replace 
     sed  's/oot/OOT/'  /etc/fstab 
sed 's/^\//#/' /etc/fstab 
Sed's/// etc/fstab replaces only the first string matched by the pattern for each row.
  Add modifier 
   g: Global substitution 
   i: Ignore case and case 
 sed 's/\//#/g'/etc/fstab
 
 s///:s###
 s@@@
 
sed 's#+##' 

Backward reference

l..e:like----->liker 
     love----->lover 
     
sed 's#l..e#&r#' file
 & Represents a reference to pattern matching 

sed 's#l..e#\1r#' file 

like---->Like
love---->Love 
sed 's#l\(..e\)#L\1#g' file 


history |sed 's#[[:space:]]##g'
history | sed 's#^[[:space:]]##g'

sed ''dirname

Example

 1.delete/etc/grub.conf The blank character at the beginning of the line in the file;
 sed  's/^[[:space:]]+//g' /etc/grub.conf 
 2.replace/etc/inittab In file"id:3:initdefault:"3 in a row
 sed 's#id:3:init#id:5:initd#'
 sed 's@\(id:\)[0-9]\(:initdefault:\)@\15\2@g' /etc/inittab 
 3.delete/etc/inittab The blank line in the file.
  sed '/^$/d' /etc/inittab
4.delete/etc/inittab At the beginning of the document#Number
sed 's/^#//'  
5.Delete the beginning of the Mo file#Number and blank line.
sed 's/^[[:space:]]+//g' 
6.Delete a file followed by a blank character#The blank characters at the beginning of the line of the class and#
sed -r 's/^[[:space:]]+#//g' 
7.Extract the directory name of a file path
echo '/etc/rc.d'|sed -r 's@^(/.*/)[^/]+/?@\1@g'

awk (report generator)

grep : Text filter
sed:stream editor 


grep option pattern file 
sed addresscommmand file 
sed 'comand/pattern/' file 

awk (report generator)

Show it in a defined format.
nawk 
gawk
gnu awk 

awk option 'script' file file2 
awk [option] 'pattern {action}' file file2 

print 
printf Custom display format


awk Extract one row at a time, and then cut each row into slices, each slice can be referenced using variables.
$0:Represents referencing a whole line
$1:First section
$2:Second section 

awk '{print $1}' text.txt 
awk '{print $1,$2}' text.txt

Options:

-F  Specifies a delimiter
awk -F ''

awk 'BEGIN{OPS="#"}{print $1,$2}' test.txt
BEGIN{OPS=""} Output separator

//Output specific characters
awk '{print $1,"hello",$2,$3,$4,$5}' file 

awk 'BEGIN{print "line one\nline two\nline tree"}'

print Format:
print item1,item2...

awk -F: input separator 
OFS="#"Output separator

awk variable

awk Built-in variables
FS: filed separator,When reading text, the field separator is used
RS:recordsepartor,Line breaks used to enter text information.
OFS:OUT filed separator 
ORS:Output ROw separator 

awk -F:
OFS="#"
FS=":"

Data variables of awk built-in variables

NR: the number of input record ,awk The record processed by the command, if there are multiple files, is the number of lines processed.
FNR: How many lines are processed by the current file?
NF: How many fields are currently processed in the row?


awk '{print NF}' file 
awk '{print $NF}' file 
awk '{print NR}' file 

- v Defines Variables

awk -v test="hello awk" '{print test}' 
awk -v test="hell awk" 'BEGIN{print test}'


awk  'BEGIN{test='hello awk',print test}'

Pritf formatted display

printf  format,item1,item2...

awk 'BEGIN{printf %c,}'
Note: printf does not change lines  

%d 
%e 
%f 
%g 

Modifier
 - Left alignment 
% nd: display width 
awk '{printf %-10s%-10s\n,$1,$2}' file

awk operator
arithmetic operator
String Operators
Boolean expression

x < y 
x <= y 
x > y 
x != y 
x ~ y matching 
x !~ y 

Logical relations between expressions

&& 
|| 

Conditional expression

select?if-true-exp:if-false-exp 
a>b?a=1:b=2 

awk mode

1.regular expression /pattern/
2.Expression 
3.REGEXP Specify the matching range 
4.BEGIN/END 
5Empty  


awk -F : '/^r/ {print $1}' /etc/passwd 
awk -F :  '$3>=500{printf $1,$3}' /etc/passwd 
awk -F: '$3+1>=500{print $1,$3}' /etc/passwd

awk -F: '$7~"bash$"{print $1,$7}' /etc/passwd 
//Perform matching tests
awk -F: '$7!~"bash$"{print $1,$7}' /etc/passwd 

awk -F: '/^r/,/^m/{print $1,$7}' /etc/passwd 

awk -F: '$3==0,$7~"bash"{print $1,$3,$7}' /etc/passwd 

awk -F '{printf "%-10s%-10s%-20s\n",$1,$2,$3}' /etc/passwd 

BEGIN ,END 

awk -F: '$3==0,$7~"nologin"BEGIN{print "Username       ID    shell"}{printf "%-10s%-10s%-20s\n"$1,$3,$7} END{print "ending"}' /etc/passwd 

action

1.ExPression 
2.control statements 
3.compound statements 
4.INput statment 
5 output statements 

Control statement

if-else

if(condition) {then-body} else {[else-body]}
eg:
awk -F:
while
while (condition){statement1;statement2;...}
//Loop each field 
length([string])

awk -F: '{i=1; while (1<=NF) if {(length($i)>4) {print $i}; i++}}'

df -hP |awk '{if($4 >=) Print $0}'


do while 
do{statement1,statement2,...} while(condition)

for 
for( ; ; ){statement1;statement2....}

awk -F: '{for(i=1:i<=NF;i++){if(length($i)>=4){print $i}}}'  /etc/passwd 

case 
switch (exprssion) {case value or /regexp/:statement1,statement2,...default:statement,....}
 
break and continue 
contine Is traversal field 

next 
//End the processing of the line text ahead of time, and then proceed to the next line.

array

The table below the array starts with 1
awk[mon]=1 
awk[tus]=2 


for (var in arrary){statement,....}

awk -F: '{shell[$NF]++}END {for(A in shell) {print A,shell[A]}}' /etc/passwd 

nestat -tan 

netstat -tan |awk '/^tcp/{STATE[$NF]++}END{for (S in STATE){print S,STATE[S]}}'

awk '{count[$1]++}END{for ip in count}{printf "%-20s:%d\n",ip,count[ip]}}'  access_log 

Posted by AP81 on Thu, 23 May 2019 10:41:23 -0700