linux script introduction

Keywords: Linux network Nginx snapshot Docker

#List all files starting with a and o
[root@sh02-hap-bss-prod-consul03 ~]# ls
anaconda-ks.cfg  nss-pam-ldapd-0.9.8-1.gf.el7.x86_64.rpm  openldap-clients-2.4.44-21.el7_6.x86_64.rpm  original-ks.cfg  tools
[root@sh02-hap-bss-prod-consul03 ~]# ls [ao]*
anaconda-ks.cfg  openldap-clients-2.4.44-21.el7_6.x86_64.rpm  original-ks.cfg

#[0-9] for any single number
#[! 0-9] indicates a string that does not start with a number
[root@sh02-hap-bss-prod-consul03 ~]# ls
1  3   anaconda-ks.cfg                          openldap-clients-2.4.44-21.el7_6.x86_64.rpm  tools
2  44  nss-pam-ldapd-0.9.8-1.gf.el7.x86_64.rpm  original-ks.cfg
[root@sh02-hap-bss-prod-consul03 ~]# ls [!0-9]*
anaconda-ks.cfg  nss-pam-ldapd-0.9.8-1.gf.el7.x86_64.rpm  openldap-clients-2.4.44-21.el7_6.x86_64.rpm  original-ks.cfg

tools:
libnss-cache  nsscache
[root@sh02-hap-bss-prod-consul03 ~]# ls [0-9]*
1  2  3  44

#Delete files starting with a number
rm -f [0-9]*

#Delete files that do not start with a number
[root@sh02-hap-bss-prod-consul03 test]# ls
1  2  3  4  a  aa  b  bb
[root@sh02-hap-bss-prod-consul03 test]# rm -f [!0-9]*
[root@sh02-hap-bss-prod-consul03 test]# ls
1  2  3  4

echo

By default, echo appends a line break to the end of the text, which can be ignored with - n

[root@host1 src]# echo abc ddd
abc ddd
[root@host1 src]# echo -n abc ddd
abc ddd[root@host1 src]#

echo -e

echo -e handles special characters

If the following characters appear in the string, they will be specially processed and will not be output as general text:

\A make a warning sound;
\b delete the previous character;
\c without line break at the end;
\f line breaks but the cursor remains in the original position;
\n wrap the line and move the cursor to the beginning of the line;
\r the cursor moves to the beginning of the line, but does not wrap;
\Tinser t tab;
\v is the same as \ f;
\Insert \ character;
\nnn inserts the ASCII character represented by nnn (octal);

Here is an example:

$echo -e "a\bdddd" / / the previous a will be erased
dddd

$echo -e "a\adddd" / / the output will sound an alarm at the same time
adddd

$echo -e "a\ndddd" / / Wrap
a
dddd

variable

String length: ${ාvar}

[root@host1 src]# echo ${NODE_HOME}
/usr/local/node
[root@host1 src]# echo ${#NODE_HOME}
15
#15 characters in length

Using shell for mathematical calculation

When using let, the variable name does not need to be added before$

[root@host1 src]# nod1=3
[root@host1 src]# nod2=5
[root@host1 src]# abc=$[nod1+nod2]
[root@host1 src]# echo $abc
8
[root@host1 src]# let def=nod1+nod2
[root@host1 src]# echo $def
8

[root@host3 2056]# echo "4*0.56" |bc
2.24
[root@host3 2056]# no=54
[root@host3 2056]# res=`echo "$no*1.5"|bc`
[root@host3 2056]# echo $res
81.0
[root@host3 2056]#

Other parameters can be placed before the specific operation to be performed and passed to bc through stdin with semicolon as delimiter

For example, setting decimal precision

[root@host3 2056]# echo "scale=2;3/8" | bc
.37

File descriptor

0---stdin standard input
1---stdout standard output
2---stderr standard error

When an error occurs in a command and exits, she will return a non-zero exit status, and the number 0 will be returned after successful execution,. Exit status can be obtained by flushing $? And echo $?

Correct output to out.txt, error output to desktop
ls  > out.txt

Error output to out.txt, correct output to desktop
ls  2> out.txt

All output redirected to out.txt
ls  &> out.txt

Can be united
find /etc -name passwd > find.txt 2> find.err

Discard the error result and only output the correct result on the screen
find /etc -name passwd 2> /dev/null

Discard all results
find /etc -name passwd &> /dev/null

Since the wrong output cannot go through the pipeline, if necessary, the wrong output must be taken as the correct output
 That is: find / etc - name passwd 2 > & 1 | less

For example, find /etc -name passwd |wc -l
 In fact, only the correct number of rows is counted, and the wrong output is not counted

find /etc -name passwd 2>&1 |wc -l
 This one counts the wrong ones as the right ones

/sbin/service vsftpd stop > /dev/null 2>&1
 This means stop the service, discard the correct output, and output the wrong output when the correct output is output to the terminal

Arrays and associative arrays

There are many ways to define an array. We usually use only one column of values in a single row to define an array:

[root@host3 ~]# array_var=(1 2 3 4 5 6 6 6)
[root@host3 ~]# echo ${array_var[*]}  #Print all values in array, mode 1
1 2 3 4 5 6 6 6
[root@host3 ~]# echo ${array_var[@]}  #Print all values in array, mode 2
1 2 3 4 5 6 6 6
[root@host3 ~]# echo ${#array_var[*]} #Print array length
8

Associative arrays are similar to dictionaries. You can customize key values and list array index keys

Get terminal information

tput sc #Store cursor position
tput rc #Restore cursor
tput ed #Clear everything from cursor to end of line

Generate delay in script

Count down:

#!/bin/bash
echo -n Count:
tput sc

count=11;
while true;
do+
  if [ $count -gt 0 ];
  then
    let count--;
    sleep 1;
    tput rc
    tput ed
    echo -n $count;
  else exit 0;
  fi
done

#In the chestnut here, the initial value of the variable count is 11, and each cycle is reduced by 1,. tput sc stores the cursor position. In each cycle, a new count value is printed in the terminal by restoring the previously stored cursor position. The command to restore the cursor position is tput rc. tput ed clears everything from the current cursor position to the end of the line so that the old count value can be cleared and written to a new value.

Functions and parameters

Defined function

function fname()
{
statements;
}

//Or:

fname()
{
statements;
}

Call, just use the function name to call
```
fname; #Execution function
```
Parameters can be passed to functions and accessed by scripts
```
fname arg1 arg2;
```

Methods of accessing function parameters

fname()
{
echo $1,$2; #Access parameters 1 and 2
echo "$@"; #Print all parameters in a list at once
echo "$*"; #Similar to $@, but parameters are treated as a single entity
echo "$#"; #$#Represents the number of parameters after this script or function
return 0;  #Return value
}
#$@ is used more than $* because $* treats all parameters as a single string, so it is rarely used

Function recursion
In bash, functions also support recursion (you can call your own functions), such as
```
F() { echo $1; F hello; sleep 1; }
```
fork bomb
```
:(){ :|:& };:

#This recursive function can call itself constantly, generate new processes, and eventually cause a denial of service. Before the function call, the Forbidden City will be put in the background. This dangerous code will branch a large number of processes and become a fork bomb

[root@host3 ~]#  :(){ :|:& };:
[1] 2526
[root@host3 ~]# 
[1]+ completed  

Crashed.
```
This does not seem to be well understood. We can change the following format:
```
:()
{
:|:&
};
:
```
A better understanding is this:
```
bomb()
{
bomb|bomb&
};
bomb
```
Because the function keyword can be omitted in the shell, the above 13 characters are the function to define a function and call this function. The name of the function is:, the main core code is: |: &. It can be seen that this is a recursive call of the function itself. Through the & Implementation, start a new process to run in the background, realize the geometric growth of the process through the pipeline, and finally Use: to call the function to detonate the bomb. Therefore, in a few seconds, the system will crash because it cannot handle too many processes. The only way to solve this problem is to restart

Ways of prevention

Of course, the Fork bomb is not so terrible. You can write one in minutes in other languages, for example, python version:

  import os
  while True: 
      os.fork()

The essence of the Fork bomb is nothing more than to seize system resources by creating processes. In Linux, we can use the ulimit command to restrict certain behaviors of users. Running ulimit-a can see what restrictions we can make:

[root@host3 ~]# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 7675
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 655350
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 100
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

As you can see, the - u parameter limits the number of user created processes, so we can use ulimit -u 100 to allow users to create up to 100 processes. This will prevent the bomb. But this is not thorough. After the terminal is shut down, the command fails. We can modify the / etc/security/limits.conf file for further prevention, and add the following lines to the file

*       soft    nproc   100
*       hard    nproc   100

Read command return value (status)

$? Gives the return value of the command

The return value is called the exit status. It can be used to analyze whether the command is executed successfully. If it is successful, the exit status is 0, otherwise it is not 0

Read the output of the command sequence into a variable

Generating an independent process by using a subshell

The subshell is a separate process. You can use the () operator to define a subshell:

pwd;
(cd /bin; ls);
pwd;

#When a command runs in a subshell, it has no effect on the current shell, and all changes are limited to the inside of the subshell. For example, when cd changes the current directory of the subshell, this change will not be reflected in the main shell environment

Read read

Read is used to read text from the keyboard or standard input. Read the input from the user interactively.
Any input library that becomes a language mostly reads the input from the keyboard; however, only when the Enter key is pressed can it mark the completion of input.
read provides a way to do this without the Enter key

Read n characters and save the variable name

read -p "Enter input:" var
#Prompt read

read -n number_of_chars name

read -n 3 var
echo $var

Use a specific delimiter as the end of the input line

read -d ":" var
echo $var

#End of input line with colon

Run command until successful execution

Define the function as follows:

repeat() { while true;do $@ && return; done }

#We created the repeat function, which contains an infinite loop that executes the commands passed in as parameters (accessed through $@). If the command is executed successfully, return and exit the loop

A faster approach:

In most modern systems, true is implemented as a binary. This means that the shell has to generate a process without executing a while loop. If you don't want to, you can use the ":" command of shell internal check, and she always returns the exit code of 0:

repeat() { while :; do $@ && return; done }

#Although the readability is not high, it must be faster than the previous method

Increase delay

Well, you need to download the next temporarily unavailable file from the internet, but it will take a while. The method is as follows:

repeat wget -c http://abc.test.com/software.tar.gz

#If we use this form, we need to send a lot of data to the server, which may have an impact on the server. We can modify the function and add a short delay

repeat() { while :; do $@ && return; sleep30; done }

#This causes the command to run every 30 seconds

Field separator and iterator

IFS is an important concept in shell script. It is the environment variable to store the delimiter, and it is the default existing identity string used in the current shell environment

The default value of IFS is blank character (line break, tab, or space). For example, in shell, the default value is blank character as ifs

[root@host3 ~]# data="abc eee ddd fff"
[root@host3 ~]# for item in $data; do echo ITEM: $item; done
ITEM: abc
ITEM: eee
ITEM: ddd
ITEM: fff

//Implementation:
list1="1 2 3 3 4 4"
for line in $list1
do
echo $line;
done

//Output:
1
2
3
3
4
4

//Implementation:
for line in 1 2 3 3 4 4 #If you enclose the in with quotation marks, it will be treated as a string
do
echo $line;
done

//Same output:
1
2
3
3
4
4

Next, we can change IFS to Comma:

#IFS has not been modified. At this time, we default to the space character, so we print data as a single string
[root@host3 ~]# data="eee,eee,111,222"
[root@host3 ~]# for item in $data1; do echo ITEM: $item; done
ITEM: eee,eee,111,222
[root@host3 ~]# oldIFS=$IFS  #This step is to back up the current IFS as oldIFS, which will be recovered later
[root@host3 ~]# IFS=,  #Modify IFS to comma after backup, and output again to find that comma has become separator
[root@host3 ~]# for item in $data1; do echo ITEM: $item; done
ITEM: eee
ITEM: eee
ITEM: 111
ITEM: 222
[root@host3 ~]# IFS=$oldIFS #Restore IFS to original
[root@host3 ~]# for item in $data1; do echo ITEM: $item; done
ITEM: eee,eee,111,222

So we need to change IFS again and remember to restore it to its original state

for cycle

for var in list;
do
  commands;
done

list It can be a string or a sequence
{1..50}Generate a 1-50 List of numbers for
{a..z}or{A..Z}or{a..h}Generate alphabet

For can also adopt the for loop mode in c language

for (i=0;i<10;i++)
{
  commands; #Use variable $i
}

while Loop

while condition
do
  commands;
done

until cycle

She's going to loop through it until it's true

x=0;
until [ $x -eq 9 ];
do
  let x++; echo $x;
done

Comparison and testing

Process control in a program is handled by comparison statements and test statements. We can test with if if else and logical operators, and compare data with comparison operators. In addition, there is a test command for testing

if condition;
then
  commands;
fi

if condition;
then
  commands;
else if condition; then
  commands;
else
  commands;
fi

if and else statements can be nested, which will become very long. You can use logical operators to simplify them

[ condition ] && action; #If the former is true, execute action;
[ condition ] || action; #If the former is false, execute action;

Arithmetic comparison

Conditions are usually placed in closed brackets. Be sure to note that there is a space between [or] and the operand. If you forget this space, an error will be reported

Arithmetic judgment:

[$var -eq 0] ා returns true when $var equals 0
 [$var -ne 0] ා when $VaR is non-zero, return true

Other:
-gt: is greater than
 -lt: less than
 -ge: greater than or equal to
 -le: less than or equal to

Multi condition test:
[$var1 -ne 0 -a $var2 -gt 2] ා logic and - A
 [$var1 -ne 0 -o $var2 -gt 2] × logical or - O

File system related tests

We can use different condition flags to test different file system related properties:

[- f file] true if the given variable contains a normal file path or filename
 [- x file] executable, true
 [- d file] is the directory, true
 True if the [- e file] file exists
 True if [- w file] is writable
 [- r file] readable, true
 True if [- L file] contains a symbolic link

The method of use is as follows:

fpath="/etc/passwd"
if [ -e $fpath ];then
  echo File exists;
else
  echo Dose not exists;
fi

string comparison

When using string comparison, it is better to use double brackets, because sometimes using a single bracket will cause errors, so it is better to avoid

You can test the two strings to see if they are the same

[[ $str1 = $str2 ]]
Or:
[[ $str1 == $str2 ]]
Conversely:
[[ $str1 != $str2 ]]

[[- z $str1]] is an empty string, returns true
 [[- n $str1]] is a non empty string, return true

Note that there is a space before and after = if you forget the space, it is not a comparison, but an assignment statement
Using logic & & and|, it is easier to combine multiple conditions

cat

General writing:

Reverse print command tac, which is the opposite of cat

cat file1 file2 file3 ...
This command splices the file contents of the command line parameters together

Similarly, we can use cat to splice the content from the input file with the standard input, and combine stdin with the data in another file, as follows:

echo "111111" |cat  - /etc/passwd
 In the above code, - is the filename of the stdin text

Show tab as ^ I

For example, when writing a program in python, the code indents with tabs and spaces are different. If tabs are used where spaces are, indenting errors will occur. It's hard to find this error in a text editor alone

At this point, we can use the - T option to display the tabs, marked with ^ I

[root@host3 ~]# cat bbb.sh 
for line in "1 2 3 3 4 4"
do
    echo $line;
done

[root@host3 ~]# cat -T bbb.sh 
for line in "1 2 3 3 4 4"
do
^Iecho $line;
done

Line number cat -n

#-n will add the line number to the blank line, if you need to skip the blank line, you can use the option - b

[root@host3 ~]# cat bbb.sh 
for line in "1 2 3 3 4 4"

do
    echo $line;
done
[root@host3 ~]# cat -n bbb.sh 
     1  for line in "1 2 3 3 4 4"
     2  
     3  do
     4      echo $line;
     5  done
[root@host3 ~]# cat -b bbb.sh 
     1  for line in "1 2 3 3 4 4"

     2  do
     3      echo $line;
     4  done

find

The find command works by traversing down the file hierarchy, matching qualified files, and performing corresponding operations

find /etc #List all files and folders in the directory, including hidden files

find -iname ignore case

#When matching one OR more files, you can use OR conditions, such as finding all. txt and. conf files under / etc
find /etc  \( -name "*.txt" -o -name "*.conf" \) 
find /etc  \( -name "*.txt" -o -name "*.conf" \) -print
#\(and \) for treating - name "*.txt" -o -name "*.conf" as a whole

#-name is used to match files, - path is used to match file paths, wildcards are available
find  / -path "*/etc/*" -print
#Print as long as the path contains / etc / and print

#-regex Parameter, regular is more powerful. For example, email address can be in the form of name@host.root. Therefore, it is generally translated into:
#[a-z0-9]+@[a-z0-9]+.[a-z0-9]+
#The symbol + indicates that a character can appear once or more times in the character class before it.
find /etc -regex ".*\(\.py|\.sh\)$"
#Find all files ending in. py or. sh
#Also - iregex can ignore case, just like - iname

#-regex It is also a test item. One thing to note when using - regex is that: - regex does not match the filename, but the full filename (including the path). For example, there is a file "abar9" in the current directory. If you use "ab.*9" to match, you will not find any results. The correct way is to use ". * ab.*9" or "*/ab.*9"To match.

find . -regex ".*/[0-9]*/.c" -print

Negative parameter

find /etc ! -name "*.conf" -print

Search based on directory depth

We can use the depth options - maxdepth and - mindepth to limit the directory depth traversed by the find command

[root@host3 ~]# find /etc -maxdepth 1 -name "*.conf" -print
/etc/resolv.conf
/etc/dracut.conf
/etc/host.conf

[root@host3 ~]# find /etc -maxdepth 2 -name "*.conf" -print
/etc/resolv.conf
/etc/depmod.d/dist.conf

[root@host3 ~]# find /etc -mindepth 4 -name "*.conf" -print
/etc/openldap/slapd.d/openldap/ldap.conf
/etc/openldap/slapd.d/openldap/schema/schema_convert.conf
/etc/openldap/slapd.d/openldap/slapd.conf

Time based search

-atime: last visit time
 -mtime: last modified time
 -ctime: the last change time of file metadata (such as permission or ownership)
It's all in days
 There are also minutes:
-amin
-mmin
-cmin

-Newer, reference file, compare timestamps. Documents newer than references
[root@host3 ~]# find /etc -type f -newer /etc/passwd -print
/etc/resolv.conf
/etc/shadow
/etc/ld.so.cache
/etc/cni/net.d/calico-kubeconfig

Search based on file size

find /etc -type f -size +2k  #Greater than 2k
find /etc -type f -size -2k  #Less than 2k
find /etc -type f -size 2k  #Equal to 2k

Delete matching files

find ./ -type f -name "*.txt" -delete

Based on file and ownership

find /etc -type f -perm 644

find /etc -type f -name "*.conf"  ! -perm 644

Search based on users

find /etc -type f -user USER

To execute a command or action

find /etc -type f -user root -exec chown mysql {} \;
#Change the file owner whose owner is root to mysql

# {} is a special string used with the - exec option. For each matching file, {} is replaced with the corresponding file name.

Another example is to splice all the file contents in a given directory and write them to a single file. We can find all the. conf files with find, and then use the cat command with exec:

find /etc/ -type f -name "*.conf" -exec cat {} \;>all.txt
#Append the contents of all. conf files to the all.txt file
#The reason for not appending with > > is that the find command outputs only one stream (stdin), which is necessary only when multiple streams are appended to a single file

#The following command copies the. txt file 10 days ago to the OLD directory:
find /etc -type f -mtime +10 -name "*.txt" -exec cp {} OLD \;

Let find skip some directories

Sometimes in order to improve performance, you need to skip some directories, such as git. Each subdirectory will contain a. git directory. You need to skip these directories.

find /etc \( -name ".git" -prune \) -o \( -type f -print  \)

#\(- name "/etc/rabbitmq" -prune \) is used for exclusion, while \ (- type f -print \) indicates the action to be performed.

Fun xargs

The xargs command reformats the data received from stdin and supplies it as a parameter to other commands

xargs, as an alternative, works like - exec in the find command

To convert multi line input to single line output, as long as the line break is removed and replaced by a space, the multi line input conversion can be realized. With xargs, we can replace line breaks with spaces, so that we can convert multiple lines into single lines

[root@host3 ~]# cat 123.txt 
1 2 3 4 5
6 7 8 9
10 11 12 13 14

[root@host3 ~]# cat 123.txt |xargs
1 2 3 4 5 6 7 8 9 10 11 12 13 14

Convert single line input to multi line output, specify the maximum number of parameters per line n, we can divide any text from stdin into multi lines, n parameters per line. Each parameter has a space separated string. Space is the default delimiter.

[root@host3 ~]# cat 123.txt 
1 2 3 4 5
6 7 8 9
10 11 12 13 14

[root@host3 ~]# cat 123.txt |xargs -n 3
1 2 3
4 5 6
7 8 9
10 11 12
13 14

[root@host3 ~]# echo  1 3 4 5 6 7 8 |xargs -n 3
1 3 4
5 6 7
8

Custom delimiters to split parameters. Specify a custom delimiter for the input with the - d option

[root@host3 ~]# echo "abcTdslfjTdshfsT1111Tfd222" |xargs -d T
abc dslfj dshfs 1111 fd222
#Use letter T as separator

#We can define how many parameters to output per line while defining the resolver
[root@host3 ~]# echo "abcTdslfjTdshfsT1111Tfd222" |xargs -d T -n 2
abc dslfj
dshfs 1111
fd222

#Output one parameter per line
[root@host3 ~]# echo "abcTdslfjTdshfsT1111Tfd222" |xargs -d T -n 1
abc
dslfj
dshfs
1111
fd222

Sub shell

cmd0 | (cmd1;cmd2;cmd3) | cmd4

In the middle is the sub shell. If there is cmd in it, it will only take effect in the sub shell

The difference between print and print0

-print A carriage return line feed is added after each output, and-print0 Not at all.
[root@AaronWong shell_test]# find /home/AaronWong/ABC/ -type f -print
/home/AaronWong/ABC/libcvaux.so
/home/AaronWong/ABC/libgomp.so.1
/home/AaronWong/ABC/libcvaux.so.4
/home/AaronWong/ABC/libcv.so
/home/AaronWong/ABC/libhighgui.so.4
/home/AaronWong/ABC/libcxcore.so
/home/AaronWong/ABC/libhighgui.so
/home/AaronWong/ABC/libcxcore.so.4
/home/AaronWong/ABC/libcv.so.4
/home/AaronWong/ABC/libgomp.so
/home/AaronWong/ABC/libz.so
/home/AaronWong/ABC/libz.so.1
[root@AaronWong shell_test]# find /home/AaronWong/ABC/ -type f -print0
/home/AaronWong/ABC/libcvaux.so/home/AaronWong/ABC/libgomp.so.1/home/AaronWong/ABC/libcvaux.so.4/home/AaronWong/ABC/libcv.so/home/AaronWong/ABC/libhighgui.so.4/home/AaronWong/ABC/libcxcore.so/home/AaronWong/ABC/libhighgui.so/home/AaronWong/ABC/libcxcore.so.4/home/AaronWong/ABC/libcv.so.4/home/AaronWong/ABC/libgomp.so/home/AaronWong/ABC/libz.so/home/AaronWong/ABC/libz.so.1

tr

tr Can only pass stdin Standard input, but cannot receive input through command line arguments. His call format is:
tr [option] set1 set2

Box drawings converted to spaces: tr '\ t' '< file.txt

[root@host3 ~]# cat -T 123.txt 
1 2 3 4 5
6 7 8 9
^I10 11 12 13 14

[root@host3 ~]# tr '\t' '    ' < 123.txt 
1 2 3 4 5
6 7 8 9
 10 11 12 13 14

Delete characters with tr

tr has an option - d to clear the specific characters that appear in stdin by specifying the character set to be deleted:

cat  file.txt |tr -d '[set1]'
#Use set1 only, not set2

#Replacement number
[root@host3 ~]# echo "Hello 123 world 456" |tr -d '0-9'
Hello  world 

#Replace letters
[root@host3 ~]# echo "Hello 123 world 456" |tr -d 'A-Za-z'
 123  456

#Replace H
[root@host3 ~]# echo "Hello 123 world 456" |tr -d 'H'
ello 123 world 456

Sorting, unique and duplicate

Sort can help us sort text files and stdin. He usually works with other commands to generate the required output. uniq is a command that is often used with sort. Its purpose is to extract a unique line from text or stdin.

#We can easily sort a group of files (such as file1.txt file2.txt) in the following way:
[root@host3 ~]# sort /etc/passwd /etc/group 
adm:x:3:4:adm:/var/adm:/sbin/nologin
adm:x:4:
apache:x:48:
apache:x:48:48:Apache:/usr/share/httpd:/sbin/nologin
audio:x:63:
bin:x:1:
bin:x:1:1:bin:/bin:/sbin/nologin
caddy:x:996:
caddy:x:997:996:Caddy web server:/var/lib/caddy:/sbin/nologin
...

#You can also merge and sort and redirect to a new file
sort /etc/passwd /etc/group > abc.txt

#Sort by number
sort -n

#Reverse sorting
sort -r

#Sort by month
sort -M month.txt

#Merge two sorted files
sort -m sorted1 sorted2

#Find non duplicate lines in sorted files
sort file1.txt file2.txt |uniq

Check that the files have been sorted:

To check whether the files have been sorted, you can use the following methods. If the files have been sorted, sort will return the exit code ($?) of 0. Otherwise, it will return non-zero

#!/bin/bash
sort -C filename;
if [ $? -eq 0 ]; then
  echo Sorted;
else
  echo Unsorted;
fi

The sort command contains a number of options. If uniq is used, sort is more necessary, because the input data must be sorted

sort completes some more complex tasks

#-k specifies which column to sort by, -r reverse, -n number
sort -nrk 1 data.txt
sort -k 2 data.txt

uniq

uniq can only be applied to the data input in the sorted order

[root@host3 ~]# cat data.txt 
1010hellothis
3333
  2189ababbba
333
 7464dfddfdfd
333

#Duplicate removal
[root@host3 ~]# sort data.txt |uniq
1010hellothis
  2189ababbba
333
3333
 7464dfddfdfd

#De duplication and statistics
[root@host3 ~]# sort data.txt |uniq -c
      1 1010hellothis
      1   2189ababbba
      2 333
      1 3333
      1  7464dfddfdfd

#Show only lines that do not have duplicates in the text
[root@host3 ~]# sort data.txt |uniq -u
1010hellothis
  2189ababbba
3333
 7464dfddfdfd

#Show only duplicate lines in text
[root@host3 ~]# sort data.txt |uniq -d
333

Temporary file name and random number

When writing shell scripts, we often need to store temporary data. The best place to store temporary data is / tmp (the contents of this directory will be emptied after the system restarts). There are two ways to generate standard file names for temporary data

[root@host3 ~]# file1=`mktemp`
[root@host3 ~]# echo $file1
/tmp/tmp.P9var0Jjdw
[root@host3 ~]# cd /tmp/
[root@host3 tmp]# ls
add_user_ldapsync.ldif     create_module_config.ldif.bak   globalconfig.ldif       overlay.ldif
create_module_config.ldif  databaseconfig_nosyncrepl.ldif  initial_structure.ldif  tmp.P9var0Jjdw
#The above code creates a temporary file and prints out the file name

[root@host3 tmp]# dir1=`mktemp -d`
[root@host3 tmp]# echo $dir1
/tmp/tmp.UqEfHa389N
[root@host3 tmp]# ll
//Total dosage 28
-r--------. 1 root root  130 2 Month 122019 add_user_ldapsync.ldif
-r--------. 1 root root  329 2 Month 142019 create_module_config.ldif
-r--------. 1 root root  329 2 Month 122019 create_module_config.ldif.bak
-r--------. 1 root root 2458 2 Month 142019 databaseconfig_nosyncrepl.ldif
-r--------. 1 root root  239 2 Month 122019 globalconfig.ldif
-r--------. 1 root root  795 2 Month 122019 initial_structure.ldif
-r--------. 1 root root  143 2 Month 122019 overlay.ldif
-rw-------  1 root root    0 9 Month 2713:06 tmp.P9var0Jjdw
drwx------  2 root root    6 9 Month 2713:09 tmp.UqEfHa389N
#The above code creates a temporary directory and prints the directory name

[root@host3 tmp]# mktemp test1.XXX
test1.mBX
[root@host3 tmp]# mktemp test1.XXX
test1.wj1
[root@host3 tmp]# ls
//Total dosage 28
-r--------. 1 root root  130 2 Month 122019 add_user_ldapsync.ldif
-r--------. 1 root root  329 2 Month 142019 create_module_config.ldif
-r--------. 1 root root  329 2 Month 122019 create_module_config.ldif.bak
-r--------. 1 root root 2458 2 Month 142019 databaseconfig_nosyncrepl.ldif
-r--------. 1 root root  239 2 Month 122019 globalconfig.ldif
-r--------. 1 root root  795 2 Month 122019 initial_structure.ldif
-r--------. 1 root root  143 2 Month 122019 overlay.ldif
-rw-------  1 root root    0 9 Month 2713:12 test1.mBX
-rw-------  1 root root    0 9 Month 2713:12 test1.wj1
-rw-------  1 root root    0 9 Month 2713:06 tmp.P9var0Jjdw
drwx------  2 root root    6 9 Month 2713:09 tmp.UqEfHa389N
#The above is to create a temporary file based on the template name. XXX is uppercase, and X will be replaced by random letters or numbers. Note that the premise of mktemp's normal operation is to ensure that there are at least three X's in the template

split files and data

Suppose a test file of data.txt, with a size of 100kb, can be divided into several files with a size of 10kb

[root@host3 src]# ls
nginx-1.14.2  nginx-1.14.2.tar.gz
[root@host3 src]# du -sh nginx-1.14.2.tar.gz 
992K    nginx-1.14.2.tar.gz
[root@host3 src]# split -b 100k nginx-1.14.2.tar.gz 
[root@host3 src]# ll
×ÜÓÃÁ¿ 1984
drwxr-xr-x 9 postgres mysql     186 8ÔÂ  15 19:50 nginx-1.14.2
-rw-r--r-- 1 root     root  1015384 8ÔÂ  16 10:44 nginx-1.14.2.tar.gz
-rw-r--r-- 1 root     root   102400 9ÔÂ  29 12:36 xaa
-rw-r--r-- 1 root     root   102400 9ÔÂ  29 12:36 xab
-rw-r--r-- 1 root     root   102400 9ÔÂ  29 12:36 xac
-rw-r--r-- 1 root     root   102400 9ÔÂ  29 12:36 xad
-rw-r--r-- 1 root     root   102400 9ÔÂ  29 12:36 xae
-rw-r--r-- 1 root     root   102400 9ÔÂ  29 12:36 xaf
-rw-r--r-- 1 root     root   102400 9ÔÂ  29 12:36 xag
-rw-r--r-- 1 root     root   102400 9ÔÂ  29 12:36 xah
-rw-r--r-- 1 root     root   102400 9ÔÂ  29 12:36 xai
-rw-r--r-- 1 root     root    93784 9ÔÂ  29 12:36 xaj
[root@host3 src]# ls
nginx-1.14.2  nginx-1.14.2.tar.gz  xaa  xab  xac  xad  xae  xaf  xag  xah  xai  xaj
[root@host3 src]# du -sh *
32M nginx-1.14.2
992K    nginx-1.14.2.tar.gz
100K    xaa
100K    xab
100K    xac
100K    xad
100K    xae
100K    xaf
100K    xag
100K    xah
100K    xai
92K xaj
#As mentioned above, the 992K nginx tar packet is divided into 100k packets, and the final packet less than 100k is only 92k

As can be seen from the above, the default is to use letters as the suffix. If you want to suffix with a number, you can use the - d parameter, - a length to specify the suffix length

[root@host3 src]# ls
nginx-1.14.2  nginx-1.14.2.tar.gz
[root@host3 src]# split -b 100k nginx-1.14.2.tar.gz -d -a 5
[root@host3 src]# ls
nginx-1.14.2  nginx-1.14.2.tar.gz  x00000  x00001  x00002  x00003  x00004  x00005  x00006  x00007  x00008  x00009
[root@host3 src]# du -sh *
32M nginx-1.14.2
992K    nginx-1.14.2.tar.gz
100K    x00000
100K    x00001
100K    x00002
100K    x00003
100K    x00004
100K    x00005
100K    x00006
100K    x00007
100K    x00008
92K x00009
#File name is x, suffix is 5 digits

Specify file name prefix

The previously divided files have a file name x. we can also use our own file PREFIX through the PREFIX name. The last argument to the split command is PREFIX

[root@host3 src]# ls
nginx-1.14.2  nginx-1.14.2.tar.gz
[root@host3 src]# split -b 100k nginx-1.14.2.tar.gz -d -a 4 nginxfuck
[root@host3 src]# ls
nginx-1.14.2         nginxfuck0000  nginxfuck0002  nginxfuck0004  nginxfuck0006  nginxfuck0008
nginx-1.14.2.tar.gz  nginxfuck0001  nginxfuck0003  nginxfuck0005  nginxfuck0007  nginxfuck0009
#As above, the last parameter specifies a prefix

If we don't want to divide by size, we can divide by number of lines - l

[root@host3 test]# ls
data.txt
[root@host3 test]# wc -l data.txt 
7474 data.txt
[root@host3 test]# split -l 1000 data.txt -d -a 4 conf
[root@host3 test]# ls
conf0000  conf0001  conf0002  conf0003  conf0004  conf0005  conf0006  conf0007  data.txt
[root@host3 test]# du -sh *
40K conf0000
48K conf0001
48K conf0002
36K conf0003
36K conf0004
36K conf0005
36K conf0006
20K conf0007
288K    data.txt
#The above 7000 line file is divided into 1000 lines and one copy. The file name starts with conf, followed by 4 digits

File split csplit

csplit can split the log file according to the specified conditions and string matching options, which is a variation of split tool

Split can only be divided according to the size of the data and the number of rows, while csplit can be divided according to the characteristics of the file itself. Whether there is a word or text content can be used as a condition to split the file

[root@host3 test]# ls
data.txt
[root@host3 test]# cat data.txt 
SERVER-1
[conection] 192.168.0.1 success
[conection] 192.168.0.2 failed
[conection] 192.168.0.3 success
[conection] 192.168.0.4 success
SERVER-2
[conection] 192.168.0.5 success
[conection] 192.168.0.5 failed
[conection] 192.168.0.5 success
[conection] 192.168.0.5 success
SERVER-3
[conection] 192.168.0.6 success
[conection] 192.168.0.7 failed
[conection] 192.168.0.8 success
[conection] 192.168.0.9 success
[root@host3 test]# csplit data.txt /SERVER/ -n 2 -s {*} -f server -b "%02d.log";rm server00.log
rm: Delete normal empty file "server00.log"？y
[root@host3 test]# ls
data.txt  server01.log  server02.log  server03.log

detailed description:

/SERVER / is used to match rows, from which the segmentation process begins
/[regenx] / indicates the text style. Include matching rows that know (but do not include) from the current row (first row) that contain 'SERVER'
{*} indicates that the split is repeated according to the matching line until the end of the file. You can specify the number of divisions in the form of {integer}
-s causes the command to enter silent mode without printing other messages.
-n specifies the divided file prefix
-b specifies the suffix format, such as% 02d.log, similar to printf in C

Because the first file after segmentation has no content (the matching word is on the first line of the file), we delete server00.log

Splitting file names by extension

Some scripts are processed according to the file name. We may need to modify the file name while retaining the extension, transform the file format (modify the extension while retaining the file name), or extract part of the file name. Some built-in functions of shell can be used to segment file names according to different situations

The% symbol makes it easy to extract the name part from the format name. Extension.

[root@host3 ~]# file_jpg="test.jpg"
[root@host3 ~]# name=${file_jpg%.*}
[root@host3 ~]# echo $name
test
#That is, the file name part is extracted

With the help of ා symbol, the extension part of the file name can be extracted.

[root@host3 ~]# file_jpg="test.jpg"
[root@host3 ~]# exten=${file_jpg#*.}
[root@host3 ~]# echo $exten
jpg
#Extract the extension. The above part of the extract file name is. * here the extract extension is *

Above grammatical interpretation

Meaning of ${VAR%. *}:

Remove the string matched by the wildcard located to the right of% from $VAR, which matches from right to left
Assign a value to VaR, VAR=test.jpg, then the wildcard will match. JPG from right to left. So, if you remove the match from $VaR, you get test

%It belongs to non greedy operation, which finds the shortest result matching wildcard from right to left. There is another%%, which is similar to%, but the behavior pattern is greedy, which means that she will match the longest string that meets the condition, such as VAR=hack.fun.book.txt

Use%Operator:
[root@host3 ~]# VAR=hack.fun.book.txt
[root@host3 ~]# echo ${VAR%.*}
hack.fun.book

//Use the%% operator:
[root@host3 ~]# echo ${VAR%%.*}
hack

Similarly, for the ා operator, there are##

Use#Operator:
[root@host3 ~]# echo ${VAR#*.}
fun.book.txt

//Use the ා operator
[root@host3 ~]# echo ${VAR##*.}
txt

Bulk rename and move

We can do a lot with find, rename and mv

The easiest way to rename an image file in the current directory in a specific format is to use the following script

#!/bin/bash
count=1;
for img in `find . -iname '*.png' -o -iname '*.jpg' -type f -maxdepth 1`
do
  new=image-$count.${img##*.}
  echo "Rename $img to $new"
  mv $img $new
  let count++
done

Execute the above script

[root@host3 ~]# ll
//Total dosage 24
-rw-r--r--  1 root root    0 10 Month 814:22 aaaaaa.jpg
-rw-r--r--  1 root root  190 8 Month 913:51 aaa.sh
-rw-r--r--  1 root root 2168 9 Month 2410:15 abc.txt
-rw-r--r--  1 root root 3352 9 Month 2009:58 all.txt
-rw-------. 1 root root 1228 1 Month 82019 anaconda-ks.cfg
-rw-r--r--  1 root root    0 10 Month 814:22 bbbb.jpg
-rw-r--r--  1 root root   48 9 Month 1810:27 bbb.sh
-rw-r--r--  1 root root    0 10 Month 814:22 cccc.png
drwxr-xr-x  2 root root  333 4 Month 1119:21 conf
-rw-r--r--  1 root root    0 10 Month 814:22 dddd.png
-rw-r--r--  1 root root  190 10 Month 814:22 rename.sh
[root@host3 ~]# sh rename.sh
find: warning: You are in the non option parameter -iname Later defined -maxdepth Option, but the option is not a location option (-maxdepth Affects specified comparison tests before or after it). Please specify options before other parameters.

Rename ./aaaaaa.jpg to image-1.jpg
Rename ./bbbb.jpg to image-2.jpg
Rename ./cccc.png to image-3.png
Rename ./dddd.png to image-4.png
[root@host3 ~]# ls
aaa.sh  abc.txt  all.txt  anaconda-ks.cfg  bbb.sh  conf  image-1.jpg  image-2.jpg  image-3.png  image-4.png  rename.sh

Interactive input automation

Write a script to read the interactive input first

#!/bin/bash
#File name: test.sh
read -p "Enter number:" no
read -p "Enter name:" name
echo $no,$name

Automatically send input to the script as follows:

[root@host3 ~]# ./test.sh 
Enter number:2
Enter name:rong
2,rong
[root@host3 ~]# echo -e "2\nrong\n" |./test.sh  
2,rong

# \n stands for carriage return, we use echo-e to generate the input sequence, - e means echo will interpret the escape sequence. If there is a large amount of input content, you can provide input by combining a separate input file with a redirection operator, as follows:
[root@host3 ~]# echo -e "2\nrong\n" > input.data
[root@host3 ~]# cat input.data 
2
rong

[root@host3 ~]# ./test.sh < input.data 
2,rong

#This method is to import interactive input data from a file

If you're a reverse engineer, you've probably dealt with buffer overflows. To implement, we need to redirect the shellcode in hexadecimal form (for example "\ xeb \ x1a \ x5e \ X31 \ xc0 \ X8 \ X46"). These characters cannot be entered directly through the keyboard because there is no corresponding key on the keyboard. So we should use:

echo -e "\xeb\x1a\x5e\x31\xc0\x88\x46"

Use this command to redirect shellcode to a defective executable. In order to handle dynamic input and provide input by checking the input requirements of the program runtime, we need to use an excellent tool expect.

The expect command can provide appropriate input according to input requirements

Automation with expect

In the default linux distribution, most of them do not include expect. You have to install them yourself: yum -y install expect

#!/usr/bin/expect
# File name expect.sh
spawn ./test.sh
expect "Enter number:"
send "2\n"
expect "Enter name:"
send "rong\n"
expect eof

#implement
[root@host3 ~]# ./expect.sh 
spawn ./test.sh
Enter number:2
Enter name:rong
2,rong

The spawn parameter specifies which command or script to execute
The expect parameter provides a message to wait for
Send is the message to send
expect eof indicates the end of command interaction

Using parallel process to speed up command execution

Take the md5sum command for example. Because of the operation involved, the command is cpu intensive. If multiple files need to generate checksums, we can use the following script to run them.

#!/bin/bash
PIDARRAY=()
for file in `find /etc/ -name "*.conf"`
do
  md5sum $file &
  PIDARRAY+=("$!")
done
wait ${PIDARRAY[@]}

//Implementation:
[root@host3 ~]# sh expect.sh 
72688131394bcce818f818e2bae98846  /etc/modprobe.d/tuned.conf
77304062b81bc20cffce814ff6bf8ed5  /etc/modprobe.d/firewalld-sysctls.conf
649f5bf7c0c766969e40b54949a06866  /etc/dracut.conf
d0f5f705846350b43033834f51c9135c  /etc/prelink.conf.d/nss-softokn-prelink.conf
0335aabf8106f29f6857d74c98697542  /etc/prelink.conf.d/fipscheck.conf
0b501d6d547fa5bb989b9cb877fee8cb  /etc/modprobe.d/dccp-blacklist.conf
d779db0cc6135e09b4d146ca69d39c2b  /etc/rsyslog.d/listen.conf
4eaff8c463f8c4b6d68d7a7237ba862c  /etc/resolv.conf
321ec6fd36bce09ed68b854270b9136c  /etc/prelink.conf.d/grub2.conf
3a6a059e04b951923f6d83b7ed327e0e  /etc/depmod.d/dist.conf
7cb6c9cab8ec511882e0e05fceb87e45  /etc/systemd/bootchart.conf
2ad769b57d77224f7a460141e3f94258  /etc/systemd/coredump.conf
f55c94d000b5d62b5f06d38852977dd1  /etc/dbus-1/system.d/org.freedesktop.hostname1.conf
7e2c094c5009f9ec2748dce92f2209bd  /etc/dbus-1/system.d/org.freedesktop.import1.conf
5893ab03e7e96aa3759baceb4dd04190  /etc/dbus-1/system.d/org.freedesktop.locale1.conf
f0c4b315298d5d687e04183ca2e36079  /etc/dbus-1/system.d/org.freedesktop.login1.conf
···

#Since multiple md5sum commands are running at the same time, if you use a multi-core processor, it will run faster and live

working principle:

Using bash's operator &, it causes the shell to put the command in the background and continue executing the script. This means that once the loop ends, the script exits and the md5sum command is still running in the background. To avoid this situation, we use $! To get the process pid. In Bash, $! To save the pid of the latest background process. We put these pid into the array, and then use the wait command to wait for the end of these processes.

Intersection and difference of text files

comm command can be used to compare two files

Intersection: print out the common lines of two files
Subtract: print out the different lines contained in the specified file
Subtraction: prints out lines contained in file a but not in other files

It should be noted that comm must use an ordered file as output

[root@host3 ~]# cat a.txt 
apple
orange
gold
silver
steel
iron
[root@host3 ~]# cat b.txt 
orange
gold
cookies
carrot
[root@host3 ~]# sort a.txt -o A.txt
[root@host3 ~]# vim A.txt 
[root@host3 ~]# sort b.txt -o B.txt
[root@host3 ~]# comm A.txt B.txt 
apple
      carrot
      cookies
              gold
iron
              orange
silver
steel
#You can see that the result is three columns. The first column outputs only the rows that exist in A.txt, the second column outputs only the rows that appear in B.txt, and the third column contains the rows that exist in A.txt and B.txt. Each column uses the tab character (\ t) as the delimiter

#In order to win the intersection, we need to delete the first and second columns and only display the third column
[root@host3 ~]# comm A.txt B.txt -1 -2
gold
orange

#Print only different
[root@host3 ~]# comm A.txt B.txt -3
apple
      carrot
      cookies
iron            
silver
steel

#In order to make the result readable, remove the previous \ t tab
[root@host3 ~]# comm A.txt B.txt -3 |sed 's/^\t//'
apple
carrot
cookies
iron
silver
steel

Create a file that cannot be modified

Make the file unmodifiable chattr +i file

[root@host3 ~]# chattr +i passwd 
[root@host3 ~]# rm -rf passwd 
rm: Cannot delete"passwd": Operation not allowed
[root@host3 ~]# chattr -i passwd 
[root@host3 ~]# rm -rf passwd 
[root@host3 ~]#

grep

grep can search multiple files

[root@host3 ~]# grep root /etc/passwd /etc/group
/etc/passwd:root:x:0:0:root:/root:/bin/bash
/etc/passwd:operator:x:11:0:operator:/root:/sbin/nologin
/etc/passwd:dockerroot:x:996:994:Docker User:/var/lib/docker:/sbin/nologin
/etc/group:root:x:0:
/etc/group:dockerroot:x:994:

The grep command interprets only some special characters in match Chu text. If you want to use regular expressions, you need to add the - e option. This means using extended regular expressions. Or you can use the egrep command that allows regular expressions by default (you can also use it without - E after actual measurement)

#Count the number of lines in the text that contain a matching string
[root@host3 ~]# grep -c root /etc/passwd
3

#Print line number
[root@host3 ~]# grep -n root /etc/passwd
1:root:x:0:0:root:/root:/bin/bash
10:operator:x:11:0:operator:/root:/sbin/nologin
27:dockerroot:x:996:994:Docker User:/var/lib/docker:/sbin/nologin

#Search multiple files and find out which file the matching text is in - l
[root@host3 ~]# grep root /etc/passwd /etc/group
/etc/passwd:root:x:0:0:root:/root:/bin/bash
/etc/passwd:operator:x:11:0:operator:/root:/sbin/nologin
/etc/passwd:dockerroot:x:996:994:Docker User:/var/lib/docker:/sbin/nologin
/etc/group:root:x:0:
/etc/group:dockerroot:x:994:

[root@host3 ~]# grep root /etc/passwd /etc/group -l
/etc/passwd
/etc/group

#-L, on the contrary, lists mismatched filenames

#Ignore case-i
#Multiple style matches - e
grep -e "pattern1" -e "pattern2"  #Match that containing pattern 1 or pattern 2

[root@host3 ~]# grep -e root -e docker /etc/passwd /etc/group
/etc/passwd:root:x:0:0:root:/root:/bin/bash
/etc/passwd:operator:x:11:0:operator:/root:/sbin/nologin
/etc/passwd:dockerroot:x:996:994:Docker User:/var/lib/docker:/sbin/nologin
/etc/group:root:x:0:
/etc/group:dockerroot:x:994:
/etc/group:docker:x:992:

#There is another way to specify multiple styles. We can provide a style condition for reading styles. Specify the file with - f, and note that the pat.file file does not contain blank lines at the end, etc
[root@host3 ~]# cat pat.file 
root
docker
[root@host3 ~]# grep -f pat.file /etc/passwd /etc/group
/etc/passwd:root:x:0:0:root:/root:/bin/bash
/etc/passwd:operator:x:11:0:operator:/root:/sbin/nologin
/etc/passwd:dockerroot:x:996:994:Docker User:/var/lib/docker:/sbin/nologin
/etc/group:root:x:0:
/etc/group:dockerroot:x:994:
/etc/group:docker:x:992:

Specify or exclude some files in grep search

grep can specify (include) or exclude some files in the search. We use wildcards to specify the include file or the exclude file

#Recursively search all. c and. cpp files in the directory
grep root . -r --include *.{c,cpp}

[root@host3 ~]# grep root /etc/ -r -l --include *.conf  # - l here means list file names only
/etc/systemd/logind.conf
/etc/dbus-1/system.d/org.freedesktop.hostname1.conf
/etc/dbus-1/system.d/org.freedesktop.import1.conf
/etc/dbus-1/system.d/org.freedesktop.locale1.conf
/etc/dbus-1/system.d/org.freedesktop.login1.conf
/etc/dbus-1/system.d/org.freedesktop.machine1.conf
/etc/dbus-1/system.d/org.freedesktop.systemd1.conf
/etc/dbus-1/system.d/org.freedesktop.timedate1.conf
/etc/dbus-1/system.d/wpa_supplicant.conf

#Exclude all README files from search
grep root . -r --exclude "README"
#******To exclude directories, use the--exclude-dir,If you want to read the exclusion file list from a file, use the--exclude-from FILE*****#

cut (omitted)

sed

#Remove blank lines
sed '/^$/d' file  # /pattern/d will remove rows of matching styles

#Replace directly in the file, replacing all 3-digit numbers in the file with the specified number
[root@host3 ~]# cat sed.data 
11 abc 111 this 9 file contains 111 11 888 numbers 0000

[root@host3 ~]# sed -i 's/\b[0-9]\{3\}\b/NUMBER/g' sed.data 
[root@host3 ~]# cat sed.data 
11 abc NUMBER this 9 file contains NUMBER 11 NUMBER numbers 0000
#The above command replaces all three digits. Regular expression \ b[0-9]\{3\}\b is used to match 3 digits, and [0-9] represents the range of digits, that is, from 0-9
# {3} Represents the character before the match 3 times. Where \ is used for escape
# \b for word boundary

sed -i .bak 's/abc/def/' file 
#In this case, sed not only performs file content replacement, but also creates a file named file.bak, which contains a copy of the original file content

String flag matched&

In sed, we can use & to mark the matching style string, so that we can use the matched content when replacing the string

[root@host3 ~]# echo this is my sister |sed 's/\w\+/<&>/g' #Replace all words with angle brackets
<this> <is> <my> <sister>
[root@host3 ~]# echo this is my sister |sed 's/\w\+/[&]/g' #Replace all words with bracketed words
[this] [is] [my] [sister]

#The regular expression \ w \ + matches each word, and then we replace it with [&], & corresponding to the previously matched word

Quote

Sed expressions are usually referred to in single quotes. But you can also use double quotes. When we want to use some variables in sed expressions, double quotes are useful

[root@host3 ~]# text=hello
[root@host3 ~]# echo hello world |sed "s/$text/HELLO/"
HELLO world

awk

Special variable:

NR: indicates the number of records, which corresponds to the current line number during execution
NF: indicates the number of fields, which corresponds to the current number of fields during execution
$0: text content of current line during execution

Principle of use:

Make sure the entire awk command is enclosed in single quotes
Make sure all quotes in the command appear in pairs
Make sure to enclose the action statement with curly braces and the conditional statement with curly braces

awk -F: '{print NR}' /etc/passwd #Print line numbers for each line
awk -F: '{print NF}' /etc/passwd #Print columns per row

[root@host3 ~]# cat passwd 
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
tcpdump:x:72:72::/:/sbin/nologin
elk:x:1000:1000::/home/elk:/bin/bash
ntp:x:38:38::/etc/ntp:/sbin/nologin
saslauth:x:998:76:Saslauthd user:/run/saslauthd:/sbin/nologin
apache:x:48:48:Apache:/usr/share/httpd:/sbin/nologin
nscd:x:28:28:NSCD Daemon:/:/sbin/nologin
[root@host3 ~]# awk -F: '{print NR}' passwd 
1
2
3
4
5
6
7
8
[root@host3 ~]# awk -F: '{print NF}' passwd 
7
7
7
7
7
7
7
7
[root@host3 ~]# awk -F: 'END{print NF}' passwd 
7
[root@host3 ~]# awk -F: 'END{print NR}' passwd 
8
#Only the end statement is used. Each time a line is read in, awk will update NR to the corresponding line number. When the last line is reached, NR is the line number of the last line, so it is the number of lines in the file

awk 'BEGIN{ print "start" } pattern { commands } END{ print "end" }'
Use single and double quotation marks to enclose awk
awk 'BEGIN{ statements } { statements } END{ statements }'
awk scripts usually consist of three parts, BEGIN, END, and common statement blocks with pattern matching options. All three parts are optional

[root@host3 ~]# awk 'BEGIN{ i=0 } { i++ } END{ print i }' passwd 
8

awk splicing:

[root@mgmt-k8smaster01 deployment]# docker images|grep veh
192.168.1.74:5000/veh/zuul                           0.0.1-SNAPSHOT.34        41e9c323b825        26 hours ago        172MB
192.168.1.74:5000/veh/vehicleanalysis                0.0.1-SNAPSHOT.38        bca9981ac781        26 hours ago        210MB
192.168.1.74:5000/veh/masterveh                      0.0.1-SNAPSHOT.88        265e448020f3        26 hours ago        209MB
192.168.1.74:5000/veh/obugateway                     0.0.1-SNAPSHOT.18        a4b3309beccd        8 days ago          182MB
192.168.1.74:5000/veh/frontend                       1.0.33                   357b20afec08        11 days ago         131MB
192.168.1.74:5000/veh/rtkconsumer                    0.0.1-SNAPSHOT.12        4c2e63b5b2f6        2 weeks ago         200MB
192.168.1.74:5000/veh/user                           0.0.1-SNAPSHOT.14        015fc6516533        2 weeks ago         186MB
192.168.1.74:5000/veh/rtkgw                          0.0.1-SNAPSHOT.12        a17a3eed4d28        2 months ago        173MB
192.168.1.74:5000/veh/websocket                      0.0.1-SNAPSHOT.7         a1af778846e6        2 months ago        179MB
192.168.1.74:5000/veh/vehconsumer                    0.0.1-SNAPSHOT.20        4a763860a5c5        2 months ago        200MB
192.168.1.74:5000/veh/dfconsumer                     0.0.1-SNAPSHOT.41        2e3471d6ca27        2 months ago        200MB
192.168.1.74:5000/veh/auth                           0.0.1-SNAPSHOT.4         be5c86dd285b        3 months ago        185MB
[root@mgmt-k8smaster01 deployment]# docker images |grep veh |awk '{a=$1;b=$2;c=(a":"b);print c}'
192.168.1.74:5000/veh/zuul:0.0.1-SNAPSHOT.34
192.168.1.74:5000/veh/vehicleanalysis:0.0.1-SNAPSHOT.38
192.168.1.74:5000/veh/masterveh:0.0.1-SNAPSHOT.88
192.168.1.74:5000/veh/obugateway:0.0.1-SNAPSHOT.18
192.168.1.74:5000/veh/frontend:1.0.33
192.168.1.74:5000/veh/rtkconsumer:0.0.1-SNAPSHOT.12
192.168.1.74:5000/veh/user:0.0.1-SNAPSHOT.14
192.168.1.74:5000/veh/rtkgw:0.0.1-SNAPSHOT.12
192.168.1.74:5000/veh/websocket:0.0.1-SNAPSHOT.7
192.168.1.74:5000/veh/vehconsumer:0.0.1-SNAPSHOT.20
192.168.1.74:5000/veh/dfconsumer:0.0.1-SNAPSHOT.41
192.168.1.74:5000/veh/auth:0.0.1-SNAPSHOT.4

awk works as follows:

1. Execute the contents of begin {commands} statement block
2. Execute the middle block pattern {commands}. Repeat this process to read all the guidance documents
3. When reading to the end of the input stream, execute the end {commands} statement block

We can add the value of the first field in each row, that is, sum the columns

[root@host3 ~]# cat sum.data 
1 2 3 4 5 6
2 2 2 2 2 2
3 3 3 3 3 3
5 5 5 6 6 6
[root@host3 ~]# cat sum.data |awk 'BEGIN{ sum=0 }  { print $1; sum+=$1 } END { print sum }'
1
2
3
5
11

[root@host3 ~]# awk '{if($2==3)print $0}' sum.data 
3 3 3 3 3 3
[root@host3 ~]# awk '{if($2==5)print $0}' sum.data 
5 5 5 6 6 6

#Add 1 to each value
[root@host2 ~]# cat passwd 
1:2:3:4
5:5:5:5
3:2:3:5
[root@host2 ~]# cat passwd |awk -F: '{for(i=1;i<=NF;i++){$i+=1}}{print $0}'
2 3 4 5
6 6 6 6
4 3 4 6

[root@host2 ~]# cat passwd |awk -F: '{$2=$2+1;print $0}' 
1 3 3 4
5 6 5 5
3 3 3 5
[root@host2 ~]# cat passwd |awk -F: '{if($2==2) $2=$2+1;print $0}'
1 3 3 4
5:5:5:5
3 3 3 5

#Replace all 2 with Jack fund. If you need to be more standard, the expression should also be enclosed in parentheses
[root@host2 ~]#  cat passwd |awk -F: '{if($2==2) $2="jack fuck";print $0}'       
1 jack fuck 3 4
5:5:5:5
3 jack fuck 3 5
[root@host2 ~]#  cat passwd |awk -F: '{if($2==2) ($2="jack fuck");print $0}'
1 jack fuck 3 4
5:5:5:5
3 jack fuck 3 5

Pass external variables to awk

#With the option - v, we can pass external values to awk
[root@host3 ~]# VAR1=10000
[root@host3 ~]# echo |awk -v VAR=$VAR1 '{print VAR}'
10000
#Input comes from standard output, so there is echo

#There is another flexible way to pass multiple external variables to awk
[root@host3 ~]# VAR1=10000
[root@host3 ~]# VAR2=20000
[root@host3 ~]# echo |awk '{ print v1,v2 }' v1=$VAR1 v2=$VAR2
10000 20000

Use filter mode to filter the rows processed by awk

[root@host3 ~]# cat sum.data 
1 2 3 4 5 6
2 2 2 2 2 2
3 3 3 3 3 3
5 5 5 6 6 6

#Lines with line number less than 3
[root@host3 ~]# awk 'NR<3' sum.data 
1 2 3 4 5 6
2 2 2 2 2 2

#Lines with line numbers between 1 and 4
[root@host3 ~]# awk 'NR==1,NR==3' sum.data 
1 2 3 4 5 6
2 2 2 2 2 2
3 3 3 3 3 3

#Lines with style linux
awk '/linux/'

#Lines without style linux
awk '!/linux/'

Merge multiple files by column

paste

wget

Download multiple files wget URL1 URL2 URL3
wget specifies the number of times with -t, and can retry wget -t 0 URL continuously
Speed limit: WGet -- limit rate 20K http://www.baidu.com
Quota can be used when downloading multiple files. Once quota is exhausted, Download wget --quota 100M URL1 URL2 URL3 will be stopped
Breakpoint resume wget-c URL
To access the URL of the http or ftp page WGet -- user username -- password password password that needs to be authenticated, or to enter the password manually without specifying the password in the command line, you need to change the -- password to -- ask password

Copy entire site (crawler)

wget has an option to recursively traverse all URL links on a web page like a crawler and download them one by one. So we can get all the pages of this site

wget --mirror --convert-links www.chinanews.com
[root@host3 tmp]# ls
www.chinanews.com
[root@host3 tmp]# cd www.chinanews.com/
[root@host3 www.chinanews.com]# ls
allspecial  auto   cj             common   gangao  hb  huaren      js          m     piaowu  robots.txt   sh      society  taiwan  tp
app         china  cns2012.shtml  fileftp  gn      hr  index.html  live.shtml  part  pv      scroll-news  shipin  stock    theory
[root@host3 www.chinanews.com]# ll
 260
drwxr-xr-x 2 root root     25 10ÔÂ 12 14:11 allspecial
drwxr-xr-x 3 root root     23 10ÔÂ 12 14:11 app
drwxr-xr-x 3 root root     18 10ÔÂ 12 14:11 auto
drwxr-xr-x 2 root root     24 10ÔÂ 12 14:11 china
drwxr-xr-x 3 root root     18 10ÔÂ 12 14:11 cj
-rw-r--r-- 1 root root  15799 10ÔÂ 12 14:11 cns2012.shtml
drwxr-xr-x 3 root root     46 10ÔÂ 12 14:11 common
drwxr-xr-x 6 root root     54 10ÔÂ 12 14:11 fileftp
drwxr-xr-x 2 root root     24 10ÔÂ 12 14:11 gangao
drwxr-xr-x 4 root root     27 10ÔÂ 12 14:11 gn
drwxr-xr-x 2 root root     24 10ÔÂ 12 14:11 hb
drwxr-xr-x 3 root root     18 10ÔÂ 12 14:11 hr
drwxr-xr-x 2 root root     24 10ÔÂ 12 14:11 huaren
-rw-r--r-- 1 root root 184362 10ÔÂ 12 14:11 index.html
drwxr-xr-x 2 root root     26 10ÔÂ 12 14:11 js

#-Convert links instructs wget to convert the page's link address to a local address

Download Web page as plain text

Under the web page, the default is html format, which needs to be viewed by the browser. lynx is a command-line browser with a lot of playheads, which can be used to obtain the plain text web page

#Use the lynx command - dump option to store the contents of web pages in ascii encoded form in a text file
[root@host3 tmp]# yum -y install lynx
[root@host3 tmp]# lynx www.chinanews.com -dump > abc.txt
[root@host3 tmp]# cat abc.txt
 ...
 1.   http://www.chinanews.com/kong/2019/10-12/8976714.shtml
 2.   http://www.chinanews.com/kong/2019/10-12/8976812.shtml
 3.   http://www.chinanews.com/kong/2019/10-12/8976721.shtml
 4.   http://www.chinanews.com/kong/2019/10-12/8976690.shtml
 5.   http://www.chinanews.com/kong/2019/10-12/8976817.shtml
 6.   http://www.chinanews.com/kong/2019/10-12/8976794.shtml
 7.   http://www.chinanews.com/kong/2019/10-12/8976853.shtml
 8.   http://www.chinanews.com/kong/2019/10-12/8976803.shtml
 9.   http://www.chinanews.com/sh/2019/10-12/8976754.shtml
 10.  http://www.chinanews.com/tp/chart/index.shtml
 11.  http://www.chinanews.com/tp/hd2011/2019/10-12/907641.shtml
 12.  http://www.chinanews.com/tp/hd2011/2019/10-12/907637.shtml
 13.  http://www.chinanews.com/tp/hd2011/2019/10-12/907651.shtml
 14.  http://www.chinanews.com/tp/hd2011/2019/10-12/907644.shtml
 15.  http://www.chinanews.com/tp/hd2011/2019/10-12/907675.shtml
 16.  http://www.chinanews.com/tp/hd2011/2019/10-12/907683.shtml
 17.  http://www.chinanews.com/tp/hd2011/2019/10-12/907656.shtml
 18.  http://www.ecns.cn/video/2019-10-12/detail-ifzpuyxh5816910.shtml
 19.  http://www.ecns.cn/video/2019-10-11/detail-ifzpuyxh5815962.shtml
 20.  http://www.ecns.cn/video/2019-10-11/detail-ifzpuyxh5815122.shtml
 21.  http://www.ecns.cn/video/2019-10-11/detail-ifzpuyxh5815100.shtml

curl

Set cookie

To specify a cookie, use the -- cookie "COOKIES" option

Cookies need to be given in the form of name=value. Multiple cookies are separated by semicolons. For example: - cookie "user=slynux;pass=hack"

If you want to save the cookie as a file, use the -- Cookie Jar option. For example -- Cookie Jar cookie file

Set user agent string

If the user agent is not specified, some web pages that need to verify the user agent cannot be displayed. You must have met some successful websites that can only work under ie. If using other browsers, these sites will prompt that she can only access ie. This is because these sites check user agents. You can use curl to set up the user agent

--The user agent or - A option is used to set the user agent: curl URL -- user agent "Mozilla / 5.0"
-H header information transfer multiple header information: curl - H "host: www.baidu.com" - H "accept language: En" URL

Print only headers

-I or -- head

[root@host3 tmp]# curl -I www.chinanews.com
HTTP/1.1 200 OK
Date: Sat, 12 Oct 2019 08:47:31 GMT
Content-Type: text/html
Connection: keep-alive
Expires: Sat, 12 Oct 2019 08:48:22 GMT
Server: nginx/1.12.2
Cache-Control: max-age=120
Age: 69
X-Via: 1.1 PSbjwjBGP2ih137:5 (Cdn Cache Server V2.0), 1.1 shx92:3 (Cdn Cache Server V2.0), 1.1 PSjsczsxrq176:3 (Cdn Cache Server V2.0), 1.1 iyidong70:11 (Cdn Cache Server V2.0)

Analyze website data

Lynx is a command line based web browser. It does not output a bunch of original html code, and two is able to display the text version of the website, which is exactly the same as the page we saw in the browser. In this way, the removal of html tags is avoided. The - nolist option of lynx is used here because there is no need to automatically label each link.

[root@host3 tmp]# lynx  www.chinanews.com -dump -nolist 
 ...
 Friendship link
   The Ministry of foreign affairs, the office of overseas Chinese, the supervision department of the Central Commission for Discipline Inspection, the office of Taiwan Affairs, the people's court network, the people's network, the Xinhua network, the China network, the CCTV network, the international online network, the China Youth Network, the China economic network, the Taiwan network, the CCTV network|
   Tibet network, China Youth Online, Guangming network, China military network, legal system network, China network, new Beijing News, Beijing News Network, Jinghua network, Sichuan Radio and television station, Qianlong Network, Hualong network, Hongwang network, Shunwang network, Jiaodong Online|
   Northeast news network | northeast network | Qilu hotline | Sichuan news network | Great Wall network | South Network | North network | East network | Sina | Sohu | Netease | Tencent | China Jingwei | East wealth network | financial sector | Huike | real estate world

   About us | about us | contact us | advertising service | contribution service | legal statement | Recruitment Information | website map
   |Message feedback

   The information published on this website does not represent the views of China News Agency and China news network. Articles published on this website shall be authorized in writing.

   Unauthorized reprint, excerpt, copy and establishment of image are prohibited. Violators will be prosecuted according to law.

   [license for online dissemination of audio-visual programs (0106168)] [jingicp Certificate No. 040655] [ghs.png] Jinggong network security
   11000002003042] [Jing ICP Bei 05004340-1] switchboard: 86-10-87826688
   Tel. of illegal and bad information report: 1569978800 email: jubao@chinanews.com.cn administrative measures for report acceptance and handling

   Copyright ©1999- 2019 chinanews.com. All Rights Reserved

                             [_1077593327_3.gif]

                  [U194P4T47D45262F978DT20190920162854.jpg]

                             [_1077593327_3.gif]

                  [U194P4T47D45262F979DT20190920162854.jpg]

case

case $variable name in
 "Value 1")
;;
If the value of the variable is equal to the value 1, execute program 1, value
2")
If the value of the variable is equal to value 2, execute program 2
 ... Omit other branches
*)
If none of the variables have the above values, execute this procedure
;;
esac

#!/bin/bash
 #Judge user input
read -p "Please choose yes/no: " -t 30 cho
 #Output "please select yes/no" on the screen, and then assign the user selection to the variable cho
case $cho in
 #Judge the value of variable cho
    "yes")
    #If yes
        echo "Your choose is yes!"
        #Then execute procedure 1
        ;;
    "no")
    If it is no
        echo "Your choose is no!"
        #Then execute procedure 2
        ;;
    *)
    #If it's neither yes nor no
    echo "Your choose is error!"
    #Then execute this procedure
    ;;
esac

Find invalid links in Web site

A person manually checks every page on the site to find invalid links. To identify links and find invalid links from them

[root@host3 tmp]# cat find_broken.sh 
#!/bin/bash
if [ $# -ne 1 ];
then
  echo -e "$Usage: $0 URL\n"
  exit 1;
fi

echo Broken links:

# $$is the pid of the script runtime
mkdir /tmp/$$.lynx
cd /tmp/$$.lynx

lynx -traversal $1 > /dev/null
count=0;

sort -u reject.data > links.txt

while read link;
do
  output=`curl -I $link -s | grep "HTTP/.*OK"`
  if [[ -z $output ]];
    then $link;
    let count++
  fi
done < links.txt

[ $count -eq 0 ] && echo No broken links found.

#The lynx -traversal URL generates several files in the working directory, including reject.dat, which contains all the links in the site. sort -u is used to create a list without duplicates. We can check the head with curl

#Sort-u de duplication, similar to uniq

lynx -traversal from the name point of view, jeject.dat should contain a list of invalid URLs. In fact, this is not the case, but put all URLs in this file
Lynx also generates a traverse.error file that contains all the URLs in question during browsing. But lynx will only return HTTP404 URLs, and will miss those URLs with other types of errors, which is why we need to check the return status manually

Posted by hostcord on Thu, 09 Jan 2020 01:33:25 -0800

Programmer Group