ls
#List all files starting with a and o [root@sh02-hap-bss-prod-consul03 ~]# ls anaconda-ks.cfg nss-pam-ldapd-0.9.8-1.gf.el7.x86_64.rpm openldap-clients-2.4.44-21.el7_6.x86_64.rpm original-ks.cfg tools [root@sh02-hap-bss-prod-consul03 ~]# ls [ao]* anaconda-ks.cfg openldap-clients-2.4.44-21.el7_6.x86_64.rpm original-ks.cfg #[0-9] for any single number #[! 0-9] indicates a string that does not start with a number [root@sh02-hap-bss-prod-consul03 ~]# ls 1 3 anaconda-ks.cfg openldap-clients-2.4.44-21.el7_6.x86_64.rpm tools 2 44 nss-pam-ldapd-0.9.8-1.gf.el7.x86_64.rpm original-ks.cfg [root@sh02-hap-bss-prod-consul03 ~]# ls [!0-9]* anaconda-ks.cfg nss-pam-ldapd-0.9.8-1.gf.el7.x86_64.rpm openldap-clients-2.4.44-21.el7_6.x86_64.rpm original-ks.cfg tools: libnss-cache nsscache [root@sh02-hap-bss-prod-consul03 ~]# ls [0-9]* 1 2 3 44
rm
#Delete files starting with a number rm -f [0-9]* #Delete files that do not start with a number [root@sh02-hap-bss-prod-consul03 test]# ls 1 2 3 4 a aa b bb [root@sh02-hap-bss-prod-consul03 test]# rm -f [!0-9]* [root@sh02-hap-bss-prod-consul03 test]# ls 1 2 3 4
echo
- By default, echo appends a line break to the end of the text, which can be ignored with - n
[root@host1 src]# echo abc ddd abc ddd [root@host1 src]# echo -n abc ddd abc ddd[root@host1 src]#
echo -e
echo -e handles special characters If the following characters appear in the string, they will be specially processed and will not be output as general text: \A make a warning sound; \b delete the previous character; \c without line break at the end; \f line breaks but the cursor remains in the original position; \n wrap the line and move the cursor to the beginning of the line; \r the cursor moves to the beginning of the line, but does not wrap; \Tinser t tab; \v is the same as \ f; \Insert \ character; \nnn inserts the ASCII character represented by nnn (octal); Here is an example: $echo -e "a\bdddd" / / the previous a will be erased dddd $echo -e "a\adddd" / / the output will sound an alarm at the same time adddd $echo -e "a\ndddd" / / Wrap a dddd
variable
String length: ${ාvar}
[root@host1 src]# echo ${NODE_HOME} /usr/local/node [root@host1 src]# echo ${#NODE_HOME} 15 #15 characters in length
Using shell for mathematical calculation
When using let, the variable name does not need to be added before$
[root@host1 src]# nod1=3 [root@host1 src]# nod2=5 [root@host1 src]# abc=$[nod1+nod2] [root@host1 src]# echo $abc 8 [root@host1 src]# let def=nod1+nod2 [root@host1 src]# echo $def 8
bc
[root@host3 2056]# echo "4*0.56" |bc 2.24 [root@host3 2056]# no=54 [root@host3 2056]# res=`echo "$no*1.5"|bc` [root@host3 2056]# echo $res 81.0 [root@host3 2056]#
Other parameters can be placed before the specific operation to be performed and passed to bc through stdin with semicolon as delimiter
For example, setting decimal precision
[root@host3 2056]# echo "scale=2;3/8" | bc .37
File descriptor
- 0---stdin standard input
- 1---stdout standard output
- 2---stderr standard error
When an error occurs in a command and exits, she will return a non-zero exit status, and the number 0 will be returned after successful execution,. Exit status can be obtained by flushing $? And echo $?
Correct output to out.txt, error output to desktop ls > out.txt Error output to out.txt, correct output to desktop ls 2> out.txt All output redirected to out.txt ls &> out.txt Can be united find /etc -name passwd > find.txt 2> find.err Discard the error result and only output the correct result on the screen find /etc -name passwd 2> /dev/null Discard all results find /etc -name passwd &> /dev/null Since the wrong output cannot go through the pipeline, if necessary, the wrong output must be taken as the correct output That is: find / etc - name passwd 2 > & 1 | less For example, find /etc -name passwd |wc -l In fact, only the correct number of rows is counted, and the wrong output is not counted find /etc -name passwd 2>&1 |wc -l This one counts the wrong ones as the right ones /sbin/service vsftpd stop > /dev/null 2>&1 This means stop the service, discard the correct output, and output the wrong output when the correct output is output to the terminal
Arrays and associative arrays
There are many ways to define an array. We usually use only one column of values in a single row to define an array:
[root@host3 ~]# array_var=(1 2 3 4 5 6 6 6) [root@host3 ~]# echo ${array_var[*]} #Print all values in array, mode 1 1 2 3 4 5 6 6 6 [root@host3 ~]# echo ${array_var[@]} #Print all values in array, mode 2 1 2 3 4 5 6 6 6 [root@host3 ~]# echo ${#array_var[*]} #Print array length 8
Associative arrays are similar to dictionaries. You can customize key values and list array index keys
Get terminal information
tput sc #Store cursor position tput rc #Restore cursor tput ed #Clear everything from cursor to end of line
Generate delay in script
Count down:
#!/bin/bash echo -n Count: tput sc count=11; while true; do+ if [ $count -gt 0 ]; then let count--; sleep 1; tput rc tput ed echo -n $count; else exit 0; fi done #In the chestnut here, the initial value of the variable count is 11, and each cycle is reduced by 1,. tput sc stores the cursor position. In each cycle, a new count value is printed in the terminal by restoring the previously stored cursor position. The command to restore the cursor position is tput rc. tput ed clears everything from the current cursor position to the end of the line so that the old count value can be cleared and written to a new value.
Functions and parameters
-
Defined function
function fname() { statements; } //Or: fname() { statements; }
-
Call, just use the function name to call
fname; #Execution function
-
Parameters can be passed to functions and accessed by scripts
fname arg1 arg2;
-
Methods of accessing function parameters
fname() { echo $1,$2; #Access parameters 1 and 2 echo "$@"; #Print all parameters in a list at once echo "$*"; #Similar to $@, but parameters are treated as a single entity echo "$#"; #$#Represents the number of parameters after this script or function return 0; #Return value } #$@ is used more than $* because $* treats all parameters as a single string, so it is rarely used
-
Function recursion
In bash, functions also support recursion (you can call your own functions), such asF() { echo $1; F hello; sleep 1; }
fork bomb
:(){ :|:& };: #This recursive function can call itself constantly, generate new processes, and eventually cause a denial of service. Before the function call, the Forbidden City will be put in the background. This dangerous code will branch a large number of processes and become a fork bomb [root@host3 ~]# :(){ :|:& };: [1] 2526 [root@host3 ~]# [1]+ completed Crashed.
This does not seem to be well understood. We can change the following format:
:() { :|:& }; :
A better understanding is this:
bomb() { bomb|bomb& }; bomb
Because the function keyword can be omitted in the shell, the above 13 characters are the function to define a function and call this function. The name of the function is:, the main core code is: |: &. It can be seen that this is a recursive call of the function itself. Through the & Implementation, start a new process to run in the background, realize the geometric growth of the process through the pipeline, and finally Use: to call the function to detonate the bomb. Therefore, in a few seconds, the system will crash because it cannot handle too many processes. The only way to solve this problem is to restart
Ways of prevention
Of course, the Fork bomb is not so terrible. You can write one in minutes in other languages, for example, python version:
import os while True: os.fork()
The essence of the Fork bomb is nothing more than to seize system resources by creating processes. In Linux, we can use the ulimit command to restrict certain behaviors of users. Running ulimit-a can see what restrictions we can make:
[root@host3 ~]# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 7675 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 655350 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 100 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
As you can see, the - u parameter limits the number of user created processes, so we can use ulimit -u 100 to allow users to create up to 100 processes. This will prevent the bomb. But this is not thorough. After the terminal is shut down, the command fails. We can modify the / etc/security/limits.conf file for further prevention, and add the following lines to the file
* soft nproc 100 * hard nproc 100
-
Read command return value (status)
$? Gives the return value of the command
The return value is called the exit status. It can be used to analyze whether the command is executed successfully. If it is successful, the exit status is 0, otherwise it is not 0
Read the output of the command sequence into a variable
-
Generating an independent process by using a subshell
The subshell is a separate process. You can use the () operator to define a subshell:
pwd; (cd /bin; ls); pwd; #When a command runs in a subshell, it has no effect on the current shell, and all changes are limited to the inside of the subshell. For example, when cd changes the current directory of the subshell, this change will not be reflected in the main shell environment
Read read
Read is used to read text from the keyboard or standard input. Read the input from the user interactively.
Any input library that becomes a language mostly reads the input from the keyboard; however, only when the Enter key is pressed can it mark the completion of input.
read provides a way to do this without the Enter key
- Read n characters and save the variable name
read -p "Enter input:" var #Prompt read read -n number_of_chars name read -n 3 var echo $var
- Use a specific delimiter as the end of the input line
read -d ":" var echo $var #End of input line with colon
Run command until successful execution
Define the function as follows:
repeat() { while true;do $@ && return; done } #We created the repeat function, which contains an infinite loop that executes the commands passed in as parameters (accessed through $@). If the command is executed successfully, return and exit the loop
A faster approach:
In most modern systems, true is implemented as a binary. This means that the shell has to generate a process without executing a while loop. If you don't want to, you can use the ":" command of shell internal check, and she always returns the exit code of 0:
repeat() { while :; do $@ && return; done } #Although the readability is not high, it must be faster than the previous method
Increase delay
Well, you need to download the next temporarily unavailable file from the internet, but it will take a while. The method is as follows:
repeat wget -c http://abc.test.com/software.tar.gz #If we use this form, we need to send a lot of data to the server, which may have an impact on the server. We can modify the function and add a short delay repeat() { while :; do $@ && return; sleep30; done } #This causes the command to run every 30 seconds
Field separator and iterator
IFS is an important concept in shell script. It is the environment variable to store the delimiter, and it is the default existing identity string used in the current shell environment
The default value of IFS is blank character (line break, tab, or space). For example, in shell, the default value is blank character as ifs
[root@host3 ~]# data="abc eee ddd fff" [root@host3 ~]# for item in $data; do echo ITEM: $item; done ITEM: abc ITEM: eee ITEM: ddd ITEM: fff //Implementation: list1="1 2 3 3 4 4" for line in $list1 do echo $line; done //Output: 1 2 3 3 4 4 //Implementation: for line in 1 2 3 3 4 4 #If you enclose the in with quotation marks, it will be treated as a string do echo $line; done //Same output: 1 2 3 3 4 4
Next, we can change IFS to Comma:
#IFS has not been modified. At this time, we default to the space character, so we print data as a single string [root@host3 ~]# data="eee,eee,111,222" [root@host3 ~]# for item in $data1; do echo ITEM: $item; done ITEM: eee,eee,111,222 [root@host3 ~]# oldIFS=$IFS #This step is to back up the current IFS as oldIFS, which will be recovered later [root@host3 ~]# IFS=, #Modify IFS to comma after backup, and output again to find that comma has become separator [root@host3 ~]# for item in $data1; do echo ITEM: $item; done ITEM: eee ITEM: eee ITEM: 111 ITEM: 222 [root@host3 ~]# IFS=$oldIFS #Restore IFS to original [root@host3 ~]# for item in $data1; do echo ITEM: $item; done ITEM: eee,eee,111,222
So we need to change IFS again and remember to restore it to its original state
for cycle
for var in list; do commands; done list It can be a string or a sequence {1..50}Generate a 1-50 List of numbers for {a..z}or{A..Z}or{a..h}Generate alphabet
For can also adopt the for loop mode in c language
for (i=0;i<10;i++) { commands; #Use variable $i }
while Loop
while condition do commands; done
until cycle
She's going to loop through it until it's true
x=0; until [ $x -eq 9 ]; do let x++; echo $x; done
Comparison and testing
Process control in a program is handled by comparison statements and test statements. We can test with if if else and logical operators, and compare data with comparison operators. In addition, there is a test command for testing
if condition; then commands; fi if condition; then commands; else if condition; then commands; else commands; fi
if and else statements can be nested, which will become very long. You can use logical operators to simplify them
[ condition ] && action; #If the former is true, execute action; [ condition ] || action; #If the former is false, execute action;
Arithmetic comparison
Conditions are usually placed in closed brackets. Be sure to note that there is a space between [or] and the operand. If you forget this space, an error will be reported
Arithmetic judgment:
[$var -eq 0] ා returns true when $var equals 0 [$var -ne 0] ා when $VaR is non-zero, return true Other: -gt: is greater than -lt: less than -ge: greater than or equal to -le: less than or equal to Multi condition test: [$var1 -ne 0 -a $var2 -gt 2] ා logic and - A [$var1 -ne 0 -o $var2 -gt 2] × logical or - O
File system related tests
We can use different condition flags to test different file system related properties:
[- f file] true if the given variable contains a normal file path or filename [- x file] executable, true [- d file] is the directory, true True if the [- e file] file exists True if [- w file] is writable [- r file] readable, true True if [- L file] contains a symbolic link
The method of use is as follows:
fpath="/etc/passwd" if [ -e $fpath ];then echo File exists; else echo Dose not exists; fi
string comparison
When using string comparison, it is better to use double brackets, because sometimes using a single bracket will cause errors, so it is better to avoid
You can test the two strings to see if they are the same
[[ $str1 = $str2 ]] Or: [[ $str1 == $str2 ]] Conversely: [[ $str1 != $str2 ]] [[- z $str1]] is an empty string, returns true [[- n $str1]] is a non empty string, return true
- Note that there is a space before and after = if you forget the space, it is not a comparison, but an assignment statement
- Using logic & & and|, it is easier to combine multiple conditions
cat
General writing:
Reverse print command tac, which is the opposite of cat
cat file1 file2 file3 ... This command splices the file contents of the command line parameters together
Similarly, we can use cat to splice the content from the input file with the standard input, and combine stdin with the data in another file, as follows:
echo "111111" |cat - /etc/passwd In the above code, - is the filename of the stdin text
Show tab as ^ I
For example, when writing a program in python, the code indents with tabs and spaces are different. If tabs are used where spaces are, indenting errors will occur. It's hard to find this error in a text editor alone
At this point, we can use the - T option to display the tabs, marked with ^ I
[root@host3 ~]# cat bbb.sh for line in "1 2 3 3 4 4" do echo $line; done [root@host3 ~]# cat -T bbb.sh for line in "1 2 3 3 4 4" do ^Iecho $line; done
Line number cat -n
#-n will add the line number to the blank line, if you need to skip the blank line, you can use the option - b [root@host3 ~]# cat bbb.sh for line in "1 2 3 3 4 4" do echo $line; done [root@host3 ~]# cat -n bbb.sh 1 for line in "1 2 3 3 4 4" 2 3 do 4 echo $line; 5 done [root@host3 ~]# cat -b bbb.sh 1 for line in "1 2 3 3 4 4" 2 do 3 echo $line; 4 done
find
The find command works by traversing down the file hierarchy, matching qualified files, and performing corresponding operations
find /etc #List all files and folders in the directory, including hidden files find -iname ignore case #When matching one OR more files, you can use OR conditions, such as finding all. txt and. conf files under / etc find /etc \( -name "*.txt" -o -name "*.conf" \) find /etc \( -name "*.txt" -o -name "*.conf" \) -print #\(and \) for treating - name "*.txt" -o -name "*.conf" as a whole #-name is used to match files, - path is used to match file paths, wildcards are available find / -path "*/etc/*" -print #Print as long as the path contains / etc / and print #-regex Parameter, regular is more powerful. For example, email address can be in the form of name@host.root. Therefore, it is generally translated into: #[a-z0-9]+@[a-z0-9]+.[a-z0-9]+ #The symbol + indicates that a character can appear once or more times in the character class before it. find /etc -regex ".*\(\.py|\.sh\)$" #Find all files ending in. py or. sh #Also - iregex can ignore case, just like - iname #-regex It is also a test item. One thing to note when using - regex is that: - regex does not match the filename, but the full filename (including the path). For example, there is a file "abar9" in the current directory. If you use "ab.*9" to match, you will not find any results. The correct way is to use ". * ab.*9" or "*/ab.*9"To match. find . -regex ".*/[0-9]*/.c" -print
Negative parameter
find /etc ! -name "*.conf" -print
Search based on directory depth
We can use the depth options - maxdepth and - mindepth to limit the directory depth traversed by the find command
[root@host3 ~]# find /etc -maxdepth 1 -name "*.conf" -print /etc/resolv.conf /etc/dracut.conf /etc/host.conf [root@host3 ~]# find /etc -maxdepth 2 -name "*.conf" -print /etc/resolv.conf /etc/depmod.d/dist.conf [root@host3 ~]# find /etc -mindepth 4 -name "*.conf" -print /etc/openldap/slapd.d/openldap/ldap.conf /etc/openldap/slapd.d/openldap/schema/schema_convert.conf /etc/openldap/slapd.d/openldap/slapd.conf
Time based search
-atime: last visit time -mtime: last modified time -ctime: the last change time of file metadata (such as permission or ownership) It's all in days There are also minutes: -amin -mmin -cmin -Newer, reference file, compare timestamps. Documents newer than references [root@host3 ~]# find /etc -type f -newer /etc/passwd -print /etc/resolv.conf /etc/shadow /etc/ld.so.cache /etc/cni/net.d/calico-kubeconfig
Search based on file size
find /etc -type f -size +2k #Greater than 2k find /etc -type f -size -2k #Less than 2k find /etc -type f -size 2k #Equal to 2k
Delete matching files
find ./ -type f -name "*.txt" -delete
Based on file and ownership
find /etc -type f -perm 644 find /etc -type f -name "*.conf" ! -perm 644
Search based on users
find /etc -type f -user USER
To execute a command or action
find /etc -type f -user root -exec chown mysql {} \; #Change the file owner whose owner is root to mysql # {} is a special string used with the - exec option. For each matching file, {} is replaced with the corresponding file name.
Another example is to splice all the file contents in a given directory and write them to a single file. We can find all the. conf files with find, and then use the cat command with exec:
find /etc/ -type f -name "*.conf" -exec cat {} \;>all.txt #Append the contents of all. conf files to the all.txt file #The reason for not appending with > > is that the find command outputs only one stream (stdin), which is necessary only when multiple streams are appended to a single file #The following command copies the. txt file 10 days ago to the OLD directory: find /etc -type f -mtime +10 -name "*.txt" -exec cp {} OLD \;
Let find skip some directories
Sometimes in order to improve performance, you need to skip some directories, such as git. Each subdirectory will contain a. git directory. You need to skip these directories.
find /etc \( -name ".git" -prune \) -o \( -type f -print \) #\(- name "/etc/rabbitmq" -prune \) is used for exclusion, while \ (- type f -print \) indicates the action to be performed.
Fun xargs
The xargs command reformats the data received from stdin and supplies it as a parameter to other commands
xargs, as an alternative, works like - exec in the find command
- To convert multi line input to single line output, as long as the line break is removed and replaced by a space, the multi line input conversion can be realized. With xargs, we can replace line breaks with spaces, so that we can convert multiple lines into single lines
[root@host3 ~]# cat 123.txt 1 2 3 4 5 6 7 8 9 10 11 12 13 14 [root@host3 ~]# cat 123.txt |xargs 1 2 3 4 5 6 7 8 9 10 11 12 13 14
- Convert single line input to multi line output, specify the maximum number of parameters per line n, we can divide any text from stdin into multi lines, n parameters per line. Each parameter has a space separated string. Space is the default delimiter.
[root@host3 ~]# cat 123.txt 1 2 3 4 5 6 7 8 9 10 11 12 13 14 [root@host3 ~]# cat 123.txt |xargs -n 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 [root@host3 ~]# echo 1 3 4 5 6 7 8 |xargs -n 3 1 3 4 5 6 7 8
- Custom delimiters to split parameters. Specify a custom delimiter for the input with the - d option
[root@host3 ~]# echo "abcTdslfjTdshfsT1111Tfd222" |xargs -d T abc dslfj dshfs 1111 fd222 #Use letter T as separator #We can define how many parameters to output per line while defining the resolver [root@host3 ~]# echo "abcTdslfjTdshfsT1111Tfd222" |xargs -d T -n 2 abc dslfj dshfs 1111 fd222 #Output one parameter per line [root@host3 ~]# echo "abcTdslfjTdshfsT1111Tfd222" |xargs -d T -n 1 abc dslfj dshfs 1111 fd222
- Sub shell
cmd0 | (cmd1;cmd2;cmd3) | cmd4
In the middle is the sub shell. If there is cmd in it, it will only take effect in the sub shell
The difference between print and print0
-print A carriage return line feed is added after each output, and-print0 Not at all. [root@AaronWong shell_test]# find /home/AaronWong/ABC/ -type f -print /home/AaronWong/ABC/libcvaux.so /home/AaronWong/ABC/libgomp.so.1 /home/AaronWong/ABC/libcvaux.so.4 /home/AaronWong/ABC/libcv.so /home/AaronWong/ABC/libhighgui.so.4 /home/AaronWong/ABC/libcxcore.so /home/AaronWong/ABC/libhighgui.so /home/AaronWong/ABC/libcxcore.so.4 /home/AaronWong/ABC/libcv.so.4 /home/AaronWong/ABC/libgomp.so /home/AaronWong/ABC/libz.so /home/AaronWong/ABC/libz.so.1 [root@AaronWong shell_test]# find /home/AaronWong/ABC/ -type f -print0 /home/AaronWong/ABC/libcvaux.so/home/AaronWong/ABC/libgomp.so.1/home/AaronWong/ABC/libcvaux.so.4/home/AaronWong/ABC/libcv.so/home/AaronWong/ABC/libhighgui.so.4/home/AaronWong/ABC/libcxcore.so/home/AaronWong/ABC/libhighgui.so/home/AaronWong/ABC/libcxcore.so.4/home/AaronWong/ABC/libcv.so.4/home/AaronWong/ABC/libgomp.so/home/AaronWong/ABC/libz.so/home/AaronWong/ABC/libz.so.1
tr
tr Can only pass stdin Standard input, but cannot receive input through command line arguments. His call format is:
tr [option] set1 set2
Box drawings converted to spaces: tr '\ t' '< file.txt
[root@host3 ~]# cat -T 123.txt 1 2 3 4 5 6 7 8 9 ^I10 11 12 13 14 [root@host3 ~]# tr '\t' ' ' < 123.txt 1 2 3 4 5 6 7 8 9 10 11 12 13 14
- Delete characters with tr
tr has an option - d to clear the specific characters that appear in stdin by specifying the character set to be deleted:
cat file.txt |tr -d '[set1]' #Use set1 only, not set2 #Replacement number [root@host3 ~]# echo "Hello 123 world 456" |tr -d '0-9' Hello world #Replace letters [root@host3 ~]# echo "Hello 123 world 456" |tr -d 'A-Za-z' 123 456 #Replace H [root@host3 ~]# echo "Hello 123 world 456" |tr -d 'H' ello 123 world 456
Sorting, unique and duplicate
Sort can help us sort text files and stdin. He usually works with other commands to generate the required output. uniq is a command that is often used with sort. Its purpose is to extract a unique line from text or stdin.
#We can easily sort a group of files (such as file1.txt file2.txt) in the following way: [root@host3 ~]# sort /etc/passwd /etc/group adm:x:3:4:adm:/var/adm:/sbin/nologin adm:x:4: apache:x:48: apache:x:48:48:Apache:/usr/share/httpd:/sbin/nologin audio:x:63: bin:x:1: bin:x:1:1:bin:/bin:/sbin/nologin caddy:x:996: caddy:x:997:996:Caddy web server:/var/lib/caddy:/sbin/nologin ... #You can also merge and sort and redirect to a new file sort /etc/passwd /etc/group > abc.txt #Sort by number sort -n #Reverse sorting sort -r #Sort by month sort -M month.txt #Merge two sorted files sort -m sorted1 sorted2 #Find non duplicate lines in sorted files sort file1.txt file2.txt |uniq
Check that the files have been sorted:
To check whether the files have been sorted, you can use the following methods. If the files have been sorted, sort will return the exit code ($?) of 0. Otherwise, it will return non-zero
#!/bin/bash sort -C filename; if [ $? -eq 0 ]; then echo Sorted; else echo Unsorted; fi
The sort command contains a number of options. If uniq is used, sort is more necessary, because the input data must be sorted
sort completes some more complex tasks
#-k specifies which column to sort by, -r reverse, -n number sort -nrk 1 data.txt sort -k 2 data.txt
uniq
uniq can only be applied to the data input in the sorted order
[root@host3 ~]# cat data.txt 1010hellothis 3333 2189ababbba 333 7464dfddfdfd 333 #Duplicate removal [root@host3 ~]# sort data.txt |uniq 1010hellothis 2189ababbba 333 3333 7464dfddfdfd #De duplication and statistics [root@host3 ~]# sort data.txt |uniq -c 1 1010hellothis 1 2189ababbba 2 333 1 3333 1 7464dfddfdfd #Show only lines that do not have duplicates in the text [root@host3 ~]# sort data.txt |uniq -u 1010hellothis 2189ababbba 3333 7464dfddfdfd #Show only duplicate lines in text [root@host3 ~]# sort data.txt |uniq -d 333
Temporary file name and random number
When writing shell scripts, we often need to store temporary data. The best place to store temporary data is / tmp (the contents of this directory will be emptied after the system restarts). There are two ways to generate standard file names for temporary data
[root@host3 ~]# file1=`mktemp` [root@host3 ~]# echo $file1 /tmp/tmp.P9var0Jjdw [root@host3 ~]# cd /tmp/ [root@host3 tmp]# ls add_user_ldapsync.ldif create_module_config.ldif.bak globalconfig.ldif overlay.ldif create_module_config.ldif databaseconfig_nosyncrepl.ldif initial_structure.ldif tmp.P9var0Jjdw #The above code creates a temporary file and prints out the file name [root@host3 tmp]# dir1=`mktemp -d` [root@host3 tmp]# echo $dir1 /tmp/tmp.UqEfHa389N [root@host3 tmp]# ll //Total dosage 28 -r--------. 1 root root 130 2 Month 122019 add_user_ldapsync.ldif -r--------. 1 root root 329 2 Month 142019 create_module_config.ldif -r--------. 1 root root 329 2 Month 122019 create_module_config.ldif.bak -r--------. 1 root root 2458 2 Month 142019 databaseconfig_nosyncrepl.ldif -r--------. 1 root root 239 2 Month 122019 globalconfig.ldif -r--------. 1 root root 795 2 Month 122019 initial_structure.ldif -r--------. 1 root root 143 2 Month 122019 overlay.ldif -rw------- 1 root root 0 9 Month 2713:06 tmp.P9var0Jjdw drwx------ 2 root root 6 9 Month 2713:09 tmp.UqEfHa389N #The above code creates a temporary directory and prints the directory name [root@host3 tmp]# mktemp test1.XXX test1.mBX [root@host3 tmp]# mktemp test1.XXX test1.wj1 [root@host3 tmp]# ls //Total dosage 28 -r--------. 1 root root 130 2 Month 122019 add_user_ldapsync.ldif -r--------. 1 root root 329 2 Month 142019 create_module_config.ldif -r--------. 1 root root 329 2 Month 122019 create_module_config.ldif.bak -r--------. 1 root root 2458 2 Month 142019 databaseconfig_nosyncrepl.ldif -r--------. 1 root root 239 2 Month 122019 globalconfig.ldif -r--------. 1 root root 795 2 Month 122019 initial_structure.ldif -r--------. 1 root root 143 2 Month 122019 overlay.ldif -rw------- 1 root root 0 9 Month 2713:12 test1.mBX -rw------- 1 root root 0 9 Month 2713:12 test1.wj1 -rw------- 1 root root 0 9 Month 2713:06 tmp.P9var0Jjdw drwx------ 2 root root 6 9 Month 2713:09 tmp.UqEfHa389N #The above is to create a temporary file based on the template name. XXX is uppercase, and X will be replaced by random letters or numbers. Note that the premise of mktemp's normal operation is to ensure that there are at least three X's in the template
split files and data
Suppose a test file of data.txt, with a size of 100kb, can be divided into several files with a size of 10kb
[root@host3 src]# ls nginx-1.14.2 nginx-1.14.2.tar.gz [root@host3 src]# du -sh nginx-1.14.2.tar.gz 992K nginx-1.14.2.tar.gz [root@host3 src]# split -b 100k nginx-1.14.2.tar.gz [root@host3 src]# ll ×ÜÓÃÁ¿ 1984 drwxr-xr-x 9 postgres mysql 186 8Ô 15 19:50 nginx-1.14.2 -rw-r--r-- 1 root root 1015384 8Ô 16 10:44 nginx-1.14.2.tar.gz -rw-r--r-- 1 root root 102400 9Ô 29 12:36 xaa -rw-r--r-- 1 root root 102400 9Ô 29 12:36 xab -rw-r--r-- 1 root root 102400 9Ô 29 12:36 xac -rw-r--r-- 1 root root 102400 9Ô 29 12:36 xad -rw-r--r-- 1 root root 102400 9Ô 29 12:36 xae -rw-r--r-- 1 root root 102400 9Ô 29 12:36 xaf -rw-r--r-- 1 root root 102400 9Ô 29 12:36 xag -rw-r--r-- 1 root root 102400 9Ô 29 12:36 xah -rw-r--r-- 1 root root 102400 9Ô 29 12:36 xai -rw-r--r-- 1 root root 93784 9Ô 29 12:36 xaj [root@host3 src]# ls nginx-1.14.2 nginx-1.14.2.tar.gz xaa xab xac xad xae xaf xag xah xai xaj [root@host3 src]# du -sh * 32M nginx-1.14.2 992K nginx-1.14.2.tar.gz 100K xaa 100K xab 100K xac 100K xad 100K xae 100K xaf 100K xag 100K xah 100K xai 92K xaj #As mentioned above, the 992K nginx tar packet is divided into 100k packets, and the final packet less than 100k is only 92k
As can be seen from the above, the default is to use letters as the suffix. If you want to suffix with a number, you can use the - d parameter, - a length to specify the suffix length
[root@host3 src]# ls nginx-1.14.2 nginx-1.14.2.tar.gz [root@host3 src]# split -b 100k nginx-1.14.2.tar.gz -d -a 5 [root@host3 src]# ls nginx-1.14.2 nginx-1.14.2.tar.gz x00000 x00001 x00002 x00003 x00004 x00005 x00006 x00007 x00008 x00009 [root@host3 src]# du -sh * 32M nginx-1.14.2 992K nginx-1.14.2.tar.gz 100K x00000 100K x00001 100K x00002 100K x00003 100K x00004 100K x00005 100K x00006 100K x00007 100K x00008 92K x00009 #File name is x, suffix is 5 digits
Specify file name prefix
The previously divided files have a file name x. we can also use our own file PREFIX through the PREFIX name. The last argument to the split command is PREFIX
[root@host3 src]# ls nginx-1.14.2 nginx-1.14.2.tar.gz [root@host3 src]# split -b 100k nginx-1.14.2.tar.gz -d -a 4 nginxfuck [root@host3 src]# ls nginx-1.14.2 nginxfuck0000 nginxfuck0002 nginxfuck0004 nginxfuck0006 nginxfuck0008 nginx-1.14.2.tar.gz nginxfuck0001 nginxfuck0003 nginxfuck0005 nginxfuck0007 nginxfuck0009 #As above, the last parameter specifies a prefix
If we don't want to divide by size, we can divide by number of lines - l
[root@host3 test]# ls data.txt [root@host3 test]# wc -l data.txt 7474 data.txt [root@host3 test]# split -l 1000 data.txt -d -a 4 conf [root@host3 test]# ls conf0000 conf0001 conf0002 conf0003 conf0004 conf0005 conf0006 conf0007 data.txt [root@host3 test]# du -sh * 40K conf0000 48K conf0001 48K conf0002 36K conf0003 36K conf0004 36K conf0005 36K conf0006 20K conf0007 288K data.txt #The above 7000 line file is divided into 1000 lines and one copy. The file name starts with conf, followed by 4 digits
File split csplit
csplit can split the log file according to the specified conditions and string matching options, which is a variation of split tool
Split can only be divided according to the size of the data and the number of rows, while csplit can be divided according to the characteristics of the file itself. Whether there is a word or text content can be used as a condition to split the file
[root@host3 test]# ls data.txt [root@host3 test]# cat data.txt SERVER-1 [conection] 192.168.0.1 success [conection] 192.168.0.2 failed [conection] 192.168.0.3 success [conection] 192.168.0.4 success SERVER-2 [conection] 192.168.0.5 success [conection] 192.168.0.5 failed [conection] 192.168.0.5 success [conection] 192.168.0.5 success SERVER-3 [conection] 192.168.0.6 success [conection] 192.168.0.7 failed [conection] 192.168.0.8 success [conection] 192.168.0.9 success [root@host3 test]# csplit data.txt /SERVER/ -n 2 -s {*} -f server -b "%02d.log";rm server00.log rm: Delete normal empty file "server00.log"?y [root@host3 test]# ls data.txt server01.log server02.log server03.log
detailed description:
- /SERVER / is used to match rows, from which the segmentation process begins
- /[regenx] / indicates the text style. Include matching rows that know (but do not include) from the current row (first row) that contain 'SERVER'
- {*} indicates that the split is repeated according to the matching line until the end of the file. You can specify the number of divisions in the form of {integer}
- -s causes the command to enter silent mode without printing other messages.
- -n specifies the divided file prefix
- -b specifies the suffix format, such as% 02d.log, similar to printf in C
Because the first file after segmentation has no content (the matching word is on the first line of the file), we delete server00.log
Splitting file names by extension
Some scripts are processed according to the file name. We may need to modify the file name while retaining the extension, transform the file format (modify the extension while retaining the file name), or extract part of the file name. Some built-in functions of shell can be used to segment file names according to different situations
The% symbol makes it easy to extract the name part from the format name. Extension.
[root@host3 ~]# file_jpg="test.jpg" [root@host3 ~]# name=${file_jpg%.*} [root@host3 ~]# echo $name test #That is, the file name part is extracted
With the help of ා symbol, the extension part of the file name can be extracted.
[root@host3 ~]# file_jpg="test.jpg" [root@host3 ~]# exten=${file_jpg#*.} [root@host3 ~]# echo $exten jpg #Extract the extension. The above part of the extract file name is. * here the extract extension is *
Above grammatical interpretation
Meaning of ${VAR%. *}:
- Remove the string matched by the wildcard located to the right of% from $VAR, which matches from right to left
- Assign a value to VaR, VAR=test.jpg, then the wildcard will match. JPG from right to left. So, if you remove the match from $VaR, you get test
%It belongs to non greedy operation, which finds the shortest result matching wildcard from right to left. There is another%%, which is similar to%, but the behavior pattern is greedy, which means that she will match the longest string that meets the condition, such as VAR=hack.fun.book.txt
Use%Operator: [root@host3 ~]# VAR=hack.fun.book.txt [root@host3 ~]# echo ${VAR%.*} hack.fun.book //Use the%% operator: [root@host3 ~]# echo ${VAR%%.*} hack
Similarly, for the ා operator, there are##
Use#Operator: [root@host3 ~]# echo ${VAR#*.} fun.book.txt //Use the ා operator [root@host3 ~]# echo ${VAR##*.} txt
Bulk rename and move
We can do a lot with find, rename and mv
The easiest way to rename an image file in the current directory in a specific format is to use the following script
#!/bin/bash count=1; for img in `find . -iname '*.png' -o -iname '*.jpg' -type f -maxdepth 1` do new=image-$count.${img##*.} echo "Rename $img to $new" mv $img $new let count++ done
Execute the above script
[root@host3 ~]# ll //Total dosage 24 -rw-r--r-- 1 root root 0 10 Month 814:22 aaaaaa.jpg -rw-r--r-- 1 root root 190 8 Month 913:51 aaa.sh -rw-r--r-- 1 root root 2168 9 Month 2410:15 abc.txt -rw-r--r-- 1 root root 3352 9 Month 2009:58 all.txt -rw-------. 1 root root 1228 1 Month 82019 anaconda-ks.cfg -rw-r--r-- 1 root root 0 10 Month 814:22 bbbb.jpg -rw-r--r-- 1 root root 48 9 Month 1810:27 bbb.sh -rw-r--r-- 1 root root 0 10 Month 814:22 cccc.png drwxr-xr-x 2 root root 333 4 Month 1119:21 conf -rw-r--r-- 1 root root 0 10 Month 814:22 dddd.png -rw-r--r-- 1 root root 190 10 Month 814:22 rename.sh [root@host3 ~]# sh rename.sh find: warning: You are in the non option parameter -iname Later defined -maxdepth Option, but the option is not a location option (-maxdepth Affects specified comparison tests before or after it). Please specify options before other parameters. Rename ./aaaaaa.jpg to image-1.jpg Rename ./bbbb.jpg to image-2.jpg Rename ./cccc.png to image-3.png Rename ./dddd.png to image-4.png [root@host3 ~]# ls aaa.sh abc.txt all.txt anaconda-ks.cfg bbb.sh conf image-1.jpg image-2.jpg image-3.png image-4.png rename.sh
Interactive input automation
Write a script to read the interactive input first
#!/bin/bash #File name: test.sh read -p "Enter number:" no read -p "Enter name:" name echo $no,$name
Automatically send input to the script as follows:
[root@host3 ~]# ./test.sh Enter number:2 Enter name:rong 2,rong [root@host3 ~]# echo -e "2\nrong\n" |./test.sh 2,rong # \n stands for carriage return, we use echo-e to generate the input sequence, - e means echo will interpret the escape sequence. If there is a large amount of input content, you can provide input by combining a separate input file with a redirection operator, as follows: [root@host3 ~]# echo -e "2\nrong\n" > input.data [root@host3 ~]# cat input.data 2 rong [root@host3 ~]# ./test.sh < input.data 2,rong #This method is to import interactive input data from a file
If you're a reverse engineer, you've probably dealt with buffer overflows. To implement, we need to redirect the shellcode in hexadecimal form (for example "\ xeb \ x1a \ x5e \ X31 \ xc0 \ X8 \ X46"). These characters cannot be entered directly through the keyboard because there is no corresponding key on the keyboard. So we should use:
echo -e "\xeb\x1a\x5e\x31\xc0\x88\x46"
Use this command to redirect shellcode to a defective executable. In order to handle dynamic input and provide input by checking the input requirements of the program runtime, we need to use an excellent tool expect.
The expect command can provide appropriate input according to input requirements
Automation with expect
In the default linux distribution, most of them do not include expect. You have to install them yourself: yum -y install expect
#!/usr/bin/expect # File name expect.sh spawn ./test.sh expect "Enter number:" send "2\n" expect "Enter name:" send "rong\n" expect eof #implement [root@host3 ~]# ./expect.sh spawn ./test.sh Enter number:2 Enter name:rong 2,rong
- The spawn parameter specifies which command or script to execute
- The expect parameter provides a message to wait for
- Send is the message to send
- expect eof indicates the end of command interaction
Using parallel process to speed up command execution
Take the md5sum command for example. Because of the operation involved, the command is cpu intensive. If multiple files need to generate checksums, we can use the following script to run them.
#!/bin/bash PIDARRAY=() for file in `find /etc/ -name "*.conf"` do md5sum $file & PIDARRAY+=("$!") done wait ${PIDARRAY[@]} //Implementation: [root@host3 ~]# sh expect.sh 72688131394bcce818f818e2bae98846 /etc/modprobe.d/tuned.conf 77304062b81bc20cffce814ff6bf8ed5 /etc/modprobe.d/firewalld-sysctls.conf 649f5bf7c0c766969e40b54949a06866 /etc/dracut.conf d0f5f705846350b43033834f51c9135c /etc/prelink.conf.d/nss-softokn-prelink.conf 0335aabf8106f29f6857d74c98697542 /etc/prelink.conf.d/fipscheck.conf 0b501d6d547fa5bb989b9cb877fee8cb /etc/modprobe.d/dccp-blacklist.conf d779db0cc6135e09b4d146ca69d39c2b /etc/rsyslog.d/listen.conf 4eaff8c463f8c4b6d68d7a7237ba862c /etc/resolv.conf 321ec6fd36bce09ed68b854270b9136c /etc/prelink.conf.d/grub2.conf 3a6a059e04b951923f6d83b7ed327e0e /etc/depmod.d/dist.conf 7cb6c9cab8ec511882e0e05fceb87e45 /etc/systemd/bootchart.conf 2ad769b57d77224f7a460141e3f94258 /etc/systemd/coredump.conf f55c94d000b5d62b5f06d38852977dd1 /etc/dbus-1/system.d/org.freedesktop.hostname1.conf 7e2c094c5009f9ec2748dce92f2209bd /etc/dbus-1/system.d/org.freedesktop.import1.conf 5893ab03e7e96aa3759baceb4dd04190 /etc/dbus-1/system.d/org.freedesktop.locale1.conf f0c4b315298d5d687e04183ca2e36079 /etc/dbus-1/system.d/org.freedesktop.login1.conf ··· #Since multiple md5sum commands are running at the same time, if you use a multi-core processor, it will run faster and live
working principle:
Using bash's operator &, it causes the shell to put the command in the background and continue executing the script. This means that once the loop ends, the script exits and the md5sum command is still running in the background. To avoid this situation, we use $! To get the process pid. In Bash, $! To save the pid of the latest background process. We put these pid into the array, and then use the wait command to wait for the end of these processes.
Intersection and difference of text files
comm command can be used to compare two files
- Intersection: print out the common lines of two files
- Subtract: print out the different lines contained in the specified file
- Subtraction: prints out lines contained in file a but not in other files
It should be noted that comm must use an ordered file as output
[root@host3 ~]# cat a.txt apple orange gold silver steel iron [root@host3 ~]# cat b.txt orange gold cookies carrot [root@host3 ~]# sort a.txt -o A.txt [root@host3 ~]# vim A.txt [root@host3 ~]# sort b.txt -o B.txt [root@host3 ~]# comm A.txt B.txt apple carrot cookies gold iron orange silver steel #You can see that the result is three columns. The first column outputs only the rows that exist in A.txt, the second column outputs only the rows that appear in B.txt, and the third column contains the rows that exist in A.txt and B.txt. Each column uses the tab character (\ t) as the delimiter #In order to win the intersection, we need to delete the first and second columns and only display the third column [root@host3 ~]# comm A.txt B.txt -1 -2 gold orange #Print only different [root@host3 ~]# comm A.txt B.txt -3 apple carrot cookies iron silver steel #In order to make the result readable, remove the previous \ t tab [root@host3 ~]# comm A.txt B.txt -3 |sed 's/^\t//' apple carrot cookies iron silver steel
Create a file that cannot be modified
Make the file unmodifiable chattr +i file
[root@host3 ~]# chattr +i passwd [root@host3 ~]# rm -rf passwd rm: Cannot delete"passwd": Operation not allowed [root@host3 ~]# chattr -i passwd [root@host3 ~]# rm -rf passwd [root@host3 ~]#
grep
grep can search multiple files
[root@host3 ~]# grep root /etc/passwd /etc/group /etc/passwd:root:x:0:0:root:/root:/bin/bash /etc/passwd:operator:x:11:0:operator:/root:/sbin/nologin /etc/passwd:dockerroot:x:996:994:Docker User:/var/lib/docker:/sbin/nologin /etc/group:root:x:0: /etc/group:dockerroot:x:994:
The grep command interprets only some special characters in match Chu text. If you want to use regular expressions, you need to add the - e option. This means using extended regular expressions. Or you can use the egrep command that allows regular expressions by default (you can also use it without - E after actual measurement)
#Count the number of lines in the text that contain a matching string [root@host3 ~]# grep -c root /etc/passwd 3 #Print line number [root@host3 ~]# grep -n root /etc/passwd 1:root:x:0:0:root:/root:/bin/bash 10:operator:x:11:0:operator:/root:/sbin/nologin 27:dockerroot:x:996:994:Docker User:/var/lib/docker:/sbin/nologin #Search multiple files and find out which file the matching text is in - l [root@host3 ~]# grep root /etc/passwd /etc/group /etc/passwd:root:x:0:0:root:/root:/bin/bash /etc/passwd:operator:x:11:0:operator:/root:/sbin/nologin /etc/passwd:dockerroot:x:996:994:Docker User:/var/lib/docker:/sbin/nologin /etc/group:root:x:0: /etc/group:dockerroot:x:994: [root@host3 ~]# grep root /etc/passwd /etc/group -l /etc/passwd /etc/group #-L, on the contrary, lists mismatched filenames #Ignore case-i #Multiple style matches - e grep -e "pattern1" -e "pattern2" #Match that containing pattern 1 or pattern 2 [root@host3 ~]# grep -e root -e docker /etc/passwd /etc/group /etc/passwd:root:x:0:0:root:/root:/bin/bash /etc/passwd:operator:x:11:0:operator:/root:/sbin/nologin /etc/passwd:dockerroot:x:996:994:Docker User:/var/lib/docker:/sbin/nologin /etc/group:root:x:0: /etc/group:dockerroot:x:994: /etc/group:docker:x:992: #There is another way to specify multiple styles. We can provide a style condition for reading styles. Specify the file with - f, and note that the pat.file file does not contain blank lines at the end, etc [root@host3 ~]# cat pat.file root docker [root@host3 ~]# grep -f pat.file /etc/passwd /etc/group /etc/passwd:root:x:0:0:root:/root:/bin/bash /etc/passwd:operator:x:11:0:operator:/root:/sbin/nologin /etc/passwd:dockerroot:x:996:994:Docker User:/var/lib/docker:/sbin/nologin /etc/group:root:x:0: /etc/group:dockerroot:x:994: /etc/group:docker:x:992:
Specify or exclude some files in grep search
grep can specify (include) or exclude some files in the search. We use wildcards to specify the include file or the exclude file
#Recursively search all. c and. cpp files in the directory grep root . -r --include *.{c,cpp} [root@host3 ~]# grep root /etc/ -r -l --include *.conf # - l here means list file names only /etc/systemd/logind.conf /etc/dbus-1/system.d/org.freedesktop.hostname1.conf /etc/dbus-1/system.d/org.freedesktop.import1.conf /etc/dbus-1/system.d/org.freedesktop.locale1.conf /etc/dbus-1/system.d/org.freedesktop.login1.conf /etc/dbus-1/system.d/org.freedesktop.machine1.conf /etc/dbus-1/system.d/org.freedesktop.systemd1.conf /etc/dbus-1/system.d/org.freedesktop.timedate1.conf /etc/dbus-1/system.d/wpa_supplicant.conf #Exclude all README files from search grep root . -r --exclude "README" #******To exclude directories, use the--exclude-dir,If you want to read the exclusion file list from a file, use the--exclude-from FILE*****#
cut (omitted)
sed
#Remove blank lines sed '/^$/d' file # /pattern/d will remove rows of matching styles #Replace directly in the file, replacing all 3-digit numbers in the file with the specified number [root@host3 ~]# cat sed.data 11 abc 111 this 9 file contains 111 11 888 numbers 0000 [root@host3 ~]# sed -i 's/\b[0-9]\{3\}\b/NUMBER/g' sed.data [root@host3 ~]# cat sed.data 11 abc NUMBER this 9 file contains NUMBER 11 NUMBER numbers 0000 #The above command replaces all three digits. Regular expression \ b[0-9]\{3\}\b is used to match 3 digits, and [0-9] represents the range of digits, that is, from 0-9 # {3} Represents the character before the match 3 times. Where \ is used for escape # \b for word boundary sed -i .bak 's/abc/def/' file #In this case, sed not only performs file content replacement, but also creates a file named file.bak, which contains a copy of the original file content
String flag matched&
In sed, we can use & to mark the matching style string, so that we can use the matched content when replacing the string
[root@host3 ~]# echo this is my sister |sed 's/\w\+/<&>/g' #Replace all words with angle brackets <this> <is> <my> <sister> [root@host3 ~]# echo this is my sister |sed 's/\w\+/[&]/g' #Replace all words with bracketed words [this] [is] [my] [sister] #The regular expression \ w \ + matches each word, and then we replace it with [&], & corresponding to the previously matched word
Quote
Sed expressions are usually referred to in single quotes. But you can also use double quotes. When we want to use some variables in sed expressions, double quotes are useful
[root@host3 ~]# text=hello [root@host3 ~]# echo hello world |sed "s/$text/HELLO/" HELLO world
awk
Special variable:
- NR: indicates the number of records, which corresponds to the current line number during execution
- NF: indicates the number of fields, which corresponds to the current number of fields during execution
- $0: text content of current line during execution
Principle of use:
- Make sure the entire awk command is enclosed in single quotes
- Make sure all quotes in the command appear in pairs
- Make sure to enclose the action statement with curly braces and the conditional statement with curly braces
awk -F: '{print NR}' /etc/passwd #Print line numbers for each line awk -F: '{print NF}' /etc/passwd #Print columns per row [root@host3 ~]# cat passwd sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin postfix:x:89:89::/var/spool/postfix:/sbin/nologin tcpdump:x:72:72::/:/sbin/nologin elk:x:1000:1000::/home/elk:/bin/bash ntp:x:38:38::/etc/ntp:/sbin/nologin saslauth:x:998:76:Saslauthd user:/run/saslauthd:/sbin/nologin apache:x:48:48:Apache:/usr/share/httpd:/sbin/nologin nscd:x:28:28:NSCD Daemon:/:/sbin/nologin [root@host3 ~]# awk -F: '{print NR}' passwd 1 2 3 4 5 6 7 8 [root@host3 ~]# awk -F: '{print NF}' passwd 7 7 7 7 7 7 7 7 [root@host3 ~]# awk -F: 'END{print NF}' passwd 7 [root@host3 ~]# awk -F: 'END{print NR}' passwd 8 #Only the end statement is used. Each time a line is read in, awk will update NR to the corresponding line number. When the last line is reached, NR is the line number of the last line, so it is the number of lines in the file
- awk 'BEGIN{ print "start" } pattern { commands } END{ print "end" }'
- Use single and double quotation marks to enclose awk
- awk 'BEGIN{ statements } { statements } END{ statements }'
- awk scripts usually consist of three parts, BEGIN, END, and common statement blocks with pattern matching options. All three parts are optional
[root@host3 ~]# awk 'BEGIN{ i=0 } { i++ } END{ print i }' passwd 8
awk splicing:
[root@mgmt-k8smaster01 deployment]# docker images|grep veh 192.168.1.74:5000/veh/zuul 0.0.1-SNAPSHOT.34 41e9c323b825 26 hours ago 172MB 192.168.1.74:5000/veh/vehicleanalysis 0.0.1-SNAPSHOT.38 bca9981ac781 26 hours ago 210MB 192.168.1.74:5000/veh/masterveh 0.0.1-SNAPSHOT.88 265e448020f3 26 hours ago 209MB 192.168.1.74:5000/veh/obugateway 0.0.1-SNAPSHOT.18 a4b3309beccd 8 days ago 182MB 192.168.1.74:5000/veh/frontend 1.0.33 357b20afec08 11 days ago 131MB 192.168.1.74:5000/veh/rtkconsumer 0.0.1-SNAPSHOT.12 4c2e63b5b2f6 2 weeks ago 200MB 192.168.1.74:5000/veh/user 0.0.1-SNAPSHOT.14 015fc6516533 2 weeks ago 186MB 192.168.1.74:5000/veh/rtkgw 0.0.1-SNAPSHOT.12 a17a3eed4d28 2 months ago 173MB 192.168.1.74:5000/veh/websocket 0.0.1-SNAPSHOT.7 a1af778846e6 2 months ago 179MB 192.168.1.74:5000/veh/vehconsumer 0.0.1-SNAPSHOT.20 4a763860a5c5 2 months ago 200MB 192.168.1.74:5000/veh/dfconsumer 0.0.1-SNAPSHOT.41 2e3471d6ca27 2 months ago 200MB 192.168.1.74:5000/veh/auth 0.0.1-SNAPSHOT.4 be5c86dd285b 3 months ago 185MB [root@mgmt-k8smaster01 deployment]# docker images |grep veh |awk '{a=$1;b=$2;c=(a":"b);print c}' 192.168.1.74:5000/veh/zuul:0.0.1-SNAPSHOT.34 192.168.1.74:5000/veh/vehicleanalysis:0.0.1-SNAPSHOT.38 192.168.1.74:5000/veh/masterveh:0.0.1-SNAPSHOT.88 192.168.1.74:5000/veh/obugateway:0.0.1-SNAPSHOT.18 192.168.1.74:5000/veh/frontend:1.0.33 192.168.1.74:5000/veh/rtkconsumer:0.0.1-SNAPSHOT.12 192.168.1.74:5000/veh/user:0.0.1-SNAPSHOT.14 192.168.1.74:5000/veh/rtkgw:0.0.1-SNAPSHOT.12 192.168.1.74:5000/veh/websocket:0.0.1-SNAPSHOT.7 192.168.1.74:5000/veh/vehconsumer:0.0.1-SNAPSHOT.20 192.168.1.74:5000/veh/dfconsumer:0.0.1-SNAPSHOT.41 192.168.1.74:5000/veh/auth:0.0.1-SNAPSHOT.4
awk works as follows:
- 1. Execute the contents of begin {commands} statement block
- 2. Execute the middle block pattern {commands}. Repeat this process to read all the guidance documents
- 3. When reading to the end of the input stream, execute the end {commands} statement block
We can add the value of the first field in each row, that is, sum the columns
[root@host3 ~]# cat sum.data 1 2 3 4 5 6 2 2 2 2 2 2 3 3 3 3 3 3 5 5 5 6 6 6 [root@host3 ~]# cat sum.data |awk 'BEGIN{ sum=0 } { print $1; sum+=$1 } END { print sum }' 1 2 3 5 11 [root@host3 ~]# awk '{if($2==3)print $0}' sum.data 3 3 3 3 3 3 [root@host3 ~]# awk '{if($2==5)print $0}' sum.data 5 5 5 6 6 6 #Add 1 to each value [root@host2 ~]# cat passwd 1:2:3:4 5:5:5:5 3:2:3:5 [root@host2 ~]# cat passwd |awk -F: '{for(i=1;i<=NF;i++){$i+=1}}{print $0}' 2 3 4 5 6 6 6 6 4 3 4 6 [root@host2 ~]# cat passwd |awk -F: '{$2=$2+1;print $0}' 1 3 3 4 5 6 5 5 3 3 3 5 [root@host2 ~]# cat passwd |awk -F: '{if($2==2) $2=$2+1;print $0}' 1 3 3 4 5:5:5:5 3 3 3 5 #Replace all 2 with Jack fund. If you need to be more standard, the expression should also be enclosed in parentheses [root@host2 ~]# cat passwd |awk -F: '{if($2==2) $2="jack fuck";print $0}' 1 jack fuck 3 4 5:5:5:5 3 jack fuck 3 5 [root@host2 ~]# cat passwd |awk -F: '{if($2==2) ($2="jack fuck");print $0}' 1 jack fuck 3 4 5:5:5:5 3 jack fuck 3 5
Pass external variables to awk
#With the option - v, we can pass external values to awk [root@host3 ~]# VAR1=10000 [root@host3 ~]# echo |awk -v VAR=$VAR1 '{print VAR}' 10000 #Input comes from standard output, so there is echo #There is another flexible way to pass multiple external variables to awk [root@host3 ~]# VAR1=10000 [root@host3 ~]# VAR2=20000 [root@host3 ~]# echo |awk '{ print v1,v2 }' v1=$VAR1 v2=$VAR2 10000 20000
Use filter mode to filter the rows processed by awk
[root@host3 ~]# cat sum.data 1 2 3 4 5 6 2 2 2 2 2 2 3 3 3 3 3 3 5 5 5 6 6 6 #Lines with line number less than 3 [root@host3 ~]# awk 'NR<3' sum.data 1 2 3 4 5 6 2 2 2 2 2 2 #Lines with line numbers between 1 and 4 [root@host3 ~]# awk 'NR==1,NR==3' sum.data 1 2 3 4 5 6 2 2 2 2 2 2 3 3 3 3 3 3 #Lines with style linux awk '/linux/' #Lines without style linux awk '!/linux/'
Merge multiple files by column
paste
wget
- Download multiple files wget URL1 URL2 URL3
- wget specifies the number of times with -t, and can retry wget -t 0 URL continuously
- Speed limit: WGet -- limit rate 20K http://www.baidu.com
- Quota can be used when downloading multiple files. Once quota is exhausted, Download wget --quota 100M URL1 URL2 URL3 will be stopped
- Breakpoint resume wget-c URL
- To access the URL of the http or ftp page WGet -- user username -- password password password that needs to be authenticated, or to enter the password manually without specifying the password in the command line, you need to change the -- password to -- ask password
Copy entire site (crawler)
wget has an option to recursively traverse all URL links on a web page like a crawler and download them one by one. So we can get all the pages of this site
wget --mirror --convert-links www.chinanews.com [root@host3 tmp]# ls www.chinanews.com [root@host3 tmp]# cd www.chinanews.com/ [root@host3 www.chinanews.com]# ls allspecial auto cj common gangao hb huaren js m piaowu robots.txt sh society taiwan tp app china cns2012.shtml fileftp gn hr index.html live.shtml part pv scroll-news shipin stock theory [root@host3 www.chinanews.com]# ll 260 drwxr-xr-x 2 root root 25 10ÔÂ 12 14:11 allspecial drwxr-xr-x 3 root root 23 10ÔÂ 12 14:11 app drwxr-xr-x 3 root root 18 10ÔÂ 12 14:11 auto drwxr-xr-x 2 root root 24 10ÔÂ 12 14:11 china drwxr-xr-x 3 root root 18 10ÔÂ 12 14:11 cj -rw-r--r-- 1 root root 15799 10ÔÂ 12 14:11 cns2012.shtml drwxr-xr-x 3 root root 46 10ÔÂ 12 14:11 common drwxr-xr-x 6 root root 54 10ÔÂ 12 14:11 fileftp drwxr-xr-x 2 root root 24 10ÔÂ 12 14:11 gangao drwxr-xr-x 4 root root 27 10ÔÂ 12 14:11 gn drwxr-xr-x 2 root root 24 10ÔÂ 12 14:11 hb drwxr-xr-x 3 root root 18 10ÔÂ 12 14:11 hr drwxr-xr-x 2 root root 24 10ÔÂ 12 14:11 huaren -rw-r--r-- 1 root root 184362 10ÔÂ 12 14:11 index.html drwxr-xr-x 2 root root 26 10ÔÂ 12 14:11 js #-Convert links instructs wget to convert the page's link address to a local address
Download Web page as plain text
Under the web page, the default is html format, which needs to be viewed by the browser. lynx is a command-line browser with a lot of playheads, which can be used to obtain the plain text web page
#Use the lynx command - dump option to store the contents of web pages in ascii encoded form in a text file [root@host3 tmp]# yum -y install lynx [root@host3 tmp]# lynx www.chinanews.com -dump > abc.txt [root@host3 tmp]# cat abc.txt ... 1. http://www.chinanews.com/kong/2019/10-12/8976714.shtml 2. http://www.chinanews.com/kong/2019/10-12/8976812.shtml 3. http://www.chinanews.com/kong/2019/10-12/8976721.shtml 4. http://www.chinanews.com/kong/2019/10-12/8976690.shtml 5. http://www.chinanews.com/kong/2019/10-12/8976817.shtml 6. http://www.chinanews.com/kong/2019/10-12/8976794.shtml 7. http://www.chinanews.com/kong/2019/10-12/8976853.shtml 8. http://www.chinanews.com/kong/2019/10-12/8976803.shtml 9. http://www.chinanews.com/sh/2019/10-12/8976754.shtml 10. http://www.chinanews.com/tp/chart/index.shtml 11. http://www.chinanews.com/tp/hd2011/2019/10-12/907641.shtml 12. http://www.chinanews.com/tp/hd2011/2019/10-12/907637.shtml 13. http://www.chinanews.com/tp/hd2011/2019/10-12/907651.shtml 14. http://www.chinanews.com/tp/hd2011/2019/10-12/907644.shtml 15. http://www.chinanews.com/tp/hd2011/2019/10-12/907675.shtml 16. http://www.chinanews.com/tp/hd2011/2019/10-12/907683.shtml 17. http://www.chinanews.com/tp/hd2011/2019/10-12/907656.shtml 18. http://www.ecns.cn/video/2019-10-12/detail-ifzpuyxh5816910.shtml 19. http://www.ecns.cn/video/2019-10-11/detail-ifzpuyxh5815962.shtml 20. http://www.ecns.cn/video/2019-10-11/detail-ifzpuyxh5815122.shtml 21. http://www.ecns.cn/video/2019-10-11/detail-ifzpuyxh5815100.shtml
curl
Set cookie
To specify a cookie, use the -- cookie "COOKIES" option
Cookies need to be given in the form of name=value. Multiple cookies are separated by semicolons. For example: - cookie "user=slynux;pass=hack"
If you want to save the cookie as a file, use the -- Cookie Jar option. For example -- Cookie Jar cookie file
Set user agent string
If the user agent is not specified, some web pages that need to verify the user agent cannot be displayed. You must have met some successful websites that can only work under ie. If using other browsers, these sites will prompt that she can only access ie. This is because these sites check user agents. You can use curl to set up the user agent
- --The user agent or - A option is used to set the user agent: curl URL -- user agent "Mozilla / 5.0"
- -H header information transfer multiple header information: curl - H "host: www.baidu.com" - H "accept language: En" URL
Print only headers
-I or -- head
[root@host3 tmp]# curl -I www.chinanews.com HTTP/1.1 200 OK Date: Sat, 12 Oct 2019 08:47:31 GMT Content-Type: text/html Connection: keep-alive Expires: Sat, 12 Oct 2019 08:48:22 GMT Server: nginx/1.12.2 Cache-Control: max-age=120 Age: 69 X-Via: 1.1 PSbjwjBGP2ih137:5 (Cdn Cache Server V2.0), 1.1 shx92:3 (Cdn Cache Server V2.0), 1.1 PSjsczsxrq176:3 (Cdn Cache Server V2.0), 1.1 iyidong70:11 (Cdn Cache Server V2.0)
Analyze website data
Lynx is a command line based web browser. It does not output a bunch of original html code, and two is able to display the text version of the website, which is exactly the same as the page we saw in the browser. In this way, the removal of html tags is avoided. The - nolist option of lynx is used here because there is no need to automatically label each link.
[root@host3 tmp]# lynx www.chinanews.com -dump -nolist ... Friendship link The Ministry of foreign affairs, the office of overseas Chinese, the supervision department of the Central Commission for Discipline Inspection, the office of Taiwan Affairs, the people's court network, the people's network, the Xinhua network, the China network, the CCTV network, the international online network, the China Youth Network, the China economic network, the Taiwan network, the CCTV network| Tibet network, China Youth Online, Guangming network, China military network, legal system network, China network, new Beijing News, Beijing News Network, Jinghua network, Sichuan Radio and television station, Qianlong Network, Hualong network, Hongwang network, Shunwang network, Jiaodong Online| Northeast news network | northeast network | Qilu hotline | Sichuan news network | Great Wall network | South Network | North network | East network | Sina | Sohu | Netease | Tencent | China Jingwei | East wealth network | financial sector | Huike | real estate world About us | about us | contact us | advertising service | contribution service | legal statement | Recruitment Information | website map |Message feedback The information published on this website does not represent the views of China News Agency and China news network. Articles published on this website shall be authorized in writing. Unauthorized reprint, excerpt, copy and establishment of image are prohibited. Violators will be prosecuted according to law. [license for online dissemination of audio-visual programs (0106168)] [jingicp Certificate No. 040655] [ghs.png] Jinggong network security 11000002003042] [Jing ICP Bei 05004340-1] switchboard: 86-10-87826688 Tel. of illegal and bad information report: 1569978800 email: jubao@chinanews.com.cn administrative measures for report acceptance and handling Copyright ©1999- 2019 chinanews.com. All Rights Reserved [_1077593327_3.gif] [U194P4T47D45262F978DT20190920162854.jpg] [_1077593327_3.gif] [U194P4T47D45262F979DT20190920162854.jpg]
case
case $variable name in "Value 1") ;; If the value of the variable is equal to the value 1, execute program 1, value 2") If the value of the variable is equal to value 2, execute program 2 ... Omit other branches *) If none of the variables have the above values, execute this procedure ;; esac #!/bin/bash #Judge user input read -p "Please choose yes/no: " -t 30 cho #Output "please select yes/no" on the screen, and then assign the user selection to the variable cho case $cho in #Judge the value of variable cho "yes") #If yes echo "Your choose is yes!" #Then execute procedure 1 ;; "no") If it is no echo "Your choose is no!" #Then execute procedure 2 ;; *) #If it's neither yes nor no echo "Your choose is error!" #Then execute this procedure ;; esac
Find invalid links in Web site
A person manually checks every page on the site to find invalid links. To identify links and find invalid links from them
[root@host3 tmp]# cat find_broken.sh #!/bin/bash if [ $# -ne 1 ]; then echo -e "$Usage: $0 URL\n" exit 1; fi echo Broken links: # $$is the pid of the script runtime mkdir /tmp/$$.lynx cd /tmp/$$.lynx lynx -traversal $1 > /dev/null count=0; sort -u reject.data > links.txt while read link; do output=`curl -I $link -s | grep "HTTP/.*OK"` if [[ -z $output ]]; then $link; let count++ fi done < links.txt [ $count -eq 0 ] && echo No broken links found. #The lynx -traversal URL generates several files in the working directory, including reject.dat, which contains all the links in the site. sort -u is used to create a list without duplicates. We can check the head with curl #Sort-u de duplication, similar to uniq
- lynx -traversal from the name point of view, jeject.dat should contain a list of invalid URLs. In fact, this is not the case, but put all URLs in this file
- Lynx also generates a traverse.error file that contains all the URLs in question during browsing. But lynx will only return HTTP404 URLs, and will miss those URLs with other types of errors, which is why we need to check the return status manually