Shell distribution script

Keywords: Linux shell

Shell distribution script

principle

In the cluster, we often have this requirement: copy files to the same directory of all nodes in a circular way, and use commands separately from one node to another to improve efficiency.

Core idea
Encapsulation on rsync

rsync command analysis

characteristic

rsync remote synchronization tool
rsync is mainly used for backup and mirroring. It has the advantages of high speed, avoiding copying the same content and supporting symbolic links.

Difference between rsync and scp
rsync: faster, update only difference files, first synchronization = scp.
scp: copy all files.

Basic grammar

Rsync - AV (selection parameter) file path to copy destination user @ host: destination path

-a archive copy
-v displays the copy process
-r handles subdirectories recursively

realization

demand

Circularly copy files (only for files here) to the same directory of all nodes

Command: xsync name of file to synchronize

Here, you need to add nodes in the cluster in the script. Another way is to transfer files by passing the nodes that need the file in the command. You can also determine whether to copy files or directories by passing parameters.

Here we will do the simplest implementation, adding nodes in the cluster in the script. If there are other requirements in subsequent use, make improvements on this basis.

Implementing requirements using rsync

rsync -av /opt/module/ ranan@hadoop103:/opt/module

So we need to get the current path, the current user name and the host name to be passed

environment variable

It is expected that the script can be used in any path

The script is placed under the path where the global environment variable is declared

You can view the path where the global environment variable is declared through echo $PATH

Script implementation

#Specify bash parser
#!/bin/bash

#1. $# Number of parameters obtained
# Judge whether the parameter is less than 1, - lt 	 (less than) less than, at least the files to be transferred
if [ $# -lt 1 ]
then
    echo Not Enough Arguement!
    exit;
fi

#2. Traverse all machines in the cluster
# Distribute all 102103104
for host in hadoop102 hadoop103 hadoop104
do
   # $host uses the host variable
   echo ====================  $host  ====================
   #3. $@ represents all parameters in the command line, that is, the files to be transferred, which can be multiple
   for file in $@
   do

        #4. Judge whether the file exists -e (existence)
        if [ -e $file ]
            then
                #5. Get parent directory, common commands
		# Dirname gets the directory where the specified path is located. For example, dirname /home/xu results in / home.
		# $returns the result of the command
		# pwd -P: if the directory is a link, the actual path is displayed instead of the link path.
                # $(command) is equivalent to the 'command' shell. After scanning the command line and finding the $(command) structure, execute the command in $(command) once to obtain its standard output,
                pdir=$(cd -P $(dirname $file); pwd)


                #6. Get the name of the current file
                fname=$(basename $file)
                # Create a directory first to prevent the path from not existing when transferring files
		#ssh xxx.xxx.xxx.xxx "command", execute the command remotely
		#mkdir -p recursively creates a directory. Even if the parent directory does not exist, it will automatically create a directory according to the directory level
		# -No error will be reported when the p directory already exists
                ssh $host "mkdir -p $pdir"
                rsync -av $pdir/$fname $host:$pdir
            # If it doesn't exist
            else
                echo $file does not exists!
        fi
    done
done

Knowledge points

Get the directory dirname of the current path

pdir=$(cd -P $(dirname $file); pwd)

$file gets the directory where the current path is located, such as dirname /home/xu, and the result is / home.
$returns the result of the command
pwd -P: if the directory is a link, the actual path is displayed instead of the link path.
cd -P: if the directory is a link, switch to the directory of the actual path.

Gets the file name basename of the current path

basename command format:
basename [pathname] [suffix]
basename [string] [suffix]

Suffix is the suffix. If suffix is specified, basename will remove suffix from pathname or string.
Example:

$ basename /tmp/test/file.txt
file.txt
$ basename /tmp/test/file.txt .txt
file

shell remote command execution

ssh ranan@xxx.xxx.xxx.xxx "pwd; cat hello.txt" //Execute two commands remotely, the user name is ranan
ssh xxx.xxx.xxx.xxx "command"

mkdir directory exists without error

mkdir directory. If there is a mkdir directory, an error will be reported.
mkdir -p creates a multi-level directory. If the directory exists, no error will be reported.

Posted by grantf on Sun, 21 Nov 2021 23:16:16 -0800