centos7 hadoop+hive installation

Keywords: hive Hadoop vim MySQL

Prepare four virtual machines

Virtual Machine Installation

1.Create a new virtual machine
2.Click on Typical Installation(Recommend)
3.Select Chinese and click on your own partition
# Partition Configuration (JD Usage)
/boot 200M
swap 512M  # Not enough native memory, swap
/ # root directory
4.Configure others, as shown below

Update yum

yum install update -y

ip of four hosts

One Master and Three Subordinates
 172.20.10.9 Password: virtual machine hadoop01 corresponding to hadoop01
 172.20.10.10 Password: virtual machine hadoop02 corresponding to hadoop02
 172.20.10.11 Password: virtual machine hadoop03 corresponding to hadoop03
 172.20.10.12 Password: virtual machine hadoop04 corresponding to hadoop04

#Reset the password for root
passwd root

hadoop installation

https://www.cnblogs.com/shireenlee4testing/p/10472018.html

Configure DNS

Each node is configured

vim /etc/hosts

172.20.10.9   hadoop01
172.20.10.10  hadoop02
172.20.10.11  hadoop03
172.20.10.12  hadoop04

Close Firewall

# Close Firewall
systemctl stop firewalld

# Turn off self-start
systemctl disable firewalld

Configure Secret-Free Login

https://www.cnblogs.com/shireenlee4testing/p/10366061.html

Configure DNS

Generate ssh key
# Generate ssh key
ssh-keygen -t rsa

cd /root/.ssh
ls 

# Copy the public key to a specific file authorized_keys on the primary node (hadoop01)
cp id_rsa.pub authorized_keys

# Copy authorized_keys to hadoop02
scp authorized_keys root@hadoop02:/root/.ssh/

# Log on to hadoop02 host
cd .ssh/
cat id_rsa.pub >> authorized_keys
# On copying authorized_keys to hadoop03
scp authorized_keys root@hadoop03:/root/.ssh/

# Log on to the hadoop03 host
cd .ssh/
cat id_rsa.pub >> authorized_keys
# Copying authorized_keys to hadoop04
scp authorized_keys root@hadoop04:/root/.ssh/

# Log on to hadoop04 host
cd .ssh/
cat id_rsa.pub >> authorized_keys
# Copy the generated authorized_keys to hadoop01, hadoop02, hadoop03
scp authorized_keys root@hadoop01:/root/.ssh/
scp authorized_keys root@hadoop02:/root/.ssh/
scp authorized_keys root@hadoop03:/root/.ssh/

# Verify Secret Login
//Verify password-free login using ssh username@node name or ssh ip address command
ssh root@hadoop02

wget download jdk8

https://blog.csdn.net/u014700139/article/details/89960494
# Copy the downloaded jdk to hadoop02, hadoop03, hadoop04
scp -r -P 22 jdk.tar.gz root@hadoop02:~/
scp -r -P 22 jdk.tar.gz root@hadoop03:~/
scp -r -P 22 jdk.tar.gz root@hadoop04:~/

Configuring the JDK environment

tar -zxvf jdk.tat.gz
mv jdk1.8.0_241 /opt/
# Create Soft Connections
ln -s /opt/jdk1.8.0_241 /opt/jdk

# Configure the java environment
vim /etc/profile
# Java
export JAVA_HOME=/opt/jdk
export CLASSPATH=$JAVA_HOME/lib/
export PATH=$PATH:$JAVA_HOME/bin

# Make environment variables valid
source /etc/profile

# Verify java installation
java -version

Setting up a fully distributed Hadoop cluster

hadoop version download

http://mirror.bit.edu.cn/apache/hadoop/common/
Download hadoop
wget http://us.mirrors.quenda.co/apache/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz
https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.2.0/hadoop-3.2.0.tar.gz
1. Configure hadoop environment variables (per node)
# Unzip under opt
tar -zxvf hadoop-3.2.0.tar.gz -C /opt/

vim /etc/profile
# hadoop
export HADOOP_HOME=/opt/hadoop-3.2.0
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop

# Save to make profile valid
source /etc/profile
2. Configure JAVA_HOME parameters in the Hadoop environment script file
cd /opt/hadoop-3.2.0/etc/hadoop
#Add or modify the following parameters in the hadoop-env.sh, mapred-env.sh, yarn-env.sh files, respectively
vim hadoop-env.sh
vim mapred-env.sh
vim yarn-env.sh
export JAVA_HOME="/opt/jdk"
3. Modify Hadoop Profile

cd /opt/hadoop-3.2.0/etc/hadoop

In the etc/hadoop directory under the Hadoop installation directory, modify the core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml, workers files to modify the configuration information as appropriate

create folder

mkdir -p /opt/hadoop/tmp

core-site.xml (Configure Common Component Properties)
<configuration>
  <property>
      <!-- To configure hdfs address -->
      <name>fs.defaultFS</name>
      <value>hdfs://hadoop01:9000</value>
  </property>
  <property>
      <!-- To save a temporary file directory, you first need to/opt/hadoop Create under tmp Catalog -->
      <name>hadoop.tmp.dir</name>
     <value>/opt/hadoop/tmp</value>
 </property>
 </configuration>
hdfs-site.xml (configure HDFS component properties)
<configuration>
      <property>
         <!-- Primary Node Address -->
          <name>dfs.namenode.http-address</name>
          <value>hadoop01:50070</value>
      </property>
      <property>
          <name>dfs.namenode.name.dir</name>
          <value>file:/opt/hadoop/dfs/name</value>
     </property>
     <property>
         <name>dfs.datanode.data.dir</name>
         <value>file:/opt/hadoop/dfs/data</value>
     </property>
     <property>
        <!-- Number of backups is default 3 -->
        <name>dfs.replication</name>
         <value>3</value>
     </property>
        <property> 
              <name>dfs.webhdfs.enabled</name> 
              <value>true</value> 
         </property>
      <property>
            <name>dfs.permissions</name>
            <value>false</value>
            <description>Configuration false After that, you can allow generation without checking permissions dfs The files on the are convenient, but you need to prevent them from being deleted by mistake.</description>
       </property>
 </configuration>
mapred-site.xml (configure Map-Reduce component properties)
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <!--#Set MapReduce to run on yarn-->
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>hadoop01:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>hadoop01:19888</value>
    </property>
    <property>
        <name>mapreduce.application.classpath</name>
        <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
    </property>
</configuration>
yarn-site.xml (Configure Resource Scheduling Properties)
<configuration>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <!--#Specify the address of yarn's esourceManager management interface, and if not, Active Node will always be 0-->
        <value>hadoop01</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <!--#Reducr How to Get Data -->
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>hadoop01:8088</value>
        <description>Configuring an external network only requires replacing the external network ip Is True ip,Otherwise defaults to localhost:8088</description>
    </property>
    <property>
        <name>yarn.scheduler.maximum-allocation-mb</name>
        <value>2048</value>
        <description>Available memory per node,Company MB,Default 8182 MB</description>
    </property>
    <property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
        <description>Ignore virtual memory checks. This configuration is useful if you are installing on a virtual machine, and subsequent operations are not likely to cause problems.</description>
    </property>
    <property>
        <name>yarn.nodemanager.env-whitelist</name>
        <value>     JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME
        </value>
    </property>
</configuration>
workers
vim workers
# Add something
hadoop02
hadoop03
hadoop04
4. Copy the configured folder to another slave node
scp -r /opt/hadoop-3.2.0 root@hadoop02:/opt/
scp -r /opt/hadoop-3.2.0 root@hadoop03:/opt/

scp -r /opt/hadoop root@hadoop02:/opt/
scp -r /opt/hadoop root@hadoop03:/opt/
5. Configure startup scripts, add HDFS and Yarn permissions
# Add HDFS permissions: Edit the following script and add HDFS permissions in the empty space on the second line
cd /opt/hadoop-3.2.0/sbin
vim start-dfs.sh 
vim stop-dfs.sh

HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
# Add Yarn permissions: Edit the following script and add Yarn permissions in the empty space on the second line
cd /opt/hadoop-3.2.0/sbin
vim start-yarn.sh 
vim stop-yarn.sh 

YARN_RESOURCEMANAGER_USER=root
HDFS_DATANODE_SECURE_USER=yarn
YARN_NODEMANAGER_USER=root
6. Initialization & Startup
cd /opt/hadoop-3.2.0
# init 
# Format
bin/hdfs namenode -format wmqhadoop

#start-up
sbin/start-dfs.sh
sbin/start-yarn.sh

# Back Open
sbin/start-all.sh

# Stop it
sbin/stop-all.sh
7. Verify Hadoop started successfully
jps
Open ResourceManager page at browser input: http://hadoop01:8088

Open the Hadoop Namenode page in the browser by typing: http://hadoop01:50070

mysql-5.7 installation

download

wget http://repo.mysql.com/yum/mysql-5.7-community/el/7/x86_64/mysql57-community-release-el7-10.noarch.rpm

rpm -ivh mysql57-community-release-el7-10.noarch.rpm

Use the yum command to complete the installation

1,Installation command:
yum -y install mysql-community-server

2,start-up msyql: 
systemctl start mysqld #Start MySQL

3,Get the temporary password for the installation (which is used the first time you log on):
grep 'temporary password' /var/log/mysqld.log
sGpt=V+8f,qv

Auftbt8Mht,x
3.Set up boot-up
systemctl enable mysqld

Sign in

mysql -uroot -p
# Enter the password you just entered

Change Password

ALTER USER 'root'@'localhost' IDENTIFIED BY 'Mysql123!';

Set to allow remote login

1. Execute authorization commands

GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'Mysql123!' WITH GRANT OPTION;

2. Exit mysql Operations Console

exit

3. Open port 3306

Open Firewall

sudo systemctl start firewalld.service

Open port 3306 permanently

sudo firewall-cmd --add-port=3306/tcp --permanent

Reload

sudo firewall-cmd --reload

Close Firewall

sudo systemctl stop firewalld.service

Set default encoding to utf8

View mysql encoding before modification

show variables like '%chara%';

Modify the / etc/my.cnf file to add the following two lines

vim /etc/my.cnf
character_set_server=utf8
init_connect='SET NAMES utf8'

After modification, restart mysql

sudo systemctl restart mysqld

hive installation

https://blog.csdn.net/qq_39315740/article/details/98626518 #Recommended

https://blog.csdn.net/weixin_43207025/article/details/101073351

hive Download

http://mirror.bit.edu.cn/apache/hive/

hive installation

tar -zxvf apache-hive-3.1.2-bin.tar.gz -C /opt/

Configuring environment variables

vim /etc/profile
# hive
export HIVE_HOME=/opt/apache-hive-3.1.2-bin
export PATH=$PATH:$HIVE_HOME/bin

source /etc/profile

Create hive-site.xml file

cd /opt/apache-hive-3.1.2-bin/conf
cp hive-default.xml.template hive-site.xml

For the following HDFS-related settings in hive-site.xml, we need to create the corresponding directory in HDFS now

  <property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/user/hive/warehouse</value>
    <description>location of default database for the warehouse</description>
  </property>

  <property>
    <name>hive.exec.scratchdir</name>
    <value>/tmp/hive</value>
    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt;username&gt; is created, with ${hive.scratch.dir.permission}.</description>
  </property>

Create HDFS Folder

hadoop fs -mkdir -p /user/hive/warehouse   # create folder
hadoop fs -mkdir -p /tmp/hive    # create folder
hadoop fs -chmod -R 777 /user/hive/warehouse   # Grant privileges
hadoop fs -chmod -R 777 /tmp/hive   # Grant privileges

# Check to see if the creation was successful
hadoop fs -ls /

Hive-related configuration

Change {system:java.io.tmpdir} in hive-site.xml to hive's local temporary directory and {system:user.name} to user name.
Create temp directory

cd /opt/apache-hive-3.1.2-bin
mkdir temp
chmod -R 777 temp
# Replace ${system:java.io.tmpdir} with / opt/apache-hive-3.1.2-bin/temp
# Replace ${system:user.name} with root
vim hive-site.xml
%s/${system:java.io.tmpdir}/\/opt\/apache-hive-3.1.2-bin\/temp/g
%s/${system:user.name}/root/g

Database Related Configuration

# Database jdbc address, modified as host ip address in value tag
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&amp;characterEncoding=UTF-8</value>
  </property>
  
# Driver class name for database
# The new version 8.0 driver is com.mysql.cj.jdbc.Driver
# Older version 5.x is driven by com.mysql.jdbc.Driver
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
  </property>
  
# Database User Name
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
  </property>
 
# Database Password
   <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>Mysql123!</value> #Modify your own mysql password
  </property>

  <property>
    <name>hive.metastore.schema.verification</name>
    <value>false</value>
  </property>

Configure hive-log4j2.properties

cd /opt/apache-hive-3.1.2-bin/conf
cp hive-log4j2.properties.template hive-log4j2.properties

vim hive-log4j2.properties
# Modify Content
property.hive.log.dir = /opt/apache-hive-3.1.2-bin/temp/root

Configure hive-env.sh file

cd /opt/apache-hive-3.1.2-bin/conf
cp hive-env.sh.template hive-env.sh
vim hive-env.sh

Add the following:

export JAVA_HOME=/opt/jdk
export HADOOP_HOME=/opt/hadoop-3.2.0
export HIVE_CONF_DIR=/opt/apache-hive-3.1.2-bin/conf
export HIVE_AUX_JARS_PATH=/opt/apache-hive-3.1.2-bin/lib

Hive Start

Download Database 5.7 Driver
https://blog.csdn.net/qq_41950447/article/details/90085170
//Database Driven Download

wget https://cdn.mysql.com//Downloads/Connector-J/mysql-connector-java-5.1.48.tar.gz

# Move the database driver to lib in hive
cp -r mysql-connector-java-5.1.48-bin.jar /opt/apache-hive-3.1.2-bin/lib
Initialization
schematool -dbType mysql -initSchema
problem
# hive initialization error
http://www.lzhpo.com/article/98
# Compare the guava-27.0-jre.jar versions in hadoop and hive
cd /opt/hadoop-3.2.0/share/hadoop/common/lib
ll | grep guava*

cd /opt/apache-hive-3.1.2-bin/lib
ll | grep guava*

//Replace the higher version of guava-27.0-jre.jar with the lower version of guava-19.0-jre.jar

# There are also questions to refer to
https://blog.csdn.net/qq_39315740/article/details/98626518

Hadoop 3.1.2 + Hive 3.1.1 Installation

https://www.cnblogs.com/weavepub/p/11130869.html

other

Modify vim comment color

Create a new.vimrc profile under User~Home folder
vim ~/.vimrc
 #Add the content and save it
hi Comment ctermfg =blue

vim replacement

%s/${system:java.io.tmpdir}/\/opt\/apache-hive-3.1.2-bin\/temp/g
%s/${system:user.name}/root/g
Twenty-nine original articles were published, 11 were praised, and 10,000 visits were received.
Private letter follow

Posted by Derek on Fri, 31 Jan 2020 20:33:17 -0800