centos7 hadoop+hive installation

Prepare four virtual machines

Virtual Machine Installation

1.Create a new virtual machine
2.Click on Typical Installation(Recommend)
3.Select Chinese and click on your own partition
# Partition Configuration (JD Usage)
/boot 200M
swap 512M  # Not enough native memory, swap
/ # root directory
4.Configure others, as shown below

Update yum

yum install update -y

ip of four hosts

One Master and Three Subordinates
 172.20.10.9 Password: virtual machine hadoop01 corresponding to hadoop01
 172.20.10.10 Password: virtual machine hadoop02 corresponding to hadoop02
 172.20.10.11 Password: virtual machine hadoop03 corresponding to hadoop03
 172.20.10.12 Password: virtual machine hadoop04 corresponding to hadoop04

#Reset the password for root
passwd root

hadoop installation

https://www.cnblogs.com/shireenlee4testing/p/10472018.html

Configure DNS

Each node is configured

vim /etc/hosts

172.20.10.9   hadoop01
172.20.10.10  hadoop02
172.20.10.11  hadoop03
172.20.10.12  hadoop04

Close Firewall

# Close Firewall
systemctl stop firewalld

# Turn off self-start
systemctl disable firewalld

Configure Secret-Free Login

https://www.cnblogs.com/shireenlee4testing/p/10366061.html

Configure DNS

Generate ssh key

# Generate ssh key
ssh-keygen -t rsa

cd /root/.ssh
ls 

# Copy the public key to a specific file authorized_keys on the primary node (hadoop01)
cp id_rsa.pub authorized_keys

# Copy authorized_keys to hadoop02
scp authorized_keys root@hadoop02:/root/.ssh/

# Log on to hadoop02 host
cd .ssh/
cat id_rsa.pub >> authorized_keys
# On copying authorized_keys to hadoop03
scp authorized_keys root@hadoop03:/root/.ssh/

# Log on to the hadoop03 host
cd .ssh/
cat id_rsa.pub >> authorized_keys
# Copying authorized_keys to hadoop04
scp authorized_keys root@hadoop04:/root/.ssh/

# Log on to hadoop04 host
cd .ssh/
cat id_rsa.pub >> authorized_keys
# Copy the generated authorized_keys to hadoop01, hadoop02, hadoop03
scp authorized_keys root@hadoop01:/root/.ssh/
scp authorized_keys root@hadoop02:/root/.ssh/
scp authorized_keys root@hadoop03:/root/.ssh/

# Verify Secret Login
//Verify password-free login using ssh username@node name or ssh ip address command
ssh root@hadoop02

wget download jdk8

https://blog.csdn.net/u014700139/article/details/89960494
# Copy the downloaded jdk to hadoop02, hadoop03, hadoop04
scp -r -P 22 jdk.tar.gz root@hadoop02:~/
scp -r -P 22 jdk.tar.gz root@hadoop03:~/
scp -r -P 22 jdk.tar.gz root@hadoop04:~/

Configuring the JDK environment

tar -zxvf jdk.tat.gz
mv jdk1.8.0_241 /opt/
# Create Soft Connections
ln -s /opt/jdk1.8.0_241 /opt/jdk

# Configure the java environment
vim /etc/profile
# Java
export JAVA_HOME=/opt/jdk
export CLASSPATH=$JAVA_HOME/lib/
export PATH=$PATH:$JAVA_HOME/bin

# Make environment variables valid
source /etc/profile

# Verify java installation
java -version

Setting up a fully distributed Hadoop cluster

hadoop version download

http://mirror.bit.edu.cn/apache/hadoop/common/

Download hadoop

wget http://us.mirrors.quenda.co/apache/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz
https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.2.0/hadoop-3.2.0.tar.gz

1. Configure hadoop environment variables (per node)

# Unzip under opt
tar -zxvf hadoop-3.2.0.tar.gz -C /opt/

vim /etc/profile
# hadoop
export HADOOP_HOME=/opt/hadoop-3.2.0
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop

# Save to make profile valid
source /etc/profile

2. Configure JAVA_HOME parameters in the Hadoop environment script file

cd /opt/hadoop-3.2.0/etc/hadoop
#Add or modify the following parameters in the hadoop-env.sh, mapred-env.sh, yarn-env.sh files, respectively
vim hadoop-env.sh
vim mapred-env.sh
vim yarn-env.sh
export JAVA_HOME="/opt/jdk"

3. Modify Hadoop Profile

cd /opt/hadoop-3.2.0/etc/hadoop

In the etc/hadoop directory under the Hadoop installation directory, modify the core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml, workers files to modify the configuration information as appropriate

create folder

mkdir -p /opt/hadoop/tmp

core-site.xml (Configure Common Component Properties)

<configuration>
  <property>
      <!-- To configure hdfs address -->
      <name>fs.defaultFS</name>
      <value>hdfs://hadoop01:9000</value>
  </property>
  <property>
      <!-- To save a temporary file directory, you first need to/opt/hadoop Create under tmp Catalog -->
      <name>hadoop.tmp.dir</name>
     <value>/opt/hadoop/tmp</value>
 </property>
 </configuration>

hdfs-site.xml (configure HDFS component properties)

<configuration>
      <property>
         <!-- Primary Node Address -->
          <name>dfs.namenode.http-address</name>
          <value>hadoop01:50070</value>
      </property>
      <property>
          <name>dfs.namenode.name.dir</name>
          <value>file:/opt/hadoop/dfs/name</value>
     </property>
     <property>
         <name>dfs.datanode.data.dir</name>
         <value>file:/opt/hadoop/dfs/data</value>
     </property>
     <property>
        <!-- Number of backups is default 3 -->
        <name>dfs.replication</name>
         <value>3</value>
     </property>
        <property> 
        　　　　　　<name>dfs.webhdfs.enabled</name> 
        　　　　　　<value>true</value> 
    　　　　　</property>
 　　　　　<property>
      　　　　　　<name>dfs.permissions</name>
      　　　　　　<value>false</value>
      　　　　　　<description>Configuration false After that, you can allow generation without checking permissions dfs The files on the are convenient, but you need to prevent them from being deleted by mistake.</description>
  　　　　　</property>
 </configuration>

mapred-site.xml (configure Map-Reduce component properties)

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <!--#Set MapReduce to run on yarn-->
        <value>yarn</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>hadoop01:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>hadoop01:19888</value>
    </property>
    <property>
        <name>mapreduce.application.classpath</name>
        <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
    </property>
</configuration>

yarn-site.xml (Configure Resource Scheduling Properties)

<configuration>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <!--#Specify the address of yarn's esourceManager management interface, and if not, Active Node will always be 0-->
        <value>hadoop01</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <!--#Reducr How to Get Data -->
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>hadoop01:8088</value>
        <description>Configuring an external network only requires replacing the external network ip Is True ip，Otherwise defaults to localhost:8088</description>
    </property>
    <property>
        <name>yarn.scheduler.maximum-allocation-mb</name>
        <value>2048</value>
        <description>Available memory per node,Company MB,Default 8182 MB</description>
    </property>
    <property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
        <description>Ignore virtual memory checks. This configuration is useful if you are installing on a virtual machine, and subsequent operations are not likely to cause problems.</description>
    </property>
    <property>
        <name>yarn.nodemanager.env-whitelist</name>
        <value>     JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME
        </value>
    </property>
</configuration>

workers

vim workers
# Add something
hadoop02
hadoop03
hadoop04

4. Copy the configured folder to another slave node

scp -r /opt/hadoop-3.2.0 root@hadoop02:/opt/
scp -r /opt/hadoop-3.2.0 root@hadoop03:/opt/

scp -r /opt/hadoop root@hadoop02:/opt/
scp -r /opt/hadoop root@hadoop03:/opt/

5. Configure startup scripts, add HDFS and Yarn permissions

# Add HDFS permissions: Edit the following script and add HDFS permissions in the empty space on the second line
cd /opt/hadoop-3.2.0/sbin
vim start-dfs.sh 
vim stop-dfs.sh

HDFS_DATANODE_USER=root
HDFS_DATANODE_SECURE_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root

# Add Yarn permissions: Edit the following script and add Yarn permissions in the empty space on the second line
cd /opt/hadoop-3.2.0/sbin
vim start-yarn.sh 
vim stop-yarn.sh 

YARN_RESOURCEMANAGER_USER=root
HDFS_DATANODE_SECURE_USER=yarn
YARN_NODEMANAGER_USER=root

6. Initialization & Startup

cd /opt/hadoop-3.2.0
# init 
# Format
bin/hdfs namenode -format wmqhadoop

#start-up
sbin/start-dfs.sh
sbin/start-yarn.sh

# Back Open
sbin/start-all.sh

# Stop it
sbin/stop-all.sh

7. Verify Hadoop started successfully

jps

Open ResourceManager page at browser input: http://hadoop01:8088

Open the Hadoop Namenode page in the browser by typing: http://hadoop01:50070

mysql-5.7 installation

download

wget http://repo.mysql.com/yum/mysql-5.7-community/el/7/x86_64/mysql57-community-release-el7-10.noarch.rpm

rpm -ivh mysql57-community-release-el7-10.noarch.rpm

Use the yum command to complete the installation

1,Installation command:
yum -y install mysql-community-server

2,start-up msyql: 
systemctl start mysqld #Start MySQL

3,Get the temporary password for the installation (which is used the first time you log on):
grep 'temporary password' /var/log/mysqld.log
sGpt=V+8f,qv

Auftbt8Mht,x
3.Set up boot-up
systemctl enable mysqld

Sign in

mysql -uroot -p
# Enter the password you just entered

Change Password

ALTER USER 'root'@'localhost' IDENTIFIED BY 'Mysql123!';

Set to allow remote login

1. Execute authorization commands

GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'Mysql123!' WITH GRANT OPTION;

2. Exit mysql Operations Console

exit

3. Open port 3306

Open Firewall

sudo systemctl start firewalld.service

Open port 3306 permanently

sudo firewall-cmd --add-port=3306/tcp --permanent

Reload

sudo firewall-cmd --reload

Close Firewall

sudo systemctl stop firewalld.service

Set default encoding to utf8

View mysql encoding before modification

show variables like '%chara%';

Modify the / etc/my.cnf file to add the following two lines

vim /etc/my.cnf
character_set_server=utf8
init_connect='SET NAMES utf8'

After modification, restart mysql

sudo systemctl restart mysqld

hive installation

https://blog.csdn.net/qq_39315740/article/details/98626518 #Recommended

https://blog.csdn.net/weixin_43207025/article/details/101073351

hive Download

http://mirror.bit.edu.cn/apache/hive/

hive installation

tar -zxvf apache-hive-3.1.2-bin.tar.gz -C /opt/

Configuring environment variables

vim /etc/profile
# hive
export HIVE_HOME=/opt/apache-hive-3.1.2-bin
export PATH=$PATH:$HIVE_HOME/bin

source /etc/profile

Create hive-site.xml file

cd /opt/apache-hive-3.1.2-bin/conf
cp hive-default.xml.template hive-site.xml

For the following HDFS-related settings in hive-site.xml, we need to create the corresponding directory in HDFS now

  <property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/user/hive/warehouse</value>
    <description>location of default database for the warehouse</description>
  </property>

  <property>
    <name>hive.exec.scratchdir</name>
    <value>/tmp/hive</value>
    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt;username&gt; is created, with ${hive.scratch.dir.permission}.</description>
  </property>

Create HDFS Folder

hadoop fs -mkdir -p /user/hive/warehouse   # create folder
hadoop fs -mkdir -p /tmp/hive    # create folder
hadoop fs -chmod -R 777 /user/hive/warehouse   # Grant privileges
hadoop fs -chmod -R 777 /tmp/hive   # Grant privileges

# Check to see if the creation was successful
hadoop fs -ls /

Hive-related configuration

Change {system:java.io.tmpdir} in hive-site.xml to hive's local temporary directory and {system:user.name} to user name.
Create temp directory

cd /opt/apache-hive-3.1.2-bin
mkdir temp
chmod -R 777 temp

# Replace ${system:java.io.tmpdir} with / opt/apache-hive-3.1.2-bin/temp
# Replace ${system:user.name} with root
vim hive-site.xml
%s/${system:java.io.tmpdir}/\/opt\/apache-hive-3.1.2-bin\/temp/g
%s/${system:user.name}/root/g

Database Related Configuration

# Database jdbc address, modified as host ip address in value tag
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&amp;characterEncoding=UTF-8</value>
  </property>
  
# Driver class name for database
# The new version 8.0 driver is com.mysql.cj.jdbc.Driver
# Older version 5.x is driven by com.mysql.jdbc.Driver
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
  </property>
  
# Database User Name
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
  </property>
 
# Database Password
   <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>Mysql123!</value> #Modify your own mysql password
  </property>

  <property>
    <name>hive.metastore.schema.verification</name>
    <value>false</value>
  </property>

Configure hive-log4j2.properties

cd /opt/apache-hive-3.1.2-bin/conf
cp hive-log4j2.properties.template hive-log4j2.properties

vim hive-log4j2.properties
# Modify Content
property.hive.log.dir = /opt/apache-hive-3.1.2-bin/temp/root

Configure hive-env.sh file

cd /opt/apache-hive-3.1.2-bin/conf
cp hive-env.sh.template hive-env.sh
vim hive-env.sh

Add the following:

export JAVA_HOME=/opt/jdk
export HADOOP_HOME=/opt/hadoop-3.2.0
export HIVE_CONF_DIR=/opt/apache-hive-3.1.2-bin/conf
export HIVE_AUX_JARS_PATH=/opt/apache-hive-3.1.2-bin/lib

Hive Start

Download Database 5.7 Driver

https://blog.csdn.net/qq_41950447/article/details/90085170
//Database Driven Download

wget https://cdn.mysql.com//Downloads/Connector-J/mysql-connector-java-5.1.48.tar.gz

# Move the database driver to lib in hive
cp -r mysql-connector-java-5.1.48-bin.jar /opt/apache-hive-3.1.2-bin/lib

Initialization

schematool -dbType mysql -initSchema

problem

# hive initialization error
http://www.lzhpo.com/article/98
# Compare the guava-27.0-jre.jar versions in hadoop and hive
cd /opt/hadoop-3.2.0/share/hadoop/common/lib
ll | grep guava*

cd /opt/apache-hive-3.1.2-bin/lib
ll | grep guava*

//Replace the higher version of guava-27.0-jre.jar with the lower version of guava-19.0-jre.jar

# There are also questions to refer to
https://blog.csdn.net/qq_39315740/article/details/98626518

Hadoop 3.1.2 + Hive 3.1.1 Installation

https://www.cnblogs.com/weavepub/p/11130869.html

other

Modify vim comment color

Create a new.vimrc profile under User~Home folder
vim ~/.vimrc
 #Add the content and save it
hi Comment ctermfg =blue

vim replacement

%s/${system:java.io.tmpdir}/\/opt\/apache-hive-3.1.2-bin\/temp/g
%s/${system:user.name}/root/g

ZbyFt

Twenty-nine original articles were published, 11 were praised, and 10,000 visits were received.

Private letter follow

Posted by Derek on Fri, 31 Jan 2020 20:33:17 -0800

Programmer Group

centos7 hadoop+hive installation

Prepare four virtual machines

Update yum

ip of four hosts

hadoop installation

Configure DNS

Close Firewall

Configure Secret-Free Login

Generate ssh key

wget download jdk8

Configuring the JDK environment

Setting up a fully distributed Hadoop cluster

hadoop version download

Download hadoop

1. Configure hadoop environment variables (per node)

2. Configure JAVA_HOME parameters in the Hadoop environment script file

3. Modify Hadoop Profile

core-site.xml (Configure Common Component Properties)

hdfs-site.xml (configure HDFS component properties)

mapred-site.xml (configure Map-Reduce component properties)

yarn-site.xml (Configure Resource Scheduling Properties)

workers

4. Copy the configured folder to another slave node

5. Configure startup scripts, add HDFS and Yarn permissions

6. Initialization & Startup

7. Verify Hadoop started successfully

mysql-5.7 installation

download

Use the yum command to complete the installation

Sign in

Change Password

Set to allow remote login

Set default encoding to utf8

hive installation

hive Download

hive installation

Configuring environment variables

Create hive-site.xml file

Create HDFS Folder

Hive-related configuration

Database Related Configuration

Configure hive-log4j2.properties

Configure hive-env.sh file

Hive Start

Download Database 5.7 Driver

Initialization

problem

Hadoop 3.1.2 + Hive 3.1.1 Installation

other

Modify vim comment color

vim replacement

Hot Keywords