hive-1.2.1 Installation and Simple Use

Keywords: Big Data hive MySQL Hadoop JDBC

Hive can only be installed on one node

1. Upload tar package

2. Decompression

	tar -zxvf hive-1.2.1.tar.gz -C /apps/

3. Install mysql database (switch to root user) (there are no limitations on where to install, only nodes that can connect to the hadoop cluster)

4. Configure hive

  • (a) Configure the HIVE_HOME environment variable vi conf/ to configure $hadoop_home
1.To configure hive Environment variables, editing
vi /etc/profile
#set hive env
export HIVE_HOME=/root/apps/hive-1.2.1
export PATH=${HIVE_HOME}/bin:$PATH
source /etc/profile
2.To configure hadoop Environment Variables [Installation hadoop Time Configured)

cd apps/hive-1.2.1/conf

4.1 cp


Write the following to the file
export JAVA_HOME=/usr/local/java-1.8.231
export HADOOP_HOME=/root/apps/hadoop-2.6.5
export HIVE_HOME=/root/apps/hive-1.2.1

4.2 Configuration metadata

vi hive-site.xml

Add the following:

<description>JDBC connect string for a JDBC metastore</description>

<description>Driver class name for a JDBC metastore</description>

<description>username to use against metastore database</description>

<description>password to use against metastore database</description>

5. After installing hive and mysq, copy the connection jar package of mysql to the / lib directory of the hive installation directory

If there is a problem with no permissions, authorize mysql (execute on the machine where mysql is installed)
mysql -uroot -p
 # (Execute the following statement*. *: All tables under all libraries%: Any IP address or host can connect)
	grant all privileges on *.* to root@"" identified by "mysql" with grant option;

[Note] Environment variables for hadoop and hive need to be configured, hdfs, yarn for hadoop need to be started before hive can be started

6. Inconsistent Jline package versions, need to replace the jar package of jline.2.12.jar in hadoop by copying hive's lib directory


7.1 Start hive program


[Note] The transplantation is very powerful. It only needs a modified copy of hive-1.2.1 to be copied to another machine, no modifications are allowed, and it can be started directly, such as:

scp -r hive-1.2.1/ hadoop02:/root/apps/

7.2 Displays the library currently in use and opens the field name

set hive.cli.print.current.db=true;
set hive.cli.print.header=true;

8.0 What if you start hive services, clients???

#Start service hiveserver2 10000 port
 bin/hiveserver2 #There is no background running here
 Nohup bin/hiveserver2 1>/dev/null 2>&1 & #Background runtime
 #Open beeline client
beeline>!connect jdbc:hive2://hadoop1:10000
 User name root direct return
 The client looks good!!!!!
#Exit Client

9.0 Build internal tables (default internal table) with separate fields

create table trade_detail(id bigint, account string, income double, expenses double, time string) row format delimited fields terminated by ',';

9.1 Building External Tables

Build external tables, any directory, can be non/usr/warehouse/under, the data directory in hdfs will not be deleted after the external tables are deleted.

create external table td_ext(id bigint, account string, income double, expenses double, time string) row format delimited fields terminated by ',' location '/lod/20190202/';

10. Create partition tables

10.1 Difference between normal and partition tables: Partition tables need to be built because there is a large amount of data increase

create table log (id bigint, url string) partitioned by (daytime string) row format delimited fields terminated by ',';

10.2 ** Import native data into hive repository for partition table hdfs

1.Manual Upload
2.hive Command, data will be appended if executed again
hive>load data local inpath '/root/log1.log/' log partition(daytime='20190904'); # Specified partition 20190904

10.3 Partition Table Load Data

select *
from log
where daytime='20190904'

Posted by gyash on Thu, 26 Mar 2020 21:12:15 -0700