Hive can only be installed on one node
1. Upload tar package
2. Decompression
tar -zxvf hive-1.2.1.tar.gz -C /apps/
3. Install mysql database (switch to root user) (there are no limitations on where to install, only nodes that can connect to the hadoop cluster)
4. Configure hive
- (a) Configure the HIVE_HOME environment variable vi conf/hive-env.sh to configure $hadoop_home
1.To configure hive Environment variables, editing vi /etc/profile #set hive env export HIVE_HOME=/root/apps/hive-1.2.1 export PATH=${HIVE_HOME}/bin:$PATH source /etc/profile 2.To configure hadoop Environment Variables [Installation hadoop Time Configured)
cd apps/hive-1.2.1/conf
4.1 cp hive-env.sh.template hive-env.sh
vi hive-env.sh
Write the following to the hive-env.sh file export JAVA_HOME=/usr/local/java-1.8.231 export HADOOP_HOME=/root/apps/hadoop-2.6.5 export HIVE_HOME=/root/apps/hive-1.2.1
4.2 Configuration metadata
vi hive-site.xml
Add the following:
<configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://192.168.52.200:3306/hive?createDatabaseIfNotExist=true</value> <description>JDBC connect string for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> <description>username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>123456</value> <description>password to use against metastore database</description> </property> </configuration>
5. After installing hive and mysq, copy the connection jar package of mysql to the / lib directory of the hive installation directory
If there is a problem with no permissions, authorize mysql (execute on the machine where mysql is installed) mysql -uroot -p # (Execute the following statement*. *: All tables under all libraries%: Any IP address or host can connect)
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'mysql' WITH GRANT OPTION; FLUSH PRIVILEGES; grant all privileges on *.* to root@"192.168.52.200" identified by "mysql" with grant option; FLUSH PRIVILEGES;
[Note] Environment variables for hadoop and hive need to be configured, hdfs, yarn for hadoop need to be started before hive can be started
6. Inconsistent Jline package versions, need to replace the jar package of jline.2.12.jar in hadoop by copying hive's lib directory
/apps/hadoop-2.6.4/share/hadoop/yarn/lib/jline-0.9.94.jar
7.1 Start hive program
bin/hive
[Note] The transplantation is very powerful. It only needs a modified copy of hive-1.2.1 to be copied to another machine, no modifications are allowed, and it can be started directly, such as:
scp -r hive-1.2.1/ hadoop02:/root/apps/
7.2 Displays the library currently in use and opens the field name
set hive.cli.print.current.db=true; set hive.cli.print.header=true;
8.0 What if you start hive services, clients???
#Start service hiveserver2 10000 port bin/hiveserver2 #There is no background running here Nohup bin/hiveserver2 1>/dev/null 2>&1 & #Background runtime #Open beeline client bin/beeline beeline>!connect jdbc:hive2://hadoop1:10000 User name root direct return The client looks good!!!!! #Exit Client beeline>!quit
9.0 Build internal tables (default internal table) with separate fields
create table trade_detail(id bigint, account string, income double, expenses double, time string) row format delimited fields terminated by ',';
9.1 Building External Tables
Build external tables, any directory, can be non/usr/warehouse/under, the data directory in hdfs will not be deleted after the external tables are deleted.
create external table td_ext(id bigint, account string, income double, expenses double, time string) row format delimited fields terminated by ',' location '/lod/20190202/';
10. Create partition tables
10.1 Difference between normal and partition tables: Partition tables need to be built because there is a large amount of data increase
create table log (id bigint, url string) partitioned by (daytime string) row format delimited fields terminated by ',';
10.2 ** Import native data into hive repository for partition table hdfs
1.Manual Upload 2.hive Command, data will be appended if executed again hive>load data local inpath '/root/log1.log/' log partition(daytime='20190904'); # Specified partition 20190904
10.3 Partition Table Load Data
select * from log where daytime='20190904'