Article Directory
Friendly Tip: To reduce the probability of maven compilation errors on the server, you can choose to open the source locally using idea, modify the source and compile it (idea problems may eventually fail), but this ensures that the code we modify must be correct and that the local repository contains almost all the jar s.Then pack the local warehouse upload service and the modified code to replace the unzipped class on the server.
1. Download the source code
My Hadoop version environment uses hadoop-2.6.0-cdh5.7.0, so I download the corresponding hive version source from the CDH component repository.
2. Compile support for UDF
Compiled using maven.Maven's installation and configuration have been omitted here. I want to compile hive directly with idea, but trying to compile a class for a few hours will still cause errors and eventually compile the class with mvn on an honest server.
2.1 Upload and Unzip
#upload [hadoop@hadoop001 source]$ rz [hadoop@hadoop001 source]$ ll -rw-r--r--. 1 hadoop hadoop 14652104 Apr 18 2019 hive-1.1.0-cdh5.7.0-src.tar.gz #decompression [hadoop@hadoop001 source]$ tar -zxvf hive-1.1.0-cdh5.7.0-src.tar.gz -C ~/source/
2.2 Add UDF Function Class
HelloUDF.java is a UDF function class that I have written before. You can refer to the previous article for UDF programming details. Blog , add the class to the hive-1.1.0-cdh5.7.0\ql\src\java\org\apache\hadoop\hive\ql\udf directory, and note that you modify the package name in the class.
#Add previously written HelloUDF classes [hadoop@hadoop001 udf]$ ll total 344 -rw-r--r--. 1 hadoop hadoop 567 Apr 20 2019 AddPre.java drwxrwxr-x. 2 hadoop hadoop 12288 Mar 24 2016 generic -rw-r--r--. 1 hadoop hadoop 409 Apr 20 2019 HelloUDF.java drwxrwxr-x. 2 hadoop hadoop 4096 Mar 24 2016 ptf -rw-r--r--. 1 hadoop hadoop 649 Apr 20 2019 RemovePre.java #Modify the package name of HelloUDF.java to org.apache.hadoop.hive.ql.udf [hadoop@hadoop001 udf]$ cd ~/source/hive-1.1.0-cdh5.7.0/ql/src/java/org/apache/hadoop/hive/ql/udf/ [hadoop@hadoop001 udf]$ vim HelloUDF.java package org.apache.hadoop.hive.ql.udf;
2.3 Registration Functions
Modify the FunctionRegistry.java class to register a function called say_hell0
[hadoop@hadoop001 exec]$ cd ~/source/hive-1.1.0-cdh5.7.0/ql/src/java/org/apache/hadoop/hive/ql/exec/ #Add registration information in the static code block [hadoop@hadoop001 exec]$ vim FunctionRegistry.java system.registerUDF("say_hell0", HelloUDF.class,false);
2.4 Compile hive
mvn clean package -DskipTests -Phadoop-2 -Pdist
[hadoop@hadoop001 ~]$ cd ~/source/hive-1.1.0-cdh5.7.0 #Compile, the first compilation takes a long time and requires patience until succfule occurs [hadoop@hadoop001 hive-1.1.0-cdh5.7.0]$ mvn clean package -DskipTests -Phadoop-2 -Pdist #Looking at the compiled package, apache-hive-1.1.0-cdh5.7.0-bin.tar.gz is what I need [hadoop@hadoop001 target]$ cd ~/source/hive-1.1.0-cdh5.7.0/packaging/target/ [hadoop@hadoop001 target]$ ll total 129092 drwxrwxr-x. 2 hadoop hadoop 4096 Apr 15 10:19 antrun drwxrwxr-x. 3 hadoop hadoop 4096 Apr 15 10:20 apache-hive-1.1.0-cdh5.7.0-bin -rw-rw-r--. 1 hadoop hadoop 105725582 Apr 15 10:21 apache-hive-1.1.0-cdh5.7.0-bin.tar.gz -rw-rw-r--. 1 hadoop hadoop 12610961 Apr 15 10:21 apache-hive-1.1.0-cdh5.7.0-jdbc.jar -rw-rw-r--. 1 hadoop hadoop 13826134 Apr 15 10:21 apache-hive-1.1.0-cdh5.7.0-src.tar.gz drwxrwxr-x. 2 hadoop hadoop 4096 Apr 15 10:20 archive-tmp drwxrwxr-x. 3 hadoop hadoop 4096 Apr 15 10:19 maven-shared-archive-resources drwxrwxr-x. 3 hadoop hadoop 4096 Apr 15 10:19 tmp drwxrwxr-x. 2 hadoop hadoop 4096 Apr 15 10:19 warehouse
3. Deployment and installation
Omit, please refer to Hive Quick Start and Installation Deployment
The deployment process takes care not to conflict with an installed hive, especially if you execute a hive command that runs an actual previous hive component.
4. Testing UDF
#View Functions hive> show functions; //At this point, we can see a function named say_hell0, which implements the use of custom functions without adding jar packages #Run Function hive> select say_hell0("666"); OK hello:666 Time taken: 0.888 seconds, Fetched: 1 row(s) hive> select rm_pre("3_wsk"); OK wsk Time taken: 0.183 seconds, Fetched: 1 row(s)
This, combined with previous UDF programming, goes through the whole process from UDF programming to function creation and source code compilation.