In hive, sometimes you need to customize some functions according to business requirements. Here are the steps to customize functions
1. Create a new maven project and introduce dependencies in the project's pom file
<dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>3.1.2</version> </dependency>
2. Create a new class, inherit UDF, and rewrite the evaluation () method. The following is a method of adding field prefix. The specific implementation refers to the following code
import org.apache.hadoop.hive.ql.exec.Description; import org.apache.hadoop.hive.ql.exec.UDF; import java.util.Random; @Description( name = "min", value = "_FUNC_(expr) - add a number and '_' before the expr" ) public class AddPrefixUDF extends UDF { public String evaluate(String input){ Random random = new Random(); int num = random.nextInt(10); return num + "_" + input; } public static void main(String[] args) { AddPrefixUDF addPrefixUDF = new AddPrefixUDF(); String result = addPrefixUDF.evaluate("test"); System.out.println(result); } }
3. Packing through maven, uploading files to a path in linux.
4. In the hit command, add jar files and create functions
hive (ruozedata_ba)> add jar /home/hadoop/lib/hadoop-project-1.0.jar; Added [/home/hadoop/lib/hadoop-project-1.0.jar] to class path Added resources: [/home/hadoop/lib/hadoop-project-1.0.jar] hive (ruozedata_ba)> create TEMPORARY function add_prefix as 'com.wxx.bigdata.hive.udf.AddPrefixUDF'; OK Time taken: 0.101 seconds hive (ruozedata_ba)> show functions; OK tab_name ! != % & * + - / < <= <=> <> = == > >= ^ abs acos add_months add_prefix ... hive (ruozedata_ba)> select add_prefix(platform) from platform_stat; OK _c0 4_Android 6_MAC os 1_WIN 4_iOS 8_windows mobile 6_windows phone Time taken: 0.332 seconds, Fetched: 6 row(s)
5. The above is to add a temporary function, the current hive session takes effect, replaced by a session show functions; then the two temporary functions can not be found.
6. Add a permanent function.
6.1 Upload jar files to HDFS.
[hadoop@hadoop000 lib]$ hdfs dfs -mkdir /lib [hadoop@hadoop000 lib]$ hdfs dfs -put /home/hadoop/lib/hadoop-project-1.0.jar /lib [hadoop@hadoop000 lib]$ hdfs dfs -ls /lib Found 1 items -rw-r--r-- 1 hadoop supergroup 50187 2019-09-25 17:37 /lib/hadoop-project-1.0.jar [hadoop@hadoop000 lib]$
6.2 Create permanent functions in hive
CREATE FUNCTION add_prefix_new AS "com.wxx.bigdata.hive.udf.AddPrefixUDF" USING JAR "hdfs://hadoop000:8020/lib/hadoop-project-1.0.jar"; CREATE FUNCTION remove_prefix_new AS "com.wxx.bigdata.hive.udf.RemovePrefixUDF" USING JAR "hdfs://hadoop000:8020/lib/hadoop-project-1.0.jar";
6.3 Through hive configuration file hive-site.xml, find the database of hive configuration. After login, check the created permanent function.
6.4 After the test function was created successfully, the newly opened session will take effect as well.
hive (ruozedata_ba)> select add_prefix_new(platform) from platform_stat; OK _c0 6_Android 7_MAC os 1_WIN 9_iOS 6_windows mobile 3_windows phone Time taken: 0.658 seconds, Fetched: 6 row(s) hive (ruozedata_ba)>