spark reads hive data java

Requirement: read out the data in hive and write it into es. Environment: spark 2.0.2 1. enableHiveSupport() is set in sparksession SparkConf conf = new SparkConf().setAppName("appName").setMaster("local[*]"); SparkSession spark = SparkSession .builder() .appName("Java Spark SQL basic exam ...

Posted by NSW42 on Tue, 10 Dec 2019 14:30:32 -0800

AWS Athena analysis log

In AWS, Athena can be used to analyze the logs saved in S3. He can convert the logs into the format of database tables, so that they can be queried through sql statements. This function is similar to using logparser to analyze Exchange or IIS logs on a windows Server. Let's do a demonstration, record the management log through Cloudtrail, and ...

Posted by stubarny on Wed, 04 Dec 2019 09:10:56 -0800

Hive Installation, Configuration, and Use

Overview of Hive Hive is a Hadoop-based data warehouse tool that maps structured data files to a table and provides SQL-like query capabilities. Hive is essentially a MapReduce program that converts HQL. Data processed by Hive is stored in HDFS, and the underlying implementation of analytic data can be MapReduce, tes, or Spark, with its executo ...

Posted by Wayniac on Tue, 03 Dec 2019 17:47:05 -0800

Hive-2.3.4 installation (upgrade from Hive-1.2.2)

Download Hive installation package from website Modify hive-site.xml (version 2.3.4 does not have this file to create directly) <?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>javax.jdo.option. ...

Posted by ryadex on Tue, 03 Dec 2019 14:14:21 -0800

Data service analysis of Spark project

Business logic processing Peers To judge whether it is a peer object, we can judge whether two objects have passed through multiple identical places by using longitude and latitude. Of course, each monitoring device can also be marked. When an object passes through the monitoring device, it can be captured by the device. Tra ...

Posted by aztec on Sun, 17 Nov 2019 08:26:39 -0800

Spark SQL uses beeline to access hive warehouse

I. add hive-site.xml Add the hive-site.xml configuration file under $SPARK_HOME/conf in order to access hive metadata normally vim hive-site.xml <configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://192.168.1.201:3306/hiveDB?createDatabaseIfNotExist=true ...

Posted by mrodrigues on Wed, 06 Nov 2019 14:06:19 -0800

I. hbase -- basic principle and use

Hot issues of hbase data: The solution is to preprocess the rowkey of the hot data, add some prefixes, and distribute the hot data to multiple region s. Pre merger? Dynamic partition? At the beginning of the initial data, the data should be partitioned, stored in different region s, and load balanced. Example: for example, it is easy to divide ...

Posted by daniel_mintz on Mon, 04 Nov 2019 16:20:41 -0800

Import of sqoop | Hive|Hbase

Import data (cluster as object) In Sqoop, the concept of "import" refers to the transfer of data from non big data cluster (RDBMS) to big data cluster (HDFS, HIVE, HBASE). It is called "import", that is, using the import keyword. 1 RDBMS to HDFS 1) make sure the Mysql service is turned on normally 2) create a new table i ...

Posted by Anders_Scales on Mon, 04 Nov 2019 07:26:50 -0800

[Hive] install Hive on Mac

I. installation Hive installation: brew install hive mysql installation: brew install mysql Start MySQL: bash mysql.server start 2. Metabase configuration Hive uses derby as the metabase by default. Here we use mysql to store metadata, and do some initialization configuration below Log in to mysql: mysql -u root ...

Posted by jd307 on Sat, 02 Nov 2019 21:50:35 -0700

The windowing function of Spark

I. Introduction The window function row ou number() is a function that groups one field and then takes the first values sorted by another field, which is equivalent to grouping topN. If a windowing function is used in the SQL statement, the SQL statement must be executed with HiveContext. II. Code practice [use HiveContext] package big.dat ...

Posted by nezbo on Fri, 01 Nov 2019 16:08:24 -0700