Using ibis, impyla, pyhive and pyspark to connect to Hive and Impala of Kerberos security authentication in Python

There are many ways to connect hive and impala in python, including pyhive,impyla,pyspark,ibis, etc. in this article, we will introduce how to use these packages to connect hive or impala, and how to pass kerberos authentication. Kerberos If the cluster does not enable kerberos authentication, the ...

Posted by RunningUtes on Mon, 08 Jun 2020 23:22:07 -0700

[teacher Zhao Qiang] use Load statement to Load data in Hive

1, Syntax description of load statement in Hive Hive Load statement will not do any conversion work when loading data, but simply copy / move the data file to the address corresponding to hive table. The syntax format is as follows: LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename \ [PARTITION (partcol1=val1, partcol2= ...

Posted by whizzykid on Thu, 21 May 2020 21:08:45 -0700

Python data analyst analyzes his future, as if he sees a miserable future

Preface The text and pictures of this article are from the Internet, only for learning and communication, not for any commercial purpose. The copyright belongs to the original author. If you have any questions, please contact us in time for handling. Analysis background With the popularity of artificial intelligence and big data in recent years ...

Posted by POG1 on Sat, 09 May 2020 02:40:47 -0700

Hive quick start series (12) | introduction and use of hive data compression

                   . Catalog 1, Hadoop source compilation supports Snappy compression 1.1 resource preparation 1.2 jar package installation 1.3 compiling source code 2, Hadoop compression configuration 2 ...

Posted by pavanpuligandla on Tue, 05 May 2020 23:29:55 -0700

Use demonstration of Hive lateral view

Lateral view is used with split, expand and other UDTF to split a row of data into multiple rows. On this basis, the split data can be aggregated. Lateral view first calls UDTF for each row of the original table, and UDTF will split a row into one or more rows. Then lateral view combines the results to produce a virtual table that supports alia ...

Posted by ayzee01 on Sat, 02 May 2020 19:41:55 -0700

The realization principle of window function in spark and hive

Window function is often used in work and often asked in interview. Do you know the implementation principle behind it? Starting from the problems encountered in a business, this paper discusses the data flow principle of window function in hsql, and gives a solution to this problem at the end of the article. ​   1, Business background Fi ...

Posted by moiseszaragoza on Mon, 06 Apr 2020 04:05:56 -0700

Hive is finally waiting, Flink

When did Apache Spark start to support the integrated Hive feature? I believe that as long as readers have used Spark, they will say that this is a long time ago. When does Apache Flink support integration with Hive? Readers may be confused. Haven't they supported it yet, haven't they used it? Or the latest version only supports it, but the fu ...

Posted by elfeste on Fri, 27 Mar 2020 03:09:45 -0700

hive-1.2.1 Installation and Simple Use

Hive can only be installed on one node 1. Upload tar package 2. Decompression tar -zxvf hive-1.2.1.tar.gz -C /apps/ 3. Install mysql database (switch to root user) (there are no limitations on where to install, only nodes that can connect to the hadoop cluster) 4. Configure hive (a) Configure the HIVE_HOME environment variable vi conf/hive-e ...

Posted by gyash on Thu, 26 Mar 2020 21:12:15 -0700

Flume installation deployment and cases

1. Installation address 1) Flume official website address http://flume.apache.org/ 2) Document view address http://flume.apache.org/FlumeUserGuide.html 3) Download address http://archive.apache.org/dist/flume/ 2. Installation and deployment 1) Upload apache-flume-1.7.0-bin.tar.gz to the / opt/softwa ...

Posted by The14thGOD on Fri, 13 Mar 2020 19:40:57 -0700

hive installation (incomplete)

1. Three installation methods of hive Three installation methods of Hive are introduced on Hive official website, corresponding to different application scenarios. In the final analysis, the storage location of metadata is different. Embedded mode (metadata is saved in the embedded derby database, al ...

Posted by akreation on Thu, 27 Feb 2020 02:23:22 -0800