spark-2.4.2.tgz Download and Compile

Does 51CTO have no directory function?Good or bad ========Welcome to Penguin for any questions ^-^1176738641 ======== preparation in advance folders creating #Create five folders in the user directory app #Storage application software #Store application packages data #Store test data lib #Store jar pa ...

Posted by xshanelarsonx on Wed, 15 May 2019 08:05:26 -0700

combiner Learning and Customized combiner of Hadoop

Combiner concept Combiner is known as the local Reduce, and Reduce's input is the final output of Combiner. In MapReduce, when the data generated by map is too large, bandwidth becomes a bottleneck. How to compress the data transmitted to Reduce without affecting the final results? One way is to use Combiner, which is known ...

Posted by frobak on Fri, 10 May 2019 11:06:38 -0700

The compiled source of hive supports UDF functions

Article Directory 1. Download the source code 2. Compile support for UDF 2.1 Upload and Unzip 2.2 Add UDF Function Class 2.3 Registration Functions 2.4 Compile hive 3. Deployment and installation 4. Testing UDF Friendly Tip: To reduce the probability of maven compilation errors on the server, you can choose to open the so ...

Posted by alant on Thu, 09 May 2019 08:26:40 -0700

High Availability HA Installation-Trench Recording for Large Data Base Hadoop 3.1.1

Recently, the project responsible for preparing large data platform storage mainly around the Hadoop platform. Although we plan to use the cdh version of hadoop, we will use the original Hadoop for development convenience before preparing a better environment for expansion. Environmental preparation The three server system environm ...

Posted by adeelahmad on Tue, 07 May 2019 02:10:39 -0700

Big Data Learning 06_Hadoop: An Overview of MapReduce

Big Data Learning 06_Hadoop: An Overview of MapReduce Overview of MapReduce MapReduce Core MapReduce Programming Specification MapReduce Case Practice 1: WordCount Hadoop serialization MapReduce Case Practice 2: FlowCount Overview of MapReduce MapReduce Core The MapReduce operator is divided into at least two phases ...

Posted by sufian on Fri, 03 May 2019 04:50:38 -0700

Using LZO compression and supporting slicing for hadoop

1. introduction: Install LZO: lzo is not native to Linux systems, so you need to download and install packages. There are at least three packages to install here. lzo, lzop, hadoop-gpl-packaging. Add index: The main function of gpl-packing is to create index for compressed lzo files. Otherwise, whether the compressed file is la ...

Posted by oaf357 on Tue, 23 Apr 2019 17:00:34 -0700

Hadoop 3.2.0 Source Analysis: Default Container Executor for Container and Linux Container Executor

Container is optional in yarn. This paper introduces the following two types: DefaultContainerExecutor  LinuxContainerExecutor  Controlled by configuration parameters: yarn.nodemanager.container-executor.class   When NodeManager initializes, load org.apache.hadoop.yarn.server.nodemanager.NodeManager#serviceInit     // ...

Posted by Atanu on Mon, 22 Apr 2019 22:00:35 -0700

Converting Hive database storage format to orc

Hive storage format textfile hive's default storage format Storage: row storage Disk overhead, data parsing overhead Compressed text file hive cannot be merged and split SequenceFile Binary files are serialized into files in the form of key and value Storage: row storage Separable compression Generally selected block compression ...

Posted by WendyLady on Sun, 21 Apr 2019 18:30:35 -0700

Hive UDF Realizes Strong Check of Identity Card

In the work, the ID number in the Hive table needs to be strongly checked. Because the last bit is the checking bit, simple regularization can not be realized, and UDF is used to implement the relevant functions. I just realized the function, did not do in-depth optimization, welcome to leave a message, under the guidance of h ...

Posted by carlg on Wed, 17 Apr 2019 20:06:33 -0700

Common sql statements in hive

data base Create a database hive> CREATE DATABASE financials; hive> CREATE DATABASE IF NOT EXISTS financials; Display existing databases hive> SHOW DATABASES; default financials hive> CREATE DATABASE human_resources; hive> SHOW DATABASES; default financials human_resources Conditional Query Database hive> SHOW DATABASE ...

Posted by johnsonzhang on Mon, 15 Apr 2019 14:42:32 -0700