spark-2.4.2.tgz Download and Compile
Does 51CTO have no directory function?Good or bad
========Welcome to Penguin for any questions ^-^1176738641
========
preparation in advance
folders creating
#Create five folders in the user directory
app #Storage application
software #Store application packages
data #Store test data
lib #Store jar pa ...
Posted by xshanelarsonx on Wed, 15 May 2019 08:05:26 -0700
combiner Learning and Customized combiner of Hadoop
Combiner concept
Combiner is known as the local Reduce, and Reduce's input is the final output of Combiner.
In MapReduce, when the data generated by map is too large, bandwidth becomes a bottleneck. How to compress the data transmitted to Reduce without affecting the final results? One way is to use Combiner, which is known ...
Posted by frobak on Fri, 10 May 2019 11:06:38 -0700
The compiled source of hive supports UDF functions
Article Directory
1. Download the source code
2. Compile support for UDF
2.1 Upload and Unzip
2.2 Add UDF Function Class
2.3 Registration Functions
2.4 Compile hive
3. Deployment and installation
4. Testing UDF
Friendly Tip: To reduce the probability of maven compilation errors on the server, you can choose to open the so ...
Posted by alant on Thu, 09 May 2019 08:26:40 -0700
High Availability HA Installation-Trench Recording for Large Data Base Hadoop 3.1.1
Recently, the project responsible for preparing large data platform storage mainly around the Hadoop platform. Although we plan to use the cdh version of hadoop, we will use the original Hadoop for development convenience before preparing a better environment for expansion.
Environmental preparation
The three server system environm ...
Posted by adeelahmad on Tue, 07 May 2019 02:10:39 -0700
Big Data Learning 06_Hadoop: An Overview of MapReduce
Big Data Learning 06_Hadoop: An Overview of MapReduce
Overview of MapReduce
MapReduce Core
MapReduce Programming Specification
MapReduce Case Practice 1: WordCount
Hadoop serialization
MapReduce Case Practice 2: FlowCount
Overview of MapReduce
MapReduce Core
The MapReduce operator is divided into at least two phases
...
Posted by sufian on Fri, 03 May 2019 04:50:38 -0700
Using LZO compression and supporting slicing for hadoop
1. introduction:
Install LZO:
lzo is not native to Linux systems, so you need to download and install packages. There are at least three packages to install here.
lzo, lzop, hadoop-gpl-packaging.
Add index:
The main function of gpl-packing is to create index for compressed lzo files. Otherwise, whether the compressed file is la ...
Posted by oaf357 on Tue, 23 Apr 2019 17:00:34 -0700
Hadoop 3.2.0 Source Analysis: Default Container Executor for Container and Linux Container Executor
Container is optional in yarn. This paper introduces the following two types:
DefaultContainerExecutor
LinuxContainerExecutor
Controlled by configuration parameters: yarn.nodemanager.container-executor.class
When NodeManager initializes, load
org.apache.hadoop.yarn.server.nodemanager.NodeManager#serviceInit
// ...
Posted by Atanu on Mon, 22 Apr 2019 22:00:35 -0700
Converting Hive database storage format to orc
Hive storage format
textfile
hive's default storage format
Storage: row storage
Disk overhead, data parsing overhead
Compressed text file hive cannot be merged and split
SequenceFile
Binary files are serialized into files in the form of key and value
Storage: row storage
Separable compression
Generally selected block compression
...
Posted by WendyLady on Sun, 21 Apr 2019 18:30:35 -0700
Hive UDF Realizes Strong Check of Identity Card
In the work, the ID number in the Hive table needs to be strongly checked. Because the last bit is the checking bit, simple regularization can not be realized, and UDF is used to implement the relevant functions.
I just realized the function, did not do in-depth optimization, welcome to leave a message, under the guidance of h ...
Posted by carlg on Wed, 17 Apr 2019 20:06:33 -0700
Common sql statements in hive
data base
Create a database
hive> CREATE DATABASE financials;
hive> CREATE DATABASE IF NOT EXISTS financials;
Display existing databases
hive> SHOW DATABASES;
default
financials
hive> CREATE DATABASE human_resources;
hive> SHOW DATABASES;
default
financials
human_resources
Conditional Query Database
hive> SHOW DATABASE ...
Posted by johnsonzhang on Mon, 15 Apr 2019 14:42:32 -0700