Hadoop Part 2: mapreedce
Mapreedce (3)
Project address: https://github.com/KingBobTitan/hadoop.git
MR's Shuffle explanation and Join implementation
First, review
1. MapReduce's history monitoring service: JobHistoryServer
Function: used to monitor the information of all MapReduce programs running on YARN
Configure log ...
Posted by nick1 on Tue, 14 Jan 2020 02:21:13 -0800
waterdrop filtering processing log files
waterdrop filters and processes log files to store data
Installing waterdrop
Download the installation package of waterdrop using wget
wget xxxxx
Extract to the directory you need
Unzip XXX (package location) XXX (decompression location)
If unzip reports an error, please download the corresponding command yourself.
Set the dependency env ...
Posted by PhantomCube on Mon, 13 Jan 2020 01:04:18 -0800
Group control of several common window functions in Hive
brief introduction
Of course, there is nothing to say about regular window functions. It's very simple. Here's an introduction to grouping, focusing on the usage of rows between after grouping and sorting.
The key is to understand the meaning of keywords in rows between:
Keyword
Meaning
preceding
Forward
following
In the future
current ...
Posted by skyxmen on Thu, 09 Jan 2020 07:26:16 -0800
The way of Hadoop learning Mapreduce program completes wordcount
Test text data used by the program:
Dear River
Dear River Bear Spark
Car Dear Car Bear Car
Dear Car River Car
Spark Spark Dear Spark
1 main categories
(1) Maper class
The first is the custom Maper class code
public class WordCountMap extends Mapper<LongWritable, Text, Text, IntWritable> {
public void map(LongWritable key, Text val ...
Posted by SundayDriver on Fri, 27 Dec 2019 02:19:33 -0800
Scala functional programming under functional data structure
previously on
Guide to functional programming in Scala
Scala functional programming (2) introduction to scala basic syntax
scala functional programming (3) scala sets and functions
Scala functional programming (four) functional data structure
1.List code analysis
The content introduced today is mainly to supplement the scala functional data st ...
Posted by stev979 on Thu, 19 Dec 2019 03:45:09 -0800
Real time log analysis by Flume+Kafka+SparkStreaming+Redis+Mysql ip access times
Novice learning, if there are mistakes, please correct, thank you!
1. Start zookeeper and kafka, and set a topic as test fkss. For the convenience of observation, I added it through kafka manager
2. Configure Flume and start it. The listening file is / home / czh / docker-public-file/testplume.log, which is sent to kafka
a ...
Posted by ksduded on Sat, 14 Dec 2019 10:51:38 -0800
Using docker to install Hadoop and Spark
Using docker configuration to install hadoop and spark
Install hadoop and spark images respectively
Install hadoop image
docker selected Mirror Address , the version of hadoop provided by this image is relatively new, and jdk8 is installed, which can support the installation of the latest version of spark.
docker pull uhopp ...
Posted by shantred on Tue, 10 Dec 2019 22:46:53 -0800
spark reads hive data java
Requirement: read out the data in hive and write it into es.
Environment: spark 2.0.2
1. enableHiveSupport() is set in sparksession
SparkConf conf = new SparkConf().setAppName("appName").setMaster("local[*]");
SparkSession spark = SparkSession
.builder()
.appName("Java Spark SQL basic exam ...
Posted by NSW42 on Tue, 10 Dec 2019 14:30:32 -0800
Spark - upgraded data source JDBC 2
In the data source of spark, only Append, Overwrite, ErrorIfExists, Ignore are supported. But almost all of our online businesses need the upsert function, that is, the existing data must not be overwritten. In mysql, we use: ON DUPLICATE KEY UPDATE. Is there such an implementation? Official: sorry, no, dounine: I have it. You can ...
Posted by smordue on Fri, 22 Nov 2019 07:03:24 -0800
Data service analysis of Spark project
Business logic processing
Peers
To judge whether it is a peer object, we can judge whether two objects have passed through multiple identical places by using longitude and latitude. Of course, each monitoring device can also be marked. When an object passes through the monitoring device, it can be captured by the device.
Tra ...
Posted by aztec on Sun, 17 Nov 2019 08:26:39 -0800