Principle analysis of Apache Flink CDC batch stream fusion technology
This article is reproduced from the official account of "good future technology". The use of Flink CDC 2 is introduced in the case of Flink SQL, and the core design of CDC is interpreted. The main contents are as follows:caseCore designCode explanationIn August, Flink CDC released version 2.0.0. Compared with version 1.0, it supports ...
Posted by wxflint on Wed, 10 Nov 2021 23:25:41 -0800
Spark common RDD operators for big data development
Spark common RDD operators for big data development
map
map passes in a piece of data and returns a piece of data Map is to perform function operations on the elements in the RDD one by one and map them to another RDD, Each data item in an RDD is transformed into a new element through the function mapping in the map. Input partition and o ...
Posted by axo on Tue, 09 Nov 2021 10:56:02 -0800
Day79_ Flink (V) FlinkSQL and CEP
Syllabus Course content learning effect Master the goal FlinkSQL FlinkTable master FlinkSQL master FlinkCEP FlinkCEP master Task performance optimization operator chain master slot sharing master Flink asynchronous IO master Checkpoint optimization master
1, Table & SQL
(1) Overview
Table API is a ...
Posted by stereo on Mon, 08 Nov 2021 11:18:42 -0800
HA high availability + hive+hbase+sqoop+kafka+flume+spark installation and deployment
catalogue
preface
data
HA high availability deployment
Hive installation deployment
Hbase installation and deployment
sqoop installation deployment
Unzip the installation package
Modify profile
environment variable
sqoop-env.sh
Copy JDBC Driver
Test whether Sqoop can successfully connect to the database
kafka installation and depl ...
Posted by anthonyv on Mon, 08 Nov 2021 04:33:27 -0800
Review of big data development (MapReduce)
2,MapReduce
2.1. Introduction to MapReduce
The core idea of MapReduce is "divide and conquer", which is suitable for a large number of complex task processing scenarios (large-scale data processing scenarios).
Map is responsible for "dividing", that is, dividing complex tasks into several "simple tasks" for ...
Posted by JCBarry on Sun, 07 Nov 2021 16:18:05 -0800
Data type of clickhouse
data type
In order to improve performance, ClickHouse provides composite data types compared with traditional databases. ClickHouse's Upadate and Delete are implemented by the Alter variant.
1 integer
Integer includes signed integer and unsigned integer
1.1 signed integer
type
Range
byte
int8
[-128,127]
1
int16
[-32768 : 32767]
2
int ...
Posted by dmayo2 on Sat, 06 Nov 2021 12:52:23 -0700
Machine learning competition: come and explore happiness together
Come and explore happiness together (complete)
This learning note is the learning content of Alibaba cloud Tianchi Longzhu plan machine learning training camp. The learning links are: AI training camp machine learning - Alibaba cloud Tianchi
Introduction to the game title (although there is already an introduction to the game title in the a ...
Posted by PhilVaz on Wed, 03 Nov 2021 18:16:15 -0700
Analog network Duplicate package used by ChaosMesh of chaos Engineering
preface
Today, let's play ChaosMesh to simulate the network duplicate package. At the same time, we should also look at the direct impact on the application.
target
Simulate network duplicate packets.
to configure
yaml file configuration
[root@s5 ChaosMesh]# cat network-duplicate.yaml
apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkC ...
Posted by neiltaylormade on Tue, 02 Nov 2021 22:49:42 -0700
CDH6.3.2 integration with Apache Flink1.12.0
edition: Centos7.6 JDK1.8 Scala2.11 Python2.7 Git1.8.3.1 Apache Maven3.6.3 CDH6.3.2 Apache Flink1.12.0
The above software needs to be installed in advance!!!
1, Compile Flink
1 download the flink source code
git clone https://github.com/apache/flink.git
git checkout release-1.12.0
2. Add maven image
Add the following mirrors in the m ...
Posted by mustatin on Mon, 01 Nov 2021 04:52:28 -0700
[Hive] Chapter 5 DML data operation
Data import
1. Load data into the table (load)
1) Grammar
hive> load data [local] inpath 'Data path' [overwrite] into table \
student [partition (partcol1=val1,...)];
(1) load data: indicates loading data (2) Local: indicates loading data from local to hive table; Otherwise, load data from HDFS to hive table (3) inpath: indicates the pa ...
Posted by PseudoEvolution on Sun, 31 Oct 2021 14:38:44 -0700