Principle analysis of Apache Flink CDC batch stream fusion technology

This article is reproduced from the official account of "good future technology". The use of Flink CDC 2 is introduced in the case of Flink SQL, and the core design of CDC is interpreted. The main contents are as follows:caseCore designCode explanationIn August, Flink CDC released version 2.0.0. Compared with version 1.0, it supports ...

Posted by wxflint on Wed, 10 Nov 2021 23:25:41 -0800

Spark common RDD operators for big data development

Spark common RDD operators for big data development map map passes in a piece of data and returns a piece of data Map is to perform function operations on the elements in the RDD one by one and map them to another RDD, Each data item in an RDD is transformed into a new element through the function mapping in the map. Input partition and o ...

Posted by axo on Tue, 09 Nov 2021 10:56:02 -0800

Day79_ Flink (V) FlinkSQL and CEP

Syllabus Course content learning effect Master the goal FlinkSQL FlinkTable master FlinkSQL master FlinkCEP FlinkCEP master Task performance optimization operator chain master slot sharing master Flink asynchronous IO master Checkpoint optimization master 1, Table & SQL (1) Overview Table API is a ...

Posted by stereo on Mon, 08 Nov 2021 11:18:42 -0800

HA high availability + hive+hbase+sqoop+kafka+flume+spark installation and deployment

catalogue preface data HA high availability deployment Hive installation deployment Hbase installation and deployment sqoop installation deployment Unzip the installation package Modify profile environment variable  sqoop-env.sh Copy JDBC Driver Test whether Sqoop can successfully connect to the database kafka installation and depl ...

Posted by anthonyv on Mon, 08 Nov 2021 04:33:27 -0800

Review of big data development (MapReduce)

2,MapReduce 2.1. Introduction to MapReduce The core idea of MapReduce is "divide and conquer", which is suitable for a large number of complex task processing scenarios (large-scale data processing scenarios). Map is responsible for "dividing", that is, dividing complex tasks into several "simple tasks" for ...

Posted by JCBarry on Sun, 07 Nov 2021 16:18:05 -0800

Data type of clickhouse

data type In order to improve performance, ClickHouse provides composite data types compared with traditional databases. ClickHouse's Upadate and Delete are implemented by the Alter variant. 1 integer Integer includes signed integer and unsigned integer 1.1 signed integer type Range byte int8 [-128,127] 1 int16 [-32768 : 32767] 2 int ...

Posted by dmayo2 on Sat, 06 Nov 2021 12:52:23 -0700

Machine learning competition: come and explore happiness together

Come and explore happiness together (complete) This learning note is the learning content of Alibaba cloud Tianchi Longzhu plan machine learning training camp. The learning links are: AI training camp machine learning - Alibaba cloud Tianchi Introduction to the game title (although there is already an introduction to the game title in the a ...

Posted by PhilVaz on Wed, 03 Nov 2021 18:16:15 -0700

Analog network Duplicate package used by ChaosMesh of chaos Engineering

preface Today, let's play ChaosMesh to simulate the network duplicate package. At the same time, we should also look at the direct impact on the application. target Simulate network duplicate packets. to configure yaml file configuration [root@s5 ChaosMesh]# cat network-duplicate.yaml apiVersion: chaos-mesh.org/v1alpha1 kind: NetworkC ...

Posted by neiltaylormade on Tue, 02 Nov 2021 22:49:42 -0700

CDH6.3.2 integration with Apache Flink1.12.0

edition: Centos7.6 JDK1.8 Scala2.11 Python2.7 Git1.8.3.1 Apache Maven3.6.3 CDH6.3.2 Apache Flink1.12.0 The above software needs to be installed in advance!!! 1, Compile Flink 1 download the flink source code git clone https://github.com/apache/flink.git git checkout release-1.12.0 2. Add maven image Add the following mirrors in the m ...

Posted by mustatin on Mon, 01 Nov 2021 04:52:28 -0700

[Hive] Chapter 5 DML data operation

Data import 1. Load data into the table (load) 1) Grammar hive> load data [local] inpath 'Data path' [overwrite] into table \ student [partition (partcol1=val1,...)]; (1) load data: indicates loading data (2) Local: indicates loading data from local to hive table; Otherwise, load data from HDFS to hive table (3) inpath: indicates the pa ...

Posted by PseudoEvolution on Sun, 31 Oct 2021 14:38:44 -0700