Process Function for Flink Learning Notes
It is expected to take a week to complete this article, and to study every aspect of the basic entry provided by the Flink documentation (for the purpose of participating in Ali's Flink Programming Competition):
Figure Flink provides different ...
Posted by kulikedat on Fri, 16 Aug 2019 20:07:04 -0700
Spark Learning Instance (Python): Load Data Source loads the data source
When we use Spark, we mainly use it to process large quantities of data quickly. So what data sources will we have in actual development and production? I summarize them as follows:
text
csv
json
parquet
jdbc
hive
kafka
elasticsearch
Next, all the tests are based on the spark local mode, be ...
Posted by Revlet on Thu, 08 Aug 2019 23:40:57 -0700
Chapter 2 RDD Programming (2.1-2.2)
Chapter 2 RDD Programming
2.1 Programming Model
In Spark, RDDs are represented as objects that are converted through method calls on objects.After a series of transformations define the RDD, actions can be invoked to trigger RDD calculations, either by returning results to the application (count, collect, etc.) or by saving data to the stor ...
Posted by Chizzad on Sun, 04 Aug 2019 10:52:20 -0700
Apache Spark Progressive Learning Tutorial: Spark Cluster Deployment and Running
Catalog
First, Preface
1.1 Cluster Planning
1.2 Pre-condition
1.3 Installation Pack Download
II. Installation and deployment
2.1. Unzip and modify configuration files
2.2 Copy files to two other machines
3. Operation and testing
3.1 Start Cluster
3.2 Start spark-shell connection cluster
3. ...
Posted by zuhalter223 on Fri, 02 Aug 2019 02:32:40 -0700
Application of Scrapy and MongoDB
Links to the original text: http://www.cnblogs.com/JackQ/p/4843701.html
Scrapy is a fast, high-level screen capture and Web Capture framework developed by Python. It is used to capture Web sites and extract structured data from pages. ...
Posted by pollysal on Tue, 30 Jul 2019 18:00:21 -0700
Apache Spark Progressive Learning Tutorial: Spark Single Node Installation and Quick Start Demo
First, download Spark
The first step in using Spark is to download and decompress. Let's start by downloading the precompiled version of Spark. Visits http://spark.apache.org/downloads.html To download the spark installation package. The version used in this article is: spark-2.4.3-bin-hadoop2.7.tgz
Second, install Spark
cd ~
tar -xf spark ...
Posted by culprit on Mon, 29 Jul 2019 04:21:08 -0700
ROS One-Click Deployment of Spark Distributed Cluster
Apache Spark is a fast and versatile computing engine designed for large-scale data processing. It can perform a wide range of operations, including SQL queries, text processing, machine learning, etc. Before the advent of Spark, we generally needed to learn a variety of engines to handle these requirements separately.The main purpose of this a ...
Posted by scottb1 on Mon, 08 Jul 2019 09:48:51 -0700
Summary of kafka learning knowledge points
I. server.properties in kafka configuration file
#broker's global unique number, not duplicated
broker.id=0
#The port used to listen for links, where producer or consumer will establish a connection
port=9092
#Number of threads processing network requests
num.network.threads=3
#Off-the-shelf quantities used to process disk IO
nu ...
Posted by fellixombc on Thu, 27 Jun 2019 14:32:08 -0700
spark examples
Spark Streaming is a quasi-real-time stream processing framework. The processing response time is usually in minutes, that is to say, the delay time of processing real-time data is in seconds; Storm is a real-time stream processing framework, and the processing response is in milliseconds. So the selection of flow framework depends on the spec ...
Posted by Frederick on Fri, 21 Jun 2019 14:37:32 -0700
Spark SQL Learning Notes
Spark SQL is a module for processing structured data in Spark. Unlike the underlying Park RDD API, the Spark SQL interface provides more information about the structure of data and the runtime of computing tasks. Spark SQL now has three different APIs: the SQL statement, the DataFrame API and the latest Dataset API.One use of Spark SQL is to ex ...
Posted by arunmj82 on Sun, 16 Jun 2019 17:27:38 -0700