Spark implements a slightly more complex business scenario by customizing InputFormat to read HDFS files
Links to the original text: https://www.oipapio.com/cn/article-2689341
Business scenario
Spark knows how to read files according to InputFormat. By default, it reads files according to one line. In some specific cases, Spark's default Inpu ...
Posted by garyb_44 on Sun, 06 Oct 2019 17:05:10 -0700
Spark Learning 02 - A Method to Create DStream
Spark Streaming provides two types of built-in streaming media sources.
Basic source: The source directly provided in the StreamingContext API. Example: File system and socket connection.
Advanced resources: Kafka, Flume, Kinesis and other resources can be obtained through additional utility classes.
...
Posted by leebo on Thu, 03 Oct 2019 10:21:03 -0700
Spark is exported to an existing directory
Links to the original text: https://my.oschina.net/asparagus/blog/699814
Spark saves the output data to an existing directory
Save by the saveAsTextFile method of RDD
String input = ...
Posted by webguync on Thu, 03 Oct 2019 07:17:07 -0700
scala_Akka Concurrent Programming Framework
Article Directory
Introduction to Akka Concurrent Programming Framework
Introduction to Akka
Akka characteristics
Akka communication process
Create Actor
Introduction to API
Getting Started Cases
Implementation Steps
1. Create Maven modules
2. Create and load Actor s
3. Send/receive messages
Akka ...
Posted by Stonewall on Fri, 20 Sep 2019 20:24:53 -0700
Big Data Series: Spark's Initial Knowledge of Learning Notes
1. Introduction to Spark
In 2009, Spark was born at AMPLab Laboratory at the University of Berkeley. Spark is an experimental project with very little code and is a lightweight framework.
In 2010, the University of Berkeley officially opened up th ...
Posted by Pozor on Wed, 11 Sep 2019 19:29:10 -0700
Structured Streaming Simple Data Processing - Read CSV and extract column keywords
Preface
Recently, when Baidu wants to learn Spark's newer Structured Streaming, all of them are monotonous wordcount s, which are quite speechless.You have to figure out for yourself what you can do with the Select and Filter operations of the Dataframe.Because of using Python, using Pandas, and trying to turn Pandas to process, readStream doe ...
Posted by bigphpn00b on Wed, 11 Sep 2019 16:56:03 -0700
Apache Flink Zero Foundation Initial Flink Data Stream Programming
Data sources can be created by Stream Execution Environment. addSource (sourceFunction). Flink also provides some built-in data sources for easy use, such as readTextFile(path) readFile(), and of course, it can also write a custom data source (by implementing the SourceFunction method, but can not be implemented in parallel). That's ok. Or impl ...
Posted by McInfo on Tue, 10 Sep 2019 02:57:29 -0700
Spark integrates Kafka and manually maintains offset
Spark Integrates Kafka's Two Patterns
In development, we often use SparkStreaming to read and process data in kafka in real time. After version 1.3 of SparkStreaming, KafkaUtils provides two methods to create DStream:
Receiver reception: KafkaUtils.c ...
Posted by zilem on Wed, 04 Sep 2019 20:35:09 -0700
Beautiful HTML 5 Web page special effects learning notes _canvas to achieve flame following mouse
Effect:
Lifelike flames follow the mouse and sparks appear to illuminate the background text.
Drawing with canvas
javascript is used, but there is no complex logic. Beginning Degree: Simple
Welcome to my blog to read this article: https://clatterrr.com/archive...
Source code:
Demonstration address: https://clatterrr.github.io/f...
The sou ...
Posted by rich___ on Tue, 27 Aug 2019 03:44:52 -0700
Common operations of ArrayList and LinkedList
The main contents of this paper are as follows:
1. Common operations of ArrayList
2. A Brief Introduction to Random Numbers
3. Comparisons of Java Custom Equivalence
4. A Brief Introduction to Iterators
5. The relationship between Iterable and foreac ...
Posted by Altec on Tue, 20 Aug 2019 04:58:28 -0700