Spark implements a slightly more complex business scenario by customizing InputFormat to read HDFS files

Links to the original text: https://www.oipapio.com/cn/article-2689341 Business scenario Spark knows how to read files according to InputFormat. By default, it reads files according to one line. In some specific cases, Spark's default Inpu ...

Posted by garyb_44 on Sun, 06 Oct 2019 17:05:10 -0700

Spark Learning 02 - A Method to Create DStream

Spark Streaming provides two types of built-in streaming media sources. Basic source: The source directly provided in the StreamingContext API. Example: File system and socket connection. Advanced resources: Kafka, Flume, Kinesis and other resources can be obtained through additional utility classes. ...

Posted by leebo on Thu, 03 Oct 2019 10:21:03 -0700

Spark is exported to an existing directory

Links to the original text: https://my.oschina.net/asparagus/blog/699814 Spark saves the output data to an existing directory Save by the saveAsTextFile method of RDD String input =  ...

Posted by webguync on Thu, 03 Oct 2019 07:17:07 -0700

scala_Akka Concurrent Programming Framework

Article Directory Introduction to Akka Concurrent Programming Framework Introduction to Akka Akka characteristics Akka communication process Create Actor Introduction to API Getting Started Cases Implementation Steps 1. Create Maven modules 2. Create and load Actor s 3. Send/receive messages Akka ...

Posted by Stonewall on Fri, 20 Sep 2019 20:24:53 -0700

Big Data Series: Spark's Initial Knowledge of Learning Notes

1. Introduction to Spark In 2009, Spark was born at AMPLab Laboratory at the University of Berkeley. Spark is an experimental project with very little code and is a lightweight framework. In 2010, the University of Berkeley officially opened up th ...

Posted by Pozor on Wed, 11 Sep 2019 19:29:10 -0700

Structured Streaming Simple Data Processing - Read CSV and extract column keywords

Preface Recently, when Baidu wants to learn Spark's newer Structured Streaming, all of them are monotonous wordcount s, which are quite speechless.You have to figure out for yourself what you can do with the Select and Filter operations of the Dataframe.Because of using Python, using Pandas, and trying to turn Pandas to process, readStream doe ...

Posted by bigphpn00b on Wed, 11 Sep 2019 16:56:03 -0700

Apache Flink Zero Foundation Initial Flink Data Stream Programming

Data sources can be created by Stream Execution Environment. addSource (sourceFunction). Flink also provides some built-in data sources for easy use, such as readTextFile(path) readFile(), and of course, it can also write a custom data source (by implementing the SourceFunction method, but can not be implemented in parallel). That's ok. Or impl ...

Posted by McInfo on Tue, 10 Sep 2019 02:57:29 -0700

Spark integrates Kafka and manually maintains offset

Spark Integrates Kafka's Two Patterns In development, we often use SparkStreaming to read and process data in kafka in real time. After version 1.3 of SparkStreaming, KafkaUtils provides two methods to create DStream: Receiver reception: KafkaUtils.c ...

Posted by zilem on Wed, 04 Sep 2019 20:35:09 -0700

Beautiful HTML 5 Web page special effects learning notes _canvas to achieve flame following mouse

Effect: Lifelike flames follow the mouse and sparks appear to illuminate the background text. Drawing with canvas javascript is used, but there is no complex logic. Beginning Degree: Simple Welcome to my blog to read this article: https://clatterrr.com/archive... Source code: Demonstration address: https://clatterrr.github.io/f... The sou ...

Posted by rich___ on Tue, 27 Aug 2019 03:44:52 -0700

Common operations of ArrayList and LinkedList

The main contents of this paper are as follows: 1. Common operations of ArrayList 2. A Brief Introduction to Random Numbers 3. Comparisons of Java Custom Equivalence 4. A Brief Introduction to Iterators 5. The relationship between Iterable and foreac ...

Posted by Altec on Tue, 20 Aug 2019 04:58:28 -0700