RDD common operations in pyspark

preparation: import pyspark from pyspark import SparkContext from pyspark import SparkConf conf=SparkConf().setAppName("lg").setMaster('local[4]') #local[4] means to run 4 kernels locally sc=SparkContext.getOrCreate(conf) 1. Parallel and collect The parallelize function converts the list obj ...

Posted by moomsdad on Fri, 21 Feb 2020 02:13:19 -0800

MasterSlave cluster deployment of ActiveMQ high availability solution (HA)

In the previous documents, we demonstrated how to use shared files and shared databases to realize the cluster of activemq. See also MasterSlave cluster deployment of ActiveMQ high availability solution (HA) (I) In this section, we demonstrate how to implement clustering through leveldb + zookeeper. O ...

Posted by gjdunga on Fri, 14 Feb 2020 05:31:00 -0800

Scala learning day 1: Variables

Learning objectives Grammatical format Define a variable in the interpreter val and var variables Use type inference to define variables Lazy assignment Grammatical format Java variable definition int a = 0; In scala, you can use val or var to define variables. The syntax format is as follows: ...

Posted by thor erik on Sun, 09 Feb 2020 02:40:44 -0800

maven common plug-ins

1, Maven compiler plugin 1. It is used to set the jdk version used when maven is packaged. maven is a java framework, so it is only for jdk. scala needs to set it separately. 2.usage 2.1. Setting plug-ins <plugin> ...

Posted by Rohan Shenoy on Sun, 09 Feb 2020 00:46:26 -0800

Find the number of adjacent words in large amount of data

This topic is similar to some of the search topics in Leetcode. The problem you want to deal with is: count the number of two adjacent digits of a word. If there are w1,w2,w3,w4,w5,w6, then: The final output is (word,neighbor,frequency). We implement it in five ways: MapReduce Spark Spark SQL method Scala method Spark SQL for Scala MapReduce ...

Posted by olechka on Sun, 02 Feb 2020 08:18:59 -0800

Spark SQL/DataFrame/DataSet operation ----- read data

1, Read data source (1) Read json and use spark.read. Note: the path is from HDFS by default. If you want to read the native file, you need to prefix it file: / /, as follows scala> val people = spark.read.format("json").load("file:///opt/software/data/people.json") people: org.apache.spark.sql.DataFrame = [age: bigint, name: string] scal ...

Posted by Pie on Sun, 02 Feb 2020 08:18:33 -0800

Flink of big data learning

Catalog   1: Introduction 2: Why Flink 3: What industries need 4: Features of Flink 5: The difference with sparkStreaming 6: Preliminary development 7: Flink configuration description Eight: Environment 9: Running components 1: Introduction Flink is a framework and distributed com ...

Posted by stodge on Fri, 17 Jan 2020 01:18:24 -0800

Scala functional programming under functional data structure

previously on Guide to functional programming in Scala Scala functional programming (2) introduction to scala basic syntax scala functional programming (3) scala sets and functions Scala functional programming (four) functional data structure 1.List code analysis The content introduced today is mainly to supplement the scala functional data st ...

Posted by stev979 on Thu, 19 Dec 2019 03:45:09 -0800

Real time log analysis by Flume+Kafka+SparkStreaming+Redis+Mysql ip access times

Novice learning, if there are mistakes, please correct, thank you! 1. Start zookeeper and kafka, and set a topic as test fkss. For the convenience of observation, I added it through kafka manager 2. Configure Flume and start it. The listening file is / home / czh / docker-public-file/testplume.log, which is sent to kafka a ...

Posted by ksduded on Sat, 14 Dec 2019 10:51:38 -0800

[four Twirl template engine] 3. Common templates

Now let's look at the typical usage of templates. layout Now declare a view / main.scala.html template as the main template: @(title: String)(content: Html) <!DOCTYPE html> <html> <head> <title>@title</title> </head> <body> <section class="content">@content</section> ...

Posted by Tryweryn on Sun, 08 Dec 2019 07:13:13 -0800