Big data development review scala

Keywords: Scala


10.1 introduction to scala

scala is a multi paradigm programming language running on the JVM. It supports both object-oriented and function oriented programming.

10.2. scala interpreter

To start the scala interpreter, you only need the following steps:

  • Press and hold the windows key + r
  • Just enter scala

  • Execute: quit in the scala command prompt window to exit the interpreter

10.3 basic syntax of scala

10.3.1 declaration of variables

In scala, you can use val or var to define variables. The syntax format is as follows:

val/var Variable identification:Variable type = Initial value


  • val defines variables that cannot be reassigned
  • var defines a variable that can be re assigned


  • The variable type defined in scala is written after the variable name
  • There is no need to add a semicolon at the end of scala statements

Question: what is the difference between variables modified by val and var?

10.3.2 string

scala provides a variety of ways to define strings. In the future, we can choose the most convenient way to define strings according to our needs.

  • Use double quotation marks

    val/var Variable name = ""String"
  • Use interpolation expressions

    val/var Variable name = s"${variable/expression}character string"
  • Use three quotation marks

    val/var Variable name = """String 1
     String 2"""    

10.3.3 data type

Foundation typeType description
Byte8-bit signed integer
Short16 bit signed integer
Int32-bit signed integer
Long64 bit signed integer
Char16 bit unsigned Unicode character
StringChar type sequence (string)
Float32-bit single precision floating-point number
Double64 bit double precision floating point number
Booleantrue or false

Note the difference between scala types and Java


  1. All types in scala begin with uppercase letters
  2. Shaping uses Int instead of Integer
  3. Variables defined in scala can be typed without writing, so that the scala compiler can infer automatically scala type hierarchy

10.3.4 expression conditional expression

A conditional expression is an if expression. The if expression can decide to perform the corresponding operation according to the result (true or false) of the condition according to whether the given condition is satisfied.

scala conditional expressions have the same syntax as Java. Unlike Java,


  • In scala, conditional expressions also have return values
  • In scala, there is no ternary expression, so you can use if expression instead of ternary expression block expression
  • In scala, {} is used to represent a block expression
  • Like if expressions, block expressions have values
  • Value is the value of the last expression


What is the value of variable a in the following code?

scala> val a = {
     | println("1 + 1")
     | 1 + 1
     | }

10.3.5 circulation

In scala, you can use for and while, but for expressions are generally recommended because the syntax of for expressions is more concise

10.3.5 for cycle


for(i <- expression/array/aggregate) {
    // expression
} nested for loop

Use the for expression to print the following characters



  1. Use the for expression to print 3 rows and 5 columns of stars
  2. Wrap every 5 stars printed

Reference code

for(i <- 1 to 3; j <- 1 to 5) {print("*");if(j == 5) println("")}
10.3.5. while cycle

The while loop in scala is consistent with that in Java


Print numbers 1-10

Reference code

scala> var i = 1
i: Int = 1

scala> while(i <= 10) {
     | println(i)
     | i = i+1
     | }

10.3.6 method


def methodName (Parameter name:Parameter type, Parameter name:Parameter type) : [return type] = {
    // Method body: a series of code


  • The parameter type of the parameter list cannot be omitted
  • The return value type can be omitted and automatically inferred by the scala compiler
  • The return value may not be written with return. By default, it is the value of the {} block expression


  1. Define a method to add two integer values and return the added result
  2. Call this method

Reference code

scala> def add(a:Int, b:Int) = a + b
m1: (x: Int, y: Int)Int

scala> add(1,2)
res10: Int = 3 method parameters

The method parameters in scala are used flexibly. It supports the following types of parameters:

  • Default parameters
  • Named parameter
  • Variable length parameter

Default parameters

When defining a method, you can define a default value for the parameter.


  1. Defines a method for calculating the addition of two values, which default to 0
  2. Call this method without passing any parameters

Reference code

// x. y with default value of 0 
def add(x:Int = 0, y:Int = 0) = x + y

Named parameter

When calling a method, you can specify the name of the parameter to call.


  1. Defines a method for calculating the addition of two values, which default to 0
  2. Call this method to set only the value of the first parameter

Reference code

def add(x:Int = 0, y:Int = 0) = x + y

Variable length parameter

If the parameters of a method are not fixed, you can define that the parameters of a method are variable length parameters.

Syntax format:

def Method name(Parameter name:Parameter type*):return type = {
    Method body


An * sign is added after the parameter type to indicate that the parameters can be 0 or more


  1. Define a method for calculating the addition of several values
  2. Call the method and pass in the following data: 1,2,3,4,5

Reference code

scala> def add(num:Int*) = num.sum
add: (num: Int*)Int

scala> add(1,2,3,4,5)
res1: Int = 15

10.3.7 function


val Function variable name = (Parameter name:Parameter type, Parameter name:Parameter type....) => Function body


  • A function is an object (variable)
  • Similar to methods, functions have input parameters and return values
  • Function definitions do not require def definitions
  • There is no need to specify the return value type


  1. Define a function that adds two values
  2. Call this function

Reference code

scala> val add = (x:Int, y:Int) => x + y
add: (Int, Int) => Int = <function2>

scala> add(1,2)
res3: Int = 3

10.4 data structure

10.4.1 array

Fixed length array

  • Fixed length array means that the length of the array cannot be changed
  • The elements of an array can be changed


// Defines an array by specifying a length
val/var Variable name = new Array[Element type](Array length)

// Initializing arrays directly with elements
val/var Variable name = Array(Element 1, Element 2, Element 3...)


  • In scala, the generics of arrays are specified with []
  • Use () to get the element

Variable length array

To create a variable length array, you need to import the ArrayBuffer class import scala.collection.mutable.ArrayBuffer in advance


  • Create an empty ArrayBuffer variable length array. Syntax structure:

    val/var a = ArrayBuffer[Element type]()
  • Create an ArrayBuffer with initial elements

    val/var a = ArrayBuffer(Element 1, element 2, element 3....)

Array related operations

  • Use + = to add elements
  • Use - = to delete elements
  • Use + + = to append an array to a variable length array
  • Use for to traverse the array

10.4.1 tuple

Tuples can be used to contain a set of different types of values. For example: name, age, gender, date of birth. The elements of tuples are immutable.


Use parentheses to define tuples

val/var tuple = (Element 1, Element 2, Element 3....)

Use arrows to define tuples (tuples have only two elements)

val/var tuple = Element 1->Element 2


Define a tuple that contains the following data for a student

idfull nameAgeaddress

Reference code

scala> val a = (1, "zhangsan", 20, "beijing")
a: (Int, String, Int, String) = (1,zhangsan,20,beijing)

Access tuple

Use_ 1,_ 2,_ 3... To access the elements in the tuple_ 1 means to access the first element, and so on


  • Define a tuple containing the name and gender of a student, "zhangsan", "male"
  • Obtain the student's name and gender respectively

Reference code

scala> val a = "zhangsan" -> "male"
a: (String, String) = (zhangsan,male)

// Get the first element
scala> a._1
res41: String = zhangsan

// Get the second element
scala> a._2
res42: String = male

10.4.2 list

Immutable list


Use list (element 1, element 2, element 3,...) to create an immutable list. Syntax format:

val/var Variable name = List(Element 1, Element 2, Element 3...)

Use Nil to create an immutable empty list

val/var Variable name = Nil

Use the:: method to create an immutable list

val/var Variable name = Element 1 :: Element 2 :: Nil


To create a list by * *:: splicing, you must add a Nil at the end

Variable list

Variable list means that the elements and length of the list are variable.

To use a variable list, first import scala.collection.mutable.ListBuffer


  • The mutable sets are all in the mutable package
  • Immutable sets are all in immutable package (imported by default)


Create an empty variable list using ListBuffer [element type] (). Syntax structure:

val/var Variable name = ListBuffer[Int]()

Use listbuffer (element 1, element 2, element 3...) to create a variable list. The syntax structure is as follows:

val/var Variable name = ListBuffer(Element 1, element 2, element 3...)

Operation of variable list

  • Get element (accessed using parentheses (index value))
  • Add element (+ =)
  • Append a list (+ + =)
  • Change the element (get the element with parentheses and assign a value)
  • Delete element (- =)
  • Convert to List (toList)
  • Convert to Array (toArray)


A set is a collection that represents no duplicate elements. Set has the following properties:

  1. Element does not repeat
  2. The insertion order is not guaranteed

There are two kinds of sets in scala, one is immutable set, the other is variable set.

Immutable set


Create an empty immutable set. Syntax format:

val/var Variable name = Set[type]()

Given an element to create an immutable set, syntax format:

val/var Variable name = Set(Element 1, Element 2, Element 3...)

Variable set


Variable sets are created in the same way as immutable sets, except that a variable set class needs to be imported in advance.

Manual import: import scala.collection.mutable.Set

10.4.1 mapping

A Map can be called a Map. It is a collection of key value pairs. In scala, Map is also divided into immutable Map and variable Map.

Immutable Map



val/var map = Map(key->value, key->value, key->value...)	// Recommended, better readability
val/var map = Map((key, value), (key, value), (key, value), (key, value)...)


  1. Define a mapping that contains the following student names and age data

    "zhangsan", 30
    "lisi", 40
  2. Get zhangsan's age

Variable Map


The definition syntax is consistent with the immutable Map. However, to define a variable Map, you need to import scala.collection.mutable.Map manually


  1. Define a mapping that contains the following student names and age data

    "zhangsan", 30
    "lisi", 40
  2. Modify zhangsan's age to 20

scala> val map = Map("zhangsan"->30, "lisi"->40)
map: scala.collection.mutable.Map[String,Int] = Map(lisi -> 40, zhangsan -> 30)

// Modify value
scala> map("zhangsan") = 20

10.5 functional programming

  • Traversal (foreach)
  • map
  • Flat map
  • filter
  • exists
  • Sort (sorted, sortBy, sortWith)
  • Group by
  • Aggregate computing (reduce)
  • fold

10.6 associated objects

A class and object have the same name. This object is called an associated object, and this class is called an associated class

  • The associated object must have the same name as the associated class
  • Associated objects and associated classes are in the same scala source file
  • Associated objects and associated classes can access private properties from each other

Reference code

object _11ObjectDemo {

  class CustomerService {
    def save() = {
      println(s"${CustomerService.SERVICE_NAME}:Save customer")

  // Companion object of CustomerService
  object CustomerService {
    private val SERVICE_NAME = "CustomerService"

  def main(args: Array[String]): Unit = {
    val customerService = new CustomerService()

10.7. Sample class

The sample class is a special class that can be used to quickly define a class for saving data (similar to the Java POJO class)

Syntax format

case class Sample class name(var/val Member variable name 1:Type 1, Member variable name 2:Type 2, Member variable name 3:Type 3)
  • If you want to implement a member variable that can be modified, you can add var
  • The default value is val, which can be omitted


  • Define a Person sample class that contains name and age member variables
  • Create an object instance of the sample class (Zhang San, 20) and print it

Reference code

object _01CaseClassDemo {
  case class Person(name:String, age:Int)

  def main(args: Array[String]): Unit = {
    val zhangsan = Person("Zhang San", 20)


10.8. Sample object

It is mainly used in two places:

  1. As a message without any parameters

Use case object to create sample objects. The sample object is singleton, and it has no main constructor

Syntax format

case object Sample object name

10.9 pattern matching

10.9.1 simple pattern matching

Syntax format

variable match {
    case "Constant 1" => Expression 1
    case "Constant 2" => Expression 2
    case "Constant 3" => Expression 3
    case _ => Expression 4		// Default configuration


Requirement description

  1. Enter a word from the console (using the StdIn.readLine method)
  2. Judge whether the word can match the following words. If it can match, return a sentence
  3. Print this sentence
hadoopBig data distributed storage and computing framework
zookeeperBig data distributed coordination service framework
sparkBig data distributed memory computing framework

Reference code

println("Please output a word:")
// StdIn.readLine means to read a line of text from the console
val name = StdIn.readLine()

val result = name match {
    case "hadoop" => "Big data distributed storage and computing framework"
    case "zookeeper" => "Big data distributed coordination service framework"
    case "spark" => "Big data distributed memory computing framework"
    case _ => "Unmatched"


10.9.2. Matching sample classes

scala can use pattern matching to match the sample class, so that the member data in the sample class can be quickly obtained. Later, we will use it when developing the Akka case.


Requirement description

  • Create two sample classes Customer and Order
    • Customer contains name and age fields
    • The Order contains an id field
  • Define the objects of two case classes respectively and specify them as Any type
  • Match the two objects using a pattern and print their member variable values separately

Reference code

// 1. Create two sample classes
case class Person(name:String, age:Int)
case class Order(id:String)

def main(args: Array[String]): Unit = {
    // 2. Create a sample class object and assign it to Any type
    val zhangsan:Any = Person("Zhang San", 20)
    val order1:Any = Order("001")

    // 3. Use the expression for pattern matching
    // Get member variables in the sample class
    order1 match {
        case Person(name, age) => println(s"full name: ${name} Age: ${age}")
        case Order(id1) => println(s"ID Is: ${id1}")
        case _ => println("Unmatched")

10.10 higher order function

Higher order function inclusion

  • Function as value
  • Anonymous function
  • closure
  • Corey, wait

Function as value

Example description

Convert each element in an integer list into a corresponding number of small stars

List(1, 2, 3...) => *, **, *** 


  1. Create a function to replace a number with a specified number of small stars
  2. Create a list and call the map method
  3. Print list converted to

Reference code

val func = (num:Int) => "*" * num

println((1 to 10).map(func))

Anonymous function

Functions that are not assigned to variables are anonymous functions

Reference code

println((1 to 10).map(num => "*" * num))
// Because the num variable is used only once and is only used for simple calculation, you can omit the parameter list and use_ Alternative parameters
println((1 to 10).map("*" * _))


A closure is actually a function, but the return value of this function depends on variables declared outside the function.

You can simply think that you can access a function that is not within the current scope.


Define a closure

val y=10

val add=(x:Int)=>{

println(add(5)) // Result 15

The add function is a closure


Currying refers to the process of converting a method that previously accepted multiple parameters into multiple parameter lists.


Example description

  • Write a method to complete the calculation of two Int type numbers
  • How to calculate and encapsulate functions
  • Use Coriolis to achieve the above operations

Reference code

// Coriolism: realize the calculation of two numbers
def calc_carried(x:Double, y:Double)(func_calc:(Double, Double)=>Double) = {
    func_calc(x, y)

def main(args: Intrray[String]): Unit = {
    println(calc_carried(10.1, 10.2){
        (x,y) => x + y
    println(calc_carried(10, 10)(_ + _))
    println(calc_carried(10.1, 10.2)(_ * _))
    println(calc_carried(100.2, 10)(_ - _))

Posted by cuvaibhav on Thu, 18 Nov 2021 06:30:09 -0800