Scala functional programming under functional data structure

Keywords: Java Scala Programming Spark

previously on

Guide to functional programming in Scala

Scala functional programming (2) introduction to scala basic syntax

scala functional programming (3) scala sets and functions

Scala functional programming (four) functional data structure

1.List code analysis

The content introduced today is mainly to supplement the scala functional data structure introduced in the previous article, mainly about code. Let's take a look at the previous section, which focuses on the functional list, Scala functional programming (four) functional data structure . I put all these codes in my public account, including a functional List and a functional binary search tree. Pay attention to the public account: Hal's data castle. Reply to "scala tree data structure" and you can get them directly (it's not easy to write an article, please pay attention to it).

In other words, in the previous article, I mainly introduced some basic usage of List, including defining the basic structure, node Cons and the ending Nil. And use an object List to define the basic List operation.

//Define List as attribute, Nil and Cons as end and middle nodes
sealed trait List[+A]

case object Nil extends List[Nothing]

case class Cons[+A](head: A, tail: List[A]) extends List[A] {
  override def toString: String = s"$head :: ${tail.toString}"
}


//The definition method of Listc operation. object is equivalent to a static class in java. The methods in it can be called directly
object List {

  def sum(ints: List[Int]): Int = ints match {
    case Nil => 0
    case Cons(x,xs) => x + sum(xs)
  }

  def map[A,B](l: List[A],f: A => B): List[B] =l match {
    case Nil              => Nil
    case Cons(head, tail) =>Cons(f(head), map(tail,f))
  }
  def apply[A](a: A*): List[A] =
    if (a.isEmpty) Nil
    else Cons(a.head, apply(a.tail: _*))

  def empty[A]: List[A] = Nil


  object ops {
    //Define implicit conversion, which is prepared to expand the List operation. You can see at the bottom if you use
    implicit def listOps[A](list: List[A]): ListOps[A] = new ListOps(list)
  }
}

The definition of node Cons and Nil is the same as that in the previous section, except that Cons has multiple overridden toString methods.

Briefly speaking, here, in the object List, we define the apply method to initialize and generate a List. And the sum and map methods mentioned in the previous section. If you don't understand these, you can look at the content of the previous section.

But in this case, when we want to call the sum method, we can only call it through the object List, like the following:

//Use the apply method in the object List to initialize and generate the List
scala> val numList = List(1,2,3,4)
numList: List[Int] = 1 :: 2 :: 3 :: 4 :: Nil

//Use sum method in object List
scala> List.sum(numList)
res0: Int = 10

However, this is not the case when we use it everyday. We are more familiar with it

//Use the apply method in the object List to initialize and generate the List
scala> val numList = List(1,2,3,4)
numList: List[Int] = 1 :: 2 :: 3 :: 4 :: Nil

//Use numList's built-in methods directly to handle
scala> numList.sum()
res0: Int = 10

A more general approach is to call methods through the List itself, as seen above. In general, it is directly added to Cons, but since Cons is inherited from trait List[+A], you (including) need to define a bunch of methods in Nil. Is there any other way?

Yes, another syntax sugar of scala, implicit transformation, is ops in the object List above.

  object ops {
    //Define implicit conversion, which is prepared to expand the List operation. You can see at the bottom if you use
    implicit def listOps[A](list: List[A]): ListOps[A] = new ListOps(list)
  }

Implicit transformation is mainly defined by the keyword implicit. Of course, implicit transformation has other uses, no matter the usage here is the most common one.

Implicit conversion functions mainly focus on parameters and returns. The function name (listOps in this case) is not important. It doesn't matter.

The function of implicit conversion is not explained here. Baidu can take a look at it. In short, it is to convert one type to another when necessary. The function here is to convert the List defined by us to the type of ListOps under specific circumstances, and the class of ListOps is given below.

//Expand List operation
private[list] final class ListOps[A](list: List[A]) {
//Import implicit conversion function, because the following processing also needs implicit conversion
  import List.ops._

  //With recursive implementation, foldRight's implementation calls this function, which is used for reuse
  //Code reuse is a very important feature of functional expressions, as you can see from the append method below
  def foldRightAsPrimary[B](z: B)(f: (A, B) => B): B = list match {
    case Nil              => z
    case Cons(head, tail) => f(head, tail.foldRightAsPrimary(z)(f))
  }

  def foldRight[B](z: B)(f: (A, B) => B): B = foldRightViaFoldLeft(z)(f)

  def map[B](f: A=> B): List[B] = list match {
    case Nil              => Nil
    case Cons(head, tail) => Cons(f(head), tail.map(f))
  }

}

With this code, when we need to use the map, we can use it directly instead of using the object List, just like this:

//Use the apply method in the object List to initialize and generate the List
scala> val numList = List(1,2,3,4)
numList: List[Int] = 1 :: 2 :: 3 :: 4 :: Nil

//Use the built-in method of numList instead of List.map(numList,function)
scala> numList.map(function)

When the code detects that the List calls the map method, but there is no map method inside the List, the implicit conversion will be triggered to convert to the ListOps type, the map method in the ListOps type will be called, and then a List will be returned as the result. Although it has gone through many twists and turns, the caller can't feel it. Instead, it feels like the map method in the List itself. There are many such operations in Spark.

Like the above code, now we can use numList.map(function) directly, just like there is a map function in the List itself.

2. Binary search tree

At the end of the last article, we give an unfinished data structure, binary search tree as an exercise. Let's talk about this in this section.

In fact, if you understand the previous List, there is no difficulty in binary search tree.

Binary search tree is a tree, which naturally has leaf nodes and leaf nodes (that is, the end). But this time, unlike List, implicit transformation is not used, so what we define is not a trait, but an abstract class first. Then let the leaf node and the leaf node inherit it.

  //Defining an abstract class of binary tree
  sealed abstract class TreeMap[+A] extends AbstractMap[Int, A] {

    def add[B >: A](kv: (Int, B)): TreeMap[B] = ???
    def deleteMin: ((Int, A), TreeMap[A]) = ???
    def delete(key: Int): TreeMap[A] = ???
    def get(key: Int): Option[A] = ???
    def +[A1 >: A](kv: (Int, A1)): TreeMap[A1] =  ???
    def -(k: Int): TreeMap[A] = ???
    override def toList: List[(Int, A)] = ???
    def iterator: Iterator[(Int, A)] =???
  }
  
  //The leaf node, the end of each branch, inherits the above abstract class
  case class Leaf() extends TreeMap[Nothing]
  //Leaf node, containing left, right and content, inherits the above abstract class
  case class Node[+A](key: Int, value: A,
                      left: TreeMap[A], right: TreeMap[A]) extends TreeMap[A]

There are basic addition and deletion operations in binary tree, and two symbols are overloaded, + and - represent addition and deletion respectively. By the way, the??? Here is actually the same as the pass in python. It serves as a placeholder to tell the compiler that there will be something here. Don't report an error first.

Then it is mainly to realize the code that is vacant in the binary tree. In fact, students who are familiar with the tree structure should know that recursion is the natural gene of the tree. So it's natural to do this recursively. However, before writing, it should be noted that in general functional programming, variable variables (var) and variable data structures (ListBuff) are not used.

The implementation process is not easy to explain. In fact, it is through recursion and scala pattern matching. If it encounters a leaf node, it will hang up, not be handed back. Look directly at the code. Here we mainly introduce the add method. Other methods are basically similar:

  sealed abstract class TreeMap[+A] extends AbstractMap[Int, A] {
    ......
    //Use pattern matching to realize recursive operation, mainly to find the corresponding location and insert data
    def add[B >: A](kv: (Int, B)): TreeMap[B] = {

      val (key, value) = kv
      //this is the current type. It may be a leaf node or a leaf node
      this match {
        case Node(nodeKey, nodeValue, left, right) => {
          //Recursion according to the rules of binary search tree
          if(nodeKey > key)
            Node(nodeKey, nodeValue, left.add((key,value)), right)
          else if(nodeKey < key)
            Node(nodeKey, nodeValue, left, right.add((key,value)))
          else
            Node(nodeKey, value, left, right)
        }
        //If it is a leaf node, a new leaf node will be generated. Return
        case Leaf() => {
          Node(key, value, Leaf(), Leaf())
        }
      }

      ......
    }
    

According to the rule of binary search tree, when the new key is larger than the node key, insert it to the right, and when it is smaller than the node key, insert it to the left. Then the end condition is agreed, which is to return when the leaf node is encountered. This completes the insertion. Whether it's deleting or searching, it's the same idea.

Overloaded operator methods, such as overloaded + methods, directly call the above add method, that is, directly reuse. Then look at the object TreeMap.

  object TreeMap {

    def empty[A]: TreeMap[A] = Leaf()

    def apply[A](kvs: (Int, A)*): TreeMap[A] = {
      kvs.toSeq.foldLeft(empty[A])(_ + _)
    }
  }

This object has two main functions: one is to generate leaf nodes, and the other is to initialize a tree (note that it is the apply method). Just like List, multi parameter input is also used here. The difference is that instead of recursion, multiple parameters are directly converted into a sequence, and then foldLeft is used to accumulate one by one. To achieve the initialization tree.

OK, it's over here. Finally, I hope you can try to write down the code of tree, and then test it with test case after writing. The programming skill is one step by one.

3. Summary

This is the end of the functional data structure chapter. I hope that here, you can understand the differences between the functional data structure and the realization of the data structure we first came into contact with, and why we have to spend a lot of time to realize it in the functional way!!

A lot of scala tutorials are introduced here. In a word, the default data structure of scala is immutable. If you want to change how Balabala is, it's easy for people to fall into the situation of knowing what it is and not knowing why.

At the same time, I have always decided to learn the most superficial thing about grammar. To really learn a language in depth, you need to gradually know the choice of the language in design, and even the philosophy of design, such as python's minimalism philosophy.

In the process of going deep into these things, grammar is naturally mastered, such as the more obscure implicit transformation. Here we will know that implicit transformation is used in this way. It has always been involved in spark!!!

The next chapter will introduce the error handling in scala, which is still a functional way. For example, try{}catch {} in java must be non functional. How is scala implemented? The next chapter will introduce:)

If you have any questions, please leave a message.

Above~

Posted by stev979 on Thu, 19 Dec 2019 03:45:09 -0800