R language learning -- the usage of application series functions

This part includes:

1. Function of apply series function

2. apply function family members and their respective functions (8 functions, divided into 4 groups)

3. Specific usage

 

1, The function of the apply family function avoids the use of loops. The efficiency of loops in R is very low and time-consuming. Using the apply family function can avoid loops

 

2, Apple family members

 

  (1) Perform group calculations - apply() and apply()

1. tapply() function:

(1) Purpose: apply a unified function to calculate each group of data in the data grouped by factors, and output the batch calculation results of each group of data. The calculation of multi factor grouping can be realized, but only one variable can be processed, and multiple variables cannot be processed at the same time.

(2) Option parameters:

tapply(X, INDEX, FUN = NULL, ..., default = NA, simplify = TRUE)

x: The input option is required to be a data object with split factor grouping, and only single factor variables, i.e. vectors, are supported

INDEX: an input option, which can be a factor (vector) or a group of factors (list) as the grouping basis of data in the x object. For multiple grouping factors, if there are 2 factors, the results will be output in the matrix form of two-dimensional table; if there are 3 or more factors, the results will be output in the form of list

FUN: used to set the function used, that is, the formula actually used for calculation. Its parameter values can be named functions, such as sum, mean, length, etc., or anonymous functions

Anonymous function usage: function(i) i+2*5

Default: used to set the default value. The parameter value can be set to 0 or NA or other values

simplify: used to set the presentation form of data results. The parameter values include TRUE and false. T means to output the results in brief form, mostly in vector form, F means to output the results in complex form, mostly in multi-level list.

 

(3) Case: count the number of lakes with different water quality categories in the first-class water resources area of the country

Input variable: number of lakes

Grouping factors: there are 2 in total, which are the first-class area of water resources and the category of lake water quality

① Read input data from the clipboard

> data_t <- read.table("clipboard",header=T,sep="\t")

  The entered data are as follows:

 

 

 

② INDEX factors are constructed. There are two factors in this case, namely, the first-class area of water resources and the category of water quality

> wrr1 <- factor(data_t$Class I water resources area,levels=c("Songhuajiang District","Liaohe District","Haihe District","Yellow River Region","Huaihe District","Yangtze River Region","Southeast Zhuhe District","Pearl River Region","Southwest Zhuhe District","Northwest Zhuhe District"))

> wql <- factor(data_t$Water quality category)

> wrr1
  [1] Songhuajiang District Songhuajiang District Songhuajiang District Songhuajiang District Songhuajiang District Songhuajiang District Songhuajiang District Songhuajiang District Liaohe District    
 [10] Songhuajiang District Songhuajiang District Haihe District     Haihe District     Haihe District     Haihe District     Haihe District     Haihe District     Haihe District    
 [19] Haihe District     Haihe District     Haihe District     Haihe District     Haihe District     Haihe District     Haihe District     Haihe District     Haihe District    
 [28] Haihe District     Haihe District     Haihe District     Haihe District     Haihe District     Haihe District     Yellow River Region     Northwest Zhuhe District
 [37] Yellow River Region     Northwest Zhuhe district northwest Zhuhe district northwest Zhuhe District Yellow River District     Northwest Zhuhe district northwest Zhuhe District Yellow River District     Northwest Zhuhe District
 [46] The Yellow River region in the Northwest     Yellow River Region     Northwest Zhuhe district northwest Zhuhe district northwest Zhuhe District Yellow River District     Yellow River Region     Yellow River Region    
 [55] Northwest Zhuhe district northwest Zhuhe District Yellow River District     Northwest Zhuhe district northwest Zhuhe District Yellow River District     Northwest Zhuhe district northwest Zhuhe District Yellow River District    
 [64] Yellow River Region     Yellow River Region     Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District    
 [73] Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District    
 [82] Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District    
 [91] Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District     Huaihe District     Southeast Zhuhe District Yangtze River District    
[100] Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Southeast Zhuhe District Yangtze River District     Yangtze River Region     Southeast Zhuhe District
[109] Yangtze River Region     Southeast Zhuhe District Yangtze River District     Southeast Zhuhe district southeast Zhuhe District Yangtze River District     Southeast Zhuhe District Yangtze River District     Yangtze River Region    
[118] Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Southeast Zhuhe District Yangtze River District     Southeast Zhuhe District Yangtze River District     Yangtze River Region    
[127] Southeast Zhuhe District Yangtze River District     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region    
[136] Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Southeast Zhuhe District Yangtze River District     Yangtze River Region     Yangtze River Region     Yangtze River Region    
[145] Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Northwest Zhuhe District Yangtze River District     Yangtze River Region     Southwest Zhuhe District
[154] Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region    
[163] Yangtze River Region     Yangtze River Region     Yangtze River Region     Southwest Zhuhe District Yangtze River District     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region    
[172] Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Southwest Zhuhe District Yangtze River District     Yangtze River Region     Yangtze River Region     Yangtze River Region    
[181] Yangtze River Region     Southwest Zhuhe District Yangtze River District     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region    
[190] Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region    
[199] Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Northwest Zhuhe District Yangtze River District     Yangtze River Region    
[208] Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Southwest Zhuhe District Yangtze River District     Yangtze River Region     Southwest Zhuhe District Yangtze River District    
[217] Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region    
[226] Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region    
[235] Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region    
[244] Southwest Zhuhe District Yangtze River District     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region    
[253] Pearl River Region     Pearl River Region     Pearl River Region     Pearl River Region     Pearl River Region     Pearl River Region     Pearl River Region     Pearl River Region     Pearl River Region    
[262] Pearl River Region     Pearl River Region     Pearl River Region     Pearl River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region    
[271] Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Yangtze River Region     Northwest Zhuhe District
Levels: Songhuajiang District Liaohe District Haihe District Huanghe District Huaihe District Changjiang district southeast Zhuhe District Zhujiang district southwest Zhuhe district northwest Zhuhe District


> wql
  [1] Ⅴ   inferiorⅤ inferiorⅤ inferiorⅤ inferiorⅤ inferiorⅤ inferiorⅤ Ⅴ   inferiorⅤ Ⅲ   Ⅱ   Ⅲ   Ⅴ   Ⅱ   Ⅲ   Ⅲ   Ⅳ   Ⅱ   Ⅱ   Ⅳ   Ⅳ   Ⅲ   inferiorⅤ Ⅱ   Ⅳ   Ⅳ  
 [27] Ⅳ   Ⅱ   Ⅱ   Ⅱ   Ⅲ   Ⅱ   Ⅲ   Ⅱ   inferiorⅤ inferiorⅤ Ⅳ   Ⅲ   inferiorⅤ inferiorⅤ Ⅲ   Ⅱ   inferiorⅤ Ⅱ   Ⅱ   Ⅱ   Ⅳ   inferiorⅤ inferiorⅤ Ⅰ   Ⅱ   Ⅱ  
 [53] Ⅳ   inferiorⅤ Ⅱ   Ⅰ   Ⅳ   Ⅰ   Ⅲ   inferiorⅤ inferiorⅤ inferiorⅤ inferiorⅤ Ⅳ   Ⅱ   Ⅴ   Ⅳ   Ⅲ   Ⅳ   Ⅳ   Ⅳ   Ⅳ   Ⅳ   Ⅳ   Ⅴ   Ⅳ   Ⅳ   Ⅳ  
 [79] Ⅳ   Ⅴ   Ⅲ   Ⅳ   Ⅳ   Ⅲ   Ⅳ   Ⅳ   Ⅴ   Ⅴ   Ⅳ   Ⅳ   inferiorⅤ Ⅳ   Ⅴ   Ⅴ   Ⅴ   inferiorⅤ Ⅳ   Ⅳ   Ⅴ   inferiorⅤ Ⅳ   Ⅴ   inferiorⅤ Ⅲ  
[105] Ⅳ   Ⅴ   Ⅴ   Ⅱ   Ⅳ   Ⅳ   Ⅴ   Ⅴ   Ⅲ   Ⅴ   Ⅱ   Ⅴ   Ⅴ   Ⅴ   Ⅴ   Ⅴ   inferiorⅤ Ⅲ   Ⅳ   Ⅱ   Ⅴ   Ⅲ   Ⅱ   inferiorⅤ Ⅳ   Ⅳ  
[131] Ⅳ   Ⅳ   Ⅳ   Ⅴ   Ⅴ   Ⅴ   Ⅴ   Ⅳ   inferiorⅤ Ⅲ   Ⅴ   Ⅴ   Ⅴ   Ⅴ   inferiorⅤ Ⅳ   Ⅳ   Ⅴ   Ⅴ   Ⅱ   Ⅲ   Ⅴ   Ⅳ   Ⅳ   Ⅲ   Ⅲ  
[157] Ⅳ   inferiorⅤ Ⅲ   Ⅴ   Ⅴ   inferiorⅤ Ⅲ   Ⅲ   Ⅴ   Ⅰ   Ⅳ   Ⅳ   inferiorⅤ Ⅴ   Ⅴ   Ⅱ   Ⅴ   Ⅴ   Ⅲ   Ⅲ   Ⅳ   Ⅳ   Ⅴ   Ⅲ   Ⅲ   Ⅰ  
[183] Ⅳ   Ⅳ   Ⅴ   Ⅴ   Ⅴ   Ⅳ   Ⅳ   Ⅴ   Ⅲ   Ⅳ   Ⅴ   Ⅴ   Ⅲ   Ⅰ   Ⅳ   Ⅲ   Ⅲ   inferiorⅤ Ⅴ   inferiorⅤ Ⅴ   Ⅳ   inferiorⅤ Ⅴ   inferiorⅤ Ⅴ  
[209] inferiorⅤ Ⅳ   Ⅴ   inferiorⅤ Ⅳ   Ⅳ   Ⅱ   Ⅱ   Ⅴ   Ⅱ   Ⅱ   Ⅲ   inferiorⅤ Ⅴ   inferiorⅤ inferiorⅤ Ⅳ   Ⅳ   Ⅱ   Ⅳ   Ⅳ   Ⅳ   Ⅲ   Ⅴ   inferiorⅤ Ⅴ  
[235] Ⅲ   Ⅴ   Ⅲ   Ⅲ   Ⅴ   Ⅳ   Ⅴ   Ⅴ   Ⅲ   inferiorⅤ Ⅲ   inferiorⅤ Ⅲ   Ⅳ   Ⅴ   Ⅴ   Ⅳ   Ⅳ   Ⅴ   Ⅱ   Ⅱ   inferiorⅤ Ⅱ   inferiorⅤ Ⅳ   Ⅲ  
[261] Ⅱ   inferiorⅤ Ⅳ   inferiorⅤ Ⅳ   Ⅴ   inferiorⅤ Ⅲ   inferiorⅤ Ⅳ   inferiorⅤ inferiorⅤ Ⅴ   Ⅲ   Ⅴ   inferiorⅤ
Levels: Ⅰ Ⅱ Ⅲ Ⅳ Ⅴ inferiorⅤ

  

③ Use the tapply() function for statistics, select the "length" counting function for the statistical function, and set the missing value to 0

> result <- tapply(data$Lake name,list(wrr1,wql),FUN=length,default=0)
> result
           Ⅰ Ⅱ  Ⅲ  Ⅳ  Ⅴ inferiorⅤ
Songhuajiang District 0 1 0 2 6
 Liaohe District     0 0  0  0  0   1
 Haihe District     0 8  6  6  1   1
 Yellow River Region     0 4  1  5  0   4
 Huaihe District     0 0  3 19  8   2
 Yangtze River Region     1 5 24 38 54  22
 Southeast Zhuhe District 0 4 4 2 0
 Pearl River Region     0 4  1  3  1   4
 Southwest Zhuhe District 2 1 0 2
 Zhuhe District, Northwest China

④ Calculate the summation term to obtain the final result table

> result <- cbind("total"=rowSums(result),result)
> result
           total Ⅰ Ⅱ  Ⅲ  Ⅳ  Ⅴ inferiorⅤ
Songhuajiang District     10 0 1  1  0  2   6
 Liaohe District        1 0 0  0  0  0   1
 Haihe District       22 0 8  6  6  1   1
 Yellow River Region       14 0 4  1  5  0   4
 Huaihe District       32 0 0  3 19  8   2
 Yangtze River Region      144 1 5 24 38 54  22
 Southeast Zhuhe District 1204 4220
 Pearl River Region       13 0 4  1  3  1   4
 Southwest Zhuhe District    7 2 1  1  1  0   2
 Northwest Zhuhe District 21 3 6 2 0 0 10
> result <- rbind(result,"whole country"=colSums(result))
> result
           total Ⅰ  Ⅱ  Ⅲ  Ⅳ  Ⅴ inferiorⅤ
Songhuajiang District     10 0  1  1  0  2   6
 Liaohe District        1 0  0  0  0  0   1
 Haihe District       22 0  8  6  6  1   1
 Yellow River Region       14 0  4  1  5  0   4
 Huaihe District       32 0  0  3 19  8   2
 Yangtze River Region      144 1  5 24 38 54  22
 Southeast Zhuhe District 1204 4220
 Pearl River Region       13 0  4  1  3  1   4
 Southwest Zhuhe District    7 2  1  1  1  0   2
 Northwest Zhuhe District 21 3 6 2 0 0 10
 whole country        276 6 33 43 74 68  52

  

2. apply() function:

(1) Purpose: batch calculate different dimensions of the matrix.

(2) Option parameters:

 apply(X, MARGIN, FUN, ..., simplify = TRUE)

X: enter an option. The data type is matrix or array

MARGIN: enter an option, the dimension of object X, and the parameter values can be 1, 2, 3, etc

FUN: the function used. The parameter values can be named function and anonymous function

simplify: the same as that of apply, which determines whether the output format is complex or not

(3) Examples

> x <- cbind(x1 = 3, x2 = c(4:1, 2:5))
> dimnames(x)[[1]] <- letters[1:8]
> x
  x1 x2
a  3  4
b  3  3
c  3  2
d  3  1
e  3  2
f  3  3
g  3  4
h  3  5

> col.sums <- apply(x, 2, sum)
> col.sums
x1 x2 
24 24 

> row.sums <- apply(x, 1, sum)
> row.sums
a b c d e f g h 
7 6 5 4 5 6 7 8 

 

(2) Apply a Function over a List or Vector (including apply, sapply, vapply, rapply)

(1) Purpose: to apply a function to a list or vector to play the role of loop iteration

(2) Option parameters

lapply(X, FUN, ...)

  

10: Enter an option, list, or vector

FUN: function

...: additional arguments to the function

(3) Examples

We want to sum up the total area of the lake and the nutrition score value in the data

> lapply(list(data$Total Lake area,data$Score value),sum,na.rm=T)
[[1]]
[1] 39792.29

[[2]]
[1] 14284.51

  

When object X is not the first parameter of function FUN, it is best to use anonymous functions to enhance readability

> lapply(list(4,5,6),function(x) rnorm(3,x,0.1))
[[1]]
[1] 4.053170 4.182545 4.057495

[[2]]
[1] 5.127469 5.119175 4.962460

[[3]]
[1] 5.995226 6.065272 5.890950

The above code creates three groups of random numbers with mean values of 4, 5 and 6 and standard deviation = 0.1. The number of numbers in each group is 3

 2,sapply,vapply,rapply

(1) Sapply: the simplified version of sapply. The option parameters are basically the same as those of sapply. The output results are in a simplified form, not a list. They tend to be vectors and matrices

sapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)
sapply(list(4,5,6),function(x) rnorm(3,x,0.1))
         [,1]     [,2]     [,3]
[1,] 4.114730 5.005572 6.111872
[2,] 3.916513 5.088405 5.869742
[3,] 3.925176 5.034011 5.859441

(2) vapply: the optional parameter version of sapply adds an option FUN.VALUE

vapply(X, FUN, FUN.VALUE, ..., USE.NAMES = TRUE)

FUN.VALUE:   

(3) rapply: recursive version of lapply, which can loop through each element in each sub list in the list and apply functions

 

  (3) Multi parameter calculation -- map () and map () series functions

1. Purpose: use function to calculate multiple variables simultaneously - Apply a Function to Multiple List or Vector Arguments

2. Option parameters

mapply(FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE)

  

 

 

> mapply(length,data)
Water resources first level district water resources second level district water resources third level district provincial level administrative region prefecture level administrative region county level administrative region     Name of Lake total area of Lake 
         276          276          276          276          276          276          276          276 
    Annual category     Flood season category non flood season category       Score value nutritional degree 
         276          276          276          276          276 

  

  (4) Environment variable eapply ()

 eapply (env, FUN, ..., all.names = FALSE, USE.NAMES = TRUE) 

env: environment space

FUN: the selected function can be a user-defined function or a named function

all.names: the matching type. The parameter value is TRUE or FALSE. When TRUE, the function is applied to all values

USE.NAMES: set whether there is a name in the returned list

Posted by BZorch on Sun, 31 Oct 2021 19:42:51 -0700