This part includes:
1. Function of apply series function
2. apply function family members and their respective functions (8 functions, divided into 4 groups)
3. Specific usage
1, The function of the apply family function avoids the use of loops. The efficiency of loops in R is very low and time-consuming. Using the apply family function can avoid loops
2, Apple family members
(1) Perform group calculations - apply() and apply()
1. tapply() function:
(1) Purpose: apply a unified function to calculate each group of data in the data grouped by factors, and output the batch calculation results of each group of data. The calculation of multi factor grouping can be realized, but only one variable can be processed, and multiple variables cannot be processed at the same time.
(2) Option parameters:
tapply(X, INDEX, FUN = NULL, ..., default = NA, simplify = TRUE)
x: The input option is required to be a data object with split factor grouping, and only single factor variables, i.e. vectors, are supported
INDEX: an input option, which can be a factor (vector) or a group of factors (list) as the grouping basis of data in the x object. For multiple grouping factors, if there are 2 factors, the results will be output in the matrix form of two-dimensional table; if there are 3 or more factors, the results will be output in the form of list
FUN: used to set the function used, that is, the formula actually used for calculation. Its parameter values can be named functions, such as sum, mean, length, etc., or anonymous functions
Anonymous function usage: function(i) i+2*5
Default: used to set the default value. The parameter value can be set to 0 or NA or other values
simplify: used to set the presentation form of data results. The parameter values include TRUE and false. T means to output the results in brief form, mostly in vector form, F means to output the results in complex form, mostly in multi-level list.
(3) Case: count the number of lakes with different water quality categories in the first-class water resources area of the country
Input variable: number of lakes
Grouping factors: there are 2 in total, which are the first-class area of water resources and the category of lake water quality
① Read input data from the clipboard
> data_t <- read.table("clipboard",header=T,sep="\t")
The entered data are as follows:
② INDEX factors are constructed. There are two factors in this case, namely, the first-class area of water resources and the category of water quality
> wrr1 <- factor(data_t$Class I water resources area,levels=c("Songhuajiang District","Liaohe District","Haihe District","Yellow River Region","Huaihe District","Yangtze River Region","Southeast Zhuhe District","Pearl River Region","Southwest Zhuhe District","Northwest Zhuhe District")) > wql <- factor(data_t$Water quality category) > wrr1 [1] Songhuajiang District Songhuajiang District Songhuajiang District Songhuajiang District Songhuajiang District Songhuajiang District Songhuajiang District Songhuajiang District Liaohe District [10] Songhuajiang District Songhuajiang District Haihe District Haihe District Haihe District Haihe District Haihe District Haihe District Haihe District [19] Haihe District Haihe District Haihe District Haihe District Haihe District Haihe District Haihe District Haihe District Haihe District [28] Haihe District Haihe District Haihe District Haihe District Haihe District Haihe District Yellow River Region Northwest Zhuhe District [37] Yellow River Region Northwest Zhuhe district northwest Zhuhe district northwest Zhuhe District Yellow River District Northwest Zhuhe district northwest Zhuhe District Yellow River District Northwest Zhuhe District [46] The Yellow River region in the Northwest Yellow River Region Northwest Zhuhe district northwest Zhuhe district northwest Zhuhe District Yellow River District Yellow River Region Yellow River Region [55] Northwest Zhuhe district northwest Zhuhe District Yellow River District Northwest Zhuhe district northwest Zhuhe District Yellow River District Northwest Zhuhe district northwest Zhuhe District Yellow River District [64] Yellow River Region Yellow River Region Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District [73] Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District [82] Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District [91] Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District Huaihe District Southeast Zhuhe District Yangtze River District [100] Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Southeast Zhuhe District Yangtze River District Yangtze River Region Southeast Zhuhe District [109] Yangtze River Region Southeast Zhuhe District Yangtze River District Southeast Zhuhe district southeast Zhuhe District Yangtze River District Southeast Zhuhe District Yangtze River District Yangtze River Region [118] Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Southeast Zhuhe District Yangtze River District Southeast Zhuhe District Yangtze River District Yangtze River Region [127] Southeast Zhuhe District Yangtze River District Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region [136] Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Southeast Zhuhe District Yangtze River District Yangtze River Region Yangtze River Region Yangtze River Region [145] Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Northwest Zhuhe District Yangtze River District Yangtze River Region Southwest Zhuhe District [154] Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region [163] Yangtze River Region Yangtze River Region Yangtze River Region Southwest Zhuhe District Yangtze River District Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region [172] Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Southwest Zhuhe District Yangtze River District Yangtze River Region Yangtze River Region Yangtze River Region [181] Yangtze River Region Southwest Zhuhe District Yangtze River District Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region [190] Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region [199] Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Northwest Zhuhe District Yangtze River District Yangtze River Region [208] Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Southwest Zhuhe District Yangtze River District Yangtze River Region Southwest Zhuhe District Yangtze River District [217] Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region [226] Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region [235] Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region [244] Southwest Zhuhe District Yangtze River District Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region [253] Pearl River Region Pearl River Region Pearl River Region Pearl River Region Pearl River Region Pearl River Region Pearl River Region Pearl River Region Pearl River Region [262] Pearl River Region Pearl River Region Pearl River Region Pearl River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region [271] Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Yangtze River Region Northwest Zhuhe District Levels: Songhuajiang District Liaohe District Haihe District Huanghe District Huaihe District Changjiang district southeast Zhuhe District Zhujiang district southwest Zhuhe district northwest Zhuhe District > wql [1] Ⅴ inferiorⅤ inferiorⅤ inferiorⅤ inferiorⅤ inferiorⅤ inferiorⅤ Ⅴ inferiorⅤ Ⅲ Ⅱ Ⅲ Ⅴ Ⅱ Ⅲ Ⅲ Ⅳ Ⅱ Ⅱ Ⅳ Ⅳ Ⅲ inferiorⅤ Ⅱ Ⅳ Ⅳ [27] Ⅳ Ⅱ Ⅱ Ⅱ Ⅲ Ⅱ Ⅲ Ⅱ inferiorⅤ inferiorⅤ Ⅳ Ⅲ inferiorⅤ inferiorⅤ Ⅲ Ⅱ inferiorⅤ Ⅱ Ⅱ Ⅱ Ⅳ inferiorⅤ inferiorⅤ Ⅰ Ⅱ Ⅱ [53] Ⅳ inferiorⅤ Ⅱ Ⅰ Ⅳ Ⅰ Ⅲ inferiorⅤ inferiorⅤ inferiorⅤ inferiorⅤ Ⅳ Ⅱ Ⅴ Ⅳ Ⅲ Ⅳ Ⅳ Ⅳ Ⅳ Ⅳ Ⅳ Ⅴ Ⅳ Ⅳ Ⅳ [79] Ⅳ Ⅴ Ⅲ Ⅳ Ⅳ Ⅲ Ⅳ Ⅳ Ⅴ Ⅴ Ⅳ Ⅳ inferiorⅤ Ⅳ Ⅴ Ⅴ Ⅴ inferiorⅤ Ⅳ Ⅳ Ⅴ inferiorⅤ Ⅳ Ⅴ inferiorⅤ Ⅲ [105] Ⅳ Ⅴ Ⅴ Ⅱ Ⅳ Ⅳ Ⅴ Ⅴ Ⅲ Ⅴ Ⅱ Ⅴ Ⅴ Ⅴ Ⅴ Ⅴ inferiorⅤ Ⅲ Ⅳ Ⅱ Ⅴ Ⅲ Ⅱ inferiorⅤ Ⅳ Ⅳ [131] Ⅳ Ⅳ Ⅳ Ⅴ Ⅴ Ⅴ Ⅴ Ⅳ inferiorⅤ Ⅲ Ⅴ Ⅴ Ⅴ Ⅴ inferiorⅤ Ⅳ Ⅳ Ⅴ Ⅴ Ⅱ Ⅲ Ⅴ Ⅳ Ⅳ Ⅲ Ⅲ [157] Ⅳ inferiorⅤ Ⅲ Ⅴ Ⅴ inferiorⅤ Ⅲ Ⅲ Ⅴ Ⅰ Ⅳ Ⅳ inferiorⅤ Ⅴ Ⅴ Ⅱ Ⅴ Ⅴ Ⅲ Ⅲ Ⅳ Ⅳ Ⅴ Ⅲ Ⅲ Ⅰ [183] Ⅳ Ⅳ Ⅴ Ⅴ Ⅴ Ⅳ Ⅳ Ⅴ Ⅲ Ⅳ Ⅴ Ⅴ Ⅲ Ⅰ Ⅳ Ⅲ Ⅲ inferiorⅤ Ⅴ inferiorⅤ Ⅴ Ⅳ inferiorⅤ Ⅴ inferiorⅤ Ⅴ [209] inferiorⅤ Ⅳ Ⅴ inferiorⅤ Ⅳ Ⅳ Ⅱ Ⅱ Ⅴ Ⅱ Ⅱ Ⅲ inferiorⅤ Ⅴ inferiorⅤ inferiorⅤ Ⅳ Ⅳ Ⅱ Ⅳ Ⅳ Ⅳ Ⅲ Ⅴ inferiorⅤ Ⅴ [235] Ⅲ Ⅴ Ⅲ Ⅲ Ⅴ Ⅳ Ⅴ Ⅴ Ⅲ inferiorⅤ Ⅲ inferiorⅤ Ⅲ Ⅳ Ⅴ Ⅴ Ⅳ Ⅳ Ⅴ Ⅱ Ⅱ inferiorⅤ Ⅱ inferiorⅤ Ⅳ Ⅲ [261] Ⅱ inferiorⅤ Ⅳ inferiorⅤ Ⅳ Ⅴ inferiorⅤ Ⅲ inferiorⅤ Ⅳ inferiorⅤ inferiorⅤ Ⅴ Ⅲ Ⅴ inferiorⅤ Levels: Ⅰ Ⅱ Ⅲ Ⅳ Ⅴ inferiorⅤ
③ Use the tapply() function for statistics, select the "length" counting function for the statistical function, and set the missing value to 0
> result <- tapply(data$Lake name,list(wrr1,wql),FUN=length,default=0) > result Ⅰ Ⅱ Ⅲ Ⅳ Ⅴ inferiorⅤ Songhuajiang District 0 1 0 2 6 Liaohe District 0 0 0 0 0 1 Haihe District 0 8 6 6 1 1 Yellow River Region 0 4 1 5 0 4 Huaihe District 0 0 3 19 8 2 Yangtze River Region 1 5 24 38 54 22 Southeast Zhuhe District 0 4 4 2 0 Pearl River Region 0 4 1 3 1 4 Southwest Zhuhe District 2 1 0 2 Zhuhe District, Northwest China
④ Calculate the summation term to obtain the final result table
> result <- cbind("total"=rowSums(result),result) > result total Ⅰ Ⅱ Ⅲ Ⅳ Ⅴ inferiorⅤ Songhuajiang District 10 0 1 1 0 2 6 Liaohe District 1 0 0 0 0 0 1 Haihe District 22 0 8 6 6 1 1 Yellow River Region 14 0 4 1 5 0 4 Huaihe District 32 0 0 3 19 8 2 Yangtze River Region 144 1 5 24 38 54 22 Southeast Zhuhe District 1204 4220 Pearl River Region 13 0 4 1 3 1 4 Southwest Zhuhe District 7 2 1 1 1 0 2 Northwest Zhuhe District 21 3 6 2 0 0 10 > result <- rbind(result,"whole country"=colSums(result)) > result total Ⅰ Ⅱ Ⅲ Ⅳ Ⅴ inferiorⅤ Songhuajiang District 10 0 1 1 0 2 6 Liaohe District 1 0 0 0 0 0 1 Haihe District 22 0 8 6 6 1 1 Yellow River Region 14 0 4 1 5 0 4 Huaihe District 32 0 0 3 19 8 2 Yangtze River Region 144 1 5 24 38 54 22 Southeast Zhuhe District 1204 4220 Pearl River Region 13 0 4 1 3 1 4 Southwest Zhuhe District 7 2 1 1 1 0 2 Northwest Zhuhe District 21 3 6 2 0 0 10 whole country 276 6 33 43 74 68 52
2. apply() function:
(1) Purpose: batch calculate different dimensions of the matrix.
(2) Option parameters:
apply(X, MARGIN, FUN, ..., simplify = TRUE)
X: enter an option. The data type is matrix or array
MARGIN: enter an option, the dimension of object X, and the parameter values can be 1, 2, 3, etc
FUN: the function used. The parameter values can be named function and anonymous function
simplify: the same as that of apply, which determines whether the output format is complex or not
(3) Examples
> x <- cbind(x1 = 3, x2 = c(4:1, 2:5)) > dimnames(x)[[1]] <- letters[1:8] > x x1 x2 a 3 4 b 3 3 c 3 2 d 3 1 e 3 2 f 3 3 g 3 4 h 3 5 > col.sums <- apply(x, 2, sum) > col.sums x1 x2 24 24 > row.sums <- apply(x, 1, sum) > row.sums a b c d e f g h 7 6 5 4 5 6 7 8
(2) Apply a Function over a List or Vector (including apply, sapply, vapply, rapply)
(1) Purpose: to apply a function to a list or vector to play the role of loop iteration
(2) Option parameters
lapply(X, FUN, ...)
10: Enter an option, list, or vector
FUN: function
...: additional arguments to the function
(3) Examples
We want to sum up the total area of the lake and the nutrition score value in the data
> lapply(list(data$Total Lake area,data$Score value),sum,na.rm=T) [[1]] [1] 39792.29 [[2]] [1] 14284.51
When object X is not the first parameter of function FUN, it is best to use anonymous functions to enhance readability
> lapply(list(4,5,6),function(x) rnorm(3,x,0.1)) [[1]] [1] 4.053170 4.182545 4.057495 [[2]] [1] 5.127469 5.119175 4.962460 [[3]] [1] 5.995226 6.065272 5.890950
The above code creates three groups of random numbers with mean values of 4, 5 and 6 and standard deviation = 0.1. The number of numbers in each group is 3
2,sapply,vapply,rapply
(1) Sapply: the simplified version of sapply. The option parameters are basically the same as those of sapply. The output results are in a simplified form, not a list. They tend to be vectors and matrices
sapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)
sapply(list(4,5,6),function(x) rnorm(3,x,0.1)) [,1] [,2] [,3] [1,] 4.114730 5.005572 6.111872 [2,] 3.916513 5.088405 5.869742 [3,] 3.925176 5.034011 5.859441
(2) vapply: the optional parameter version of sapply adds an option FUN.VALUE
vapply(X, FUN, FUN.VALUE, ..., USE.NAMES = TRUE)
FUN.VALUE:
(3) rapply: recursive version of lapply, which can loop through each element in each sub list in the list and apply functions
(3) Multi parameter calculation -- map () and map () series functions
1. Purpose: use function to calculate multiple variables simultaneously - Apply a Function to Multiple List or Vector Arguments
2. Option parameters
mapply(FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE)
> mapply(length,data) Water resources first level district water resources second level district water resources third level district provincial level administrative region prefecture level administrative region county level administrative region Name of Lake total area of Lake 276 276 276 276 276 276 276 276 Annual category Flood season category non flood season category Score value nutritional degree 276 276 276 276 276
(4) Environment variable eapply ()
eapply (env, FUN, ..., all.names = FALSE, USE.NAMES = TRUE)
env: environment space
FUN: the selected function can be a user-defined function or a named function
all.names: the matching type. The parameter value is TRUE or FALSE. When TRUE, the function is applied to all values
USE.NAMES: set whether there is a name in the returned list