Principal Component Analysis and Factor Analysis

#Packet loading
principal component analysis

Principal Component Analysis (PCA) is to extract a small number of irrelevant variables for a large number of related variables, these irrelevant variables also become principal component variables.

Data exploration

#In this section, the questionnaire data set of consumer brand perception is introduced as data.

pca <- read.csv("")

#As you can see, except for one character variable, the data are all numeric variables ranging from 1 to 10.
#Check the correlation between two variables
corrplot(cor(pca[,1:9]), order = "FPC")
#"FPC" for the first principal component order.
It can be roughly found that they are clustered into three categories: latest/trendy/fun, serious/leader/perform, bargain/rebuy/value, which is also the next analysis to verify.

Extraction of Principal Components

#Data scaling <- pca[,1:9] <- scale([,1:9])
#Extraction of Principal Components
pca.pc <- prcomp([,1:9])
#Judging the number of components
plot(pca.pc, type = "l")
For the number of principal components, scree diagram shows that after three categories, the variance value added of each principal component interpretation decreases.

Principal Component Score Acquisition

#Psh packages also have good PCA analysis output
fa.parallel([,1:9],fa = "pc") #principal() needs to know in advance about the composition of the package and output the lithotripsy map.
pca.psy <- principal([,1:9], nfactors = 3,rotate = "none")
round(unclass(pca.psy$weights),2) #Obtain principal component score
The composition coefficients of each principal component obtained here can be derived as follows: PC1 = 0.14*perform + 0.12 leader + latest(-0.21)... .

Brand Perception Map

At the same time, another important aspect of the application of principal component analysis is to visualize the relationship between different categories (brands) through biplot.

#The first two components of the principal component are mapped to two dimensions, but the direct projection data subject will face the problem of too many scatters and poor visibility.
#Therefore, its category (brand) can be mapped.
pca.mean <- aggregate(.~ brand,, mean)
rownames(pca.mean) <- pca.mean[,1]
pca.mean.pc <- prcomp(pca.mean[,-1], scale = T)
biplot(pca.mean.pc) #Brand Perception Map
From this map, we can examine the harmony and position of each category (brand).

The differences among different brands can be further examined.

pca.mean["a",] - pca.mean["j",]
Explanatory Factor Analysis

Factor analysis (EFA) is a method used to discover the potential structure of a group of variables. It mainly extracts and obtains observable explicit variables for unobservable factor variables.


# Number of determinants for gravel maps and eigenvalues
# Various schemes of gravel maps

#  noc naf nparallel nkaiser
#  1   3   2         3       3

# Number of eigenvalues > 1

#[1] 2.9792956 2.0965517 1.0792549 0.7272110 0.6375459 0.5348432 0.3901044
#[8] 0.3120464 0.2431469
From the above results, we can see that the factor number is between 2 and 3.

#Extraction of Common Factor by Psh Packet
#Scheme with Common Factor Quantity 2
fa([,1:9], nfactors = 2, rotate = "none",fm = "ml") #Factorization Method Selection of Maximum Likelihood Method ml
The two-factor scheme explained only 44% of the variance.

#A Scheme with Common Factor Number 3
fa([,1:9], nfactors = 3, rotate = "none",fm = "ml") 
It rose to 57% variance interpretation. It can be concluded that the three-factor scheme is better than the other one.

EFA rotation

Two methods of rotation

#Orthogonal rotation using psych's fa
#Orthogonal rotation
fa.vaf <- fa([,1:9], nfactors = 3, rotate = "varimax",fm = "ml")  
#Skew rotation
#With oblimin oblique rotation of GPArotation, the result of factanal is relatively concise.
fa.ob <- factanal([,1:9], factors = 3, rotation = "oblimin") 
#Unlisted factor structure matrix is obtained by factor model matrix*factor correlation matrix.
fsm <- function(oblique) {
if (class(oblique)[2]=="fa" & is.null(oblique$rotmat)) {
    warning("Object doesn't look like oblique EFA")
} else {
    P <- unclass(oblique$loading)
    F <- P %*% oblique$rotmat
} }
Orthogonal rotation

Skew rotation

Orthogonal rotation does not correlate mandatory factors and focuses on the correlation of the variables with the factors. Oblique rotation focuses on three matrices:
* The normalized regression coefficients of each variable and factor variable are listed in the pattern matrix.
* Factor correlation matrix, the correlation between factors.
* Factor structure matrix, i.e. factor load matrix, measures the correlation coefficient between variables and factors.

#Corresponding alternatives
#Orthogonal rotation <- factanal([,1:9], factors = 3, rotation = "varimax") 
#Skew rotation
fa.promax <- fa([,1:9], nfactors = 3, rotate = "promax",fm = "ml") 

fsmfa <- function(oblique) {
if (class(oblique)[2]=="fa" & is.null(oblique$Phi)) {
    warning("Object doesn't look like oblique EFA")
} else {
    P <- unclass(oblique$loading)
    F <- P %*% oblique$Phi
} }

Visualization of EFA Rotation Results

Roadmap is used to show the relationship between potential factors and individual factors.

fa.diagram(fa.promax,simple = T,digits = 2)
  • 1

Thermal maps are used to show the relationship between potential factors and variables more intuitively.

          col=brewer.pal(9, "GnBu"), trace="none", key=FALSE, dend="none",
           Colv=FALSE, cexCol = 1.5,
           main="\n\n\nFactor loadings for brand adjectives")
Scores of different brand factors

#Achieving the Mean Value of Three Factors for Each Brand
fa.ob <- factanal([,1:9], factors = 3, rotation = "oblimin",scores = "Bartlett") 
fa.score <- data.frame(fa.ob$scores)
fa.score$brand <-$brand
fa.score.mean <- aggregate(.~ brand, fa.score, mean)
#Making Thermal Map Based on the Mean Value of Factor
rownames(fa.score.mean) <- fa.score.mean[, 1]           # brand names
fa.score.mean <- fa.score.mean[, -1]

           col=brewer.pal(9, "GnBu"), trace="none", key=FALSE, dend="none",
           cexCol=1.2, main="\n\n\n\nMean factor score by brand")
