R language basic questions and answers - R language and Statistical Analysis Chapter 5 after class exercises (Tang Yincai)

Keywords: R Language

R language and Statistical Analysis Chapter 5 after class exercises (Tang Yincai)

Question-1

Design overall X X X is the error of measuring distance with radio rangefinder, which is subject to ( α , β ) (α, β) ( α,β) In 200 measurements, the error is X i X_i Xi # times n i n_i ni # times:

X i X_i Xi​3579111315171921
n i n_i ni​21161526221421221825

seek ( α , β ) (α, β) ( α,β) The estimated value of moment method (Note: the measurement error here is X i X_i Xi ﹤ refers to the measurement error within ( X i − 1 , X i + 1 ) (X_i-1, Xi+1) Representative value between (Xi − 1,Xi+1)

# Construct X_i,n_i sequence
X_i<-seq(3,21,by=2)
n_i<-c(21,16,15,26,22,14,21,22,18,25)

# Restore X sequence according to X_i,n_i sequence
X<-rep(n_i,X_i)

# Mean and standard deviation
mu<-mean(X)
sigma<-sd(X)

# Find \ alpha and \ beata
# (a+b)/2=E(x),(b-a)^2/12=D(x)
# a+b=2E(x),b-a=2sqrt(3)sd(x)
# a=E(x)-sqrt(3)sd(X)
# b=E(x)+sqrt(3)sd(X)
alpha<-mu-sqrt(3)*sigma; print(alpha)
beata<-mu+sqrt(3)*sigma; print(beata)

[1] 13.88
[1] 27.15

Question-2

In order to test the effect of a tap water disinfection equipment, 50L is randomly selected from the disinfected water to test the number of Escherichia coli per liter of water (assuming that the number of Escherichia coli in 1L water follows Poisson distribution). The test results are as follows:

Number of Escherichia coli/ L L L0123456
Litres of water1720102100

What is the average number of Escherichia coli per liter of water in order to maximize the probability of the above situation

# NUM: number of Escherichia coli, v: liters of water
NUM<-0:6;
v<-c(17,20,10,2,1,0,0)

# Find the expected mean
E<-mean(NUM*v); print(E)

[1] 7.143

Question-3

It is known that the transverse grain stress resistance of a certain wood obeys N ( μ , σ 2 ) N(\mu,\sigma^2) N( μ,σ 2) , the transverse grain pressure resistance test is carried out on ten specimens, and the data are as follows ( k g / c m 2 ) (kg/cm2) (kg/cm2)

482 , 493 , 457 , 471 , 510 , 446 , 435 , 418 , 394 , 469 482, 493, 457, 471, 510, 446, 435, 418, 394, 469 482,493,457,471,510,446,435,418,394,469

  1. seek μ \mu μ The confidence level is a confidence interval of 0.95
  2. seek σ \sigma σ The confidence level is a confidence interval of 0.90
# Pressure value
P<-c(482,493,457,471,510,466,435,418,394,469)

# \mu 95% confidence interval
t.test(P)$conf.int


# \sigma 90% confidence interval
# The function chisq.var.test() provided in the textbook
chisq.var.test <- function (x,var,alpha,alternative="two.sided"){
  options(digits=4)
  result<-list( )
  n<-length(x)
  v<-var(x)
  result$var<-v
  chi2<-(n-1)*v/var
  result$chi2<-chi2
  p<-pchisq(chi2,n-1)
  if(alternative == "less"|alternative=="greater"){
    result$p.value<-p
  } else if (alternative=="two.sided") {
    if(p>.5)
    p<-1-p
    p<-2*p
    result$p.value<-p
  } else return("your input is wrong")
    result$conf.int<-c(
      (n-1)*v/qchisq(alpha/2, df=n-1, lower.tail=FALSE),
      (n-1)*v/qchisq(alpha/2, df=n-1, lower.tail=TRUE))
    result
}


chisq.var.test(P,var(P),0.1)$conf.int

# =============================================+
# The following are tests:|
# ---------------------------------------------+
# x<-c(175,176,173,175,174,173,173,176,173,179)|
# t.test(x)$conf.int                           |
# ---------------------------------------------+         
# chisq.var.test(x,var(x),0.05)$conf.int       |
# ---------------------------------------------+

[1] 434.4 484.6
attr(,"conf.level")
[1] 0.95
[1] 653.9 3327.0

Question-4

A cigarette factory produces two kinds of cigarettes a and B. now the nicotine content of the two cigarettes is tested for 6 times respectively. The results are as follows:

Cigarette A252823262922
Cigarette B282330352127

If the nicotine content of cigarettes follows a normal distribution,

  1. Is the variance of nicotine content in the two cigarettes equal?

  2. Try to find the 95% confidence interval of the average nicotine content difference between the two cigarettes?

# Cigarette A\B data
A<-c(25,28,23,26,29,22)
B<-c(28,23,30,35,21,27)

# var test:
# P-value > 0.05, equal variance
var.test(A,B)

# 95% of average content difference
t.test(x,y,var.equal = TRUE)$conf.int

F test to compare two variances
.
data: A and B
F = 0.3, num df = 5, denom df = 5, p-value = 0.2
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.04187 2.13821
sample estimates:
ratio of variances
0.2992
.
[1] -0.0007721 0.0084388
attr(,"conf.level")
[1] 0.95

Question-5

Comparing the yields of the two wheat varieties, 22 experimental fields with similar conditions were selected and the same farming methods were used for the experiment. The results showed that the yield per unit area of 12 experimental fields sown with variety a and 12 experimental fields sown with variety B were as follows:

A variety628583510554612523530615573603334564
B variety535433398470567480498560503426338547

Assuming that the yield per unit area of each variety obeys the normal distribution, the variance of the yield of variety a is 2140 and the variance of the yield of variety B is 3250, try to find the upper confidence limit and lower confidence limit with the confidence level of 0.95 and 0.90

# A: A, B: B 
X_A<-c(628,583,510,554,612,523,530,615,573,603,334,564)
X_B<-c(535,433,398,470,567,480,498,560,503,426,338,547)

sigma_A<-2140
sigma_B<-3250

# The function two.sample.ci() provided in the textbook
two.sample.ci=function(x,y,conf.level=0.95,sigma1,sigma2)
{options(digits=4)
  m=length(x);n=length(y)
  xbar=mean(x)-mean(y)
  alpha=1-conf.level
  zstar=qnorm(1-alpha/2)*(sigma1/m+sigma2/n)^(1/2)
  xbar+c(-zstar,+zstar)
}

# The upper confidence limit with a confidence level of 0.95 and the lower confidence limit with a confidence level of 0.90       
two.sample.ci(X_A,X_B,conf.level=0.95,sigma_A,sigma_B)[2]
two.sample.ci(X_A,X_B,conf.level=0.90,sigma_A,sigma_B)[1]

[1] 114.4
[1] 37.97

Question-6

Two machine tools produce balls of the same model. According to previous experience, the diameters of the balls produced by the two machine tools obey normal distribution. Now, 7 and 9 balls are randomly selected from the balls produced by the two machine tools, and their diameters are measured as follows (unit: mmm)

Machine tool a15.214.515.514.815.115.614.7
Machine tool B15.215.014.815.215.014.915.114.815.3

Is the variance of the ball produced by machine tool B smaller than that of the ball diameter produced by machine tool a?

# A: x, B: y 
x=c(15.2,14.5,15.5,14.8,15.1,15.6,14.7)
y=c(15.2,15.0,14.8,15.2,15,14.9,15.1,14.8,15.3)
var.test(x,y)
# ratio of variances 5.216
# E{σx^2/σy^2}=5.216
# The difference of Party B is less than that of Party A

F test to compare two variances
.
data: x and y
F = 5.2, num df = 6, denom df = 8, p-value = 0.04
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
1.121 29.208
sample estimates:
ratio of variances
5.216

Question-7

A company learned about the sales of two bicycle models a and B produced by the company, and randomly selected 400 people to ask them about their choice of a and B. among them, 224 people like a. try to estimate the confidence level of the proportion p of customers who like a as an interval of 0.99

binom.test(224,400,conf.level=0.99)

Exact binomial test
.
data: 224 and 400
number of successes = 224, number of trials = 400, p-value = 0.02
alternative hypothesis: true probability of success is not equal to 0.5
99 percent confidence interval:
0.4944 0.6241
sample estimates:
probability of success
0.56

Question-8

A company has produced a batch of new products, and the products generally obey the normal distribution. Now, to estimate the average weight of this batch of products, the maximum allowable error is 1, and the sample standard deviation s =10. How many products should be selected at least with a confidence of 0.95?

# The definition of size.norm2() in the textbook
size.norm2<-function(s,alpha,d,m){
  t0<-qt(alpha/2,m,lower.tail=FALSE)
  n0<-(t0*s/d)^2
  t1<-qt(alpha/2,n0,lower.tail=FALSE)
  n1<-(t1*s/d)^2
  while(abs(n1-n0)>0.5){
    n0<-(qt(alpha/2,n1,lower.tail=FALSE)*s/d)^2
    n1<-(qt(alpha/2,n0,lower.tail=FALSE)*s/d)^2
  }
  n1
}

# The last item m is a large number given in advance (as the textbook says)
size.norm2(10,0.05,2,1000)

[1] 98.44

Question-9

According to previous experience, the damage rate of a large number of glassware shipped is not more than 5%. Now to estimate the damage rate of glassware in a ship, it is required that the difference between the estimate and the true value is not more than 1%, and the confidence is 0.90. How many samples should be taken for acceptance to meet the above requirements?

# Definition of size.bin() in the textbook
size.bin=function(d,p,conf.level){
  alpha=1-conf.level
  ((qnorm(1-alpha/2))/d)^2*p*(1-p)
}

size.bin(0.01,0.05,0.90)

[1] 1285

Posted by josephferris on Fri, 17 Sep 2021 15:47:59 -0700