我有一个功课问题,要求我使用R中的模拟测试置信区间的覆盖概率(作为前一个问题的一部分找到)。
我的代码试图从我的样本数据生成1000个随机样本(替换),有效地将我的原始样本作为我的新种群。随机样本与我的人口大小相同。然后我想找到每个随机样本的95%置信区间,看看有多少包含'真实均值'(在问题陈述中给出)与'总体均值'(我的原始样本的平均值)。
set.seed(1987)
iq <- rnorm(1000,91.08065,14.40393)
pop_mean <- mean(iq) #the mean of my sample is now considered the population mean
true_mean <- 100 #the true mean is 100, specified in question
sampSEs <- numeric() #create an empty vector to put the sample SEs in
sampMeans <- numeric() #create an empty vector to put the sample means in
get_conf_interval <- function(sample_measurements) {
iqSE_samp <- 15/sqrt(length(iq)) #find the SE using an sd of 15
iqMean_samp <- mean(sample_measurements) #take the mean of each sample
upper <- iqMean_samp + 1.96*iqSE_samp #find the upper bound for a 95% CI
lower <- iqMean_samp - 1.96*iqSE_samp #find the lower bound for a 95% CI
list(lower=lower, upper=upper)
}
interval_contains_true_mean <- function(interval) { #check if the interval contains the true mean
true_mean >= interval$lower && true_mean <= interval$upper
}
interval_contains_population_mean <- function(interval) { #check if the interval contains the population mean
pop_mean >= interval$lower && pop_mean <= interval$upper
}
samples <- replicate(1000, sample(iq, size = 124, replace = T)) #take 1000 samples with replacement from my iq data
for(i in 1:1000) { #for each sample taken
sampMeans[i] <- mean(samples[i]) #put the mean of it in the vector created previously
sampSEs[i] <- 15/sqrt(length(iq)) #put the SE in a vector... these are all the same bc not finding the sample sd
}
intervals <- apply(samples, FUN=get_conf_interval, MARGIN=2) #call the function to find the confidence intervals
sampMeans #just check if worked
#sampSEs #ditto
percent_intervals_with_true_mean <- mean(sapply(intervals, FUN=interval_contains_true_mean)) * 100
cat("% Intervals Containing True Mean: ", percent_intervals_with_true_mean, "%\n")
percent_intervals_with_pop_mean <- mean(sapply(intervals, FUN=interval_contains_population_mean)) * 100
cat("% Intervals Containing Population Mean: ", percent_intervals_with_pop_mean, "%")
此代码报告我的样本的0%置信区间包含真实均值。这不正确;我看过样本的意思,其中有几个是真正的意思。
1.-我有两个解决方案,第一个解决方案是逗号'mean(samples [,i])'和
“set.seed(1987)
sigma_M = 14.40393
mu_M = 91.08065
米= 10
iq < - rnorm(m,mu_M,sigma_M)
pop_mean < - mean(iq)#我的样本的平均值现在被认为是人口平均值
样品< - 复制(m,样品(iq,size = 4,replace = T))#take m样品,替换我的iq数据
sampSEs < - numeric()#create一个空向量,用于放置样本SE
sampMeans < - numeric()#创建一个空向量以放入样本均值
for(i in 1:m){#for each samples
sampMeans [i] < - mean(samples [,i])#put在先前创建的向量中的平均值
sampSEs [i] < - 15 / sqrt(length(iq))#put在向量中的SE ...这些都是相同的bc没有找到样本sd}
get_conf_interval < - function(sample_measurements){
iqSE_samp < - 15 / sqrt(length(iq))#使用15的sd查找SE
iqMean_samp < - mean(sample_measurements)#take每个样本的平均值
upper < - iqMean_samp + 1.96 * iqSE_samp #find 95%CI的上限
lower < - iqMean_samp - 1.96 * iqSE_samp #find 95%CI的下限
列表(lower = lower,upper = upper)}
interval_contains_population_mean < - function(interval){#check if interval是否包含总体均值
pop mean> = interval $ lower和pop mean <= interval $ upper}
interval < - apply(samples,FUN = get_conf_interval,MARGIN = 2)#call函数查找置信区间
sampMeans #just检查是否有效
percent_intervals_with_pop_mean < - mean(sapply(interval,FUN = interval_contains_population_mean))* 100
cat(“包含人口的%间隔平均值:”,percent_intervals_with_pop_mean,“%”)'
2.-第二个解决方案是更改代码,但我只做'pop_mean'人口平均值(并计算标准偏差)
“set.seed(1987)
sigma_M = 14.40393
mu_M = 91.08065
米= 10
iq < - rnorm(m,mu_M,sigma_M)
pop_mean < - mean(iq)#我的样本的平均值现在被认为是人口平均值
样品< - 复制(m,样品(iq,size = 4,replace = T))#take m样品,替换我的iq数据
sampMeans = apply(样本,2,意思)
iqSE_samp < - 15 / sqrt(length(iq))#使用15的sd查找SE
iqMean_samp < - sampMeans #take每个样本的平均值
upper < - iqMean_samp + 1.96 * iqSE_samp #find 95%CI的上限
lower < - iqMean_samp - 1.96 * iqSE_samp #find 95%CI的下限
interval = cbind(lower,upper)
percent_intervals_with_pop_mean = mean(apply(interval,1,findInterval,x = pop_mean)== 1)* 100
cat(“包含人口的%间隔平均值:”,percent_intervals_with_pop_mean,“%”)'
而决赛对我来说是80%