在附加的数据框中,我有10个科目(5名男性和5名女性)。每个受试者有三种分析物(A,B,C),每种分析物有三个访问值(访问次数= 1,2,3)。现在我想通过每个分析物和每次访问进行两组比较男性对女性。我在分析物中使用了嵌套循环,在访问中使用了j。附加所需的输出格式(9行和7列)。我希望有九排,但这里只有三排。我认为i循环的输出没有正确存储,但我不确定如何正确包含我。有什么建议?非常感谢你!
df1 = data.frame(id = c(1:10), gender = c(rep(c("F","M"),5)))
df2 = data.frame(id = c(1:10), analyte = c(rep(c("A","B","C"), 10)))
df3 = data.frame(id = rep((1:10),each=3), visit = rep(c("day1","day2","day3"),10))
set.seed(123)
df4 = data.frame(id = rep((1:10),each=9), val=rnorm(n = 90, mean = 0, sd = 1))
df5 = Reduce(function(dtf1, dtf2) merge(dtf1, dtf2, by = "id", all.x = TRUE), list(df1,df2,df3))
df = cbind(df5,df4)[,-5]
mk1=unique(df$analyte)
mk2=unique(df$visit)
out=matrix(NA, ncol=7, nrow=9)
for(i in 1:length(mk1)){
for (j in 1:length(mk2)){
dd = df[as.character(df$analyte)==mk1[i]&as.character(df$visit)==mk2[j],]
x = as.vector(dd$val[dd$gender=="F"])
y = as.vector(dd$val[dd$gender=="M"])
med1=as.numeric(quantile(x, probs=seq(0,1, by=0.25), na.rm=TRUE, type=2)[3])
med2=as.numeric(quantile(y, probs=seq(0,1, by=0.25), na.rm=TRUE, type=2)[3])
ci=wilcox.test(x, y, conf.int = TRUE, exact=FALSE)$conf.int
out[j,] = c(mk1[i], mk2[j],length(x),length(y),
med1, med2, wilcox.test(x, y, conf.int = TRUE,
exact=FALSE)$p.value)
}
}
colnames(out)=c("Analyte", "VISIT", "Female (N)", "Male (N)",
"Median of Female", "Median of Male", "P_wilxon")
您的直接问题是您正在重新分配到相同的输出矩阵行。下面只保留最后三行,因为j永远不会达到9。
out[j,] <- ...
但是,不是嵌套的for
循环迭代地将输出分配给具有硬编码尺寸的预定义矩阵,而是使用更动态的方法。考虑by
通过访问和分析物对数据帧进行子集化,然后将子集传递到所需的操作中。最后,最终对象的数据帧的行绑定列表:
run_comparison <- function(dd) {
x <- as.vector(dd$val[dd$gender=="F"])
y <- as.vector(dd$val[dd$gender=="M"])
med1 <- as.numeric(quantile(x, probs=seq(0,1, by=0.25), na.rm=TRUE, type=2)[3])
med2 <- as.numeric(quantile(y, probs=seq(0,1, by=0.25), na.rm=TRUE, type=2)[3])
wx <- wilcox.test(x, y, conf.int = TRUE, exact=FALSE)
data.frame(ANALYTE = dd$analyte[[1]], Visit = dd$visit[[1]],
Female_N = length(x), Male_N = length(y),
Female_Median = med1, Male_Median= med2,
P_Wilcox = wx$p.value)
}
df_list <- by(df, df[c("analyte", "visit")], run_comparison)
final_df <- do.call(rbind, df_list)