在嵌套循环中运行R t.test

问题描述 投票:0回答:1

我是R Studio的新手。在课堂上,我已经取消了2016年美国人口普查选举数据集,并希望对数据集进行一系列T检验。有关数据集的一些细节。首先,数据被编码 - 1到4 - 代表公民身份。我想看看各种因素是否会影响投票的可能性(1 =是或2 =否)。

这是代码:

factor <- c("Age", "Fathers_country_of_birth", "Mothers_country_of_birth","Highest_level_of_School_completed", "Country_of_birth")
citizen <- c("NATIVE, BORN IN THE UNITED STATES", "NATIVE, BORN IN PUERTO RICO OR OTHER U.S. ISLAND AREAS", "NATIVE, BORN ABROAD OF AMERICAN PARENT OR PARENTS", "FOREIGN BORN, U.S. CITIZEN BY NATURALIZATION")

for (f in factor) {
  print(f)
for (i in 1:4){
  print(paste("Citizenship is", citizen[i] ))
  query <- paste("select * from result2 where Citizenship = ",i)

 sample <- sqldf(query) 
  print(
  (t.test(f ~ Vote_in_Election, data=sample, var.equal = FALSE) ) ) 

} }

它会引发“可变长度”错误

> [1] "Age" [1] "Citizenship is NATIVE, BORN IN THE UNITED STATES"  Show
> Traceback Error in model.frame.default(formula = f ~ Vote_in_Election,
> data = sample) : variable lengths differ (found for
> 'Vote_in_Election')

如果我取出外循环,我可以运行它就好了,当然,我必须逐个输入'factor'中的值。

在Windows 10上运行R Studio版本1.1.463,R是3.5.2。

因为当我迭代i时会有不同的数据行,我尝试设置配对= FALSE,它仍然对我大喊大叫。

我看过SO,但还没有找到解决方案。我错过了什么?

r nested-loops t-test
1个回答
0
投票

要动态构建公式,您需要在as.formula中强制转换公式的字符串版本:

t.test(as.formula(paste(f, "~ Vote_in_Election")), data=sample, var.equal = FALSE) 

或者使用reformulate

t.test(reformulate("Vote_in_Election", response=f), data=sample, var.equal = FALSE)
© www.soinside.com 2019 - 2024. All rights reserved.