我有 284 组数据,每组数据有 8 个输入和 8 个输出。我想为每组创建所有可能的线性回归,其中至少有 3 个(最多 8 个)点,这将使我获得最高的 R 平方值。我对 R 很陌生,所以我一直在摆弄数据,以一种有效的方式对其进行格式化,但我正在努力创建一个循环来为我提供所需的信息。我想在表格中获得每个数据集的最高 R 平方值(我也尝试将它们分组,但我自己也很困惑)。
structure(list(Compound = c("Metaldehyde", "Ethiolate", "Dichlorvos",
"EPTC", "Dichlorobenzonitrile, 2,6- (Dichlobenil)", "Biphenyl"
), `0.001` = c("0.525801378811413", "1.09086320757922", "0.757598798609689",
"1.33007381591862", "0.809505580520483", "1.00786299782037"),
`0.002` = c("0.81382522294068", "2.21908840206009", "1.42422433130107",
"2.51320619666535", "1.45846969693775", "1.62715080909114"
), `0.008` = c("3.82235874643909", "8.73143928047307", "5.45664826653415",
"9.34670555134185", "6.07409045418359", "7.12064468663931"
), `0.01` = c("5.37597220248941", "10.9630450790647", "6.94974722612339",
"11.1622834351669", "7.6513486235631", "9.30809628488874"
), `0.02` = c("11.0397242837903", "21.7336744487985", "13.8840844920987",
"22.6791609253705", "14.5124382242381", "18.6193248982877"
), `0.05` = c("29.6510452842719", "54.5684493286241", "35.8144245415805",
"54.1600611247882", "37.6386671274583", "46.441493535114"
), `0.1` = c("63.976241274172", "108.825303544669", "73.6698676717224",
"109.005619728961", "75.2699800944843", "91.4888857826015"
), `0.2` = c("138.727387333375", "211.326086519903", "154.804651104695",
"216.030074774139", "145.529789433938", "181.182166279407"
)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
))
您可以在变量
lapply
上使用 split
和 Compound
,这会将大型数据框按 Compound
- lapply
分解为列表,然后在此列表上迭代您的模型。
如果这些是您的数据:
df <- data.frame(Compound = sample(LETTERS[1:3], n, replace = TRUE),
StdConc = sample(c("Apples", "Oranges"), replace = TRUE),
CalcConc = runif(n))
您可以使用:
lapply(split(df, df$Compound), \(x) lm(CalcConc ~ StdConc, data = x))
输出:
$A
Call:
lm(formula = CalcConc ~ StdConc, data = x)
Coefficients:
(Intercept) StdConcOranges
0.4995875 -0.0009936
$B
Call:
lm(formula = CalcConc ~ StdConc, data = x)
Coefficients:
(Intercept) StdConcOranges
4.997e-01 1.657e-05
$C
Call:
lm(formula = CalcConc ~ StdConc, data = x)
Coefficients:
(Intercept) StdConcOranges
0.4996074 -0.0001163