从嵌套列表中智能提取线性模型系数

Question

我是一名 R 资深人士，通常讨厌使用嵌套列表，因为它们看起来很棘手......但我不确定我可以在这里避免使用它们。在这种情况下，我可以生成我想要的输出，但我不知道如何智能地拆分列表。

我正在尝试从数据集中为类的每个级别创建 n 个线性模型。运行所有模型后，我想要一个简单的表格，其中包含每个级别的斜率、截距和类别。下面的例子就是我想要的：

# Dummy data
d <- data.frame(x=rnorm(50, 10, 1), 
                y=rnorm(50, 0, 2), 
                class=c(rep('A',10),rep('B',10),rep('C',10),rep('D',10),rep('E',10)))

# Split the data by grouping variable
d.s <-  split(d, d$class)

# Create a linear model from y~x in each class
coeffs <- function(df) {
  m <- lm(y~x, data = df)$coefficients
}

m.s <- lapply(d.s, coeffs)

# How do I neatly get a data frame that looks like below out of m.s??

wanted <- data.frame(class=as.character(), slope=as.numeric(), intercept=as.numeric())

请原谅我对嵌套列表的厌恶和缺乏经验！经过多行取消列出并拆开行名后，我可以获得我想要的东西，但必须有更好的方法。我正在努力改变...

Answer 1

你可以首先使用

sapply

，

> t(sapply(d.s, coeffs))
  (Intercept)          x
A   9.4390072 -0.8430022
B  -5.2018384  0.4988027
C  -7.9531678  0.8298487
D  -0.9505984  0.1192621
E  -1.8155237  0.1522270

> data.frame(sort(unique(d$class)), t(sapply(d.s, coeffs))) |> 
+   setNames(c('class', 'intercept', 'slope'))
  class  intercept      slope
A     A  9.4390072 -0.8430022
B     B -5.2018384  0.4988027
C     C -7.9531678  0.8298487
D     D -0.9505984  0.1192621
E     E -1.8155237  0.1522270

或者如果您依赖

lapply

，

rbind

结果。

> data.frame(names(m.s), do.call('rbind', m.s)) |> 
+   setNames(c('class', 'intercept', 'slope'))
  class  intercept      slope
A     A  9.4390072 -0.8430022
B     B -5.2018384  0.4988027
C     C -7.9531678  0.8298487
D     D -0.9505984  0.1192621
E     E -1.8155237  0.1522270

如果您真的想要截距前的斜率，请这样做

> data.frame(class=names(m.s), do.call('rbind', m.s)[, 2:1]) |> 
+   setNames(c('class', 'slope', 'intercept'))
  class      slope  intercept
A     A -0.8430022  9.4390072
B     B  0.4988027 -5.2018384
C     C  0.8298487 -7.9531678
D     D  0.1192621 -0.9505984
E     E  0.1522270 -1.8155237

Answer 2

1) lmList nlme 已随 R 预装。

library (nlme)

fm <- lmList(y ~ x | class, d)
coef(fm)
##   (Intercept)          x
## A   -1.195093  0.1626329
## B   -1.260145  0.1475529
## C    6.971637 -0.8038765
## D    5.533810 -0.4754503
## E    4.297987 -0.3426785

2) lm

lm

可以使用适当的公式计算这些系数。

lm(y ~ class / x + 0, d) |> coef() |> matrix(ncol = 2)
##           [,1]       [,2]
## [1,] -1.195093  0.1626329
## [2,] -1.260145  0.1475529
## [3,]  6.971637 -0.8038765
## [4,]  5.533810 -0.4754503
## [5,]  4.297987 -0.3426785

3）simplify2array 如果从

m.s

开始很重要，则使用

simplify2array

或使用

转置它（如果您更喜欢转置表示）。

simplify2array(m.s)
##                      A          B          C          D          E
## (Intercept) -1.1950932 -1.2601447  6.9716368  5.5338101  4.2979871
## x            0.1626329  0.1475529 -0.8038765 -0.4754503 -0.3426785

注意

使用的输入

set.seed(123)

d <- data.frame(x=rnorm(50, 10, 1), 
                y=rnorm(50, 0, 2), 
                class=c(rep('A',10),rep('B',10),rep('C',10),rep('D',10),rep('E',10)))

从嵌套列表中智能提取线性模型系数

问题描述投票：0回答：2

2个回答

注意

最新问题

从嵌套列表中智能提取线性模型系数

问题描述 投票：0回答：2

2个回答

注意

最新问题

问题描述投票：0回答：2