如果缺少响应变量,则创建模型矩阵,并且矩阵乘法重新创建预测函数

问题描述 投票:0回答:1

我想为缺少响应变量的测试数据集创建一个模型矩阵,如果使用矩阵乘法构建预测,我可以完美地复制在模型上调用 Predict() 的结果。例如,请参阅下面的代码。

我有可以执行此操作的代码(再次参见下面的示例),但它要求我在测试数据中创建一个占位符响应变量。这看起来不太干净,我想知道是否有办法让代码在没有这种解决方法的情况下工作。

# Make data, fit model
set.seed(1); df_train = data.frame(y = rnorm(10), x = rnorm(10), z = rnorm(10))
set.seed(2); df_test = data.frame(x = rnorm(10), z = rnorm(10))
fit = lm(y ~ poly(x) + poly(z), data = df_train)

# Make model matrices. Get error for the test data as 'y' isnt found
mm_train = model.matrix(terms(fit), df_train)
mm_test = model.matrix(terms(fit), df_test) #"Error in eval(predvars, data, env) : object 'y' not found"

# Make fake y variable for test data then build model matrix. I want to know if there's a less hacky way to do this
df_test$y = 1
mm_test = model.matrix(terms(fit), df_test) 

# Check predict and matrix multiplication give identical results on test data. NB this is not the case if contstructing the model matrix using (e.g.) mm_test = model.matrix(formula(fit)[-2], df_test) for the reason outlined here https://stackoverflow.com/questions/59462820/why-are-predict-lm-and-matrix-multiplication-giving-different-predictions.
preds_1 = round(predict(fit, df_test), 5) 
preds_2 = round(mm_test %*% fit$coefficients, 5)
all(preds_1 == preds_2)  #TRUE
r predict model.matrix
1个回答
0
投票

以此问题为基础,您可以从模型中提取公式,将响应设置为

NULL
,然后将其传递给
model.matrix

mm_test = model.matrix(update(formula(fit), NULL ~ .), data = df_test)

仍然不是“内置”功能,但至少这是一个更简洁的单行代码。

© www.soinside.com 2019 - 2024. All rights reserved.