我正在尝试在data.table中使用predict.lm,并收到一个奇怪的错误。第一部分,数据准备,完美运行。
# (1) Load data
library(data.table)
homeprice = fread('https://vincentarelbundock.github.io/Rdatasets/csv/mosaicData/SaratogaHouses.csv')
# (2) Data Prep: Convert character variables into factors.
myvars = c('heating','fuel','sewer','waterfront','newConstruction','centralAir')
for (var in myvars) {
homeprice[, paste0(var) := as.factor(get(var))]
}
# (3) Split data into training and test sets
install.packages('caTools')
library(caTools)
homeprice[, split := sample.split(V1, SplitRatio = 0.5)]
train = homeprice[split == T,] # Creating training data
test = homeprice[split == F,] # Create test data
# Train OLS model with training data.
reg1 = lm(price ~ . - V1 - split, train)
summary(reg1) # Displays the results from "myfirstreg"
好的,这是给我麻烦的部分:
# In sample-prediction: Predict prices for training set
z = predict(reg1, newdata = train)
train[, price_pred := z] # Works perfectly
train[, price_pred := predict(reg1, newdata = train)] # Gives error
请告知。
我不知道是什么原因引起的错误,但是使用了dplyr
train <- train %>%
mutate(price_pred = predict(reg1, newdata = train))
似乎提供相同的结果