我正在做一个实验,以使用3种类型的机器学习方法预测能量需求,即ANN-MLP,SVM RBF和k-NN。我正在使用Execute R Script模块来运行R代码。我的问题是如何输出已执行的模型,以便该模型可用于在另一个数据集中进行预测。基本上我有2个数据集-6月数据集和7月数据集。预测模型是使用June数据集创建的,我想使用该模型测试July数据集。
predictor <- maml.mapInputPort(1)
datafull <- maml.mapInputPort(2)
library(caret)
#data splitting
datasplit <- createDataPartition(y = predictor$demand, p = 0.7, list = FALSE)
datatrain <- predictor[datasplit,]
datatest <- predictor[-datasplit,]
#repeated cv, 10 cross validation
ctrl <- trainControl(method="repeatedcv", repeats =5)
knnFit <- train(demand ~ ., data = datatrain, method = "knn", trControl = ctrl, preProcess = c("center", "scale"), tuneLength = 20)
knnFit
plot(knnFit)
#Prediction
datapredict <- predict(knnFit, predictor)
plot(datafull$cdate, datafull$demand, xlab = "Time", ylab = "Demand", col = "#0441d9")
lines(datafull$cdate, datafull$demand, xlab = "Time", ylab = "Demand", col = "#0441d9")
lines(datafull$cdate, datapredict, xlab = "Time", ylab = "Demand", col = "#cc0000")
现在,代码仅输出原始数据集,而新列包含预测值。
datafull$data.predict <- datapredict
str(datafull)
# Select data.frame to be sent to the output Dataset port
maml.mapOutputPort('datafull');
您可以将实验转换为预测性实验。然后保存模型,您可以使用预测性经验对所需的任何数据进行评估。