使用此代码训练 SMO 后:
Evaluation eval = new Evaluation(ClassData);
SMO svm = new SMO();
eval.crossValidateModel(svm, ClassData, 10, new Random(1));
System.out.println(eval.toSummaryString("\nResults\n\n",false));
eval.evaluateModel(svm,ClassData);
我收到此错误:
Exception in thread "main" java.lang.NullPointerException: Cannot invoke "weka.filters.unsupervised.attribute.ReplaceMissingValues.input(weka.core.Instance)" because "this.m_Missing" is null
at weka.classifiers.functions.SMO.distributionForInstance(SMO.java:1443)
at weka.classifiers.evaluation.Evaluation.evaluationForSingleInstance(Evaluation.java:2208)
at weka.classifiers.evaluation.Evaluation.evaluateModelOnceAndRecordPrediction(Evaluation.java:2246)
at weka.classifiers.evaluation.Evaluation.evaluateModel(Evaluation.java:2122)
at weka.classifiers.Evaluation.evaluateModel(Evaluation.java:689)
at ClassCleanSMO.main(ClassCleanSMO.java:78)
如您所见,ClassData 可正常用于交叉评估,但不适用于评估。 我认为 ClassData 不包含任何缺失值,而仅包含 0 值。 评估不同的测试集时,它给了我同样的错误。 使用 J48 Classfier 时它给了我同样的错误。
traininput2.arff 是从 Matlab 双精度数组转换而来的 198 + 1 个属性 X 3600 个数字实例。 Classdata 源自此代码:
import weka.core.Instances;
import weka.core.converters.ArffLoader;
import weka.filters.unsupervised.attribute.NumericToNominal;
import weka.filters.unsupervised.attribute.ClassAssigner;
import weka.filters.supervised.attribute.PartitionMembership;
import weka.filters.supervised.instance.ClassBalancer;
import weka.classifiers.Evaluation;
import java.util.Random;
import weka.classifiers.functions.SMO;
public class ClassCleanSMO {
public static void main(String[] args) throws Exception{
// Loader
ArffLoader loader = new weka.core.converters.ArffLoader();
loader.setFile(new java.io.File("C:/Users/redmello/Desktop/traininput2.arff"));
Instances ClassData= loader.getDataSet();
// FILTERS
// Nun2Nominal
NumericToNominal N2N = new weka.filters.unsupervised.attribute.NumericToNominal();
N2N.setInputFormat(ClassData);
String[] options = new String[2];
options[0] = "-R";
options[1] = "last";
N2N.setOptions(options);
ClassData = weka.filters.Filter.useFilter(ClassData, N2N);
// ClassAssigner
ClassAssigner MyClass= new weka.filters.unsupervised.attribute.ClassAssigner();
MyClass.setInputFormat(ClassData);
String[] options3 = new String[2];
options3[0] = "-C";
options3[1] = "last";
MyClass.setOptions(options3);
ClassData = weka.filters.Filter.useFilter(ClassData, MyClass);
// PartitionMembership
PartitionMembership PartMemb = new weka.filters.supervised.attribute.PartitionMembership();
PartMemb.setInputFormat(ClassData);
ClassData = weka.filters.Filter.useFilter(ClassData, PartMemb);
// ClassBalancer
ClassBalancer ClassBal = new weka.filters.supervised.instance.ClassBalancer();
ClassBal.setInputFormat(ClassData);
String[] options4 = new String[1];
options4[0] = "10";
ClassBal.setOptions(options4);
ClassData = weka.filters.Filter.useFilter(ClassData, ClassBal);
// DATA MINING
// crossValidationFoldMaker + J48
Evaluation eval = new Evaluation(ClassData);
SMO svm = new SMO();
eval.crossValidateModel(svm, ClassData, 10, new Random(1));
System.out.println(eval.toSummaryString("\nResults\n\n",false));
eval.evaluateModel(svm,ClassData);
}
}
一些建议? 我在网上什么也没找到。
谢谢!
Evaluation 类的
crossValidateModel
方法会创建您提供的模板分类器的副本,然后再训练和评估这些副本。
这意味着,当您尝试在完整的训练集上评估 SMO 实例(变量
svm
)时,它从未经过过训练。
我不确定您想通过最后一次评估调用实现什么目标,因为您已经有了交叉验证统计数据。
您的代码还有一些问题:
setInputFormat
调用需要在为过滤器设置所有选项之后进行,因为过滤器将根据当前设置的选项初始化其内部数据结构。如果您之后设置选项,则不会有任何效果。weka.classifiers.meta.FilteredClassifier
元分类器结合使用,以避免使用监督过滤器时发生任何类信息泄漏。如果您有多个过滤器,请使用 weka.filters.MultiFilter
将它们变成过滤器管道。我已将您的代码简化为:
import weka.classifiers.Evaluation;
import weka.classifiers.functions.SMO;
import weka.classifiers.meta.FilteredClassifier;
import weka.core.Instances;
import weka.core.converters.ConverterUtils.DataSource;
import weka.filters.Filter;
import weka.filters.MultiFilter;
import weka.filters.supervised.attribute.PartitionMembership;
import weka.filters.supervised.instance.ClassBalancer;
import weka.filters.unsupervised.attribute.NumericToNominal;
import java.util.Random;
public class ClassCleanSMO2 {
public static void main(String[] args) throws Exception{
// load data
Instances data = DataSource.read("C:/Users/redmello/Desktop/traininput2.arff");
// make class attribute nominal
// (only filter outside FilteredClassifier, as the Evaluation
// class needs to be set up with correct class attribute)
NumericToNominal n2n = new NumericToNominal();
n2n.setOptions(new String[]{"-R", "last"});
n2n.setInputFormat(data);
data = Filter.useFilter(data, n2n);
data.setClassIndex(data.numAttributes() - 1);
// PartitionMembership
PartitionMembership partMemb = new PartitionMembership();
// ClassBalancer
ClassBalancer classBal = new ClassBalancer();
classBal.setOptions(new String[]{"-num-intervals", "10"});
// combine filters
MultiFilter multi = new MultiFilter();
multi.setFilters(new Filter[]{
partMemb,
classBal,
});
// classifier
// FilteredClassifier combines filter and base classifier
SMO svm = new SMO();
FilteredClassifier fc = new FilteredClassifier();
fc.setClassifier(svm);
fc.setFilter(multi);
// cross-validate on data
Evaluation eval = new Evaluation(data);
eval.crossValidateModel(fc, data, 10, new Random(1));
System.out.println(eval.toSummaryString("\nResults\n\n",false));
}
}