Deeplearning4j - 为字符串员工、工人、同事等创建模型

问题描述 投票:0回答:1

我想使用 deeplearning4j 构建模型/训练模型。 该模型应该理解所有与员工相关的术语,例如,它应该将“员工”、“工人”、“同事”等词理解为相同的词。

我使用下面的代码片段来创建数据集并拥有迭代器

    DataSet allData;
    try (RecordReader recordReader = new CSVRecordReader(0, ',')) {
        recordReader.initialize(new StringSplit("Employee ID,0"));

        DataSetIterator iterator = new RecordReaderDataSetIterator(recordReader, 150, FEATURES_COUNT, CLASSES_COUNT);
        allData = iterator.next();
    }

    allData.shuffle(42);

并最终出现以下错误:

Exception in thread "main" java.lang.NumberFormatException: For input string: "Employee ID"
    at java.base/jdk.internal.math.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2054)
    at java.base/jdk.internal.math.FloatingDecimal.parseDouble(FloatingDecimal.java:110)
    at java.base/java.lang.Double.parseDouble(Double.java:651)
    at org.datavec.api.writable.Text.toDouble(Text.java:590)
    at org.datavec.api.util.ndarray.RecordConverter.toMinibatchArray(RecordConverter.java:207)
    at org.deeplearning4j.datasets.datavec.RecordReaderMultiDataSetIterator.next(RecordReaderMultiDataSetIterator.java:153)
    at org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator.next(RecordReaderDataSetIterator.java:346)
    at org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator.next(RecordReaderDataSetIterator.java:421)
    at org.deeplearning4j.datasets.datavec.RecordReaderDataSetIterator.next(RecordReaderDataSetIterator.java:53)
    at com.test.dl4jtest.dl4jtest.TrainModel.main(TrainModel.java:46)

请建议我如何创建一个模型,该模型应在此处考虑工人、同事、雇员等术语。

machine-learning deep-learning deeplearning4j
1个回答
0
投票

我建议您查看 deeplearning4j 示例页面上的数据管道示例,尤其是您的案例中的CSVMixedDataTypesLocal.java

© www.soinside.com 2019 - 2024. All rights reserved.