在我的项目中,我使用Spring批处理并使用FlatFileItemReader / FieldSetMapper读取文件。某些输入文件存在问题。对于少数记录,行被剪切/格式化。 假设输入文件有4列。几列没有正确形成。任何人都可以帮忙解决这个问题吗?(如果需要我可以解释更多) FILE.CSV
"id","name","age","salary"
"1","user1","28","1000"
"2","user2","27","2000"
"3","user3","26
","3000"
"4","user4","25","
4000"
"5","
user5","24","5000"
"6","user6","23","6000"
"7","user7","22","7000"
"8","user8","21","8000"
使用FlatFileItemReader读取格式错误的行时,我遇到了类似的问题。在这种情况下,您可以在FlatFileItemReader中将DefaultRecordSeparatorPolicy用作RecordSeparatorPolicy。它的作用是在读取一行后检查endOfRecord。如果读取行具有任何未注释的引号,则它会读取另一行以规范化输入。您也可以覆盖该行为。
flatFileItemReader.setRecordSeparatorPolicy(new DefaultRecordSeparatorPolicy());
有关更多信息,请参阅DefaultRecordSeparatorPolicy API Doc
@Bean
public FlatFileItemReader<YourClassName> itemReader(@Value("${input}") Resource resource) {
FlatFileItemReader<YourClassName> flatFileItemReader = new FlatFileItemReader<>();
flatFileItemReader.setResource(resource);
flatFileItemReader.setName("CSV-Reader");
flatFileItemReader.setLinesToSkip(1);
// override default comment '#' from file parsing
flatFileItemReader.setComments(new String[] {});
// checks for multi-line csv inputs for very lage row
flatFileItemReader.setRecordSeparatorPolicy(new DefaultRecordSeparatorPolicy());
flatFileItemReader.setLineMapper(lineMapper());
return flatFileItemReader;
}
@Bean
public LineMapper<YourClassName> lineMapper() {
DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer();
lineTokenizer.setDelimiter(DelimitedLineTokenizer.DELIMITER_COMMA);
lineTokenizer.setQuoteCharacter(DelimitedLineTokenizer.DEFAULT_QUOTE_CHARACTER);
lineTokenizer.setStrict(false);
lineTokenizer.setNames(COLUMN_NAMES);
BeanWrapperFieldSetMapper<YourClassName> fieldSetMapper = new BeanWrapperFieldSetMapper<>();
fieldSetMapper.setTargetType(YourClassName.class);
DefaultLineMapper<YourClassName> defaultLineMapper = new DefaultLineMapper<>();
defaultLineMapper.setLineTokenizer(lineTokenizer);
defaultLineMapper.setFieldSetMapper(fieldSetMapper);
return defaultLineMapper;
}