在Kotlin中使用SAX进行异步XML解析

问题描述 投票:0回答:1

我有一个SAX解析器,它读取XML文件(特别是.xlsx文件),并将内容作为Row对象的列表返回:大致是这样的

fun readExcelContent(data: InputStream) {
    val pkg = OPCPackage.open(file)
    val reader = XSSFReader(pkg)
    val sst = reader.sharedStringsTable
    val parser = XMLHelper.newXMLReader()
    val handler = ExcelSheetHandler(sst)
    parser.contentHandler = handler
    val sheet = reader.sheetsData.next()
    val source = InputSource(sheet)
    parser.parse(source)

    return handler.content
}

ExcelSheetHandler是扩展DefaultHandler并负责填写列表的类:

class ExcelSheetHandler(sst: SharedStringsTable): DefaultHandler() {

    private val content = mutableListOf<Row>()

    @Throws(SAXException::class)
    override fun endElement(uri: String?, localName: String?, name: String) {
        // If it's the end of a content element, add a row to content
    }
}

基本上是对Apache POI howto中事件模型示例的略微修改。

[我想知道是否有一种方法可以让readExcelContent返回一个异步对象(例如流),并在读取行后立即将行发送到其客户端,而不用等待整个文件被处理。

kotlin apache-poi sax kotlin-coroutines
1个回答
0
投票

在此用例中,[kotlinx.coroutines.Channelkotlinx.coroutines.Flow更可取,因为这是由parse()方法触发的热数据流。这是Kotlin Language Guide的状态。

流是类似于序列的冷流-流中的代码收集流之后,构建器才会运行

这里是您可以尝试的快速实现。

class ExcelSheetHandler : DefaultHandler() {

    private val scope = CoroutineScope(Dispatchers.Default)
    private val rows = Channel<Row>()

    override fun endDocument() {
        // To avoid suspending forever!
        rows.close()
    }

    @Throws(SAXException::class)
    override fun endElement(uri: String?, localName: String?, name: String) {
        readRow(uri, localName, name)
    }

    private fun readRow(uri: String?, localName: String?, name: String) = runBlocking {
        // If it's the end of a content element, add a row to content
        rows.send(row)
    }

    // Client code - if it needs to be somewhere else
    // you can expose a reference to Channel object
    private fun processRows() = scope.launch {
        for(row in rows) {
            // Do something
            println(row)
        }
    }
}
© www.soinside.com 2019 - 2024. All rights reserved.