使用ANTLR使用Python解析一些Java代码

问题描述 投票:1回答:1

我想在Python中使用ANTLR构建Java解析器。

我从ANTLR存储库下载了语法:

Lexer:https://github.com/antlr/grammars-v4/blob/master/java/java/JavaLexer.g4

解析器:https://github.com/antlr/grammars-v4/blob/master/java/java/JavaParser.g4

然后,我使用script.bat生成了我需要的python代码:

java -jar antlr-4.8-complete.jar -Dlanguage=Python3 Java8Lexer.g4
java -jar antlr-4.8-complete.jar -Dlanguage=Python3 Java8Parser.g4

[antlr-4.8-complete.jar在这里下载:https://www.antlr.org/download/antlr-4.8-complete.jar

这将生成此文件列表:

  • Java8Lexer.interp
  • Java8Lexer.py
  • Java8Lexer.tokens
  • Java8Parser.interp
  • Java8Parser.py
  • Java8Parser.tokens
  • Java8ParserListener.py

然后我编写了这段代码来解析Java文件:

import antlr4
from antlr4 import *
from java.antlr_unit2 import Java8Parser, Java8Lexer

def main():+
    code = open('test.txt', 'r').read()
    lexer = Java8Lexer.Java8Lexer(antlr4.InputStream(code))
    stream = antlr4.CommonTokenStream(lexer)
    parser = Java8Parser.Java8Parser(stream)
    tree = parser.expression()
    print (tree)

if __name__ == '__main__':
    main()

我的测试Java代码test.txt是这样的:

package org.jabref.gui.fieldeditors;
import java.util.ArrayList;
/**
 * This class contains some code
 */
public class TextInputControlBehavior {

    private static final boolean SHOW_HANDLES = Properties.IS_TOUCH_SUPPORTED && !OS.OS_X;

}

当我运行此代码时,我得到了这个:

line 1:0 extraneous input 'package' expecting {'boolean', 'byte', 'char', 'double', 'float', 'int', 'long', 'new', 'short', 'super', 'this', 'void', IntegerLiteral, FloatingPointLiteral, BooleanLiteral, CharacterLiteral, StringLiteral, 'null', '(', '!', '~', '++', '--', '+', '-', Identifier, '@'}
[]

我做错什么了吗?我没有写语法,只是从ANTLR存储库中提取的。

python python-3.x antlr antlr4 python-3.8
1个回答
0
投票
您的文本文件包含compilationUnit,而不是您尝试解析的expression

tree = parser.expression()

仔细查看解析器规则,您需要的规则是

compilationUnit : packageDeclaration? importDeclaration* typeDeclaration* EOF ;

必须称为

tree = parser.compilationUnit()

© www.soinside.com 2019 - 2024. All rights reserved.