我正在使用 ANTLR4 并尝试为我拥有的 python 文件生成解析树。我使用了 ANTLR4 文档中的语法文件 python3.g4。我安装了antlr4-python3-runtime,并且运行了以下命令:
antlr4 -Dlanguage=Python3 Python3.g4
这生成了我的解析器和词法分析器文件。
在 Python3Lexer.py 中,我遇到以下错误:
from typing.io import TextIO
所以我把它改为:
from typing import TextIO
我还创建了这个名为 pythonparser.py 的文件,该文件与解析器和词法分析器文件位于同一文件夹中,以调用解析器:
import sys
from antlr4 import *
from Python3Lexer import Python3Lexer
from Python3Parser import Python3Parser
def main(argv):
input_stream = FileStream(argv[1])
lexer = Python3Lexer(input_stream)
stream = CommonTokenStream(lexer)
parser = Python3Parser(stream)
tree = parser.single_input()
if __name__ == '__main__':
main(sys.argv)
我还制作了一个test.py文件,它与antlr语法位于同一文件夹中,其中:
print("hello world")
我尝试使用以下命令运行该文件的语法来解析它:
python3 pythonparser.py test.py
我不知道该怎么做,因为它对我不起作用。
我收到此错误消息:
Traceback (most recent call last):
File "/Users/Fari/Developer/PRJ/project/antlr/pythonparser.py", line 3, in <module>
from Python3Lexer import Python3Lexer
File "/Users/Fari/Developer/PRJ/project/antlr/Python3Lexer.py", line 19, in <module>
LanguageParser = getattr(importlib.import_module('{}Parser'.format(module_path)), '{}Parser'.format(language_name))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/Fari/Developer/PRJ/project/antlr/Python3Parser.py", line 446, in <module>
class Python3Parser ( Parser ):
File "/Users/Fari/Developer/PRJ/project/antlr/Python3Parser.py", line 450, in Python3Parser
atn = ATNDeserializer().deserialize(serializedATN())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/antlr4/atn/ATNDeserializer.py", line 60, in deserialize
self.reset(data)
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/antlr4/atn/ATNDeserializer.py", line 90, in reset
temp = [ adjust(c) for c in data ]
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/antlr4/atn/ATNDeserializer.py", line 90, in <listcomp>
temp = [ adjust(c) for c in data ]
^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/antlr4/atn/ATNDeserializer.py", line 88, in adjust
v = ord(c)
^^^^^^
TypeError: ord() expected string of length 1, but int found
我不知道我哪里错了。
Python 语法有很多。您需要的是这些:
下载这两个语法后,您需要通过在这两个语法文件所在的同一文件夹中运行文件 transformGrammar.py 来预处理它们。
现在将这两个类下载到同一个文件夹中:
全部完成后,生成词法分析器和解析器 Python 类:
java -jar antlr-4.11.1-complete.jar *.g4 -Dlanguage=Python3
如果您现在运行该文件:
from antlr4 import *
from Python3Lexer import Python3Lexer
from Python3Parser import Python3Parser
def main():
input_stream = InputStream('print("hello world")\n')
lexer = Python3Lexer(input_stream)
stream = CommonTokenStream(lexer)
parser = Python3Parser(stream)
tree = parser.single_input()
print(tree.toStringTree(recog=parser))
if __name__ == '__main__':
main()
将打印以下输出:
(single_input (simple_stmts (simple_stmt (expr_stmt (testlist_star_expr (test (or_test (and_test (not_test (comparison (expr (xor_expr (and_expr (shift_expr (arith_expr (term (factor (power (atom_expr (atom (name print)) (trailer ( (arglist (argument (test (or_test (and_test (not_test (comparison (expr (xor_expr (and_expr (shift_expr (arith_expr (term (factor (power (atom_expr (atom "hello world"))))))))))))))))) ))))))))))))))))))) \n))
请注意,我没有更改任何其他内容(不需要
typing.io
到 typing
)。我用过:
当我将以下内容粘贴到文件中时:
#!/usr/bin/env bash
wget https://raw.githubusercontent.com/antlr/grammars-v4/master/python/python3/Python3Lexer.g4
wget https://raw.githubusercontent.com/antlr/grammars-v4/master/python/python3/Python3Parser.g4
wget https://raw.githubusercontent.com/antlr/grammars-v4/master/python/python3/Python3/transformGrammar.py
wget https://raw.githubusercontent.com/antlr/grammars-v4/master/python/python3/Python3/Python3LexerBase.py
wget https://raw.githubusercontent.com/antlr/grammars-v4/master/python/python3/Python3/Python3ParserBase.py
wget https://www.antlr.org/download/antlr-4.11.1-complete.jar
python3 transformGrammar.py
pip install antlr4-python3-runtime
java -jar antlr-4.11.1-complete.jar *.g4 -Dlanguage=Python3
cat << EOF > main.py
from antlr4 import *
from Python3Lexer import Python3Lexer
from Python3Parser import Python3Parser
def main():
input_stream = InputStream('print("hello world")\n')
lexer = Python3Lexer(input_stream)
stream = CommonTokenStream(lexer)
parser = Python3Parser(stream)
tree = parser.single_input()
print(tree.toStringTree(recog=parser))
if __name__ == '__main__':
main()
EOF
python3 --version
python3 main.py
并运行此文件,我得到以下输出:
...
antlr-4.11.1-complete.jar 100%[============================================================================>] 3,38M 9,33MB/s in 0,4s
2023-01-31 10:51:47 (9,33 MB/s) - ‘antlr-4.11.1-complete.jar’ saved [3547867/3547867]
Altering Python3Lexer.g4
Writing ...
Altering Python3Parser.g4
Writing ...
Requirement already satisfied: antlr4-python3-runtime in /opt/homebrew/lib/python3.10/site-packages (4.11.1)
Python 3.10.9
(single_input (simple_stmts (simple_stmt (expr_stmt (testlist_star_expr (test (or_test (and_test (not_test (comparison (expr (xor_expr (and_expr (shift_expr (arith_expr (term (factor (power (atom_expr (atom (name print)) (trailer ( (arglist (argument (test (or_test (and_test (not_test (comparison (expr (xor_expr (and_expr (shift_expr (arith_expr (term (factor (power (atom_expr (atom "hello world"))))))))))))))))) ))))))))))))))))))) \n))
就我而言,我从 ANTLR 的存储库复制了语法,它有一个名称,将其设置为语法的源代码,如下所示:
parser grammar PythonParser;
此名称必须与语法的文件名匹配。所以该文件应该命名为
PythonParser.g4
。