我正在尝试为通用Python函数创建一个解析器,以将args
和kwargs
分开。我浏览了examples,但找不到有帮助的。
这里是我想解析的示例,以及我想用parseString().asDict()
解析后的输出的示例。
example = "test(1, 2, 3, hello, a=4, stuff=there, d=5)"
results = xxx.parseString(example).asDict()
results
{'name': 'test', 'args': ['1', '2', '3', 'hello'], 'kwargs': {'a': '4', 'stuff': 'there', 'd': '5'}}
or
example = "test(a=4, stuff=there, d=5)"
results = xxx.parseString(example).asDict()
results
{'name': 'test', 'args': '', 'kwargs': {'a': '4', 'stuff': 'there', 'd': '5'}}
or
example = "test(1, 2, 3, hello)"
results = xxx.parseString(example).asDict()
results
{'name': 'test', 'args': ['1', '2', '3', 'hello'], 'kwargs': ''}
参数和关键字参数都应该是可选的,并且我暂时忽略了超通用的*args
,**kwargs
和输入嵌套列表等。当只有args或kwargs但失败时,我设法使某些东西工作当我两个都有。
import pyparsing as pp
LPAR = pp.Suppress('(')
RPAR = pp.Suppress(')')
# define generic number
number = pp.Regex(r"[+-~]?\d+(:?\.\d*)?(:?[eE][+-]?\d+)?")
# define function arguments
arglist = pp.delimitedList(number | (pp.Word(pp.alphanums + '-_') + pp.NotAny('=')) )
args = pp.Group(arglist).setResultsName('args')
# define function keyword arguments
key = pp.Word(pp.alphas) + pp.Suppress('=')
values = (number | pp.Word(pp.alphas))
keyval = pp.dictOf(key, values)
kwarglist = pp.delimitedList(keyval)
kwargs = pp.Group(kwarglist).setResultsName('kwargs')
# build generic function
fxn_args = pp.Optional(args, default='') + pp.Optional(kwargs, default='')
fxn_name = (pp.Word(pp.alphas)).setResultsName('name')
fxn = pp.Group(fxn_name + LPAR + fxn_args + RPAR)
和结果
# parsing only kwargs
fxn.parseString('test(a=4, stuff=there, d=5)')[0].asDict()
{'name': 'test', 'args': '', 'kwargs': {'a': '4', 'stuff': 'there', 'd': '5'}}
# parsing only args
fxn.parseString('test(1, 2, 3, hello)')[0].asDict()
{'name': 'test', 'args': ['1', '2', '3', 'hello'], 'kwargs': ''}
# parsing both
fxn.parseString('test(1, 2, 3, hello, a=4, stuff=there, d=5)')[0].asDict()
...
ParseException: Expected ")", found ',' (at char 19), (line:1, col:20)
如果仅检查fxn_args
的解析,我会发现kwargs
完全丢失了
# parse only kwargs
fxn_args.parseString('c=4, stuff=there, d=5.234').asDict()
{'args': '', 'kwargs': {'c': '4', 'stuff': 'there', 'd': '5.234'}}
# parse both args and kwargs
fxn_args.parseString('1, 2, 3, hello, c=4, stuff=there, d=5.234').asDict()
{'args': ['1', '2', '3', 'hello'], 'kwargs': ''}
如果同时存在args和kwargs,则您的解析器在它们之间的','上跳闸。
您可以使用pyparsing的runTests方法亲自查看:
fxn.runTests("""\
# parsing only kwargs
test(a=4, stuff=there, d=5)
# parsing only args
test(1, 2, 3, hello)
# parsing both
test(1, 2, 3, hello, a=4, stuff=there, d=5)
""")
将打印:
# parsing only kwargs
test(a=4, stuff=there, d=5)
[['test', '', [['a', 4], ['stuff', 'there'], ['d', 5]]]]
[0]:
['test', '', [['a', 4], ['stuff', 'there'], ['d', 5]]]
- args: ''
- kwargs: [['a', 4], ['stuff', 'there'], ['d', 5]]
- a: 4
- d: 5
- stuff: 'there'
- name: 'test'
# parsing only args
test(1, 2, 3, hello)
[['test', [1, 2, 3, 'hello'], '']]
[0]:
['test', [1, 2, 3, 'hello'], '']
- args: [1, 2, 3, 'hello']
- kwargs: ''
- name: 'test'
# parsing both
test(1, 2, 3, hello, a=4, stuff=there, d=5)
^
FAIL: Expected ")", found ',' (at char 19), (line:1, col:20)>Exit code: 0
最容易修复:
fxn_args = args + ',' + kwargs | pp.Optional(args, default='') + pp.Optional(kwargs, default='')
[您可能还会发现标识符不仅是Word(字母),而且是'_'和数字。 pyparsing随附的pyparsing_common命名空间类中有一个标识符表达式:
ppc = pp.pyparsing_common
ident = ppc.identifier()
number = ppc.number()
[number
还将自动转换为int或float。