输出为特定的字典和字典列表

问题描述 投票:0回答:1

我正在尝试将一些命令(例如字符串/文件内容)解析为 Dict 输出,并且我了解了 pyparsing。

假设我有以下输入:

str = "p1 start a, {alias = b, for : 30}; c, d stop e"

为了解析它,我正在使用这个:

import pyparsing as pp



grammar = pp.Forward()

SEP = pp.one_of(", ;")
EQ = pp.Suppress(pp.one_of(': ='))
LBRACE, RBRACE = map(pp.Suppress,"{}")
CMD_KEYWORD = (pp.CaselessKeyword("start") | pp.CaselessKeyword("stop") | pp.CaselessKeyword("resume"))
platform = pp.one_of("p1 p2 p3")("platform")
alias = pp.Word(pp.alphanums)
prop = pp.Word(pp.alphanums)
value = pp.Word(pp.alphanums)
prop_value = pp.Dict(pp.Group(prop + EQ + value))
task_config = LBRACE + pp.delimitedList(prop_value, delim = SEP) + RBRACE
command = CMD_KEYWORD + pp.Group(pp.delimitedList(task_config | alias, delim = SEP))("tasks")
expr = platform + command[1, ...]("commands")

grammar <<= expr

res = grammar.parse_string(str)

print(res.as_dict())

print(res.as_list())

产生以下字典和列表

{'platform': 'p1', 'tasks': ['e'], 'commands': ['start', {'alias': 'b', 'for': '30'}, 'stop', ['e']]}
['p1', 'start', ['a', ['alias', 'b'], ['for', '30'], 'c', 'd'], 'stop', ['e']]

虽然我(仍然)试图实现的目标是获得特定 Dict 格式的输出,如下所示:

{
 [
  {
   'platform': 'p1',
   'commands': [
                {'cmd': 'start', 'tasks': [{'alias': 'a'}, {'alias': 'b', 'for': '30'}, {'alias': 'c'}, {'alias': 'd'}]},
                {'cmd': 'stop', 'tasks': [{'alias': 'e'}]}
               ],

  }       
 ]
}

编辑:

经过一些尝试和错误,在对解析器语法进行一些更改后,我设法(几乎)实现了我的目标:

import pyparsing as pp

def set_alias(t):
    return {"alias": t[0]}

grammar = pp.Forward()

SEP = pp.one_of(", ;")
EQ = pp.Suppress(pp.one_of(': ='))
LBRACE, RBRACE = map(pp.Suppress,"{}")
OPT_SEP = pp.Suppress(pp.Opt(SEP))
CMD_KEYWORD = (pp.CaselessKeyword("start") | pp.CaselessKeyword("stop") | pp.CaselessKeyword("resume"))("cmd")
platform = pp.one_of("p1 p2 p3")("platform")
alias = ~(CMD_KEYWORD | platform) + pp.Word(pp.alphanums)
prop = pp.Word(pp.alphanums)
value = pp.Word(pp.alphanums)
prop_value = pp.Dict(pp.Group(prop + EQ + value))
task_config = LBRACE + pp.Group(pp.delimitedList(prop_value, delim = SEP)) + RBRACE
command = pp.Group(CMD_KEYWORD + pp.Group(pp.OneOrMore((task_config | alias.set_parse_action(set_alias)) + OPT_SEP))("tasks"))
expr = platform + command[1, ...]("commands")

grammar <<= pp.OneOrMore(expr + OPT_SEP)

print(res.as_dict())

print(res.as_list())

但是当我使用以下输入对其进行测试时:

p1 start a, {alias = b, for : 30}; c, d stop e p2 resume f

我得到:

{'platform': 'p2', 'commands': [{'cmd': 'resume', 'tasks': [{'alias': 'f'}]}]}

['p1', ['start', [{'alias': 'a'}, [['alias', 'b'], ['for', '30']], {'alias': 'c'}, {'alias': 'd'}]], ['stop', [{'alias': 'e'}]], 'p2', ['resume', [{'alias': 'f'}]]]

正如您所看到的 res.as_list() 返回所有预期的标记,但是 res.as_dict() 只返回 'platform': 'p2' 部分缺少 'platfrom': ' p1' 一个,我不明白其原因。

编辑2:

我现在已经解决了这个问题,将最后一部分更改为:

expr = pp.Dict(pp.Group(platform + command[1, ...]("commands")))
    
grammar <<= pp.OneOrMore(expr + OPT_SEP)

我得到了以下 Dict 作为输出:

{'p1': {'platform': 'p1', 'commands': [{'cmd': 'start', 'tasks': [{'alias': 'a'}, {'alias': 'b', 'for': '30'}, {'alias': 'c'}, {'alias': 'd'}]}, {'cmd': 'stop', 'tasks': [{'alias': 'e'}]}]},


'p2': {'platform': 'p2', 'commands': [{'cmd': 'resume', 'tasks': [{'alias': 'f'}]}]}}


[['p1', ['start', [{'alias': 'a'}, [['alias', 'b'], ['for', '30']], {'alias': 'c'}, {'alias': 'd'}]], ['stop', [{'alias': 'e'}]]], ['p2', ['resume', [{'alias': 'f'}]]]]

也许我会回答我自己的问题并稍后将其标记为已解决,因为我意识到我需要更新我的解析器以重新运行 DictionariesList 而不是只是一个大 Dict 因为处理解析器输出的顺序非常重要。

python parsing pyparsing
1个回答
0
投票

所以,我将回答我自己在以下输入字符串上测试的问题:

p1 开始 a, {alias = b, for : 30}; c、d 停止 e; p2 恢复 f p3 开始 {名称 = g,at = 5}

import pyparsing as pp

def set_alias(t):
    return {"alias": t[0]}

def set_expr(t):
    for expre in t.as_dict().values():
        result.append(expre)

str = "p1 start a, {alias = b, for : 30}; c, d stop e; p2 resume f p3 start {name = g, at = 5}"


# result will represent the output as a List of Dictionaries
result = []

grammar = pp.Forward()

SEP = pp.one_of(", ;")
EQ = pp.Suppress(pp.one_of(': ='))
LBRACE, RBRACE = map(pp.Suppress,"{}")
OPT_SEP = pp.Suppress(pp.Opt(SEP))
CMD_KEYWORD = (pp.CaselessKeyword("start") | pp.CaselessKeyword("stop") | pp.CaselessKeyword("resume"))("cmd")
platform = pp.one_of("p1 p2 p3")("platform")
alias = ~(CMD_KEYWORD | platform) + pp.Word(pp.alphanums)
prop = pp.Word(pp.alphanums)
value = pp.Word(pp.alphanums)
prop_value = pp.Dict(pp.Group(prop + EQ + value))
task_config = LBRACE + pp.Group(pp.delimitedList(prop_value, delim = SEP)) + RBRACE
command = pp.Group(CMD_KEYWORD + pp.Group(pp.OneOrMore((task_config | alias.set_parse_action(set_alias)) + OPT_SEP))("tasks"))
expr = pp.Dict(pp.Group(platform + command[1, ...]("commands"))).setParseAction(set_expr)

grammar <<= pp.OneOrMore(expr + OPT_SEP)

print('\nDict = ', res.as_dict())

print('\n List of Dict = ', result)

print('\nList of Tokens =', res.as_list())

结果是:

Dict =  {'p1': {'platform': 'p1', 'commands': [{'cmd': 'start', 'tasks': [{'alias': 'a'}, {'alias': 'b', 'for': '30'}, {'alias': 'c'}, {'alias': 'd'}]}, {'cmd': 'stop', 'tasks': [{'alias': 'e'}]}]}, 'p2': {'platform': 'p2', 'commands': [{'cmd': 'resume', 'tasks': [{'alias': 'f'}]}]}, 'p3': {'platform': 'p3', 'commands': [{'cmd': 'start', 'tasks': [{'name': 'g', 'at': '5'}]}]}}


List of Dict =  [{'platform': 'p1', 'commands': [{'cmd': 'start', 'tasks': [{'alias': 'a'}, {'alias': 'b', 'for': '30'}, {'alias': 'c'}, {'alias': 
'd'}]}, {'cmd': 'stop', 'tasks': [{'alias': 'e'}]}]}, {'platform': 'p2', 'commands': [{'cmd': 'resume', 'tasks': [{'alias': 'f'}]}]}, {'platform': 'p3', 'commands': [{'cmd': 'start', 'tasks': [{'name': 'g', 'at': '5'}]}]}]


List of Tokens = [['p1', ['start', [{'alias': 'a'}, [['alias', 'b'], ['for', '30']], {'alias': 'c'}, {'alias': 'd'}]], ['stop', [{'alias': 'e'}]]], ['p2', ['resume', [{'alias': 'f'}]]], ['p3', ['start', [[['name', 'g'], ['at', '5']]]]]]
© www.soinside.com 2019 - 2024. All rights reserved.