如何使用 PyYAML 读取 python 元组?

问题描述 投票:0回答:6

我有以下名为

input.yaml
的 YAML 文件:

cities:
  1: [0,0]
  2: [4,0]
  3: [0,4]
  4: [4,4]
  5: [2,2]
  6: [6,2]
highways:
  - [1,2]
  - [1,3]
  - [1,5]
  - [2,4]
  - [3,4]
  - [5,4]
start: 1
end: 4

我使用 PyYAML 加载它并打印结果,如下所示:

import yaml

f = open("input.yaml", "r")
data = yaml.load(f)
f.close()

print(data)

结果是以下数据结构:

{ 'cities': { 1: [0, 0]
            , 2: [4, 0]
            , 3: [0, 4]
            , 4: [4, 4]
            , 5: [2, 2]
            , 6: [6, 2]
            }
, 'highways': [ [1, 2]
              , [1, 3]
              , [1, 5]
              , [2, 4]
              , [3, 4]
              , [5, 4]
              ]
, 'start': 1
, 'end': 4
}

如您所见,每个城市和高速公路都以列表的形式表示。但是,我希望它们被表示为一个元组。因此,我使用推导式手动将它们转换为元组:

import yaml

f = open("input.yaml", "r")
data = yaml.load(f)
f.close()

data["cities"] = {k: tuple(v) for k, v in data["cities"].items()}
data["highways"] = [tuple(v) for v in data["highways"]]

print(data)

但是,这似乎是一种黑客行为。有没有办法指示 PyYAML 直接将它们读取为元组而不是列表?

python yaml pyyaml
6个回答
34
投票

对于你正在尝试做的事情,我不会称你所做的事情为黑客行为。根据我的理解,您的替代方法是在 YAML 文件中使用特定于 python 的标签,以便在加载 yaml 文件时正确表示它。但是,这需要您修改 yaml 文件,如果该文件很大,可能会非常烦人并且不理想。

查看 PyYaml 文档 进一步说明了这一点。最终,您希望在您想要表示的结构前面放置一个

!!python/tuple
。要获取样本数据,它需要:

YAML 文件:

cities:
  1: !!python/tuple [0,0]
  2: !!python/tuple [4,0]
  3: !!python/tuple [0,4]
  4: !!python/tuple [4,4]
  5: !!python/tuple [2,2]
  6: !!python/tuple [6,2]
highways:
  - !!python/tuple [1,2]
  - !!python/tuple [1,3]
  - !!python/tuple [1,5]
  - !!python/tuple [2,4]
  - !!python/tuple [3,4]
  - !!python/tuple [5,4]
start: 1
end: 4

示例代码:

import yaml

with open('y.yaml') as f:
    d = yaml.load(f.read())

print(d)

将输出:

{'cities': {1: (0, 0), 2: (4, 0), 3: (0, 4), 4: (4, 4), 5: (2, 2), 6: (6, 2)}, 'start': 1, 'end': 4, 'highways': [(1, 2), (1, 3), (1, 5), (2, 4), (3, 4), (5, 4)]}

6
投票

取决于您的 YAML 输入来自“hack”的位置,这是一个很好的解决方案,特别是如果您使用

yaml.safe_load()
而不是不安全的
yaml.load()
。如果您的 YAML 文件中的“叶子”序列需要是元组,您可以这样做 ¹:

import pprint
import ruamel.yaml
from ruamel.yaml.constructor import SafeConstructor


def construct_yaml_tuple(self, node):
    seq = self.construct_sequence(node)
    # only make "leaf sequences" into tuples, you can add dict 
    # and other types as necessary
    if seq and isinstance(seq[0], (list, tuple)):
        return seq
    return tuple(seq)

SafeConstructor.add_constructor(
    u'tag:yaml.org,2002:seq',
    construct_yaml_tuple)

with open('input.yaml') as fp:
    data = ruamel.yaml.safe_load(fp)
pprint.pprint(data, width=24)

打印:

{'cities': {1: (0, 0),
            2: (4, 0),
            3: (0, 4),
            4: (4, 4),
            5: (2, 2),
            6: (6, 2)},
 'end': 4,
 'highways': [(1, 2),
              (1, 3),
              (1, 5),
              (2, 4),
              (3, 4),
              (5, 4)],
 'start': 1}

如果您需要处理更多材料,其中序列需要再次成为“正常”列表,请使用:

SafeConstructor.add_constructor(
    u'tag:yaml.org,2002:seq',
    SafeConstructor.construct_yaml_seq)

¹ 这是使用 ruamel.yaml 一个 YAML 1.2 解析器完成的,我是该解析器的作者。如果您只需要支持 YAML 1.1 和/或由于某种原因无法升级,您应该能够对旧版 PyYAML 执行相同的操作


4
投票

我遇到了与问题相同的问题,我对这两个答案不太满意。在浏览 pyyaml 文档时我发现 确实是两个有趣的方法

yaml.add_constructor
yaml.add_implicit_resolver

隐式解析器通过将字符串与正则表达式进行匹配,解决了必须用

!!python/tuple
标记所有条目的问题。我还想使用元组语法,所以写
tuple: (10,120)
而不是写一个列表
tuple: [10,120]
然后得到 转换为元组,我个人觉得很烦人。我也不想安装外部库。这是代码:

import yaml
import re

# this is to convert the string written as a tuple into a python tuple
def yml_tuple_constructor(loader, node): 
    # this little parse is really just for what I needed, feel free to change it!                                                                                            
    def parse_tup_el(el):                                                                                                            
        # try to convert into int or float else keep the string                                                                      
        if el.isdigit():                                                                                                             
            return int(el)                                                                                                           
        try:                                                                                                                         
            return float(el)                                                                                                         
        except ValueError:                                                                                                           
            return el                                                                                                                

    value = loader.construct_scalar(node)                                                                                            
    # remove the ( ) from the string                                                                                                 
    tup_elements = value[1:-1].split(',')                                                                                            
    # remove the last element if the tuple was written as (x,b,)                                                                     
    if tup_elements[-1] == '':                                                                                                       
        tup_elements.pop(-1)                                                                                                         
    tup = tuple(map(parse_tup_el, tup_elements))                                                                                     
    return tup                                                                                                                       

# !tuple is my own tag name, I think you could choose anything you want                                                                                                                                   
yaml.add_constructor(u'!tuple', yml_tuple_constructor)
# this is to spot the strings written as tuple in the yaml                                                                               
yaml.add_implicit_resolver(u'!tuple', re.compile(r"\(([^,\W]{,},){,}[^,\W]*\)")) 

最后执行以下命令:

>>> yml = yaml.load("""
   ...: cities:
   ...:   1: (0,0)
   ...:   2: (4,0)
   ...:   3: (0,4)
   ...:   4: (4,4)
   ...:   5: (2,2)
   ...:   6: (6,2)
   ...: highways:
   ...:   - (1,2)
   ...:   - (1,3)
   ...:   - (1,5)
   ...:   - (2,4)
   ...:   - (3,4)
   ...:   - (5,4)
   ...: start: 1
   ...: end: 4""")
>>>  yml['cities']
{1: (0, 0), 2: (4, 0), 3: (0, 4), 4: (4, 4), 5: (2, 2), 6: (6, 2)}
>>> yml['highways']
[(1, 2), (1, 3), (1, 5), (2, 4), (3, 4), (5, 4)]

与我没有测试过的

save_load
相比,
load
可能存在潜在的缺点。


0
投票

您将

tuple
视为
list

params.yaml

foo:
  bar: ["a", "b", "c"]

来源


0
投票

这对我有用 -

config.yaml

cities:
    1: !!python/tuple [0,0]
    2: !!python/tuple [4,0]
    3: !!python/tuple [0,4]
    4: !!python/tuple [4,4]
    5: !!python/tuple [2,2]
    6: !!python/tuple [6,2]
highways:
    - !!python/tuple [1,2]
    - !!python/tuple [1,3]
    - !!python/tuple [1,5]
    - !!python/tuple [2,4]
    - !!python/tuple [3,4]
    - !!python/tuple [5,4]
start: 1
end: 4

main.py

import yaml

def tuple_constructor(loader, node):
    # Load the sequence of values from the YAML node
    values = loader.construct_sequence(node)
    # Return a tuple constructed from the sequence
    return tuple(values)

# Register the constructor with PyYAML
yaml.SafeLoader.add_constructor('tag:yaml.org,2002:python/tuple', 
tuple_constructor)

# Load the YAML file
with open('config.yaml', 'r') as f:
    data = yaml.load(f, Loader=yaml.SafeLoader)

print(data)

输出:

{'cities': {1: (0, 0), 2: (4, 0), 3: (0, 4), 4: (4, 4), 5: (2, 2), 6: (6, 2)},
'highways': [(1, 2), (1, 3), (1, 5), (2, 4), (3, 4), (5, 4)], 
'start': 1, 
'end': 4}

0
投票

在某些情况下可能不安全,但对我来说一个简单的解决方法是以字符串表示形式存储元组列表。读回时,使用 eval() 将字符串转换为元组列表。

atoms = [(7, 13, 14, 15)]  # list of tuple

# when creating dict for YAML dump
ddict[grp] = str(atoms)  # convert list of tuples to string

# then after reading the YAML file
ddict[grp] = list(eval(ddict[grp]))  # list() for slight safety
© www.soinside.com 2019 - 2024. All rights reserved.