scipy.io.loadmat 嵌套结构(即字典)

问题描述 投票:0回答:6

使用给定的例程(如何使用 scipy 加载 Matlab .mat 文件),我无法访问更深的嵌套结构以将它们恢复到字典中

为了更详细地介绍我遇到的问题,我给出了以下玩具示例:

load scipy.io as spio
a = {'b':{'c':{'d': 3}}}
# my dictionary: a['b']['c']['d'] = 3
spio.savemat('xy.mat',a)

现在我想将 mat-File 读回到 python 中。我尝试了以下方法:

vig=spio.loadmat('xy.mat',squeeze_me=True)

如果我现在想访问我得到的字段:

>> vig['b']
array(((array(3),),), dtype=[('c', '|O8')])
>> vig['b']['c']
array(array((3,), dtype=[('d', '|O8')]), dtype=object)
>> vig['b']['c']['d']
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

/<ipython console> in <module>()

ValueError: field named d not found.

但是,通过使用选项

struct_as_record=False
可以访问该字段:

v=spio.loadmat('xy.mat',squeeze_me=True,struct_as_record=False)

现在可以通过

访问它
>> v['b'].c.d
array(3)
python nested structure scipy dictionary
6个回答
66
投票

这里是一些函数,只需使用这个 loadmat 而不是 scipy.io 的 loadmat 来重建字典:

import scipy.io as spio

def loadmat(filename):
    '''
    this function should be called instead of direct spio.loadmat
    as it cures the problem of not properly recovering python dictionaries
    from mat files. It calls the function check keys to cure all entries
    which are still mat-objects
    '''
    data = spio.loadmat(filename, struct_as_record=False, squeeze_me=True)
    return _check_keys(data)

def _check_keys(dict):
    '''
    checks if entries in dictionary are mat-objects. If yes
    todict is called to change them to nested dictionaries
    '''
    for key in dict:
        if isinstance(dict[key], spio.matlab.mio5_params.mat_struct):
            dict[key] = _todict(dict[key])
    return dict        

def _todict(matobj):
    '''
    A recursive function which constructs from matobjects nested dictionaries
    '''
    dict = {}
    for strg in matobj._fieldnames:
        elem = matobj.__dict__[strg]
        if isinstance(elem, spio.matlab.mio5_params.mat_struct):
            dict[strg] = _todict(elem)
        else:
            dict[strg] = elem
    return dict

32
投票

只是对 mergen 答案的增强,不幸的是,如果它到达对象元胞数组,它将停止递归。以下版本将改为创建它们的列表,并在可能的情况下继续递归到元胞数组元素中。

import scipy.io as spio
import numpy as np


def loadmat(filename):
    '''
    this function should be called instead of direct spio.loadmat
    as it cures the problem of not properly recovering python dictionaries
    from mat files. It calls the function check keys to cure all entries
    which are still mat-objects
    '''
    def _check_keys(d):
        '''
        checks if entries in dictionary are mat-objects. If yes
        todict is called to change them to nested dictionaries
        '''
        for key in d:
            if isinstance(d[key], spio.matlab.mat_struct):
                d[key] = _todict(d[key])
        return d

    def _todict(matobj):
        '''
        A recursive function which constructs from matobjects nested dictionaries
        '''
        d = {}
        for strg in matobj._fieldnames:
            elem = matobj.__dict__[strg]
            if isinstance(elem, spio.matlab.mat_struct):
                d[strg] = _todict(elem)
            elif isinstance(elem, np.ndarray):
                d[strg] = _tolist(elem)
            else:
                d[strg] = elem
        return d

    def _tolist(ndarray):
        '''
        A recursive function which constructs lists from cellarrays
        (which are loaded as numpy ndarrays), recursing into the elements
        if they contain matobjects.
        '''
        elem_list = []
        for sub_elem in ndarray:
            if isinstance(sub_elem, spio.matlab.mat_struct):
                elem_list.append(_todict(sub_elem))
            elif isinstance(sub_elem, np.ndarray):
                elem_list.append(_tolist(sub_elem))
            else:
                elem_list.append(sub_elem)
        return elem_list
    data = spio.loadmat(filename, struct_as_record=False, squeeze_me=True)
    return _check_keys(data)

8
投票

scipy >= 1.5.0 开始,此功能现在使用

simplify_cells
参数内置。

from scipy.io import loadmat

mat_dict = loadmat(file_name, simplify_cells=True)

4
投票

我在 scipy 邮件列表 (https://mail.python.org/pipermail/scipy-user/) 上被告知还有两种方法可以访问此数据。

这有效:

import scipy.io as spio
vig=spio.loadmat('xy.mat')
print vig['b'][0, 0]['c'][0, 0]['d'][0, 0]

我机器上的输出: 3

这种访问的原因:“由于历史原因,在 Matlab 中,一切都至少是一个 2D 数组,甚至是标量。” 因此 scipy.io.loadmat 默认情况下模仿 Matlab 行为。


2
投票

找到了一种解决方案,可以通过以下方式访问“scipy.io.matlab.mio5_params.mat_struct 对象”的内容:

v['b'].__dict__['c'].__dict__['d']

1
投票

另一种有效的方法:

import scipy.io as spio
vig=spio.loadmat('xy.mat',squeeze_me=True)
print vig['b']['c'].item()['d']

输出:

3

我也在 scipy 邮件列表上学到了这个方法。我当然不明白(还)为什么必须添加“.item()”,并且:

print vig['b']['c']['d']

将会抛出错误:

IndexError:只有整数、切片 (

:
)、省略号 (
...
)、numpy.newaxis (
None
) 和整数或布尔数组是有效索引

但是当我知道的时候我会回来补充解释。 numpy.ndarray.item 的解释(来自 thenumpy 参考): 将数组的元素复制到标准 Python 标量并返回它。

(请注意,这个答案与 hpaulj 对最初问题的评论基本相同,但我觉得该评论不够“可见”或不够理解。当我搜索该问题的解决方案时,我当然没有注意到它第一次,几周前)。

© www.soinside.com 2019 - 2024. All rights reserved.