我正在Python 3.7中使用Pandas,以便从HDF5文件读取数据。 HDF5文件包含MSC Nastran的结果表。
HDF5文件名为'ave_01.h5'
HDF5位移表如下所示:
使用下面的方法就可以了:
import numpy as np
import pandas as pd
pd.read_hdf('./ave_01.h5', 'NASTRAN/RESULT/NODAL/DISPLACEMENT')
但是,我还有另一个压力结果表,看起来像这样:
因此,我希望以下代码可以工作,但不能:
pd.read_hdf('./ave_01.h5', '/NASTRAN/RESULT/ELEMENTAL/STRESS/QUAD_CN')
我收到以下错误:
ValueError:传递的项目数错误5,展示位置表示1
我已经注意到,第二个表在某些列中包含列表,而第一个表则没有。这些列表还包含5个元素。也许这是导致错误的原因,但我不知道这是否是正确的,也不知道如何纠正。
我要去哪里错了?
谢谢。
您是正确的,该问题与5个元素的列表有关。
我最终可以复制该问题。在我的情况下,该列表具有9个元素,但read_hdf函数每个表单元格只希望有一个值。
下面是我的Pandas的Python代码。不幸的是,我无法解决该问题。
通过使用h5py库,我成功地前进了。再往下是带有h5py库的Python代码。
Pandas
工作示例
import pandas as pd
test_output = pd.read_hdf('./nug_46.h5', '/NASTRAN/RESULT/NODAL/DISPLACEMENT')
print(test_output)
# returns
# ID X Y Z RX RY RZ DOMAIN_ID
# 0 3 -0.000561 -0.001269 0.001303 0.0 0.0 0.0 2
# 1 5 -0.001269 -0.000561 0.001303 0.0 0.0 0.0 2
# 2 6 -0.001342 -0.000668 0.001181 0.0 0.0 0.0 2
# 3 7 -0.001342 -0.000794 0.001162 0.0 0.0 0.0 2
# 4 8 -0.001335 -0.000893 0.001120 0.0 0.0 0.0 2
# ... ... ... ... ... ... ... ... ...
# 4878 20475 0.000000 0.000000 0.000000 0.0 0.0 0.0 2
# 4879 20478 0.000000 0.000000 0.000000 0.0 0.0 0.0 2
# 4880 100001 0.000000 0.000000 0.000000 0.0 0.0 0.0 2
# 4881 100002 0.000000 0.000000 0.000000 0.0 0.0 0.0 2
# 4882 100003 0.000000 0.000000 0.000000 0.0 0.0 0.0 2
非工作示例
test_output = pd.read_hdf('./nug_46.h5', 'NASTRAN/RESULT/ELEMENTAL/STRESS/HEXA')
print(test_output)
# returns an error
# Traceback (most recent call last):
# File "/home/apricot/PycharmProjects/python_hdf5_reader/venv/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 1654, in create_block_manager_from_blocks
# make_block(values=blocks[0], placement=slice(0, len(axes[0])))
# File "/home/apricot/PycharmProjects/python_hdf5_reader/venv/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 3041, in make_block
# return klass(values, ndim=ndim, placement=placement)
# File "/home/apricot/PycharmProjects/python_hdf5_reader/venv/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 125, in __init__
# f"Wrong number of items passed {len(self.values)}, "
# ValueError: Wrong number of items passed 9, placement implies 1
H5PY
工作示例
import h5py
file = h5py.File('./nug_46.h5', 'r')
# Open the dataset of compound type
dataset = file['/NASTRAN/RESULT/ELEMENTAL/STRESS/HEXA']
# Print the column names
column_names = dataset.dtype.names
print(column_names)
# returns
# ('EID', 'CID', 'CTYPE', 'NODEF', 'GRID', 'X', 'Y', 'Z', 'TXY', 'TYZ', 'TZX', 'DOMAIN_ID')
# Print the first ten rows of the dataset
# If you want to print the whole dataset, leave out the brackets and
# colon, e.g. enumerate(dataset)
for i, line in enumerate(dataset[0:10]):
print(line)
# returns
# (447, 0, b'GRID', 8, [ 0, 5, 6, 12, 11, 1716, 1340, 1346, 1345], ..., 2)
# (448, 0, b'GRID', 8, [ 0, 6, 7, 13, 12, 1340, 1341, 1347, 1346], ..., 2)
# (449, 0, b'GRID', 8, [ 0, 7, 8, 14, 13, 1341, 1342, 1348, 1347], ..., 2)
# (450, 0, b'GRID', 8, [ 0, 8, 9, 15, 14, 1342, 1343, 1349, 1348], ..., 2)
# (451, 0, b'GRID', 8, [ 0, 9, 10, 16, 15, 1343, 1344, 1350, 1349], ..., 2)
# (452, 0, b'GRID', 8, [ 0, 11, 12, 18, 17, 1345, 1346, 1352, 1714], ..., 2)
# (453, 0, b'GRID', 8, [ 0, 12, 13, 19, 18, 1346, 1347, 1353, 1352], ..., 2)
# (454, 0, b'GRID', 8, [ 0, 13, 14, 20, 19, 1347, 1348, 1354, 1353], ..., 2)
# (455, 0, b'GRID', 8, [ 0, 14, 15, 21, 20, 1348, 1349, 1355, 1354], ..., 2)
# (456, 0, b'GRID', 8, [ 0, 15, 16, 22, 21, 1349, 1350, 1356, 1355], ..., 2)
# Print the 2nd row, 1st column in the dataset
print(dataset[1][column_names[0]])
# returns
# 448
# Print the 2nd row, 5th column, 3rd element of the list in the dataset
print(dataset[1][column_names[4]][2])
# returns
# 7
# Same as above, but by using the column name
print(dataset[1]['GRID'][2])
# returns
# 7
为了快速澄清由MSC Nastran创建的HDF5文件中的数据格式。这些值不是Python列表,而是NumPy数组。我知道,这具有欺骗性,因为两种数据类型都使用[val1,val2,val3],并且都使用索引来访问单个元素。 但是它们不相同。您可以通过使用.dtype
属性检查每个字段的数据类型来确认这一点,如下所示。
每个数组在多个元素位置具有值。当您的Nastran压力请求为(BOTH)时,就会发生这种情况;您会在质心和角/网格处获得输出。这些位置与GRID字段中的网格ID匹配。
这是使用Quad4元素数据的简单示例。其他元素类型的过程与此类似:
In [1]: import h5py
In [2]: h5f = h5py.File('tube_a_mesh.h5', 'r')
In [3]: str_ds = h5f['/NASTRAN/RESULT/ELEMENTAL/STRESS/QUAD_CN']
In [4]: print (str_ds.dtype)
{'names' ['EID','TERM','GRID','FD1','X1','Y1','TXY1','FD2','X2','Y2','TXY2','DOMAIN_ID'],
'formats':['<i8','S4',('<i8', (5,)),('<f8', (5,)),('<f8', (5,)),('<f8', (5,)),('<f8', (5,)),('<f8', (5,)),('<f8', (5,)),('<f8', (5,)),('<f8', (5,)),'<i8'], 'offsets':[0,8,16,56,96,136,176,216,256,296,336,376],
'itemsize':384}
dytpe
显示GRID
为('<i8', (5,))
,X1
为('<f8', (5,))
(对于其他应力值,相同的dtype:Y1
,TXY1
等)。继续,这就是如何在Z1位置作为HDF5数据集对象提取Sx应力。
。In [5]: quad_sx_arr= str_ds['X1'] In [6]: print (quad_sx_arr.dtype, quad_sx_arr.dtype) float64 (4428, 5)
或者,这是如何提取Z1处的所有Sx应力作为NumPy数组
。] >>In [7]: quad_sx_arr= str_ds['X1'][:] In [8]: print (quad_sx_arr.dtype, quad_sx_arr.dtype) float64 (4428, 5)
最后,如果只需要质心值(每个X1数组的第一个元素),则为如何将其提取为NumPy数组
In [9]: quad_csx_arr = quad_sx_arr[:,0]
In [10]: print (quad_csx_arr.dtype, quad_csx_arr.shape)
float64 (4428,)