多索引DataFrame到邻接矩阵序列(Numpy 3D阵列)

问题描述 投票:0回答:1

我想将多索引数据帧转换为一系列邻接矩阵,或者由时间坐标索引的3d Numpy数组。

这是数据框:

Boxes = {'Date': ['2016-01-01 00:00:00', '2016-01-01 00:00:00', 
        '2016-01-01 00:00:00', '2016-01-01 12:00:00', '2016-01-01 12:00:00', 
        '2016-01-01 12:00:00', '2016-01-01 17:54:00', '2016-01-01 22:44:00'],
         'From': ['Green','Green','Green','Blue','Blue','Red','Red','Red'],
         'To': ['Rectangle','Rectangle','Square','Rectangle','Square','Square','Square','Rectangle'],
         'Qty': ['12', '3', '43', '125', '34', '76', '9', '222' ]}

df = pd.DataFrame(Boxes, columns= ['Date', 'From', 'To', 'Qty'])

我可以通过创建多索引数据框

dups = df.pivot_table(index=['Date'], columns = ['From', 'To'], values = ['Qty'], aggfunc=np.sum).fillna(0)

将此多索引数据帧转换为由时间组件索引的邻接矩阵序列的最佳方法是什么?或者,创建一个3d numpy数组,如下所示:

[[[ 0.   0.   0.   15.  43.]
  [ 0.   0.   0.   0.   0.]
  [ 0.   0.   0.   0.   0.]
  [ 0.   0.   0.   0.   0.]
  [ 0.   0.   0.   0.   0.]]

 [[ 0.   0.   0.   0.    0.]
  [ 0.   0.   0.   125.  34.]
  [ 0.   0.   0.   0.    76.]
  [ 0.   0.   0.   0.    0.]
  [ 0.   0.   0.   0.    0.]]

 [[ 0.   0.   0.   0.    0.]
  [ 0.   0.   0.   0.    0.]
  [ 0.   0.   0.   0.    9.]
  [ 0.   0.   0.   0.    0.]
  [ 0.   0.   0.   0.    0.]]

 [[ 0.   0.   0.   0.      0.]
  [ 0.   0.   0.   0.      0.]
  [ 0.   0.   0.   222.    0.]
  [ 0.   0.   0.   0.      0.]
  [ 0.   0.   0.   0.      0.]]]

由于这些矩阵将是稀疏的,因此邻接列表可能是更有效的答案。谢谢!

pandas multi-index adjacency-matrix
1个回答
0
投票

由于您没有提供预期的输出,我只能提供转换为3 d数组的方式

d1 = len(dups.columns.get_level_values(1).unique())
d2 = len(dups.columns.get_level_values(2).unique())
a = dups.values.reshape((len(dups), d1, d2))
a.shape
Out[450]: (4, 3, 2)
© www.soinside.com 2019 - 2024. All rights reserved.