Pandas - 从转换后的数据帧中检索原始数据帧

Question

我建立了一个数据框，以便通过以下步骤随时间保存股票指数的股票成分：

1）首先，我通过数据提供程序下载原始数据并存储在dict中

2）转换为数据帧以获得：

constituent_pd = pd.DataFrame.from_dict(constituent, orient='index')

index  col1     col2    col3  etc...
1/1/92 stockA  stockB  NA     etc...
2/1/92 stockB  stockC  stockD etc...

3）转换为布尔数据帧：

constituent_bol = pd.get_dummies(constituent_pd.stack()).max(level=0).astype(bool)

index  stockA  stockB  stockC etc...
1/1/92 True    True    False  etc...
2/1/92 False   True    True   etc...

从那里，我一直在努力寻找一种快速更新我的桌子的方法。为此，我需要将constituent_bin重新转换回原始字典表单，将其与新字典合并（对于更新的日期）并重新启动整个过程。

step1 = constituent_bol.astype('int32')
step2 = step1[step1 ==1].stack().reset_index().drop(0,1).set_index('level_0')

1/1/92 stockA
1/1/92 stockB
etc...

而且我不知道如何重塑这个长数据帧如construct_pd以便稍后获得一个dic。

感谢您的任何帮助！

Answer 1

问题在于功能max(level=0)丢失原始列名称，因为它按第一级聚合。

如此接近你需要的是可能使用GroupBy.cumcount作为新列名称的计数器：

print (constituent_pd)
          col1    col2    col3
index                         
1/1/92  stockA  stockB     NaN
2/1/92  stockB  stockC  stockD

print (pd.get_dummies(constituent_pd.stack()))
             stockA  stockB  stockC  stockD
index                                      
1/1/92 col1       1       0       0       0
       col2       0       1       0       0
2/1/92 col1       0       1       0       0
       col2       0       0       1       0
       col3       0       0       0       1

print (pd.get_dummies(constituent_pd.stack()).max(level=0))
        stockA  stockB  stockC  stockD
index                                 
1/1/92       1       1       0       0
2/1/92       0       1       1       1

constituent_bol = pd.get_dummies(constituent_pd.stack()).max(level=0).astype(bool)
print (constituent_bol)
        stockA  stockB  stockC  stockD
index                                 
1/1/92    True    True   False   False
2/1/92   False    True    True    True

step1 = constituent_bol.astype('int32')
step2 = step1[step1 == 1].stack().reset_index().drop(0,1)
step2 = step2.set_index(['index', step2.groupby('index').cumcount()])['level_1'].unstack()
print (step2)
             0       1       2
index                         
1/1/92  stockA  stockB     NaN
2/1/92  stockB  stockC  stockD

Pandas - 从转换后的数据帧中检索原始数据帧

问题描述投票：1回答：1

1个回答

最新问题

Pandas - 从转换后的数据帧中检索原始数据帧

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1