我在我的文件顶部附近用全局范围声明了一个空数据框:
final_df = pd.DataFrame()
我有stats_df
s成功打印正确的值,但final_df
没有改变后附加stats_df:
stats_df = pd.DataFrame(X, columns=stats_feature_names).sum().to_frame().T
print('statsdf being appended: \n', stats_df)
print('final_df before append: \n', final_df)
final_df.append(stats_df)
print('final_df after append: \n', final_df)
这些打印语句的输出是:
statsdf being appended:
GF GA
0 14 33
final_df before append:
Empty DataFrame
Columns: []
Index: []
final_df after append:
Empty DataFrame
Columns: []
Index: []
它应该是:
statsdf being appended:
GF GA
0 14 33
final_df before append:
Empty DataFrame
Columns: []
Index: []
final_df after append:
GF GA
0 14 33
为什么stats_df
没有附加到final_df
?
你需要分配给新的DataFrame
,因为使用DataFrame.append
,而不是纯粹的python append
:
stats_feature_names = ['a','b']
final_df = pd.DataFrame()
X = [[1,2]]
stats_df = pd.DataFrame(X, columns=stats_feature_names).sum().to_frame().T
print('statsdf being appended: \n', stats_df)
print('final_df before append: \n', final_df)
final_df = final_df.append(stats_df, ignore_index=True)
print('final_df after append: \n', final_df)
a b
0 1 2
但更好的解决方案是附加到列表(纯python追加)和循环使用concat
:
L = []
for x in iterator:
stats_df = pd.DataFrame([[1,2]], columns=stats_feature_names).sum().to_frame().T
L.append(stats_df)
final_df = pd.concat(L, ignore_index=True)
print('final_df after append: \n', final_df)