我有一个数据框,缺少日期。是否有可能在第二列和三月的第一列中使用所有日期来更新此数据框。我希望缺失日期的价格为0,并且产品为铅笔或钢笔,以列出完整的清单。数据框的顺序不一定总是正确的,因此不应使用填充。
Price Product
1/Feb/2020 4400 Pencils
2/Feb/2020 0 Pencils
3/Feb/2020 0 Pencils
27/Feb/2020 0 Pencils
28/Feb/2020 0 Pencils
1/Mar/2020 55000 Pencils
2/Mar/2020 0 Pencils
3/Mar/2020 0 Pencils
30/Mar/2020 0 Pencils
31/Mar/2020 0 Pencils
1/Feb/2020 4400 Pens
2/Feb/2020 4454 Pens
3/Feb/2020 0 Pens
27/Feb/2020 34534 Pens
28/Feb/2020 345345 Pens
1/Mar/2020 550345 Pens
2/Mar/2020 354 Pens
3/Mar/2020 454 Pens
30/Mar/2020 454 Pens
31/Mar/2020 4545 Pens
一个主意应该是:
df.index = pd.to_datetime(df.index)
df = df.set_index('Product', append=True).unstack().asfreq('d').stack(dropna=False).reset_index(level=1)
print (df)
Product Price
2020-02-01 Pencils 4400.0
2020-02-01 Pens 4400.0
2020-02-02 Pencils 0.0
2020-02-02 Pens 4454.0
2020-02-03 Pencils 0.0
... ...
2020-03-29 Pens NaN
2020-03-30 Pencils 0.0
2020-03-30 Pens 454.0
2020-03-31 Pencils 0.0
2020-03-31 Pens 4545.0
[120 rows x 2 columns]