在原始DataFrame中插入行差值

问题描述 投票:0回答:1

我有一个 read.csv DataFrame,它不断更新,每次运行脚本时都会添加一个新行,看起来像......

df = pd.read_csv(file_path)
print(df.to_string(index=False))

timestamp    Puts   Calls  PutCh  CallCh  ChDiff

09:41:12 AM 2027891 1820724 280101  200974   79127
09:48:51 AM 2075976 1862053 328186  242303   85883
09:58:48 AM 2091487 1885842 343697  266092   77605
10:08:21 AM 2091879 1918592 344089  298842   45247
02:26:00 PM 1995234 1941917 247444  322167  -74723
02:44:36 PM 1990071 1934874 242281  315124  -72843
02:56:17 PM 1970892 1938472 223102  318722  -95620

现在我想要每个后续行与我已阅读有关 df.diff() 的前一行的差异。所以我删除了时间戳列以获取新的数据名 df1 并编写了我的脚本...

df1.diff()

我的输出为....

    Puts   Calls    PutCh  CallCh    ChDiff
     NaN     NaN      NaN     NaN       NaN
 48085.0 41329.0  48085.0 41329.0    6756.0
 15511.0 23789.0  15511.0 23789.0   -8278.0
   392.0 32750.0    392.0 32750.0  -32358.0
-96645.0 23325.0 -96645.0 23325.0 -119970.0
 -5163.0 -7043.0  -5163.0 -7043.0    1880.0
-19179.0  3598.0 -19179.0  3598.0  -22777.0

在这里,我希望将这些差异值添加到每列括号中的原始 DataFrame(df) 中。更详细地说,我的输出应该看起来像(这里时间戳列也应该像我的 df 中一样)....

Puts    Calls   PutCh   CallCh  ChDiff
2027891 1820724 280101  200974  79127
2075976 1862053 328186  242303  85883
(48085) (41329) (48085) (41329) (6756)
2091487 1885842 343697  266092  77605
(15511) (23789) (15511) (23789) (-8278)
2091879 1918592 344089  298842  45247
(392)   (32750) (392)   (32750) (-32358)

有什么方法可以做同样的事情吗?

python pandas dataframe diff
1个回答
0
投票

diff
的输出转换为字符串,添加括号和
concat
回到原来的状态,最后
sort_index
按顺序重新组织行:

tmp = (df.drop(columns='timestamp').diff()
         .iloc[1:]
         .apply(lambda s: '('+s.astype(str)+')')
      )

out = pd.concat([df, tmp]).sort_index()

输出:

     timestamp        Puts      Calls       PutCh     CallCh       ChDiff
0  09:41:12 AM     2027891    1820724      280101     200974        79127
1  09:48:51 AM     2075976    1862053      328186     242303        85883
1          NaN   (48085.0)  (41329.0)   (48085.0)  (41329.0)     (6756.0)
2  09:58:48 AM     2091487    1885842      343697     266092        77605
2          NaN   (15511.0)  (23789.0)   (15511.0)  (23789.0)    (-8278.0)
3  10:08:21 AM     2091879    1918592      344089     298842        45247
3          NaN     (392.0)  (32750.0)     (392.0)  (32750.0)   (-32358.0)
4  02:26:00 PM     1995234    1941917      247444     322167       -74723
4          NaN  (-96645.0)  (23325.0)  (-96645.0)  (23325.0)  (-119970.0)
5  02:44:36 PM     1990071    1934874      242281     315124       -72843
5          NaN   (-5163.0)  (-7043.0)   (-5163.0)  (-7043.0)     (1880.0)
6  02:56:17 PM     1970892    1938472      223102     318722       -95620
6          NaN  (-19179.0)   (3598.0)  (-19179.0)   (3598.0)   (-22777.0)
© www.soinside.com 2019 - 2024. All rights reserved.