客户和产品每月的每月差异。蟒蛇

问题描述 投票:0回答:1

我有以下数据框,我需要按客户和类别计算每月的方差。 我的代码的问题:在第四行和第四列 DiffA 中,我得到的值为零而不是 1782207 (因为它是该月的增量) 我猜因为客户上个月没有产品类别 = A,所以代码无法正常工作。知道如何解决这个问题。

我正在尝试下面的代码

df.sort_values(by=['Name','Category','Date'], inplace=True)
df['DiffB'] = df[df['Category'] == B].groupby('Name')['amount'].diff().fillna(0)
df['DiffA'] = df[df['Category'] == A].groupby('Name')['amount'].diff().fillna(0)

但我的输出低于

姓名 类别 日期 金额 差异B 差异A
ABC B 2023-04 1829540 0 0
ABC B 2023-05 1829540 0 0
ABC B 2023-06 12087873 10258333 0
ABC A 2023-07 1782207 0 0
ABC B 2023-07 10258333 -1829540 0

下面是所需的输出

姓名 类别 日期 金额 差异B 差异A
ABC B 2023-04 1829540 0 0
ABC B 2023-05 1829540 0 0
ABC B 2023-06 12087873 10258333 0
ABC A 2023-07 1782207 0 1782207
ABC B 2023-07 10258333 -1829540 0
python pandas diff
1个回答
0
投票

类似这样的:

import pandas as pd

data = [['ABC', 'B', '2023-04', 1829540],
        ['ABC', 'B', '2023-05', 1829540],
        ['ABC', 'B', '2023-06', 12087873],
        ['ABC', 'A', '2023-07', 1782207],
        ['ABC', 'B', '2023-07', 10258333]]

df = pd.DataFrame(data, columns=['Name', 'Category', 'Date', 'Amount'])

# Calculate differences for 'B' category
df['DiffB'] = df.where(df['Category'] == 'B')['Amount'].diff().fillna(df['Amount'])

# Calculate differences for 'A' category
df['DiffA'] = df.where(df['Category'] == 'A')['Amount'].diff().fillna(df['Amount'])

# For rows not in category 'B' or 'A', set 'DiffB' or 'DiffA' respectively to 0
df['DiffB'] = df.apply(lambda x: 0 if x['Category'] != 'B' else x['DiffB'], axis=1)
df['DiffA'] = df.apply(lambda x: 0 if x['Category'] != 'A' else x['DiffA'], axis=1)

print(df)

Name Category     Date    Amount       DiffB      DiffA
0  ABC        B  2023-04   1829540   1829540.0        0.0
1  ABC        B  2023-05   1829540         0.0        0.0
2  ABC        B  2023-06  12087873  10258333.0        0.0
3  ABC        A  2023-07   1782207         0.0  1782207.0
4  ABC        B  2023-07  10258333  10258333.0        0.0
© www.soinside.com 2019 - 2024. All rights reserved.