我试图通过对现有列执行一些操作来创建新列,但它在我的代码中引发了一个关键错误。我尝试使用 df.columns 来调试它并复制粘贴确切的名称,但仍然遇到相同的错误。我的代码如下:
def calculate_elasticity(group):
sales_change = group['Primary Sales Quantity'].pct_change()
price_change = group['MRP'].pct_change()
elasticity = sales_change / price_change
return elasticity
df['Variant-based Elasticity'] = df.groupby('Variant').transform(calculate_elasticity)
显示的错误是
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
3801 try:
-> 3802 return self._engine.get_loc(casted_key)
3803 except KeyError as err:
16 frames
pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()
KeyError: 'Primary Sales Quantity'
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
3802 return self._engine.get_loc(casted_key)
3803 except KeyError as err:
-> 3804 raise KeyError(key) from err
3805 except TypeError:
3806 # If we have a listlike key, _check_indexing_error will raise
KeyError: 'Primary Sales Quantity'
我尝试调试,以下是 df.columns 的结果
Index(['Cal. year / month', 'Material', 'Product Name', 'MRP', 'Distribution Channel (Master)', 'Unnamed: 5', 'L1 Prod Category', 'L2 Prod Brand', 'L3 Prod Sub-Category', 'State', 'Primary Actual GSV Value', 'Primary Sales Qty (CS)', 'Secondary GSV', 'Secondary sales Qty(CS)', 'Primary Volume(MT/KL)', 'Secondary Volume(MT/KL)', 'Variant', 'Weight', 'Offers', 'Primary Sales Quantity'], dtype='object')
print(df['Primary Sales Quantity'])
的结果是
0 155
1 16953
2 455
3 138
4 2653
...
14147 6
14148 1
14149 8428
14150 237
14151 24
Name: Primary Sales Quantity, Length: 14152, dtype: int64
我尝试使用列名称进行调试。我什至可以通过该名称访问该列,只是在此函数中抛出错误。
GroupBy.transform
无法一起处理 2 列,则需要 GroupBy.apply
:
def calculate_elasticity(group):
sales_change = group['Primary Sales Quantity'].pct_change()
price_change = group['MRP'].pct_change()
group['Variant-based Elasticity'] = sales_change / price_change
return group
df = df.groupby('Variant', group_keys=False).apply(calculate_elasticity)
print (df)
Variant Primary Sales Quantity MRP Variant-based Elasticity
0 a 10 8 NaN
1 a 7 10 -1.200000
2 b 87 3 NaN
3 b 8 2 2.724138
或者更改没有辅助功能的解决方案:
g = df.groupby('Variant')
df['Variant-based Elasticity'] = (g['Primary Sales Quantity'].pct_change() /
g['MRP'].pct_change())
print (df)
Variant Primary Sales Quantity MRP Variant-based Elasticity
0 a 10 8 NaN
1 a 7 10 -1.200000
2 b 87 3 NaN
3 b 8 2 2.724138
带有助手的替代解决方案
df1
DataFrame:
df1 = df.groupby('Variant')[['Primary Sales Quantity', 'MRP']].pct_change()
df['Variant-based Elasticity'] = df1['Primary Sales Quantity'] / df1['MRP']
print (df)
Variant Primary Sales Quantity MRP Variant-based Elasticity
0 a 10 8 NaN
1 a 7 10 -1.200000
2 b 87 3 NaN
3 b 8 2 2.724138
样本数据:
df = pd.DataFrame({'Variant': ['a', 'a', 'b', 'b'],
'Primary Sales Quantity': [10, 7, 87, 8],
'MRP': [8, 10, 3, 2]})