Pandas - 将区间索引合并为浮动

Question

我正在使用 qcut 获取预测值的箱并计算每个箱的标准误差。然后，我想通过将数据帧的预测映射到我制作的 bin 的 SE，将这些标准误差应用于另一个数据帧中的预测。

下面是我正在使用的代码，最后一行已完成。

df = pd.DataFrame(np.random.randint(0,100,size=(1000, 2)), columns=['Pred','Error'])
df2 = pd.DataFrame(np.random.randint(0,100,size=(1000, 2)), columns=['Pred'])
df['binned']=pd.qcut(df['Pred'], 10)    
binSEs=df.groupby(['binned'],observed=False)['Error'].std()  

**df2['binSE']=unknownintervaljoin(df['Pred'],binSEs)**

或者，如果我可以基于 binSE 在 df2 中创建一个“分箱”列，我可以合并 binSE 系列。

Answer 1

您可以通过首先根据

df2

中的值计算

df['Pred']

中的分箱列，然后将“分箱”列与

binSEs

合并来实现此目的：

import pandas as pd
import numpy as np

# Create the dataframes
df = pd.DataFrame(np.random.randint(0, 100, size=(1000, 2)), columns=['Pred', 'Error'])
df2 = pd.DataFrame(np.random.randint(0, 100, size=(1000, 1)), columns=['Pred'])

# Calculate the 'binned' column in df2
df2['binned'] = pd.qcut(df2['Pred'], 10, labels=False, duplicates='drop')

# Calculate binSEs
df['binned'] = pd.qcut(df['Pred'], 10)
binSEs = df.groupby(['binned'], observed=False)['Error'].std()

# Merge df2 with binSEs based on the 'binned' column
df2['binSE'] = df2['binned'].map(binSEs)

print(df2.head())

在此代码中，我们首先使用

df2

计算

pd.qcut

中的“分箱”列。然后，我们基于“分箱”列将

df2

与

binSEs

合并，确保映射正确执行。最后，我们在生成的

df2

数据框中将“Error”列重命名为“binSE”。

这将为您提供

df2

，其中“binSE”列包含基于

df

中的 bin 的标准误差。

Pandas - 将区间索引合并为浮动

问题描述投票：0回答：1

1个回答

最新问题

Pandas - 将区间索引合并为浮动

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1