根据pandas数据框中的条件为列分配值

问题描述 投票:1回答:3

我有以下数据集:

device_id   A   B   C   Current Class   
1           70  35  40     C                
2           45  90  34     B

现在每个设备在每个类(A,B,C)中都有一个分数,它目前是某个类的一部分。根据得分最高的班级,推荐或不推荐班级变更。

例如,设备1在C类中,但它的最高分是在A类中,因此它的推荐类将是A.

预期产量:

device_id   A   B   C   Current Class   Class Change    Recommended
1           70  35  40  C                   Yes             A
2           45  90  34  B                   No              B

有人可以帮我这个吗?

python pandas condition
3个回答
1
投票

numpy解决方案: - )

df['Recommended']=np.array(list('ABC'))[np.argmax(df[list('ABC')].values,1)]
df
Out[172]: 
   device_id   A   B   C CurrentClass Recommended
0          1  70  35  40            C           A
1          2  45  90  34            B           B
(df.CurrentClass==df.Recommended).map({False:'no',True:'yes'})
Out[173]: 
0     no
1    yes
dtype: object
df['Class Change']=(df.CurrentClass==df.Recommended).map({False:'no',True:'yes'})
df
Out[175]: 
   device_id   A   B   C CurrentClass Recommended Class Change
0          1  70  35  40            C           A           no
1          2  45  90  34            B           B          yes

1
投票

我会首先找到带有max的列来获取Recommended行,然后检查是否匹配Current Class以获取Class Change行,如下所示:

devices = pd.DataFrame({'A':[70, 45],
                       'B':[35, 90],
                       'C':[40, 34],
                       'Current Class':['C','B']})

devices['Recommended'] = devices[['A', 'B', 'C']].idxmax(1)

devices['Class Change'] = devices['Current Class'] == devices['Recommended']

print(devices)

输出:

    A   B   C Current Class Recommended  Class Change
0  70  35  40             C           A         False
1  45  90  34             B           B          True

1
投票

我认为你需要idxmaxnumpy.where

a = df[['A','B','C']].idxmax(axis=1)
#more general solution is select all columns without first and last
#a = df.iloc[:, 1:-1].idxmax(axis=1)
print (df.iloc[:, 1:-1])
    A   B   C
0  70  35  40
1  45  90  34

df['Class Change'] = np.where(df['Current Class'] == a, 'No', 'Yes')
df['Recommended'] = a
print (df)
   device_id   A   B   C Current Class Class Change Recommended
0          1  70  35  40             C          Yes           A
1          2  45  90  34             B           No           B

详情:

print (a)
0    A
1    B
dtype: object

如果新列的顺序不重要且应该交换:

df['Recommended'] = df[['A','B','C']].idxmax(1)
df['Class Change'] = np.where(df['Current Class'] == df['Recommended'], 'No', 'Yes')
print (df)
   device_id   A   B   C Current Class Recommended Class Change
0          1  70  35  40             C           A          Yes
1          2  45  90  34             B           B           No
© www.soinside.com 2019 - 2024. All rights reserved.