我有一个数据帧,我正在尝试使用RESULT
,Set
和Subset
列上的groupby生成Subsubset
列。我尝试在perc
上返回idmax。
| Set | Subset | Subsubset | Class | perc | RESULT |
|-----|--------|-----------|-------|------|--------|
| 1 | A | 1 | good | 100 | good |
| 1 | A | | ok | 0 | good |
| 1 | A | | poor | 0 | good |
| 1 | A | | bad | 0 | good |
| 1 | A | 2 | good | 20 | bad |
| 1 | A | | ok | 10 | bad |
| 1 | A | | poor | 20 | bad |
| 1 | A | | bad | 50 | bad |
| 1 | A | 3 | good | 0 | poor |
| 1 | A | | ok | 10 | poor |
| 1 | A | | poor | 80 | poor |
| 1 | A | | bad | 10 | poor |
| 1 | B | 1 | good | 50 | good |
| 1 | B | | ok | 0 | good |
| 1 | B | | poor | 1 | good |
| 1 | B | | bad | 49 | good |
| 1 | B | 2 | good | 60 | good |
| 1 | B | | ok | 10 | good |
| 1 | B | | poor | 20 | good |
| 1 | B | | bad | 10 | good |
为了澄清,结果将始终是单个值(例如,永远不会看到50/50分割)。
设置数百个子集中的数字,直到ZZ(非常长的表)。
这与类似的问题Python : Getting the Row which has the max value in groups using groupby不同,因为我有兴趣查看MULTIPLE列的分组。
既然你提到了idxmax
,那么我们使用idxmax
idx=df.groupby(['Set','Subset','Subsubset'])['perc'].transform('idxmax')
df['RESULT']=df.loc[idx,'Class'].values#df.Class.reindex(idx).values