基于其他列的唯一值(重新发布)

问题描述 投票:0回答:1

有很多文件,但我选择了这3个文件进行排名。我想循环显示单个文件的排名。目的是在参考学生姓名和年份的主题列中找到唯一的值。

我有3个文件:

Jack_2019.csv

StudentName    Subject      TypeofMoudle        Grade   Year
   Jack        Design       2D Modelling           4     2019
   Jack        Design       3D Modelling           4     2019
   Jack        Design       AD                     4     2019
   Jack       Networking    CloudComputing         4     2019
   Jack       Networking    NOS                    4     2019
   Jack       Coding        Mobile App             4     2019

Jack_2018.csv

StudentName    Subject      TypeofMoudle        Grade    Year
   Jack       Networking    CloudComputing         4     2018
   Jack       Networking    CloudComputing2        4     2018
   Jack       Design        Video Editing          3     2018
   Jack       Design        Photo Editing          4     2018
   Jack       Coding        Web App                4     2018

Mary_2019.csv

StudentName    Subject      TypeofMoudle        Grade    Year
   Mary      Networking    CloudComputing          4     2019
   Mary      Networking    NOS1                    4     2019
   Mary      Coding        Web App 1               4     2019

所有文件合并后:清除数据:

StudentName    Subject      TypeofMoudle        Grade   Year
   Jack        Design       2D Modelling           4     2019
   Jack        Design       3D Modelling           4     2019
   Jack        Design       AD                     4     2019
   Jack       Networking    CloudComputing         4     2019
   Jack       Networking    NOS                    4     2019
   Jack       Coding        Mobile App             4     2019
   Jack       Networking    CloudComputing         4     2018
   Jack       Networking    CloudComputing2        4     2018
   Jack       Design        Video Editing          3     2018
   Jack       Design        Photo Editing          4     2018
   Jack       Coding        Web App                4     2018
   Mary      Networking    CloudComputing          4     2019
   Mary      Networking    NOS1                    4     2019
   Mary      Coding        Web App 1               4     2019

这是所需的参考列:

StudentName    Subject         Year
       Jack        Design      2019
       Jack        Design      2019
       Jack        Design      2019
       Jack       Networking   2019
       Jack       Networking   2019
       Jack       Coding       2019
       Jack       Networking   2018
       Jack       Networking   2018
       Jack       Design       2018
       Jack       Design       2018
       Jack       Coding       2018
       Mary      Networking    2019
       Mary      Networking    2019
       Mary      Coding        2019

这是我想要的结果

StudentName    Subject      Rank  Year 
   Jack        Design       1     2019
   Jack        Design       1     2019
   Jack        Design       1     2019
   Jack       Networking    2     2019
   Jack       Networking    2     2019
   Jack       Coding        3     2019
   Jack       Networking    1     2018
   Jack       Networking    1     2018
   Jack       Design        2     2018
   Jack       Design        2     2018
   Jack       Coding        3     2018
   Mary      Networking     1     2019
   Mary      Networking     1     2019
   Mary      Coding         2     2019

我尝试过的:

df['Rank']=df.groupby(['StudentName','Year'])['Subject'].transform('count')
python pandas dataframe ranking
1个回答
0
投票

您可以使用groupby + groupby

cumcount

df['rank']=df.groupby(['StudentName','Year','Subject'],sort=False)['Subject'].transform('cumcount')+1
print(df)
© www.soinside.com 2019 - 2024. All rights reserved.