我曾多次发过类似的问题,但都被关闭了,或者被转到另一个没有回答我问题的帖子。我希望这次这个帖子能留下来。
我有一个美国人口普查数据的DF。我把各州与其对应的县进行了分组。还有另一栏,人口从高到低排序。我唯一想做的是把它切成片,这样我只得到每个州的三个人口最多的县。最后的结果应该是根据三个人口最多的县来显示三个人口最多的州。
这是到目前为止我的代码。
def answer_six():
cdf = census_df[census_df['SUMLEV'] == 50]
columns_to_keep = ['STNAME', 'CTYNAME', 'CENSUS2010POP']
cdf = cdf[columns_to_keep]
cdf = cdf.sort_values('CENSUS2010POP', ascending=False)
cdf = cdf.groupby('STNAME')
cdf = cdf.apply(pd.DataFrame.sort_values, 'CENSUS2010POP', ascending=False).head(100)
# cdf = [i for i in cdf['STNAME'][:3] if all(cdf['STNAME']) == all(cdf['STNAME'])]
return cdf
answer_six()
这是我的数据样本
STNAME CTYNAME CENSUS2010POP
37 Alabama Jefferson County 658466
49 Alabama Mobile County 412992
45 Alabama Madison County 334811
51 Alabama Montgomery County 229363
59 Alabama Shelby County 195085
63 Alabama Tuscaloosa County 194656
2 Alabama Baldwin County 182265
41 Alabama Lee County 140247
52 Alabama Morgan County 119490
8 Alabama Calhoun County 118572
28 Alabama Etowah County 104430
35 Alabama Houston County 101547
48 Alabama Marshall County 93019
39 Alabama Lauderdale County 92709
58 Alabama St. Clair County 83593
42 Alabama Limestone County 82782
61 Alabama Talladega County 82291
22 Alabama Cullman County 80406
26 Alabama Elmore County 79303
25 Alabama DeKalb County 71109
64 Alabama Walker County 67023
5 Alabama Blount County 57322
1 Alabama Autauga County 54571
17 Alabama Colbert County 54428
36 Alabama Jackson County 53227
57 Alabama Russell County 52947
23 Alabama Dale County 50251
16 Alabama Coffee County 49948
24 Alabama Dallas County 43820
11 Alabama Chilton County 43643
... ... ... ... ...
80 Alaska Kenai Peninsula Borough 55400
79 Alaska Juneau City and Borough 31275
72 Alaska Bethel Census Area 17013
我猜你要找的是 cdf.groupby('STNAME').head(3)
在你整理好cdf后?
P.S.也许你的问题一直被关闭是因为重复的问题吧,比如。Pandas在每组中得到最多的n条记录。