如何按一组列进行分组，同时按一列及其项目组合进行分组并在值列上进行聚合？

Question

这是数据框 Input df

df = pd.DataFrame({'county':['Laramie']*10 + ['Albany']*12,
                   'co': ['LU']*22,
                    'tech':['cable']*6+['copper']*4+['cable']*6+['copper']*4+['Fiber']*2,
                    'loc':[*'abcdefdefgmnopqrnostow']})

我想对 County、co 以及 tech 列中的所有项目组合进行分组，并在 loc 列上进行聚合，以获得 unique 和 nunique。

这是我正在寻找的结果： Output df

尝试过这个：

df = df.groupby(['county', 'co'], as_index=True).agg({'tech':'unique', 'loc':'unique', 'loc':'nunique'}).reset_index()

但这并没有给出技术栏的所有可能组合。

Answer 1

尝试获得每种技术的独特性，然后你可以合并该技术：

m = df.groupby(['county','co','tech'], as_index=False).agg({'loc':set})

out = m.merge(m, on=['county','co']).query('tech_x < tech_y')
out['tech'] = out['tech_x'] + ',' + out['tech_y']
out['loc'] = [x.union(y) for x,y in zip(out['loc_x'],out['loc_y'])]
out['loc-nunique'] = out['loc'].str.len()
out=out[['county','co','tech','loc','loc-nunique']]

输出：

     county  co          tech                       loc  loc-nunique
1    Albany  LU   Fiber,cable     {w, q, r, m, n, p, o}            7
2    Albany  LU  Fiber,copper           {n, s, w, t, o}            5
5    Albany  LU  cable,copper  {q, r, m, n, p, s, t, o}            8
10  Laramie  LU  cable,copper     {a, b, c, g, f, d, e}            7

如何按一组列进行分组，同时按一列及其项目组合进行分组并在值列上进行聚合？

问题描述投票：0回答：1

1个回答

最新问题

如何按一组列进行分组，同时按一列及其项目组合进行分组并在值列上进行聚合？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1