Python - 如何使用经理层次结构列获取经理下的团队数量

问题描述 投票:0回答:1

我有一个包含员工电子邮件、经理电子邮件和经理层次列的数据框。我试图获得经理拥有的团队数量。

我当前的数据框

emp_email        mgr_email       mgr_hier_01    mgr_hier_02     mgr_hier_03            
[email protected]     [email protected]    [email protected]    [email protected]   [email protected]     
[email protected]     [email protected]    [email protected]    [email protected]   [email protected]     
[email protected]    [email protected]   [email protected]    [email protected]
[email protected]    [email protected]   [email protected]    [email protected]                                       
[email protected]      [email protected]   [email protected]    [email protected]    [email protected]    
[email protected]    [email protected]    [email protected]    [email protected]                     
[email protected]     [email protected]    [email protected]    [email protected]                     
[email protected]   [email protected]  [email protected]    [email protected]    [email protected]   
[email protected]  [email protected]    [email protected]    [email protected]                     
[email protected]     [email protected]   [email protected]    [email protected]    [email protected]     
[email protected]   [email protected]   [email protected]    [email protected]   [email protected]    
[email protected]     [email protected]    [email protected]    [email protected]                                    
[email protected]    [email protected]     [email protected]                                     
[email protected]    [email protected]     [email protected]                                     
[email protected]     [email protected]   [email protected]    [email protected]                    
[email protected]     [email protected]     [email protected]                                     
[email protected]     [email protected]     [email protected]                                     
[email protected]     [email protected]     [email protected]                                     
[email protected]    [email protected]    [email protected]    [email protected]                      
[email protected]     [email protected]     [email protected]                                     
[email protected]   [email protected]    [email protected]    [email protected]
[email protected]    [email protected]   [email protected]    [email protected]                    
[email protected]      NAN             NAN                                             

我希望实现的是一个列,如果员工是经理,该列会给出经理拥有的团队数量。例如,[email protected] 有 2 位经理向她汇报([email protected][email protected]),因此她手下的团队数应该为 2。而 [email protected] 没有向他汇报的经理但他是一名经理,负责管理 2 个个人贡献者([email protected][email protected])。所以 [email protected] 下的团队数应该是 1.

emp_email        mgr_email       mgr_hier_01    mgr_hier_02     mgr_hier_03      num_teams_if_mgr     
[email protected]     [email protected]    [email protected]    [email protected]   [email protected]     0
[email protected]     [email protected]    [email protected]    [email protected]   [email protected]     0
[email protected]    [email protected]   [email protected]    [email protected]                    0
[email protected]    [email protected]   [email protected]    [email protected]                    0
[email protected]      [email protected]   [email protected]    [email protected]    [email protected]    0
[email protected]    [email protected]    [email protected]    [email protected]                     0
[email protected]     [email protected]    [email protected]    [email protected]                     0
[email protected]   [email protected]  [email protected]    [email protected]    [email protected]   0
[email protected]  [email protected]    [email protected]    [email protected]                     0
[email protected]     [email protected]   [email protected]    [email protected]    [email protected]     0
[email protected]   [email protected]   [email protected]    [email protected]   [email protected]    0
[email protected]     [email protected]    [email protected]    [email protected]                     1                
[email protected]    [email protected]     [email protected]                                     2
[email protected]    [email protected]     [email protected]                                     1
[email protected]     [email protected]   [email protected]    [email protected]                    1
[email protected]     [email protected]     [email protected]                                     2
[email protected]     [email protected]     [email protected]                                     1
[email protected]     [email protected]     [email protected]                                     1 
[email protected]    [email protected]    [email protected]    [email protected]                     1  
[email protected]     [email protected]     [email protected]                                     1
[email protected]   [email protected]    [email protected]    [email protected]                     1  
[email protected]    [email protected]   [email protected]    [email protected]                    1
[email protected]      NAN             NAN                                             6

到目前为止,我只能使用下面的代码为数据框创建层次结构列。感谢任何形式的帮助。

import networkx as nx

# create graph
G = nx.from_pandas_edgelist(df_hc, source='mgr_email', target='emp_email', create_using=nx.DiGraph)

# find roots (= top managers)
roots = [n for n,d in G.in_degree() if d==0]

# for each employee, find the hierarchy 
df_hierarchy = (pd.DataFrame([next((p for root in roots for p in nx.all_simple_paths(G, root, node)), [])[:-1] for node in df_hc['emp_email']], index= df_hc.index).rename(columns=lambda x: f'mgr_hier_{x+1:02d}'))

# join to original DataFrame
df_hc2 = df_hc.join(df_hierarchy)
python pandas dataframe group-by hierarchy
1个回答
0
投票

我不完全理解你的团队概念,但是假设一个团队不止一个人,然后统计非叶子的后代:

leafs = {n for n,d in G.out_degree() if d==0}
d = {n: len(nx.descendants_at_distance(G, n, 1)-leafs)
     for n in G.nodes}
df_hc['num_teams_if_mgr'] = df_hc['emp_email'].map(d)

输出:

          emp_email       mgr_email  mgr_hier_01    mgr_hier_02     mgr_hier_03  num_teams_if_mgr
0      [email protected]    [email protected]  [email protected]  [email protected]    [email protected]                 0
1      [email protected]    [email protected]  [email protected]  [email protected]    [email protected]                 0
2     [email protected]   [email protected]  [email protected]  [email protected]            None                 0
3     [email protected]   [email protected]  [email protected]  [email protected]            None                 0
4       [email protected]   [email protected]  [email protected]   [email protected]   [email protected]                 0
5     [email protected]    [email protected]  [email protected]   [email protected]            None                 0
6      [email protected]    [email protected]  [email protected]   [email protected]            None                 0
7    [email protected]  [email protected]  [email protected]   [email protected]  [email protected]                 0
8   [email protected]    [email protected]  [email protected]   [email protected]            None                 0
9      [email protected]   [email protected]  [email protected]   [email protected]    [email protected]                 0
10   [email protected]   [email protected]  [email protected]  [email protected]   [email protected]                 0
11     [email protected]    [email protected]  [email protected]   [email protected]            None                 0
12    [email protected]     [email protected]  [email protected]           None            None                 1
13    [email protected]     [email protected]  [email protected]           None            None                 0
14     [email protected]   [email protected]  [email protected]  [email protected]            None                 0
15     [email protected]     [email protected]  [email protected]           None            None                 1
16     [email protected]     [email protected]  [email protected]           None            None                 1
17     [email protected]     [email protected]  [email protected]           None            None                 0
18    [email protected]    [email protected]  [email protected]   [email protected]            None                 0
19     [email protected]     [email protected]  [email protected]           None            None                 0
20   [email protected]    [email protected]  [email protected]   [email protected]            None                 0
21    [email protected]   [email protected]  [email protected]  [email protected]            None                 0
22      [email protected]             NAN          NAN           None            None                 6

如果员工本身不是叶子要算+1:

leafs = {n for n,d in G.out_degree() if d==0}
d = {n: len(nx.descendants_at_distance(G, n, 1)-leafs)
        + (not n in leafs) for n in G.nodes}
df_hc['num_teams_if_mgr'] = df_hc['emp_email'].map(d)

输出:

          emp_email       mgr_email  mgr_hier_01    mgr_hier_02     mgr_hier_03  num_teams_if_mgr
0      [email protected]    [email protected]  [email protected]  [email protected]    [email protected]                 0
1      [email protected]    [email protected]  [email protected]  [email protected]    [email protected]                 0
2     [email protected]   [email protected]  [email protected]  [email protected]            None                 0
3     [email protected]   [email protected]  [email protected]  [email protected]            None                 0
4       [email protected]   [email protected]  [email protected]   [email protected]   [email protected]                 0
5     [email protected]    [email protected]  [email protected]   [email protected]            None                 0
6      [email protected]    [email protected]  [email protected]   [email protected]            None                 0
7    [email protected]  [email protected]  [email protected]   [email protected]  [email protected]                 0
8   [email protected]    [email protected]  [email protected]   [email protected]            None                 0
9      [email protected]   [email protected]  [email protected]   [email protected]    [email protected]                 0
10   [email protected]   [email protected]  [email protected]  [email protected]   [email protected]                 0
11     [email protected]    [email protected]  [email protected]   [email protected]            None                 0
12    [email protected]     [email protected]  [email protected]           None            None                 2
13    [email protected]     [email protected]  [email protected]           None            None                 1
14     [email protected]   [email protected]  [email protected]  [email protected]            None                 1
15     [email protected]     [email protected]  [email protected]           None            None                 2
16     [email protected]     [email protected]  [email protected]           None            None                 2
17     [email protected]     [email protected]  [email protected]           None            None                 1
18    [email protected]    [email protected]  [email protected]   [email protected]            None                 1
19     [email protected]     [email protected]  [email protected]           None            None                 1
20   [email protected]    [email protected]  [email protected]   [email protected]            None                 1
21    [email protected]   [email protected]  [email protected]  [email protected]            None                 0
22      [email protected]             NAN          NAN           None            None                 7

图表:

© www.soinside.com 2019 - 2024. All rights reserved.