Employee和监控数据的递归操纵,以产生组织树分层结构中的R

问题描述 投票:3回答:2

我经常分析的“组织树”格式的数据,以了解在组织内一个给定的领导下活动的频率。我需要从两列数据产生广泛层次:员工姓名和主管的名字。

----------
df <- data.frame("Employee"=c("Bill","James","Amy","Jen","Henry"),
                      "Supervisor"=c("Jen","Jen","Steve","Amy","Amy"))
df
#   Employee Supervisor
# 1     Bill        Jen
# 2    James        Jen
# 3      Amy      Steve
# 4      Jen        Amy
# 5    Henry        Amy

与指定的组织结构图,首先是CEO(或最上面的雇员)宽的数据帧结束:

#  Employee       H1     H2    H3
# 1    Bill    Steve    Amy   Jen
# 2   James    Steve    Amy   Jen
# 3     Amy    Steve     NA    NA
# 4     Jen    Steve    Amy    NA
# 5   Henry    Steve    Amy    NA

大量的研究后,data.tree包似乎提供最大限度的协助。我怎么能执行此操作?

r
2个回答
2
投票

尝试这个:

library(data.table)
setDT(df)

setnames(df, 'Supervisor', 'Supervisor.1')

j=1
while (df[, any(get(paste0('Supervisor.',j)) %in% Employee)]) {
  df[df, on=paste0('Supervisor.',j,'==Employee'),
     paste0('Supervisor.',j+1):= i.Supervisor.1]
  j = j + 1
}

> df
#    Employee Supervisor.1 Supervisor.2 Supervisor.3
# 1:     Bill          Jen          Amy        Steve
# 2:    James          Jen          Amy        Steve
# 3:      Amy        Steve           NA           NA
# 4:      Jen          Amy        Steve           NA
# 5:    Henry          Amy        Steve           NA

要行内重新排序:

df = cbind(df[, 1], t(apply(df[, -1], 1, function(r) c(rev(r[!is.na(r)]), r[is.na(r)]))))
> df
#    Employee    V1  V2  V3
# 1:     Bill Steve Amy Jen
# 2:    James Steve Amy Jen
# 3:      Amy Steve  NA  NA
# 4:      Jen Steve Amy  NA
# 5:    Henry Steve Amy  NA

1
投票

如果你没有在输出坚持,但希望与层次的工作,然后data.tree是一个很好的选择。这里有些例子:

libary(data.tree)
df <- data.frame("Employee"=c("Bill","James","Amy","Jen","Henry"),
                 "Supervisor"=c("Jen","Jen","Steve","Amy","Amy"))

dt <- FromDataFrameNetwork(df)

#here's your org chart:

print(dt)

让我们找到Jennas下属,连同他们的层次结构中的级别:

Get(FindNode(dt, 'Jen')$leaves, 'level')

这将返回如下所示:

 Bill James 
    4     4 

只是为了好玩,让我们增加人员预算:

dt$Set(salary = c(100000, 80000, 60000, 40000, 35000, 70000))

打印工资和薪水累计

print(dt, 'salary', sal_subordinates = function(node) Aggregate(node, 'salary', sum))

这将打印这样的:

          levelName salary sal_subordinates
1 Steve             100000            80000
2  °--Amy            80000           130000
3      ¦--Jen        60000            75000
4      ¦   ¦--Bill   40000            40000
5      ¦   °--James  35000            35000
6      °--Henry      70000            70000

vignettes有分层数据和汇总工作的许多例子中data.tree。

© www.soinside.com 2019 - 2024. All rights reserved.