R和dplyr中的平列变量为Yes / No

问题描述 投票:0回答:1

我在一个数据框中有一个各种人的简历的数据集。每行都是一个新人员条目,并且有多列(学校,所担任的职位,出生城市等)。我想为这些人建立一个邻接矩阵,所以我正在寻找一种将列变量“扁平化”为“是/否”的方法。

例如,数据框的片段看起来像这样:

Name:     City_of_birth:  Job Title: 
Person1   'New York',     'Librarian'
Person2   'Shanghai',     'Secretary'
Person3   'Tokyo',        'Engineer'
Person4   'Lagos',        'CEO'
Person5   'Atlanta'       'Mayor'

我想对数据框进行转换,以使新的列标题为“ New York”,“ Shanghai”,“ Tokyo” ...以及与每行(人)相关的是/否值。

Name:     New York?:  Shanghai?:  ...    Librarian?:
Person1   Yes         No                 Yes
Person2   No          No                 No
Person3   No          No                 No
Person4   ...
Person5   

我对R很陌生,因此我愿意使用任何工具来执行此操作。在此先多谢!

r dataframe dplyr adjacency-matrix
1个回答
0
投票

这里是使用dplyrtidyr的选项

library(dplyr)
library(tidyr)
df %>% 
  pivot_wider(names_from = c(City_of_birth, JobTitle),
              values_from = c(City_of_birth, JobTitle)) %>% 
  mutate_at(vars(-contains("Name")), ~if_else(is.na(.), "No", "Yes")) 

数据:

df <- structure(list(Name = structure(1:5, .Label = c("Person1", "Person2", 
"Person3", "Person4", "Person5"), class = "factor"), City_of_birth = structure(c(3L, 
4L, 5L, 2L, 1L), .Label = c("Atlanta", "Lagos", "New York", "Shanghai", 
"Tokyo"), class = "factor"), JobTitle = structure(c(3L, 5L, 2L, 
1L, 4L), .Label = c("CEO", "Engineer", "Librarian", "Mayor", 
"Secretary"), class = "factor")), class = "data.frame", row.names = c(NA, 
-5L))
© www.soinside.com 2019 - 2024. All rights reserved.