我有一个数据帧
structure(list(Race = structure(c(3L, 2L, 3L, 9L, 9L, 11L,
5L, 11L, 3L, 3L, 3L, 3L, 7L, 3L, 11L, 5L, 9L, 10L, 9L, 10L, 2L,
3L, 2L, 6L, 9L, 10L, 3L, 10L, 8L, 3L, 5L, 1L, 2L, 9L, 4L, 3L), .Label = c("Black or African American",
"Black or African American,White or Caucasian", "East Asian",
"East Asian,Pacific Islander", "Hispanic or Latino/a", "Other",
"Pacific Islander", "South Asian", "White or Caucasian", "White or Caucasian,Hispanic or Latino/a",
"White or Caucasian,Middle Eastern"), class = "factor")), class = "data.frame", row.names = c(NA,
-36L))
我正在比较人口普查数据的多个种族。我想做的是创建一个新变量,说明该人是否是少数人,或者根据该行是否包含除“白人或高加索人”以外的任何内容。因此,如果有人将自己列为“太平洋岛民”,他们将在新变量中列为“少数民族”。如果他们被列为“白人或高加索人”,他们将是“多数”。请注意,其中一些细胞有种族组合,包括“白人或高加索人”以及其他一些种族。任何有一个以上种族的人仍应被视为“少数民族”
为什么不简单:
df %>% mutate(new_var = ifelse(Race=="White or Caucasian","Majority","Minority"))