“文本到列”使用不同的变量在R中的字段

问题描述 投票:1回答:1

[我正在尝试使用几个不同的值来分隔R中的列表,并且我感觉自己过度复杂化了我需要做的事情。

我想在“肯定”列中将列表中“正”(即,开始列表或在其前面带有+号的任何内容)分开。

带有-号的任何符号都进入负列。

将c(“ EmilyP”,“ EmilyS”)放入Emily列的任何内容

以及任何将c(“ Red”,“ Blue”)插入Color列的内容。

我已经尝试过dplyr和tidyr,但无法完成这项工作,然后我开始进行循环工作,这似乎很复杂。

有人可以提出更好的方法吗?

((下面的输入和输出)。

input <- structure(list(Team.Name = c("Team 1", "Team 2", "Team 3", "Team 4", 
"Team 5", "Team 6"), Members = c("Frank + Terry - Joan - Bob + EmilyS + Red", 
"Frank + Bob - Neil - Janet - Tim + EmilyP + Blue", "Frank + Blue - Joan - Bob + EmilyP + Red", 
"Tom + Jerry - Bill - Jenny", "Tess + Jean + Jill + EmilyS", 
"Bill + Bob + Red")), class = "data.frame", row.names = c(NA, 
-6L))

而且我正试图得到这个:

output <- structure(list(Team.Name = c("Team 1", "Team 2", "Team 3", "Team 4", 
"Team 5", "Team 6"), Positive = c("Frank + Terry", "Frank + Bob", 
"Frank", "Tom + Jerry", "Tess + Jean + Jill", "Bill + Bob"), 
    Negative = c("Joan - Bob", "Neil - Janet - Tim", "Joan - Bob", 
    "Bill - Jenny", "", ""), Emily = c("EmilyS", "EmilyP", "EmilyP", 
    "", "EmilyS", ""), Color = c("Red", "Blue", "Red + Blue", 
    "", "", "Red")), class = "data.frame", row.names = c(NA, 
-6L))
r dplyr tidyr
1个回答
0
投票

这是我现在得到的。首先,我拆分成员并使用map_dfr()创建一个数据框。然后,我进行了一些字符串操作。每个组中的第一个成员没有+。所以我将其添加到第一个成员。我在+之前替换了-Emily,然后用Emily替换为大写字母。我还用+替换了颜色名称之前的-color。然后,将value列与separate()分开。对于每个组,我将所有名称与toString()组合在一起。最后,我将数据转换为宽格式数据。

library(tidyverse)
library(stringi)

map_dfr(.x = stri_split_regex(str = input$Members, pattern = "\\s(?=[+|-])"),
        .f = enframe,
        .id = "id") %>% 
mutate(value = if_else(!substr(x = value, start = 1, stop =1) %in% c("+", "-"),
                       paste("+ ", value, sep = ""), value),
       value = if_else(grepl(x = value, pattern = "Emily[A-Z]"),
                             sub(x = value, pattern = "[+|-]", replacement = "Emily"),
                             value),
       value = if_else(sub(x = value, pattern = "[+|-]\\s", replacement = "") %in% stri_trans_totitle(colors(distinct = TRUE)),
                       sub(x = value, pattern = "[+|-]", replacement = "color"),
                       value)) %>%
separate(col = "value", into = c("type", "value"), sep = "\\s") %>% 
group_by(id, type) %>%  
summarise(value = toString(value)) %>% 
pivot_wider(id_cols = "id", names_from = "type", values_from = "value")

#  id    `-`              `+`              color     Emily 
#  <chr> <chr>            <chr>            <chr>     <chr> 
#1 1     Joan, Bob        Frank, Terry     Red       EmilyS
#2 2     Neil, Janet, Tim Frank, Bob       Blue      EmilyP
#3 3     Joan, Bob        Frank            Blue, Red EmilyP
#4 4     Bill, Jenny      Tom, Jerry       NA        NA    
#5 5     NA               Tess, Jean, Jill NA        EmilyS
#6 6     NA               Bill, Bob        Red       NA    
© www.soinside.com 2019 - 2024. All rights reserved.