在 splitstackshape 包中使用 cSplit_e 函数时寻找空白的负前瞻

问题描述 投票:0回答:1

我希望将包含多个逗号分隔响应的列分成多个列。我正在 splitstackshape 包中使用 cSplit_e 函数。不幸的是,包中的某些项目在单个项目中包含逗号,因此我试图表明它应该仅在后面不跟空格的逗号处拆分。

这是我现在得到的语法:

cSplit_e(data=df,split.col="question",sep=",",type="character")

这需要:

Behavior; green, pink, blue,Sleep; indigo, violet, puce

并为以下内容创建单独的列:

question_Behavior; green
question_pink
question_blue
question_Sleep; indigo
question_violet
question_puce

但我希望它分成这样:

question_Behavior; green, pink, blue
question_Sleep; indigo, violet, puce

我不确定如何在 cSplit_e 的语法中指示我只希望它在紧随其后的非空格的逗号处进行分割,并且将不胜感激!

示例数据框:

id_num <- c("1","2","3","4","5")
question <- c("Behavior; green, pink, blue,Sleep; indigo, violet, puce","Behavior; green, pink, blue","","Sleep; indigo, violet, puce","Behavior; green, pink, blue,Sleep; indigo, violet, puce")

df <- data.frame(id_num,question)
r regex delimiter splitstackshape
1个回答
0
投票

如果您不介意使用

tidyr package
,这里有一个可能的解决方案的建议。也许它不像使用这个
splitstackshape package
那么优雅或简单,但我不知道。

我的代码:

df %>%
  separate_rows(question, sep = "(?<=\\S),(?=\\S)", convert = FALSE) %>%
  separate(question, into = c("question", "response"), sep = ";", extra = "merge") %>%
  filter(!is.na(response)) %>%
  pivot_wider(names_from = question, values_from = response) %>%
  rename_all(~gsub("\\.", "_", .))

输出:

# A tibble: 4 × 3
  id_num Behavior             Sleep                  
  <chr>  <chr>                <chr>                  
1 1      " green, pink, blue" " indigo, violet, puce"
2 2      " green, pink, blue"  NA                    
3 4       NA                  " indigo, violet, puce"
4 5      " green, pink, blue" " indigo, violet, puce"
© www.soinside.com 2019 - 2024. All rights reserved.