我有一个(示例)数据框,包括一个标识符(此处为“ID”)和两个感兴趣的变量(此处为“V1”和“V2”):
df <- data.frame(ID = c("Sample 1", "Sample 2", "Sample 3"),
V1 = c("A, B, C", "E" , "A, F"),
V2 = c("H, G" , "C, A" , "J"))
对于感兴趣的两个变量(即列),我想生成逗号分隔字符的所有潜在组合,可通过标识符进行识别(通过行偏角)。
结果数据框可能看起来像这样(只要组合完整,顺序和结构上的一些灵活性是可以接受的):
rd <- data.frame(ID = c("Sample 1","Sample 1", "Sample 1", "Sample 1", "Sample 1", "Sample 1", "Sample 2", "Sample 2", "Sample 3", "Sample 3"),
V1 = c("A", "A", "B", "B", "C", "C", "E", "E", "A", "F"),
V2 = c("H","G","H","G","H","G","C","A","J","J"))
非常感谢您的支持。
library(tidyr)
df |>
separate_rows(V1) |>
separate_rows(V2)
结果
ID V1 V2
1 Sample 1 A H
2 Sample 1 A G
3 Sample 1 B H
4 Sample 1 B G
5 Sample 1 C H
6 Sample 1 C G
7 Sample 2 E C
8 Sample 2 E A
9 Sample 3 A J
10 Sample 3 F J