我有一个输入 data.frame
combined
,其中包含多对多关系,我想将其标准化为 2 个表。该输入表包含多个不同样品和位置的混合物成分。
我想从中导出 2 个表:一个 (
goal_1
) 包含位置、sampleid 和 mixid;第二个表应包含实际的混合物成分 (goal_2
)。
library(tidyverse)
goal_1 = tribble(
~sampleID, ~location, ~MixtureID,
1, "A", 1,
2, "B", 2,
3, "C", 3,
4, "A", 4,
5, "A", 1,
6, "B", 2,
7, "B", 2
)
goal_2 = tribble(
~MixtureID, ~element, ~conc_pct,
1, "He", 0,
1, "H", 10,
1, "C", 0,
1, "O", 0,
1, "N", 0,
1, "Ca", 90,
1, "Cs", 0,
2, "Si", 0,
2, "S", 100,
2, "V", 0,
3, "Nb", 100,
3, "Fe", 0,
4, "C", 20,
4, "H", 10,
4, "S", 70
);
combined = left_join(goal_1, goal_2, by = "MixtureID", relationship = "many-to-many") |>
select(-MixtureID)
基本上我想反转
combined = left_join(...)
的操作
我可以部分生成 goal_1 表:
goal_1a = distinct(combined, sampleID, location)
但是我被困在如何从
goal_2
表中派生 combined
表上。
combined <- full_join( goal_1, goal_2,
by = "MixtureID", relationship = "many-to-many" )
combined
数据集,我建议您尝试这样的操作:( table_1 <- combined |> distinct( sampleID, location ))
( table_2 <- combined |> distinct( sampleID, element, conc_pct ))
然后您可以将两者连接并获得相同的 28 行:
table_1 |>
inner_join( table_2, by = 'sampleID')