我有以下两个表,ProteinList 和 PeptideList。每种蛋白质有多种肽。我想制作一个组合表,其中每个肽列表都嵌套在蛋白质条目下。我是 R 新手,尝试阅读一些关于
rbind
的指南,但我什至不知道如何解决这个问题。
ProteinList <- data.frame(Proteins = c(A1INK5, K1SNC9), Abundance = c(67, 25))
ProteinList
Proteins Abundance
A1INK5 67
K1SNC9 25
PeptideList <- data.frame(MasterProtein = c(A1INK5, A1INK5, A1INK5, K1SNC9, K1SNC9), Peptides = c(VSAITLEDGVYR, TLEDGVY, LEDGVYRTD, KNSLRGFDDDE, MKLSFYHPNS), Positions = c([1-65], [53-78], [70-88], [30-42], [41-55]))
PeptideList
MasterProtein Peptides Positions
A1INK5 VSAITLEDGVYR [1-65]
A1INK5 TLEDGVY [53-78]
A1INK5 LEDGVYRTD [70-88]
K1SNC9 KNSLRGFDDDE [30-42]
K1SNC9 MKLSFYHPNS [41-55]
如何将它们组合如下:
ProteinPeptideList:
Proteins Abundance
A1INK5 67
Peptides Positions
VSAITLEDGVYR [1-65]
TLEDGVY [53-78]
LEDGVYRTD [70-88]
K1SNC9 25
Peptides Positions
KNSLRGFDDDE [30-42]
MKLSFYHPNS [41-55]
抱歉,我对 R 还很陌生,我什至不知道如何尝试这个问题。我尝试了一些基本命令,但没有得到任何接近我想要的结果。
如果您使用大数据框架
data.table
包是最好的选择:
library(data.table)
ProteinList <-
data.table(Proteins = c("A1INK5", "K1SNC9"),
Abundance = c(67, 25))
PeptideList <-
data.table(
MasterProtein = c("A1INK5", "A1INK5", "A1INK5", "K1SNC9", "K1SNC9"),
Peptides = c(
"VSAITLEDGVYR",
"TLEDGVY",
"LEDGVYRTD",
"KNSLRGFDDDE",
"MKLSFYHPNS"
),
Positions = c("[1-65]", "[53-78]", "[70-88]", "[30-42]", "[41-55]")
)
ProteinPeptideList <- merge(ProteinList, PeptideList, by.x = "Proteins", by.y = "MasterProtein")
输出:
> ProteinPeptideList
Key: <Proteins>
Proteins Abundance Peptides Positions
<char> <num> <char> <char>
1: A1INK5 67 VSAITLEDGVYR [1-65]
2: A1INK5 67 TLEDGVY [53-78]
3: A1INK5 67 LEDGVYRTD [70-88]
4: K1SNC9 25 KNSLRGFDDDE [30-42]
5: K1SNC9 25 MKLSFYHPNS [41-55]