cbind/join 没有标识符的不同长度的数据帧

问题描述 投票:0回答:1

我正在尝试

cbind
/
join
两个没有唯一标识符的数据框。由于它们是如何被网络抓取的,格式很复杂。

df1
包含化验结果,一天中发生的每个化验有 1 行,并且一行(ID、名称、浓度)将不同日期的化验分开。
df2
包含每个化验日期的 1 行。我需要将分析日期从
df2
绑定到
df1
.

df1 <- data.frame(X1 = c("ID", "1", "2", "3", "4", "5", "ID", "1", "2", "3", "ID", "1", "2"),
                  X2 = c("Name", "Jose", "Mary", "Doug", "Luisa", "Pam", "Name", "Jose", "Doug", "Lou", "Name", "Luisa", "Pam"),
                  X3 = c("Concentration", "4.2", "2.3", "7.3", "1.4", "0.5", "Concentration", "0.1", "2.3", "2.1", "Concentration", "9.0", "1.4"))
df2 <- data.frame(X4 = c("Monday", "Tuesday", "Friday"),
                  X5 = c("January", "February", "March"),
                  X6 = c("12", "4", "21"))

我希望生成的数据框看起来像this.

到目前为止,我已经尝试创建一个在同一天发生的标识符分析,但我没有成功,因为一天中的分析次数差异很大。实际上,我有来自几十个日期的超过 200,000 次化验。

r dataframe
1个回答
0
投票

要得到你想要的,你可以使用

which
并做:

w <- which(df1$X1 == "ID")
n <- diff(c(w, nrow(df1) + 1L))
df3 <- data.frame(df1, df2[rep.int(seq_along(n), n), ])
df3
    X1    X2            X3      X4       X5 X6
1   ID  Name Concentration  Monday  January 12
1.1  1  Jose           4.2  Monday  January 12
1.2  2  Mary           2.3  Monday  January 12
1.3  3  Doug           7.3  Monday  January 12
1.4  4 Luisa           1.4  Monday  January 12
1.5  5   Pam           0.5  Monday  January 12
2   ID  Name Concentration Tuesday February  4
2.1  1  Jose           0.1 Tuesday February  4
2.2  2  Doug           2.3 Tuesday February  4
2.3  3   Lou           2.1 Tuesday February  4
3   ID  Name Concentration  Friday    March 21
3.1  1 Luisa           9.0  Friday    March 21
3.2  2   Pam           1.4  Friday    March 21

但是对于 R 中的分析,具有非冗余行和正确数据类型的数据框会更有用:

w <- which(df1$X1 == "ID")
n <- diff(c(w, nrow(df1) + 1L)) - 1L
df3 <- data.frame(df1[-w, ],
                  Date = rep.int(as.Date(paste(2023L, match(df2$X5, month.name), as.integer(df2$X6), sep = "-")), n),
                  row.names = NULL)
names(df3)[seq_along(df1)] <- as.character(df1[1L, ])
df3
   ID  Name Concentration       Date
1   1  Jose           4.2 2023-01-12
2   2  Mary           2.3 2023-01-12
3   3  Doug           7.3 2023-01-12
4   4 Luisa           1.4 2023-01-12
5   5   Pam           0.5 2023-01-12
6   1  Jose           0.1 2023-02-04
7   2  Doug           2.3 2023-02-04
8   3   Lou           2.1 2023-02-04
9   1 Luisa           9.0 2023-03-21
10  2   Pam           1.4 2023-03-21
© www.soinside.com 2019 - 2024. All rights reserved.