我有两个不同的数据框,每个数据框按月包含不同的文本。我想做的是将具有相同日期的文本合并到一个数据框中。
让我举个例子来澄清。这是dataframe_A,其中第三列(Article)包含每个日期的一些文本:
Date Title Article
1 1 January 2000 PRESS CONFERENCE Article_topic_A_1
2 1 February 2000 PRESS CONFERENCE Article_topic_A_2
3 1 March 2000 PRESS CONFERENCE Article_topic_A_3
这是dataframe_B,其中包含不同文本,但在相同日期:
Date Title Article
1 1 January 2000 PRESS CONFERENCE Article_topic_B_1
2 1 February 2000 PRESS CONFERENCE Article_topic_B_2
3 1 March 2000 PRESS CONFERENCE Article_topic_B_3
现在,我想将Article_topic_A_1的文本与Article_topic_B_1的文本,Article_topic_A_2的文本与Article_topic_B_2的文本合并,依此类推。对于同一日期(例如:2000年1月1日),我想合并不同的文章(例如:Article_topic_A_1和Article_topic_B_1)。基本上,最终数据帧应如下所示:
Date Title Article
1 1 January 2000 PRESS CONFERENCE Article1
2 1 February 2000 PRESS CONFERENCE Article2
3 1 March 2000 PRESS CONFERENCE Article3
第三列将包含按“日期”分组的合并文本。
我尝试使用merge和subset,但我没有做到这一点。
您能帮我吗?
非常感谢!
这里是使用merge
的解决方案,两者的文本均以,
分隔。
df_a <- data.frame(
Date = c("1 January 2000", "1 February 2000", "1 March 2000"),
Title = rep("PRESS CONFERENCE", 3),
Article = c("Article_topic_A_1", "Article_topic_A_2", "Article_topic_A_3")
)
df_b <- data.frame(
Date = c("1 January 2000", "1 February 2000", "1 March 2000"),
Title = rep("PRESS CONFERENCE", 3),
Article = c("Article_topic_B_1", "Article_topic_B_2", "Article_topic_B_3")
)
df <- merge(df_a, df_b, by = c("Date", "Title"))
df$Article <- paste(df$Article.x, df$Article.y, sep = ", ")
df <- df[, !(names(df) %in% c("Article.x", "Article.y"))]
df
#> Date Title Article
#> 1 1 February 2000 PRESS CONFERENCE Article_topic_A_2, Article_topic_B_2
#> 2 1 January 2000 PRESS CONFERENCE Article_topic_A_1, Article_topic_B_1
#> 3 1 March 2000 PRESS CONFERENCE Article_topic_A_3, Article_topic_B_3