在r中折叠同一作者的每4个连续文本行

问题描述 投票:0回答:1

我想在一个广泛的数据框中合并作者的每四个帖子,如果剩下的帖子少于四个,则将这些帖子合并(例如,一个作者有11个帖子,我最终得到2个帖子(共4个帖子)和1个帖子(共3个帖子) 3)。

这是我的数据框示例:

name  text
bee   _ so we know that right           
bee   said so           
alma  hello,            
alma  Good to hear back from you.           
bee   I've currently written an application         
alma  I'm happy about it            
bee   It was not the last.          
alma  Will this ever stop.          
alma  Yet another line.         
alma  so            

我想将其更改为此:

name  text
bee   _ so we know that right said so I've currently written an application It was not the last.
alma  hello, Good to hear back from you. I'm happy about it Will this ever stop
alma  Yet another line. so

这里是初始数据帧:

df = structure(list(name = c("bee", "bee", "alma", "alma", "bee", "alma", "bee", "alma", "alma", "alma"), text = c( "_ so we know that right", "said so", "hello,", "Good to hear back from you.", "I've currently written an application", "I'm happy about it", "It was not the last.", "Will this ever stop.", "Yet another line.", "so")), .Names = c("name", "text"), row.names = c(NA, -10L), class = "data.frame")
r transform collapse
1个回答
0
投票

利用dplyr的一个选项可能是:

df %>%
 group_by(name) %>%
 mutate(ID = ceiling(row_number()/4)) %>%
 group_by(name, ID) %>%
 summarise_all(paste, collapse = " ")

  name     ID text                                                                         
  <chr> <dbl> <chr>                                                                        
1 alma      1 hello, Good to hear back from you. I'm happy about it Will this ever stop.   
2 alma      2 Yet another line. so                                                         
3 bee       1 _ so we know that right said so I've currently written an application It was…
© www.soinside.com 2019 - 2024. All rights reserved.