R中重复的行和字符串操作

问题描述 投票:0回答:1

我在R中有一个数据帧,其中有一些行,如下所示:

c("LouDobbs", "gen_jackkeane") || RT @LouDobbs: #AmericaFirst- @gen_jackkeane: The Taliban for 9 months have told their fighters to kill as many people as you can, to includ…

以上是2列的示例,其中第1列(我正在使用分隔符||)具有多个用户名,第2列具有tweet文本。我希望该行应复制为2(用户数量),并且对于每个在tweet文本中列出了1个以上用户的数据框中的所有此类行,每个用户都可以单独放置在列1中。

structure(list(user = list("Dandhy_Laksono", c("LouDobbs", "gen_jackkeane"
), "DeepStateExpose", "AndruewJamess", "jrossman12", "BiLLRaY2019", 
    "DeepStateExpose", "Dandhy_Laksono", "DeepStateExpose", "DeepStateExpose"), 
    full_text = c("RT @Dandhy_Laksono: Sebagian pendukung Jokowi ini mengalami bagaimana fitnah \"komunis dan PKI\" digunakan selama pemilu.\n\nSekarang mereka me…", 
    "RT @LouDobbs: #AmericaFirst- @gen_jackkeane: The Taliban for 9 months have told their fighters to kill as many people as you can, to includ…", 
    "RT @DeepStateExpose: The Only Reason The Deep State Cabal Has Stayed in Afghanistan For 18 Years Is To Protect Their Largest Poppy/Opium/Na…", 
    "RT @AndruewJamess: @BillOReilly @KamalaHarris is wrong. @realDonaldTrump has accomplished a lot. He set a record for  incoherent toilet twe…", 
    "RT @jrossman12: @SaraCarterDC Pakistan won't allow that as you already know. Your husband and the other U.S. troops have been forced to fig…", 
    "RT @BiLLRaY2019: JOKOWI TIDAK MEMBUNUH KPK..!\nMarkibong…\"Selamat tinggal Taliban di dalam KPK. Kalian kalah lagi, kalah lagi..!\"\n\n#JumatBer…", 
    "RT @DeepStateExpose: The Only Reason The Deep State Cabal Has Stayed in Afghanistan For 18 Years Is To Protect Their Largest Poppy/Opium/Na…", 
    "RT @Dandhy_Laksono: Sebagian pendukung Jokowi ini mengalami bagaimana fitnah \"komunis dan PKI\" digunakan selama pemilu.\n\nSekarang mereka me…", 
    "RT @DeepStateExpose: The Only Reason The Deep State Cabal Has Stayed in Afghanistan For 18 Years Is To Protect Their Largest Poppy/Opium/Na…", 
    "RT @DeepStateExpose: The Only Reason The Deep State Cabal Has Stayed in Afghanistan For 18 Years Is To Protect Their Largest Poppy/Opium/Na…"
    )), row.names = c(NA, 10L), class = "data.frame")
r string twitter
1个回答
0
投票

我们可以使用lengths来获取list列中每个元素的长度。它应该足够快,因为lengths很快

l1 <- lengths(df$user)
out <- data.frame(user = unlist(df$user), n = rep(l1, l1),
          text = rep(df$full_text, l1))
© www.soinside.com 2019 - 2024. All rights reserved.