具有这样的数据框:
df <- structure(list(doc_id = c("1", "2"), ner_words = c("John, Google",
"Amazon, Python, Canada")), row.names = c(NA, -2L), class = c("tbl_df",
"tbl", "data.frame"))
如何制作table(df$ner_words)
,但每行取不同的词?预期结果示例
data.frame(text = c("John", "Google", "Amazon", "Python", "Canada"), frq = c(1,1,1,1,1))
这是一个选项:
library(dplyr)
df %>%
separate_rows(ner_words, sep = ", ") %>%
group_by(ner_words) %>%
mutate(freq = n())
# A tibble: 5 x 3
# Groups: ner_words [5]
doc_id ner_words freq
<chr> <chr> <int>
1 1 John 1
2 1 Google 1
3 2 Amazon 1
4 2 Python 1
5 2 Canada 1