我正在尝试在下面的文本cat
中找到第一个最频繁出现的单词,第二个最频繁出现的单词,...。
library(stringr)
cat <- c("AA","AA","AA","Ee","Dd","Ee","Bb","Cc","Cc","Cc")
我需要的输出:
most1 AAA Cc
most2 Ee
most3 Bb Dd
可以在这方面帮助我吗? Tnx!
您可以像使用table
:
sort(table(cat), TRUE)
#cat
#AA Cc Ee Bb Dd
# 3 3 2 1 1
并且作为字符向量:
x <- table(cat)
x <- rev(do.call(rbind, lapply(split(names(x), x), paste,collapse = " ")))
cbind(paste0("most", seq(x)), x)
# x
#[1,] "most1" "AA Cc"
#[2,] "most2" "Ee"
#[3,] "most3" "Bb Dd"
Variant:
x <- table(cat)
x <- do.call(rbind, rev(lapply(split(names(x), x), list)))
as.data.frame(cbind(paste0("most", seq(x)), x))
# V1 V2
#3 most1 AA, Cc
#2 most2 Ee
#1 most3 Bb, Dd