计算不同数据帧中出现的次数,并以字符串的形式获取行号

问题描述 投票:0回答:1

Report$keyid = c(ab~2000~to81~~91,cb~1000~tr50~xz~23~45,yo~1999~~es~21~45) key_id = c(cb~1000~tr50~xz~23~45, ab~2000~to81~~91, cb~1000~tr50~xz~23~45, yo~1999~~es~21~45) desc = c(low, medium,low, high) error=data.frame(key_id, desc)

报表数据框由所有唯一值组成,而错误也具有重复值。

我想添加一列Report $ errorcount,以便获得每个Report $ keyid与错误数据框中的Error $ key_id进行比较的次数。

[Report $ errorline的另一列,这样它可以告诉keyid在错误数据框中出现的位置。实际的数据帧包含数千行。

期望的结果-

Report$errorcount = c(1,2,1)
Report$errorline = c("2","1,3","4")
r dataframe merge group-by sumifs
1个回答
0
投票

这是可行的,但是您可能要检查下面代码中产生的错误线列是否完全满足您的需求。

library(dplyr)

key_id = c("cb~1000~tr50~xz~23~45", "ab~2000~to81~~91", "cb~1000~tr50~xz~23~45", "yo~1999~~es~21~45")
desc = c("low", "medium","low", "high")
error = data.frame(key_id, desc)
error$errorline <- row_number(error$key_id)

Report <- data.frame(
  keyid = c("ab~2000~to81~~91","cb~1000~tr50~xz~23~45","yo~1999~~es~21~4")
)

tb <- as.data.frame(table(error$key_id))
colnames(tb) <- c("keyid", "errorcount")

Report <- left_join(Report, tb) 
Report <- left_join(Report, error, by = c("keyid" = "key_id"))
Report <- Report[-3] # drop desc column
© www.soinside.com 2019 - 2024. All rights reserved.