如何使用dplyr :: Distinct基于另一个变量的值

问题描述 投票:0回答:1
library(tidyverse)

使用下面的示例数据,我想根据条件使用dplyr :: distinct()。我想消除ID列中的重复项,但只删除具有最低“Rate”值的重复项。例如,对于“A1A1”,应减去速率为2的行,而对于“CC33”,应删除“速率”等于2和3的行。我还想通过使用dplyr :: distinct和“.keep_all = TRUE”来结束所有列。

我尝试了下面的代码,但这删除了Subject列。

DF2%>%group_by(ID)%>%summarise(Min_rate=min(Rate))

我也玩过group_by,mutate和if_else,但无法让它工作......

DF2%>%group_by(ID)%>%mutate(if_else(Rate=min(Rate),Rate,distinct(ID)

帮助将不胜感激......

样本数据:

ID<-c("A1A1","A22B","CC33","D33D","A1A1","4DD8","4DD8","CC33","CC33","56DK","F4G5","8Y0R")
Subject<-c("Subject1","Subject2","Subject3","Subject4","Subject5","Subject6","Subject7","Subject8","Subject9","Subject10","Subject11","Subject12")
Rate<-c(1,2,3,2,2,3,2,1,2,2,2,3)
DF2<-data_frame(ID,Subject,Rate)
r tidyverse
1个回答
0
投票

我找到了一种方法来实现我想要的方法,首先使用dplyr的“group_by”和“mutate”函数以及“if_else”来重新编码每个ID组中的rate变量的最小值1,其他所有值都为0 。

DF2<-DF2%>%group_by(ID)%>%mutate(Rate_Min=if_else(Rate==min(Rate),1,0))

然后我使用dplyr的“过滤器”来删除0。

DF2<-DF2%>%filter(Rate_Min==1)
© www.soinside.com 2019 - 2024. All rights reserved.