当某些值匹配而某些不[重复]时删除行

问题描述投票：0回答：1

ID Amount Previous 
1  10     15
1  10     13
2  20     18
2  20     24
3  5      7
3  5      6

我想从ID和金额匹配的以下数据框中删除重复的行。上一列中的值不匹配。在决定采用哪一行时，我想采用前一列值较高的那一行。

这看起来像：

ID Amount Previous 
1  10     15
2  20     24
3  5      7

r duplicates rows

1个回答

0
投票

[ID为”，“金额”为列的distinct选项（在arrange数据集之后，同时指定.keep_all = TRUE以获取与这些列中的不同元素相对应的所有其他列）>

library(dplyr)
df1 %>% 
    arrange(ID, Amount, desc(Previous)) %>%
    distinct(ID, Amount, .keep_all = TRUE)
#   ID Amount Previous
#1  1     10       15
#2  2     20       24
#3  3      5        7
或将duplicated中的base R应用于“ ID”，“金额”以创建逻辑vector并使用其对数据集的行进行子集化

df2 <- df1[with(df1, order(ID, Amount, -Previous)),]
df2[!duplicated(df2[c('ID', 'Amount')]),]
#  ID Amount Previous
#1  1     10       15
#3  2     20       24
#5  3      5        7
数据

df1 <- structure(list(ID = c(1L, 1L, 2L, 2L, 3L, 3L), Amount = c(10L, 
10L, 20L, 20L, 5L, 5L), Previous = c(15L, 13L, 18L, 24L, 7L, 
6L)), class = "data.frame", row.names = c(NA, -6L))

最新问题

© www.soinside.com 2019 - 2024. All rights reserved.