如果列值相等,则删除行

问题描述 投票:0回答:2

如果所有行中的列(YEAR,POL,CTY,ID,AMOUNT)相等,则我想删除行。请参见下面的输出表。

Table:

YEAR  POL    CTY   ID   AMOUNT   RAN     LEGAL
2017  30408  11    36   3500     RANGE1  L0015N20W23
2017  30408  11    36   3500     RANGE1  L00210N20W24
2017  30408  11    36   3500     RANGE1  L00310N20W25
2017  30409  11    36   3500     RANGE1  L0015N20W23
2017  30409  11    35   3500     RANGE2  NANANA
2017  30409  11    35   3500     RANGE3  NANANA
2017  30409  11    35   3500     RANGE3  NANANA

输出:

YEAR  POL    CTY   ID   AMOUNT   RAN     LEGAL
2017  30408  11    35   3500     RANGE1  L0015N20W23
r rows
2个回答
0
投票

您可以尝试以下方法:

no_duplicate_cols <- c("YEAR", "POL", "CTY", "ID", "AMOUNT")

new_df <- df[!duplicated(df[, no_duplicate_cols]), ]

数据帧new_df将保存df中没有重复的行。


0
投票

如果我正确理解了问题,那么我认为您可以尝试此操作

library(dplyr)
df %>%
  group_by(YEAR, POL, CTY, ID, AMOUNT) %>%
  filter(n() == 1)

输出(但原始问题中提供的输出似乎有错字!):

# A tibble: 1 x 7
# Groups:   YEAR, POL, CTY, ID, AMOUNT [1]
   YEAR   POL   CTY    ID AMOUNT    RAN       LEGAL
1  2017 30409    11    36   3500 RANGE1 L0015N20W23

#sample data
> dput(df)
structure(list(YEAR = c(2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 
2017L), POL = c(30408L, 30408L, 30408L, 30409L, 30409L, 30409L, 
30409L), CTY = c(11L, 11L, 11L, 11L, 11L, 11L, 11L), ID = c(36L, 
36L, 36L, 36L, 35L, 35L, 35L), AMOUNT = c(3500L, 3500L, 3500L, 
3500L, 3500L, 3500L, 3500L), RAN = structure(c(1L, 1L, 1L, 1L, 
2L, 3L, 3L), .Label = c("RANGE1", "RANGE2", "RANGE3"), class = "factor"), 
    LEGAL = structure(c(1L, 2L, 3L, 1L, 4L, 4L, 4L), .Label = c("L0015N20W23", 
    "L00210N20W24", "L00310N20W25", "NANANA"), class = "factor")), .Names = c("YEAR", 
"POL", "CTY", "ID", "AMOUNT", "RAN", "LEGAL"), class = "data.frame", row.names = c(NA, 
-7L))
© www.soinside.com 2019 - 2024. All rights reserved.