当其他列的对应值重复R时,我想添加一列的值R

问题描述 投票:0回答:2

我对此有类似的问题:(Sum the duplicate rows of particular columns in dataframe),但该解决方案对我不起作用,或者我不知道如何修改它。

即使参考号和NODCCODE在参考号内不相邻,如果参考号和NODCCODE都匹配,我需要将“数字”列值加在一起。

我有这个:

structure(list(Reference = c("BBM101", "BBM102", 
                             "BBM102", "BBM102", "BBM103", "BBM103", 
                             "BBM104", "BBM105", "BBM105", "BBM105"), 
               NODCCODE = c("101","301", "201", "201", "201", "401", "401", "201", "102", "201"), 
               Number = c(2, 1, 3, 1, 3, 14, 3, 24, 2, 1)), 
          row.names = c(NA, 10L), class = "data.frame")
   Reference NODCCODE Number
1     BBM101      101      2
2     BBM102      301      1
3     BBM102      201      3
4     BBM102      201      1
5     BBM103      201      3
6     BBM103      401     14
7     BBM104      401      3
8     BBM105      201     24
9     BBM105      102      2
10    BBM105      201      1

我想要这个:

structure(list(Reference = c("BBM101", "BBM102", "BBM102", "BBM103", "BBM103", "BBM104", "BBM105", "BBM105"), 
               NODCCODE = c("101","301", "201", "201", "401", "401", "201", "102"), 
               Number = c(2, 1, 4, 3, 14, 3, 25, 2)), 
          row.names = c(NA, 8L), class = "data.frame")
Reference NODCCODE Number
1    BBM101      101      2
2    BBM102      301      1
3    BBM102      201      4
4    BBM103      201      3
5    BBM103      401     14
6    BBM104      401      3
7    BBM105      201     25
8    BBM105      102      2

注意,第3行和第4行Reference和NODCCODE已合并,并添加了Number列。即使在201个值之间有102个值,并且第8行和第10行也都具有相同的参考号,所以将它们相加。我不在乎其余行是在那组参考号的开头还是结尾。

r merge duplicates add
2个回答
0
投票

我相信tidyverse这样简单吗?只有一个匹配的NODCCODE的Reference的总和将是唯一值,具有相同reference和NODCCODE的条目将被求和]

library(tidyverse)

struct <- structure(list(Reference = c("BBM101", "BBM102", 
                             "BBM102", "BBM102", "BBM103", "BBM103", 
                             "BBM104", "BBM105", "BBM105", "BBM105"), 
               NODCCODE = c("101","301", "201", "201", "201", "401", "401", "201", "102", "201"), 
               Number = c(2, 1, 3, 1, 3, 14, 3, 24, 2, 1)), 
          row.names = c(NA, 10L), class = "data.frame")


result <- struct %>% 
  group_by(Reference,NODCCODE) %>% 
  summarise(Number = sum(Number)) %>% 
  arrange(Reference) %>% 
  ungroup()

result
#> # A tibble: 8 x 3
#>   Reference NODCCODE Number
#>   <chr>     <chr>     <dbl>
#> 1 BBM101    101           2
#> 2 BBM102    201           4
#> 3 BBM102    301           1
#> 4 BBM103    201           3
#> 5 BBM103    401          14
#> 6 BBM104    401           3
#> 7 BBM105    102           2
#> 8 BBM105    201          25

reprex package(v0.3.0)在2020-04-24创建


0
投票

如果加载data.table包,则将data.frame转换为data.table(使用setDT,则可以执行此操作]

© www.soinside.com 2019 - 2024. All rights reserved.