我正在写if语句,该语句检查是否重复。如果有,我想继续执行,但是返回一条消息,指出哪些重复项。我尝试了message()
,但是我不确定如何包括locations
的值。
if(anyDuplicated(regionGroups$location) > 0){
duplicateRegions <- regionGroups[, 'count' := .N, by = location][count > 1, .SD[1], by = location][[1]]
message("Location is not unique in the table regionGroups. There are length(duplicateRegions) duplicated locations, namely: duplicateRegions[1],duplicateRegions[2] ")
regionGroups <- regionGroups[!duplicated(regionGroups$location)]
}
(anyDuplicated(regionGroups$location) > 0)
[1] TRUE
dupplicateRegions
[1] 55100 26080
所需的输出是:
Location is not unique in the table regionGroups. There are 2 duplicated locations, namely: 55100, 26080
复杂的是,可能会有更多重复的区域,并且数字将改变。
问题:如何编写message()
语句,以便输出将列出 duplicateRegions
的各个值?
嗨,这对您有用吗?我不确定为什么无论如何都要执行if
语句,因为似乎不需要else
元素,也许为简单起见就省略了它。
要注意的另一点是,duplicated
不会拾取重复集中的第一个重复,因此在使用时:regionGroups[!duplicated(regionGroups$location),]
它将始终消除除第一个重复项之外的所有重复项。也许对您来说不错,但只是警告。
[同样,如果您采用这种方法:namely: duplicateRegions[1],duplicateRegions[2]
在消息功能中,则假定您知道将有多少个重复项,事实并非如此。您可以使用paste(as.character(regionGroups$location[dups]), collapse = ", "))
折叠字符串,因此您不必担心。
if(any(duplicated(regionGroups$location))){
dups <- which(duplicated(regionGroups$location))
dup_regions <- regionGroups$location[dups]
message(" Location is not unique in the table regionGroups. There are ",
length(dups)," duplicated locations, namely: ", paste(as.character(regionGroups$location[dups]), collapse = ", "))
regionGroups <- regionGroups[!duplicated(regionGroups$location),]
}
使用paste()
功能将固定文本与R对象合并。
message(paste("Location is not unique in the table regionGroups. There are",
duplicateRegions, "duplicated locations, namely: ",
duplicateRegions[1],duplicateRegions[2]))
如果需要打印未知数量的重复区域,则可以按以下方式打印它们:
dupRegions <- c(101004,1038187,218477)
message("duplicated regions: ",
paste(as.character(dupRegions),collapse = " "))
...和输出:
> message("duplicated regions: ",
+ paste(as.character(dupRegions),collapse = " "))
duplicated regions: 101004 1038187 218477
>