绑定多个数据帧的行,这些行包含来自lubridate包的类间隔的列

问题描述 投票:1回答:1

我有一个列表,其中每个元素都是具有相同列名的数据框,其中一列属于Interval类(来自lubridate包)。我想将列表中的所有单个数据帧绑定到一个数据帧中。不幸的是,使用rbind和bind_rows将interval列强制转换为数字,并且我收到以下警告。

警告消息:1:在bind_rows_(x,.id)中:向量化“间隔”元素可能不会保留其属性

library(dplyr)
library(lubridate)
#Create sample list length 2 actually list length ~18,000
test <- list(BGC119AP01 = structure(list(participant_code = "BGC119AP01", 
    interval_1 = new("Interval", .Data = 34128000, start = structure(1479427200, class = c("POSIXct", 
    "POSIXt"), tzone = "UTC"), tzone = "UTC")), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -1L), groups = structure(list(
    participant_code = "BGC119AP01", .rows = list(1L)), row.names = c(NA, 
-1L), class = c("tbl_df", "tbl", "data.frame"), .drop = FALSE)), 
    BGC119AP02 = structure(list(participant_code = "BGC119AP02", 
        interval_1 = new("Interval", .Data = 34128000, start = structure(1479427200, class = c("POSIXct", 
        "POSIXt"), tzone = "UTC"), tzone = "UTC")), class = c("grouped_df", 
    "tbl_df", "tbl", "data.frame"), row.names = c(NA, -1L), groups = structure(list(
        participant_code = "BGC119AP02", .rows = list(1L)), row.names = c(NA, 
    -1L), class = c("tbl_df", "tbl", "data.frame"), .drop = FALSE)))

#Attempt bind rows both ending in the above warning.
do.call(rbind, test)
do.call(bind_rows, test) 

输出请注意,interval_1已被强制变为double并丢失了其属性

# A tibble: 2 x 2
# Groups:   participant_code [2]
  participant_code interval_1
  <chr>                 <dbl>
1 BGC119AP01         34128000
2 BGC119AP02         34128000
Warning messages:
1: In bind_rows_(x, .id) :
  Vectorizing 'Interval' elements may not preserve their attributes
2: In bind_rows_(x, .id) :
  Vectorizing 'Interval' elements may not preserve their attributes

大概是因为类间隔的列不是原子向量。我知道我可以通过保留原始的开始日期和结束日期,然后在绑定行之后创建间隔列来解决此问题,但是我想要一个解决方案,使我可以绑定列表中的所有单个数据帧,同时保持完整性类间隔的列的最大值,并且解决方案可扩展到18,000行。提前非常感谢

r intervals lubridate rbind
1个回答
0
投票

有提示,当您在加载do.call(rbind, test)的情况下执行dplyr并收到警告时:

Warning messages:
1: In bind_rows_(x, .id) :
  Vectorizing 'Interval' elements may not preserve their attributes

dplyr::bind_rows()实际上是被调用而不是base::rbind(),并且间隔属性已删除。当对象是小对象(tbltbl_df类)时,似乎会发生这种情况。

您可以通过使用rbind.data.frame()来避免这种情况:

do.call(rbind.data.frame, test)
# A tibble: 2 x 2
# Groups:   participant_code [1]
  participant_code interval_1                    
* <chr>            <Interval>                    
1 BGC119AP01       2016-11-18 UTC--2017-12-18 UTC
2 BGC119AP02       2016-11-18 UTC--2017-12-18 UTC
© www.soinside.com 2019 - 2024. All rights reserved.