使用dplyr和Unix时间戳过滤数据库

问题描述 投票:2回答:2

我正在使用大型数据库(通过dplyrimpaladb)和dplyr。因此,我需要按日期过滤所有这些都在Unix时间戳中给出。虽然我可以在本地转换为

time_t = as.Date(as.POSIXct(time_t/1000, origin = '1970-01-01', tz = 'UTC')))` 

与DB通信时,这不起作用;我需要将以下内容翻译成dplyr

dau <- bb %>%
  tbl(sql("SELECT
             device_token_s,
             to_date(from_unixtime(cast(collector_date_t/1000 as bigint))) AS dte
           FROM bb.sys_app_open
           WHERE 
             build_type_n = 1
             AND to_date(from_unixtime(cast(collector_date_t/1000 as bigint))) >=  '2016-02-26'
           GROUP BY 
             device_token_s,
             to_date(from_unixtime(cast(collector_date_t/1000 as bigint)))")) %>%
  collect()

我能得到的最接近的是,

dau.df <- bb %>% 
  tbl('sys_app_open') %>%
  select(device_token_s, 
         sql('to_date(from_unixtime(cast(collector_date_t/1000 as bigint))) AS dte')) %>%
  filter(build_type_n == 1, 
         sql("to_date(from_unixtime(cast(collector_date_t/1000 as bigint))) >=  '2016-02-26' ")) %>%
  #mutate(collector_date_t = sql('to_date(from_unixtime(cast(collector_date_t/1000 as bigint)))')) %>%
  group_by(device_token_s, sql('to_date(from_unixtime(cast(collector_date_t/1000 as bigint)))')) %>%
  collect()

但我收到了

Error: All select() inputs must resolve to integer column positions.
The following do not:
*  sql("to_date(from_unixtime(cast(collector_date_t/1000 as bigint))) as dte")
r dplyr unix-timestamp
2个回答
0
投票

错误来自您使用select函数的方式。您正尝试通过select发送“文字”SQL指令,您应该通过mutate函数执行此操作。

这应该适合你:

dau.df <- bb %>% 
  tbl('sys_app_open') %>%
  select(device_token_s, build_type_n, collector_date_t) %>%
  mutate(dte = sql("to_date(from_unixtime(cast(collector_date_t/1000 as bigint)))")) %>%
  filter(build_type_n == 1, dte > '2016-02-26') %>%
  group_by(device_token_s, dte) %>%
  collect

我建议你使用函数dbplyr::sql_render()来查看dplyr正在创建的查询。例如,运行

bb %>% 
  tbl('sys_app_open') %>%
  select(device_token_s, build_type_n, collector_date_t) %>%
  mutate(dte = sql("to_date(from_unixtime(cast(collector_date_t/1000 as bigint)))")) %>%
  filter(build_type_n == 1, dte > '2016-02-26') %>%
  dbplyr::sql_render()

查看以下创建的查询:

<SQL> SELECT *
FROM (SELECT "device_token_s", "build_type_n", "collector_date_t", to_date(from_unixtime(cast(collector_date_t/1000 as bigint))) AS "dte"
FROM (SELECT "device_token_s", "build_type_n", "collector_date_t"
FROM "sys_app_open") "fgyyfaqrwp") "nmmczsfuid"
WHERE (("build_type_n" = 1) AND ("dte" > '2016-02-26'))
© www.soinside.com 2019 - 2024. All rights reserved.