我正在使用大型数据库(通过dplyrimpaladb
)和dplyr
。因此,我需要按日期过滤所有这些都在Unix时间戳中给出。虽然我可以在本地转换为
time_t = as.Date(as.POSIXct(time_t/1000, origin = '1970-01-01', tz = 'UTC')))`
与DB通信时,这不起作用;我需要将以下内容翻译成dplyr
。
dau <- bb %>%
tbl(sql("SELECT
device_token_s,
to_date(from_unixtime(cast(collector_date_t/1000 as bigint))) AS dte
FROM bb.sys_app_open
WHERE
build_type_n = 1
AND to_date(from_unixtime(cast(collector_date_t/1000 as bigint))) >= '2016-02-26'
GROUP BY
device_token_s,
to_date(from_unixtime(cast(collector_date_t/1000 as bigint)))")) %>%
collect()
我能得到的最接近的是,
dau.df <- bb %>%
tbl('sys_app_open') %>%
select(device_token_s,
sql('to_date(from_unixtime(cast(collector_date_t/1000 as bigint))) AS dte')) %>%
filter(build_type_n == 1,
sql("to_date(from_unixtime(cast(collector_date_t/1000 as bigint))) >= '2016-02-26' ")) %>%
#mutate(collector_date_t = sql('to_date(from_unixtime(cast(collector_date_t/1000 as bigint)))')) %>%
group_by(device_token_s, sql('to_date(from_unixtime(cast(collector_date_t/1000 as bigint)))')) %>%
collect()
但我收到了
Error: All select() inputs must resolve to integer column positions. The following do not: * sql("to_date(from_unixtime(cast(collector_date_t/1000 as bigint))) as dte")
错误来自您使用select
函数的方式。您正尝试通过select
发送“文字”SQL指令,您应该通过mutate
函数执行此操作。
这应该适合你:
dau.df <- bb %>%
tbl('sys_app_open') %>%
select(device_token_s, build_type_n, collector_date_t) %>%
mutate(dte = sql("to_date(from_unixtime(cast(collector_date_t/1000 as bigint)))")) %>%
filter(build_type_n == 1, dte > '2016-02-26') %>%
group_by(device_token_s, dte) %>%
collect
我建议你使用函数dbplyr::sql_render()
来查看dplyr
正在创建的查询。例如,运行
bb %>%
tbl('sys_app_open') %>%
select(device_token_s, build_type_n, collector_date_t) %>%
mutate(dte = sql("to_date(from_unixtime(cast(collector_date_t/1000 as bigint)))")) %>%
filter(build_type_n == 1, dte > '2016-02-26') %>%
dbplyr::sql_render()
查看以下创建的查询:
<SQL> SELECT *
FROM (SELECT "device_token_s", "build_type_n", "collector_date_t", to_date(from_unixtime(cast(collector_date_t/1000 as bigint))) AS "dte"
FROM (SELECT "device_token_s", "build_type_n", "collector_date_t"
FROM "sys_app_open") "fgyyfaqrwp") "nmmczsfuid"
WHERE (("build_type_n" = 1) AND ("dte" > '2016-02-26'))