我在数据帧parameters
中存储了一组配对值:
parameters <- data.frame(
variant_id = c(1, 2, 3, 4, 5),
start_date = c("2019-07-01", "2019-09-05", "2019-05-21", "2019-09-06",
"2019-04-19"))
> parameters
variant_id start_date
1 1 2019-07-01
2 2 2019-09-05
3 3 2019-05-21
4 4 2019-09-06
5 5 2019-04-19
我想使用variant_id
和start_date
的组合作为在RPostgres中执行的此SQL查询中的动态参数。
library(RPostgres)
library(tidyverse)
query <- "select sum(o.quantity)
from orders o
where o.date >= << start_date >>
and o.variant_id = << variant_id >> "
df <- dbGetQuery(db, query)
然后我将有类似的查询:
query_1 <- "select sum(o.quantity)
from orders o
where o.date >= '2019-07-01'
and o.variant_id = 1 "
result_1 <- dbGetQuery(db, query_1)
> result_1
sum
1 100
query_2 <- "select sum(o.quantity)
from orders o
where o.date >= '2019-09-05'
and o.variant_id = 2 "
result_2 <- dbGetQuery(db, query_2)
> result_2
sum
1 120
query_3 <- "select sum(o.quantity)
from orders o
where o.date >= '2019-05-21'
and o.variant_id = 3 "
result_3 <- dbGetQuery(db, query_3)
> result_3
sum
1 140
...依此类推。
然后,我想将每个结果附加到新的数据框results
中:
results <- data.frame(
variant_id = c(1, 2, 3, 4, 5),
quantity = c(100, 120, 140, 150, 160)
)
> results
variant_id quantity
1 1 100
2 2 120
3 3 140
4 4 150
5 5 160
如何使用RPostgres
和dplyr
解决此问题,避免使用循环?
假定最后的注释中定义的parameters
。除了我们添加了stringsAsFactors=FALSE
,这与问题相同。
为了测试这一点,我们在下面使用了c
,但是您可以用对数据库的调用替换c
。
library(gsubfn)
query <- "select sum(o.quantity)
from orders o
where o.date >= '`start_date`'
and o.variant_id = `variant_id` "
nr <- nrow(parameters)
unname(unlist(sapply(split(parameters, 1:nr), with, fn$c(query))))
给予:
[1] "select sum(o.quantity)\n from orders o\n where o.date >= '2019-07-01'\n and o.variant_id = 1 "
[2] "select sum(o.quantity)\n from orders o\n where o.date >= '2019-09-05'\n and o.variant_id = 2 "
[3] "select sum(o.quantity)\n from orders o\n where o.date >= '2019-05-21'\n and o.variant_id = 3 "
[4] "select sum(o.quantity)\n from orders o\n where o.date >= '2019-09-06'\n and o.variant_id = 4 "
[5] "select sum(o.quantity)\n from orders o\n where o.date >= '2019-04-19'\n and o.variant_id = 5 "
或用sqldf
和orders
的值进行测试:
library(sqldf)
orders <- data.frame(date = "2019-07-02", variant_id = 1:3, quantity = 1:3)
unname(unlist(lapply(split(parameters, 1:nr), with, fn$sqldf(query))))
## [1] 1 NA 3 NA NA
parameters <- data.frame(
variant_id = c(1, 2, 3, 4, 5),
start_date = c("2019-07-01", "2019-09-05", "2019-05-21", "2019-09-06",
"2019-04-19"), stringsAsFactors = FALSE)