RPostgres中的参数查询,并将结果附加到新的数据框中

问题描述 投票:1回答:1

我在数据帧parameters中存储了一组配对值:

parameters <- data.frame(
   variant_id = c(1, 2, 3, 4, 5),
   start_date = c("2019-07-01", "2019-09-05", "2019-05-21", "2019-09-06",
                  "2019-04-19"))

> parameters
  variant_id start_date
1          1 2019-07-01
2          2 2019-09-05
3          3 2019-05-21
4          4 2019-09-06
5          5 2019-04-19

我想使用variant_idstart_date的组合作为在RPostgres中执行的此SQL查询中的动态参数。

library(RPostgres)
library(tidyverse)

query <- "select sum(o.quantity)
from orders o
where o.date >= << start_date >>
and o.variant_id = << variant_id >> "

df <- dbGetQuery(db, query)

然后我将有类似的查询:

query_1 <- "select sum(o.quantity)
from orders o
where o.date >= '2019-07-01'
and o.variant_id = 1 "

result_1 <- dbGetQuery(db, query_1)
 > result_1
     sum
   1 100

query_2 <- "select sum(o.quantity)
from orders o
where o.date >= '2019-09-05'
and o.variant_id = 2 "

result_2 <- dbGetQuery(db, query_2)
 > result_2
     sum
   1 120


query_3 <- "select sum(o.quantity)
from orders o
where o.date >= '2019-05-21'
and o.variant_id = 3 "

result_3 <- dbGetQuery(db, query_3)
 > result_3
     sum
   1 140

...依此类推。

然后,我想将每个结果附加到新的数据框results中:

results <- data.frame(
              variant_id = c(1, 2, 3, 4, 5),
                quantity = c(100, 120, 140, 150, 160)
           )

> results
  variant_id quantity
1          1      100
2          2      120
3          3      140
4          4      150
5          5      160

如何使用RPostgresdplyr解决此问题,避免使用循环?

r tidyverse rpostgresql
1个回答
0
投票

假定最后的注释中定义的parameters。除了我们添加了stringsAsFactors=FALSE,这与问题相同。

为了测试这一点,我们在下面使用了c,但是您可以用对数据库的调用替换c

library(gsubfn)

query <- "select sum(o.quantity)
  from orders o
  where o.date >= '`start_date`'
  and o.variant_id = `variant_id` "

nr <- nrow(parameters)
unname(unlist(sapply(split(parameters, 1:nr), with, fn$c(query))))

给予:

[1] "select sum(o.quantity)\n      from orders o\n      where o.date >= '2019-07-01'\n      and o.variant_id = 1 "
[2] "select sum(o.quantity)\n      from orders o\n      where o.date >= '2019-09-05'\n      and o.variant_id = 2 "
[3] "select sum(o.quantity)\n      from orders o\n      where o.date >= '2019-05-21'\n      and o.variant_id = 3 "
[4] "select sum(o.quantity)\n      from orders o\n      where o.date >= '2019-09-06'\n      and o.variant_id = 4 "
[5] "select sum(o.quantity)\n      from orders o\n      where o.date >= '2019-04-19'\n      and o.variant_id = 5 "

或用sqldforders的值进行测试:

library(sqldf)

orders <- data.frame(date = "2019-07-02", variant_id = 1:3, quantity = 1:3)

unname(unlist(lapply(split(parameters, 1:nr), with, fn$sqldf(query))))
## [1]  1 NA  3 NA NA

parameters <- data.frame(
   variant_id = c(1, 2, 3, 4, 5),
   start_date = c("2019-07-01", "2019-09-05", "2019-05-21", "2019-09-06",
                  "2019-04-19"), stringsAsFactors = FALSE)
© www.soinside.com 2019 - 2024. All rights reserved.