使用 Ruby on Rails 进行 Postgres 通用表表达式查询

问题描述 投票:0回答:3

我正在尝试找到在 Rails 应用程序中使用通用表表达式进行 Postgres 查询的最佳方法,因为我知道 ActiveRecord 显然不支持 CTE。

我有一个名为

user_activity_transitions
的表,其中包含一系列正在启动和停止的用户活动的记录(每行表示状态的更改:例如启动或停止)。

一个

user_activity_id
可能有很多情侣开始停下来,他们分在两排。 如果活动当前正在进行且尚未停止,则也可能只有“已开始”。
sort_key
从第一个状态的 0 开始,每次状态更改时增加 10。

id      to_state     sort_key     user_activity_id    created_at
1       started      0            18                  2014-11-15 16:56:00
2       stopped      10           18                  2014-11-15 16:57:00
3       started      20           18                  2014-11-15 16:58:00
4       stopped      30           18                  2014-11-15 16:59:00
5       started      40           18                  2014-11-15 17:00:00

我想要的是以下输出,将启动-停止对分组在一起,以便能够计算持续时间等。

user_activity_id     started_created_at      stopped_created_at
18                   2014-11-15 16:56:00     2014-11-15 16:57:00
18                   2014-11-15 16:58:00     2014-11-15 16:59:00
18                   2014-11-15 17:00:00     null

表的实现方式使得运行该查询变得更加困难,但对于未来的更改(例如新的中间状态)更加灵活,因此不会进行修改。

我的 Postgres 查询(以及 Rails 中的相关代码):

query = <<-SQL
    with started as (
    select 
        id,
        sort_key,
        user_activity_id,
        created_at as started_created_at
    from
        user_activity_transitions
    where  
        sort_key % 4 = 0
    ), stopped as (
    select 
        id,
        sort_key-10 as sort_key2,
        user_activity_id,
        created_at as stopped_created_at
    from
    user_activity_transitions
    where
        sort_key % 4 = 2
    )
    select
        started.user_activity_id AS user_activity_id,
        started.started_created_at AS started_created_at,
        stopped.stopped_created_at AS stopped_created_at
    FROM
        started
    left join stopped on stopped.sort_key2 = started.sort_key
    and stopped.user_activity_id = started.user_activity_id
SQL

results = ActiveRecord::Base.connection.execute(query)

它的作用是根据排序键的模数检查“欺骗”SQL 连接 2 个连续行。

查询工作正常。但使用这个原始 AR 调用让我很烦恼,特别是因为

connection.execute
返回的内容非常混乱。我基本上需要循环遍历结果并将其放入正确的哈希中。

2个问题:

  1. 有没有办法摆脱 CTE 并使用运行相同的查询 轨道魔法?
  2. 如果没有,是否有更好的方法来获得我想要的漂亮哈希结果?

请记住,我对 Rails 还很陌生,而且不是查询专家,因此可能会有明显的改进...

非常感谢!

ruby-on-rails postgresql ruby-on-rails-4 common-table-expression
3个回答
5
投票

虽然 Rails 不直接支持 CTE,但您可以模拟单个 CTE 并且仍然利用 ActiveRecord。使用

from
子查询代替 CTE。

Thing
  .from(
    # Using a subquery in place of a single CTE
    Thing
      .select(
        '*',
        %{row_number() over(
            partition by
              this, that
            order by
              created_at desc
          ) as rank
        }
      )
    :things
  )
  .where(rank: 1)

这不完全一样,但相当于...

with ranked_things as (
  select
    *,
    row_number() over(
      partition by
        this, that
      order by
        created_at desc
    ) as rank
)
select *
from ranked_things
where rank = 1

3
投票

我正在尝试找到在 Rails 应用程序中使用通用表表达式进行 Postgres 查询的最佳方法,因为我知道 ActiveRecord 显然支持 CTE。

据我所知ActiveRecord不支持CTE。 AR 在幕后使用的 Arel 支持它们,但它们不暴露在 AR 的界面中。

有没有办法摆脱 CTE 并使用 Rails magic 运行相同的查询?

其实不然。您可以将其编写在 AR 的 API 中,但只需将相同的 SQL 拆分为几个方法调用即可。

如果没有,是否有更好的方法来获得我想要的漂亮哈希结果?

我尝试运行查询,得到以下内容,这对我来说似乎足够好。你得到不同的结果吗?

[
  {"user_activity_id"=>"18", "started_created_at"=>"2014-11-15 16:56:00", "stopped_created_at"=>"2014-11-15 16:57:00"},
  {"user_activity_id"=>"18", "started_created_at"=>"2014-11-15 16:58:00", "stopped_created_at"=>"2014-11-15 16:59:00"},
  {"user_activity_id"=>"18", "started_created_at"=>"2014-11-15 17:00:00", "stopped_created_at"=>nil}
]

我假设您有一个名为

UserActivityTransition
的模型,用于操作数据。您也可以使用该模型来获得结果。

results = UserActivityTransition.find_by_sql(query)
results.size # => 3
results.first.started_created_at # => 2014-11-15 16:56:00 UTC

请注意,检查结果时这些“虚拟”属性将不可见,但它们确实存在。


0
投票

现在可以使用

with
方法了。 AR 查询如下所示:

results =
  UserActivityTransition
    .with(started: UserActivityTransition.select('id, sort_key, user_activity_id, created_at AS started_created_at').where('sort_key % 4 = 0'))
    .with(stopped: UserActivityTransition.select('id, sort_key-10 as sort_key2, user_activity_id, created_at AS stopped_created_at').where('sort_key % 4 = 2'))
    .select('started.user_activity_id AS user_activity_id, started.started_created_at AS started_created_at, stopped.stopped_created_at AS stopped_created_at')
    .from('started')
    .joins('LEFT JOIN stopped ON stopped.sort_key2 = started.sort_key AND stopped.user_activity_id = started.user_activity_id')

我们来验证一下SQL是否正确

puts results.to_sql

输出:

WITH "started" AS (SELECT id, sort_key, user_activity_id, created_at AS started_created_at FROM "user_activity_transitions" WHERE (sort_key % 4 = 0)), "stopped" AS (SELECT id, sort_key-10 as sort_key2, user_activity_id, created_at AS stopped_created_at FROM "user_activity_transitions" WHERE (sort_key % 4 = 2)) SELECT started.user_activity_id AS user_activity_id, started.started_created_at AS started_created_at, stopped.stopped_created_at AS stopped_created_at FROM started LEFT JOIN stopped ON stopped.sort_key2 = started.sort_key AND stopped.user_activity_id = started.user_activity_id

现在可以以某种方式处理数据

results.each do |r|
  puts "#{r.user_activity_id} | #{r.started_created_at} | #{r.stopped_created_at}"
end

输出

18 | 2014-11-15 16:56:00 | 2014-11-15 16:57:00
18 | 2014-11-15 16:58:00 | 2014-11-15 16:59:00
18 | 2014-11-15 17:00:00 | 

问题中的这个查询足够大

更易读的用法示例:

Book
  .with(books_with_reviews: Book.where("reviews_count > ?", 0))
  .with(books_with_ratings: Book.where("ratings_count > ?", 0))
  .joins("JOIN books_with_reviews ON books_with_reviews.id = books.id")

# WITH books_with_reviews AS (
#   SELECT * FROM books WHERE (reviews_count > 0)
# ), books_with_ratings AS (
#   SELECT * FROM books WHERE (ratings_count > 0)
# )
# SELECT * FROM books JOIN books_with_reviews ON books_with_reviews.id = books.id
© www.soinside.com 2019 - 2024. All rights reserved.