Snowflake 是否支持 select 子句中的关联子查询?

问题描述 投票:0回答:1

我无法确定 Snowflake 是否支持

select
子句中的相关子查询,因为我遇到了相互矛盾的证据。

Snowflake querying-subqueries 文档说

where
子句支持相关子查询,似乎暗示
select
子句不支持它们。 herehere 等社区讨论似乎证实了这一点。

然而...

Chinook 数据库为模型,这条在

select
子句中带有相关子查询的SQL 语句有效。

-- Three levels: artist, album, track
-- Correlated sub-query: level 2 -> level 3
-- Works: ✓
select
  array_agg(object_construct(*))
  from (
    select
      *,
      (
        select
          array_agg(object_construct(*))
          from (
            select
              *,
              (
                select
                  array_agg(object_construct(*))
                  from (
                    select
                      *
                      from track
                     where track.albumid = album.albumid) y
              ) tracks
              from album
             where album.artistid = 1) x
      ) albums
      from
        artist
     where artistid = 1) x;

这个也是……

-- Three levels: artist, album, track
-- Correlated sub-query: level 1 -> level 2
-- Works: ✓
select
  array_agg(object_construct(*))
  from (
    select
      *,
      (
        select
          array_agg(object_construct(*))
          from (
            select
              *,
              (
                select
                  array_agg(object_construct(*))
                  from (
                    select
                      *
                      from track
                     where track.albumid = 1) y
              ) tracks
              from album
             where album.artistid = artist.artistid) x
      ) albums
      from
        artist
     where artistid = 1) x;

然而,这并不...

-- Three levels: artist, album, track
-- Correlated sub-query: level 1 -> level 2 -> level 3
-- Doesn't work: ✗
select
  array_agg(object_construct(*))
  from (
    select
      *,
      (
        select
          array_agg(object_construct(*))
          from (
            select
              *,
              (
                select
                  array_agg(object_construct(*))
                  from (
                    select
                      *
                      from track
                     where track.albumid = album.albumid) y
              ) tracks
              from album
             where album.artistid = artist.artistid) x
      ) albums
      from
        artist
     where artistid = 1) x;
Error: SQL compilation error:
Unsupported subquery type cannot be evaluated

就好像相关子查询 are

select
子句中得到半途支持,只要引用不超过两个级别,它们就可以工作。然而,即使是这种有限的支持也与社区论坛和 StackOverflow 上的文档和传统智慧相矛盾。

我尝试了

select
子句中的相关子查询,涉及两层嵌套和三层嵌套。我期望的是它们都不起作用,尽管我也接受它们都起作用。我没想到的是其中一些(2 个级别)可以工作,而另一些(3 个级别)则不能。

附录

您有多个具有相同别名的子查询——我会尝试使用不同的名称,然后看看相关性是否更好。

这会引发上述错误:

-- Three levels: artist, album, track
-- Correlated sub-query: level 1 -> level 2 -> level 3
-- Doesn't work: ✗
select
  array_agg(object_construct(*))
  from (
    select
      *,
      (
        select
          array_agg(object_construct(*))
          from (
            select
              *,
              (
                select
                  array_agg(object_construct(*))
                  from (
                    select
                      *
                      from track t3
                     where t3.albumid = t2.albumid) y
              ) tracks
              from album t2
             where t2.artistid = t1.artistid) x
      ) albums
      from
        artist t1
     where artistid = 1);

你能为这两种情况发布 EXPLAIN USING TABULAR 吗?

-- Three levels: artist, album, track
-- Correlated sub-query: level 1 -> level 2
-- Works: ✓
explain using tabular
select
  array_agg(object_construct(*))
  from (
    select
      *,
      (
        select
          array_agg(object_construct(*))
          from (
            select
              *,
              (
                select
                  array_agg(object_construct(*))
                  from (
                    select
                      *
                      from track t3
                     where t3.albumid = 1) y
              ) tracks
              from album t2
             where t2.artistid = t1.artistid) x
      ) albums
      from
        artist t1
     where artistid = 1) x;
step | id | parent | operation     | objects               | alias | expressions                    | partitionstotal | partitionsassigned | bytesassigned
-----+----+--------+---------------+-----------------------+-------+--------------------------------+-----------------+--------------------+--------------
     |    |        | GlobalStats   |                       |       |                                | 3               | 3                  | 116224       
1    | 0  |        | Result        |                       |       | ARRAY_AGG(OBJECT_CONSTRUCT(... |                 |                    |              
1    | 1  | 0      | Aggregate     |                       |       | aggExprs: [ARRAY_AGG(OBJECT... |                 |                    |              
1    | 2  | 1      | Filter        |                       |       | T3.ALBUMID = 1                 |                 |                    |              
1    | 3  | 2      | TableScan     | CHINOOK.PUBLIC.TRACK  | T3    | TRACKID, NAME, ALBUMID, MED... | 1               | 1                  | 101888       
2    | 0  |        | Result        |                       |       | ARRAY_AGG(OBJECT_CONSTRUCT(... |                 |                    |              
2    | 1  | 0      | Aggregate     |                       |       | aggExprs: [ARRAY_AGG(OBJECT... |                 |                    |              
2    | 2  | 1      | LeftOuterJoin |                       |       | joinKey: (T2.ARTISTID = T1.... |                 |                    |              
2    | 3  | 2      | Filter        |                       |       | ARRAY_AGG(OBJECT_CONSTRUCT(... |                 |                    |              
2    | 4  | 3      | Aggregate     |                       |       | aggExprs: [ARRAY_AGG(OBJECT... |                 |                    |              
2    | 5  | 4      | Filter        |                       |       | T2.ARTISTID = 1                |                 |                    |              
2    | 6  | 5      | TableScan     | CHINOOK.PUBLIC.ALBUM  | T2    | ALBUMID, TITLE, ARTISTID       | 1               | 1                  | 8192         
2    | 7  | 2      | Filter        |                       |       | T1.ARTISTID = 1                |                 |                    |              
2    | 8  | 7      | TableScan     | CHINOOK.PUBLIC.ARTIST | T1    | ARTISTID, NAME                 | 1               | 1                  | 6144         
-- Three levels: artist, album, track
-- Correlated sub-query: level 1 -> level 2 -> level 3
-- Doesn't work: ✗

explain using tabular
select
  array_agg(object_construct(*))
  from (
    select
      *,
      (
        select
          array_agg(object_construct(*))
          from (
            select
              *,
              (
                select
                  array_agg(object_construct(*))
                  from (
                    select
                      *
                      from track t3
                     where t3.albumid = t2.albumid) y
              ) tracks
              from album t2
             where t2.artistid = t1.artistid) x
      ) albums
      from
        artist t1
     where artistid = 1);

同样的错误。

sql database subquery snowflake-cloud-data-platform correlated-subquery
1个回答
0
投票

最简单的答案是“它不起作用”,因为从性能的角度来看,它们很恶心,永远不应该使用。

但是在一些简单的例子中,它确实有时会起作用。但是会爆炸成当前优化器无法解决的情况,因此会出现错误。

在一个层面上,您可以指向“其他某个数据库”并说他们这样做,是的,他们这样做了。那么下一个问题是,那你为什么不使用那个数据库……我会提出的一个原因是性能。

获得高性能结果的最好方法是理解你的数据,并编写 SQL 来解决你的数据集的每一个缺点,而不是更多。因此,根据定义,必须考虑“所有边缘情况”的通用解决方案性能不佳。

© www.soinside.com 2019 - 2024. All rights reserved.