雅典娜目录查询很慢

问题描述 投票:0回答:1

我有一个 dbt 数据管道,它创建了许多带有大量文件的 athena 表,我注意到所有运行都需要很长时间才能运行简单的查询......所以在 dbt.log 中我发现了这个查询:

WITH views AS (
      select
        table_catalog as database,
        table_name as name,
        table_schema as schema
      from "awsdatacatalog".INFORMATION_SCHEMA.views
      where table_schema = LOWER('graphs_db')
    ), tables AS (
      select
        table_catalog as database,
        table_name as name,
        table_schema as schema

      from "awsdatacatalog".INFORMATION_SCHEMA.tables
      where table_schema = LOWER('graphs_db')

      -- Views appear in both `tables` and `views`, so excluding them from tables
      EXCEPT 

      select * from views
    )
    select views.*, 'view' AS table_type FROM views
    UNION ALL
    select tables.*, 'table' AS table_type FROM tables

它可能在检查它可以使用哪些表之前运行。 无论如何,此查询需要 5 分钟才能运行。我的管道中有几个 dbt 步骤,因此这大大减慢了它的速度。正常吗?有什么办法可以优化吗?

amazon-web-services amazon-athena dbt
1个回答
0
投票

你可以看看这个Tips to improve the Athena Query Performance

特别点没有4.

© www.soinside.com 2019 - 2024. All rights reserved.