如何在 Django 中加速聚合

问题描述 投票:0回答:1

我正在运行一个非常复杂的查询来获取试算表报告,该报告收集当前和之前的借方和贷方金额并汇总总和。

在 Django 中,我使用 ORM 来创建查询,加载大约需要 5-6 秒。我认为这是一个 ORM 的事情,并尝试在 PGAdmin 中使用纯 SQL,查询时间是相同的,我试图看看如何提高速度。

注意 我在 date_entered、property 和 is_voided 上有索引,所以这不应该成为瓶颈。正在查询的表中还有超过一百万行。

这是 SQL:

SELECT "accounting_glaccount"."deleted",
       "accounting_glaccount"."date_entered",
       "accounting_glaccount"."last_update",
       "accounting_glaccount"."uuid",
       "accounting_glaccount"."deleted_by_cascade",
       "accounting_glaccount"."id",
       "accounting_glaccount"."account_name",
       "accounting_glaccount"."account_identifier",
       "accounting_glaccount"."sortable_account_identifier",
       "accounting_glaccount"."account_type",
       "accounting_glaccount"."account_designation",
       "accounting_glaccount"."balance_type",
       "accounting_glaccount"."report_type",
       "accounting_glaccount"."report_designation",
       "accounting_glaccount"."description",
       "accounting_glaccount"."last_edited_by_id",
       "accounting_glaccount"."submitted_by_id",
       (
        SELECT SUM(U0."credit_amount") AS "aggregation"
          FROM "accounting_generalledger" U0
         WHERE ((U0."date_entered")::date >= '2024-02-06'::date AND (U0."date_entered")::date <= '2024-02-06'::date AND U0."property_id" = 20 AND NOT U0."is_voided" AND U0."account_id" = ("accounting_glaccount"."id") AND U0."deleted" IS NULL)
         GROUP BY U0."account_id"
       ) AS "current_credit",
       (
        SELECT SUM(U0."debit_amount") AS "aggregation"
          FROM "accounting_generalledger" U0
         WHERE ((U0."date_entered")::date >= '2024-02-06'::date AND (U0."date_entered")::date <= '2024-02-06'::date AND U0."property_id" = 20 AND NOT U0."is_voided" AND U0."account_id" = ("accounting_glaccount"."id") AND U0."deleted" IS NULL)
         GROUP BY U0."account_id"
       ) AS "current_debit",
       (
        SELECT SUM(U0."credit_amount") AS "aggregation"
          FROM "accounting_generalledger" U0
         WHERE ((U0."date_entered")::date < '2024-02-06'::date AND U0."property_id" = 20 AND NOT U0."is_voided" AND U0."account_id" = ("accounting_glaccount"."id") AND U0."deleted" IS NULL)
         GROUP BY U0."account_id"
       ) AS "prior_credit",
       (
        SELECT SUM(U0."debit_amount") AS "aggregation"
          FROM "accounting_generalledger" U0
         WHERE ((U0."date_entered")::date < '2024-02-06'::date AND U0."property_id" = 20 AND NOT U0."is_voided" AND U0."account_id" = ("accounting_glaccount"."id") AND U0."deleted" IS NULL)
         GROUP BY U0."account_id"
       ) AS "prior_debit"
  FROM "accounting_glaccount"
 WHERE (UPPER("accounting_glaccount"."account_type"::text) = UPPER('Regular') AND "accounting_glaccount"."deleted" IS NULL)
 ORDER BY "accounting_glaccount"."sortable_account_identifier" ASC

这是原始的 Django ORM 查询:

gl_accounts = GLAccount.objects.order_by('sortable_account_identifier').filter(account_type__iexact='Regular').annotate(
                current_credit=SubqueryAggregate(
                    'general_ledger__credit_amount',
                    filter=(Q(date_entered__date__gte=start_date,
                              date_entered__date__lte=end_date) & Q(property=property) & Q(is_voided=False)),
                    aggregate=Sum),
                current_debit=SubqueryAggregate(
                    'general_ledger__debit_amount',
                    filter=(Q(date_entered__date__gte=start_date,
                              date_entered__date__lte=end_date) & Q(property=property) & Q(is_voided=False)),
                    aggregate=Sum),
                prior_credit=SubqueryAggregate(
                    'general_ledger__credit_amount',
                    filter=(Q(date_entered__date__lt=start_date) & Q(property=property) & Q(is_voided=False)),
                    aggregate=Sum),
                prior_debit=SubqueryAggregate(
                    'general_ledger__debit_amount',
                    filter=(Q(date_entered__date__lt=start_date) & Q(property=property) & Q(is_voided=False)),
                    aggregate=Sum)
            ).iterator()

如何加快速度?

注意

我发现一个在线查询优化器建议添加这些索引,但我更喜欢在 Django 而不是纯 SQL 中执行此操作,我该怎么做?

CREATE INDEX accounting_general_idx_prope_delet_accou_credi_date_is_vo ON "accounting_generalledger" ("property_id","deleted","account_id","credit_amount","date_entered","is_voided");
CREATE INDEX accounting_general_idx_prope_delet_accou_debit_date_is_vo ON "accounting_generalledger" ("property_id","deleted","account_id","debit_amount","date_entered","is_voided");
CREATE INDEX accounting_glaccou_idx_upperacc_deleted_sortable ON "accounting_glaccount" ((UPPER("account_type"::text)),"deleted","sortable_account_identifier");
python django django-queryset
1个回答
0
投票

首先,如果您只是好奇是否有一种方法可以优化 SQL,我发现这个网站非常有帮助:

https://www.eversql.com/sql-query-optimizer/

当我发布原始查询时,它指出可以优化某些索引并提供 SQL 来完成它,在我的示例中是:

CREATE INDEX accounting_general_idx_prope_delet_accou_credi_date_is_vo ON "accounting_generalledger" ("property_id","deleted","account_id","credit_amount","date_entered","is_voided");
CREATE INDEX accounting_general_idx_prope_delet_accou_debit_date_is_vo ON "accounting_generalledger" ("property_id","deleted","account_id","debit_amount","date_entered","is_voided");

显然,我想要可移植性,并且使用 SQL 为我的应用程序的每个新实例创建手动索引有点糟糕,所以我想出了 Django 路线来获得相同的索引分组。

我有一个 GeneralLedger 模型,其中包含所有这些数据,或多或少我需要做的就是向我的模型添加

index_together
属性
Meta
,如下所示:

class GeneralLedger(SafeDeleteModel, UUID):
       class Meta:
        index_together = [
            ("property", "deleted", "account", "credit_amount", "date_entered", "is_voided"),
            ("property", "deleted", "account", "debit_amount", "date_entered", "is_voided")
        
        ]

从 5 秒加载到 475 毫秒

© www.soinside.com 2019 - 2024. All rights reserved.