我的桌子是:
create table transactions
transaction_id integer not null,
transaction_timestamp integer not null,
input_index smallint,
output_index smallint not null,
from_id integer,
to_id integer not null,
input_value real,
output_value real not null,
constraint unique_transactions
unique (transaction_id, from_id, to_id)
);
具有以下索引:
create index idx_transactions_from_id_block_timestamp
on transactions (from_id asc, transaction_timestamp desc);
create index idx_transactions_to_id_block_timestamp
on transactions (to_id asc, transaction_timestamp desc);
create index idx_transactions_transaction_id
on transactions (transaction_id);
create index idx_transactions_block_timestamp
on transactions (transaction_timestamp desc);
我理想中想要的查询是-
select distinct on (transaction_id,output_index) *
from transactions
where to_id = 1000
and transaction_timestamp between 1691193600 AND 1711929600
order by transaction_timestamp desc
limit 10
给我最近 10 个唯一的 (transaction_id,output_index) 对(不关心选择保留哪一个 from_id 和 input_index)。
这种直接的方法行不通,因为 postgres 要求 order by 首先包含列上的不同值。 错误:SELECT DISTINCT ON 表达式必须与初始 ORDER BY 表达式匹配
这样做会重新排序我的行,选择 transaction_id 最高的前 10 行,这是我不想要的。
有没有一种有效的方法来做到这一点,使用下限数量希望不必超过表中的数百万行?
我尝试了以下查询,但最终都花费了太长时间,因为它们需要处理整个表,而不使用小限制 10。
查询1:
WITH RankedTransactions AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY transaction_id, output_index ORDER BY transaction_timestamp DESC) AS rn
FROM transactions
WHERE to_id = 1000
and transaction_timestamp between 1691193600 AND 1711929600
)
SELECT transaction_id,
input_index,
output_index,
transaction_timestamp,
from_id,
to_id,
input_value,
output_value
FROM RankedTransactions
WHERE rn = 1
ORDER BY transaction_timestamp DESC
LIMIT 10;
查询2:
SELECT *
FROM (
SELECT DISTINCT ON (transaction_id, output_index) *
FROM transactions
WHERE to_id = 1000
and transaction_timestamp between 1691193600 AND 1711929600
ORDER BY transaction_id, output_index DESC
) AS latest_transactions
ORDER BY transaction_timestamp DESC
LIMIT 10;
这有效:
SELECT *
FROM (
SELECT DISTINCT ON (transaction_id, output_index) *
FROM transactions
WHERE to_id = 1000
AND transaction_timestamp BETWEEN 1691193600 AND 1711929600
ORDER BY transaction_id, output_index DESC, transaction_timestamp DESC -- !!!
) AS latest_transactions
ORDER BY transaction_timestamp DESC
LIMIT 10;
最佳查询(和索引)取决于每个基本选择期望有多少行(以及其中的重复行)(通过
WHERE to_id = 1000 AND transaction_timestamp BETWEEN 1691193600 AND 1711929600
过滤后)。
由
(to_id, transaction_timestamp DESC)
上的索引支持(您似乎有?)这个查询可能就是这样。
(transaction_id, output_index)
上有大量符合条件的行和/或大量重复行,事情会变得更加复杂。特别是因为您(1)在基本过滤器中已经有范围条件并且(2)想要 (transaction_id ASC, output_index DESC)
的混合排序顺序,这使得模拟索引跳过扫描变得困难......
相关: