Postgres：一个查询包含多个JOIN和多个查询

Question

我正在研究Posrgres 9.6和PostGIS 2.3，托管在AWS RDS上。我正在尝试针对来自不同表的数据优化一些地理半径查询。

我正在考虑两种方法：具有多个连接的单个查询或两个单独但更简单的查询。

在高层次上，并简化了结构，我的架构是：

CREATE EXTENSION "uuid-ossp";
CREATE EXTENSION IF NOT EXISTS postgis;


CREATE TABLE addresses (
    id bigint NOT NULL,
    latitude double precision,
    longitude double precision,
    line1 character varying NOT NULL,
    "position" geography(Point,4326),
    CONSTRAINT enforce_srid CHECK ((st_srid("position") = 4326))
);

CREATE INDEX index_addresses_on_position ON addresses USING gist ("position");

CREATE TABLE locations (
    id bigint NOT NULL,
    uuid uuid DEFAULT uuid_generate_v4() NOT NULL,
    address_id bigint NOT NULL
);

CREATE TABLE shops (
    id bigint NOT NULL,
    name character varying NOT NULL,
    location_id bigint NOT NULL
);

CREATE TABLE inventories (
    id bigint NOT NULL,
    shop_id bigint NOT NULL,
    status character varying NOT NULL
);

addresses表包含地理数据。当插入或更新行时，position列是从lat-lng列计算的。

每个address与一个location相关联。

每个address可能有许多shops，每个shop将有一个inventory。

为简洁起见，我省略了它们，但所有表都在引用列上具有正确的外键约束和btree索引。

这些表有几十万行。

有了这个，我的主要用例可以通过这个单一查询来满足，该查询在距离中心地理点（addresses）1000米范围内搜索10.0, 10.0并返回所有表中的数据：

SELECT
    s.id AS shop_id,
    s.name AS shop_name,
    i.status AS inventory_status,
    l.uuid AS location_uuid,
    a.line1 AS addr_line,
    a.latitude AS lat,
    a.longitude AS lng
FROM addresses a
JOIN locations l ON l.address_id = a.id
JOIN shops s ON s.location_id = l.id
JOIN inventories i ON i.shop_id = s.id
WHERE ST_DWithin(
    a.position,                             -- the position of each address
    ST_SetSRID(ST_Point(10.0, 10.0), 4326), -- the center of the circle
    1000,                                   -- radius distance in meters
    true
);

此查询有效，EXPLAIN ANALYZE显示它正确使用GIST索引。

但是，我也可以将此查询拆分为两个，并在应用程序层中管理中间结果。例如，这也适用：

--- only search for the addresses
SELECT
    a.id as addr_id,
    a.line1 AS addr_line,
    a.latitude AS lat,
    a.longitude AS lng
FROM addresses a
WHERE ST_DWithin(
    a.position,                             -- the position of each address
    ST_SetSRID(ST_Point(10.0, 10.0), 4326), -- the center of the circle
    1000,                                   -- radius distance in meters
    true
);

--- get the rest of the data
SELECT
    s.id AS shop_id,
    s.name AS shop_name,
    i.status AS inventory_status,
    l.id AS location_id,
    l.uuid AS location_uuid
FROM locations l
JOIN shops s ON s.location_id = l.id
JOIN inventories i ON i.shop_id = s.id
WHERE
    l.address_id IN (1, 2, 3, 4, 5)  -- potentially thousands of values
;

其中l.address_id IN (1, 2, 3, 4, 5)中的值来自第一个查询。

两个拆分查询的查询计划看起来比第一个查询更简单，但我想知道这本身是否意味着第二个解决方案更好。

我知道内连接已经很好地优化了，并且对DB的单次往返更可取。

内存使用情况怎么样？或表上的资源争用？（例如锁）

Answer 1

我使用IN(...)（重新）将您的第二个代码合并到一个查询中：

--- get the rest of the data
SELECT
    s.id AS shop_id,
    s.name AS shop_name,
    i.status AS inventory_status,
    l.id AS location_id,
    l.uuid AS location_uuid
FROM locations l
JOIN shops s ON s.location_id = l.id
JOIN inventories i ON i.shop_id = s.id
WHERE l.address_id IN ( --- only search for the addresses
        SELECT a.id
        FROM addresses a
        WHERE ST_DWithin(a.position, ST_SetSRID(ST_Point(10.0, 10.0), 4326), 1000 true)
        );

或者，类似地，使用EXISTS(...)：

--- get the rest of the data
SELECT
    s.id AS shop_id,
    s.name AS shop_name,
    i.status AS inventory_status,
    l.id AS location_id,
    l.uuid AS location_uuid
FROM locations l
JOIN shops s ON s.location_id = l.id
JOIN inventories i ON i.shop_id = s.id
WHERE EXISTS ( SELECT * --- only search for the addresses
        FROM addresses a
        WHERE a.id = l.address_id 
        AND ST_DWithin( a.position, ST_SetSRID(ST_Point(10.0, 10.0), 4326), 1000, true)
        );

Postgres：一个查询包含多个JOIN和多个查询

问题描述投票：0回答：1

1个回答

最新问题

Postgres：一个查询包含多个JOIN和多个查询

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1