与 CTE 一起使用时，在 Redshift 中选择 DISTINCT 会出现错误

Question

我构建了一个 CTE，并在子查询中使用“SELECT DISTINCT”。此查询失败并给出错误。但是，当我不使用不同的命令而仅使用 SELECT 时，查询会成功运行。我试图弄清楚为什么使用 SELECT DISTINCT 时会失败。为什么在这种情况下我不能使用 select unique 。该查询是从我正在调试的 R 包中自动生成的。

SQL 查询

with tab as (
  SELECT coh.cohort_definition_id, pr.person_id,
         'ADT' as codeset_tag, pr.procedure_date as drug_exposure_start_date,
         coh.cohort_start_date,
         coh.cohort_end_date
  FROM truven_ccmr_claims_actual_omop.PROCEDURE_OCCURRENCE pr
  JOIN sandbox_truven.PIONEER2023_US_MarketScan_stg coh
      ON pr.person_id = coh.subject_id
  WHERE procedure_concept_id in (
     4012324, 4304921, 4073141, 4071936, 4073142,
     4073143, 2103796, 2109975, 2109976,
     4512827, 4314682, 4286887, 4341536, 4145907) 
  LIMIT 10
)
SELECT distinct * 
FROM tab 
WHERE cohort_end_date >= drug_exposure_start_date
  AND cohort_start_date <= drug_exposure_start_date limit 10;

错误

An error occurred when executing the SQL command:
with tab as (SELECT coh.cohort_definition_id, pr.person_id,
         'ADT' as codeset_tag, pr.procedure_date as drug_exposure_start_date,
         coh...

[Amazon](500310) Invalid operation: failed to find conversion function from
"unknown" to text; [SQL State=XX000, DB Errorcode=500310]
1 statement failed.

Answer 1

为了立即修复，这里有一个应该可以工作的版本：

SELECT DISTINCT
    cohort_definition_id,
    person_id,
    codeset_tag,
    drug_exposure_start_date,
    cohort_start_date,
    cohort_end_date
FROM tab
WHERE cohort_end_date >= drug_exposure_start_date AND
      cohort_start_date <= drug_exposure_start_date
-- ORDER BY <one or more columns>
LIMIT 10;

Redshift 似乎不喜欢

SELECT *

与

DISTINCT

混合，尽管作为最佳实践，您应该列出其组合应该不同的所有列。另请注意，使用

LIMIT

而不使用

ORDER BY

是相当没有意义的。

与 CTE 一起使用时，在 Redshift 中选择 DISTINCT 会出现错误

问题描述投票：0回答：1

1个回答

最新问题

与 CTE 一起使用时，在 Redshift 中选择 DISTINCT 会出现错误

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1