选择ColA=ColB的行,并添加一个新列,告诉ColA中的原始数据是否重复

问题描述 投票:0回答:4

我试图弄清楚如何查询一个表(该表实际上是一个结果集,因此它将是一个子查询),按

ColA=ColB
对其进行分组(见下文),并一步创建一个计算字段。

所以,如果我的测试数据看起来像

ColA ColB ColC 
1 1 AAA
1 2 bbbb
1 3 cccc
2 2 dddd
3 3 eeee
3 4 ffff
3 5 咕咕
3 6 呵呵
4 4 三世
5 5 杰杰杰杰
6 6 kkkk
6 7 llll
6 8 毫米

我想只检索

ColA=ColB
所在的行,并添加一个新列来告诉我
ColA
中的原始数据是否重复。见下文。

ColA ColB ColC 倍数
1 1 aaaa 是的
2 2 dddd 无
3 3 eeee 是的
4 4 iii 否
5 5 jjjj 没有
6 6 kkkk 是的

有人可以帮我解决语法问题吗?我一直在使用 Group By 和 SubSelects 但无济于事。我是否需要对倍数字段使用 case 语句?

sql oracle group-by case having
4个回答
7
投票

发布 create table 和 insert 语句而不是 Desc table 和 select * from table_name 更有帮助; http://tkyte.blogspot.com/2005/06/how-to-ask-questions.html

create table test_repeat(
   cola number,
   colb number,
   colc varchar2(20)
   );

insert into test_repeat values (1,1,'aaa'); 
insert into test_repeat values (1,2,'bbbb');    
insert into test_repeat values (1,3,'cccc');    
insert into test_repeat values (2,2,'dddd');    
insert into test_repeat values (3,3,'eeee'); 
insert into test_repeat values (3,4,'ffff');    
insert into test_repeat values (3,5,'gggg');    
insert into test_repeat values (3,6,'hhhh');    
insert into test_repeat values (4,4,'iiii');    
insert into test_repeat values (5,5,'jjjj');    
insert into test_repeat values (6,6,'kkkk');    
insert into test_repeat values (6,7,'llll');    
insert into test_repeat values (6,8,'mmmm');
commit;

1.您可以使用 Oracle 分析函数 Lead 来查看结果集,看看 colA 是否与下一行相同(在对其进行排序后..),例如 ..

select * from
(select colA, colb,
       (case when colA = (lead(cola) over 
                          (partition by colA order by cola, colb))
            then 'Yes'
            else 'No'
       end) multiples,
       colc
  from test_repeat)
  where colA = colb
/

      COLA       COLB MUL COLC
---------- ---------- --- --------------------
         1          1 Yes aaa
         2          2 No  dddd
         3          3 Yes eeee
         4          4 No  iiii
         5          5 No  jjjj
         6          6 Yes kkkk

2.或者您可以获取 COLA 每个值的计数并进行比较以查看是否有重复项...

select a.colA, a.colb, a.colc,
       (case when (select count(*) from test_repeat t where t.cola = a.colA) > 1
            then 'Yes'
            else 'No'
       end) Repeat
  from test_repeat a
  where colA = colB
/

      COLA       COLB COLC                 REP
---------- ---------- -------------------- ---
         1          1 aaa                  Yes
         2          2 dddd                 No
         3          3 eeee                 Yes
         4          4 iiii                 No
         5          5 jjjj                 No
         6          6 kkkk                 Yes

它们都同样简单,但是我建议使用分析函数方法,因为我发现它对于所有查询来说通常都更快我过去使用过。


1
投票
SQL> select *
  2    from test_repeat
  3   order by cola
  4  /

      COLA       COLB COLC
---------- ---------- --------------------
         1          2 bbbb
         1          1 aaa
         1          3 cccc
         2          2 dddd
         3          4 ffff
         3          3 eeee
         3          5 gggg
         3          6 hhhh
         4          4 iiii
         5          5 jjjj
         6          6 kkkk
         6          7 llll
         6          8 mmmm
         7          9 nnnn

14 rows selected.

SQL> select cola "ColA"
  2       , max(decode(colb,cola,colb)) "ColB"
  3       , max(decode(colb,cola,colc)) "ColC"
  4       , case count(*) when 1 then 'no' else 'yes' end "multiples"
  5    from test_repeat
  6   group by cola
  7  having cola = max(decode(colb,cola,colb))
  8   order by cola
  9  /

      ColA       ColB ColC                 mul
---------- ---------- -------------------- ---
         1          1 aaa                  yes
         2          2 dddd                 no
         3          3 eeee                 yes
         4          4 iiii                 no
         5          5 jjjj                 no
         6          6 kkkk                 yes

6 rows selected.

从性能角度来看,这个 SQL 大约等于 Rajesh 的第一个查询。所以你可以选择你更舒服的那一个。

SQL> set autotrace on
SQL> select * from
  2  (select colA, colb,
  3         (case when colA = (lead(cola) over
  4                            (partition by colA order by cola, colb))
  5              then 'Yes'
  6              else 'No'
  7         end) multiples,
  8         colc
  9    from test_repeat)
 10    where colA = colb
 11  /

      COLA       COLB MUL COLC
---------- ---------- --- --------------------
         1          1 Yes aaa
         2          2 No  dddd
         3          3 Yes eeee
         4          4 No  iiii
         5          5 No  jjjj
         6          6 Yes kkkk

6 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1491815685

-----------------------------------------------------------------------------------
| Id  | Operation           | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |             |    14 |   574 |     4  (25)| 00:00:01 |
|*  1 |  VIEW               |             |    14 |   574 |     4  (25)| 00:00:01 |
|   2 |   WINDOW SORT       |             |    14 |   532 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| TEST_REPEAT |    14 |   532 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("COLA"="COLB")

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          4  recursive calls
          0  db block gets
         15  consistent gets
          0  physical reads
          0  redo size
        421  bytes sent via SQL*Net to client
        238  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          6  rows processed

SQL> /

      COLA       COLB MUL COLC
---------- ---------- --- --------------------
         1          1 Yes aaa
         2          2 No  dddd
         3          3 Yes eeee
         4          4 No  iiii
         5          5 No  jjjj
         6          6 Yes kkkk

6 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1491815685

-----------------------------------------------------------------------------------
| Id  | Operation           | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |             |    14 |   574 |     4  (25)| 00:00:01 |
|*  1 |  VIEW               |             |    14 |   574 |     4  (25)| 00:00:01 |
|   2 |   WINDOW SORT       |             |    14 |   532 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| TEST_REPEAT |    14 |   532 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("COLA"="COLB")

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        421  bytes sent via SQL*Net to client
        238  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          6  rows processed

SQL> select cola "ColA"
  2       , max(decode(colb,cola,colb)) "ColB"
  3       , max(decode(colb,cola,colc)) "ColC"
  4       , case count(*) when 1 then 'no' else 'yes' end "multiples"
  5    from test_repeat
  6   group by cola
  7  having cola = max(decode(colb,cola,colb))
  8   order by cola
  9  /

      ColA       ColB ColC                 mul
---------- ---------- -------------------- ---
         1          1 aaa                  yes
         2          2 dddd                 no
         3          3 eeee                 yes
         4          4 iiii                 no
         5          5 jjjj                 no
         6          6 kkkk                 yes

6 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3021378319

-----------------------------------------------------------------------------------
| Id  | Operation           | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |             |    14 |   532 |     4  (25)| 00:00:01 |
|*  1 |  FILTER             |             |       |       |            |          |
|   2 |   SORT GROUP BY     |             |    14 |   532 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| TEST_REPEAT |    14 |   532 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("COLA"=MAX(DECODE("COLB","COLA","COLB")))

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        421  bytes sent via SQL*Net to client
        238  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          6  rows processed

SQL> /

      ColA       ColB ColC                 mul
---------- ---------- -------------------- ---
         1          1 aaa                  yes
         2          2 dddd                 no
         3          3 eeee                 yes
         4          4 iiii                 no
         5          5 jjjj                 no
         6          6 kkkk                 yes

6 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3021378319

-----------------------------------------------------------------------------------
| Id  | Operation           | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |             |    14 |   532 |     4  (25)| 00:00:01 |
|*  1 |  FILTER             |             |       |       |            |          |
|   2 |   SORT GROUP BY     |             |    14 |   532 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| TEST_REPEAT |    14 |   532 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("COLA"=MAX(DECODE("COLB","COLA","COLB")))

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        420  bytes sent via SQL*Net to client
        238  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          6  rows processed

SQL> set autotrace off

0
投票

我认为您需要对

multiples
列进行子选择。不需要分组。写一些类似的内容:

select ColA, ColB, ColC, 
       (CASE
        WHEN (select b.ColA from thistable b where b.ColA = a.ColA) > 1 THEN 'yes'
        ELSE 'no'
        END)
from thistable a
where ColA = ColB

0
投票

我不知道引导功能,非常好。 Rob 和 Rajesh,感谢到目前为止的回答,我想我能够使用两个子查询和一个外连接部分地回答我自己的问题。

我猜我的查询不如其他人发布的那么有效,但我想知道人们是否可以对此查询发表评论。它总是有效吗?可以对其进行调整以提高效率吗? 对于倍数,它仍然需要输入“是”或“否”,现在它会输入倍数计数或对于 0 为空。

SELECT BB.ColA, ColB, ColC, AA.CNT 
FROM
(SELECT ColA, count(*) as CNT FROM TMPY where ColA<>ColB Group by ColA)  AA, 
(SELECT ColA, ColB, ColC FROM TMPY WHERE COLA=COLB ) BB 
WHERE
AA.COLa(+)=BB.COLB order by bb.ColA;
© www.soinside.com 2019 - 2024. All rights reserved.