SQL查询/过程根据最匹配的列查找两个表之间的最佳匹配

问题描述 投票:0回答:1

我需要一个SQL查询/过程,以便根据访问者的性别,最匹配的兴趣和学术领域为给定的访问者找到最匹配的主机。我有以下表格:

HOSTS: 
HOST_ID  GENDER   INTEREST_ONE_ID       INTEREST_TWO_ID   ACADEMIC_FIELD_ID     NUM_CAN_HOST
   A       M            1                    2                   10                    2
   B       M            5                    4                    3                    1
   C       F            2                    1                    3                    2
   D       F            1                    2                   10                    3 
   E       M            5                    1                    3                    1
   F       M            5                    1                    6                    1

VISTORS:
VISTOR_ID  GENDER INTEREST_ONE_ID       INTEREST_TWO_ID   ACADEMIC_FIELD_ID 
   1         M          2                       1                10
   2         M          5                       4                 3
   3         M          1                       2                 2
   4         F          4                       1                 6

[请注意,所有兴趣ID都来自相同的列表,academic_field_id也来自同一列表(但自然不同于兴趣列表)。因此,我想要一个查询/过程,该查询/过程首先根据性别返回给定访问者的前X个最佳主机匹配项,而不是根据哪个主机最匹配兴趣和学术领域。兴趣匹配的位置并不重要(interest_one可以匹配interest_two,反之亦然)。所以是Vistor 1的示例输出:

BEST_MATCHES (for Vistor 1..only males with most matches) 
VISITOR_ID HOST_ID      INTEREST_ONE_MATCH   INTEREST_TWO_MATCH   Academic_int_MATCH
   1         A               x [one to two]      x [two to one]        x
   1         B                 -                    -                  -     Next best..which is not too good!

和访问者2:

BEST_MATCHES 
VISITOR_ID HOST_ID      INTEREST_ONE_MATCH   INTEREST_TWO_MATCH   Academic_int_MATCH
   2         B               x                   x                     x
   2         E               x                   -                     x     
   2         F               x                   -                     -   

sql group-by rank
1个回答
0
投票

这是一个昂贵的查询,但是:

select hv.
from (select h.host_id, v.visitor_id,
             (case when h.INTEREST_ONE_ID = v.INTEREST_ONE_ID then 'X' end) as INTEREST_ONE_ID_match,
             (case when h.INTEREST_TWO_ID = v.INTEREST_TWO_ID then 'X' end) as INTEREST_TWO_ID_match,
             . . . ,
             dense_rank() over (partition by h.host_id
                                order by ((case when h.INTEREST_ONE_ID = v.INTEREST_ONE_ID then 1 else 0) +
                                          (case when h.INTEREST_TWO_ID = v.INTEREST_TWO_ID then 1 else 0) +
                                          . . .
                                         ) desc
                               ) as seqnum
      from hosts h cross join
           visitors v
     ) hv
where seqnum = 1;
© www.soinside.com 2019 - 2024. All rights reserved.