Neo4j Cypher 使用 WHERE 条件查询性能优化

问题描述 投票:0回答:1

我有以下密码查询:

PROFILE
MATCH (childDStat:JobableStatistic {jobableId: childD.id}) 
WITH collect({`childDStat`:childDStat, `childD`:childD}) as childDStats 
CALL apoc.cypher.mapParallel2(
  " WITH _.childD as childD, _.childDStat as childDStat 
    WITH childD, childDStat 
    UNWIND childD.detailedCriterionIds as dCId 
    WITH childD, childDStat, dCId + coalesce(childDStat['replaceableCriterionIds.' + dCId],[]) as cGroup 
    WITH childD, childDStat, cGroup 
    WHERE NOT AlL(x IN cGroup WHERE x IN $zeroCriterionIds ) 
    WITH childD, childDStat, collect(cGroup) as cGroups 
    WHERE size(cGroups) >= size(childD.detailedCriterionIds) 
    UNWIND cGroups as cGroup 
    WITH childD, childDStat, cGroup 
    WHERE ANY(x IN cGroup WHERE x IN $detailedCriterionIds) 
    WITH childD, childDStat, collect(cGroup) as cGroups 
    WHERE size(cGroups) > 1 
    RETURN childD, childDStat, cGroups ",
  {`detailedCriterionIds`: [3, 5, 7, 8, 12, 13, 14, 15, 16, 18, 20, 21, 23, 26, 28, 29, 30, 31, 33, 35, 36, 40, 42, 44, 45, 46, 47, 51, 54],
   `zeroCriterionIds`: []},
  childDStats, 6, 10) 
YIELD value
WITH value.childD as childD, value.childDStat as childDStat, value.cGroups as cGroups WITH collect({`childDStat`:childDStat, `childD`:childD, `cGroups`:cGroups}) as childDStats 

CALL apoc.cypher.mapParallel2(
  " WITH _.childD as childD, _.childDStat as childDStat, _.cGroups as cGroups WITH childD, childDStat, size(cGroups) as cGroupsSize, cGroups 
    UNWIND cGroups as cGroup 
    WITH childD, childDStat, cGroupsSize, cGroup 
    UNWIND cGroup as cId WITH childD, childDStat, cGroupsSize, cGroup, cId, cGroup[0] as cG0

    WITH childD, childDStat, cGroupsSize, cGroup, cId, cG0, childDStat['criterionAvgVoteWeights.' + cG0] as childDStatCriterionAvgVoteWeight,
      childDStat['criterionExperienceMonths.' + cG0] as childDStatCriterionExperienceMonth, 
      $criterionAvgVoteWeights[toString(cId)] as criterionAvgVoteWeight, $criterionExperienceMonths[toString(cId)] as criterionExperienceMonth 
    WHERE 
      (childDStatCriterionAvgVoteWeight = 0 OR childDStatCriterionAvgVoteWeight <= criterionAvgVoteWeight OR criterionAvgVoteWeight IS NULL) AND
      (childDStatCriterionExperienceMonth = 0 OR childDStatCriterionExperienceMonth <= criterionExperienceMonth OR criterionExperienceMonth IS NULL)

    WITH childD, childDStat, cGroupsSize, cG0, collect(cId) as cIds WITH childD, childDStat, cGroupsSize, collect(DISTINCT cG0 + cIds) as cGroups
    WHERE size(cGroups) >= cGroupsSize
    RETURN childD, childDStat, cGroups ",
  {`detailedCriterionIds`: [3, 5, 7, 8, 12, 13, 14, 15, 16, 18, 20, 21, 23, 26, 28, 29, 30, 31, 33, 35, 36, 40, 42, 44, 45, 46, 47, 51, 54],
   `zeroCriterionIds`: [],
   `criterionAvgVoteWeights`: {`51`:5.0, `8`:0.0, `33`:5.0, `21`:0.0, `31`:0.0, `26`:4.0, `14`:5.0, `36`:3.0, `46`:3.0, `12`:3.0, `18`:5.0, `28`:5.0, `16`:2.0, `7`:5.0, `40`:1.0, `5`:5.0, `44`:4.0, `3`:1.0, `54`:4.0, `20`:4.0, `42`:4.0, `30`:3.0, `15`:4.0, `47`:1.0, `35`:1.0, `13`:3.0, `45`:3.0, `23`:4.0, `29`:1.0},
   `criterionExperienceMonths`: {`8`:109, `33`:8, `21`:184, `31`:14, `26`:100, `14`:157, `36`:140, `46`:123, `12`:85, `18`:96, `28`:116, `16`:15, `7`:63, `40`:56, `5`:166, `44`:101, `3`:129, `42`:84, `20`:102, `30`:173, `15`:97, `47`:54, `13`:91, `35`:137, `45`:119, `23`:162, `29`:97}
  },
  childDStats, 6, 10) 
YIELD value
RETURN value.childD.id

以下条件占用了大部分查询执行时间:

WHERE 
  (childDStatCriterionAvgVoteWeight = 0 OR childDStatCriterionAvgVoteWeight <= criterionAvgVoteWeight OR criterionAvgVoteWeight IS NULL) AND
  (childDStatCriterionExperienceMonth = 0 OR childDStatCriterionExperienceMonth <= criterionExperienceMonth OR criterionExperienceMonth IS NULL)

有了这个条件,查询就可以工作了

~1000ms
,没有它
5ms

我可以针对这种情况做些什么来提高查询性能吗?

P.S

这只是一个测试查询。真正的查询对每个单个值使用参数。

neo4j cypher query-optimization
1个回答
0
投票

(childDStatCriterionAvgVoteWeight = 0 OR childDStatCriterionAvgVoteWeight <= criterionAvgVoteWeight OR criterionAvgVoteWeight IS NULL)

  • 您应该先测试
    criterionAvgVoteWeight IS NULL
    。这样,如果是
    NULL
    ,您就不会浪费时间测试
    NULL
    表达式中的
    childDStatCriterionAvgVoteWeight <= criterionAvgVoteWeight
    值(始终被视为
    false
    ),而且您也不需要测试
    childDStatCriterionAvgVoteWeight = 0
  • 如果
    criterionAvgVoteWeight
    始终 >= 0(当它不是
    NULL
    时),则也可以消除
    childDStatCriterionAvgVoteWeight = 0
    测试。

简而言之,请考虑使用以下表达方式:

(criterionAvgVoteWeight IS NULL OR childDStatCriterionAvgVoteWeight <= criterionAvgVoteWeight)

类似的考虑因素适用于

(childDStatCriterionExperienceMonth = 0 OR childDStatCriterionExperienceMonth <= criterionExperienceMonth OR criterionExperienceMonth IS NULL)

这些更改可能会节省一些时间,但过滤并不是免费的。

© www.soinside.com 2019 - 2024. All rights reserved.