基于合规性分析器添加检查

问题描述 投票:0回答:1

这是我正在使用的示例数据帧(df):

+---+----+--------+
| id|orig|scrubbed|
+---+----+--------+
|  1|   a|       a|
|  2|   B|       b|
|  3|   c|       c|
|  4|   D|       d|
|  5|   *|      XX|
|  6|   $|      XX|
|  7|  ZZ|      ZZ|
|  8|  XX|      XX|
|  9|   y|       y|
| 10|   Z|       z|
+---+----+--------+

我想执行一次检查,以告诉我在清理后“填充”(不包含“ XX”或“ ZZ”的项目)的比例是否至少为80%。 (此检查将失败。)我可以向VerificationRunBuilder添加一个符合性分析器以计算指标,如下所示:

val myVerificationResult: VerificationResult = new VerificationRunBuilder(df).
    addRequiredAnalyzer(
        Compliance(
            "populatedAfterScrubbing",
            "`scrubbed` NOT IN ('ZZ', 'XX') AND `scrubbed` IS NOT NULL",
            Some("`orig` NOT IN ('ZZ', 'XX') AND `orig` IS NOT NULL")
        )
    ).
    addCheck(
        Check(CheckLevel.Error, "Review Check").
            hasSize(_ >= 1)
    ).
    run()

此代码将运行,并使用hasSize约束条件成功检查了数据,但是我无法弄清楚如何基于自定义的合规性分析器添加约束条件。这可能吗?

scala apache-spark amazon-deequ
1个回答
0
投票

我发现了一个似乎可行的解决方案,以防有人感兴趣。答案在于创建自定义约束而不是自定义分析器。这是工作代码:

val myConstraint = Constraint.complianceConstraint(
    "my constraint",
    "`scrubbed` NOT IN ('ZZ', 'XX') AND `scrubbed` IS NOT NULL",
    (fraction:Double)=>fraction>=0.8,
    Some("`orig` NOT IN ('ZZ', 'XX') AND `orig` IS NOT NULL"),
    Some("no peeking")
)

val myVerificationResult: VerificationResult = { VerificationSuite()
    .onData(df)
    .addCheck(
        Check(CheckLevel.Error, "Review Check") 
            .addConstraint(myConstraint)
    )
    .run()
}
val result = checkResultsAsDataFrame(spark, myVerificationResult)
result.show(truncate=true)

结果完全符合预期:

+------------+-----------+------------+--------------------+-----------------+--------------------+
|       check|check_level|check_status|          constraint|constraint_status|  constraint_message|
+------------+-----------+------------+--------------------+-----------------+--------------------+
|Review Check|      Error|       Error|ComplianceConstra...|          Failure|Value: 0.75 does ...|
+------------+-----------+------------+--------------------+-----------------+--------------------+
© www.soinside.com 2019 - 2024. All rights reserved.