我正在尝试使用以下 SQL 查询检索包含一组字段上的 Cramer 相关系数值的表:
WITH var_pairs AS (
WITH vars AS (SELECT n FROM unnest(ARRAY['Performance Score', 'state', 'sex', 'maritaldesc', 'citizendesc', 'Hispanic/Latino', 'racedesc', 'Reason For Term', 'Employment Status', 'department', 'position', 'Manager Name', 'Employee Source']) AS n)
SELECT vars1.n AS var1, vars2.n AS var2
FROM vars AS vars1 CROSS JOIN vars AS vars2
)
SELECT
(WITH
-- Contingency table
observed AS (
SELECT
var1 AS x,
var2 AS y,
COUNT(*) AS observed
FROM hr_dataset
GROUP BY var1, var2
),
-- Sum of the rows of the contingency table
row_total AS (
SELECT
x,
SUM(observed) AS row_total
FROM observed
GROUP BY x
),
-- Sum of columns of the contiguity table
col_total AS (
SELECT
y,
SUM(observed) AS col_total
FROM observed
GROUP BY y
),
-- Total number of observations
grand_total AS (
SELECT SUM(observed) AS grand_total
FROM observed
),
-- Expected frequencies
expected AS (
SELECT
observed.x,
observed.y,
(row_total.row_total * col_total.col_total) / grand_total.grand_total AS expected
FROM
observed
INNER JOIN row_total USING(x)
INNER JOIN col_total USING(y)
CROSS JOIN grand_total
),
-- Chi-square statistics
chi_sq AS (
SELECT
SUM(POWER(observed.observed - expected.expected, 2) / expected.expected) AS chi_sq
FROM
observed
INNER JOIN expected USING(x, y)
),
count_x AS (
SELECT
count(DISTINCT x) AS count_x
FROM
observed
),
count_y AS (
SELECT
count(DISTINCT y) AS count_y
FROM
observed
)
-- Cramer's correlation coefficient
SELECT
SQRT(chi_sq / (grand_total * (least(count_x, count_y) - 1))) AS "Cramer\'s V"
FROM
chi_sq
CROSS JOIN grand_total
CROSS JOIN count_x
CROSS JOIN count_y
)
FROM var_pairs
服务器响应:错误:除以零。据我了解,我错误地将参数 var1 和 var2 传递给子查询。怎样做才正确呢?我正在使用 Postgres。
不,这是除以零的错误,这意味着您在某些时候尝试执行此模式
a / b
其中
b
是 0。在数学中除以零是一个特殊问题。
假设您除以 a / 0,并假设其结果是 r:
a / 0 = r
现在,让我们将方程乘以 0,得到:
a = 0 * r
那么,对于哪个 r 来说,a = 0 * r 是正确的?答案是,只要 a 为 0,它对于任何 r 都是正确的。但是,只要 a 不为 0,它对于任何 r 都是错误的。
因此:
a / 0
如果 a <> 0 和 a / 0 可以是任意值(如果 a = 0),则毫无意义。
因此,除以零是数学中的一个特例,编程语言可以处理这个问题。
由于您的分母不依赖于
var1
或 var2
的值,因此我们讨论的问题完全独立于您如何传递它们,问题是您除以零。要解决您的问题,您需要确保不除以零,因此您检查分母是否为零,如果是,则在其中设置默认值,否则进行除法。比如:
case
when denominator = 0 then 100
else numerator / denominator
end