Algorithmic systems now set prices across auto insurance, credit, and lending markets, and regulators increasingly require firms to demonstrate that these systems do not discriminate against protected groups. The standard audit regresses pricing output on a protected attribute and legitimate rating factors, then tests the resulting coefficient using ordinary least squares standard errors. We show that this approach is structurally invalid. Pricing algorithms are usually deterministic, so residuals reflect approximation error rather than sampling variability, rendering classical standard errors invalid in both direction and magnitude. We derive correct asymptotic variance estimators for OLS and GLM audit regressions and the correct cross-covariance formula for proxy discrimination testing. Applied to quoted premiums from 34 Illinois auto insurers, every insurer fails the conditional demographic parity test, with minority zip codes paying $34-$158 more per year than comparable-risk white zip codes. The standard proxy discrimination formula flags zero insurers. However, our corrected formula identifies all 34 as statistically significant, of which 16 exceed the substantive threshold. Our framework provides statistically valid audit tools for any deterministic algorithmic system subject to regression-based fairness testing.
翻译:算法系统如今已广泛应用于汽车保险、信贷和贷款市场的定价,监管机构日益要求企业证明这些系统不会歧视受保护群体。标准审计方法将定价输出对受保护属性和合法评分因子进行回归,然后使用普通最小二乘标准误差检验所得系数。我们证明该方法在结构上无效。定价算法通常是确定性的,因此残差反映的是近似误差而非抽样变异性,导致经典标准误差在方向和数值上均失效。我们推导了OLS和GLM审计回归的正确渐近方差估计量,以及代理歧视检验的正确互协方差公式。应用于伊利诺伊州34家汽车保险公司的报价保费时,每家保险公司均未通过条件人口统计均等检验,少数族裔邮政编码区域每年比风险等级相当的白人邮政编码区域多支付34-158美元。标准代理歧视公式判定为零家保险公司违规。然而,我们修正后的公式判定全部34家保险公司具有统计显著性,其中16家超过实质性阈值。我们的框架为任何受回归型公平性测试约束的确定性算法系统提供了统计有效的审计工具。