Attenuation bias -- the systematic underestimation of regression coefficients due to measurement errors in input variables -- affects astronomical data-driven models. For linear regression, this problem was solved by treating the true input values as latent variables to be estimated alongside model parameters. In this paper, we show that neural networks suffer from the same attenuation bias and that the latent variable solution generalizes directly to neural networks. We introduce LatentNN, a method that jointly optimizes network parameters and latent input values by maximizing the joint likelihood of observing both inputs and outputs. We demonstrate the correction on one-dimensional regression, multivariate inputs with correlated features, and stellar spectroscopy applications. LatentNN reduces attenuation bias across a range of signal-to-noise ratios where standard neural networks show large bias. This provides a framework for improved neural network inference in the low signal-to-noise regime characteristic of astronomical data. This bias correction is most effective when measurement errors are less than roughly half the intrinsic data range; in the regime of very low signal-to-noise and few informative features. Code is available at https://github.com/tingyuansen/LatentNN.
翻译:衰减偏差——由于输入变量的测量误差导致的回归系数系统性低估——影响着天文学数据驱动模型。对于线性回归,该问题已通过将真实输入值视为与模型参数共同估计的潜变量得以解决。本文证明神经网络同样受此衰减偏差影响,且潜变量解法可直接推广至神经网络。我们提出LatentNN方法,通过最大化观测输入与输出的联合似然,联合优化网络参数与潜输入值。我们在一维回归、具有相关特征的多元输入及恒星光谱学应用中验证了此修正效果。在标准神经网络表现出显著偏差的信噪比范围内,LatentNN有效降低了衰减偏差。这为改进天文学数据特有的低信噪比场景下的神经网络推断提供了框架。当测量误差小于约一半数据固有范围时,此偏差修正效果最为显著;而在极低信噪比且信息特征稀少的场景下同样适用。代码发布于https://github.com/tingyuansen/LatentNN。