We introduce an efficient method for learning linear models from uncertain data, where uncertainty is represented as a set of possible variations in the data, leading to predictive multiplicity. Our approach leverages abstract interpretation and zonotopes, a type of convex polytope, to compactly represent these dataset variations, enabling the symbolic execution of gradient descent on all possible worlds simultaneously. We develop techniques to ensure that this process converges to a fixed point and derive closed-form solutions for this fixed point. Our method provides sound over-approximations of all possible optimal models and viable prediction ranges. We demonstrate the effectiveness of our approach through theoretical and empirical analysis, highlighting its potential to reason about model and prediction uncertainty due to data quality issues in training data.
翻译:我们提出了一种从不确定数据中学习线性模型的高效方法,其中不确定性被表示为数据中一组可能的变化,从而导致预测多样性。我们的方法利用抽象解释和zonotopes(一种凸多面体)来紧凑地表示这些数据集变化,使得能够对所有可能世界同时进行梯度下降的符号执行。我们开发了确保该过程收敛到不动点的技术,并推导了该不动点的闭式解。我们的方法提供了所有可能最优模型和可行预测范围的有保证的过近似。我们通过理论和实证分析证明了该方法的有效性,突显了其在推理由于训练数据质量问题引起的模型及预测不确定性方面的潜力。