We introduce a new framework for data denoising, partially inspired by martingale optimal transport. For a given noisy distribution (the data), our approach involves finding the closest distribution to it among all distributions which 1) have a particular prescribed structure (expressed by requiring they lie in a particular domain), and 2) are self-consistent with the data. We show that this amounts to maximizing the variance among measures in the domain which are dominated in convex order by the data. For particular choices of the domain, this problem and a relaxed version of it, in which the self-consistency condition is removed, are intimately related to various classical approaches to denoising. We prove that our general problem has certain desirable features: solutions exist under mild assumptions, have certain robustness properties, and, for very simple domains, coincide with solutions to the relaxed problem. We also introduce a novel relationship between distributions, termed Kantorovich dominance, which retains certain aspects of the convex order while being a weaker, more robust, and easier-to-verify condition. Building on this, we propose and analyze a new denoising problem by substituting the convex order in the previously described framework with Kantorovich dominance. We demonstrate that this revised problem shares some characteristics with the full convex order problem but offers enhanced stability, greater computational efficiency, and, in specific domains, more meaningful solutions. Finally, we present simple numerical examples illustrating solutions for both the full convex order problem and the Kantorovich dominance problem.
翻译:我们提出了一种新的数据去噪框架,其部分灵感来源于鞅最优输运。对于给定的含噪分布(即数据),我们的方法旨在寻找所有满足以下条件的分布中与之最接近的分布:1)具有特定的预设结构(通过要求其位于特定定义域来表达);2)与数据保持自洽。我们证明,这等价于在定义域内寻找被数据在凸序意义下占优的测度中方差最大的那个。针对特定定义域的选择,该问题及其移除自一致性条件的松弛形式,与多种经典去噪方法存在深刻联系。我们证明了所提通用问题具备若干理想特性:在温和假设下解存在,具有特定的鲁棒性,并且在极简定义域下与松弛问题的解一致。此外,我们引入了一种称为Kantorovich优势的分布间新型关系,它在保留凸序部分性质的同时,是一种更弱、更鲁棒且更易验证的条件。在此基础上,我们通过将前述框架中的凸序替换为Kantorovich优势,提出并分析了一种新的去噪问题。我们证明,修正后的问题与完整凸序问题共享部分特性,但具备更强的稳定性、更高的计算效率,并在特定定义域中能产生更具实际意义的解。最后,我们通过简单数值算例展示了完整凸序问题与Kantorovich优势问题的求解结果。