We introduce a formulation of optimal transport problem for distributions on function spaces, where the stochastic map between functional domains can be partially represented in terms of an (infinite-dimensional) Hilbert-Schmidt operator mapping a Hilbert space of functions to another. For numerous machine learning tasks, data can be naturally viewed as samples drawn from spaces of functions, such as curves and surfaces, in high dimensions. Optimal transport for functional data analysis provides a useful framework of treatment for such domains. { Since probability measures in infinite dimensional spaces generally lack absolute continuity (that is, with respect to non-degenerate Gaussian measures), the Monge map in the standard optimal transport theory for finite dimensional spaces may not exist. Our approach to the optimal transport problem in infinite dimensions is by a suitable regularization technique -- we restrict the class of transport maps to be a Hilbert-Schmidt space of operators.} To this end, we develop an efficient algorithm for finding the stochastic transport map between functional domains and provide theoretical guarantees on the existence, uniqueness, and consistency of our estimate for the Hilbert-Schmidt operator. We validate our method on synthetic datasets and examine the functional properties of the transport map. Experiments on real-world datasets of robot arm trajectories further demonstrate the effectiveness of our method on applications in domain adaptation.
翻译:我们针对函数空间上的分布提出了一种最优传输问题的形式化方法,其中函数域之间的随机映射可部分通过(无限维)希尔伯特-施密特算子来表示,该算子将一个函数希尔伯特空间映射到另一个函数空间。在众多机器学习任务中,数据可自然视为从函数空间(如高维曲线和曲面)中抽取的样本。函数数据分析的最优传输为处理此类领域提供了有用的框架。由于无限维空间中的概率测度通常缺乏绝对连续性(即相对于非退化高斯测度),标准有限维空间最优传输理论中的蒙日映射可能不存在。我们通过适当的正则化技术来处理无限维中的最优传输问题——将传输映射类别限制为希尔伯特-施密特算子空间。为此,我们开发了一种高效算法来寻找函数域之间的随机传输映射,并为希尔伯特-施密特算子估计的存在性、唯一性和一致性提供了理论保证。我们在合成数据集上验证了该方法,并考察了传输映射的函数特性。在机器人手臂轨迹真实数据集上的实验进一步证明了该方法在领域适应应用中的有效性。