Supervised learning with probabilistic morphisms and kernel mean embeddings

In this paper I propose a concept of a correct loss function in a generative model of supervised learning for an input space $\mathcal{X}$ and a label space $\mathcal{Y}$, which are measurable spaces. A correct loss function in a generative model of supervised learning must correctly measure the discrepancy between elements of a hypothesis space $\mathcal{H}$ of possible predictors and the supervisor operator, which may not belong to $\mathcal{H}$. To define correct loss functions, I propose a characterization of a regular conditional probability measure $\mu_{\mathcal{Y}|\mathcal{X}}$ for a probability measure $\mu$ on $\mathcal{X} \times \mathcal{Y}$ relative to the projection $\Pi_{\mathcal{X}}: \mathcal{X}\times\mathcal{Y}\to \mathcal{X}$ as a solution of a linear operator equation. If $\mathcal{Y}$ is a separable metrizable topological space with the Borel $\sigma$-algebra $ \mathcal{B} (\mathcal{Y})$, I propose another characterization of a regular conditional probability measure $\mu_{\mathcal{Y}|\mathcal{X}}$ as a minimizer of a mean square error on the space of Markov kernels, called probabilistic morphisms, from $\mathcal{X}$ to $\mathcal{Y}$, using kernel mean embeddings. Using these results and using inner measure to quantify generalizability of a learning algorithm, I give a generalization of a result due to Cucker-Smale, which concerns the learnability of a regression model, to a setting of a conditional probability estimation problem. I also give a variant of Vapnik's regularization method for solving stochastic ill-posed problems, using inner measure, and present its applications.

翻译：本文针对可测空间输入空间$\mathcal{X}$与标签空间$\mathcal{Y}$构成的生成式监督学习模型，提出正确损失函数的概念。在生成式监督学习模型中，正确损失函数必须能够准确度量假说空间$\mathcal{H}$（由可能预测器构成）与监督算子（可能不属于$\mathcal{H}$）元素间的差异。为定义正确损失函数，本文提出将概率测度$\mu$（定义于$\mathcal{X} \times \mathcal{Y}$上）关于投影映射$\Pi_{\mathcal{X}}: \mathcal{X}\times\mathcal{Y}\to \mathcal{X}$的正则条件概率测度$\mu_{\mathcal{Y}|\mathcal{X}}$表征为线性算子方程的解。当$\mathcal{Y}$为具有Borel $\sigma$-代数$\mathcal{B}(\mathcal{Y})$的可分可度量化拓扑空间时，本文利用核均值嵌入方法，提出将正则条件概率测度$\mu_{\mathcal{Y}|\mathcal{X}}$表征为马尔可夫核空间（即从$\mathcal{X}$到$\mathcal{Y}$的概率形态）中均方误差极小化问题的新方案。基于上述结果，通过引入内测度量化学习算法的泛化能力，本文将Cucker-Smale关于回归模型可学习性的结论推广至条件概率估计问题的框架下。同时，本文提出基于内测度的Vapnik正则化方法变体用于求解随机不适定问题，并给出其应用示例。