This paper studies the statistical model of the non-centered mixture of scaled Gaussian distributions (NC-MSG). Using the Fisher-Rao information geometry associated to this distribution, we derive a Riemannian gradient descent algorithm. This algorithm is leveraged for two minimization problems. The first one is the minimization of a regularized negative log-likelihood (NLL). The latter makes the trade-off between a white Gaussian distribution and the NC-MSG. Conditions on the regularization are given so that the existence of a minimum to this problem is guaranteed without assumptions on the samples. Then, the Kullback-Leibler (KL) divergence between two NC-MSG is derived. This divergence enables us to define a minimization problem to compute centers of mass of several NC-MSGs. The proposed Riemannian gradient descent algorithm is leveraged to solve this second minimization problem. Numerical experiments show the good performance and the speed of the Riemannian gradient descent on the two problems. Finally, a Nearest centroid classifier is implemented leveraging the KL divergence and its associated center of mass. Applied on the large scale dataset Breizhcrops, this classifier shows good accuracies as well as robustness to rigid transformations of the test set.
翻译:本文研究了非中心化尺度高斯混合(NC-MSG)的统计模型。利用该分布对应的Fisher-Rao信息几何,我们推导出一种黎曼梯度下降算法,并将其应用于两个最小化问题。第一个问题是正则化负对数似然(NLL)的最小化,该正则化在白色高斯分布与NC-MSG之间进行权衡。我们给出了正则化条件,确保该问题的最小值存在性无需对样本施加假设。随后,推导了两个NC-MSG之间的Kullback-Leibler(KL)散度,该散度使我们能够定义一个最小化问题来计算多个NC-MSG的质心。提出的黎曼梯度下降算法被用于求解第二个最小化问题。数值实验表明,该算法在两类问题上均具有良好的性能和较快的速度。最后,利用KL散度及其关联的质心,实现了一个最近邻质心分类器。应用于大规模数据集Breizhcrops时,该分类器展现出良好的准确率以及对测试集刚性变换的鲁棒性。