We present several algorithms designed to learn a pattern of correspondence between two data sets in situations where it is desirable to match elements that exhibit a relationship belonging to a known parametric model. In the motivating case study, the challenge is to better understand micro-RNA regulation in the striatum of Huntington's disease model mice. The algorithms unfold in two stages. First, an optimal transport plan P and an optimal affine transformation are learned, using the Sinkhorn-Knopp algorithm and a mini-batch gradient descent. Second, P is exploited to derive either several co-clusters or several sets of matched elements. A simulation study illustrates how the algorithms work and perform. The real data application further illustrates their applicability and interest.
翻译:我们提出了若干算法,用于在需要匹配具有已知参数模型关系的元素时,学习两个数据集之间的对应模式。在作为案例驱动的研究中,核心挑战在于更深入地理解亨廷顿病模型小鼠纹状体中微RNA的调控机制。这些算法分两个阶段展开。首先,利用Sinkhorn-Knopp算法和小批量梯度下降法,学习一个最优传输计划P和一个最优仿射变换。其次,利用P导出若干共聚类或若干组匹配元素。仿真研究展示了这些算法的工作原理及性能表现。真实数据应用进一步验证了其实用性与研究价值。