Multi-task learning (MTL) is a methodology that aims to improve the general performance of estimation and prediction by sharing common information among related tasks. In the MTL, there are several assumptions for the relationships and methods to incorporate them. One of the natural assumptions in the practical situation is that tasks are classified into some clusters with their characteristics. For this assumption, the group fused regularization approach performs clustering of the tasks by shrinking the difference among tasks. This enables us to transfer common information within the same cluster. However, this approach also transfers the information between different clusters, which worsens the estimation and prediction. To overcome this problem, we propose an MTL method with a centroid parameter representing a cluster center of the task. Because this model separates parameters into the parameters for regression and the parameters for clustering, we can improve estimation and prediction accuracy for regression coefficient vectors. We show the effectiveness of the proposed method through Monte Carlo simulations and applications to real data.
翻译:多任务学习是一种通过共享相关任务间的共同信息来提升估计与预测整体性能的方法。在多任务学习中,关于任务间关系存在若干假设及其相应的整合手段。实际情境中一个自然的假设是:任务可根据其特征划分为若干聚类。针对这一假设,组融合正则化方法通过缩小任务间的差异以实现任务聚类,从而允许同一聚类内的信息共享。然而,这类方法也会在不同聚类间传递信息,反而恶化估计与预测效果。为解决该问题,我们提出一种采用代表任务聚类中心质心参数的多任务学习方法。由于该方法将参数分解为回归参数与聚类参数,能够提升回归系数向量的估计与预测精度。通过蒙特卡洛模拟及真实数据应用,我们验证了所提方法的有效性。