Ordinal regression (OR) is classification of ordinal data in which the underlying categorical target variable has a natural ordinal relation for the underlying explanatory variable. For $K$-class OR tasks, threshold methods learn a one-dimensional transformation (1DT) of the explanatory variable so that 1DT values for observations of the explanatory variable preserve the order of label values $1,\ldots,K$ for corresponding observations of the target variable well, and then assign a label prediction to the learned 1DT through threshold labeling, namely, according to the rank of an interval to which the 1DT belongs among intervals on the real line separated by $(K-1)$ threshold parameters. In this study, we propose a parallelizable algorithm to find the optimal threshold labeling, which was developed in previous research, and derive sufficient conditions for that algorithm to successfully output the optimal threshold labeling. In a numerical experiment we performed, the computation time taken for the whole learning process of a threshold method with the optimal threshold labeling could be reduced to approximately 60\,\% by using the proposed algorithm with parallel processing compared to using an existing algorithm based on dynamic programming.
翻译:序数回归(OR)是对序数数据的分类,其中底层类别目标变量相对于解释变量具有自然的序数关系。对于$K$类序数回归任务,阈值方法学习解释变量的一维变换(1DT),使得解释变量观测值的1DT值能够很好地保持目标变量对应观测值的标签值$1,\ldots,K$的顺序,然后通过阈值标注(即根据1DT所属区间在由$(K-1)$个阈值参数分隔的实数线上区间中的排名)为学习到的1DT分配标签预测。在本研究中,我们提出了一种可并行化的算法来寻找先前研究中开发的最优阈值标注,并推导了该算法成功输出最优阈值标注的充分条件。在我们进行的数值实验中,与使用基于动态规划的现有算法相比,使用所提出的并行处理算法可以将采用最优阈值标注的阈值方法的整个学习过程的计算时间减少约60%。