Ordinal data are quite common in applied statistics. Although some model selection and regularization techniques for categorical predictors and ordinal response models have been developed over the past few years, less work has been done concerning ordinal-on-ordinal regression. Motivated by a consumer test and a survey on the willingness to pay for luxury food products consisting of Likert-type items, we propose a strategy for smoothing and selecting ordinally scaled predictors in the cumulative logit model. First, the group lasso is modified by the use of difference penalties on neighboring dummy coefficients, thus taking into account the predictors' ordinal structure. Second, a fused lasso-type penalty is presented for the fusion of predictor categories and factor selection. The performance of both approaches is evaluated in simulation studies and on real-world data.
翻译:序数数据在应用统计学中相当常见。尽管过去几年已针对分类预测变量和序数响应模型开发了一些模型选择与正则化技术,但关于序数对序数回归的研究仍相对较少。受一项消费者测试和一项关于奢侈食品支付意愿(包含李克特式条目)调查的启发,我们提出了一种在累积logit模型中平滑与选择序数尺度预测变量的策略。首先,通过对相邻虚拟系数施加差分惩罚来改进组lasso方法,从而考虑预测变量的序数结构。其次,提出了一种融合lasso型惩罚用于预测变量类别的合并与因子选择。通过模拟研究和实际数据评估了两种方法的性能。