Some of the simplest, yet most frequently used predictors in statistics and machine learning use weighted linear combinations of features. Such linear predictors can model non-linear relationships between features by adding interaction terms corresponding to the products of all pairs of features. We consider the problem of accurately estimating coefficients for interaction terms in linear predictors. We hypothesize that the coefficients for different interaction terms have an approximate low-dimensional structure and represent each feature by a latent vector in a low-dimensional space. This low-dimensional representation can be viewed as a structured regularization approach that further mitigates overfitting in high-dimensional settings beyond standard regularizers such as the lasso and elastic net. We demonstrate that our approach, called LIT-LVM, achieves superior prediction accuracy compared to the elastic net, hierarchical lasso, and factorization machines on a wide variety of simulated and real data, particularly when the number of interaction terms is high compared to the number of samples. LIT-LVM also provides low-dimensional latent representations for features that are useful for visualizing and analyzing their relationships.
翻译:在统计学与机器学习中,一些最简单却最常用的预测器采用特征的加权线性组合。此类线性预测器可通过添加所有特征对乘积所对应的交互项来建模特征间的非线性关系。本文研究线性预测器中交互项系数的准确估计问题。我们假设不同交互项的系数具有近似低维结构,并将每个特征表示为低维空间中的隐向量。这种低维表示可视为一种结构化正则化方法,能在标准正则化器(如lasso和弹性网络)基础上,进一步缓解高维场景下的过拟合问题。我们提出的LIT-LVM方法在多种模拟和真实数据上,相比弹性网络、分层lasso和因子分解机均展现出更优的预测精度,尤其在交互项数量远多于样本量的情况下表现突出。LIT-LVM同时能为特征提供低维隐表示,这对可视化与分析特征间关系具有实用价值。