Max-Linear Regression by Convex Programming

We consider the multivariate max-linear regression problem where the model parameters $\boldsymbol{\beta}_{1},\dotsc,\boldsymbol{\beta}_{k}\in\mathbb{R}^{p}$ need to be estimated from $n$ independent samples of the (noisy) observations $y = \max_{1\leq j \leq k} \boldsymbol{\beta}_{j}^{\mathsf{T}} \boldsymbol{x} + \mathrm{noise}$. The max-linear model vastly generalizes the conventional linear model, and it can approximate any convex function to an arbitrary accuracy when the number of linear models $k$ is large enough. However, the inherent nonlinearity of the max-linear model renders the estimation of the regression parameters computationally challenging. Particularly, no estimator based on convex programming is known in the literature. We formulate and analyze a scalable convex program given by anchored regression (AR) as the estimator for the max-linear regression problem. Under the standard Gaussian observation setting, we present a non-asymptotic performance guarantee showing that the convex program recovers the parameters with high probability. When the $k$ linear components are equally likely to achieve the maximum, our result shows a sufficient number of noise-free observations for exact recovery scales as {$k^{4}p$} up to a logarithmic factor. { This sample complexity coincides with that by alternating minimization (Ghosh et al., {2021}). Moreover, the same sample complexity applies when the observations are corrupted with arbitrary deterministic noise. We provide empirical results that show that our method performs as our theoretical result predicts, and is competitive with the alternating minimization algorithm particularly in presence of multiplicative Bernoulli noise. Furthermore, we also show empirically that a recursive application of AR can significantly improve the estimation accuracy.}

翻译：我们研究了多变量最大线性回归问题，其中模型参数 $\boldsymbol{\beta}_{1},\dotsc,\boldsymbol{\beta}_{k}\in\mathbb{R}^{p}$ 需从 $n$ 个（含噪声）观测 $y = \max_{1\leq j \leq k} \boldsymbol{\beta}_{j}^{\mathsf{T}} \boldsymbol{x} + \mathrm{noise}$ 的独立样本中估计。最大线性模型极大推广了传统线性模型，当线性模型数量 $k$ 足够大时，能以任意精度逼近任意凸函数。然而，最大线性模型固有的非线性特性使得回归参数的估计在计算上具有挑战性，尤其是文献中尚未存在基于凸规划的估计器。我们提出并分析一种可扩展的凸规划——锚定回归（AR），作为最大线性回归问题的估计器。在标准高斯观测设置下，我们给出了非渐近性能保证，证明该凸规划能以高概率恢复参数。当 $k$ 个线性分量等可能地达到最大值时，我们的结果表明，在无噪声观测下精确恢复所需样本量（忽略对数因子）为 $k^{4}p$。该样本复杂度与交替最小化方法（Ghosh 等，2021）一致。此外，当观测被任意确定性噪声污染时，相同样本复杂度仍然适用。实验结果表明，我们的方法性能与理论预测一致，且在存在乘性伯努利噪声时与交替最小化算法具有竞争力。此外，我们通过实验证明递归应用 AR 可显著提升估计精度。