We present and analyze an algorithm designed for addressing vector-valued regression problems involving possibly infinite-dimensional input and output spaces. The algorithm is a randomized adaptation of reduced rank regression, a technique to optimally learn a low-rank vector-valued function (i.e. an operator) between sampled data via regularized empirical risk minimization with rank constraints. We propose Gaussian sketching techniques both for the primal and dual optimization objectives, yielding Randomized Reduced Rank Regression (R4) estimators that are efficient and accurate. For each of our R4 algorithms we prove that the resulting regularized empirical risk is, in expectation w.r.t. randomness of a sketch, arbitrarily close to the optimal value when hyper-parameteres are properly tuned. Numerical expreriments illustrate the tightness of our bounds and show advantages in two distinct scenarios: (i) solving a vector-valued regression problem using synthetic and large-scale neuroscience datasets, and (ii) regressing the Koopman operator of a nonlinear stochastic dynamical system.
翻译:我们提出并分析了一种针对可能涉及无限维输入和输出空间的向量值回归问题的算法。该算法是降阶回归的一种随机化变体,该技术通过在具有秩约束的正则化经验风险最小化框架下,从样本数据中最优地学习低秩向量值函数(即算子)。我们针对原始优化对偶目标提出了高斯草图技术,从而得到高效且精确的随机化降阶回归(R4)估计量。对于每个R4算法,我们证明:当超参数得到适当调整时,所得正则化经验风险(关于草图随机性的期望)可任意接近最优值。数值实验验证了所提边界的紧致性,并在两个不同场景中展示了算法优势:(i)利用合成数据集和大型神经科学数据集求解向量值回归问题;(ii)对非线性随机动力系统的Koopman算子进行回归建模。