Conditional stochastic optimization covers a variety of applications ranging from invariant learning and causal inference to meta-learning. However, constructing unbiased gradient estimators for such problems is challenging due to the composition structure. As an alternative, we propose a biased stochastic gradient descent (BSGD) algorithm and study the bias-variance tradeoff under different structural assumptions. We establish the sample complexities of BSGD for strongly convex, convex, and weakly convex objectives under smooth and non-smooth conditions. Our lower bound analysis shows that the sample complexities of BSGD cannot be improved for general convex objectives and nonconvex objectives except for smooth nonconvex objectives with Lipschitz continuous gradient estimator. For this special setting, we propose an accelerated algorithm called biased SpiderBoost (BSpiderBoost) that matches the lower bound complexity. We further conduct numerical experiments on invariant logistic regression and model-agnostic meta-learning to illustrate the performance of BSGD and BSpiderBoost.
翻译:条件随机优化涵盖了从不变性学习、因果推断到元学习等多种应用。然而,由于复合结构的存在,为此类问题构建无偏梯度估计器具有挑战性。作为替代方案,我们提出了一种偏置随机梯度下降(BSGD)算法,并在不同结构假设下研究了偏差-方差权衡。我们建立了BSGD在光滑与非光滑条件下,针对强凸、凸及弱凸目标的样本复杂度。我们的下界分析表明,除具有Lipschitz连续梯度估计器的光滑非凸目标外,BSGD对于一般凸目标和非凸目标的样本复杂度无法进一步改进。针对这一特殊设定,我们提出了一种名为偏置SpiderBoost(BSpiderBoost)的加速算法,其复杂度与下界匹配。我们进一步在不变逻辑回归和模型无关元学习上进行了数值实验,以说明BSGD和BSpiderBoost的性能。