Stochastic Approximation Beyond Gradient for Signal Processing and Machine Learning

Stochastic Approximation (SA) is a classical algorithm that has had since the early days a huge impact on signal processing, and nowadays on machine learning, due to the necessity to deal with a large amount of data observed with uncertainties. An exemplar special case of SA pertains to the popular stochastic (sub)gradient algorithm which is the working horse behind many important applications. A lesser-known fact is that the SA scheme also extends to non-stochastic-gradient algorithms such as compressed stochastic gradient, stochastic expectation-maximization, and a number of reinforcement learning algorithms. The aim of this article is to overview and introduce the non-stochastic-gradient perspectives of SA to the signal processing and machine learning audiences through presenting a design guideline of SA algorithms backed by theories. Our central theme is to propose a general framework that unifies existing theories of SA, including its non-asymptotic and asymptotic convergence results, and demonstrate their applications on popular non-stochastic-gradient algorithms. We build our analysis framework based on classes of Lyapunov functions that satisfy a variety of mild conditions. We draw connections between non-stochastic-gradient algorithms and scenarios when the Lyapunov function is smooth, convex, or strongly convex. Using the said framework, we illustrate the convergence properties of the non-stochastic-gradient algorithms using concrete examples. Extensions to the emerging variance reduction techniques for improved sample complexity will also be discussed.

翻译：随机逼近（SA）是一种经典算法，自诞生之初就对信号处理产生了深远影响，如今因需处理大量含噪声观测数据，其对机器学习的影响同样重大。SA的一个典型特例是流行的随机（次）梯度算法，该算法是许多重要应用的核心支柱。鲜为人知的是，SA框架同样适用于非随机梯度的算法，例如压缩随机梯度、随机期望最大化以及诸多强化学习算法。本文旨在通过介绍基于理论支撑的SA算法设计准则，向信号处理与机器学习领域的读者概述并引入SA的非随机梯度视角。我们以提出一个统一现有SA理论（包括其非渐近与渐近收敛性结果）的通用框架为核心主题，并展示其在主流非随机梯度算法中的应用。我们基于满足多种弱条件的李雅普诺夫函数类构建分析框架，揭示了非随机梯度算法与李雅普诺夫函数光滑、凸或强凸情形之间的内在联系。借助该框架，我们通过具体实例阐明了非随机梯度算法的收敛性质，并将探讨新兴方差缩减技术在提升样本复杂度方面的扩展应用。

相关内容

Signal Processing

关注 0

信号处理期刊采用了理论与实践的各个方面的信号处理。它以原始研究工作，教程和评论文章以及实际发展情况为特色。它旨在将知识和经验快速传播给从事信号处理研究，开发或实际应用的工程师和科学家。该期刊涵盖的主题领域包括：信号理论；随机过程; 检测和估计；光谱分析；过滤；信号处理系统；软件开发；图像处理; 模式识别; 光信号处理；数字信号处理; 多维信号处理；通信信号处理；生物医学信号处理；地球物理和天体信号处理；地球资源信号处理；声音和振动信号处理；数据处理; 遥感; 信号处理技术；雷达信号处理；声纳信号处理；工业应用；新的应用程序。官网地址：http://dblp.uni-trier.de/db/journals/sigpro/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

37+阅读 · 2019年10月17日