We develop an algorithm for parameter-free stochastic convex optimization (SCO) whose rate of convergence is only a double-logarithmic factor larger than the optimal rate for the corresponding known-parameter setting. In contrast, the best previously known rates for parameter-free SCO are based on online parameter-free regret bounds, which contain unavoidable excess logarithmic terms compared to their known-parameter counterparts. Our algorithm is conceptually simple, has high-probability guarantees, and is also partially adaptive to unknown gradient norms, smoothness, and strong convexity. At the heart of our results is a novel parameter-free certificate for SGD step size choice, and a time-uniform concentration result that assumes no a-priori bounds on SGD iterates.
翻译:我们开发了一种用于无参数随机凸优化(SCO)的算法,其收敛速率仅比相应已知参数设置下的最优速率大一个双对数因子。相比之下,先前已知的无参数SCO最佳速率基于在线无参数遗憾界,与已知参数对应方法相比,这些界包含不可避免的额外对数项。我们的算法概念简单,具有高概率保证,并且对未知梯度范数、光滑性和强凸性具有部分自适应性。我们结果的核心在于一种新颖的无参数证书,用于SGD步长选择,以及一个时间一致浓度结果,该结果不假设SGD迭代存在任何先验界。