Can gradient descent be accelerated by just choosing better stepsizes? Surprisingly, the answer is yes. This short expository article provides an accessible introduction to this phenomenon of stepsize hedging.
翻译:暂无翻译