Consider an online convex optimization problem where the loss functions are self-concordant barriers, smooth relative to a convex function $h$, and possibly non-Lipschitz. We analyze the regret of online mirror descent with $h$. Then, based on the result, we prove the following in a unified manner. Denote by $T$ the time horizon and $d$ the parameter dimension. 1. For online portfolio selection, the regret of $\widetilde{\text{EG}}$, a variant of exponentiated gradient due to Helmbold et al., is $\tilde{O} ( T^{2/3} d^{1/3} )$ when $T > 4 d / \log d$. This improves on the original $\tilde{O} ( T^{3/4} d^{1/2} )$ regret bound for $\widetilde{\text{EG}}$. 2. For online portfolio selection, the regret of online mirror descent with the logarithmic barrier is $\tilde{O}(\sqrt{T d})$. The regret bound is the same as that of Soft-Bayes due to Orseau et al. up to logarithmic terms. 3. For online learning quantum states with the logarithmic loss, the regret of online mirror descent with the log-determinant function is also $\tilde{O} ( \sqrt{T d} )$. Its per-iteration time is shorter than all existing algorithms we know.
翻译:考虑一个在线凸优化问题,其中损失函数为自协调障碍函数,相对于凸函数$h$光滑,且可能非Lipschitz。我们分析了基于$h$的在线镜像下降的遗憾界。在此基础上,以统一方式证明了以下结论。记$T$为时间范围,$d$为参数维度。1. 对于在线投资组合选择,由Helmbold等人提出的指数梯度变体$\widetilde{\text{EG}}$在$T > 4 d / \log d$时的遗憾界为$\tilde{O} ( T^{2/3} d^{1/3} )$,这改进了$\widetilde{\text{EG}}$原始$\tilde{O} ( T^{3/4} d^{1/2} )$的遗憾界。2. 对于在线投资组合选择,基于对数障碍函数的在线镜像下降的遗憾界为$\tilde{O}(\sqrt{T d})$,该界与Orseau等人提出的Soft-Bayes方法在对数项意义下的遗憾界相同。3. 对于基于对数损失的在线量子态学习,使用对数行列式函数的在线镜像下降的遗憾界也为$\tilde{O} ( \sqrt{T d} )$,且其每步迭代时间短于我们已知的所有现有算法。