OptiRoulette Optimizer: A New Stochastic Meta-Optimizer for up to 5.3x Faster Convergence

This paper presents OptiRoulette, a stochastic meta-optimizer that selects update rules during training instead of fixing a single optimizer. The method combines warmup optimizer locking, random sampling from an active optimizer pool, compatibility-aware learning-rate scaling during optimizer transitions, and failure-aware pool replacement. OptiRoulette is implemented as a drop-in, "torch.optim.Optimizer-compatible" component and packaged for pip installation. We report completed 10-seed results on five image-classification suites: CIFAR-100, CIFAR-100-C, SVHN, Tiny ImageNet, and Caltech-256. Against a single-optimizer AdamW baseline, OptiRoulette improves mean test accuracy from 0.6734 to 0.7656 on CIFAR-100 (+9.22 percentage points), 0.2904 to 0.3355 on CIFAR-100-C (+4.52), 0.9667 to 0.9756 on SVHN (+0.89), 0.5669 to 0.6642 on Tiny ImageNet (+9.73), and 0.5946 to 0.6920 on Caltech-256 (+9.74). Its main advantage is convergence reliability at higher targets: it reaches CIFAR-100/CIFAR-100-C 0.75, SVHN 0.96, Tiny ImageNet 0.65, and Caltech-256 0.62 validation accuracy in 10/10 runs, while the AdamW baseline reaches none of these targets within budget. On shared targets, OptiRoulette also reduces time-to-target (e.g., Caltech-256 at 0.59: 25.7 vs 77.0 epochs). Paired-seed deltas are positive on all datasets; CIFAR-100-C test ROC-AUC is the only metric not statistically significant in the current 10-seed study.

翻译：本文提出 OptiRoulette，一种在训练过程中动态选择更新规则而非固定使用单一优化器的随机元优化器。该方法融合了预热阶段优化器锁定、从活跃优化器池中随机采样、优化器切换期间兼容性感知的学习率缩放以及故障感知的优化器池替换机制。OptiRoulette 被实现为一个即插即用、与 "torch.optim.Optimizer" 兼容的组件，并已打包供 pip 安装。我们在五个图像分类基准测试集上报告了完整的 10 次随机种子实验结果：CIFAR-100、CIFAR-100-C、SVHN、Tiny ImageNet 和 Caltech-256。相较于单一优化器 AdamW 基线，OptiRoulette 将平均测试准确率从 0.6734 提升至 0.7656（CIFAR-100，+9.22 个百分点），从 0.2904 提升至 0.3355（CIFAR-100-C，+4.52），从 0.9667 提升至 0.9756（SVHN，+0.89），从 0.5669 提升至 0.6642（Tiny ImageNet，+9.73），以及从 0.5946 提升至 0.6920（Caltech-256，+9.74）。其主要优势在于达到更高目标时的收敛可靠性：在 10/10 次运行中，它均能达到 CIFAR-100/CIFAR-100-C 0.75、SVHN 0.96、Tiny ImageNet 0.65 和 Caltech-256 0.62 的验证准确率目标，而 AdamW 基线在给定预算内均未达到这些目标。在共享目标上，OptiRoulette 也减少了达到目标所需时间（例如，Caltech-256 达到 0.59 准确率：25.7 个周期 vs 77.0 个周期）。在所有数据集上，配对种子的性能差异均为正；在当前 10 次种子的研究中，仅 CIFAR-100-C 的测试 ROC-AUC 指标未达到统计显著性。