Modern optimization problems in scientific and engineering domains often rely on expensive black-box evaluations, such as those arising in physical simulations or deep learning pipelines, where gradient information is unavailable or unreliable. In these settings, conventional optimization methods quickly become impractical due to prohibitive computational costs and poor scalability. We propose ALMAB-DC, a unified and modular framework for scalable black-box optimization that integrates active learning, multi-armed bandits, and distributed computing, with optional GPU acceleration. The framework leverages surrogate modeling and information-theoretic acquisition functions to guide informative sample selection, while bandit-based controllers dynamically allocate computational resources across candidate evaluations in a statistically principled manner. These decisions are executed asynchronously within a distributed multi-agent system, enabling high-throughput parallel evaluation. We establish theoretical regret bounds for both UCB-based and Thompson-sampling-based variants and develop a scalability analysis grounded in Amdahl's and Gustafson's laws. Empirical results across synthetic benchmarks, reinforcement learning tasks, and scientific simulation problems demonstrate that ALMAB-DC consistently outperforms state-of-the-art black-box optimizers. By design, ALMAB-DC is modular, uncertainty-aware, and extensible, making it particularly well suited for high-dimensional, resource-intensive optimization challenges.
翻译:科学与工程领域的现代优化问题通常依赖于昂贵的黑箱评估,例如物理仿真或深度学习流程中的评估,这些场景中梯度信息不可用或不可靠。在此类情境下,由于高昂的计算成本和较差的可扩展性,传统优化方法很快变得不切实际。我们提出了ALMAB-DC——一个统一且模块化的可扩展黑箱优化框架,它集成了主动学习、多臂赌博机和分布式计算,并支持可选的GPU加速。该框架利用代理模型和信息论采集函数来指导信息性样本的选择,同时基于赌博机的控制器以统计原则性方式动态分配候选评估的计算资源。这些决策在分布式多智能体系统中异步执行,实现了高吞吐量的并行评估。我们为基于UCB和基于Thompson采样的变体建立了理论遗憾界,并基于Amdahl定律和Gustafson定律进行了可扩展性分析。在合成基准测试、强化学习任务和科学仿真问题上的实证结果表明,ALMAB-DC始终优于最先进的黑箱优化器。通过设计,ALMAB-DC具有模块化、不确定性感知和可扩展的特点,使其特别适用于高维度、资源密集型的优化挑战。