Learning-based congestion controllers offer better adaptability compared to traditional heuristics. However, the unreliability of learning techniques can cause learning-based controllers to behave poorly, creating a need for formal guarantees. While methods for formally verifying learned congestion controllers exist, these methods offer binary feedback that cannot optimize the controller toward better behavior. We improve this state-of-the-art via Canopy, a new property-driven framework that integrates learning with formal reasoning in the learning loop. Canopy uses novel quantitative certification with an abstract interpreter to guide the training process, rewarding models, and evaluating robust and safe model performance on worst-case inputs. Our evaluation demonstrates that unlike state-of-the-art learned controllers, Canopy-trained controllers provide both adaptability and worst-case reliability across a range of network conditions.
翻译:相较于传统启发式方法,基于学习的拥塞控制器展现出更优的适应性。然而,学习技术的不可靠性可能导致基于学习的控制器表现不佳,这催生了对形式化保证的需求。尽管存在形式化验证学习型拥塞控制器的方法,但这些方法仅能提供二元反馈,无法引导控制器优化其行为。我们通过Canopy改进了这一前沿现状——这是一个在训练循环中融合学习与形式化推理的新型属性驱动框架。Canopy采用基于抽象解释器的量化验证新技术来指导训练过程,通过奖励机制评估模型在最坏情况输入下的鲁棒性与安全性表现。实验评估表明,与现有先进学习型控制器不同,经Canopy训练的控制器能在多种网络条件下同时提供卓越的适应性与最坏情况可靠性。