Data-driven algorithm design automates hyperparameter tuning, but its statistical foundations remain limited because model performance can depend on hyperparameters in implicit and highly non-smooth ways. Existing guarantees focus on the simple case of a one-dimensional (scalar) hyperparameter. This leaves the practically important, multi-dimensional hyperparameter tuning setting unresolved. We address this open question by establishing the first general framework for establishing generalization guarantees for tuning multi-dimensional hyperparameters in data-driven settings. Our approach strengthens the generalization guarantee framework for semi-algebraic function classes by exploiting tools from real algebraic geometry, yielding sharper, more broadly applicable guarantees. We then extend the analysis to hyperparameter tuning using the validation loss under minimal assumptions, and derive improved bounds when additional structure is available. Finally, we demonstrate the scope of the framework with new learnability results, including data-driven weighted group lasso and weighted fused lasso.
翻译:数据驱动算法设计实现了超参数调优的自动化,但其统计基础仍然有限,因为模型性能可能以隐式且高度非光滑的方式依赖于超参数。现有理论保证主要针对一维(标量)超参数的简单情形,这使得实践中重要的多维超参数调优问题尚未得到解决。我们通过建立首个通用框架来解决这一开放性问题,该框架可为数据驱动场景下的多维超参数调优提供泛化保证。我们的方法通过运用实代数几何工具,强化了半代数函数类的泛化保证框架,从而得到更精确、更广泛适用的理论保证。随后,我们在最小假设条件下将分析扩展到基于验证损失的超参数调优,并在获得额外结构信息时推导出改进的边界。最后,我们通过新的可学习性结果(包括数据驱动的加权群套索和加权融合套索)展示了该框架的适用范围。