Output-sensitive ERM-based techniques for data-driven algorithm design

Data-driven algorithm design is a promising, learning-based approach for beyond worst-case analysis of algorithms with tunable parameters. An important open problem is the design of computationally efficient data-driven algorithms for combinatorial algorithm families with multiple parameters. As one fixes the problem instance and varies the parameters, the "dual" loss function typically has a piecewise-decomposable structure, i.e. is well-behaved except at certain sharp transition boundaries. In this work we initiate the study of techniques to develop efficient ERM learning algorithms for data-driven algorithm design by enumerating the pieces of the sum dual loss functions for a collection of problem instances. The running time of our approach scales with the actual number of pieces that appear as opposed to worst case upper bounds on the number of pieces. Our approach involves two novel ingredients -- an output-sensitive algorithm for enumerating polytopes induced by a set of hyperplanes using tools from computational geometry, and an execution graph which compactly represents all the states the algorithm could attain for all possible parameter values. We illustrate our techniques by giving algorithms for pricing problems, linkage-based clustering and dynamic-programming based sequence alignment.

翻译：数据驱动算法设计是一种有前景的、基于学习的方法，用于对具有可调参数的算法进行超越最坏情况的分析。一个重要开放问题是为具有多个参数的组合算法族设计计算高效的数据驱动算法。当固定问题实例并变化参数时，“对偶”损失函数通常具有分段可分解结构，即在某些尖锐的转变边界之外表现良好。本文中，我们首次系统研究了通过枚举一组问题实例的总对偶损失函数的分段来开发数据驱动算法设计的有效经验风险最小化学习算法的技术。我们方法的运行时间与实际出现的分段数量成比例，而非分段数量的最坏情况上界。我们的方法包含两个创新要素——利用计算几何工具对由超平面集合诱导的多面体进行输出敏感枚举的算法，以及一个紧凑表示算法在所有可能参数值下可能达到的所有状态的执行图。我们通过为定价问题、基于链接的聚类和基于动态规划的序列对齐设计算法来展示我们的技术。

相关内容

损失函数（机器学习）

关注 10

损失函数，在AI中亦称呼距离函数，度量函数。此处的距离代表的是抽象性的，代表真实数据与预测数据之间的误差。损失函数（loss function）是用来估量你模型的预测值f(x)与真实值Y的不一致程度，它是一个非负实值函数,通常使用L(Y, f(x))来表示，损失函数越小，模型的鲁棒性就越好。损失函数是经验风险函数的核心部分，也是结构风险函数重要组成部分。

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

《用于无线通信和传感的智能反射面 (IRS)》（ICC 2022）新加坡国立大学2022最新53页slides

专知会员服务

26+阅读 · 2022年11月16日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日