ConfHit：具有无预言机保证的保形生成设计 (ConfHit: Conformal Generative Design with Oracle Free Guarantees) - 专知论文

会员服务 ·

0

样本 · 设计 · 生成建模 · 预测方法 · 路径 ·

ConfHit: Conformal Generative Design with Oracle Free Guarantees

翻译：ConfHit：具有无预言机保证的保形生成设计

Siddhartha Laghuvarapu,Ying Jin,Jimeng Sun

from arxiv, Accepted at ICLR 2026

The success of deep generative models in scientific discovery requires not only the ability to generate novel candidates but also reliable guarantees that these candidates indeed satisfy desired properties. Recent conformal-prediction methods offer a path to such guarantees, but its application to generative modeling in drug discovery is limited by budget constraints, lack of oracle access, and distribution shift. To this end, we introduce ConfHit, a distribution-free framework that provides validity guarantees under these conditions. ConfHit formalizes two central questions: (i) Certification: whether a generated batch can be guaranteed to contain at least one hit with a user-specified confidence level, and (ii) Design: whether the generation can be refined to a compact set without weakening this guarantee. ConfHit leverages weighted exchangeability between historical and generated samples to eliminate the need for an experimental oracle, constructs multiple-sample density-ratio weighted conformal p-value to quantify statistical confidence in hits, and proposes a nested testing procedure to certify and refine candidate sets of multiple generated samples while maintaining statistical guarantees. Across representative generative molecule design tasks and a broad range of methods, ConfHit consistently delivers valid coverage guarantees at multiple confidence levels while maintaining compact certified sets, establishing a principled and reliable framework for generative modeling.

翻译：深度学习生成模型在科学发现中的成功不仅要求其能够生成新颖候选物，还需要可靠的保证，确保这些候选物确实满足所需特性。近期的保形预测方法为实现此类保证提供了路径，但其在药物发现生成建模中的应用受限于预算约束、缺乏预言机访问以及分布偏移。为此，我们提出了ConfHit，一个无需分布假设的框架，可在上述条件下提供有效性保证。ConfHit形式化了两个核心问题：(i) 认证：能否以用户指定的置信水平保证生成批次中至少包含一个有效命中物；(ii) 设计：能否在不削弱此保证的前提下将生成结果精炼至一个紧凑集合。ConfHit利用历史样本与生成样本之间的加权可交换性以消除对实验预言机的需求，构建多样本密度比加权保形p值以量化命中物的统计置信度，并提出一种嵌套检验程序来认证和精炼多个生成样本的候选集合，同时保持统计保证。在具有代表性的生成式分子设计任务和多种方法中，ConfHit在多个置信水平下始终提供有效的覆盖保证，同时保持紧凑的认证集合，为生成建模建立了一个原则性且可靠的框架。

0

相关内容

超越生成式人工智能：用于临床预测、反事实推断与规划的世界模型

超越生成式人工智能：用于临床预测、反事实推断与规划的世界模型

专知会员服务

22+阅读 · 2025年11月23日

【AAAI2024】Wikiformer: 利用维基百科结构化信息进行预训练，用于Ad-hoc检索

【AAAI2024】Wikiformer: 利用维基百科结构化信息进行预训练，用于Ad-hoc检索

专知会员服务

19+阅读 · 2023年12月26日

什么是共形预测(conformal prediction)？LPSM最新《共形预测》教程，71页ppt

什么是共形预测(conformal prediction)？LPSM最新《共形预测》教程，71页ppt

专知会员服务

44+阅读 · 2023年9月3日

J. Med. Chem. | RELATION: 一种基于靶标结构的深度学习全新药物设计模型

J. Med. Chem. | RELATION: 一种基于靶标结构的深度学习全新药物设计模型

专知会员服务

11+阅读 · 2022年6月23日

AI药物设计前沿进展探讨，智药公开课今晚19:30开播！

AI药物设计前沿进展探讨，智药公开课今晚19:30开播！

专知会员服务

15+阅读 · 2022年6月22日

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

专知会员服务

26+阅读 · 2022年3月15日

Jakub Tomczak- 《深度生成建模》讲座报告与视频，84页ppt，Deep Generative Modeling is a key to unlocking AI potential

Jakub Tomczak- 《深度生成建模》讲座报告与视频，84页ppt，Deep Generative Modeling is a key to unlocking AI potential

专知会员服务

61+阅读 · 2022年3月11日

【DeepMind】无监督实体对齐，AlignNet: Unsupervised Entity Alignment

【DeepMind】无监督实体对齐，AlignNet: Unsupervised Entity Alignment

专知会员服务

21+阅读 · 2020年7月24日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

37+阅读 · 2020年2月27日

【CCF优秀博士学位论文奖-2019】面向多种学习任务的深度生成模型，清华大学李崇轩

【CCF优秀博士学位论文奖-2019】面向多种学习任务的深度生成模型，清华大学李崇轩

专知会员服务

52+阅读 · 2019年11月8日

港科大浙大最新《深度生成模型三维表示》综述，20页pdf全面阐述3D生成进展

港科大浙大最新《深度生成模型三维表示》综述，20页pdf全面阐述3D生成进展

专知

12+阅读 · 2022年10月31日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知

18+阅读 · 2020年10月11日

最新《深度生成式模型进展》视频报告，43页ppt，斯坦福Aditya Grover

最新《深度生成式模型进展》视频报告，43页ppt，斯坦福Aditya Grover

专知

13+阅读 · 2020年8月9日

Transformers就是图神经网络？NTU-Chaitanya Joshi论述: 是GNN的一个特例

Transformers就是图神经网络？NTU-Chaitanya Joshi论述: 是GNN的一个特例

专知

20+阅读 · 2020年3月1日

Github 项目推荐 | 论文的代码实现：可变形ConvNets v2的PyTorch实现

Github 项目推荐 | 论文的代码实现：可变形ConvNets v2的PyTorch实现

AI研习社

22+阅读 · 2019年1月10日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

DeepMind无监督表示学习重大突破：语音、图像、文本、强化学习全能冠军！

DeepMind无监督表示学习重大突破：语音、图像、文本、强化学习全能冠军！

新智元

12+阅读 · 2018年7月13日

变分自编码器（Variational Autoencoder, VAE）通俗教程，细节、基础、符号解释很齐全

变分自编码器（Variational Autoencoder, VAE）通俗教程，细节、基础、符号解释很齐全

CreateAMind

12+阅读 · 2018年4月7日

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

专知

12+阅读 · 2017年12月21日

自然语言处理中的Attention Model：是什么及为什么

自然语言处理中的Attention Model：是什么及为什么

新智元

11+阅读 · 2017年7月13日

非确定型Web服务流程重组的可靠性验证技术

国家自然科学基金

1+阅读 · 2015年12月31日

复杂工程产品基于多可信度近似的设计优化研究

国家自然科学基金

0+阅读 · 2015年12月31日

高效保结构算法的构造、并行化及其应用

国家自然科学基金

0+阅读 · 2015年12月31日

考虑不确定性的结构动力学响应模型可信度确认方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于多准则场景缩减的“零停机”设备状态预测与维护方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

试验设计中的模型选择

国家自然科学基金

6+阅读 · 2014年12月31日

复杂数据下带有形状约束的半参数模型统计推断

国家自然科学基金

3+阅读 · 2014年12月31日

基于深度学习的三维模型检索技术

国家自然科学基金

13+阅读 · 2014年12月31日

不确定结构可靠寿命设计的时变高精度模型和序列优化问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于自适应模型检测的安全协议自动建模与设计研究

国家自然科学基金

1+阅读 · 2014年12月31日

Conformal Tradeoffs: Guarantees Beyond Coverage

Arxiv

0+阅读 · 3月7日

COMPASS: Robust Feature Conformal Prediction for Medical Segmentation Metrics

Arxiv

0+阅读 · 2月28日

DANCE: Doubly Adaptive Neighborhood Conformal Estimation

Arxiv

0+阅读 · 2月24日

ContextGen: Contextual Layout Anchoring for Identity-Consistent Multi-Instance Generation

Arxiv

0+阅读 · 2月12日

Aggregate Models, Not Explanations: Improving Feature Importance Estimation

Arxiv

0+阅读 · 2月12日

Conformal Unlearning: A New Paradigm for Unlearning in Conformal Predictors

Arxiv

0+阅读 · 2月11日

Conformal Prediction Sets for Instance Segmentation

Arxiv

0+阅读 · 2月10日

HiFo-Prompt: Prompting with Hindsight and Foresight for LLM-based Automatic Heuristic Design

Arxiv

0+阅读 · 2月8日

ContextBench: A Benchmark for Context Retrieval in Coding Agents

Arxiv

0+阅读 · 2月5日

DecompressionLM: Deterministic, Diagnostic, and Zero-Shot Concept Graph Extraction from Language Models

Arxiv

0+阅读 · 2月5日

VIP会员

文章信息

相关主题

相关VIP内容

超越生成式人工智能：用于临床预测、反事实推断与规划的世界模型

超越生成式人工智能：用于临床预测、反事实推断与规划的世界模型

专知会员服务

22+阅读 · 2025年11月23日

【AAAI2024】Wikiformer: 利用维基百科结构化信息进行预训练，用于Ad-hoc检索

【AAAI2024】Wikiformer: 利用维基百科结构化信息进行预训练，用于Ad-hoc检索

专知会员服务

19+阅读 · 2023年12月26日

什么是共形预测(conformal prediction)？LPSM最新《共形预测》教程，71页ppt

什么是共形预测(conformal prediction)？LPSM最新《共形预测》教程，71页ppt

专知会员服务

44+阅读 · 2023年9月3日

J. Med. Chem. | RELATION: 一种基于靶标结构的深度学习全新药物设计模型

J. Med. Chem. | RELATION: 一种基于靶标结构的深度学习全新药物设计模型

专知会员服务

11+阅读 · 2022年6月23日

AI药物设计前沿进展探讨，智药公开课今晚19:30开播！

AI药物设计前沿进展探讨，智药公开课今晚19:30开播！

专知会员服务

15+阅读 · 2022年6月22日

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

【清华大学】Delta调优:预训练语言模型参数有效方法的综合研究，Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

专知会员服务

26+阅读 · 2022年3月15日

Jakub Tomczak- 《深度生成建模》讲座报告与视频，84页ppt，Deep Generative Modeling is a key to unlocking AI potential

Jakub Tomczak- 《深度生成建模》讲座报告与视频，84页ppt，Deep Generative Modeling is a key to unlocking AI potential

专知会员服务

61+阅读 · 2022年3月11日

【DeepMind】无监督实体对齐，AlignNet: Unsupervised Entity Alignment

【DeepMind】无监督实体对齐，AlignNet: Unsupervised Entity Alignment

专知会员服务

21+阅读 · 2020年7月24日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

37+阅读 · 2020年2月27日

【CCF优秀博士学位论文奖-2019】面向多种学习任务的深度生成模型，清华大学李崇轩

【CCF优秀博士学位论文奖-2019】面向多种学习任务的深度生成模型，清华大学李崇轩

专知会员服务

52+阅读 · 2019年11月8日

热门VIP内容

开通专知VIP会员享更多权益服务

《可变规模无人机蜂群的任务分配研究》最新90页

军用无人机系统发展趋势之印度发展

《缩小陆军反小型无人机系统差距》最新79页

拉斐尔公司将于2026年新加坡航展展示集成的空中、太空、情报与防空能力

相关资讯

港科大浙大最新《深度生成模型三维表示》综述，20页pdf全面阐述3D生成进展

港科大浙大最新《深度生成模型三维表示》综述，20页pdf全面阐述3D生成进展

专知

12+阅读 · 2022年10月31日

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

【商汤科技】可变形Transformers端到端对象检测，Deformable DETR

专知

18+阅读 · 2020年10月11日

最新《深度生成式模型进展》视频报告，43页ppt，斯坦福Aditya Grover

最新《深度生成式模型进展》视频报告，43页ppt，斯坦福Aditya Grover

专知

13+阅读 · 2020年8月9日

Transformers就是图神经网络？NTU-Chaitanya Joshi论述: 是GNN的一个特例

Transformers就是图神经网络？NTU-Chaitanya Joshi论述: 是GNN的一个特例

专知

20+阅读 · 2020年3月1日

Github 项目推荐 | 论文的代码实现：可变形ConvNets v2的PyTorch实现

Github 项目推荐 | 论文的代码实现：可变形ConvNets v2的PyTorch实现

AI研习社

22+阅读 · 2019年1月10日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

DeepMind无监督表示学习重大突破：语音、图像、文本、强化学习全能冠军！

DeepMind无监督表示学习重大突破：语音、图像、文本、强化学习全能冠军！

新智元

12+阅读 · 2018年7月13日

变分自编码器（Variational Autoencoder, VAE）通俗教程，细节、基础、符号解释很齐全

变分自编码器（Variational Autoencoder, VAE）通俗教程，细节、基础、符号解释很齐全

CreateAMind

12+阅读 · 2018年4月7日

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

专知

12+阅读 · 2017年12月21日

自然语言处理中的Attention Model：是什么及为什么

自然语言处理中的Attention Model：是什么及为什么

新智元

11+阅读 · 2017年7月13日

相关论文

Conformal Tradeoffs: Guarantees Beyond Coverage

Arxiv

0+阅读 · 3月7日

COMPASS: Robust Feature Conformal Prediction for Medical Segmentation Metrics

Arxiv

0+阅读 · 2月28日

DANCE: Doubly Adaptive Neighborhood Conformal Estimation

Arxiv

0+阅读 · 2月24日

ContextGen: Contextual Layout Anchoring for Identity-Consistent Multi-Instance Generation

Arxiv

0+阅读 · 2月12日

Aggregate Models, Not Explanations: Improving Feature Importance Estimation

Arxiv

0+阅读 · 2月12日

Conformal Unlearning: A New Paradigm for Unlearning in Conformal Predictors

Arxiv

0+阅读 · 2月11日

Conformal Prediction Sets for Instance Segmentation

Arxiv

0+阅读 · 2月10日

HiFo-Prompt: Prompting with Hindsight and Foresight for LLM-based Automatic Heuristic Design

Arxiv

0+阅读 · 2月8日

ContextBench: A Benchmark for Context Retrieval in Coding Agents

Arxiv

0+阅读 · 2月5日

DecompressionLM: Deterministic, Diagnostic, and Zero-Shot Concept Graph Extraction from Language Models

Arxiv

0+阅读 · 2月5日

相关基金

非确定型Web服务流程重组的可靠性验证技术

国家自然科学基金

1+阅读 · 2015年12月31日

复杂工程产品基于多可信度近似的设计优化研究

国家自然科学基金

0+阅读 · 2015年12月31日

高效保结构算法的构造、并行化及其应用

国家自然科学基金

0+阅读 · 2015年12月31日

考虑不确定性的结构动力学响应模型可信度确认方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于多准则场景缩减的“零停机”设备状态预测与维护方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

试验设计中的模型选择

国家自然科学基金

6+阅读 · 2014年12月31日

复杂数据下带有形状约束的半参数模型统计推断

国家自然科学基金

3+阅读 · 2014年12月31日

基于深度学习的三维模型检索技术

国家自然科学基金

13+阅读 · 2014年12月31日

不确定结构可靠寿命设计的时变高精度模型和序列优化问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于自适应模型检测的安全协议自动建模与设计研究

国家自然科学基金

1+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员