权重到代码：从离散Transformer中提取可解释算法 (Weights to Code: Extracting Interpretable Algorithms from the Discrete Transformer) - 专知论文

会员服务 ·

0

离散 · 提取 · 算法 · 代码 · Transformer ·

Weights to Code: Extracting Interpretable Algorithms from the Discrete Transformer

翻译：权重到代码：从离散Transformer中提取可解释算法

Yifan Zhang,Wei Bi,Kechi Zhang,Dongming Jin,Jie Fu,Zhi Jin

Algorithm extraction aims to synthesize executable programs directly from models trained on specific algorithmic tasks, enabling de novo algorithm discovery without relying on human-written code. However, extending this paradigm to Transformer is hindered by superposition, where entangled features encoded in overlapping directions obstruct the extraction of symbolic expressions. In this work, we propose the Discrete Transformer, an architecture explicitly engineered to bridge the gap between continuous representations and discrete symbolic logic. By enforcing a strict functional disentanglement, which constrains Numerical Attention to information routing and Numerical MLP to element-wise arithmetic, and employing temperature-annealed sampling, our method effectively facilitates the extraction of human-readable programs. Empirically, the Discrete Transformer not only achieves performance comparable to RNN-based baselines but crucially extends interpretability to continuous variable domains. Moreover, our analysis of the annealing process shows that the efficient discrete search undergoes a clear phase transition from exploration to exploitation. We further demonstrate that our method enables fine-grained control over synthesized programs by imposing inductive biases. Collectively, these findings establish the Discrete Transformer as a robust framework for demonstration-free algorithm discovery, offering a rigorous pathway toward Transformer interpretability.

翻译：算法提取旨在直接从针对特定算法任务训练的模型中合成可执行程序，从而无需依赖人工编写的代码即可实现从零开始的算法发现。然而，将这一范式扩展到Transformer模型受到叠加现象的阻碍，其中编码在重叠方向上的纠缠特征阻碍了符号表达式的提取。在本工作中，我们提出了离散Transformer，这是一种专门设计的架构，旨在弥合连续表示与离散符号逻辑之间的鸿沟。通过强制实施严格的功能解耦——将数值注意力机制约束于信息路由，将数值多层感知机约束于逐元素算术运算——并采用温度退火采样，我们的方法有效促进了人类可读程序的提取。实证结果表明，离散Transformer不仅实现了与基于RNN的基线模型相当的性能，而且关键地将可解释性扩展到了连续变量领域。此外，我们对退火过程的分析表明，高效的离散搜索经历了从探索到利用的清晰相变。我们进一步证明，通过施加归纳偏置，我们的方法能够对合成程序进行细粒度控制。综上所述，这些发现确立了离散Transformer作为一个无需演示的算法发现的稳健框架，为Transformer的可解释性研究提供了一条严谨的路径。

0

相关内容

【CMU博士论文】长度可外推的Transformer，149页pdf

【CMU博士论文】长度可外推的Transformer，149页pdf

专知会员服务

27+阅读 · 2024年6月30日

Transformer如何训得更快更好？莫纳什大学最新《Transformer高效训练》综述，详述训练Transformer技术

Transformer如何训得更快更好？莫纳什大学最新《Transformer高效训练》综述，详述训练Transformer技术

专知会员服务

61+阅读 · 2023年2月4日

代码注释最详细的Transformer

代码注释最详细的Transformer

专知会员服务

113+阅读 · 2022年6月30日

【干货教程】从零开始学习Transformer，手把手写代码带你搞会，11页pdf细致笔记

【干货教程】从零开始学习Transformer，手把手写代码带你搞会，11页pdf细致笔记

专知会员服务

143+阅读 · 2022年4月27日

【Google】高效Transformer综述，Efficient Transformers: A Survey

【Google】高效Transformer综述，Efficient Transformers: A Survey

专知会员服务

66+阅读 · 2022年3月17日

中科院计算所最新「视觉Transformer」综述论文，带你全面了解最新CV分类、检测/分割方法

中科院计算所最新「视觉Transformer」综述论文，带你全面了解最新CV分类、检测/分割方法

专知会员服务

99+阅读 · 2021年11月16日

机器学习的可解释性

机器学习的可解释性

专知会员服务

69+阅读 · 2020年12月18日

【Google】最新《高效Transformers》综述大全，Efficient Transformers: A Survey

【Google】最新《高效Transformers》综述大全，Efficient Transformers: A Survey

专知会员服务

113+阅读 · 2020年9月17日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

From Data to Model Programming: Injecting Structured Priors for Knowledge Extraction，南加州大学计算机科学系任翔助理教授，CIPS ATT 16（2019）

From Data to Model Programming: Injecting Structured Priors for Knowledge Extraction，南加州大学计算机科学系任翔助理教授，CIPS ATT 16（2019）

专知会员服务

14+阅读 · 2019年10月25日

【干货书】《Transformers 机器学习:深度探究》，284页pdf

【干货书】《Transformers 机器学习:深度探究》，284页pdf

专知

72+阅读 · 2022年4月21日

【AAAI2021】生成式Transformer的对比三元组提取

【AAAI2021】生成式Transformer的对比三元组提取

专知

11+阅读 · 2021年2月8日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知

14+阅读 · 2020年8月30日

【翻译技术速递】测评：免费的术语抽取工具

【翻译技术速递】测评：免费的术语抽取工具

翻译技术沙龙

139+阅读 · 2019年11月2日

百闻不如一码！手把手教你用Python搭一个Transformer

百闻不如一码！手把手教你用Python搭一个Transformer

大数据文摘

18+阅读 · 2019年4月22日

TensorFlow 2.0官方Transformer教程 (Attention is All you Need)

TensorFlow 2.0官方Transformer教程 (Attention is All you Need)

专知

54+阅读 · 2019年4月12日

手把手 | 基于TextRank算法的文本摘要（附Python代码）

手把手 | 基于TextRank算法的文本摘要（附Python代码）

大数据文摘

11+阅读 · 2018年12月27日

一种关键字提取新方法

一种关键字提取新方法

1号机器人网

21+阅读 · 2018年11月15日

最新｜深度离散哈希算法，可用于图像检索！

最新｜深度离散哈希算法，可用于图像检索！

全球人工智能

14+阅读 · 2017年12月15日

从浅层模型到深度模型：概览机器学习优化算法

从浅层模型到深度模型：概览机器学习优化算法

机器之心

27+阅读 · 2017年7月9日

基于分类能力结构度量与类相关性关系保留的特征选取方法研究

国家自然科学基金

1+阅读 · 2017年12月31日

面向特征提取的低秩与稀疏图嵌入理论与算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于深度学习的多尺度本质图像提取方法

国家自然科学基金

5+阅读 · 2015年12月31日

基于压缩感知的信号重建快速算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于密集快速特征提取的可视媒体篡改检测研究

国家自然科学基金

1+阅读 · 2015年12月31日

随机文法作为通用统计模型的扩展

国家自然科学基金

1+阅读 · 2015年12月31日

面向二进制程序的静态结构化符号执行与动态组合方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于异构信息网络的分类算法推荐方法研究

国家自然科学基金

7+阅读 · 2015年12月31日

提高移动最小二乘近似无网格方法计算效率的技术和理论

国家自然科学基金

0+阅读 · 2014年12月31日

基于深度学习的机器译文质量估计方法研究

国家自然科学基金

3+阅读 · 2014年12月31日

Transmuting prompts into weights

Arxiv

0+阅读 · 2月5日

Explaining the Explainer: Understanding the Inner Workings of Transformer-based Symbolic Regression Models

Arxiv

0+阅读 · 2月3日

Depth-Width tradeoffs in Algorithmic Reasoning of Graph Tasks with Transformers

Arxiv

0+阅读 · 1月28日

No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves

Arxiv

0+阅读 · 1月26日

An Introduction to Transformers

Arxiv

0+阅读 · 1月20日

Controlled Self-Evolution for Algorithmic Code Optimization

Arxiv

0+阅读 · 1月15日

Value-Aware Numerical Representations for Transformer Language Models

Arxiv

0+阅读 · 1月14日

Controlled Self-Evolution for Algorithmic Code Optimization

Arxiv

0+阅读 · 1月13日

Boosting Resolution Generalization of Diffusion Transformers with Randomized Positional Encodings

Arxiv

0+阅读 · 1月7日

Fast weight programming and linear transformers: from machine learning to neurobiology

Arxiv

0+阅读 · 2025年12月31日

VIP会员

文章信息

相关主题

相关VIP内容

【CMU博士论文】长度可外推的Transformer，149页pdf

【CMU博士论文】长度可外推的Transformer，149页pdf

专知会员服务

27+阅读 · 2024年6月30日

Transformer如何训得更快更好？莫纳什大学最新《Transformer高效训练》综述，详述训练Transformer技术

Transformer如何训得更快更好？莫纳什大学最新《Transformer高效训练》综述，详述训练Transformer技术

专知会员服务

61+阅读 · 2023年2月4日

代码注释最详细的Transformer

代码注释最详细的Transformer

专知会员服务

113+阅读 · 2022年6月30日

【干货教程】从零开始学习Transformer，手把手写代码带你搞会，11页pdf细致笔记

【干货教程】从零开始学习Transformer，手把手写代码带你搞会，11页pdf细致笔记

专知会员服务

143+阅读 · 2022年4月27日

【Google】高效Transformer综述，Efficient Transformers: A Survey

【Google】高效Transformer综述，Efficient Transformers: A Survey

专知会员服务

66+阅读 · 2022年3月17日

中科院计算所最新「视觉Transformer」综述论文，带你全面了解最新CV分类、检测/分割方法

中科院计算所最新「视觉Transformer」综述论文，带你全面了解最新CV分类、检测/分割方法

专知会员服务

99+阅读 · 2021年11月16日

机器学习的可解释性

机器学习的可解释性

专知会员服务

69+阅读 · 2020年12月18日

【Google】最新《高效Transformers》综述大全，Efficient Transformers: A Survey

【Google】最新《高效Transformers》综述大全，Efficient Transformers: A Survey

专知会员服务

113+阅读 · 2020年9月17日

Transformer文本分类代码

Transformer文本分类代码

专知会员服务

118+阅读 · 2020年2月3日

From Data to Model Programming: Injecting Structured Priors for Knowledge Extraction，南加州大学计算机科学系任翔助理教授，CIPS ATT 16（2019）

From Data to Model Programming: Injecting Structured Priors for Knowledge Extraction，南加州大学计算机科学系任翔助理教授，CIPS ATT 16（2019）

专知会员服务

14+阅读 · 2019年10月25日

热门VIP内容

开通专知VIP会员享更多权益服务

论学习、公平性与复杂度

《整合杀伤链：一个用于边缘目标验证与战术推理的零样本框架》最新资料

2025中国人工智能学会系列白皮书⸺棋盘上的人工智能|附下载

通用智能体评估的逻辑架构

相关资讯

【干货书】《Transformers 机器学习:深度探究》，284页pdf

【干货书】《Transformers 机器学习:深度探究》，284页pdf

专知

72+阅读 · 2022年4月21日

【AAAI2021】生成式Transformer的对比三元组提取

【AAAI2021】生成式Transformer的对比三元组提取

专知

11+阅读 · 2021年2月8日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知

14+阅读 · 2020年8月30日

【翻译技术速递】测评：免费的术语抽取工具

【翻译技术速递】测评：免费的术语抽取工具

翻译技术沙龙

139+阅读 · 2019年11月2日

百闻不如一码！手把手教你用Python搭一个Transformer

百闻不如一码！手把手教你用Python搭一个Transformer

大数据文摘

18+阅读 · 2019年4月22日

TensorFlow 2.0官方Transformer教程 (Attention is All you Need)

TensorFlow 2.0官方Transformer教程 (Attention is All you Need)

专知

54+阅读 · 2019年4月12日

手把手 | 基于TextRank算法的文本摘要（附Python代码）

手把手 | 基于TextRank算法的文本摘要（附Python代码）

大数据文摘

11+阅读 · 2018年12月27日

一种关键字提取新方法

一种关键字提取新方法

1号机器人网

21+阅读 · 2018年11月15日

最新｜深度离散哈希算法，可用于图像检索！

最新｜深度离散哈希算法，可用于图像检索！

全球人工智能

14+阅读 · 2017年12月15日

从浅层模型到深度模型：概览机器学习优化算法

从浅层模型到深度模型：概览机器学习优化算法

机器之心

27+阅读 · 2017年7月9日

相关论文

Transmuting prompts into weights

Arxiv

0+阅读 · 2月5日

Explaining the Explainer: Understanding the Inner Workings of Transformer-based Symbolic Regression Models

Arxiv

0+阅读 · 2月3日

Depth-Width tradeoffs in Algorithmic Reasoning of Graph Tasks with Transformers

Arxiv

0+阅读 · 1月28日

No Other Representation Component Is Needed: Diffusion Transformers Can Provide Representation Guidance by Themselves

Arxiv

0+阅读 · 1月26日

An Introduction to Transformers

Arxiv

0+阅读 · 1月20日

Controlled Self-Evolution for Algorithmic Code Optimization

Arxiv

0+阅读 · 1月15日

Value-Aware Numerical Representations for Transformer Language Models

Arxiv

0+阅读 · 1月14日

Controlled Self-Evolution for Algorithmic Code Optimization

Arxiv

0+阅读 · 1月13日

Boosting Resolution Generalization of Diffusion Transformers with Randomized Positional Encodings

Arxiv

0+阅读 · 1月7日

Fast weight programming and linear transformers: from machine learning to neurobiology

Arxiv

0+阅读 · 2025年12月31日

相关基金

基于分类能力结构度量与类相关性关系保留的特征选取方法研究

国家自然科学基金

1+阅读 · 2017年12月31日

面向特征提取的低秩与稀疏图嵌入理论与算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于深度学习的多尺度本质图像提取方法

国家自然科学基金

5+阅读 · 2015年12月31日

基于压缩感知的信号重建快速算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于密集快速特征提取的可视媒体篡改检测研究

国家自然科学基金

1+阅读 · 2015年12月31日

随机文法作为通用统计模型的扩展

国家自然科学基金

1+阅读 · 2015年12月31日

面向二进制程序的静态结构化符号执行与动态组合方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于异构信息网络的分类算法推荐方法研究

国家自然科学基金

7+阅读 · 2015年12月31日

提高移动最小二乘近似无网格方法计算效率的技术和理论

国家自然科学基金

0+阅读 · 2014年12月31日

基于深度学习的机器译文质量估计方法研究

国家自然科学基金

3+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员