Hierarchical Sparse Circuit Extraction from Billion-Parameter Language Models through Scalable Attribution Graph Decomposition - 专知论文

会员服务 ·

0

稀疏 · 图 · 语言模型化 · MoDELS · GNN ·

Hierarchical Sparse Circuit Extraction from Billion-Parameter Language Models through Scalable Attribution Graph Decomposition

翻译：暂无翻译

Mohammed Mudassir Uddin,Shahnawaz Alam,Mohammed Kaif Pasha

Extracting sparse circuits from billion-parameter transformers is constrained by $O(2^n)$ search cost and pervasive feature reuse across co-active pathways. Hierarchical Attribution Graph Decomposition (HAGD) addresses this through four stages: cross-layer transcoder training, spectral coarsening of attribution graphs, graph-neural-network (GNN)-guided hierarchical traversal, and causal intervention verification, reducing worst-case complexity to $O(n^2 \log n)$. Per-layer transcoders trained on the RedPajama corpus yield monosemantic dictionaries; gradient-activation products form weighted attribution graphs; normalized-Laplacian spectral clustering builds multi-resolution hierarchies; an attention-based GNN assigns circuit-membership scores at successive coarsening stages. Evaluation spans GPT-2 (117M-774M), Pythia (1.4B-6.9B), and Llama (7B-70B) across modular arithmetic, parity computation, integer sorting, coreference resolution (WinoGrande), commonsense reasoning (HellaSwag), and factual recall. Behavioral preservation reaches 91\% ($\pm$2.3\%) on modular arithmetic with 49-347-node circuits, while ACDC exhausts memory beyond 1.4B parameters. Cross-architecture transfer coefficients span 0.38-0.82, with within-family pairs (Llama-7B $\to$ Llama-70B) attaining 0.82. Limitations include omitted attention-head circuits, 15-20\% unexplained reconstruction variance, ablation-based validation circularity, and uncertain interpretability of circuits exceeding several hundred nodes.

翻译：暂无翻译

0

相关内容

【ICML2025】SparseLoRA：利用上下文稀疏性加速大语言模型微调

【ICML2025】SparseLoRA：利用上下文稀疏性加速大语言模型微调

专知会员服务

11+阅读 · 2025年6月23日

【ICML2024】揭示Graph Transformers 中的过全局化问题

【ICML2024】揭示Graph Transformers 中的过全局化问题

专知会员服务

21+阅读 · 2024年5月27日

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

专知会员服务

11+阅读 · 2022年3月19日

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

专知会员服务

13+阅读 · 2022年3月12日

【文本生成现代方法】Modern Methods for Text Generation

【文本生成现代方法】Modern Methods for Text Generation

专知会员服务

44+阅读 · 2020年9月11日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知会员服务

108+阅读 · 2020年8月30日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【东京大学】图采样，Sampling on Graphs: From Theory to Applications

【东京大学】图采样，Sampling on Graphs: From Theory to Applications

专知会员服务

19+阅读 · 2020年3月10日

From Data to Model Programming: Injecting Structured Priors for Knowledge Extraction，南加州大学计算机科学系任翔助理教授，CIPS ATT 16（2019）

From Data to Model Programming: Injecting Structured Priors for Knowledge Extraction，南加州大学计算机科学系任翔助理教授，CIPS ATT 16（2019）

专知会员服务

14+阅读 · 2019年10月25日

Natural Language Interface to Knowledge Graph (our experience) ，加州大学圣塔芭芭拉分校严锡峰副教授，CIPS ATT 16（2019）

Natural Language Interface to Knowledge Graph (our experience) ，加州大学圣塔芭芭拉分校严锡峰副教授，CIPS ATT 16（2019）

专知会员服务

16+阅读 · 2019年10月25日

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

专知

36+阅读 · 2022年2月7日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知

14+阅读 · 2020年8月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

综述：Image Caption 任务之语句多样性

综述：Image Caption 任务之语句多样性

PaperWeekly

22+阅读 · 2018年11月30日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

自然语言处理 | 使用Spacy 进行自然语言处理（二）

自然语言处理 | 使用Spacy 进行自然语言处理（二）

机器学习和数学

10+阅读 · 2018年8月27日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

论文报告 | Graph-based Neural Multi-Document Summarization

论文报告 | Graph-based Neural Multi-Document Summarization

科技创新与创业

15+阅读 · 2017年12月15日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

曲面上图像处理的非局部变分模型与算法

国家自然科学基金

0+阅读 · 2017年12月31日

大规模多视角高维图像特征提取

国家自然科学基金

5+阅读 · 2017年12月31日

随机映射框架下的图像语义分析与提取技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于复杂语义的个性化图像集摘要研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于多特征与水平集融合的遥感图像分割算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于上下文感知和异质特征集成的SAR图像分割与评价

国家自然科学基金

2+阅读 · 2015年12月31日

数据驱动的人体图像语义分割研究

国家自然科学基金

5+阅读 · 2014年12月31日

基于关系语义的空间场景信息理解

国家自然科学基金

5+阅读 · 2014年12月31日

基于超像素稀疏表示的图像超分辨率方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

复杂场景中基于分数阶微积分的局部形状匹配方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

Evaluating the Interpretability of Sparse Autoencoders with Concept Annotations

Arxiv

0+阅读 · 6月23日

Do Sparse Autoencoders Learn Meaningful Concept Hierarchies?

Arxiv

0+阅读 · 6月22日

Numerical stability analysis of large language models

Arxiv

0+阅读 · 6月22日

Artic-O: End-to-End Articulated Object Reconstruction via Latent Geometry Learning

Arxiv

0+阅读 · 6月20日

Language-Instructed Vision Embeddings for Controllable and Generalizable Perception

Arxiv

0+阅读 · 6月17日

Exact Algorithms for Edge Deletion to Cactus Graphs and Weighted Variants

Arxiv

0+阅读 · 6月16日

$θ$-free matching covered graphs: characterization and consequences

Arxiv

0+阅读 · 6月14日

Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

Arxiv

10+阅读 · 2022年3月15日

Attributed Graph Clustering via Adaptive Graph Convolution

Arxiv

11+阅读 · 2019年6月4日

Hierarchical Graph Representation Learning with Differentiable Pooling

Hierarchical Graph Representation Learning with Differentiable Pooling

Arxiv

15+阅读 · 2018年6月26日

VIP会员

文章信息

相关主题

语言模型化

最新内容

无人机自主控制与人工智能：系统性综述

无人机自主控制与人工智能：系统性综述

专知会员服务

5+阅读 · 今天7:25

巡飞弹与反无人机系统——现代战场的两大支柱

巡飞弹与反无人机系统——现代战场的两大支柱

专知会员服务

2+阅读 · 今天6:54

《打造“黄金舰队”》57页报告

《打造“黄金舰队”》57页报告

专知会员服务

1+阅读 · 今天6:52

《北约数字教官网络发展路径》128页报告

《北约数字教官网络发展路径》128页报告

专知会员服务

1+阅读 · 今天6:33

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

专知会员服务

6+阅读 · 6月25日

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

专知会员服务

5+阅读 · 6月25日

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

8+阅读 · 6月25日

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

7+阅读 · 6月25日

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

8+阅读 · 6月25日

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

8+阅读 · 6月25日

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

9+阅读 · 6月25日

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

9+阅读 · 6月25日

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

9+阅读 · 6月25日

综述 | 从问答到任务完成：Agent系统与Harness设计

综述 | 从问答到任务完成：Agent系统与Harness设计

专知会员服务

9+阅读 · 6月24日

Agentic RL：框架、实践与长程智能体训练

Agentic RL：框架、实践与长程智能体训练

专知会员服务

10+阅读 · 6月24日

相关VIP内容

【ICML2025】SparseLoRA：利用上下文稀疏性加速大语言模型微调

【ICML2025】SparseLoRA：利用上下文稀疏性加速大语言模型微调

专知会员服务

11+阅读 · 2025年6月23日

【ICML2024】揭示Graph Transformers 中的过全局化问题

【ICML2024】揭示Graph Transformers 中的过全局化问题

专知会员服务

21+阅读 · 2024年5月27日

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

专知会员服务

11+阅读 · 2022年3月19日

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

专知会员服务

13+阅读 · 2022年3月12日

【文本生成现代方法】Modern Methods for Text Generation

【文本生成现代方法】Modern Methods for Text Generation

专知会员服务

44+阅读 · 2020年9月11日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知会员服务

108+阅读 · 2020年8月30日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【东京大学】图采样，Sampling on Graphs: From Theory to Applications

【东京大学】图采样，Sampling on Graphs: From Theory to Applications

专知会员服务

19+阅读 · 2020年3月10日

From Data to Model Programming: Injecting Structured Priors for Knowledge Extraction，南加州大学计算机科学系任翔助理教授，CIPS ATT 16（2019）

From Data to Model Programming: Injecting Structured Priors for Knowledge Extraction，南加州大学计算机科学系任翔助理教授，CIPS ATT 16（2019）

专知会员服务

14+阅读 · 2019年10月25日

Natural Language Interface to Knowledge Graph (our experience) ，加州大学圣塔芭芭拉分校严锡峰副教授，CIPS ATT 16（2019）

Natural Language Interface to Knowledge Graph (our experience) ，加州大学圣塔芭芭拉分校严锡峰副教授，CIPS ATT 16（2019）

专知会员服务

16+阅读 · 2019年10月25日

热门VIP内容

开通专知VIP会员享更多权益服务

巡飞弹与反无人机系统——现代战场的两大支柱

《北约数字教官网络发展路径》128页报告

无人机自主控制与人工智能：系统性综述

《打造“黄金舰队”》57页报告

相关资讯

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

专知

36+阅读 · 2022年2月7日

Transformer模型-深度学习自然语言处理，17页ppt

Transformer模型-深度学习自然语言处理，17页ppt

专知

14+阅读 · 2020年8月30日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

综述：Image Caption 任务之语句多样性

综述：Image Caption 任务之语句多样性

PaperWeekly

22+阅读 · 2018年11月30日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

自然语言处理 | 使用Spacy 进行自然语言处理（二）

自然语言处理 | 使用Spacy 进行自然语言处理（二）

机器学习和数学

10+阅读 · 2018年8月27日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

【论文推荐】最新5篇图像分割（Image Segmentation）相关论文—多重假设、超像素分割、自监督、图、生成对抗网络

专知

27+阅读 · 2018年2月7日

论文报告 | Graph-based Neural Multi-Document Summarization

论文报告 | Graph-based Neural Multi-Document Summarization

科技创新与创业

15+阅读 · 2017年12月15日

Generative Adversarial Text to Image Synthesis论文解读

Generative Adversarial Text to Image Synthesis论文解读

统计学习与视觉计算组

13+阅读 · 2017年6月9日

相关论文

Evaluating the Interpretability of Sparse Autoencoders with Concept Annotations

Arxiv

0+阅读 · 6月23日

Do Sparse Autoencoders Learn Meaningful Concept Hierarchies?

Arxiv

0+阅读 · 6月22日

Numerical stability analysis of large language models

Arxiv

0+阅读 · 6月22日

Artic-O: End-to-End Articulated Object Reconstruction via Latent Geometry Learning

Arxiv

0+阅读 · 6月20日

Language-Instructed Vision Embeddings for Controllable and Generalizable Perception

Arxiv

0+阅读 · 6月17日

Exact Algorithms for Edge Deletion to Cactus Graphs and Weighted Variants

Arxiv

0+阅读 · 6月16日

$θ$-free matching covered graphs: characterization and consequences

Arxiv

0+阅读 · 6月14日

Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

Arxiv

10+阅读 · 2022年3月15日

Attributed Graph Clustering via Adaptive Graph Convolution

Arxiv

11+阅读 · 2019年6月4日

Hierarchical Graph Representation Learning with Differentiable Pooling

Hierarchical Graph Representation Learning with Differentiable Pooling

Arxiv

15+阅读 · 2018年6月26日

相关基金

曲面上图像处理的非局部变分模型与算法

国家自然科学基金

0+阅读 · 2017年12月31日

大规模多视角高维图像特征提取

国家自然科学基金

5+阅读 · 2017年12月31日

随机映射框架下的图像语义分析与提取技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于复杂语义的个性化图像集摘要研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于多特征与水平集融合的遥感图像分割算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于上下文感知和异质特征集成的SAR图像分割与评价

国家自然科学基金

2+阅读 · 2015年12月31日

数据驱动的人体图像语义分割研究

国家自然科学基金

5+阅读 · 2014年12月31日

基于关系语义的空间场景信息理解

国家自然科学基金

5+阅读 · 2014年12月31日

基于超像素稀疏表示的图像超分辨率方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

复杂场景中基于分数阶微积分的局部形状匹配方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员