The Low-Dimensional Linear Geometry of Contextualized Word Representations - 专知论文

会员服务 ·

0

线性的 · 子空间 · Performer · BERT · MoDELS ·

2021 年 9 月 14 日

The Low-Dimensional Linear Geometry of Contextualized Word Representations

翻译：上下文单词表示式的低多维线性线性直线几何测量

Evan Hernandez,Jacob Andreas

from arxiv, To be published in the 25th Conference on Computational Natural Language Learning (CoNLL)

Black-box probing models can reliably extract linguistic features like tense, number, and syntactic role from pretrained word representations. However, the manner in which these features are encoded in representations remains poorly understood. We present a systematic study of the linear geometry of contextualized word representations in ELMO and BERT. We show that a variety of linguistic features (including structured dependency relationships) are encoded in low-dimensional subspaces. We then refine this geometric picture, showing that there are hierarchical relations between the subspaces encoding general linguistic categories and more specific ones, and that low-dimensional feature encodings are distributed rather than aligned to individual neurons. Finally, we demonstrate that these linear subspaces are causally related to model behavior, and can be used to perform fine-grained manipulation of BERT's output distribution.

翻译：黑盒检验模型可以可靠地从经过训练的字形演示中提取语言特征,如时态、数字和综合作用。但是,这些特征的编码方式仍然不易理解。我们对ELMO和BERT中背景化字形表达的线性几何学进行系统研究。我们显示,在低维次空间中,有多种语言特征(包括结构上的依赖关系)编码。然后,我们细化这一几何图画,显示子空间编码一般语言类别和更具体的类别之间有等级关系,低维特征编码是分布的,而不是与单个神经元一致的。最后,我们证明这些线性子空间与模式行为有因果关系,可以用来对BERT的输出分布进行精细细的操纵。

0

相关内容

线性的

必须收藏！MIT-Gilbert老爷子《矩阵图解》，一张图看透矩阵

必须收藏！MIT-Gilbert老爷子《矩阵图解》，一张图看透矩阵

专知会员服务

112+阅读 · 2020年11月17日

【ACL2020-斯坦福大学】低维双曲线知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings (ACL 2020)

【ACL2020-斯坦福大学】低维双曲线知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings (ACL 2020)

专知会员服务

55+阅读 · 2020年7月3日

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

专知会员服务

77+阅读 · 2020年6月14日

【ACL2020-斯坦福】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic KGE

【ACL2020-斯坦福】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic KGE

专知会员服务

46+阅读 · 2020年5月6日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

254+阅读 · 2020年4月19日

【技术报告】诺亚开源中文预训练语言模型“哪吒”（NEZHA: Neural Contextualized Representation for Chinese Language Understanding）

【技术报告】诺亚开源中文预训练语言模型“哪吒”（NEZHA: Neural Contextualized Representation for Chinese Language Understanding）

专知会员服务

21+阅读 · 2019年12月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Equivariant Deep Dynamical Model for Motion Prediction

Arxiv

0+阅读 · 2021年11月2日

Directed Graph Embeddings in Pseudo-Riemannian Manifolds

Arxiv

12+阅读 · 2021年6月16日

AutoETER: Automated Entity Type Representation for Knowledge Graph Embedding

Arxiv

5+阅读 · 2020年10月6日

Learning Disentangled Representations for Recommendation

Learning Disentangled Representations for Recommendation

Arxiv

8+阅读 · 2019年10月31日

Visualizing and Measuring the Geometry of BERT

Visualizing and Measuring the Geometry of BERT

Arxiv

7+阅读 · 2019年10月28日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

41+阅读 · 2019年6月4日

Deep Network Embedding for Graph Representation Learning in Signed Networks

Arxiv

4+阅读 · 2019年1月7日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

Deep contextualized word representations

Arxiv

10+阅读 · 2018年3月22日

A Comparison of Word Embeddings for the Biomedical Natural Language Processing

Arxiv

3+阅读 · 2018年2月1日

VIP会员

文章信息

相关主题

最新内容

无人机自主控制与人工智能：系统性综述

无人机自主控制与人工智能：系统性综述

专知会员服务

10+阅读 · 今天7:25

巡飞弹与反无人机系统——现代战场的两大支柱

巡飞弹与反无人机系统——现代战场的两大支柱

专知会员服务

3+阅读 · 今天6:54

《打造“黄金舰队”》57页报告

《打造“黄金舰队”》57页报告

专知会员服务

3+阅读 · 今天6:52

《北约数字教官网络发展路径》128页报告

《北约数字教官网络发展路径》128页报告

专知会员服务

2+阅读 · 今天6:33

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

专知会员服务

7+阅读 · 6月25日

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

专知会员服务

6+阅读 · 6月25日

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

10+阅读 · 6月25日

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

8+阅读 · 6月25日

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

8+阅读 · 6月25日

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

8+阅读 · 6月25日

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

10+阅读 · 6月25日

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

9+阅读 · 6月25日

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

9+阅读 · 6月25日

综述 | 从问答到任务完成：Agent系统与Harness设计

综述 | 从问答到任务完成：Agent系统与Harness设计

专知会员服务

10+阅读 · 6月24日

Agentic RL：框架、实践与长程智能体训练

Agentic RL：框架、实践与长程智能体训练

专知会员服务

10+阅读 · 6月24日

相关VIP内容

必须收藏！MIT-Gilbert老爷子《矩阵图解》，一张图看透矩阵

必须收藏！MIT-Gilbert老爷子《矩阵图解》，一张图看透矩阵

专知会员服务

112+阅读 · 2020年11月17日

【ACL2020-斯坦福大学】低维双曲线知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings (ACL 2020)

【ACL2020-斯坦福大学】低维双曲线知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings (ACL 2020)

专知会员服务

55+阅读 · 2020年7月3日

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

【ACL 2020】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic Knowledge Graph Embeddings

专知会员服务

77+阅读 · 2020年6月14日

【ACL2020-斯坦福】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic KGE

【ACL2020-斯坦福】低维双曲知识图谱嵌入，Low-Dimensional Hyperbolic KGE

专知会员服务

46+阅读 · 2020年5月6日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

254+阅读 · 2020年4月19日

【技术报告】诺亚开源中文预训练语言模型“哪吒”（NEZHA: Neural Contextualized Representation for Chinese Language Understanding）

【技术报告】诺亚开源中文预训练语言模型“哪吒”（NEZHA: Neural Contextualized Representation for Chinese Language Understanding）

专知会员服务

21+阅读 · 2019年12月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

巡飞弹与反无人机系统——现代战场的两大支柱

《北约数字教官网络发展路径》128页报告

无人机自主控制与人工智能：系统性综述

《打造“黄金舰队”》57页报告

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Equivariant Deep Dynamical Model for Motion Prediction

Arxiv

0+阅读 · 2021年11月2日

Directed Graph Embeddings in Pseudo-Riemannian Manifolds

Arxiv

12+阅读 · 2021年6月16日

AutoETER: Automated Entity Type Representation for Knowledge Graph Embedding

Arxiv

5+阅读 · 2020年10月6日

Learning Disentangled Representations for Recommendation

Learning Disentangled Representations for Recommendation

Arxiv

8+阅读 · 2019年10月31日

Visualizing and Measuring the Geometry of BERT

Visualizing and Measuring the Geometry of BERT

Arxiv

7+阅读 · 2019年10月28日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

41+阅读 · 2019年6月4日

Deep Network Embedding for Graph Representation Learning in Signed Networks

Arxiv

4+阅读 · 2019年1月7日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

Deep contextualized word representations

Arxiv

10+阅读 · 2018年3月22日

A Comparison of Word Embeddings for the Biomedical Natural Language Processing

Arxiv

3+阅读 · 2018年2月1日

微信扫码咨询专知VIP会员