USNID: A Framework for Unsupervised and Semi-supervised New Intent Discovery - 专知论文

会员服务 ·

0

监督 · 半监督 · 无监督 · 复杂语义 · 语义相似性 ·

2023 年 4 月 16 日

USNID: A Framework for Unsupervised and Semi-supervised New Intent Discovery

翻译：USNID：面向无监督与半监督新意图发现的框架

Hanlei Zhang,Hua Xu,Xin Wang,Fei Long,Kai Gao

from arxiv, 14 pages, 5 figures

New intent discovery is of great value to natural language processing, allowing for a better understanding of user needs and providing friendly services. However, most existing methods struggle to capture the complicated semantics of discrete text representations when limited or no prior knowledge of labeled data is available. To tackle this problem, we propose a novel framework called USNID for unsupervised and semi-supervised new intent discovery, which has three key technologies. First, it takes full use of unsupervised or semi-supervised data to mine shallow semantic similarity relations and provide well-initialized representations for clustering. Second, it designs a centroid-guided clustering mechanism to address the issue of cluster allocation inconsistency and provide high-quality self-supervised targets for representation learning. Third, it captures high-level semantics in unsupervised or semi-supervised data to discover fine-grained intent-wise clusters by optimizing both cluster-level and instance-level objectives. We also propose an effective method for estimating the cluster number in open-world scenarios without knowing the number of new intents beforehand. USNID performs exceptionally well on several intent benchmark datasets, achieving new state-of-the-art results in unsupervised and semi-supervised new intent discovery and demonstrating robust performance with different cluster numbers.

翻译：新意图发现对自然语言处理具有重要价值，能更深入理解用户需求并提供友好服务。然而，现有方法大多难以在缺乏标注数据先验知识或仅有少量标注数据时，有效捕获离散文本表示的复杂语义。为解决该问题，我们提出名为USNID的新型框架，用于无监督与半监督新意图发现，该框架包含三项关键技术：第一，充分利用无监督或半监督数据挖掘浅层语义相似关系，为聚类提供良好初始化的表示；第二，设计质心引导聚类机制解决聚类分配不一致问题，为表示学习提供高质量自监督目标；第三，通过联合优化聚类级与实例级目标，捕获无监督或半监督数据中的高层语义，发现细粒度意图簇。我们还提出在开放世界场景中无需预知新意图数量的有效簇数估计方法。USNID在多个意图基准数据集上表现优异，在无监督与半监督新意图发现任务中均取得新的最优结果，并在不同簇数设置下展现出稳健性能。

0

相关内容

【AAAI2022】基于对比学习和对抗微调的无监督专家链接框架

【AAAI2022】基于对比学习和对抗微调的无监督专家链接框架

专知会员服务

21+阅读 · 2022年2月17日

【CVPR2021】半监督迁移学习的自适应一致性正则化

专知会员服务

33+阅读 · 2021年3月7日

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

专知会员服务

46+阅读 · 2020年7月29日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

109+阅读 · 2020年5月1日

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

专知会员服务

23+阅读 · 2020年4月21日

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

专知会员服务

27+阅读 · 2020年3月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

Klotho抑制TRPC6诱导的足细胞损伤在糖尿病肾病中的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

基于弱监督学习的细粒度中医临床医学实体识别方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

Fra-2在狼疮性肾炎足细胞凋亡中机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

“新类别发现”学习及其应用

国家自然科学基金

0+阅读 · 2014年12月31日

基于篇章语义的文档级统计机器翻译研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向调控PAI-1的lncRNAs在COPD肺泡上皮细胞凋亡中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Girdin介导血管内皮细胞吞噬血小板在内皮老化中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于半监督结构化学习的跨语言映射研究

国家自然科学基金

2+阅读 · 2011年12月31日

PI3K/Akt信号通路与中心体Plk1在胰腺癌化疗耐受中的相互作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于两重网格的Navier-Stokes方程并行自适应后处理及变分多尺度算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Beyond Active Learning: Leveraging the Full Potential of Human Interaction via Auto-Labeling, Human Correction, and Human Verification

Arxiv

0+阅读 · 2023年6月2日

Exploring the Boundaries of Semi-Supervised Facial Expression Recognition: Learning from In-Distribution, Out-of-Distribution, and Unconstrained Data

Arxiv

0+阅读 · 2023年6月2日

A Survey on In-context Learning

Arxiv

0+阅读 · 2023年6月1日

StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation

Arxiv

0+阅读 · 2023年6月1日

Web scraping: a promising tool for geographic data acquisition

Arxiv

0+阅读 · 2023年5月31日

IDAS: Intent Discovery with Abstractive Summarization

Arxiv

0+阅读 · 2023年5月31日

A Survey of Deep Learning for Scientific Discovery

A Survey of Deep Learning for Scientific Discovery

Arxiv

29+阅读 · 2020年3月26日

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Arxiv

88+阅读 · 2019年3月27日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

语义相似性

最新内容

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

0+阅读 · 今天6:30

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

1+阅读 · 今天6:18

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

2+阅读 · 今天6:08

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

2+阅读 · 今天5:54

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

0+阅读 · 今天5:22

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

3+阅读 · 今天5:15

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

2+阅读 · 今天3:42

综述 | 从问答到任务完成：Agent系统与Harness设计

综述 | 从问答到任务完成：Agent系统与Harness设计

专知会员服务

4+阅读 · 6月24日

Agentic RL：框架、实践与长程智能体训练

Agentic RL：框架、实践与长程智能体训练

专知会员服务

3+阅读 · 6月24日

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

专知会员服务

8+阅读 · 6月24日

重新思考无人机时代的生存能力

重新思考无人机时代的生存能力

专知会员服务

7+阅读 · 6月24日

装甲突击旅：现代战争思考、战斗与组织

装甲突击旅：现代战争思考、战斗与组织

专知会员服务

5+阅读 · 6月24日

在人工智能加速决策环境中拓展OODA循环

在人工智能加速决策环境中拓展OODA循环

专知会员服务

7+阅读 · 6月24日

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

专知会员服务

6+阅读 · 6月24日

军事欺骗：供作战战术指挥官使用的工具

军事欺骗：供作战战术指挥官使用的工具

专知会员服务

6+阅读 · 6月24日

相关VIP内容

【AAAI2022】基于对比学习和对抗微调的无监督专家链接框架

【AAAI2022】基于对比学习和对抗微调的无监督专家链接框架

专知会员服务

21+阅读 · 2022年2月17日

【CVPR2021】半监督迁移学习的自适应一致性正则化

专知会员服务

33+阅读 · 2021年3月7日

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

图像分类半监督自监督无监督学习综述，A survey on Semi-, Self- and Unsupervised Learning for Image Classification

专知会员服务

46+阅读 · 2020年7月29日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

【2020新书】自然语言处理Python与spaCy实践，216页pdf，NLP with Python

专知会员服务

109+阅读 · 2020年5月1日

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

专知会员服务

23+阅读 · 2020年4月21日

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

专知会员服务

27+阅读 · 2020年3月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

网状网络及其在军事领域的运用

无美国参与的欧洲战争方式（万字长文）

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

相关资讯

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

相关论文

Beyond Active Learning: Leveraging the Full Potential of Human Interaction via Auto-Labeling, Human Correction, and Human Verification

Arxiv

0+阅读 · 2023年6月2日

Exploring the Boundaries of Semi-Supervised Facial Expression Recognition: Learning from In-Distribution, Out-of-Distribution, and Unconstrained Data

Arxiv

0+阅读 · 2023年6月2日

A Survey on In-context Learning

Arxiv

0+阅读 · 2023年6月1日

StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation

Arxiv

0+阅读 · 2023年6月1日

Web scraping: a promising tool for geographic data acquisition

Arxiv

0+阅读 · 2023年5月31日

IDAS: Intent Discovery with Abstractive Summarization

Arxiv

0+阅读 · 2023年5月31日

A Survey of Deep Learning for Scientific Discovery

A Survey of Deep Learning for Scientific Discovery

Arxiv

29+阅读 · 2020年3月26日

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Arxiv

88+阅读 · 2019年3月27日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

Klotho抑制TRPC6诱导的足细胞损伤在糖尿病肾病中的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

基于弱监督学习的细粒度中医临床医学实体识别方法研究

国家自然科学基金

3+阅读 · 2015年12月31日

Fra-2在狼疮性肾炎足细胞凋亡中机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

“新类别发现”学习及其应用

国家自然科学基金

0+阅读 · 2014年12月31日

基于篇章语义的文档级统计机器翻译研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向调控PAI-1的lncRNAs在COPD肺泡上皮细胞凋亡中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Girdin介导血管内皮细胞吞噬血小板在内皮老化中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于半监督结构化学习的跨语言映射研究

国家自然科学基金

2+阅读 · 2011年12月31日

PI3K/Akt信号通路与中心体Plk1在胰腺癌化疗耐受中的相互作用机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于两重网格的Navier-Stokes方程并行自适应后处理及变分多尺度算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员