UKnow: A Unified Knowledge Protocol for Common-Sense Reasoning and Vision-Language Pre-training - 专知论文

会员服务 ·

0

知识 (knowledge) · 数据集 · 多峰值 · 图 · 知识图谱 ·

2023 年 3 月 21 日

UKnow: A Unified Knowledge Protocol for Common-Sense Reasoning and Vision-Language Pre-training

翻译：UKnow：面向常识推理与视觉-语言预训练的统一知识协议

Biao Gong,Xiaoying Xie,Yutong Feng,Yiliang Lv,Yujun Shen,Deli Zhao

This work presents a unified knowledge protocol, called UKnow, which facilitates knowledge-based studies from the perspective of data. Particularly focusing on visual and linguistic modalities, we categorize data knowledge into five unit types, namely, in-image, in-text, cross-image, cross-text, and image-text, and set up an efficient pipeline to help construct the multimodal knowledge graph from any data collection. Thanks to the logical information naturally contained in knowledge graph, organizing datasets under UKnow format opens up more possibilities of data usage compared to the commonly used image-text pairs. Following UKnow protocol, we collect, from public international news, a large-scale multimodal knowledge graph dataset that consists of 1,388,568 nodes (with 571,791 vision-related ones) and 3,673,817 triplets. The dataset is also annotated with rich event tags, including 11 coarse labels and 9,185 fine labels. Experiments on four benchmarks demonstrate the potential of UKnow in supporting common-sense reasoning and boosting vision-language pre-training with a single dataset, benefiting from its unified form of knowledge organization. Code, dataset, and models will be made publicly available.

翻译：本文提出一种名为UKnow的统一知识协议，从数据视角推动基于知识的研究。本工作特别聚焦于视觉与语言模态，将数据知识划分为五种单元类型：图像内知识、文本内知识、跨图像知识、跨文本知识与图像-文本知识，并构建高效流水线以从任意数据集合中生成多模态知识图谱。得益于知识图谱天然蕴含的逻辑信息，与常用的图像-文本对相比，采用UKnow格式组织数据集可拓展更多数据应用可能性。遵循UKnow协议，我们从国际公共新闻数据中构建了大规模多模态知识图谱数据集，包含1,388,568个节点（其中571,791个为视觉相关节点）及3,673,817个三元组。该数据集同时标注了丰富的事件标签体系，涵盖11个粗粒度标签与9,185个细粒度标签。在四个基准测试上的实验表明，UKnow凭借其统一的知识组织形式，能够以单一数据集同时支持常识推理任务并提升视觉-语言预训练效果。相关代码、数据集与模型将公开发布。

0

相关内容

知识 (knowledge)

知识 (knowledge)

通过学习、实践或探索所获得的认识、判断或技能。

【ACL2022-华盛顿大学】生成知识促进常识推理，Generated Knowledge Prompting for Commonsense Reasoning

【ACL2022-华盛顿大学】生成知识促进常识推理，Generated Knowledge Prompting for Commonsense Reasoning

专知会员服务

26+阅读 · 2022年3月1日

UIUC韩家炜：从海量非结构化文本中挖掘结构化知识

UIUC韩家炜：从海量非结构化文本中挖掘结构化知识

专知会员服务

98+阅读 · 2021年12月30日

【USC2021】常识推理，47页ppt，Commonsense Reasoning in the Wild

专知会员服务

33+阅读 · 2021年10月9日

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

专知会员服务

43+阅读 · 2020年11月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

103+阅读 · 2020年4月25日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

ACL 2022：评估单词多义性不再困扰？一种新的基准“DIBIMT”

ACL 2022：评估单词多义性不再困扰？一种新的基准“DIBIMT”

大数据文摘

0+阅读 · 2022年5月24日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

自然语言处理顶会EMNLP2018接受论文列表！

自然语言处理顶会EMNLP2018接受论文列表！

专知

87+阅读 · 2018年8月26日

【论文推荐】最新六篇知识图谱相关论文—事件演化图、神经词义消歧、增强神经网络、Mem2Seq、用户偏好传播、概率嵌入

【论文推荐】最新六篇知识图谱相关论文—事件演化图、神经词义消歧、增强神经网络、Mem2Seq、用户偏好传播、概率嵌入

专知

19+阅读 · 2018年6月14日

自然语言处理 (NLP)资源大全

自然语言处理 (NLP)资源大全

机械鸡

35+阅读 · 2017年9月17日

面向地理模型集成与运行的数据适配方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

二维层状纳米材料的谷极化与谷电子/自旋弛豫动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

层状硫化物热电材料电热输运性质的多重自由度协同调控

国家自然科学基金

0+阅读 · 2013年12月31日

Skutterudite/AgSbTe2系纳米复合热电材料研究

国家自然科学基金

0+阅读 · 2012年12月31日

情感信息抽取的资源建设及关键技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

石墨烯构建Z型载流子转移通道的复合型光催化制氢材料

国家自然科学基金

0+阅读 · 2012年12月31日

多孔γl2O3基复合纳米强碱材料捕获CO2的基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向产品几何规范的知识表示与测量认证研究

国家自然科学基金

0+阅读 · 2011年12月31日

用于宽光谱光伏电池的能带梯度材料的研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于超大环酞菁的近红外有机电致发光二极管

国家自然科学基金

0+阅读 · 2008年12月31日

FolkScope: Intention Knowledge Graph Construction for E-commerce Commonsense Discovery

Arxiv

0+阅读 · 2023年5月11日

Combo of Thinking and Observing for Outside-Knowledge VQA

Arxiv

1+阅读 · 2023年5月10日

Large Language Models Need Holistically Thought in Medical Conversational QA

Arxiv

0+阅读 · 2023年5月10日

Completeness, Recall, and Negation in Open-World Knowledge Bases: A Survey

Arxiv

0+阅读 · 2023年5月9日

Code Execution with Pre-trained Language Models

Arxiv

0+阅读 · 2023年5月8日

Benchmarks for Automated Commonsense Reasoning: A Survey

Arxiv

44+阅读 · 2023年2月22日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Arxiv

13+阅读 · 2021年4月7日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

VIP会员

文章信息

相关主题

知识 (knowledge)

最新内容

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

专知会员服务

2+阅读 · 今天11:43

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

专知会员服务

2+阅读 · 今天11:41

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

5+阅读 · 今天6:30

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

5+阅读 · 今天6:18

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

6+阅读 · 今天6:08

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

6+阅读 · 今天5:54

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

7+阅读 · 今天5:22

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

7+阅读 · 今天5:15

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

7+阅读 · 今天3:42

综述 | 从问答到任务完成：Agent系统与Harness设计

综述 | 从问答到任务完成：Agent系统与Harness设计

专知会员服务

6+阅读 · 6月24日

Agentic RL：框架、实践与长程智能体训练

Agentic RL：框架、实践与长程智能体训练

专知会员服务

8+阅读 · 6月24日

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

专知会员服务

10+阅读 · 6月24日

重新思考无人机时代的生存能力

重新思考无人机时代的生存能力

专知会员服务

9+阅读 · 6月24日

装甲突击旅：现代战争思考、战斗与组织

装甲突击旅：现代战争思考、战斗与组织

专知会员服务

7+阅读 · 6月24日

在人工智能加速决策环境中拓展OODA循环

在人工智能加速决策环境中拓展OODA循环

专知会员服务

9+阅读 · 6月24日

相关VIP内容

【ACL2022-华盛顿大学】生成知识促进常识推理，Generated Knowledge Prompting for Commonsense Reasoning

【ACL2022-华盛顿大学】生成知识促进常识推理，Generated Knowledge Prompting for Commonsense Reasoning

专知会员服务

26+阅读 · 2022年3月1日

UIUC韩家炜：从海量非结构化文本中挖掘结构化知识

UIUC韩家炜：从海量非结构化文本中挖掘结构化知识

专知会员服务

98+阅读 · 2021年12月30日

【USC2021】常识推理，47页ppt，Commonsense Reasoning in the Wild

专知会员服务

33+阅读 · 2021年10月9日

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

专知会员服务

43+阅读 · 2020年11月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

【知识图谱嵌入补全综述论文】embedding models for knowledge base completion

专知会员服务

103+阅读 · 2020年4月25日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

网状网络及其在军事领域的运用

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

相关资讯

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

ACL 2022：评估单词多义性不再困扰？一种新的基准“DIBIMT”

ACL 2022：评估单词多义性不再困扰？一种新的基准“DIBIMT”

大数据文摘

0+阅读 · 2022年5月24日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

自然语言处理顶会EMNLP2018接受论文列表！

自然语言处理顶会EMNLP2018接受论文列表！

专知

87+阅读 · 2018年8月26日

【论文推荐】最新六篇知识图谱相关论文—事件演化图、神经词义消歧、增强神经网络、Mem2Seq、用户偏好传播、概率嵌入

【论文推荐】最新六篇知识图谱相关论文—事件演化图、神经词义消歧、增强神经网络、Mem2Seq、用户偏好传播、概率嵌入

专知

19+阅读 · 2018年6月14日

自然语言处理 (NLP)资源大全

自然语言处理 (NLP)资源大全

机械鸡

35+阅读 · 2017年9月17日

相关论文

FolkScope: Intention Knowledge Graph Construction for E-commerce Commonsense Discovery

Arxiv

0+阅读 · 2023年5月11日

Combo of Thinking and Observing for Outside-Knowledge VQA

Arxiv

1+阅读 · 2023年5月10日

Large Language Models Need Holistically Thought in Medical Conversational QA

Arxiv

0+阅读 · 2023年5月10日

Completeness, Recall, and Negation in Open-World Knowledge Bases: A Survey

Arxiv

0+阅读 · 2023年5月9日

Code Execution with Pre-trained Language Models

Arxiv

0+阅读 · 2023年5月8日

Benchmarks for Automated Commonsense Reasoning: A Survey

Arxiv

44+阅读 · 2023年2月22日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Arxiv

13+阅读 · 2021年4月7日

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Arxiv

12+阅读 · 2020年2月19日

Differentiable Reasoning on Large Knowledge Bases and Natural Language

Arxiv

12+阅读 · 2019年12月17日

相关基金

面向地理模型集成与运行的数据适配方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

二维层状纳米材料的谷极化与谷电子/自旋弛豫动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

层状硫化物热电材料电热输运性质的多重自由度协同调控

国家自然科学基金

0+阅读 · 2013年12月31日

Skutterudite/AgSbTe2系纳米复合热电材料研究

国家自然科学基金

0+阅读 · 2012年12月31日

情感信息抽取的资源建设及关键技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

石墨烯构建Z型载流子转移通道的复合型光催化制氢材料

国家自然科学基金

0+阅读 · 2012年12月31日

多孔γl2O3基复合纳米强碱材料捕获CO2的基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

面向产品几何规范的知识表示与测量认证研究

国家自然科学基金

0+阅读 · 2011年12月31日

用于宽光谱光伏电池的能带梯度材料的研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于超大环酞菁的近红外有机电致发光二极管

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员