The Visual Language of Fabrics

We introduce text2fabric, a novel dataset that links free-text descriptions to various fabric materials. The dataset comprises 15,000 natural language descriptions associated to 3,000 corresponding images of fabric materials. Traditionally, material descriptions come in the form of tags/keywords, which limits their expressivity, induces pre-existing knowledge of the appropriate vocabulary, and ultimately leads to a chopped description system. Therefore, we study the use of free-text as a more appropriate way to describe material appearance, taking the use case of fabrics as a common item that non-experts may often deal with. Based on the analysis of the dataset, we identify a compact lexicon, set of attributes and key structure that emerge from the descriptions. This allows us to accurately understand how people describe fabrics and draw directions for generalization to other types of materials. We also show that our dataset enables specializing large vision-language models such as CLIP, creating a meaningful latent space for fabric appearance, and significantly improving applications such as fine-grained material retrieval and automatic captioning.

翻译：本文提出text2fabric——一个将自由文本描述与多种织物材料关联起来的新型数据集。该数据集包含15,000条自然语言描述，对应3,000张织物材料图像。传统材料描述通常采用标签/关键词形式，这不仅限制了表达力，还要求使用者具备相关词汇的先验知识，最终导致描述体系支离破碎。为此，我们探索以自由文本作为更合适的材料外观描述方式，并以非专业人士常接触的常见物品——织物为例展开研究。基于数据集分析，我们识别出描述中涌现的紧凑词表、属性集合及关键结构，从而精确理解人类描述织物的方式，并为推广至其他材料类型指明方向。实验表明，该数据集可有效增强CLIP等大型视觉语言模型的特化能力，构建具有语义意义的织物外观潜在空间，并显著提升细粒度材料检索与自动描述生成等应用性能。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日