SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing

In this technical report, we introduce SEED-Data-Edit: a unique hybrid dataset for instruction-guided image editing, which aims to facilitate image manipulation using open-form language. SEED-Data-Edit is composed of three distinct types of data: (1) High-quality editing data produced by an automated pipeline, ensuring a substantial volume of diverse image editing pairs. (2) Real-world scenario data collected from the internet, which captures the intricacies of user intentions for promoting the practical application of image editing in the real world. (3) High-precision multi-turn editing data annotated by humans, which involves multiple rounds of edits for simulating iterative editing processes. The combination of these diverse data sources makes SEED-Data-Edit a comprehensive and versatile dataset for training language-guided image editing model. We fine-tune a pretrained Multimodal Large Language Model (MLLM) that unifies comprehension and generation with SEED-Data-Edit. The instruction tuned model demonstrates promising results, indicating the potential and effectiveness of SEED-Data-Edit in advancing the field of instructional image editing. The datasets are released in https://huggingface.co/datasets/AILab-CVC/SEED-Data-Edit.

翻译：本技术报告介绍了SEED-Data-Edit：一个用于指令引导图像编辑的独特混合数据集，旨在通过开放式语言促进图像操作。SEED-Data-Edit由三种不同类型的数据组成：（1）通过自动化流水线生成的高质量编辑数据，确保大量多样化的图像编辑对；（2）从互联网收集的真实场景数据，捕捉用户意图的复杂性以推动图像编辑在实际场景中的应用；（3）人工标注的高精度多轮编辑数据，涉及多次编辑以模拟迭代编辑过程。这些多样化数据源的结合使SEED-Data-Edit成为训练语言引导图像编辑模型的全面且多功能数据集。我们使用SEED-Data-Edit对预训练的多模态大语言模型进行微调，该模型统一了理解与生成能力。经过指令微调的模型展现出令人期待的结果，表明SEED-Data-Edit在推动指令图像编辑领域发展方面的潜力与有效性。数据集已在https://huggingface.co/datasets/AILab-CVC/SEED-Data-Edit发布。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日