This paper introduces UnZipLoRA, a method for decomposing an image into its constituent subject and style, represented as two distinct LoRAs (Low-Rank Adaptations). Unlike existing personalization techniques that focus on either subject or style in isolation, or require separate training sets for each, UnZipLoRA disentangles these elements from a single image by training both the LoRAs simultaneously. UnZipLoRA ensures that the resulting LoRAs are compatible, i.e., they can be seamlessly combined using direct addition. UnZipLoRA enables independent manipulation and recontextualization of subject and style, including generating variations of each, applying the extracted style to new subjects, and recombining them to reconstruct the original image or create novel variations. To address the challenge of subject and style entanglement, UnZipLoRA employs a novel prompt separation technique, as well as column and block separation strategies to accurately preserve the characteristics of subject and style, and ensure compatibility between the learned LoRAs. Evaluation with human studies and quantitative metrics demonstrates UnZipLoRA's effectiveness compared to other state-of-the-art methods, including DreamBooth-LoRA, Inspiration Tree, and B-LoRA.
翻译:本文提出UnZipLoRA,一种将图像分解为其构成主体与风格的方法,二者分别表示为两个独立的LoRA(低秩适配器)。不同于现有仅单独关注主体或风格、或需要为每个要素准备独立训练集的个性化技术,UnZipLoRA通过同时训练两个LoRA,实现从单张图像中解耦这些要素。UnZipLoRA确保生成的LoRA具有兼容性,即能够通过直接加法无缝结合。该方法支持对主体与风格进行独立操控与再情境化,包括生成各要素的变体、将提取的风格应用于新主体,以及重组二者以重建原始图像或创造新颖变体。为解决主体与风格纠缠的难题,UnZipLoRA采用新颖的提示分离技术,并结合列分离与块分离策略,以精确保持主体与风格的特征,并确保所学LoRA间的兼容性。通过人工评估与量化指标验证,UnZipLoRA相较于DreamBooth-LoRA、Inspiration Tree及B-LoRA等前沿方法展现出显著优势。