Recent advancements in text-to-image generative models, particularly latent diffusion models (LDMs), have demonstrated remarkable capabilities in synthesizing high-quality images from textual prompts. However, achieving identity personalization-ensuring that a model consistently generates subject-specific outputs from limited reference images-remains a fundamental challenge. To address this, we introduce Meta-Low-Rank Adaptation (Meta-LoRA), a novel framework that leverages meta-learning to encode domain-specific priors into LoRA-based identity personalization. Our method introduces a structured three-layer LoRA architecture that separates identity-agnostic knowledge from identity-specific adaptation. In the first stage, the LoRA Meta-Down layers are meta-trained across multiple subjects, learning a shared manifold that captures general identity-related features. In the second stage, only the LoRA-Mid and LoRA-Up layers are optimized to specialize on a given subject, significantly reducing adaptation time while improving identity fidelity. To evaluate our approach, we introduce Meta-PHD, a new benchmark dataset for identity personalization, and compare Meta-LoRA against state-of-the-art methods. Our results demonstrate that Meta-LoRA achieves superior identity retention, computational efficiency, and adaptability across diverse identity conditions. The code, model weights, and dataset will be released publicly upon acceptance.
翻译:近年来,文本到图像生成模型——尤其是潜在扩散模型(LDMs)——在根据文本提示合成高质量图像方面展现出卓越能力。然而,实现身份个性化(即确保模型能够基于有限参考图像持续生成特定主体的输出)仍然是一个根本性挑战。为此,我们提出元学习低秩适配(Meta-LoRA)框架,该框架利用元学习将领域特定先验知识编码至基于LoRA的身份个性化系统中。本方法构建了结构化的三层LoRA架构,将身份无关知识与身份特定适配进行分离。在第一阶段,LoRA元下层通过跨多个主体的元训练学习共享流形,以捕捉通用的身份相关特征。第二阶段仅优化LoRA中层与LoRA上层,使其专精于特定主体,在显著缩短适配时间的同时提升身份保真度。为评估本方法,我们构建了身份个性化新基准数据集Meta-PHD,并将Meta-LoRA与前沿方法进行对比。实验结果表明,Meta-LoRA在不同身份条件下均实现了更优的身份保持性、计算效率与适应能力。相关代码、模型权重及数据集将在论文录用后公开发布。