Generative models, such as large language models and text-to-image diffusion models, produce relevant information when presented a query. Different models may produce different information when presented the same query. As the landscape of generative models evolves, it is important to develop techniques to study and analyze differences in model behaviour. In this paper we present novel theoretical results for embedding-based representations of generative models in the context of a set of queries. We establish sufficient conditions for the consistent estimation of the model embeddings in situations where the query set and the number of models grow.
翻译:生成模型,例如大型语言模型和文本到图像扩散模型,在接收查询时会产生相关信息。不同模型在面对相同查询时可能产生不同信息。随着生成模型生态的持续发展,开发用于研究和分析模型行为差异的技术显得尤为重要。本文针对基于嵌入的生成模型表示方法,在查询集合的语境下提出了新的理论结果。我们建立了在查询集合与模型数量同时增长的情况下,实现模型嵌入一致性估计的充分条件。