Partitioning a set of elements into an unknown number of mutually exclusive subsets is essential in many machine learning problems. However, assigning elements, such as samples in a dataset or neurons in a network layer, to an unknown and discrete number of subsets is inherently non-differentiable, prohibiting end-to-end gradient-based optimization of parameters. We overcome this limitation by proposing a novel two-step method for inferring partitions, which allows its usage in variational inference tasks. This new approach enables reparameterized gradients with respect to the parameters of the new random partition model. Our method works by inferring the number of elements per subset and, second, by filling these subsets in a learned order. We highlight the versatility of our general-purpose approach on three different challenging experiments: variational clustering, inference of shared and independent generative factors under weak supervision, and multitask learning.
翻译:将一组元素划分为未知数量的互斥子集,是许多机器学习问题中的核心任务。然而,将元素(如数据集中的样本或网络层中的神经元)分配到未知且离散数量的子集中,本质上不可微,这阻碍了基于梯度的端到端参数优化。我们通过提出一种新颖的两步法来推断划分,克服了这一限制,使其能用于变分推断任务。这种新方法能够针对新随机划分模型的参数进行重参数化梯度计算。我们的方法首先推断每个子集中元素的数量,然后按照学习到的顺序填充这些子集。我们通过三个不同且具有挑战性的实验——变分聚类、弱监督下共享与独立生成因子的推断以及多任务学习——展示了这一通用方法的广泛适用性。