Partitioning a set of elements into an unknown number of mutually exclusive subsets is essential in many machine learning problems. However, assigning elements, such as samples in a dataset or neurons in a network layer, to an unknown and discrete number of subsets is inherently non-differentiable, prohibiting end-to-end gradient-based optimization of parameters. We overcome this limitation by proposing a novel two-step method for inferring partitions, which allows its usage in variational inference tasks. This new approach enables reparameterized gradients with respect to the parameters of the new random partition model. Our method works by inferring the number of elements per subset and, second, by filling these subsets in a learned order. We highlight the versatility of our general-purpose approach on three different challenging experiments: variational clustering, inference of shared and independent generative factors under weak supervision, and multitask learning.
翻译:将一组元素划分为未知数量的互斥子集是许多机器学习问题中的关键步骤。然而,将元素(例如数据集中的样本或网络层中的神经元)分配到未知且离散数量的子集中本质上是不可微的,从而阻碍了参数的端到端梯度优化。我们通过提出一种新颖的两步推断划分方法克服了这一限制,使其可用于变分推断任务。这种新方法能够针对新随机划分模型的参数实现重参数化梯度。我们的方法首先推断每个子集中的元素数量,其次按学习到的顺序填充这些子集。我们在三个具有挑战性的不同实验中展示了这一通用方法的广泛适用性:变分聚类、弱监督下共享与独立生成因子的推断,以及多任务学习。