Are score function estimators an underestimated approach to learning with $k$-subset sampling? Sampling $k$-subsets is a fundamental operation in many machine learning tasks that is not amenable to differentiable parametrization, impeding gradient-based optimization. Prior work has focused on relaxed sampling or pathwise gradient estimators. Inspired by the success of score function estimators in variational inference and reinforcement learning, we revisit them within the context of $k$-subset sampling. Specifically, we demonstrate how to efficiently compute the $k$-subset distribution's score function using a discrete Fourier transform, and reduce the estimator's variance with control variates. The resulting estimator provides both exact samples and unbiased gradient estimates while also applying to non-differentiable downstream models, unlike existing methods. Experiments in feature selection show results competitive with current methods, despite weaker assumptions.
翻译:评分函数估计器是否是一种被低估的用于$k$-子集采样学习的方法?$k$-子集采样是许多机器学习任务中的基本操作,其本身不适用于可微参数化,从而阻碍了基于梯度的优化。先前的研究主要集中在松弛采样或路径梯度估计器上。受评分函数估计器在变分推断和强化学习中取得成功的启发,我们在$k$-子集采样的背景下重新审视它们。具体而言,我们展示了如何使用离散傅里叶变换高效计算$k$-子集分布的评分函数,并通过控制变量法降低估计器的方差。与现有方法不同,所得到的估计器不仅能提供精确样本和无偏梯度估计,还适用于不可微的下游模型。在特征选择任务中的实验表明,尽管假设条件更弱,其结果仍可与当前方法竞争。