Neural networks often assume independence among input data samples, disregarding correlations arising from inherent clustering patterns in real-world datasets (e.g., due to different sites or repeated measurements). Recently, mixed effects neural networks (MENNs) which separate cluster-specific 'random effects' from cluster-invariant 'fixed effects' have been proposed to improve generalization and interpretability for clustered data. However, existing methods only allow for approximate quantification of cluster effects and are limited to regression and binary targets with only one clustering feature. We present MC-GMENN, a novel approach employing Monte Carlo methods to train Generalized Mixed Effects Neural Networks. We empirically demonstrate that MC-GMENN outperforms existing mixed effects deep learning models in terms of generalization performance, time complexity, and quantification of inter-cluster variance. Additionally, MC-GMENN is applicable to a wide range of datasets, including multi-class classification tasks with multiple high-cardinality categorical features. For these datasets, we show that MC-GMENN outperforms conventional encoding and embedding methods, simultaneously offering a principled methodology for interpreting the effects of clustering patterns.
翻译:神经网络通常假设输入数据样本之间相互独立,忽略了现实世界数据集中因固有聚类模式(例如,源于不同站点或重复测量)而产生的相关性。最近,混合效应神经网络(MENNs)被提出,其将簇特定的“随机效应”与簇不变的“固定效应”分离开来,旨在提升对聚类数据的泛化能力和可解释性。然而,现有方法仅能对簇效应进行近似量化,且仅限于回归和二分类目标,并仅支持单一聚类特征。我们提出了MC-GMENN,这是一种采用蒙特卡洛方法训练广义混合效应神经网络(GMENNs)的新方法。我们通过实验证明,MC-GMENN在泛化性能、时间复杂度和簇间方差量化方面均优于现有的混合效应深度学习模型。此外,MC-GMENN适用于广泛的数据集,包括具有多个高基数分类特征的多类别分类任务。对于这些数据集,我们展示了MC-GMENN优于传统的编码和嵌入方法,同时提供了一种解释聚类模式效应的原理性方法。