This study introduces a nonparametric definition of interaction and provides an approach to both interaction discovery and efficient estimation of this parameter. Using stochastic shift interventions and ensemble machine learning, our approach identifies and quantifies interaction effects through a model-independent target parameter, estimated via targeted maximum likelihood and cross-validation. This method contrasts the expected outcomes of joint interventions with those of individual interventions. Validation through simulation and application to the National Institute of Environmental Health Sciences Mixtures Workshop data demonstrate the efficacy of our method in detecting true interaction directions and its consistency in identifying significant impacts of furan exposure on leukocyte telomere length. Our method, called SuperNOVA, advances the ability to analyze multiexposure interactions within high-dimensional data, offering significant methodological improvements to understand complex exposure dynamics in health research. We provide peer-reviewed open-source software that employs or proposed methodology in the \texttt{SuperNOVA} R package.
翻译:本研究提出了一种交互作用的非参数定义,并提供了交互作用发现及该参数高效估计的方法。通过使用随机偏移干预和集成机器学习,我们的方法通过模型无关的目标参数识别并量化交互效应,该参数通过目标最大似然估计和交叉验证进行估计。该方法将联合干预的预期结果与个体干预的预期结果进行对比。通过模拟实验及对美国国家环境健康科学研究所混合物研讨会数据的应用验证,证明了本方法在检测真实交互方向方面的有效性,以及在识别呋喃暴露对白细胞端粒长度显著影响中的一致性。我们提出的方法SuperNOVA,提升了在高维数据中分析多重暴露交互作用的能力,为理解健康研究中复杂的暴露动力学提供了重要的方法论改进。我们提供了经同行评审的开源软件,在SuperNOVA R包中实现了所提出的方法。