This study introduces a nonparametric definition of interaction and provides an approach to both interaction discovery and efficient estimation of this parameter. Using stochastic shift interventions and ensemble machine learning, our approach identifies and quantifies interaction effects through a model-independent target parameter, estimated via targeted maximum likelihood and cross-validation. This method contrasts the expected outcomes of joint interventions with those of individual interventions. Validation through simulation and application to the National Institute of Environmental Health Sciences Mixtures Workshop data demonstrate the efficacy of our method in detecting true interaction directions and its consistency in identifying significant impacts of furan exposure on leukocyte telomere length. Our method, called InterXshift, advances the ability to analyze multi-exposure interactions within high-dimensional data, offering significant methodological improvements to understand complex exposure dynamics in health research. We provide peer-reviewed open-source software that employs or proposed methodology in the InterXshift R package.
翻译:本研究提出了一种非参数化的交互作用定义,并提供了交互作用发现及该参数高效估计的方法。通过采用随机偏移干预与集成机器学习,我们的方法通过一个模型无关的目标参数来识别和量化交互效应,该参数通过定向最大似然估计与交叉验证进行估计。此方法对比了联合干预与单独干预的预期结果。通过仿真验证及在美国国家环境健康科学研究所混合物研讨会数据上的应用,证明了本方法在检测真实交互作用方向上的有效性,以及在识别呋喃暴露对白细胞端粒长度显著影响的一致性。我们提出的InterXshift方法提升了在高维数据中分析多重暴露交互作用的能力,为理解健康研究中复杂的暴露动态提供了重要的方法论改进。我们在InterXshift R软件包中提供了采用所提出方法的同行评审开源软件。