Traditional regulations of chemical exposure tend to focus on single exposures, overlooking the potential amplified toxicity due to multiple concurrent exposures. We are interested in understanding the average outcome if exposures were limited to fall under a multivariate threshold. Because threshold levels are often unknown \textit{a priori}, we provide an algorithm that finds exposure threshold levels where the expected outcome is maximized or minimized. Because both identifying thresholds and estimating policy effects on the same data would lead to overfitting bias, we also provide a data-adaptive estimation framework, which allows for both threshold discovery and policy estimation. Simulation studies show asymptotic convergence to the optimal exposure region and to the true effect of an intervention. We demonstrate how our method identifies true interactions in a public synthetic mixture data set. Finally, we applied our method to NHANES data to discover metal exposures that have the most harmful effects on telomere length. We provide an implementation in the \texttt{CVtreeMLE} R package.
翻译:传统化学品暴露监管往往侧重于单一暴露,忽视了多种同时暴露可能产生的潜在毒性放大效应。我们关注的核心问题在于:若将暴露水平限制在多元阈值以下,平均结果会发生何种变化。由于阈值水平通常是\textit{先验未知的},我们提出了一种算法,可搜索使预期结果达到最大化或最小化的暴露阈值水平。鉴于在同一数据集上同时进行阈值识别和政策效应估计会引发过拟合偏差,我们还构建了一个数据自适应估计框架,能够同时支持阈值发现和政策评估。模拟研究显示,该方法可渐进收敛至最优暴露区域及干预的真实效应。我们通过公共合成混合物数据集验证了该方法识别真实交互作用的能力,并最终将其应用于NHANES数据,以发现对端粒长度最具危害效应的金属暴露因素。该方法的实现已收录于\texttt{CVtreeMLE} R软件包中。