The motivation of this article is to improve inferences on the covariation in environmental exposures, motivated by data from a study of Toddlers Exposure to SVOCs in Indoor Environments (TESIE). The challenge is that the sample size is limited, so empirical covariance provides a poor estimate. In related applications, Bayesian factor models have been popular; these approaches express the covariance as low rank plus diagonal and can infer the number of factors adaptively. However, they have the disadvantage of shrinking towards a diagonal covariance, often under estimating important covariation patterns in the data. Alternatively, the dimensionality problem is addressed by collapsing the detailed exposure data within chemical classes, potentially obscuring important information. We apply a feature aware covariance regression extension of Bayesian factor analysis, which improves performance by including information from features summarizing properties of the different exposures. This approach enables shrinkage to more flexible covariance structures, reducing the over-shrinkage problem, as we illustrate in the TESIE data using various chemical features.
翻译:本文的研究动机源于改善环境暴露协变性的推断,数据来自一项关于婴幼儿室内环境半挥发性有机化合物暴露(TESIE)的研究。挑战在于样本量有限,因此经验协方差提供了较差的估计。在相关应用中,贝叶斯因子模型广受欢迎;这些方法将协方差表示为低秩加对角矩阵,并可自适应地推断因子数量。然而,它们存在向对角协方差收缩的缺点,往往低估数据中重要的协变模式。另一种方法是,通过对化学类别内的详细暴露数据进行合并来解决维度问题,但这可能会掩盖重要信息。我们应用了贝叶斯因子分析的特征感知协方差回归扩展,通过纳入总结不同暴露特征的信息来提升性能。该方法能够向更灵活的协方差结构收缩,减少过度收缩问题,我们在TESIE数据中使用多种化学特征对此进行了说明。