A Semiparametric Bayesian Method for Instrumental Variable Analysis with Partly Interval-Censored Time-to-Event Outcome

This paper develops a semiparametric Bayesian instrumental variable analysis method for estimating the causal effect of an endogenous variable when dealing with unobserved confounders and measurement errors with partly interval-censored time-to-event data, where event times are observed exactly for some subjects but left-censored, right-censored, or interval-censored for others. Our method is based on a two-stage Dirichlet process mixture instrumental variable (DPMIV) model which simultaneously models the first-stage random error term for the exposure variable and the second-stage random error term for the time-to-event outcome using a bivariate Gaussian mixture of the Dirichlet process (DPM) model. The DPM model can be broadly understood as a mixture model with an unspecified number of Gaussian components, which relaxes the normal error assumptions and allows the number of mixture components to be determined by the data. We develop an MCMC algorithm for the DPMIV model tailored for partly interval-censored data and conduct extensive simulations to assess the performance of our DPMIV method in comparison with some competing methods. Our simulations revealed that our proposed method is robust under different error distributions and can have superior performance over its parametric counterpart under various scenarios. We further demonstrate the effectiveness of our approach on an UK Biobank data to investigate the causal effect of systolic blood pressure on time-to-development of cardiovascular disease from the onset of diabetes mellitus.

翻译：本文针对存在未观测混杂因素和测量误差的部分区间删失时间-事件数据（其中部分受试者的事件时间被精确观测，而其他受试者的事件时间存在左删失、右删失或区间删失），提出了一种用于估计内生变量因果效应的半参数贝叶斯工具变量分析方法。我们的方法基于一个两阶段狄利克雷过程混合工具变量（DPMIV）模型，该模型利用狄利克雷过程（DPM）的双变量高斯混合模型，同时为暴露变量的第一阶段随机误差项和时间-事件结局的第二阶段随机误差项建模。DPM模型可广义地理解为具有未指定数量高斯分量的混合模型，它放宽了正态误差假设，并允许混合分量的数量由数据决定。我们为DPMIV模型开发了一种专门针对部分区间删失数据的MCMC算法，并通过大量模拟评估了我们的DPMIV方法与一些竞争方法相比的性能。我们的模拟结果表明，所提出的方法在不同误差分布下具有稳健性，并且在多种场景下可能优于其参数化对应方法。我们进一步在英国生物银行数据上展示了我们方法的有效性，以研究收缩压对从糖尿病发病到心血管疾病发生时间的因果效应。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日