Estimating causal effects from longitudinal trajectories is central to understanding the progression of complex conditions and optimizing clinical decision-making, such as comorbidities and long COVID recovery. We introduce \emph{C-kNN--LSH}, a nearest-neighbor framework for sequential causal inference designed to handle such high-dimensional, confounded situations. By utilizing locality-sensitive hashing, we efficiently identify ``clinical twins'' with similar covariate histories, enabling local estimation of conditional treatment effects across evolving disease states. To mitigate bias from irregular sampling and shifting patient recovery profiles, we integrate neighborhood estimator with a doubly-robust correction. Theoretical analysis guarantees our estimator is consistent and second-order robust to nuisance error. Evaluated on a real-world Long COVID cohort with 13,511 participants, \emph{C-kNN-LSH} demonstrates superior performance in capturing recovery heterogeneity and estimating policy values compared to existing baselines.
翻译:从纵向轨迹中估计因果效应对于理解复杂病症(如共病和长新冠康复)的进展及优化临床决策至关重要。本文提出 \emph{C-kNN--LSH},一种用于序列因果推断的最近邻框架,旨在处理此类高维、存在混杂因素的情形。通过利用局部敏感哈希,我们高效地识别具有相似协变量历史的“临床双胞胎”,从而实现对动态疾病状态下条件治疗效应的局部估计。为减轻不规则采样和患者康复轨迹变化带来的偏倚,我们将邻域估计器与双重稳健校正相结合。理论分析保证我们的估计量具有一致性,且对干扰误差具有二阶稳健性。在一个包含 13,511 名参与者的真实世界长新冠队列上的评估表明,与现有基线方法相比,\emph{C-kNN-LSH} 在捕捉康复异质性和估计策略价值方面表现出更优的性能。