Missingness and measurement frequency are two sides of the same coin. How frequent should we measure clinical variables and conduct laboratory tests? It depends on many factors such as the stability of patient conditions, diagnostic process, treatment plan and measurement costs. The utility of measurements varies disease by disease, patient by patient. In this study we propose a novel view of clinical variable measurement frequency from a predictive modeling perspective, namely the measurements of clinical variables reduce uncertainty in model predictions. To achieve this goal, we propose variance SHAP with variational time series models, an application of Shapley Additive Expanation(SHAP) algorithm to attribute epistemic prediction uncertainty. The prediction variance is estimated by sampling the conditional hidden space in variational models and can be approximated deterministically by delta's method. This approach works with variational time series models such as variational recurrent neural networks and variational transformers. Since SHAP values are additive, the variance SHAP of binary data imputation masks can be directly interpreted as the contribution to prediction variance by measurements. We tested our ideas on a public ICU dataset with deterioration prediction task and study the relation between variance SHAP and measurement time intervals.
翻译:缺失值和测量频率是一体两面。临床变量应多久测量一次、实验室检查应多久进行一次?这取决于诸多因素,如患者病情稳定性、诊断过程、治疗方案及测量成本。测量效用因疾病而异、因患者而异。本研究从预测建模视角提出一种关于临床变量测量频率的新观点,即临床变量的测量可降低模型预测的不确定性。为实现此目标,我们提出结合变分时间序列模型的方差SHAP方法,这是Shapley加法解释(SHAP)算法在归因认知预测不确定性中的应用。预测方差通过采样变分模型的条件隐空间进行估计,并可通过德尔塔方法进行确定性近似。该方法适用于变分递归神经网络和变分Transformer等变分时间序列模型。由于SHAP值具有可加性,二值数据插补掩码的方差SHAP可直接解释为测量对预测方差的贡献。我们在公开的ICU数据集上针对病情恶化预测任务验证了这一想法,并研究了方差SHAP与测量时间间隔之间的关系。