A new model for generating survival trajectories and data based on applying an autoencoder of a specific structure is proposed. It solves three tasks. First, it provides predictions in the form of the expected event time and the survival function for a new generated feature vector on the basis of the Beran estimator. Second, the model generates additional data based on a given training set that would supplement the original dataset. Third, the most important, it generates a prototype time-dependent trajectory for an object, which characterizes how features of the object could be changed to achieve a different time to an event. The trajectory can be viewed as a type of the counterfactual explanation. The proposed model is robust during training and inference due to a specific weighting scheme incorporating into the variational autoencoder. The model also determines the censored indicators of new generated data by solving a classification task. The paper demonstrates the efficiency and properties of the proposed model using numerical experiments on synthetic and real datasets. The code of the algorithm implementing the proposed model is publicly available.
翻译:提出了一种基于特定结构自编码器的生存轨迹与数据生成新模型。该模型解决三个任务:首先,基于Beran估计器对新生成的特征向量提供预期事件时间和生存函数形式的预测;其次,根据给定训练集生成补充原始数据集的额外数据;最后,也是最重要的,为目标对象生成随时间变化的原型轨迹,表征对象特征如何改变以实现不同的事件时间。该轨迹可视为一种反事实解释。由于变分自编码器中融入了特定加权方案,所提模型在训练和推理过程中具有鲁棒性。模型还通过解决分类任务确定新生成数据的删失指示符。通过合成数据集和真实数据集的数值实验验证了该模型的效率和特性,实现算法的代码已公开。