We propose a general framework for visualizing any intermediate embedding representation used by any neural survival analysis model. Our framework is based on so-called anchor directions in an embedding space. We show how to estimate these anchor directions using clustering or, alternatively, using user-supplied "concepts" defined by collections of raw inputs (e.g., feature vectors all from female patients could encode the concept "female"). For tabular data, we present visualization strategies that reveal how anchor directions relate to raw clinical features and to survival time distributions. We then show how these visualization ideas extend to handling raw inputs that are images. Our framework is built on looking at angles between vectors in an embedding space, where there could be "information loss" by ignoring magnitude information. We show how this loss results in a "clumping" artifact that appears in our visualizations, and how to reduce this information loss in practice.
翻译:我们提出一种通用框架,用于可视化任意神经生存分析模型中使用的任何中间嵌入表示。该框架基于嵌入空间中的所谓锚定方向。我们展示了如何通过聚类方法或利用用户定义的"概念"(由原始输入集合定义,例如所有女性患者的特征向量可编码"女性"概念)来估计这些锚定方向。针对表格数据,我们提出了能揭示锚定方向与原始临床特征及生存时间分布之间关联的可视化策略。随后展示了这些可视化思路如何延伸至处理图像类原始输入。该框架的核心在于分析嵌入空间中向量间的角度关系,但忽略幅度信息可能导致"信息损失"。我们阐明了这种损失如何在可视化中产生"聚集"伪影,并提出了实际降低该信息损失的方法。