Functional Isolation Forest (FIF) is a recent state-of-the-art Anomaly Detection (AD) algorithm designed for functional data. It relies on a tree partition procedure where an abnormality score is computed by projecting each curve observation on a drawn dictionary through a linear inner product. Such linear inner product and the dictionary are a priori choices that highly influence the algorithm's performances and might lead to unreliable results, particularly with complex datasets. This work addresses these challenges by introducing \textit{Signature Isolation Forest}, a novel AD algorithm class leveraging the rough path theory's signature transform. Our objective is to remove the constraints imposed by FIF through the proposition of two algorithms which specifically target the linearity of the FIF inner product and the choice of the dictionary. We provide several numerical experiments, including a real-world applications benchmark showing the relevance of our methods.
翻译:功能隔离森林(Functional Isolation Forest,FIF)是近期针对函数型数据设计的最先进的异常检测(AD)算法。其核心机制基于树分裂过程,通过线性内积将每条曲线观测值投影至预设字典上,进而计算异常分数。此类线性内积与字典作为先验选择,会显著影响算法性能,尤其在处理复杂数据集时可能导致不可靠结果。针对上述挑战,本文提出《签名隔离森林》(Signature Isolation Forest)——一种基于粗路径理论签名变换的新型异常检测算法类。我们旨在通过提出两种专门针对FIF线性内积特性及字典选择问题的算法,消除FIF施加的约束。数值实验(含真实场景应用基准测试)充分验证了所提方法的相关性。