Building on topological data analysis and expert knowledge, this study introduces a Mapper-based approach to cluster agents based on their tendency to be influenced by information spread. The context of our paper is financial markets with an aim to identify agents trading opportunistically on insider information while minimizing false positives, a critical challenge in financial market surveillance. We verify and demonstrate our methods using both synthetic and empirical data on insider networks and investor-level transactions in a stock market. Recognizing the sensitive nature of insider trading cases, we design a conservative approach to minimize false positives, ensuring that innocent agents are not wrongfully implicated. We find that the mapper-based method systematically outperforms other methods on synthetic data with ground truth. We also apply the method to empirical data and verify the results using a statistical validation method based on persistence homology. Our findings indicate that the proposed Mapper-based technique effectively identifies a subset of agents who tend to take advantage of inside information they have received. This method is highly adaptable to various applications involving the spread of information or diseases, where agents exhibit only indirect evidence of their carrier status (symptoms) through their behavior.
翻译:本研究基于拓扑数据分析和专家知识,提出了一种基于Mapper的方法,用于根据智能体受信息传播影响的倾向性进行聚类。本文以金融市场为背景,旨在识别利用内幕信息进行机会性交易的智能体,同时最大限度地减少误报——这是金融市场监控中的一个关键挑战。我们通过合成数据和股票市场内幕网络及投资者层面交易的实证数据,验证并展示了我们的方法。考虑到内幕交易案件的敏感性,我们设计了一种保守的方法来最小化误报,确保无辜的智能体不会被错误牵连。我们发现,在具有真实标签的合成数据上,基于Mapper的方法系统地优于其他方法。我们还将该方法应用于实证数据,并使用基于持续同调的统计验证方法对结果进行了验证。我们的研究结果表明,所提出的基于Mapper的技术能有效识别出倾向于利用其所获内幕信息的智能体子集。该方法高度适用于涉及信息或疾病传播的各种应用场景,其中智能体仅通过其行为表现出携带者状态(症状)的间接证据。