Motivated by applications in cybersecurity such as finding meaningful sequences of malware-related events buried inside large amounts of computer log data, we introduce the "planted path" problem and propose an algorithm to find fuzzy matchings between two trees. This algorithm can be used as a "building block" for more complicated workflows. We demonstrate usefulness of a few of such workflows in mining synthetically generated data as well as real-world ACME cybersecurity datasets.
翻译:受网络安全应用(例如在大量计算机日志数据中寻找隐藏的恶意软件相关事件的有意义序列)的驱动,我们引入了“植入路径”问题,并提出了一种算法来寻找两棵树之间的模糊匹配。该算法可作为更复杂工作流的“构建模块”。我们通过挖掘合成生成的数据以及真实的ACME网络安全数据集,展示了若干此类工作流的实用性。