Unsupervised Cross-Protocol Anomaly Analysis in Mobile Core Networks via Multi-Embedding Models Consensus

Mobile core networks rely on several signalling protocols in parallel, such as SS7, Diameter, and GTP, so many security-relevant problems become visible only when their interactions are analyzed jointly. At the same time, labeled examples of real attacks and cross-protocol misconfigurations are scarce, which complicates supervised detection. We therefore study unsupervised cross-protocol anomaly analysis on fused representations that combine SS7, Diameter, and GTP signalling. For each subscriber, we aggregate messages into per-minute fused records, serialize each record as text, embed it with several models, and apply unsupervised anomaly detection. We then assign each record a consensus score equal to the number of embedding models that flag it as anomalous. For evaluation, we generate cross-protocol-plausible synthetic anomalies by swapping one field group at a time between pairs of records, preserving per-message validity while making the fused view contradictory. On 219,294 fused records, 44.15% are flagged by at least one model, but only 0.97% reach full agreement across all six. Higher consensus is strongly associated with synthetic records, where for k=1-4 the odds that a flagged record is synthetic are hundreds of times greater than for original records, and for k>=5 all flagged records are synthetic, with extremely small p-values. Cosine distances between synthetic and original records also increase with consensus, suggesting clearer separation in embedding space. These results support the use of multi-embedding consensus to prioritize a much smaller set of candidate cross-protocol inconsistencies for further inspection.

翻译：移动核心网并行依赖多种信令协议（如SS7、Diameter和GTP），因此许多与安全相关的问题仅在联合分析其交互作用时才能显现。同时，真实攻击与跨协议配置错误的标注样本稀缺，这使有监督检测变得复杂。为此，我们研究基于融合SS7、Diameter和GTP信令的表征的无监督跨协议异常分析。针对每个用户，我们将消息聚合成每分钟的融合记录，将每条记录序列化为文本，通过多个模型进行嵌入，并应用无监督异常检测。随后，我们为每条记录分配一个共识分数，该分数等于将其标记为异常的嵌入模型数量。为进行评估，我们通过在记录对之间每次交换一个字段组来生成跨协议合理的合成异常，在保持单条消息有效性的同时使融合视图产生矛盾。在219,294条融合记录中，44.15%的记录被至少一个模型标记，但仅有0.97%在所有六个模型中达成完全一致。更高的共识度与合成记录强相关：当k=1-4时，被标记记录为合成记录的几率比原始记录高出数百倍；当k≥5时，所有被标记记录均为合成记录，且p值极小。合成记录与原始记录间的余弦距离也随共识度增加而增大，表明嵌入空间中的分离更清晰。这些结果支持使用多嵌入共识机制，以优先筛选出规模小得多的候选跨协议不一致集合供进一步核查。