File encrypting ransomware increasingly employs intermittent encryption techniques, encrypting only parts of files to evade classical detection methods. These strategies, exemplified by ransomware families like BlackCat, complicate file structure based detection techniques due to diverse file formats exhibiting varying traits under partial encryption. This paper provides a systematic empirical characterization of byte level statistics under intermittent encryption across common file types, establishing a comprehensive baseline of how partial encryption impacts data structure. We specialize a classical KL divergence upper bound on a tailored mixture model of intermittent encryption, yielding filetype specific detectability ceilings for histogram-based detectors. Leveraging insights from this analysis, we empirically evaluate convolutional neural network (CNN) based detection methods using realistic intermittent encryption configurations derived from leading ransomware variants. Our findings demonstrate that localized analysis via chunk level CNNs consistently outperforms global analysis methods, highlighting their practical effectiveness and establishing a robust baseline for future detection systems.
翻译:文件加密型勒索软件日益采用间歇性加密技术,仅加密文件部分内容以规避传统检测方法。以BlackCat等勒索软件家族为代表的此类策略,由于不同文件格式在部分加密下表现出各异特性,使得基于文件结构的检测技术趋于复杂。本文通过系统性的实证研究,刻画了常见文件类型在间歇性加密下的字节级统计特征,建立了部分加密如何影响数据结构的全面基准。我们针对间歇性加密的定制混合模型,推导出经典KL散度上界的特化形式,为基于直方图的检测器提供了文件类型特定的可检测性上限。基于此分析获得的洞见,我们采用源自主流勒索软件变种的真实间歇性加密配置,对基于卷积神经网络(CNN)的检测方法进行了实证评估。研究结果表明,通过分块级CNN进行的局部化分析持续优于全局分析方法,凸显了其实际有效性,并为未来检测系统建立了稳健的基准。