Building a Robust Risk-Based Access Control System to Combat Ransomware's Capability to Encrypt: A Machine Learning Approach

Ransomware core capability, unauthorized encryption, demands controls that identify and block malicious cryptographic activity without disrupting legitimate use. We present a probabilistic, risk-based access control architecture that couples machine learning inference with mandatory access control to regulate encryption on Linux in real time. The system builds a specialized dataset from the native ftrace framework using the function_graph tracer, yielding high-resolution kernel-function execution traces augmented with resource and I/O counters. These traces support both a supervised classifier and interpretable rules that drive an SELinux policy via lightweight booleans, enabling context-sensitive permit/deny decisions at the moment encryption begins. Compared to approaches centered on sandboxing, hypervisor introspection, or coarse system-call telemetry, the function-level tracing we adopt provides finer behavioral granularity than syscall-only telemetry while avoiding the virtualization/VMI overhead of sandbox-based approaches. Our current user-space prototype has a non-trivial footprint under burst I/O; we quantify it and recognize that a production kernel-space solution should aim to address this. We detail dataset construction, model training and rule extraction, and the run-time integration that gates file writes for suspect encryption while preserving benign cryptographic workflows. During evaluation, the two-layer composition retains model-level detection quality while delivering rule-like responsiveness; we also quantify operational footprint and outline engineering steps to reduce CPU and memory overhead for enterprise deployment. The result is a practical path from behavioral tracing and learning to enforceable, explainable, and risk-proportionate encryption control on production Linux systems.

翻译：勒索软件的核心能力——未经授权的加密——要求控制系统能够在不干扰合法使用的前提下识别并阻断恶意加密活动。本文提出一种基于概率风险的访问控制架构，将机器学习推理与强制访问控制相结合，实现对Linux系统加密行为的实时管控。该系统通过原生ftrace框架中的function_graph追踪器构建专用数据集，生成包含资源与I/O计数器的高分辨率内核函数执行轨迹。这些轨迹既支持监督分类器，也支撑可解释规则——后者通过轻量级布尔变量驱动SELinux策略，从而在加密启动瞬间实现上下文感知的许可/拒绝决策。相较于基于沙箱、虚拟机自省或粗粒度系统调用遥测的方案，我们采用的函数级追踪在提供比纯系统调用遥测更精细行为粒度的同时，避免了沙箱方案所需的虚拟化/VMI开销。当前用户空间原型在突发I/O负载下存在显著性能开销，我们对此进行了量化分析，并指出生产级内核空间解决方案需着力解决此问题。本文详述了数据集构建、模型训练与规则提取过程，以及运行时集成机制——该机制在保护良性加密工作流的同时，对可疑加密的文件写入实施门控。评估表明，双层组合架构在保持模型级检测质量的同时，实现了类规则的响应速度；我们还量化了系统运行开销，并规划了降低CPU与内存占用以适配企业部署的工程步骤。该研究为生产环境Linux系统提供了一条从行为追踪学习到可执行、可解释、风险均衡的加密控制的实际路径。