Analysing malware is important to understand how malicious software works and to develop appropriate detection and prevention methods. Dynamic analysis can overcome evasion techniques commonly used to bypass static analysis and provide insights into malware runtime activities. Much research on dynamic analysis focused on investigating machine-level information (e.g., CPU, memory, network usage) to identify whether a machine is running malicious activities. A malicious machine does not necessarily mean all running processes on the machine are also malicious. If we can isolate the malicious process instead of isolating the whole machine, we could kill the malicious process, and the machine can keep doing its job. Another challenge dynamic malware detection research faces is that the samples are executed in one machine without any background applications running. It is unrealistic as a computer typically runs many benign (background) applications when a malware incident happens. Our experiment with machine-level data shows that the existence of background applications decreases previous state-of-the-art accuracy by about 20.12% on average. We also proposed a process-level Recurrent Neural Network (RNN)-based detection model. Our proposed model performs better than the machine-level detection model; 0.049 increase in detection rate and a false-positive rate below 0.1.
翻译:分析恶意软件对于理解恶意软件运行机制及开发相应检测与防御方法具有重要意义。动态分析可规避静态分析中常用的逃逸技术,并揭示恶意软件的运行时行为。现有动态分析研究多聚焦于机器级信息(如CPU、内存、网络使用量)以判断设备是否运行恶意活动。但恶意设备并不意味着其上所有进程皆为恶意进程。若能隔离恶意进程而非整台设备,即可终止恶意进程,使设备继续正常运行。动态恶意软件检测研究面临的另一挑战是:样本通常在未运行任何后台应用的单一设备上执行。然而现实中,恶意软件事件发生时计算机通常会同时运行大量良性后台应用。基于机器级数据的实验表明,后台应用的存在使现有最优模型准确率平均下降约20.12%。我们进一步提出了基于循环神经网络(RNN)的进程级检测模型。该模型性能优于机器级检测模型,检测率提升0.049,且假阳性率低于0.1。