Embedded devices are specialised devices designed for one or only a few purposes. They are often part of a larger system, through wired or wireless connection. Those embedded devices that are connected to other computers or embedded systems through the Internet are called Internet of Things (IoT for short) devices. With their widespread usage and their insufficient protection, these devices are increasingly becoming the target of malware attacks. Companies often cut corners to save manufacturing costs or misconfigure when producing these devices. This can be lack of software updates, ports left open or security defects by design. Although these devices may not be as powerful as a regular computer, their large number makes them suitable candidates for botnets. Other types of IoT devices can even cause health problems since there are even pacemakers connected to the Internet. This means, that without sufficient defence, even directed assaults are possible against people. The goal of this thesis project is to provide better security for these devices with the help of machine learning algorithms and reverse engineering tools. Specifically, I study the applicability of control-flow related data of executables for malware detection. I present a malware detection method with two phases. The first phase extracts control-flow related data using static binary analysis. The second phase classifies binary executables as either malicious or benign using a neural network model. I train the model using a dataset of malicious and benign ARM applications.
翻译:嵌入式设备是为单一或少数用途而设计的专用设备,通常通过有线或无线连接作为更大系统的一部分。这些连接到互联网的其他计算机或嵌入式系统的设备被称为物联网(IoT)设备。随着其广泛应用和保护不足,这些设备日益成为恶意软件攻击的目标。企业在生产这些设备时,常为节省制造成本而偷工减料或错误配置,导致缺乏软件更新、端口未关闭或存在设计安全缺陷。尽管这些设备性能不如普通计算机,但数量庞大使其成为僵尸网络的理想载体。其他类型的物联网设备甚至可能威胁健康,例如存在联网的心脏起搏器。这意味着,若无充分防护,甚至可能对个人实施定向攻击。本论文目标是通过机器学习算法与逆向工程工具,为这些设备提供更优安全保障。具体而言,本人研究了可执行文件的控制流相关数据在恶意软件检测中的适用性,提出一种两阶段恶意软件检测方法:第一阶段通过静态二进制分析提取控制流相关数据;第二阶段利用神经网络模型将二进制可执行文件分类为恶意或良性。本人使用恶意及良性ARM应用程序数据集对模型进行训练。