Static analysis is sound in theory, but an implementation may unsoundly fail to analyze all of a program's code. Any such omission is a serious threat to the validity of the tool's output. Our work is the first to measure the prevalence of these omissions. Previously, researchers and analysts did not know what is missed by static analysis, what sort of code is missed, or the reasons behind these omissions. To address this gap, we ran 13 static analysis tools and a dynamic analysis on 1000 Android apps. Any method in the dynamic analysis but not in a static analysis is an unsoundness. Our findings include the following. (1) Apps built around external frameworks challenge static analyzers. On average, the 13 static analysis tools failed to capture 61% of the dynamically-executed methods. (2) A high level of precision in call graph construction is a synonym for a high level of unsoundness; (3) No existing approach significantly improves static analysis soundness. This includes those specifically tailored for a given mechanism, such as DroidRA to address reflection. It also includes systematic approaches, such as EdgeMiner, capturing all callbacks in the Android framework systematically. (4) Modeling entry point methods challenges call graph construction which jeopardizes soundness.
翻译:静态分析在理论上是完备的,但其具体实现可能不完备地未能分析程序的所有代码。任何此类遗漏都对工具输出结果的有效性构成严重威胁。我们的工作是首次对这些遗漏的普遍性进行量化研究。此前,研究人员和分析师并不清楚静态分析遗漏了哪些内容、遗漏了何种类型的代码,也不了解这些遗漏背后的原因。为填补这一空白,我们对1000个Android应用程序运行了13种静态分析工具和一种动态分析。在动态分析中出现但未在静态分析中出现的方法即被视为不完备性。我们的研究发现包括以下几点:(1) 围绕外部框架构建的应用程序对静态分析器构成挑战。平均而言,这13种静态分析工具未能捕获61%的动态执行方法。(2) 调用图构建的高精度等同于高程度的不完备性;(3) 现有方法均未显著改善静态分析的完备性。这包括针对特定机制专门设计的方法(例如用于处理反射的DroidRA),也包括系统性方法(例如系统性地捕获Android框架中所有回调的EdgeMiner)。(4) 入口点方法的建模对调用图构建构成挑战,从而危及完备性。