A statistical approach for finding property-access errors

We study the problem of finding incorrect property accesses in JavaScript where objects do not have a fixed layout, and properties (including methods) can be added, overwritten, and deleted freely throughout the lifetime of an object. Since referencing a non-existent property is not an error in JavaScript, accidental accesses to non-existent properties (caused, perhaps, by a typo or by a misunderstanding of API documentation) can go undetected without thorough testing, and may manifest far from the source of the problem. We propose a two-phase approach for detecting property access errors based on the observation that, in practice, most property accesses will be correct. First a large number of property access patterns is collected from an extensive corpus of real-world JavaScript code, and a statistical analysis is performed to identify anomalous usage patterns. Specific instances of these patterns may not be bugs (due, e.g., dynamic type checks), so a local data-flow analysis filters out instances of anomalous property accesses that are safe and leaves only those likely to be actual bugs. We experimentally validate our approach, showing that on a set of 100 concrete instances of anomalous property accesses, the approach achieves a precision of 82% with a recall of 90%, making it suitable for practical use. We also conducted an experiment to determine how effective the popular VSCode code completion feature is at suggesting object properties, and found that, while it never suggested an incorrect property (precision of 100%), it failed to suggest the correct property in 62 out of 80 cases (recall of 22.5%). This shows that developers cannot rely on VSCode's code completion alone to ensure that all property accesses are valid.

翻译：我们研究在JavaScript中查找不正确属性访问的问题。在JavaScript中，对象没有固定的布局，属性（包括方法）可以在对象的生命周期内自由添加、覆盖和删除。由于引用不存在的属性在JavaScript中并非错误，因此对不存在属性的偶然访问（可能由拼写错误或对API文档理解错误引起）在不经过彻底测试的情况下可能未被检测到，并且问题的影响可能远离源头。基于观察到大多数属性访问在实际中是正确的这一事实，我们提出了一种两阶段方法来检测属性访问错误。首先，从大量真实世界JavaScript代码语料库中收集大量属性访问模式，并进行统计分析以识别异常使用模式。由于这些模式的具体实例可能并非错误（例如，由于动态类型检查），因此局部数据流分析会过滤掉安全的异常属性访问实例，仅保留那些可能是实际错误的实例。我们通过实验验证了该方法，结果显示，在一组100个具体的异常属性访问实例上，该方法达到了82%的精确率和90%的召回率，适合实际使用。我们还进行了一项实验，以确定流行的VSCode代码补全功能在建议对象属性方面的有效性，发现虽然它从未建议不正确的属性（100%的精确率），但在80个案例中有62个未建议正确的属性（22.5%的召回率）。这表明开发者不能仅依赖VSCode的代码补全来确保所有属性访问都有效。