We introduce a new challenge to the software development community: 1) leveraging AI to accurately detect and flag up secrets in code and on popular document sharing platforms that frequently used by developers, such as Confluence and 2) automatically remediating the detections (e.g. by suggesting password vault functionality). This is a challenging, and mostly unaddressed task. Existing methods leverage heuristics and regular expressions, that can be very noisy, and therefore increase toil on developers. The next step - modifying code itself - to automatically remediate a detection, is a complex task. We introduce two baseline AI models that have good detection performance and propose an automatic mechanism for remediating secrets found in code, opening up the study of this task to the wider community.
翻译:我们向软件开发社区提出一项新挑战:1)利用人工智能准确检测并标记代码及开发者常用文档共享平台(如Confluence)中的机密信息;2)自动修复检测到的问题(例如通过建议密码保险库功能)。这是一项极具挑战性且尚未得到充分研究的问题。现有方法依赖启发式规则和正则表达式,容易产生大量噪声,从而增加开发者的工作负担。而后续步骤——修改代码本身以自动修复检测到的机密——则是一项复杂的任务。我们提出了两种具有良好检测性能的基线AI模型,并设计了一种自动化机制用于修复代码中发现的机密信息,从而将该任务的研究开放给更广泛的学界与业界。