WELL: Applying Bug Detectors to Bug Localization via Weakly Supervised Learning - 专知论文

会员服务 ·

0

Bug · Automator · 监督 · binary · Learning ·

2023 年 5 月 27 日

WELL: Applying Bug Detectors to Bug Localization via Weakly Supervised Learning

翻译：WELL: 通过弱监督学习将缺陷检测器应用于缺陷定位

Zhuo Li,Huangzhao Zhang,Zhi Jin,Ge Li

from arxiv, (Preprint) Software Engineer; Deep Learning; Bug Detection & Localization

Bug localization is a key software development task, where a developer locates the portion of the source code that must be modified based on the bug report. It is label-intensive and time-consuming due to the increasing size and complexity of the modern software. Effectively automating this task can greatly reduce costs by cutting down the developers' effort. Researchers have already made efforts to harness the great powerfulness of deep learning (DL) to automate bug localization. However, training DL models demands a large quantity of annotated training data, while the buggy-location-annotated dataset with reasonable quality and quantity is difficult to collect. This becomes an obstacle to the effective usage of DL for bug localization. We notice that the data pairs for bug detection, which provide weak buggy-or-not binary classification supervision, are much easier to obtain. Inspired by weakly supervised learning, this paper proposes WEakly supervised bug LocaLization (WELL), an approach to transform bug detectors to bug locators. Through the CodeBERT model finetuned by bug detection, WELL is capable to locate bugs in a weakly supervised manner based on the attention. The evaluations on three datasets of WELL show competitive performance with the existing strongly supervised DL solutions. WELL even outperforms current SOTA models in tasks of variable misuse and binary operator misuse.

翻译：缺陷定位是软件开发中的关键任务，开发者需根据缺陷报告定位需修改的源代码部分。随着现代软件规模与复杂性的增长，该任务对标签需求密集且耗时。有效自动化此任务可通过减少开发者工作量大幅降低成本。研究者已尝试利用深度学习（DL）的强大能力自动化缺陷定位，但训练DL模型需要大量带注释的训练数据，而具备合理质量与数量的缺陷位置标注数据集难以收集。这成为DL有效应用于缺陷定位的障碍。我们注意到，缺陷检测的数据对（提供二分类的弱监督信号）更易获取。受弱监督学习启发，本文提出WEakly supervised bug LocaLization（WELL）方法，将缺陷检测器转化为缺陷定位器。通过基于缺陷检测微调的CodeBERT模型，WELL能以弱监督方式基于注意力机制定位缺陷。在三个数据集上的评估表明，WELL展现出与现有强监督DL解决方案相竞争的性能，甚至在变量误用和二元运算符误用任务中超越当前最先进模型。

0

相关内容

Bug

程序猿的天敌有时是一个不能碰的magic

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

46+阅读 · 2020年10月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

小尺度电离层扰动的TEC起伏特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

近海复杂水体时空变异对水色遥感产品真实性检验的影响研究

国家自然科学基金

0+阅读 · 2013年12月31日

太阳爆发活动源区三维非势磁场重构的MHD方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

空间遥感绝对辐射定标基准辐射计

国家自然科学基金

0+阅读 · 2012年12月31日

基于MARVELD1调控的miRNA对组蛋白H4甲基化修饰/染色质重塑的影响及其与细胞增殖关系的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微球表面阵列高灵敏度microRNA成像分析研究

国家自然科学基金

0+阅读 · 2009年12月31日

肿瘤干细胞与非激素依赖性前列腺癌的形成

国家自然科学基金

0+阅读 · 2008年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

紫外线（UV）对热带珊瑚礁海区浮游藻类光合作用的生态效应

国家自然科学基金

0+阅读 · 2008年12月31日

Extending the Frontier of ChatGPT: Code Generation and Debugging

Arxiv

0+阅读 · 2023年7月17日

Large-Scale Person Detection and Localization using Overhead Fisheye Cameras

Arxiv

0+阅读 · 2023年7月17日

Temporal Label-Refinement for Weakly-Supervised Audio-Visual Event Localization

Arxiv

0+阅读 · 2023年7月12日

Deep Learning for Medical Image Segmentation: Tricks, Challenges and Future Directions

Arxiv

21+阅读 · 2022年9月21日

Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

Arxiv

16+阅读 · 2021年5月26日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Contrastive Clustering

Arxiv

31+阅读 · 2020年9月21日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Deep Learning for Generic Object Detection: A Survey

Deep Learning for Generic Object Detection: A Survey

Arxiv

14+阅读 · 2018年9月6日

Weakly Supervised One-Shot Detection with Attention Siamese Networks

Arxiv

14+阅读 · 2018年1月12日

VIP会员

文章信息

相关主题

最新内容

以人工智能为中心的指挥控制

以人工智能为中心的指挥控制

专知会员服务

0+阅读 · 21分钟前

《通过适应复杂环境与特殊作战行动动态来变革情报周期》

《通过适应复杂环境与特殊作战行动动态来变革情报周期》

专知会员服务

1+阅读 · 今天4:15

俄乌冲突背景下军事特种公路运输日益增长的重要性

俄乌冲突背景下军事特种公路运输日益增长的重要性

专知会员服务

2+阅读 · 今天3:44

速度优先于谨慎：NSPM-11意味着什么（将人工智能融入美国国防和情报行动最全面的声明）

速度优先于谨慎：NSPM-11意味着什么（将人工智能融入美国国防和情报行动最全面的声明）

专知会员服务

7+阅读 · 6月10日

《基于深度强化学习的反无人机技术研究》178页

《基于深度强化学习的反无人机技术研究》178页

专知会员服务

7+阅读 · 6月10日

技术突破与战略优势竞争：美军人工智能技术运用阶段分析

技术突破与战略优势竞争：美军人工智能技术运用阶段分析

专知会员服务

4+阅读 · 6月10日

“史诗怒火”行动与“AI中心战”模式的浮现

“史诗怒火”行动与“AI中心战”模式的浮现

专知会员服务

8+阅读 · 6月10日

【CVPR2026教程】扩散模型的解析理解

【CVPR2026教程】扩散模型的解析理解

专知会员服务

3+阅读 · 6月10日

【CVPR2026教程】从感知到模拟：多模态推理中世界模型的涌现

【CVPR2026教程】从感知到模拟：多模态推理中世界模型的涌现

专知会员服务

4+阅读 · 6月10日

马赛克战：俄乌战场透析

马赛克战：俄乌战场透析

专知会员服务

16+阅读 · 6月10日

《利用人工智能增强军事决策》

《利用人工智能增强军事决策》

专知会员服务

7+阅读 · 6月10日

《自动机器学习在军事数据耕耘法中的应用》

《自动机器学习在军事数据耕耘法中的应用》

专知会员服务

9+阅读 · 6月10日

为何指挥所生存能力要求范式转变

为何指挥所生存能力要求范式转变

专知会员服务

6+阅读 · 6月10日

打造“新蛛网”模式与高科技动员

打造“新蛛网”模式与高科技动员

专知会员服务

5+阅读 · 6月10日

“蛛网”行动一周年：远程无人机战争

“蛛网”行动一周年：远程无人机战争

专知会员服务

3+阅读 · 6月10日

相关VIP内容

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

46+阅读 · 2020年10月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《通过适应复杂环境与特殊作战行动动态来变革情报周期》

速度优先于谨慎：NSPM-11意味着什么（将人工智能融入美国国防和情报行动最全面的声明）

以人工智能为中心的指挥控制

俄乌冲突背景下军事特种公路运输日益增长的重要性

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

相关论文

Extending the Frontier of ChatGPT: Code Generation and Debugging

Arxiv

0+阅读 · 2023年7月17日

Large-Scale Person Detection and Localization using Overhead Fisheye Cameras

Arxiv

0+阅读 · 2023年7月17日

Temporal Label-Refinement for Weakly-Supervised Audio-Visual Event Localization

Arxiv

0+阅读 · 2023年7月12日

Deep Learning for Medical Image Segmentation: Tricks, Challenges and Future Directions

Arxiv

21+阅读 · 2022年9月21日

Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

Arxiv

16+阅读 · 2021年5月26日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Contrastive Clustering

Arxiv

31+阅读 · 2020年9月21日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Deep Learning for Generic Object Detection: A Survey

Deep Learning for Generic Object Detection: A Survey

Arxiv

14+阅读 · 2018年9月6日

Weakly Supervised One-Shot Detection with Attention Siamese Networks

Arxiv

14+阅读 · 2018年1月12日

相关基金

小尺度电离层扰动的TEC起伏特性研究

国家自然科学基金

0+阅读 · 2014年12月31日

近海复杂水体时空变异对水色遥感产品真实性检验的影响研究

国家自然科学基金

0+阅读 · 2013年12月31日

太阳爆发活动源区三维非势磁场重构的MHD方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

空间遥感绝对辐射定标基准辐射计

国家自然科学基金

0+阅读 · 2012年12月31日

基于MARVELD1调控的miRNA对组蛋白H4甲基化修饰/染色质重塑的影响及其与细胞增殖关系的研究

国家自然科学基金

0+阅读 · 2011年12月31日

微球表面阵列高灵敏度microRNA成像分析研究

国家自然科学基金

0+阅读 · 2009年12月31日

肿瘤干细胞与非激素依赖性前列腺癌的形成

国家自然科学基金

0+阅读 · 2008年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

紫外线（UV）对热带珊瑚礁海区浮游藻类光合作用的生态效应

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员