NL2CMD: An Updated Workflow for Natural Language to Bash Commands Translation - 专知论文

会员服务 ·

0

Bash · 数据集 · MoDELS · state-of-the-art · 语义分析 ·

2023 年 2 月 15 日

NL2CMD: An Updated Workflow for Natural Language to Bash Commands Translation

翻译：NL2CMD：自然语言转Bash命令翻译的更新工作流

Quchen Fu,Zhongwei Teng,Marco Georgaklis,Jules White,Douglas C. Schmidt

Translating natural language into Bash Commands is an emerging research field that has gained attention in recent years. Most efforts have focused on producing more accurate translation models. To the best of our knowledge, only two datasets are available, with one based on the other. Both datasets involve scraping through known data sources (through platforms like stack overflow, crowdsourcing, etc.) and hiring experts to validate and correct either the English text or Bash Commands. This paper provides two contributions to research on synthesizing Bash Commands from scratch. First, we describe a state-of-the-art translation model used to generate Bash Commands from the corresponding English text. Second, we introduce a new NL2CMD dataset that is automatically generated, involves minimal human intervention, and is over six times larger than prior datasets. Since the generation pipeline does not rely on existing Bash Commands, the distribution and types of commands can be custom adjusted. Our empirical results show how the scale and diversity of our dataset can offer unique opportunities for semantic parsing researchers.

翻译：将自然语言翻译为Bash命令是近年来受到关注的新兴研究领域。现有工作大多聚焦于提升翻译模型的准确性。据我们所知，目前仅有两个可用数据集，且其中一个基于另一个构建。这两个数据集均通过爬取已知数据源（如Stack Overflow、众包平台等）获取数据，并聘请专家对英文文本或Bash命令进行验证与修正。本文为从零合成Bash命令的研究提供两项贡献：首先，我们描述了一种从对应英文文本生成Bash命令的最先进翻译模型；其次，我们提出一个新的NL2CMD数据集，该数据集可自动生成，人工干预极小，且规模是先前数据集的六倍以上。由于生成流程不依赖现有Bash命令，命令的分布与类型可进行自定义调整。实证结果表明，我们数据集的规模与多样性能够为语义解析研究者提供独特的研究机遇。

1

相关内容

Bash

Bourne Again Shell 是一个由 Brian Fox 于 1989 年为 GNU 项目编写的、用于替代 Bourne Shell (sh) 的 UNIX shell 程序。 http://en.wikipedia.org/wiki/Bash_(Unix_shell)

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【MIT Sam Hopkins】如何读论文？How to Read a Paper

【MIT Sam Hopkins】如何读论文？How to Read a Paper

专知会员服务

108+阅读 · 2022年3月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

【Github】All4NLP：自然语言处理相关资源整理

【Github】All4NLP：自然语言处理相关资源整理

AINLP

23+阅读 · 2019年8月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

BLS1调控水稻花器官发育作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

各向同性和TI弹性波方程高精度有限差分数值解法新方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

nAChR/Prx1轴在烟草相关口腔白斑细胞凋亡中的作用及机制

国家自然科学基金

0+阅读 · 2014年12月31日

miR-29b在Ang-II诱导肾小管上皮间充质转分化中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

数据并行与线程并行合一的可伸缩处理器体系结构

国家自然科学基金

2+阅读 · 2013年12月31日

MTS-AOP-JCR系统在CMI杂志国际化中的拓展应用

国家自然科学基金

0+阅读 · 2012年12月31日

补肾化瘀生新方延缓骨髓间充质干细胞衰老的作用及其抗衰老机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

磷酸酪氨酸磷酸酶PTP-PEST在肝癌细胞转移中的作用及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

甘草素（liquiritigenin）抗肝肿瘤作用及其氧化应激机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

高性能壳聚糖纳米微囊介导双基因共转染ADSCs的成骨研究

国家自然科学基金

0+阅读 · 2008年12月31日

DATE: Domain Adaptive Product Seeker for E-commerce

Arxiv

0+阅读 · 2023年4月7日

Large language models effectively leverage document-level context for literary translation, but critical errors persist

Arxiv

0+阅读 · 2023年4月7日

Egocentric Video Task Translation

Arxiv

0+阅读 · 2023年4月6日

Semantic Communications for Image Recovery and Classification via Deep Joint Source and Channel Coding

Arxiv

0+阅读 · 2023年4月5日

Persuading to Prepare for Quitting Smoking with a Virtual Coach: Using States and User Characteristics to Predict Behavior

Arxiv

0+阅读 · 2023年4月5日

PAC-Based Formal Verification for Out-of-Distribution Data Detection

Arxiv

0+阅读 · 2023年4月4日

Knowledge Graph Quality Evaluation under Incomplete Information

Arxiv

0+阅读 · 2023年4月3日

Genie: Show Me the Data for Quantization

Arxiv

0+阅读 · 2023年4月3日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

Arxiv

13+阅读 · 2019年11月1日

VIP会员

文章信息

相关主题

state-of-the-art

最新内容

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

专知会员服务

1+阅读 · 34分钟前

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

专知会员服务

1+阅读 · 49分钟前

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

专知会员服务

0+阅读 · 43分钟前

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

专知会员服务

0+阅读 · 48分钟前

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

专知会员服务

4+阅读 · 7月17日

《边缘端实时无线感知赋能现场多机器人部署》200页

《边缘端实时无线感知赋能现场多机器人部署》200页

专知会员服务

5+阅读 · 7月17日

战力倍增器：自主武器系统与乌克兰及加沙冲突

战力倍增器：自主武器系统与乌克兰及加沙冲突

专知会员服务

4+阅读 · 7月17日

人工智能赋能战场情报：提速决策进程

人工智能赋能战场情报：提速决策进程

专知会员服务

2+阅读 · 7月17日

《拥抱新兴技术：面向未来军官的教育革新》

《拥抱新兴技术：面向未来军官的教育革新》

专知会员服务

5+阅读 · 7月17日

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

专知会员服务

2+阅读 · 7月17日

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

专知会员服务

3+阅读 · 7月17日

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

专知会员服务

11+阅读 · 7月16日

《无人地面战车（UGV）的崛起》报告

《无人地面战车（UGV）的崛起》报告

专知会员服务

7+阅读 · 7月16日

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

专知会员服务

6+阅读 · 7月16日

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

专知会员服务

13+阅读 · 7月16日

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

【MIT Sam Hopkins】如何读论文？How to Read a Paper

【MIT Sam Hopkins】如何读论文？How to Read a Paper

专知会员服务

108+阅读 · 2022年3月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

【Github】All4NLP：自然语言处理相关资源整理

【Github】All4NLP：自然语言处理相关资源整理

AINLP

23+阅读 · 2019年8月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

【推荐】深度学习目标检测概览

【推荐】深度学习目标检测概览

机器学习研究会

10+阅读 · 2017年9月1日

相关论文

DATE: Domain Adaptive Product Seeker for E-commerce

Arxiv

0+阅读 · 2023年4月7日

Large language models effectively leverage document-level context for literary translation, but critical errors persist

Arxiv

0+阅读 · 2023年4月7日

Egocentric Video Task Translation

Arxiv

0+阅读 · 2023年4月6日

Semantic Communications for Image Recovery and Classification via Deep Joint Source and Channel Coding

Arxiv

0+阅读 · 2023年4月5日

Persuading to Prepare for Quitting Smoking with a Virtual Coach: Using States and User Characteristics to Predict Behavior

Arxiv

0+阅读 · 2023年4月5日

PAC-Based Formal Verification for Out-of-Distribution Data Detection

Arxiv

0+阅读 · 2023年4月4日

Knowledge Graph Quality Evaluation under Incomplete Information

Arxiv

0+阅读 · 2023年4月3日

Genie: Show Me the Data for Quantization

Arxiv

0+阅读 · 2023年4月3日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Arxiv

12+阅读 · 2020年12月14日

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions

Arxiv

13+阅读 · 2019年11月1日

相关基金

BLS1调控水稻花器官发育作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

各向同性和TI弹性波方程高精度有限差分数值解法新方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

nAChR/Prx1轴在烟草相关口腔白斑细胞凋亡中的作用及机制

国家自然科学基金

0+阅读 · 2014年12月31日

miR-29b在Ang-II诱导肾小管上皮间充质转分化中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

数据并行与线程并行合一的可伸缩处理器体系结构

国家自然科学基金

2+阅读 · 2013年12月31日

MTS-AOP-JCR系统在CMI杂志国际化中的拓展应用

国家自然科学基金

0+阅读 · 2012年12月31日

补肾化瘀生新方延缓骨髓间充质干细胞衰老的作用及其抗衰老机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

磷酸酪氨酸磷酸酶PTP-PEST在肝癌细胞转移中的作用及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

甘草素（liquiritigenin）抗肝肿瘤作用及其氧化应激机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

高性能壳聚糖纳米微囊介导双基因共转染ADSCs的成骨研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员