基于大语言模型的长代码翻译：标识符替换方法 (LLM Based Long Code Translation using Identifier Replacement) - 专知论文

会员服务 ·

0

代码 · 可辨认的 · 可约的 · 大语言模型 · Automator ·

2025 年 10 月 31 日

LLM Based Long Code Translation using Identifier Replacement

翻译：基于大语言模型的长代码翻译：标识符替换方法

Manojit Chakraborty,Madhusudan Ghosh,Rishabh Gupta

In the domain of software development, LLMs have been utilized to automate tasks such as code translation, where source code from one programming language is translated to another while preserving its functionality. However, LLMs often struggle with long source codes that don't fit into the context window, which produces inaccurate translations. To address this, we propose a novel zero-shot code translation method that incorporates identifier replacement. By substituting user-given long identifiers with generalized placeholders during translation, our method allows the LLM to focus on the logical structure of the code, by reducing token count and memory usage, which improves the efficiency and cost-effectiveness of long code translation. Our empirical results demonstrate that our approach preserves syntactical and hierarchical information and produces translation results with reduced tokens.

翻译：在软件开发领域，大语言模型已被用于自动化代码翻译等任务，即将源代码从一种编程语言转换为另一种语言，同时保持其功能不变。然而，大语言模型在处理超出上下文窗口长度的长源代码时，往往难以生成准确的翻译结果。为解决这一问题，本文提出了一种新颖的零样本代码翻译方法，该方法引入了标识符替换机制。通过在翻译过程中将用户给定的长标识符替换为通用占位符，本方法使大语言模型能够专注于代码的逻辑结构，同时减少令牌数量和内存占用，从而提升长代码翻译的效率和成本效益。实验结果表明，该方法能够有效保留语法和层次结构信息，并生成令牌数更少的翻译结果。

0

相关内容

代码（Code）是专知网的一个重要知识资料文档板块，旨在整理收录论文源代码、复现代码，经典工程代码等，便于用户查阅下载使用。

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding

Arxiv

12+阅读 · 2020年4月15日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Deep Anomaly Detection with Outlier Exposure

Deep Anomaly Detection with Outlier Exposure

Arxiv

17+阅读 · 2018年12月21日

Approaches for Enriching and Improving Textual Knowledge Bases

Arxiv

15+阅读 · 2018年4月20日

Image Captioning using Deep Neural Architectures

Arxiv

20+阅读 · 2018年1月17日

VIP会员

文章信息

相关主题

大语言模型

相关VIP内容

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

机器人领域的视频生成模型：应用、研究挑战与未来展望

《面向小规模遥感应用引入思维链推理与多模态小语言模型》

DeepSeek开源大模型「记忆」模块，D梁文锋署名新论文，下一代稀疏模型提前剧透

网络中心战：未来冲突

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

相关论文

Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding

Arxiv

12+阅读 · 2020年4月15日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Deep Anomaly Detection with Outlier Exposure

Deep Anomaly Detection with Outlier Exposure

Arxiv

17+阅读 · 2018年12月21日

Approaches for Enriching and Improving Textual Knowledge Bases

Arxiv

15+阅读 · 2018年4月20日

Image Captioning using Deep Neural Architectures

Arxiv

20+阅读 · 2018年1月17日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员