Correctness and readability are key measures of code quality, respectively ensuring functional fidelity and ease of comprehension. While most existing research focuses on improving the correctness of large language models~(LLMs) generated codes, readability remains under-addressed. Enhancing readability through targeted control is challenging due to its subjective nature. In this article, we employ representation engineering~(RepE) as the targeted control method given its characteristics of low data dependency and low computational cost. Prior work on RepE has primarily focused on the targeted control for a single task, but improving the code readability requires the control across multiple tasks. Accordingly we proposes the multitask RepE framework and theoretically discuss the impact of the multitask steering method on the tradeoff between the code readability and correctness. We further provide comprehensive experiments in support. All the relevant implementations are open-source and available upon request.
翻译:正确性与可读性是代码质量的两个关键维度,分别确保功能完整性和理解便捷性。现有研究主要聚焦于提升大语言模型生成代码的正确性,但可读性问题仍未得到充分关注。由于可读性具有主观性特征,通过定向控制来提升可读性面临挑战。本文采用表示工程作为定向控制方法,因其具有低数据依赖性和低计算成本的特点。现有表示工程研究主要针对单一任务的定向控制,而提升代码可读性需要跨多个任务进行控制。据此我们提出多任务表示工程框架,并从理论上探讨多任务引导方法对代码可读性与正确性权衡的影响。我们进一步提供了全面的实验支持。所有相关实现均已开源并可申请获取。