Improving Code Example Recommendations on Informal Documentation Using BERT and Query-Aware LSH: A Comparative Study

Our research investigates the recommendation of code examples to aid software developers, a practice that saves developers significant time by providing ready-to-use code snippets. The focus of our study is Stack Overflow, a commonly used resource for coding discussions and solutions, particularly in the context of the Java programming language. We applied BERT, a powerful Large Language Model (LLM) that enables us to transform code examples into numerical vectors by extracting their semantic information. Once these numerical representations are prepared, we identify Approximate Nearest Neighbors (ANN) using Locality-Sensitive Hashing (LSH). Our research employed two variants of LSH: Random Hyperplane-based LSH and Query-Aware LSH. We rigorously compared these two approaches across four parameters: HitRate, Mean Reciprocal Rank (MRR), Average Execution Time, and Relevance. Our study revealed that the Query-Aware (QA) approach showed superior performance over the Random Hyperplane-based (RH) method. Specifically, it exhibited a notable improvement of 20\% to 35\% in HitRate for query pairs compared to the RH approach. Furthermore, the QA approach proved significantly more time-efficient, with its speed in creating hashing tables and assigning data samples to buckets being at least four times faster. It can return code examples within milliseconds, whereas the RH approach typically requires several seconds to recommend code examples. Due to the superior performance of the QA approach, we tested it against PostFinder and FaCoY, the state-of-the-art baselines. Our QA method showed comparable efficiency proving its potential for effective code recommendation.

翻译：本研究探讨了通过推荐代码示例辅助软件开发者的实践，这一方法通过提供即用代码片段显著节省了开发人员的时间。我们聚焦于Stack Overflow这一常用于编程讨论与解决方案的平台，尤其针对Java编程语言场景。我们应用了强大的大型语言模型BERT，通过提取代码示例的语义信息将其转化为数值向量。在获取这些数值表示后，我们使用局部敏感哈希（LSH）识别近似最近邻（ANN）。本研究采用了两种LSH变体：基于随机超平面的LSH与查询感知LSH。我们从命中率（HitRate）、平均倒数排名（MRR）、平均执行时间及相关性四个参数对这两种方法进行了严格比较。结果表明，查询感知（QA）方法在性能上显著优于随机超平面（RH）方法。具体而言，对于查询对，QA方法的命中率提升了20%至35%。此外，QA方法在时间效率上优势显著——其哈希表构建及数据分桶速度至少快4倍，能够在毫秒级返回代码示例，而RH方法通常需要数秒才能完成推荐。鉴于QA方法的优异表现，我们将其与现有最先进的基线方法PostFinder和FaCoY进行对比。我们的QA方法展现出可比的效率，证明其在代码推荐领域的潜力。