Stack Overflow, the world's largest software Q&A (SQA) website, is facing a significant traffic drop due to the emergence of generative AI techniques. ChatGPT is banned by Stack Overflow after only 6 days from its release. The main reason provided by the official Stack Overflow is that the answers generated by ChatGPT are of low quality. To verify this, we conduct a comparative evaluation of human-written and ChatGPT-generated answers. Our methodology employs both automatic comparison and a manual study. Our results suggest that human-written and ChatGPT-generated answers are semantically similar, however, human-written answers outperform ChatGPT-generated ones consistently across multiple aspects, specifically by 10% on the overall score. We release the data, analysis scripts, and detailed results at https://anonymous.4open.science/r/GAI4SQA-FD5C.
翻译:Stack Overflow作为全球最大的软件问答(SQA)网站,因生成式AI技术的兴起而面临显著的流量下降。ChatGPT在发布仅6天后便被Stack Overflow禁用。官方Stack Overflow给出的主要理由是ChatGPT生成的答案质量低下。为验证这一说法,我们对人类撰写的答案与ChatGPT生成的答案进行了对比评估。我们的方法结合了自动比较与人工研究。结果表明,人类撰写的回答与ChatGPT生成的回答在语义上相似,但人类撰写的回答在多个维度上均优于ChatGPT生成的回答,尤其是在总体得分上高出10%。我们已在https://anonymous.4open.science/r/GAI4SQA-FD5C 公开了数据、分析脚本及详细结果。