Generative AI has made significant strides, yet concerns about the accuracy and reliability of its outputs continue to grow. Such inaccuracies can have serious consequences such as inaccurate decision-making, the spread of false information, privacy violations, legal liabilities, and more. Although efforts to address these risks are underway, including explainable AI and responsible AI practices such as transparency, privacy protection, bias mitigation, and social and environmental responsibility, misinformation caused by generative AI will remain a significant challenge. We propose that verifying the outputs of generative AI from a data management perspective is an emerging issue for generative AI. This involves analyzing the underlying data from multi-modal data lakes, including text files, tables, and knowledge graphs, and assessing its quality and consistency. By doing so, we can establish a stronger foundation for evaluating the outputs of generative AI models. Such an approach can ensure the correctness of generative AI, promote transparency, and enable decision-making with greater confidence. Our vision is to promote the development of verifiable generative AI and contribute to a more trustworthy and responsible use of AI.
翻译:摘要:生成式人工智能取得了显著进展,但其输出的准确性和可靠性问题日益引发关注。此类不准确可能带来严重后果,如错误决策、虚假信息传播、隐私侵犯、法律责任等。尽管已有解决这些风险的努力,包括可解释人工智能以及透明性、隐私保护、偏见缓解和社会与环境责任等负责任人工智能实践,但生成式人工智能引发的错误信息仍将是一项重大挑战。我们认为,从数据管理角度验证生成式人工智能的输出是一个新兴议题。这涉及分析来自多模态数据湖(包括文本文件、表格和知识图谱)的底层数据,并评估其质量和一致性。通过这种方式,我们可以为评估生成式人工智能模型的输出奠定更坚实的基础。该方法能够确保生成式人工智能的正确性,促进透明性,并实现更自信的决策。我们的愿景是推动可验证生成式人工智能的发展,为更可信赖和负责任地使用人工智能做出贡献。