Research on generative models to produce human-aligned / human-preferred outputs has seen significant recent contributions. Between text and image-generative models, we narrowed our focus to text-based generative models, particularly to produce captions for images that align with human preferences. In this research, we explored a potential method to amplify the performance of the Deep Neural Network Model to generate captions that are preferred by humans. This was achieved by integrating Supervised Learning and Reinforcement Learning with Human Feedback (RLHF) using the Flickr8k dataset. Also, a novel loss function that is capable of optimizing the model based on human feedback is introduced. In this paper, we provide a concise sketch of our approach and results, hoping to contribute to the ongoing advances in the field of human-aligned generative AI models.
翻译:关于生成人类对齐/人类偏好输出的生成模型研究近期取得了显著进展。在文本与图像生成模型中,我们聚焦于文本生成模型,特别是生成符合人类偏好的图像描述。本研究探索了一种潜在方法,以增强深度神经网络模型生成人类偏好描述的性能。通过整合监督学习与基于人类反馈的强化学习(RLHF),并利用Flickr8k数据集实现了这一目标。此外,我们提出了一种新型损失函数,能够基于人类反馈优化模型。本文简要概述了我们的方法与结果,旨在为人类对齐生成式人工智能领域的持续进展做出贡献。