Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation

Katherine M. Collins,Najoung Kim,Yonatan Bitton,Verena Rieser,Shayegan Omidshafiei,Yushi Hu,Sherol Chen,Senjuti Dutta,Minsuk Chang,Kimin Lee,Youwei Liang,Georgina Evans,Sahil Singla,Gang Li,Adrian Weller,Junfeng He,Deepak Ramachandran,Krishnamurthy Dj Dvijotham

Human feedback plays a critical role in learning and refining reward models for text-to-image generation, but the optimal form the feedback should take for learning an accurate reward function has not been conclusively established. This paper investigates the effectiveness of fine-grained feedback which captures nuanced distinctions in image quality and prompt-alignment, compared to traditional coarse-grained feedback (for example, thumbs up/down or ranking between a set of options). While fine-grained feedback holds promise, particularly for systems catering to diverse societal preferences, we show that demonstrating its superiority to coarse-grained feedback is not automatic. Through experiments on real and synthetic preference data, we surface the complexities of building effective models due to the interplay of model choice, feedback type, and the alignment between human judgment and computational interpretation. We identify key challenges in eliciting and utilizing fine-grained feedback, prompting a reassessment of its assumed benefits and practicality. Our findings -- e.g., that fine-grained feedback can lead to worse models for a fixed budget, in some settings; however, in controlled settings with known attributes, fine grained rewards can indeed be more helpful -- call for careful consideration of feedback attributes and potentially beckon novel modeling approaches to appropriately unlock the potential value of fine-grained feedback in-the-wild.

翻译：人类反馈在学习和优化文本到图像生成的奖励模型中起着关键作用，但反馈应采取何种最优形式以学习准确的奖励函数尚未得到明确结论。本文研究了捕捉图像质量与提示对齐细微差异的细粒度反馈相较于传统粗粒度反馈（例如点赞/点踩或选项间排序）的有效性。尽管细粒度反馈具有潜力，特别是在适应多样化社会偏好的系统中，但我们证明其相对于粗粒度反馈的优越性并非自动实现。通过在真实与合成偏好数据上的实验，我们揭示了由于模型选择、反馈类型以及人类判断与计算解释之间的对齐相互作用，构建有效模型所面临的复杂性。我们识别了在获取和利用细粒度反馈过程中的关键挑战，促使对其假定优势与实用性的重新评估。我们的发现——例如，在某些场景下，固定预算下细粒度反馈可能导致更差的模型；然而，在具有已知属性的受控环境中，细粒度奖励确实能提供更多帮助——呼吁对反馈属性进行审慎考量，并可能需要新颖的建模方法，以在真实场景中恰当释放细粒度反馈的潜在价值。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日