Smart glasses are accelerating progress toward more seamless and personalized LLM-based assistance by integrating multimodal inputs. Yet, these inputs rely on obtrusive explicit prompts. The advent of gaze tracking on smart devices offers a unique opportunity to extract implicit user intent for personalization. This paper investigates whether LLMs can interpret user gaze for text-based tasks. We evaluate different gaze representations for personalization and validate their effectiveness in realistic reading tasks. Results show that LLMs can leverage gaze to generate high-quality personalized summaries and support users in downstream tasks, highlighting the feasibility and value of gaze-driven personalization for future mobile and wearable LLM applications.
翻译:智能眼镜通过整合多模态输入,正在加速实现更无缝、个性化的基于LLM的辅助功能。然而,这些输入依赖于显式的、易产生干扰的提示。智能设备上视线追踪技术的出现,为提取用户隐式意图以实现个性化提供了独特机遇。本文探究LLM能否在基于文本的任务中解读用户视线。我们评估了不同视线表征在个性化任务中的效果,并在现实阅读场景中验证了其有效性。结果表明,LLM能够利用视线信息生成高质量的个性化摘要,并支持用户完成下游任务,这凸显了视线驱动的个性化在未来移动与可穿戴LLM应用中的可行性和价值。