Smartphones have become essential to people's digital lives, providing a continuous stream of information and connectivity. However, this constant flow can lead to moments where users are simply passing time rather than engaging meaningfully. This underscores the importance of developing methods to identify these "time-killing" moments, enabling the delivery of important notifications in a way that minimizes interruptions and enhances user engagement. Recent work has utilized screenshots taken every 5 seconds to detect time-killing activities on smartphones. However, this method often misses to capture phone usage between intervals. We demonstrate that up to 50% of time-killing instances go undetected using screenshots, leading to substantial gaps in understanding user behavior. To address this limitation, we propose a method called ScreenTK that detects time-killing moments by leveraging continuous screen text monitoring and on-device large language models (LLMs). Screen text contains more comprehensive information than screenshots and allows LLMs to summarize detailed phone usage. To verify our framework, we conducted experiments with six participants, capturing 1,034 records of different time-killing moments. Initial results show that our framework outperforms state-of-the-art solutions by 38% in our case study.
翻译:智能手机已成为人们数字生活中不可或缺的一部分,持续提供信息流与连接性。然而,这种持续的信息流可能导致用户仅是在消磨时间而非进行有意义的互动,这凸显了开发识别此类“消磨时间”时刻方法的重要性,以便以最小化干扰并增强用户参与度的方式传递重要通知。近期研究采用每5秒截取屏幕截图的方法来检测智能手机上的消磨时间活动,但该方法常遗漏间隔期内的手机使用情况。我们证明,使用屏幕截图会导致高达50%的消磨时间实例未被检测,从而在理解用户行为方面存在显著空白。为克服此局限,我们提出一种名为ScreenTK的方法,通过利用连续屏幕文本监控与设备端大语言模型(LLMs)来检测消磨时间时刻。屏幕文本包含比屏幕截图更全面的信息,并允许LLMs总结详细的手机使用情况。为验证我们的框架,我们对六名参与者进行了实验,收集了1,034条不同消磨时间时刻的记录。初步结果表明,在我们的案例研究中,该框架的性能优于现有最优解决方案达38%。