Complex human activity recognition (CHAR) remains a pivotal challenge within ubiquitous computing, especially in the context of smart environments. Existing studies typically require meticulous labeling of both atomic and complex activities, a task that is labor-intensive and prone to errors due to the scarcity and inaccuracies of available datasets. Most prior research has focused on datasets that either precisely label atomic activities or, at minimum, their sequence approaches that are often impractical in real world settings.In response, we introduce VCHAR (Variance-Driven Complex Human Activity Recognition), a novel framework that treats the outputs of atomic activities as a distribution over specified intervals. Leveraging generative methodologies, VCHAR elucidates the reasoning behind complex activity classifications through video-based explanations, accessible to users without prior machine learning expertise. Our evaluation across three publicly available datasets demonstrates that VCHAR enhances the accuracy of complex activity recognition without necessitating precise temporal or sequential labeling of atomic activities. Furthermore, user studies confirm that VCHAR's explanations are more intelligible compared to existing methods, facilitating a broader understanding of complex activity recognition among non-experts.
翻译:复杂人类活动识别(CHAR)在普适计算领域,尤其是智能环境背景下,仍然是一个关键挑战。现有研究通常需要对原子活动和复杂活动进行精细标注,由于可用数据集的稀缺性和不准确性,这是一项劳动密集型且易出错的任务。先前大多数研究集中于那些要么精确标注原子活动、要么至少标注其序列的数据集,这类方法在现实场景中往往不切实际。为此,我们提出了VCHAR(方差驱动复杂人类活动识别),这是一个新颖的框架,将原子活动的输出视为特定时间区间上的分布。VCHAR利用生成式方法,通过基于视频的解释来阐明复杂活动分类背后的推理过程,即使没有机器学习专业知识的用户也能理解。我们在三个公开数据集上的评估表明,VCHAR提高了复杂活动识别的准确性,且无需对原子活动进行精确的时间或序列标注。此外,用户研究证实,与现有方法相比,VCHAR的解释更易于理解,有助于非专业人士更广泛地认识复杂活动识别。