This study advances aspect-based sentiment analysis (ABSA) for Persian-language user reviews in the tourism domain, addressing challenges of low-resource languages. We propose a hybrid BERT-based model with Top-K routing and auxiliary losses to mitigate routing collapse and improve efficiency. The pipeline includes: (1) overall sentiment classification using BERT on 9,558 labeled reviews, (2) multi-label aspect extraction for six tourism-related aspects (host, price, location, amenities, cleanliness, connectivity), and (3) integrated ABSA with dynamic routing. The dataset consists of 58,473 preprocessed reviews from the Iranian accommodation platform Jabama, manually annotated for aspects and sentiments. The proposed model achieves a weighted F1-score of 90.6% for ABSA, outperforming baseline BERT (89.25%) and a standard hybrid approach (85.7%). Key efficiency gains include a 39% reduction in GPU power consumption compared to dense BERT, supporting sustainable AI deployment in alignment with UN SDGs 9 and 12. Analysis reveals high mention rates for cleanliness and amenities as critical aspects. This is the first ABSA study focused on Persian tourism reviews, and we release the annotated dataset to facilitate future multilingual NLP research in tourism.
翻译:本研究针对旅游领域的波斯语用户评论,推进了基于方面的情感分析(ABSA),以应对低资源语言带来的挑战。我们提出了一种基于BERT的混合模型,采用Top-K路由和辅助损失函数,以缓解路由崩溃问题并提高效率。该流程包括:(1)使用BERT对9,558条标注评论进行整体情感分类,(2)针对六个旅游相关方面(房东、价格、位置、设施、清洁度、连通性)进行多标签方面抽取,以及(3)结合动态路由的集成ABSA。数据集包含来自伊朗住宿平台Jabama的58,473条预处理评论,已针对方面和情感进行了人工标注。所提出的模型在ABSA任务上取得了90.6%的加权F1分数,优于基线BERT(89.25%)和标准混合方法(85.7%)。关键效率提升包括与稠密BERT相比,GPU功耗降低了39%,这支持了符合联合国可持续发展目标9和12的可持续人工智能部署。分析揭示了清洁度和设施作为关键方面的高提及率。这是首个专注于波斯语旅游评论的ABSA研究,我们发布了标注数据集以促进未来旅游领域的多语言自然语言处理研究。