We present an analysis of 12 million instances of privacy-relevant reviews publicly visible on the Google Play Store that span a 10 year period. By leveraging state of the art NLP techniques, we examine what users have been writing about privacy along multiple dimensions: time, countries, app types, diverse privacy topics, and even across a spectrum of emotions. We find consistent growth of privacy-relevant reviews, and explore topics that are trending (such as Data Deletion and Data Theft), as well as those on the decline (such as privacy-relevant reviews on sensitive permissions). We find that although privacy reviews come from more than 200 countries, 33 countries provide 90% of privacy reviews. We conduct a comparison across countries by examining the distribution of privacy topics a country's users write about, and find that geographic proximity is not a reliable indicator that nearby countries have similar privacy perspectives. We uncover some countries with unique patterns and explore those herein. Surprisingly, we uncover that it is not uncommon for reviews that discuss privacy to be positive (32%); many users express pleasure about privacy features within apps or privacy-focused apps. We also uncover some unexpected behaviors, such as the use of reviews to deliver privacy disclaimers to developers. Finally, we demonstrate the value of analyzing app reviews with our approach as a complement to existing methods for understanding users' perspectives about privacy
翻译:我们分析了谷歌应用商店上公开可见的、跨越十年时间、涉及隐私的1200万条评论。通过利用最先进的自然语言处理技术,我们从多个维度(时间、国家、应用类型、多样化的隐私主题乃至情感谱系)考察了用户对隐私的表述。研究发现,隐私相关评论持续增长,并探索了数据删除、数据盗窃等热门议题,以及敏感权限相关隐私评论等衰退议题。尽管隐私评论来自200多个国家,但其中33个国家贡献了90%的评论量。我们通过分析用户所在国家撰写评论的隐私主题分布进行跨国比较,发现地理邻近性并不能可靠地表明邻国具有相似的隐私观念。研究揭示了一些具有独特模式的国家并进行了探讨。令人惊讶的是,我们发现讨论隐私的评论中具有正面情绪的比例并不低(32%),许多用户对应用内的隐私功能或专注隐私的应用表达了赞赏。此外,我们还发现了一些非预期行为,例如用户通过评论向开发者发送隐私声明。最后,我们证明了将本方法与现有方法相结合分析应用评论,对于理解用户隐私观念具有的补充价值。