Analyzing Male Domestic Violence through Exploratory Data Analysis and Explainable Machine Learning Insights

Domestic violence, which is often perceived as a gendered issue among female victims, has gained increasing attention in recent years. Despite this focus, male victims of domestic abuse remain primarily overlooked, particularly in Bangladesh. Our study represents a pioneering exploration of the underexplored realm of male domestic violence (MDV) within the Bangladeshi context, shedding light on its prevalence, patterns, and underlying factors. Existing literature predominantly emphasizes female victimization in domestic violence scenarios, leading to an absence of research on male victims. We collected data from the major cities of Bangladesh and conducted exploratory data analysis to understand the underlying dynamics. We implemented 11 traditional machine learning models with default and optimized hyperparameters, 2 deep learning, and 4 ensemble models. Despite various approaches, CatBoost has emerged as the top performer due to its native support for categorical features, efficient handling of missing values, and robust regularization techniques, achieving 76% accuracy. In contrast, other models achieved accuracy rates in the range of 58-75%. The eXplainable AI techniques, SHAP and LIME, were employed to gain insights into the decision-making of black-box machine learning models. By shedding light on this topic and identifying factors associated with domestic abuse, the study contributes to identifying groups of people vulnerable to MDV, raising awareness, and informing policies and interventions aimed at reducing MDV. Our findings challenge the prevailing notion that domestic abuse primarily affects women, thus emphasizing the need for tailored interventions and support systems for male victims. ML techniques enhance the analysis and understanding of the data, providing valuable insights for developing effective strategies to combat this pressing social issue.

翻译：家庭暴力通常被视为针对女性受害者的性别议题，近年来日益受到关注。然而，男性家庭暴力受害者仍普遍被忽视，尤其在孟加拉国。本研究开创性地探索了孟加拉国背景下未被充分研究的男性家庭暴力（MDV）领域，揭示了其普遍性、模式和潜在因素。现有文献主要聚焦于家庭暴力中的女性受害现象，导致针对男性受害者的研究缺失。我们收集了孟加拉国主要城市的数据，通过探索性数据分析理解其内在机制。我们采用默认和优化超参数实施了11种传统机器学习模型、2种深度学习模型和4种集成模型。尽管尝试多种方法，CatBoost凭借其原生支持分类特征、高效处理缺失值和稳健的正则化技术脱颖而出，实现了76%的准确率，而其他模型的准确率在58-75%之间。采用可解释人工智能技术SHAP和LIME来洞察黑盒机器学习模型的决策过程。通过揭示这一议题并识别与家庭虐待相关的因素，本研究有助于识别易受MDV影响的群体，提高公众认知，并为减少MDV的政策和干预措施提供依据。我们的研究结果挑战了家庭暴力主要影响女性的普遍认知，从而强调为男性受害者制定针对性干预措施和支持体系的必要性。机器学习技术增强了数据分析与理解能力，为制定有效策略应对这一紧迫社会问题提供了宝贵见解。

相关内容

Machine Learning

关注 2251

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日