This study presents a general analytical framework using DBSCAN clustering and penalized regression models to address multifactor problems with structural complexity and multicollinearity issues, such as carbon emission issue. The framework leverages DBSCAN for unsupervised learning to objectively cluster features. Meanwhile, penalized regression considers model complexity control and high dimensional feature selection to identify dominant influencing factors. Applying this framework to analyze energy consumption data for 46 industries from 2000 to 2019 identified 16 categories in the sample of China. We quantitatively assessed emission characteristics and drivers for each. The results demonstrate the framework's analytical approach can identify primary emission sources by category, providing quantitative references for decision-making. Overall, this framework can evaluate complex regional issues like carbon emissions to support policymaking. This research preliminarily validated its application value in identifying opportunities for emission reduction worldwide.
翻译:本研究提出了一种通用分析框架,采用DBSCAN聚类与惩罚回归模型,以解决具有结构复杂性和多重共线性问题的多因子问题(如碳排放问题)。该框架利用DBSCAN进行无监督学习,以客观地对特征进行聚类。同时,惩罚回归通过考虑模型复杂度控制和高维特征选择来识别主导影响因素。将该框架应用于2000年至2019年中国46个行业的能源消耗数据分析,识别出样本中的16个类别。我们定量评估了每个类别的排放特征及驱动因素。结果表明,该框架的分析方法能够按类别识别主要排放源,为决策提供定量参考。总体而言,该框架能够评估如碳排放等复杂区域性问题,以支持政策制定。本研究初步验证了其在识别全球减排机遇方面的应用价值。