This research aims to evaluate the performance of several Recurrent Neural Network (RNN) architectures including Simple RNN, Gated Recurrent Units (GRU), and Long Short-Term Memory (LSTM), compared to classic algorithms such as Random Forest and XGBoost in building classification models for early crash detection in ASEAN-5 stock markets. The study is examined using imbalanced data, which is common due to the rarity of market crashes. The study analyzes daily data from 2010 to 2023 across the major stock markets of the ASEAN-5 countries, including Indonesia, Malaysia, Singapore, Thailand, and Philippines. Market crash is identified as the target variable when the major stock price indices fall below the Value at Risk (VaR) thresholds of 5%, 2.5% and 1%. predictors involving technical indicators of major local and global markets as well as commodity markets. This study includes 213 predictors with their respective lags (5, 10, 15, 22, 50, 200) and uses a time step of 7, expanding the total number of predictors to 1491. The challenge of data imbalance is addressed with SMOTE-ENN. The results show that all RNN-Based architectures outperform Random Forest and XGBoost. Among the various RNN architectures, Simple RNN stands out as the most superior, mainly due to the data characteristics that are not overly complex and focus more on short-term information. This study enhances and extends the range of phenomena observed in previous studies by incorporating variables like different geographical zones and time periods, as well as methodological adjustments.
翻译:本研究旨在评估多种循环神经网络(RNN)架构——包括简单RNN、门控循环单元(GRU)和长短期记忆网络(LSTM)——与随机森林、XGBoost等经典算法在构建东盟五国股市早期崩盘检测分类模型时的性能。研究针对不平衡数据展开分析,此类数据因市场崩盘事件罕见而普遍存在。研究选取2010年至2023年东盟五国(印度尼西亚、马来西亚、新加坡、泰国、菲律宾)主要股市的日度数据,当主要股价指数跌破5%、2.5%和1%的风险价值(VaR)阈值时,即判定为市场崩盘目标变量。预测变量涵盖主要本地与全球市场以及商品市场的技术指标。本研究纳入213个预测变量及其对应滞后阶数(5、10、15、22、50、200),采用7个时间步长,使预测变量总数扩展至1491个。针对数据不平衡问题,采用SMOTE-ENN方法进行处理。结果表明,所有基于RNN的架构均优于随机森林和XGBoost。在各RNN架构中,简单RNN表现最为突出,主要源于数据特征并不过于复杂且更侧重于短期信息。本研究通过纳入不同地理区域与时间段变量以及方法学调整,拓展并深化了既往研究中的现象观测范围。