This study extends the examination of the Efficient-Market Hypothesis in Bitcoin market during a five year fluctuation period, from September 1 2017 to September 1 2022, by analyzing 28,739,514 qualified tweets containing the targeted topic "Bitcoin". Unlike previous studies, we extracted fundamental keywords as an informative proxy for carrying out the study of the EMH in the Bitcoin market rather than focusing on sentiment analysis, information volume, or price data. We tested market efficiency in hourly, 4-hourly, and daily time periods to understand the speed and accuracy of market reactions towards the information within different thresholds. A sequence of machine learning methods and textual analyses were used, including measurements of distances of semantic vector spaces of information, keywords extraction and encoding model, and Light Gradient Boosting Machine (LGBM) classifiers. Our results suggest that 78.06% (83.08%), 84.63% (87.77%), and 94.03% (94.60%) of hourly, 4-hourly, and daily bullish (bearish) market movements can be attributed to public information within organic tweets.
翻译:本研究通过分析28,739,514条包含目标主题"比特币"的合格推文,在五年波动期(2017年9月1日至2022年9月1日)内扩展对比特币市场有效市场假说的检验。与以往研究不同,我们提取基础关键词作为信息代理变量来研究比特币市场的EMH,而非聚焦于情感分析、信息量或价格数据。我们以小时、4小时和日度为周期检验市场效率,以理解市场在不同阈值内对信息的反应速度与准确性。研究采用了一系列机器学习方法与文本分析技术,包括信息语义向量空间距离度量、关键词提取与编码模型,以及Light Gradient Boosting Machine(LGBM)分类器。结果表明,小时、4小时和日度周期内分别有78.06%(83.08%)、84.63%(87.77%)和94.03%(94.60%)的看涨(看跌)市场波动可归因于原生推文中的公开信息。