With the development of new sensors and monitoring devices, more sources of data become available to be used as inputs for machine learning models. These can on the one hand help to improve the accuracy of a model. On the other hand however, combining these new inputs with historical data remains a challenge that has not yet been studied in enough detail. In this work, we propose a transfer-learning algorithm that combines the new and the historical data, that is especially beneficial when the new data is scarce. We focus the approach on the linear regression case, which allows us to conduct a rigorous theoretical study on the benefits of the approach. We show that our approach is robust against negative transfer-learning, and we confirm this result empirically with real and simulated data.
翻译:随着新型传感器与监测设备的发展,更多数据源可被用作机器学习模型的输入。这些数据一方面有助于提升模型精度,但另一方面,将这些新输入与历史数据结合仍是一个尚未得到充分研究的挑战。本文提出一种结合新旧数据的迁移学习算法,尤其适用于新数据稀缺的情形。我们以线性回归模型为研究切入点,从而能够对方法的优势进行严格的理论分析。研究表明,该方法对负迁移学习具有鲁棒性,并基于真实与模拟数据通过实验验证了这一结论。