The rapid advancement of models based on artificial intelligence demands innovative monitoring techniques which can operate in real time with low computational costs. In machine learning, especially if we consider artificial neural networks (ANNs), the models are often trained in a supervised manner. Consequently, the learned relationship between the input and the output must remain valid during the model's deployment. If this stationarity assumption holds, we can conclude that the ANN provides accurate predictions. Otherwise, the retraining or rebuilding of the model is required. We propose considering the latent feature representation of the data (called "embedding") generated by the ANN to determine the time when the data stream starts being nonstationary. In particular, we monitor embeddings by applying multivariate control charts based on the data depth calculation and normalized ranks. The performance of the introduced method is compared with benchmark approaches for various ANN architectures and different underlying data formats.
翻译:基于人工智能的模型快速发展,亟需能够以低计算成本实时运行的新型监控技术。在机器学习领域,尤其是考虑人工神经网络时,模型通常以监督方式进行训练。因此,模型部署期间输入与输出之间学习到的关系必须保持有效性。若这一平稳性假设成立,则可判定人工神经网络能够提供准确预测;否则需对模型进行重新训练或重建。我们提出通过分析人工神经网络生成的数据潜在特征表示(称为"嵌入表征"),以确定数据流开始呈现非平稳性的时间点。具体而言,我们采用基于数据深度计算与标准化秩的多变量控制图对嵌入表征进行监控。针对不同人工神经网络架构及底层数据格式,将所提方法的性能与基准方法进行了对比分析。