Financial firms commonly process and store billions of time-series data, generated continuously and at a high frequency. To support efficient data storage and retrieval, specialized time-series databases and systems have emerged. These databases support indexing and querying of time-series by a constrained Structured Query Language(SQL)-like format to enable queries like "Stocks with monthly price returns greater than 5%", and expressed in rigid formats. However, such queries do not capture the intrinsic complexity of high dimensional time-series data, which can often be better described by images or language (e.g., "A stock in low volatility regime"). Moreover, the required storage, computational time, and retrieval complexity to search in the time-series space are often non-trivial. In this paper, we propose and demonstrate a framework to store multi-modal data for financial time-series in a lower-dimensional latent space using deep encoders, such that the latent space projections capture not only the time series trends but also other desirable information or properties of the financial time-series data (such as price volatility). Moreover, our approach allows user-friendly query interfaces, enabling natural language text or sketches of time-series, for which we have developed intuitive interfaces. We demonstrate the advantages of our method in terms of computational efficiency and accuracy on real historical data as well as synthetic data, and highlight the utility of latent-space projections in the storage and retrieval of financial time-series data with intuitive query modalities.
翻译:金融机构通常处理并存储海量持续高频生成的时间序列数据。为支持高效数据存储与检索,专门的时间序列数据库与系统应运而生。这些数据库通过受限的结构化查询语言(SQL)类格式支持时间序列的索引与查询,例如"月度价格回报率超过5%的股票",并以严格格式表达。然而,此类查询无法捕捉高维时间序列数据的内在复杂性,而这些数据往往更适合通过图像或语言描述(如"低波动率状态的股票")。此外,在时间序列空间中进行搜索所需的存储、计算时间及检索复杂度往往不可忽视。本文提出并展示了一种框架,利用深度编码器将金融时间序列的多模态数据存储于低维隐空间中,使得隐空间投影不仅能捕捉时间序列趋势,还能保留金融时间序列数据的其他关键信息或属性(如价格波动率)。同时,我们的方法支持用户友好的查询接口,允许通过自然语言文本或时间序列草图进行检索,并为此开发了直观的交互界面。通过实际历史数据与合成数据的实验,我们展示了该方法在计算效率与准确性上的优势,并强调了隐空间投影在金融时间序列数据存储与检索中结合直观查询模式的实用价值。