Efficient Observation Time Window Segmentation for Administrative Data Machine Learning

Utilizing administrative data to predict outcomes is an important application area of machine learning, particularly in healthcare. Most administrative data records are timestamped and the pattern of records over time is a key input for machine learning models. This paper explores how best to divide the observation window of a machine learning model into time segments or "bins". A computationally efficient process is presented that identifies which data features benefit most from smaller, higher resolution time segments. Results generated on healthcare and housing/homelessness administrative data demonstrate that optimizing the time bin size of these high priority features while using a single time bin for the other features achieves machine learning models that are simpler and quicker to train. This approach also achieves similar and sometimes better performance than more complex models that default to representing all data features with the same time resolution.

翻译：利用管理数据预测结果是机器学习的重要应用领域，尤其在医疗保健领域。大多数管理数据记录都带有时间戳，记录随时间变化的模式是机器学习模型的关键输入。本文探讨如何最佳地将机器学习模型的观测窗口划分为时间段或“时间箱”。文中提出了一种计算高效的方法，用于识别哪些数据特征能从更小、更高分辨率的时间段中获益最多。在医疗保健与住房/无家可归管理数据上生成的结果表明，优化这些高优先级特征的时间箱大小，同时为其他特征使用单一时间箱，能够获得更简单且训练更快的机器学习模型。这种方法在性能上通常与默认对所有数据特征采用相同时间分辨率的复杂模型相当，有时甚至更优。

相关内容

Machine Learning

关注 2251

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日