The adaptive immune system's T and B cells can be viewed as large populations of simple, diverse classifiers. Artificial immune systems (AIS) $\unicode{x2013}$ algorithmic models of T or B cell repertoires $\unicode{x2013}$ are used in both computational biology and natural computing to investigate how the immune system adapts to its changing environments. However, researchers have struggled to build such systems at scale. For string-based AISs, finite state machines (FSMs) can store cell repertoires in compressed representations that are orders of magnitude smaller than explicitly stored receptor sets. This strategy allows AISs with billions of receptors to be generated in a matter of seconds. However, to date, these FSM-based AISs have been unable to deal with multiplicity in input data. Here, we show how weighted FSMs can be used to represent cell repertoires and model immunological processes like negative and positive selection, while also taking into account the multiplicity of input data. We use our method to build simple immune-inspired classifier systems that solve various toy problems in anomaly detection, showing how weights can be crucial for both performance and robustness to parameters. Our approach can potentially be extended to increase the scale of other population-based machine learning algorithms such as learning classifier systems.
翻译:适应性免疫系统的T细胞和B细胞可视为由大量简单且多样化的分类器组成的群体。人工免疫系统(AIS)——即T或B细胞库的算法模型——被用于计算生物学和自然计算领域,以研究免疫系统如何适应不断变化的环境。然而,研究人员一直难以大规模构建此类系统。对于基于字符串的AIS,有限状态机(FSM)能以压缩表示形式存储细胞库,其规模比显式存储的受体集合小数个数量级。这一策略使得拥有数十亿受体的AIS可在数秒内生成。但迄今为止,这些基于FSM的AIS无法处理输入数据的多样性问题。本文展示了如何利用加权有限状态机表示细胞库,并模拟阴性选择和阳性选择等免疫学过程,同时兼顾输入数据的多样性。我们使用该方法构建了简单的类免疫分类器系统,用于解决异常检测中的多种示例问题,揭示了权重对系统性能及参数鲁棒性的关键作用。我们的方法有望推广至其他基于群体的机器学习算法(如学习分类器系统)的规模化扩展。