We present Kathleen, a text classification architecture that operates directly on raw UTF-8 bytes using frequency-domain processing -- requiring no tokenizer, no attention mechanism, and under 470K parameters. Kathleen introduces several novel components: (1) RecurrentOscillatorBanks -- damped sinusoid convolutions with temporal memory for O(L) sequence processing; (2) an FFT-Rotate Wavetable Encoder that maps all 256 byte values using a single learnable vector (256 floats); (3) PhaseHarmonics -- a sinusoidal non-linearity with just 6 learnable phase parameters (+2.6% accuracy, <0.001% of model parameters); (4) Content-Dependent Reverb with Positional Decay Modulation -- a temporal memory mechanism whose decay rate is jointly conditioned on input content and a learned position-indexed bias vector; (5) Token-Level Module Sequencer with consonance and dissonance interference channels. Through iterative architecture evolution from an initial 733K-parameter baseline (Kathleen-Clean) to the current Kathleen-V9 (469K parameters), we demonstrate that pretraining can be entirely eliminated while improving accuracy. Kathleen-V9 achieves 88.5% +/- 0.2% on IMDB, 92.4% +/- 0.2% on AG News, and 85.8% +/- 0.5% on SST-2 (3-seed averages) -- matching or exceeding the pretrained baseline on all benchmarks with 36% fewer parameters. On SST-2, the improvement is +2.5% absolute over the pretrained predecessor. Kathleen processes sequences in O(L) time and memory.
翻译:我们提出Kathleen,一种直接对原始UTF-8字节进行频域处理的文本分类架构——无需分词器、无需注意力机制,且参数量低于47万。Kathleen引入多项创新组件:(1)递归振荡器库——具有时间记忆功能的阻尼正弦波卷积,实现O(L)序列处理;(2)FFT旋转波形表编码器——利用单个可学习向量(256个浮点数)映射全部256个字节值;(3)相位谐波——仅含6个可学习相位参数的正弦非线性函数(精度提升+2.6%,参数量占比<0.001%);(4)内容相关混响与位置衰减调制——一种时间记忆机制,其衰减率由输入内容和学习到的位置索引偏置向量共同调节;(5)包含协和与不协和干扰通道的词元级模块排序器。通过从初始73.3万参数基线(Kathleen-Clean)到当前Kathleen-V9(46.9万参数)的迭代架构演化,我们证明可在消除预训练的同时提升准确率。Kathleen-V9在IMDB上达到88.5%±0.2%,在AG News上达到92.4%±0.2%,在SST-2上达到85.8%±0.5%(3次随机种子平均)——在所有基准测试中以36%更少的参数匹配或超越预训练基线。在SST-2上,相比预训练前代模型实现了+2.5%的绝对提升。Kathleen以O(L)时间和内存复杂度处理序列。