The discovery of deep, steerable taxonomies in large text corpora is currently restricted by a trade-off between the surface-level efficiency of topic models and the prohibitive, non-scalable assignment costs of LLM-integrated frameworks. We introduce \textbf{LogiPart}, a scalable, hypothesis-first framework for building interpretable hierarchical partitions that decouples hierarchy growth from expensive full-corpus LLM conditioning. LogiPart utilizes locally hosted LLMs on compact, embedding-aware samples to generate concise natural-language taxonomic predicates. These predicates are then evaluated efficiently across the entire corpus using zero-shot Natural Language Inference (NLI) combined with fast graph-based label propagation, achieving constant $O(1)$ generative token complexity per node relative to corpus size. We evaluate LogiPart across four diverse text corpora (totaling $\approx$140,000 documents). Using structured manifolds for \textbf{calibration}, we identify an empirical reasoning threshold at the 14B-parameter scale required for stable semantic grounding. On complex, high-entropy corpora (Wikipedia, US Bills), where traditional thematic metrics reveal an ``alignment gap,'' inverse logic validation confirms the stability of the induced logic, with individual taxonomic bisections maintaining an average per-node routing accuracy of up to 96\%. A qualitative audit by an independent LLM-as-a-judge confirms the discovery of meaningful functional axes, such as policy intent, that thematic ground-truth labels fail to capture. LogiPart enables frontier-level exploratory analysis on consumer-grade hardware, making hypothesis-driven taxonomic discovery feasible under realistic computational and governance constraints.
翻译:当前,在大型文本语料库中发现深度可控分类体系面临一个根本性权衡:主题模型的表面效率与集成大语言模型(LLM)框架所带来且难以扩展的昂贵标注成本。我们提出了 **LogiPart**,一个可扩展的、假设优先的框架,用于构建可解释的层次化划分,它将层次结构的生长与昂贵的全语料库LLM条件化解耦。LogiPart利用本地部署的LLM,在紧凑且嵌入感知的样本上生成简洁的自然语言分类谓词。然后,这些谓词通过结合零样本自然语言推理(NLI)与快速的基于图的标签传播技术,在整个语料库中进行高效评估,实现了相对于语料库规模、每个节点的生成令牌复杂度恒定为 $O(1)$。我们在四个不同的文本语料库(总计约 $\approx$140,000 份文档)上评估了LogiPart。通过使用结构化流形进行 **校准**,我们识别出实现稳定语义基础所需的经验推理阈值位于140亿参数规模。在复杂、高熵的语料库(如维基百科、美国法案)上,传统主题度量揭示了一个“对齐鸿沟”,而逆向逻辑验证则证实了所诱导逻辑的稳定性,单个分类二分法保持了高达96%的平均每节点路由准确率。一项由独立LLM作为评判者的定性审计确认了有意义的功能轴(如政策意图)的发现,这些是主题真实标签所无法捕捉的。LogiPart使得在消费级硬件上实现前沿水平的探索性分析成为可能,让在现实计算与治理约束下进行假设驱动的分类发现变得可行。