We present Threadle, an open-source, high-performance, and memory-efficient network storage and query engine written in C#. Designed for working with full-population networks derived from administrative register data, which represent very large, multilayer, mixed-mode networks with millions of nodes and billions of edges, Threadle addresses a fundamental limitation of existing network libraries: the inability to efficiently handle two-mode (bipartite) data at scale. Threadle's core innovation is a pseudo-projection approach that allows two-mode layers to be queried as if they were projected into one-mode form, without ever materializing the memory-prohibitive projection. We demonstrate that a network with 20 million nodes containing layers equivalent to 8 trillion projected edges can be stored in approximately 20 GB of RAM -- a compression ratio exceeding 2000:1 compared to materialized projection. Additionally, Threadle provides native support for multilayer mixed-mode networks, an integrated node attribute manager, and a CLI frontend with 50+ commands for the construction, processing, file handling, and management of very large heterogeneous networks. Threadle is freely available at https://www.threadle.dev and can either be obtained as precompiled binaries for Win, macOS and Linux, or compiled directly from source. Supplementing Threadle is threadleR, an R frontend that enables advanced sampling- and traversal-based analyses on very large, heterogeneous, multilayer, mixed-mode population-scale networks.
翻译:暂无翻译