TusKANNy is a research team building state-of-the-art algorithms and open-source tools for ANN search across dense, sparse, and multi-vector neural representations.
Fast and scalable late-interaction multi-vector retrieval
Rust
TACHIOM is a fast and scalable data structure for late-interaction multi-vector retrieval, written in Rust with Python bindings. It allows to cluster hundreds of millions of vectors in few minutes and to retrieve from large multivector collections in under 10 ms.
Fast approximate retrieval over learned sparse embeddings
Rust
A superluminal search engine for learned sparse representations, written in Rust with Python bindings. Seismic indexes sparse vector collections and retrieves results in microseconds while maintaining near-exact accuracy.
Fast ANN library for dense, sparse, and multi-vector embeddings
Rust
A flexible Rust library combining state-of-the-art indexing techniques for both dense, sparse, and multivector embeddings. Designed to make prototyping new ANN algorithms fast and ergonomic.
Unified embedding storage and compression backbone
Rust
A Rust library for storing, accessing, and compressing dense, sparse, and multi-vector embedding datasets. Provides a unified dataset/encoder interface shared across TusKANNy's indexing and search crates. Includes an exhaustive search API and a CLI tool for ground-truth computation.
Efficient Multivector Retrieval with Token-Aware Clustering and Hierarchical Indexing
Silvio Martinico, Franco Maria Nardini, Cosimo Rulli, Rossano Venturini
Proposes TACHIOM, a multivector retrieval system that uses Token-Aware Clustering (TAC) for accurate and scalable token clustering. By combining hierarchical indexing with a MaxSim-optimized Product Quantization layout, TACHIOM achieves up to 247x faster clustering than standard k-means and delivers up to 9.8x faster retrieval compared to state-of-the-art systems.
Sparton: Fast and Memory-Efficient Triton Kernel for Learned Sparse Retrieval
Thong Nguyen, Cosimo Rulli, Franco Maria Nardini, Rossano Venturini, Andrew Yates
Sparton is a Triton kernel for the Language Model head in Learned Sparse Retrieval models that fuses tiled matrix multiplication, ReLU, log1p, and max-reduction into a single GPU kernel, achieving up to 4.8x speedup and an order-of-magnitude reduction in peak memory usage compared to PyTorch baselines.
Under review at the Journal of the ACMSparse RetrievalSketchingInverted Index
Efficient Sketching and Nearest Neighbor Search Algorithms for Sparse Vector Sets
Sebastian Bruch, Franco Maria Nardini, Cosimo Rulli, Rossano Venturini
Introduces theoretically-grounded sketching algorithm to reduce effective dimensionality while preserving inner product-induced ranks, and shows its link with the Seismic data structure.
Forward Index Compression for Learned Sparse Retrieval
Sebastian Bruch, Martino Fontana, Franco Maria Nardini, Cosimo Rulli, Rossano Venturini
Introduces DotVByte, a compression technique optimized for inner product computation that achieves significant space savings while maintaining sparse retrieval efficiency.
We are based in Pisa, one of Italy's most storied university cities. Our team works across the University of Pisa and the National Research Council of Italy.
Whether you are a researcher exploring ANN, a practitioner seeking collaboration, or a company looking for tailored solutions, we would love to hear from you. Our doors in Pisa are always open.