Research Team · Tuscany, Italy

Approximate Nearest
Neighbor Search

TusKANNy is a research team building state-of-the-art algorithms and open-source tools for ANN search across dense, sparse, and multi-vector neural representations.

Explore Software See Publications

Dense Embeddings

High-performance graph-based indexes and quantization techniques for efficient search over dense embeddings.

Sparse Embeddings

Efficient inverted-index and graph structures for learned sparse embeddings, reaching near-exact accuracy at microsecond latency.

Multi-vector Embeddings

State-of-the-art retrieval pipelines that handle multi-vector representations without sacrificing speed.

Open Source

Our Software

View all →

TACHIOM

Fast and scalable late-interaction multi-vector retrieval

Rust

TACHIOM is a fast and scalable data structure for late-interaction multi-vector retrieval, written in Rust with Python bindings. It allows to cluster hundreds of millions of vectors in few minutes and to retrieve from large multivector collections in under 10 ms.

late-interactionmulti-vectorclustering

View on GitHub

Seismic

Fast approximate retrieval over learned sparse embeddings

Rust

A superluminal search engine for learned sparse representations, written in Rust with Python bindings. Seismic indexes sparse vector collections and retrieves results in microseconds while maintaining near-exact accuracy.

SparseLSRInverted Index

View on GitHub

kANNolo

Fast ANN library for dense, sparse, and multi-vector embeddings

Rust

A flexible Rust library combining state-of-the-art indexing techniques for both dense, sparse, and multivector embeddings. Designed to make prototyping new ANN algorithms fast and ergonomic.

DenseSparseMulti-VectorHNSWQuantization

View on GitHub

Vectorium

Unified embedding storage and compression backbone

Rust

A Rust library for storing, accessing, and compressing dense, sparse, and multi-vector embedding datasets. Provides a unified dataset/encoder interface shared across TusKANNy's indexing and search crates. Includes an exhaustive search API and a CLI tool for ground-truth computation.

DenseSparseMulti-VectorQuantization

View on GitHub

Community

Resources

View all →

Awesome Learned Sparse Retrieval

Awesome List

A curated list of papers, libraries, benchmarks, and tools for Learned Sparse Retrieval - covering models, indexing structures, and evaluation resources.

Sparse RetrievalSurveyReading List

View resource

Awesome Multivector Retrieval

Awesome List

A curated list of papers, code, models, and datasets for late-interaction multivector retrieval - covering ColBERT-style models, indexing techniques, software libraries, and benchmarks.

Multivector RetrievalLate InteractionSurveyReading List

View resource

TusKANNy on Hugging Face

Hugging Face

Official TusKANNy organization on Hugging Face, hosting models and datasets released alongside our research.

ModelsDatasets

View resource

Research

Recent Publications

View all →

SIGIR 2026Multivector RetrievalLate-InteractionClustering

Efficient Multivector Retrieval with Token-Aware Clustering and Hierarchical Indexing

Silvio Martinico, Franco Maria Nardini, Cosimo Rulli, Rossano Venturini

Proposes TACHIOM, a multivector retrieval system that uses Token-Aware Clustering (TAC) for accurate and scalable token clustering. By combining hierarchical indexing with a MaxSim-optimized Product Quantization layout, TACHIOM achieves up to 247x faster clustering than standard k-means and delivers up to 9.8x faster retrieval compared to state-of-the-art systems.

Read PDF

SIGIR 2026Sparse RetrievalScalability

Sparton: Fast and Memory-Efficient Triton Kernel for Learned Sparse Retrieval

Thong Nguyen, Cosimo Rulli, Franco Maria Nardini, Rossano Venturini, Andrew Yates

Sparton is a Triton kernel for the Language Model head in Learned Sparse Retrieval models that fuses tiled matrix multiplication, ReLU, log1p, and max-reduction into a single GPU kernel, achieving up to 4.8x speedup and an order-of-magnitude reduction in peak memory usage compared to PyTorch baselines.

Read PDF

Under review at the Journal of the ACMSparse RetrievalSketchingInverted Index

Efficient Sketching and Nearest Neighbor Search Algorithms for Sparse Vector Sets

Sebastian Bruch, Franco Maria Nardini, Cosimo Rulli, Rossano Venturini

Introduces theoretically-grounded sketching algorithm to reduce effective dimensionality while preserving inner product-induced ranks, and shows its link with the Seismic data structure.

Read PDF

ECIR 2026Sparse RetrievalCompression

Forward Index Compression for Learned Sparse Retrieval

Sebastian Bruch, Martino Fontana, Franco Maria Nardini, Cosimo Rulli, Rossano Venturini

Introduces DotVByte, a compression technique optimized for inner product computation that achieves significant space savings while maintaining sparse retrieval efficiency.

Read PDF

Integrations

Used by the tools you already rely on

FlashRAG PyTerrier

Come Find Us

Visit TusKANNy

We are based in Pisa, one of Italy's most storied university cities. Our team works across the University of Pisa and the National Research Council of Italy.

Whether you are a researcher exploring ANN, a practitioner seeking collaboration, or a company looking for tailored solutions, we would love to hear from you. Our doors in Pisa are always open.

info [AT] tuskanny [DOT] com

National Research Council of Italy

Institute of Information Science and Technologies "Alessandro Faedo" (ISTI-CNR)

Via G. Moruzzi 1, 56124 Pisa, Italy

University of Pisa

Dipartimento di Informatica

Largo B. Pontecorvo 3, 56127 Pisa, Italy

Approximate NearestNeighbor Search

Dense Embeddings

Sparse Embeddings

Multi-vector Embeddings

Our Software

TACHIOM

Seismic

kANNolo

Vectorium

Resources

Awesome Learned Sparse Retrieval

Awesome Multivector Retrieval

TusKANNy on Hugging Face

Recent Publications

Efficient Multivector Retrieval with Token-Aware Clustering and Hierarchical Indexing

Sparton: Fast and Memory-Efficient Triton Kernel for Learned Sparse Retrieval

Efficient Sketching and Nearest Neighbor Search Algorithms for Sparse Vector Sets

Forward Index Compression for Learned Sparse Retrieval

Integrations

Visit TusKANNy

Approximate Nearest
Neighbor Search