Annoy

1 · Erik Bernhardsson · April 12, 2013, 4 a.m.
Annoy is a simple package to find approximate nearest neighbors (ANN) that I just put on Github. I’m not trying to compete with existing packages, but Annoy has a couple of features that makes it pretty useful. Most importantly, it uses very little memory and can put everything in a contiguous blob that you can mmap from disk. This way multiple processes can share the same index. We use it at Spotify to put a couple of million tracks in 40-dimensional space and then query for the most similar tr...