Facebook Releases New Open Source Library “FAISS” For Clustering Big Data. With the advancement of technology, everything advances. And what is advanced if not involved. With new and fruitful machine learning methods datasets have increased. If the hardware is not compatible in computing a large amount of data the process becomes slow and tiring.
The processing of those big amounts of datasets needs high memory bandwidth and processor capabilities. To add to it, indexing the data points, clustering and search become highly demanding.
A new report has now been published by researchers at Facebook AI Research or FAIR describes an efficient design for clustering and similarity search. The new algorithm by the researchers at FAIR performs at a faster speed than those state of art algorithms previously in use. The former also use GPU for higher memory bandwidth and computational throughput.
Based on this research the researchers have created a library named FAISS and they have also open sourced it. Although the algorithms for clustering and similarity are well known by many those present in the library are are an optimized version which performs well in GPU.
The algorithms implemented in the library include –
- Fast K-Nearest Neighbour
- K-Means clustering
To test how the library performs, it was given the first and the last image, and it produced the intermediate transitional images from a collection of 95 million images.
Some Excellent Features of FAISS Library –
It is written in C++ with complete Python wrappers
It supports single/multiple GPUs
It is highly scalable, typically supports up to 100 dimensions
It is built on BLAS and CUDA libraries
It is 8.5x faster in performance than current state-of-art libraries
Here is the GitHub repo of the FAISS library. So, what do you think of this library? Let us know in the comments.