Resolve conflicts with upstream#1
Open
guangzegu wants to merge 632 commits intoguangzegu:dnnl_amxfrom
Open
Conversation
guangzegu
pushed a commit
that referenced
this pull request
Oct 14, 2024
…ookresearch#3527) Summary: Pull Request resolved: facebookresearch#3527 **Context** Design Doc: [Faiss Benchmarking](https://docs.google.com/document/d/1c7zziITa4RD6jZsbG9_yOgyRjWdyueldSPH6QdZzL98/edit) **In this diff** 1. Be able to reference codec and index from blobstore (bucket & path) outside the experiment 2. To support #1, naming is moved to descriptors. 3. Build index can be written as well. 4. You can run benchmark with train and then refer it in index built and then refer index built in knn search. Index serialization is optional. Although not yet exposed through index descriptor. 5. Benchmark can support index with different datasets sizes 6. Working with varying dataset now support multiple ground truth. There may be small fixes before we could use this. 7. Added targets for bench_fw_range, ivf, codecs and optimize. **Analysis of ivf result**: D58823037 Reviewed By: algoriddle Differential Revision: D57236543 fbshipit-source-id: ad03b28bae937a35f8c20f12e0a5b0a27c34ff3b
Summary: Pull Request resolved: #4463 Reviewed By: dtolnay Differential Revision: D79042640 fbshipit-source-id: 4865f3fc37054147cac4da4bd25ccd8b4eb46e2c
Summary: Pull Request resolved: #4462 Reviewed By: dtolnay Differential Revision: D79041872 fbshipit-source-id: 5150bacf3d109e16cf2c6af59ff536e388e2742d
Summary: Pull Request resolved: #4471 endif without an if also revert all of the codemods... Reviewed By: dtolnay Differential Revision: D79101184 fbshipit-source-id: cc353a2119d39b214d83dc7bd901fd2b8b2408bf
Summary: added missing dependency libgflags-dev Pull Request resolved: #4460 Reviewed By: mnorris11 Differential Revision: D79107431 Pulled By: gtwang01 fbshipit-source-id: a11bc19c18a6a91ba42ece7dd045c99d067983ec
Summary: Pull Request resolved: #4476 Reviewed By: dtolnay Differential Revision: D79087418 fbshipit-source-id: 56fd1e880e3f8a7cd39d39ad52b92b5a5035d7f5
#4482) Summary: Pull Request resolved: #4482 cuVS build is broken because deps don't work with python 3.9. It is an old version anyway. I don't think we need to keep it around. Passed here when adding to build-pull-request.yml: https://github.com/facebookresearch/faiss/actions/runs/16600843961/job/46959770569?pr=4482 Reviewed By: pankajsingh88, trang-nm-nguyen, subhadeepkaran, ramilbakhshyiev Differential Revision: D79178046 fbshipit-source-id: c1fe7c4746124b181a8a854ecfa02c99b5cbe8a0
Summary: Pull Request resolved: #4484 Reviewed By: limqiying Differential Revision: D79124980 fbshipit-source-id: b29076d51540567ac71596c6848ce99475f48309
Summary: Pull Request resolved: #4483 Reviewed By: limqiying Differential Revision: D79125371 fbshipit-source-id: 8e5d38579809a94f9d8b84bf5ed7de366df776cc
Summary: Add nn descent as a graph building option for binary CAGRA Pull Request resolved: #4445 Reviewed By: mnorris11 Differential Revision: D79107280 Pulled By: gtwang01 fbshipit-source-id: cfa77147e9e4e7fb7c7a367b1c99652dd2d6a5db
Summary: This PR adds a new struct -- IndexBinaryHNSWCagra, that is used for the interoperability with GpuIndexBinaryCagra. It also adds the serialization, deserialization functions for this new struct and resolves a seg fault in the interop. Furthermore, this new struct allows for only the base layer to be built and searched in the hnsw graph (similar to IndexHNSWCagra). Pull Request resolved: #4478 Reviewed By: gtwang01 Differential Revision: D79365863 Pulled By: mnorris11 fbshipit-source-id: a93559b84ccbe297f7193466e05fcfe0ebf82783
Summary: Pull Request resolved: #4490 Pull Request resolved: #4489 ## Instructions about RACER Diffs: **This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.** - If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**) - If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate. - If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff. ## Summary: Removed unused `#include <faiss/impl/platform_macros.h>` from MetricType.h and replaced it with the specific standard library includes that are actually needed: `<cstdint>` and `<cstdio>`. The platform_macros.h header contains many Windows-specific macros and definitions that are not used in MetricType.h. The file only needs `int64_t` for the `idx_t` typedef and potentially `size_t` for compatibility. This change reduces unnecessary dependencies and makes the header more focused on its actual requirements. --- > Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/) [Session](https://www.internalfb.com/confucius?session_id=d9aabefb-6e83-11f0-b070-4edc7931fd32&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=d9aabefb-6e83-11f0-b070-4edc7931fd32&tab=Trace) Reviewed By: limqiying Differential Revision: D79398973 fbshipit-source-id: 3b8eca6708b297043eed616ead7530719a75041e
Summary: Pull Request resolved: #4493 ## Instructions about RACER Diffs: **This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.** - If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**) - If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate. - If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff. ## Summary: Removed three unused standard library includes from IndexBinary.h: - `#include <sstream>` - Not used in the header file - `#include <string>` - Not used in the header file - `#include <typeinfo>` - Not used in the header file These includes were likely left over from previous implementations or refactoring. The header file only needs `<cstdint>` and `<cstdio>` for the basic types it uses. This change reduces compilation dependencies and improves build performance. --- > Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/) [Session](https://www.internalfb.com/confucius?session_id=d9aabefb-6e83-11f0-b070-4edc7931fd32&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=d9aabefb-6e83-11f0-b070-4edc7931fd32&tab=Trace) Reviewed By: limqiying Differential Revision: D79399586 fbshipit-source-id: c5f1cbe8368ed61c65f1ea47320cc05bbd77311f
Summary: Pull Request resolved: #4494 ## Instructions about RACER Diffs: **This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.** - If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**) - If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate. - If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff. ## Summary: Removed two unused standard library includes from Index.h: - `#include <string>` - Not used in the header file - `#include <typeinfo>` - Not used in the header file Note: `#include <sstream>` was kept because downstream code depends on it transitively through Index.h. Index.h is a core header file that's included by many other files in the Faiss library, so this change reduces compilation dependencies while maintaining compatibility. --- > Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/) [Session](https://www.internalfb.com/confucius?session_id=d9aabefb-6e83-11f0-b070-4edc7931fd32&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=d9aabefb-6e83-11f0-b070-4edc7931fd32&tab=Trace) Reviewed By: limqiying Differential Revision: D79399876 fbshipit-source-id: 17ebf1e208e8c1d6e24881584a16629eb6387a1f
Summary: Pull Request resolved: #4492 The C api has a bug where we don't set the size in this method. The CPP api sets it correctly https://www.internalfb.com/code/fbsource/[630a7e128132fa049cea4b73d9d9eb079e0608d6]/fbcode/assistant/knowledge/search/external/faiss/faiss_c_api.cpp?lines=164 Created from CodeHub with https://fburl.com/edit-in-codehub Differential Revision: D79453787 fbshipit-source-id: 287916f83b35e501c3e293005d9cbea054a7475c
Summary: Pull Request resolved: #4496 nightly broke because of CUDA 11 build. But we don't need it anymore. CUDA 11 is older now. Reviewed By: limqiying Differential Revision: D79460010 fbshipit-source-id: 3ad4238096af27e389c54510e1730061813d636b
Summary: Pull Request resolved: #4495 ## Instructions about RACER Diffs: **This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.** - If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**) - If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate. - If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff. ## Summary: Removed three unused standard library includes from index_io.h: - `#include <string>` - Not used in the header file - `#include <typeinfo>` - Not used in the header file - `#include <vector>` - Not used in the header file The index_io.h header file contains I/O function declarations for reading/writing Faiss indexes. All function signatures use `const char*`, `FILE*`, and custom Faiss types, with no usage of std::string, std::vector, or typeid. The header only needs `<cstdio>` for FILE type support. This change reduces compilation dependencies and improves build times by removing unnecessary standard library includes. --- > Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/) [Session](https://www.internalfb.com/confucius?session_id=c5bf659e-6f30-11f0-80ca-923e70174196&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=c5bf659e-6f30-11f0-80ca-923e70174196&tab=Trace) Reviewed By: limqiying Differential Revision: D79400116 fbshipit-source-id: 7af9152b2ec7f1fc600b04600d00fec877ffedec
… index (#4477) Summary: ## Description It appears that this [line](https://github.com/facebookresearch/faiss/blob/main/faiss/gpu/GpuIndexCagra.cu#L540) allocates memory for byte flat vectors and subsequently stores the encoded byte vectors. Unless I’m mistaken, `train_dataset` seems to be a temporary structure that is discarded once index building completes. If that’s the case, I think it would be possible to just write the encoded data directly into train_dataset. This might eliminate the need for an additional memory allocation. ``` // Sample code of directly update in train_dataset[i] // Directly update train_dataset with encoded values for (int64_t i = 0 ; i < n_train * index->d ; ++i) { train_dataset[i] = static_cast<int8_t>( static_cast<uint8_t>(train_dataset[i]) + 128); } // Pass encoded byte vectors to storage index->storage->add_sa_codes( n_train, (uint8_t*) train_dataset, nullptr); ``` This optimization could have a significant impact. For a dataset with 10 million vectors and 768 dimensions, the memory required for flat vectors is approximately 7.15 GB. With the current implementation, this effectively doubles to 14.3 GB due to the additional allocation. (If including memory space for storage, this further goes up to 21.45GB. Pull Request resolved: #4477 Reviewed By: limqiying Differential Revision: D79493876 Pulled By: mnorris11 fbshipit-source-id: dbee9e55b8a63a9269ff05877f37f53cef36491d
Summary: Pull Request resolved: #4437 Reviewed By: trang-nm-nguyen, bshethmeta Differential Revision: D78683814 Pulled By: pankajsingh88 fbshipit-source-id: c51beee5d5dab854f3b9c326d7bb9327c34cb277
Summary: Pull Request resolved: #4504 ## Instructions about RACER Diffs: **This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.** - If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**) - If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate. - If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff. ## Summary: Removed unused import header `#include <faiss/IndexBinaryFlat.h>` from AutoTune.cpp. This header was included but not actually used in the code, making it an unnecessary sentinel reviewer. --- > Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/) [Session](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Trace) Reviewed By: limqiying Differential Revision: D79580489 fbshipit-source-id: 4529112e70869928b658172d8411159eea430de4
Summary: Pull Request resolved: #4505 ## Instructions about RACER Diffs: **This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.** - If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**) - If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate. - If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff. ## Summary: Removed unused import header `#include <faiss/utils/WorkerThread.h>` from IndexIDMap.cpp. This header was included but not actually used in the code, making it an unnecessary sentinel reviewer. --- > Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/) [Session](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Trace) Reviewed By: limqiying Differential Revision: D79581336 fbshipit-source-id: 685828e160394716e4201e86cca0480f5ef8a15d
Summary: Pull Request resolved: #4506 ## Instructions about RACER Diffs: **This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.** - If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**) - If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate. - If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff. ## Summary: Removed unused import header `#include <memory>` from IndexBinaryHash.cpp. This header was included but not actually used in the code, making it an unnecessary sentinel reviewer. --- > Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/) [Session](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Trace) Reviewed By: limqiying Differential Revision: D79581526 fbshipit-source-id: 4c2327432c6716bff5d3cd114fd3080c6bfb7cd3
Summary: Pull Request resolved: #4507 ## Instructions about RACER Diffs: **This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.** - If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**) - If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate. - If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff. ## Summary: Removed unused import header `#include <algorithm>` from IndexBinaryHNSW.cpp. This header was included but not actually used in the code, making it an unnecessary sentinel reviewer. --- > Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/) [Session](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Trace) Reviewed By: limqiying Differential Revision: D79582071 fbshipit-source-id: 01d02846d84de9a3ca16955ba6603bf04f43e375
…as been renamed to numpy._core._multiarray_umath (#4501) Summary: Pull Request resolved: #4501 Fix the warning that numpy.core._multiarray_umath is deprecated and has been renamed to numpy._core._multiarray_umath Issue: #4491 Reviewed By: mnorris11 Differential Revision: D79535368 fbshipit-source-id: 0eccf13072d373bec1b3052c689bc85ffd8d4acc
Summary: Pull Request resolved: #4513 ## Instructions about RACER Diffs: **This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.** - If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**) - If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate. - If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff. ## Summary: This diff removes unused headers from two faiss files: **AutoTune.cpp:** - Removed unused `#include <cmath>` header - Analysis confirmed no mathematical functions (sqrt, pow, ceil, etc.) are used in the file **clone_index.cpp:** - Removed unused `#include <cstdio>` header (no printf/fprintf calls) - Removed unused `#include <cstdlib>` header (no abort/exit/malloc calls) - Kept `#include <faiss/impl/FaissAssert.h>` which is used extensively These changes reduce compilation overhead and dependencies without affecting functionality. --- > Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/) [Session](https://www.internalfb.com/confucius?session_id=42df7f2e-7356-11f0-9ffa-ae7e0719de25&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=42df7f2e-7356-11f0-9ffa-ae7e0719de25&tab=Trace) Reviewed By: trang-nm-nguyen Differential Revision: D79785715 fbshipit-source-id: 514750d6637cbe1aebb1b5a9985a2494b6143e9a
Summary: Pull Request resolved: #4514 ## Instructions about RACER Diffs: **This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.** - If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**) - If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate. - If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff. ## Summary: This diff removes two unused headers from `fbcode/faiss/Index2Layer.cpp`: - **Removed `#include <cassert>`**: The file uses `FAISS_THROW_*` macros instead of `assert()` calls - **Removed `#include <cmath>`**: No mathematical functions (sqrt, pow, ceil, floor, etc.) are used in the implementation **Headers kept** (confirmed usage): - `cinttypes`: Used for `PRId64` macro in printf statements - `cstdint`: Used for `int64_t`, `uint8_t` types - `cstdio`: Used for `printf` calls - `immintrin.h`: Used for SSE intrinsics (`__m128`, `_mm_*` functions) - `algorithm`: Used for `std::min` This change reduces compilation overhead and dependencies without affecting functionality. --- > Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/) [Session](https://www.internalfb.com/confucius?session_id=42df7f2e-7356-11f0-9ffa-ae7e0719de25&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=42df7f2e-7356-11f0-9ffa-ae7e0719de25&tab=Trace) Reviewed By: trang-nm-nguyen Differential Revision: D79785857 fbshipit-source-id: e91928f140ab0e9953980ee36e5b4dc46ca5742f
Summary: Pull Request resolved: #4833 Reviewed By: junjieqi Differential Revision: D94054959 fbshipit-source-id: 6b12db53ecc2cf5dcf5409ae89837b7372ec4ed4
Summary: Pull Request resolved: #4832 Reviewed By: junjieqi Differential Revision: D94053800 fbshipit-source-id: 4e2bcfda62cf5297d81d1467577c7dc970e4fd57
Summary: Pull Request resolved: #4823 Casting an int to an undefined enum value is undefined behavior in C++. When deserializing indexes, explicitly validate MetricType and throw an exception if it is not a defined value. Reviewed By: limqiying Differential Revision: D93898294 fbshipit-source-id: e6ddc0128eaa07815ee898253bf0f437ae8d3f8f
Summary: Pull Request resolved: #4808 Convert PQ code distance from compile-time #ifdef __AVX2__ header dispatch to DD-compatible runtime dispatch using PQCodeDistance<PQDecoderT, SIMDLevel> struct templates with per-file SIMD compilation. Key changes: - Replace free functions (distance_single_code, distance_four_codes) with PQCodeDistance<PQDecoderT, SIMDLevel> struct template - Rename impl/code_distance/ to impl/pq_code_distance/ with pq_ prefix on all filenames - Move public header to faiss/utils/pq_code_distance.h; keep private -inl.h in faiss/impl/pq_code_distance/ - Add enclosing namespace pq_code_distance inside namespace faiss, with using re-exports in the public header for backward compatibility - Rename .h implementation files to .cpp for per-file SIMD flag compilation - Add forward declarations of specializations in pq_code_distance-inl.h to prevent callers from silently using the generic primary template - Use COMPILE_SIMD_* guards only (no redundant || __AVX2__ since build system always defines COMPILE_SIMD_* alongside compiler flags) - Add pq_code_distance-generic.cpp with DISPATCH_SIMDLevel wrappers for runtime dispatch - Update IndexIVFPQ.cpp and IndexPQ.cpp to use DISPATCH_SIMDLevel at scanner/distance-computer construction boundary - Update xplat.bzl SIMD_FILES and CMakeLists.txt SIMD registries - Update unicorn callers (PQFSTable.h, FeatureEncoding.cpp, FeatureEncodingBenchmark.cpp) to new include path Based on D72937709, reworked to match current master and DD design doc. Reviewed By: mdouze Differential Revision: D93217005 fbshipit-source-id: 8a1b0cce2757e9e96eb2f2e2fb28b71dcdf71557
) Summary: Pull Request resolved: #4816 RaBitQ FastScan indexes previously stored per-vector auxiliary data (SignBitFactors, extra-bit codes) in a separate `flat_storage` vector alongside the SIMD-packed codes. This dual-storage design had several drawbacks: it required global vector IDs for auxiliary data lookup during search (problematic for IVF where vectors are referenced by list-local offsets), it complicated serialization with a separate data stream, and it broke data locality between packed codes and their associated factors. This change embeds auxiliary data directly into the SIMD block layout by appending an auxiliary region after each block's packed PQ4 codes. Key architectural changes: - Introduce `get_block_stride()` virtual method on IndexFastScan/IndexIVFFastScan to let subclasses declare enlarged block sizes, threaded through `pq4_accumulate_loop` and `pq4_accumulate_loop_qbs` so the SIMD accumulation loops correctly skip auxiliary regions. - Introduce `CodePackerRaBitQ` that handles pack_1/unpack_1 of both PQ4 codes and auxiliary data in a single operation, enabling correct behavior for `remove_ids`, `merge_from`, and `reconstruct`. - Add `postprocess_packed_codes()` hook on IndexIVFFastScan called after `pq4_pack_codes_range` during `add_with_ids` to write auxiliary metadata into blocks. - RaBitQ handlers read auxiliary factors directly from block regions via `get_block_aux_ptr()`, improving cache locality and enabling correct IVF list-local access. - Make `get_CodePacker()` virtual on IndexFastScan so subclasses return the appropriate packer. - Fix BlockInvertedLists::remove_ids race condition where return value was always 0: capture `orig_size` before `resize()` and use OpenMP `reduction(+:nremove)` for thread-safe accumulation. - Remove duplicate `#include` statements in IndexFastScan.cpp. Reviewed By: mdouze Differential Revision: D93538118 fbshipit-source-id: 85bd017fdee5f34fde66656a7fc17ae6da758de5
Summary: Pull Request resolved: #4840 Faiss's SWIG-generated Python bindings lack static type information, which means type checkers (mypy, pyright, Pyre) cannot analyze faiss code and IDEs cannot provide autocompletion. This has been a community request. This adds a comprehensive `__init__.pyi` stub file covering the full public API — all Index classes, quantizers, vector transforms, I/O functions, GPU classes, clustering, and utility functions — with `overload` signatures for both `numpy.NDArray` and `torch.Tensor` inputs. A `py.typed` PEP 561 marker is included so type checkers automatically discover the stubs. `setup.py` is updated to copy the stub files into the staging directory during build and include `*.pyi` and `py.typed` in `package_data` so they ship in the wheel. Reviewed By: limqiying Differential Revision: D94139717 fbshipit-source-id: 4449f2275d716bf2c8e52e3e97c28ff6ad26a885
…4841) Summary: Pull Request resolved: #4841 D93538118 changed the RaBitQ FastScan serialization format by embedding auxiliary data directly into SIMD blocks (fourcc "Irfs"->"Irfn" for non-IVF, "Iwrf"->"Iwrn" for IVF). The conda faiss-cpu=1.13.2 reader used in the GitHub Actions backward compatibility test does not understand the new format, causing the "CMake Write -> Conda Read" test to fail with malformed fourcc errors. Temporarily comment out all 8 RaBitQ FastScan index types from the test until a new conda release includes the new format, at which point they should be re-enabled and the conda version bumped (following the pattern from D89523639). Reviewed By: junjieqi Differential Revision: D94432870 fbshipit-source-id: d7d8129b4a9a0f0ff163c4a56e8f272bba2f970b
Summary: Pull Request resolved: #4836 Replace the three `get_InvertedListScanner1/2/3` free function templates with nested templatized lambdas inside `IndexIVFPQ::get_InvertedListScanner`, using `with_simd_level` for SIMD dispatch. This eliminates the repetition of `(index, store_pairs, sel)` parameters through three levels of function calls, and removes the unreachable `return nullptr` in the old metric_type dispatch. Follow-up to D93217005 per reviewer feedback. Reviewed By: mdouze Differential Revision: D94351163 fbshipit-source-id: 54223a156ee361aa9bfb829b8ed355bf4a09f686
Summary: Pull Request resolved: #4838 Pure type-system refactor — no behavioral change. Convert the ScalarQuantizer template parameter from `int SIMDWIDTH` (1/8/16) to `SIMDLevel SL` (NONE/AVX2/AVX512/ARM_NEON) across quantizers.h, similarities.h, distance_computers.h, and ScalarQuantizer.cpp. Key changes: - Primary templates: `int SIMD` → `SIMDLevel SL` - Specializations: `<1>` → `<SIMDLevel::NONE>`, `<8>` → AVX2/ARM_NEON, `<16>` → AVX512 - Add `static constexpr SIMDLevel simd_level` to Similarity structs - Split combined `USE_F16C || USE_NEON` guards into separate blocks for distinct SIMDLevel values (AVX2 vs ARM_NEON) - Add `simd_width<SL>()` constexpr helper for future use Prepares for per-SIMD .cpp file splitting in the next diff. Reviewed By: mdouze Differential Revision: D94375444 fbshipit-source-id: 4f0d3629d796d80e26e953e6b24054186ad6e753
Summary: Pull Request resolved: #4837 Continue application of std::unique_ptr<> to ensure memory allocated during index deserialization is freed during exception handling. Reviewed By: mdouze Differential Revision: D94242748 fbshipit-source-id: 5238aaa001fe89c91111a87cef5a05724197a313
Summary: Pull Request resolved: #4827 In validate_HNSW(), bounds check access to cum_nneighbor_per_level for each level of the deserialized HNSW. Reviewed By: junjieqi Differential Revision: D93903637 fbshipit-source-id: ff2a161e5f05890527f7bacc91a8a8e47e80776f
Summary: Pull Request resolved: #4844 Fix 1: IndexLattice r2 and dsq validation - Tightens the r2 check from r2 >= 0 to r2 > 0. r2 must be greater than zero to avoid a divide by zero during normalization. - Adds a new check that 'dsq = d/nsq' is a power of 2 and >= 2. This aligns with asserts in ZnSphereCodecRec and prevents an invalid 'cache_level' value of -1. Fix 2: Binary hash invlists buffer validation - Changes from pre-allocating the buffer with a computed size (then overwriting via READVECTOR) to reading the buffer first and then checking it is large enough. Since READVECTOR determines its size from the serialized data stream, the previous logic didn't actually verify that the buffer was fully filled with valid data. Reviewed By: mdouze Differential Revision: D94558548 fbshipit-source-id: b061faf5a44615346a12b1d3a6d52e596db66ba9
Summary: Pull Request resolved: #4852 Prior one: D79863512 Wiki: https://www.internalfb.com/wiki/Vector_Search/Onboarding/Faiss_Contributor/Release/Release_Process/ Reviewed By: subhadeepkaran Differential Revision: D94947889 fbshipit-source-id: 94950b1704bef471f86420f7ebefbafdf5536cd7
…ay1(-1) (#4846) Summary: ## Problem Building faiss with SWIG 4.4.0+ fails to compile the generated wrapper: ``` In function 'int SWIG_mod_exec(PyObject*)': error: cannot convert 'std::nullptr_t' to 'int' in return import_array(); ``` SWIG 4.4.0 introduced PEP 489 multi-phase module initialization, where `SWIG_mod_exec` returns `int` instead of `PyObject*`. NumPy's `import_array()` macro expands to `return NULL`, which is a type mismatch. Fixes #4845. ## Why not just use `import_array1(-1)`? `import_array1(-1)` expands to `return -1`. This fixes SWIG 4.4+ but breaks older SWIG versions, which generate `PyInit_()` returning `PyObject*` — and the upstream CI (SWIG 4.0.2) confirmed exactly this failure: ``` In function 'PyObject* PyInit__swigfaiss()': error: invalid conversion from 'int' to 'PyObject*' import_array1(-1); ``` ## Fix Isolate the NumPy import into a helper function that always returns `int`, then return the correct error value from `%init` using a `SWIG_VERSION` preprocessor check: ```diff +%{ +static int _faiss_init_numpy() { + import_array1(-1); + return 0; +} +%} + %init %{ - import_array(); + if (_faiss_init_numpy() < 0) { +#if SWIG_VERSION >= 0x040400 + return -1; +#else + return NULL; +#endif + } PythonInterruptCallback::reset(); %} ``` - `_faiss_init_numpy()` always returns `int`, so `import_array1(-1)` inside it compiles cleanly under any SWIG version - The `%init` block then returns the correct type for its enclosing function: `-1` (int) for SWIG 4.4+ multi-phase init, `NULL` (PyObject*) for older single-phase init Pull Request resolved: #4846 Reviewed By: junjieqi Differential Revision: D94921302 Pulled By: mnorris11 fbshipit-source-id: 8d4fab61ffce76a10ee7b3f00bab87471b48e02a
Summary: Pull Request resolved: #4856 Using a Hadamard transformation is very similar to using a random rotation matrix, but we can calculate it in `d log d` time instead of `d ^ 2`. I initially started exploring this using a fast fourier transform which does outperform a random rotation, but a Hadamard transformation has the same computation complexity as FFT and in practice it is much easier to implement. Also because it is an orthonormal transformation, the recall is identical to a random rotation matrix. I added a benchmark to compare both. Reviewed By: mdouze Differential Revision: D94658424 fbshipit-source-id: bb2f9dc0ab5a41ce58874a1e72a3e85c22e58ea2
…spatch (#4839) Summary: Pull Request resolved: #4839 Split the SIMD-gated template specializations out of ScalarQuantizer.cpp and the shared headers into per-SIMD compilation units and wire up the Dynamic Dispatch (DD) infrastructure (COMPILE_SIMD_*, with_simd_level). **What moved where** - SIMD specializations removed from `codecs.h`, `quantizers.h`, `similarities.h`, `distance_computers.h` — these now contain only primary templates and scalar (`SIMDLevel::NONE`) specializations. (Most use empty primary templates; `quantizers.h` uses an inheriting fallback pattern for `QuantizerFP16`, `QuantizerBF16`, etc.) - SIMD specializations moved into `sq-avx2.cpp` / `sq-avx512.cpp` / `sq-neon.cpp`, each guarded by `COMPILE_SIMD_*`. - `sq-generic.cpp` deleted — the `NONE` level is now instantiated directly in `ScalarQuantizer.cpp` via `sq-dispatch.h`. - `sq-inl.h` renamed to `scanners.h`. **Dispatch mechanism** - `sq-dispatch.h` is an X-macro-style header: each per-SIMD `.cpp` file `#define`s `THE_LEVEL_TO_DISPATCH` and `#include`s it to stamp out explicit template specializations of the selection functions (`sq_select_quantizer`, `sq_select_distance_computer`, `sq_select_InvertedListScanner`). - `ScalarQuantizer.cpp` uses `with_simd_level` for runtime dispatch and instantiates the `NONE` level via the same `sq-dispatch.h`. - Each per-SIMD selection function returns `nullptr` when the dimension doesn't align, and the caller falls back to `NONE`. - `sq-neon.cpp` handles both `ARM_NEON` and `ARM_SVE` (SVE forwards to NEON — no dedicated SVE SQ implementation yet). **Build** - `xplat.bzl`, `CMakeLists.txt` — register new SIMD source files and headers. - Within the SQ module, `COMPILE_SIMD_*` macros gate all SIMD code paths. (Compiler-defined macros like `__AVX2__` are still used in lower-level shared headers like `simdlib.h` and `fp16.h`.) Reviewed By: mdouze Differential Revision: D94375408 fbshipit-source-id: a07c31540242defcc605dd74e07bd25b8c163f43
Summary: Primarily for the sake of completeness, we could expose an IndexBinaryFlat to the C API. Pull Request resolved: #4834 Reviewed By: junjieqi Differential Revision: D94736003 Pulled By: gtwang01 fbshipit-source-id: 563a0b2b6684bead170ace6df08cc8ddd29c77d5
Summary: Pull Request resolved: #4850 The multi-bit RaBitQ distance computation (`compute_full_multibit_distance`) previously extracted each code value bit-by-bit using `extract_code_inline`, which iterated `ex_bits` times per dimension — O(d × ex_bits) total with a data-dependent branch per bit. This diff replaces it with two complementary optimizations: **1. Improved scalar extraction (all platforms):** Replaces the per-bit extraction loop with a 64-bit window read (`memcpy` + shift + mask) that extracts each code value in O(1) regardless of `ex_bits`. This alone gives 25–142% QPS improvement (higher gains at more bits). **2. SIMD bit-plane decomposition (AVX2 + BMI2):** Instead of extracting per-element multi-bit codes, decomposes the inner product into `(1 + ex_bits)` bit-plane dot products. Each plane is a float × bit-vector dot product computed via bit→mask→float conversion. For `ex_bits == 1`, both sign and ex are 1-bit packed, enabling zero-extraction kernels (AVX-512 and AVX2). For `ex_bits` 2–7, BMI2 PEXT extracts each bit plane in one instruction per 8 dimensions. Also adds `-mbmi2` to the AVX2 compiler flags in `xplat.bzl`. Recall@10 is identical across all nb_bits before and after. Reviewed By: junjieqi, mdouze Differential Revision: D94587233 fbshipit-source-id: fdcbbeb1b956e98fc05dbe6dc6cba365fb553750
Summary: Pull Request resolved: #4859 Needed because svs min version is 3.14 which is blocking D95003637 which is blocking Alibek from uncommenting the backward compat Reviewed By: subhadeepkaran Differential Revision: D95087382 fbshipit-source-id: f06c87dd896d2dcbec77e935cf8fb6d321a6a0ab
Summary: The last build pushed to conda-forge was 1.9 in December 2024. Removing the instructions in INSTALL.md for now on using conda-forge to not be misleading. Pull Request resolved: #4843 Reviewed By: mnorris11 Differential Revision: D94739268 Pulled By: gtwang01 fbshipit-source-id: 7740161481f7f0fb3d099ec4bd54f964b009aadf
Summary: This PR will add LeanVec OOD (out-of-distribution) support. Pull Request resolved: #4773 Reviewed By: alibeklfc Differential Revision: D94607022 Pulled By: mnorris11 fbshipit-source-id: df1b023919b6a2753fdddcd7d5490475f8630d36
Summary: need new version to include Python 3.14 change so we can fix the backwards compatibility tests - Increment version to 1.14.1 - Update CHANGELOG.md with 7 commits since v1.14.0 - Bump version in CMakeLists.txt, setup.py, Index.h, INSTALL.md - Fix 6 missing CHANGELOG comparison links (1.12.0–1.14.0) Pull Request resolved: #4861 Test Plan: - [ ] CI passes - [ ] Backward compatibility test uses correct version - [ ] CHANGELOG entries are correctly categorized Reviewed By: junjieqi Differential Revision: D95254574 Pulled By: mnorris11 fbshipit-source-id: c6650fa1783b888c4000e4700da9c8bf01150176
…tQ (#4877) Summary: Pull Request resolved: #4877 Refactor the multi-bit RaBitQ inner product computation from the squared-distance-then-convert approach to a direct dot-product formulation. Before: IP = -0.5 * (||q-c||² + (||r||² - ||x||²) + (-2·||r||/ipnorm)·ex_ip - ||q||²) After: IP = <q,c> + <c,r> + (||r||/ipnorm)·ex_ip Both are mathematically equivalent to <q, x>. The new form is simpler and makes the code structurally immune to the D95166460 bug class where the degenerate case (x ≈ centroid) required complex metric-specific branching that was easy to get wrong. Changes: - Per-document factors (compute_ex_factors): IP branch now computes f_add_ex = <c, r> and f_rescale_ex = ||r||/ipnorm (positive, no -2 factor) - Degenerate case simplified to metric-agnostic f_add_ex=0, f_rescale_ex=0 - Core distance function (compute_full_multibit_distance): accepts a single qr_base parameter instead of qr_to_c_L2sqr + qr_norm_L2sqr, eliminates the -0.5*(dist - qr_norm_L2sqr) IP post-processing - Added q_dot_c field to QueryFactorsData, computed at all query-setup sites - L2 path, 1-bit path, and SIMD kernels are completely unchanged - All four index types covered: IndexRaBitQ, IndexIVFRaBitQ, IndexRaBitQFastScan, IndexIVFRaBitQFastScan Breaking change: serialized multi-bit IP indexes must be re-encoded. L2 indexes are unaffected. Reviewed By: ddrcoder, latham-meta Differential Revision: D95419974 fbshipit-source-id: 3a5a33b5a3065d172a2f57578f3a78bdf562b87c
Summary: Installs [SVS v0.2.0](https://github.com/intel/ScalableVectorSearch/releases/tag/v0.2.0) which includes updates for LeanVec OOD and IVF to be integrated in subsequent PRs. Also removes intel conda channel logic as libsvs-runtime is now available on conda-forge. Pull Request resolved: #4860 Reviewed By: alibeklfc Differential Revision: D95164622 Pulled By: mnorris11 fbshipit-source-id: 3b2bbaa8c5283e6aa5ac5ee250e8eac4ccba0e5a
Summary: Pull Request resolved: #4876 Reviewed By: alibeklfc Differential Revision: D95420200 fbshipit-source-id: b30d0c9c0342460187c3b8dcffbbc3df88f4a11d
Summary: Pull Request resolved: #4855 Reviewed By: zoeyeye Differential Revision: D95003637 fbshipit-source-id: 8def3429aa3890960c5b5b82431ad2878bac5c85
Summary: Pull Request resolved: #4864 FAISS index data structures use C compatible types and thus cannot embed RAII types like std::unique_ptr to trivially ensure exception safety. For this reason, allocations within constructors must be performed within the body of constructors where RAII types and other exception handling tools are available. This diff makes a pass through the index types to clean up the "own_fields" allocation pattern. "own_fields" and the pointer it guards are now both initialized to safe (false/nullptr) values at the definition site and only updated within the constructor body when further exceptions in the constructor are no longer possible. This also fixes a few cases where these fields had indeterminent state or actually wrong state when an exception is thrown. There are a few cases that perform the strange looking operation 'std::make_unique().release()'. This is functionally equivalent to just calling new, but hopefully encourages the continued use of std::unique_ptr should the constructor logic change to introduce vectors for exceptions to be thrown. Reviewed By: mnorris11 Differential Revision: D95294213 fbshipit-source-id: 773d382ed7eedbb81717a15d604f3e2ca33b376b
Summary: Pull Request resolved: #4883 Explicitly verify the results of dynamic_cast and throw desriptive exceptions when the casts fail. Of the 6 sites addressed here, 5 of them are defensive (e.g. against a future code change that breaks a current invariant). The sixth, which casts to IndexPQ*, actually protects against corrupted/mallicious input that embeds an unexpected or malformed index type. Reviewed By: mnorris11 Differential Revision: D95479168 fbshipit-source-id: e74269ce3d18a6a0583e17e32d352f771d437934
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.