Skip to content

Resolve conflicts with upstream#1

Open
guangzegu wants to merge 632 commits intoguangzegu:dnnl_amxfrom
facebookresearch:main
Open

Resolve conflicts with upstream#1
guangzegu wants to merge 632 commits intoguangzegu:dnnl_amxfrom
facebookresearch:main

Conversation

@guangzegu
Copy link
Owner

No description provided.

guangzegu pushed a commit that referenced this pull request Oct 14, 2024
…ookresearch#3527)

Summary:
Pull Request resolved: facebookresearch#3527

**Context**
Design Doc: [Faiss Benchmarking](https://docs.google.com/document/d/1c7zziITa4RD6jZsbG9_yOgyRjWdyueldSPH6QdZzL98/edit)

**In this diff**
1. Be able to reference codec and index from blobstore (bucket & path) outside the experiment
2. To support #1, naming is moved to descriptors.
3. Build index can be written as well.
4. You can run benchmark with train and then refer it in index built and then refer index built in knn search. Index serialization is optional. Although not yet exposed through index descriptor.
5. Benchmark can support index with different datasets sizes
6. Working with varying dataset now support multiple ground truth. There may be small fixes before we could use this.
7. Added targets for bench_fw_range, ivf, codecs and optimize.

**Analysis of ivf result**: D58823037

Reviewed By: algoriddle

Differential Revision: D57236543

fbshipit-source-id: ad03b28bae937a35f8c20f12e0a5b0a27c34ff3b
generatedunixname89002005287564 and others added 29 commits July 26, 2025 14:54
Summary: Pull Request resolved: #4463

Reviewed By: dtolnay

Differential Revision: D79042640

fbshipit-source-id: 4865f3fc37054147cac4da4bd25ccd8b4eb46e2c
Summary: Pull Request resolved: #4462

Reviewed By: dtolnay

Differential Revision: D79041872

fbshipit-source-id: 5150bacf3d109e16cf2c6af59ff536e388e2742d
…nvlists (#4467)

Summary: Pull Request resolved: #4467

Reviewed By: dtolnay

Differential Revision: D79069040

fbshipit-source-id: 9028f0092aaef5e6cb0e133bffe59be6d58dbd7e
Summary:
Pull Request resolved: #4471

endif without an if

also revert all of the codemods...

Reviewed By: dtolnay

Differential Revision: D79101184

fbshipit-source-id: cc353a2119d39b214d83dc7bd901fd2b8b2408bf
Summary:
added missing dependency libgflags-dev

Pull Request resolved: #4460

Reviewed By: mnorris11

Differential Revision: D79107431

Pulled By: gtwang01

fbshipit-source-id: a11bc19c18a6a91ba42ece7dd045c99d067983ec
…tils [A] (#4475)

Summary: Pull Request resolved: #4475

Reviewed By: dtolnay

Differential Revision: D79098681

fbshipit-source-id: 508afe024f4a2d3fed43aee85979121a58b023ab
Summary: Pull Request resolved: #4476

Reviewed By: dtolnay

Differential Revision: D79087418

fbshipit-source-id: 56fd1e880e3f8a7cd39d39ad52b92b5a5035d7f5
…tils [B] [A] (#4480)

Summary: Pull Request resolved: #4480

Reviewed By: dtolnay

Differential Revision: D79136745

fbshipit-source-id: c1a0a326065033a2427cd687e295630a73e4cd05
…tils [B] [B] (#4479)

Summary: Pull Request resolved: #4479

Reviewed By: dtolnay

Differential Revision: D79137001

fbshipit-source-id: 2ac1b3d4fd2c0acbd001d292a0a9354a1706b160
#4482)

Summary:
Pull Request resolved: #4482

cuVS build is broken because deps don't work with python 3.9. It is an old version anyway. I don't think we need to keep it around.

Passed here when adding to build-pull-request.yml: https://github.com/facebookresearch/faiss/actions/runs/16600843961/job/46959770569?pr=4482

Reviewed By: pankajsingh88, trang-nm-nguyen, subhadeepkaran, ramilbakhshyiev

Differential Revision: D79178046

fbshipit-source-id: c1fe7c4746124b181a8a854ecfa02c99b5cbe8a0
Summary: Pull Request resolved: #4484

Reviewed By: limqiying

Differential Revision: D79124980

fbshipit-source-id: b29076d51540567ac71596c6848ce99475f48309
Summary: Pull Request resolved: #4483

Reviewed By: limqiying

Differential Revision: D79125371

fbshipit-source-id: 8e5d38579809a94f9d8b84bf5ed7de366df776cc
Summary:
Add nn descent as a graph building option for binary CAGRA

Pull Request resolved: #4445

Reviewed By: mnorris11

Differential Revision: D79107280

Pulled By: gtwang01

fbshipit-source-id: cfa77147e9e4e7fb7c7a367b1c99652dd2d6a5db
Summary:
This PR adds a new struct -- IndexBinaryHNSWCagra, that is used for the interoperability with GpuIndexBinaryCagra. It also adds the serialization, deserialization functions for this new struct and resolves a seg fault in the interop. Furthermore, this new struct allows for only the base layer to be built and searched in the hnsw graph (similar to IndexHNSWCagra).

Pull Request resolved: #4478

Reviewed By: gtwang01

Differential Revision: D79365863

Pulled By: mnorris11

fbshipit-source-id: a93559b84ccbe297f7193466e05fcfe0ebf82783
Summary:
Pull Request resolved: #4490

Pull Request resolved: #4489

## Instructions about RACER Diffs:
**This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.**

- If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**)
- If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate.
- If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff.

## Summary:
Removed unused `#include <faiss/impl/platform_macros.h>` from MetricType.h and replaced it with the specific standard library includes that are actually needed: `<cstdint>` and `<cstdio>`.

The platform_macros.h header contains many Windows-specific macros and definitions that are not used in MetricType.h. The file only needs `int64_t` for the `idx_t` typedef and potentially `size_t` for compatibility. This change reduces unnecessary dependencies and makes the header more focused on its actual requirements.
 ---
> Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/)
[Session](https://www.internalfb.com/confucius?session_id=d9aabefb-6e83-11f0-b070-4edc7931fd32&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=d9aabefb-6e83-11f0-b070-4edc7931fd32&tab=Trace)

Reviewed By: limqiying

Differential Revision: D79398973

fbshipit-source-id: 3b8eca6708b297043eed616ead7530719a75041e
Summary:
Pull Request resolved: #4493

## Instructions about RACER Diffs:
**This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.**

- If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**)
- If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate.
- If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff.

## Summary:
Removed three unused standard library includes from IndexBinary.h:
- `#include <sstream>` - Not used in the header file
- `#include <string>` - Not used in the header file
- `#include <typeinfo>` - Not used in the header file

These includes were likely left over from previous implementations or refactoring. The header file only needs `<cstdint>` and `<cstdio>` for the basic types it uses. This change reduces compilation dependencies and improves build performance.
 ---
> Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/)
[Session](https://www.internalfb.com/confucius?session_id=d9aabefb-6e83-11f0-b070-4edc7931fd32&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=d9aabefb-6e83-11f0-b070-4edc7931fd32&tab=Trace)

Reviewed By: limqiying

Differential Revision: D79399586

fbshipit-source-id: c5f1cbe8368ed61c65f1ea47320cc05bbd77311f
Summary:
Pull Request resolved: #4494

## Instructions about RACER Diffs:
**This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.**

- If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**)
- If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate.
- If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff.

## Summary:
Removed two unused standard library includes from Index.h:
- `#include <string>` - Not used in the header file
- `#include <typeinfo>` - Not used in the header file

Note: `#include <sstream>` was kept because downstream code depends on it transitively through Index.h. Index.h is a core header file that's included by many other files in the Faiss library, so this change reduces compilation dependencies while maintaining compatibility.
 ---
> Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/)
[Session](https://www.internalfb.com/confucius?session_id=d9aabefb-6e83-11f0-b070-4edc7931fd32&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=d9aabefb-6e83-11f0-b070-4edc7931fd32&tab=Trace)

Reviewed By: limqiying

Differential Revision: D79399876

fbshipit-source-id: 17ebf1e208e8c1d6e24881584a16629eb6387a1f
Summary:
Pull Request resolved: #4492

The C api has a bug where we don't set the size in this method. The CPP api sets it correctly https://www.internalfb.com/code/fbsource/[630a7e128132fa049cea4b73d9d9eb079e0608d6]/fbcode/assistant/knowledge/search/external/faiss/faiss_c_api.cpp?lines=164

Created from CodeHub with https://fburl.com/edit-in-codehub

Differential Revision: D79453787

fbshipit-source-id: 287916f83b35e501c3e293005d9cbea054a7475c
Summary:
Pull Request resolved: #4496

nightly broke because of CUDA 11 build. But we don't need it anymore. CUDA 11 is older now.

Reviewed By: limqiying

Differential Revision: D79460010

fbshipit-source-id: 3ad4238096af27e389c54510e1730061813d636b
Summary:
Pull Request resolved: #4495

## Instructions about RACER Diffs:
**This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.**

- If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**)
- If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate.
- If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff.

## Summary:
Removed three unused standard library includes from index_io.h:
- `#include <string>` - Not used in the header file
- `#include <typeinfo>` - Not used in the header file
- `#include <vector>` - Not used in the header file

The index_io.h header file contains I/O function declarations for reading/writing Faiss indexes. All function signatures use `const char*`, `FILE*`, and custom Faiss types, with no usage of std::string, std::vector, or typeid. The header only needs `<cstdio>` for FILE type support.

This change reduces compilation dependencies and improves build times by removing unnecessary standard library includes.
 ---
> Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/)
[Session](https://www.internalfb.com/confucius?session_id=c5bf659e-6f30-11f0-80ca-923e70174196&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=c5bf659e-6f30-11f0-80ca-923e70174196&tab=Trace)

Reviewed By: limqiying

Differential Revision: D79400116

fbshipit-source-id: 7af9152b2ec7f1fc600b04600d00fec877ffedec
… index (#4477)

Summary:
## Description
It appears that this [line](https://github.com/facebookresearch/faiss/blob/main/faiss/gpu/GpuIndexCagra.cu#L540) allocates memory for byte flat vectors and subsequently stores the encoded byte vectors. Unless I’m mistaken, `train_dataset` seems to be a temporary structure that is discarded once index building completes.
If that’s the case, I think it would be possible to just write the encoded data directly into train_dataset. This might eliminate the need for an additional memory allocation.
```
// Sample code of directly update in train_dataset[i]

// Directly update train_dataset with encoded values
for (int64_t i = 0 ; i < n_train * index->d ; ++i) {
    train_dataset[i] = static_cast<int8_t>(
        static_cast<uint8_t>(train_dataset[i]) + 128);
}

// Pass encoded byte vectors to storage
index->storage->add_sa_codes(
        n_train, (uint8_t*) train_dataset, nullptr);

```

This optimization could have a significant impact. For a dataset with 10 million vectors and 768 dimensions, the memory required for flat vectors is approximately 7.15 GB. With the current implementation, this effectively doubles to 14.3 GB due to the additional allocation. (If including memory space for storage, this further goes up to 21.45GB.

Pull Request resolved: #4477

Reviewed By: limqiying

Differential Revision: D79493876

Pulled By: mnorris11

fbshipit-source-id: dbee9e55b8a63a9269ff05877f37f53cef36491d
Summary: Pull Request resolved: #4437

Reviewed By: trang-nm-nguyen, bshethmeta

Differential Revision: D78683814

Pulled By: pankajsingh88

fbshipit-source-id: c51beee5d5dab854f3b9c326d7bb9327c34cb277
Summary:
Pull Request resolved: #4504

## Instructions about RACER Diffs:
**This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.**

- If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**)
- If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate.
- If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff.

## Summary:
Removed unused import header `#include <faiss/IndexBinaryFlat.h>` from AutoTune.cpp. This header was included but not actually used in the code, making it an unnecessary sentinel reviewer.
 ---
> Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/)
[Session](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Trace)

Reviewed By: limqiying

Differential Revision: D79580489

fbshipit-source-id: 4529112e70869928b658172d8411159eea430de4
Summary:
Pull Request resolved: #4505

## Instructions about RACER Diffs:
**This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.**

- If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**)
- If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate.
- If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff.

## Summary:
Removed unused import header `#include <faiss/utils/WorkerThread.h>` from IndexIDMap.cpp. This header was included but not actually used in the code, making it an unnecessary sentinel reviewer.
 ---
> Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/)
[Session](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Trace)

Reviewed By: limqiying

Differential Revision: D79581336

fbshipit-source-id: 685828e160394716e4201e86cca0480f5ef8a15d
Summary:
Pull Request resolved: #4506

## Instructions about RACER Diffs:
**This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.**

- If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**)
- If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate.
- If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff.

## Summary:
Removed unused import header `#include <memory>` from IndexBinaryHash.cpp. This header was included but not actually used in the code, making it an unnecessary sentinel reviewer.
 ---
> Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/)
[Session](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Trace)

Reviewed By: limqiying

Differential Revision: D79581526

fbshipit-source-id: 4c2327432c6716bff5d3cd114fd3080c6bfb7cd3
Summary:
Pull Request resolved: #4507

## Instructions about RACER Diffs:
**This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.**

- If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**)
- If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate.
- If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff.

## Summary:
Removed unused import header `#include <algorithm>` from IndexBinaryHNSW.cpp. This header was included but not actually used in the code, making it an unnecessary sentinel reviewer.
 ---
> Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/)
[Session](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=e79a132f-7163-11f0-bb3e-ae1c19f8ff68&tab=Trace)

Reviewed By: limqiying

Differential Revision: D79582071

fbshipit-source-id: 01d02846d84de9a3ca16955ba6603bf04f43e375
…as been renamed to numpy._core._multiarray_umath (#4501)

Summary:
Pull Request resolved: #4501

Fix the warning that numpy.core._multiarray_umath is deprecated and has been renamed to numpy._core._multiarray_umath

Issue: #4491

Reviewed By: mnorris11

Differential Revision: D79535368

fbshipit-source-id: 0eccf13072d373bec1b3052c689bc85ffd8d4acc
Summary:
Pull Request resolved: #4513

## Instructions about RACER Diffs:
**This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.**

- If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**)
- If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate.
- If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff.

## Summary:

This diff removes unused headers from two faiss files:

**AutoTune.cpp:**
- Removed unused `#include <cmath>` header
- Analysis confirmed no mathematical functions (sqrt, pow, ceil, etc.) are used in the file

**clone_index.cpp:**
- Removed unused `#include <cstdio>` header (no printf/fprintf calls)
- Removed unused `#include <cstdlib>` header (no abort/exit/malloc calls)
- Kept `#include <faiss/impl/FaissAssert.h>` which is used extensively

These changes reduce compilation overhead and dependencies without affecting functionality.
 ---
> Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/)
[Session](https://www.internalfb.com/confucius?session_id=42df7f2e-7356-11f0-9ffa-ae7e0719de25&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=42df7f2e-7356-11f0-9ffa-ae7e0719de25&tab=Trace)

Reviewed By: trang-nm-nguyen

Differential Revision: D79785715

fbshipit-source-id: 514750d6637cbe1aebb1b5a9985a2494b6143e9a
Summary:
Pull Request resolved: #4514

## Instructions about RACER Diffs:
**This diff was generated by Racer AI agent on behalf of [Junjie Qi](https://www.internalfb.com/profile/view/100004163713284) for T233050949. If the diff quality is poor, consider contacting the user to provide clearer instructions on the task.**

- If you are happy with the changes, commandeer it if minor edits are needed. (**we encourage commandeer to get the diff credit**)
- If you are not happy with the changes, please comment on the diff with clear actions and send it back to the author. Racer will pick it up and re-generate.
- If you really feel the Racer is not helping with this change (alas, some complex changes are hard for AI) feel free to abandon this diff.

## Summary:

This diff removes two unused headers from `fbcode/faiss/Index2Layer.cpp`:

- **Removed `#include <cassert>`**: The file uses `FAISS_THROW_*` macros instead of `assert()` calls
- **Removed `#include <cmath>`**: No mathematical functions (sqrt, pow, ceil, floor, etc.) are used in the implementation

**Headers kept** (confirmed usage):
- `cinttypes`: Used for `PRId64` macro in printf statements
- `cstdint`: Used for `int64_t`, `uint8_t` types
- `cstdio`: Used for `printf` calls
- `immintrin.h`: Used for SSE intrinsics (`__m128`, `_mm_*` functions)
- `algorithm`: Used for `std::min`

This change reduces compilation overhead and dependencies without affecting functionality.
 ---
> Generated by [RACER](https://www.internalfb.com/wiki/RACER_(Risk-Aware_Code_Editing_and_Refactoring)/), powered by [Confucius](https://www.internalfb.com/wiki/Confucius/Analect/Shared_Analects/Confucius_Code_Assist_(CCA)/)
[Session](https://www.internalfb.com/confucius?session_id=42df7f2e-7356-11f0-9ffa-ae7e0719de25&tab=Chat), [Trace](https://www.internalfb.com/confucius?session_id=42df7f2e-7356-11f0-9ffa-ae7e0719de25&tab=Trace)

Reviewed By: trang-nm-nguyen

Differential Revision: D79785857

fbshipit-source-id: e91928f140ab0e9953980ee36e5b4dc46ca5742f
generatedunixname1789769605024584 and others added 30 commits February 23, 2026 17:18
Summary: Pull Request resolved: #4833

Reviewed By: junjieqi

Differential Revision: D94054959

fbshipit-source-id: 6b12db53ecc2cf5dcf5409ae89837b7372ec4ed4
Summary: Pull Request resolved: #4832

Reviewed By: junjieqi

Differential Revision: D94053800

fbshipit-source-id: 4e2bcfda62cf5297d81d1467577c7dc970e4fd57
Summary:
Pull Request resolved: #4823

Casting an int to an undefined enum value is undefined behavior
in C++. When deserializing indexes, explicitly validate MetricType
and throw an exception if it is not a defined value.

Reviewed By: limqiying

Differential Revision: D93898294

fbshipit-source-id: e6ddc0128eaa07815ee898253bf0f437ae8d3f8f
Summary:
Pull Request resolved: #4808

Convert PQ code distance from compile-time #ifdef __AVX2__ header dispatch to DD-compatible runtime dispatch using PQCodeDistance<PQDecoderT, SIMDLevel> struct templates with per-file SIMD compilation.

Key changes:
- Replace free functions (distance_single_code, distance_four_codes) with PQCodeDistance<PQDecoderT, SIMDLevel> struct template
- Rename impl/code_distance/ to impl/pq_code_distance/ with pq_ prefix on all filenames
- Move public header to faiss/utils/pq_code_distance.h; keep private -inl.h in faiss/impl/pq_code_distance/
- Add enclosing namespace pq_code_distance inside namespace faiss, with using re-exports in the public header for backward compatibility
- Rename .h implementation files to .cpp for per-file SIMD flag compilation
- Add forward declarations of specializations in pq_code_distance-inl.h to prevent callers from silently using the generic primary template
- Use COMPILE_SIMD_* guards only (no redundant || __AVX2__ since build system always defines COMPILE_SIMD_* alongside compiler flags)
- Add pq_code_distance-generic.cpp with DISPATCH_SIMDLevel wrappers for runtime dispatch
- Update IndexIVFPQ.cpp and IndexPQ.cpp to use DISPATCH_SIMDLevel at scanner/distance-computer construction boundary
- Update xplat.bzl SIMD_FILES and CMakeLists.txt SIMD registries
- Update unicorn callers (PQFSTable.h, FeatureEncoding.cpp, FeatureEncodingBenchmark.cpp) to new include path

Based on D72937709, reworked to match current master and DD design doc.

Reviewed By: mdouze

Differential Revision: D93217005

fbshipit-source-id: 8a1b0cce2757e9e96eb2f2e2fb28b71dcdf71557
)

Summary:
Pull Request resolved: #4816

RaBitQ FastScan indexes previously stored per-vector auxiliary data (SignBitFactors, extra-bit codes) in a separate `flat_storage` vector alongside the SIMD-packed codes. This dual-storage design had several drawbacks: it required global vector IDs for auxiliary data lookup during search (problematic for IVF where vectors are referenced by list-local offsets), it complicated serialization with a separate data stream, and it broke data locality between packed codes and their associated factors.

This change embeds auxiliary data directly into the SIMD block layout by appending an auxiliary region after each block's packed PQ4 codes. Key architectural changes:

- Introduce `get_block_stride()` virtual method on IndexFastScan/IndexIVFFastScan to let subclasses declare enlarged block sizes, threaded through `pq4_accumulate_loop` and `pq4_accumulate_loop_qbs` so the SIMD accumulation loops correctly skip auxiliary regions.
- Introduce `CodePackerRaBitQ` that handles pack_1/unpack_1 of both PQ4 codes and auxiliary data in a single operation, enabling correct behavior for `remove_ids`, `merge_from`, and `reconstruct`.
- Add `postprocess_packed_codes()` hook on IndexIVFFastScan called after `pq4_pack_codes_range` during `add_with_ids` to write auxiliary metadata into blocks.
- RaBitQ handlers read auxiliary factors directly from block regions via `get_block_aux_ptr()`, improving cache locality and enabling correct IVF list-local access.
- Make `get_CodePacker()` virtual on IndexFastScan so subclasses return the appropriate packer.
- Fix BlockInvertedLists::remove_ids race condition where return value was always 0: capture `orig_size` before `resize()` and use OpenMP `reduction(+:nremove)` for thread-safe accumulation.
- Remove duplicate `#include` statements in IndexFastScan.cpp.

Reviewed By: mdouze

Differential Revision: D93538118

fbshipit-source-id: 85bd017fdee5f34fde66656a7fc17ae6da758de5
Summary:
Pull Request resolved: #4840

Faiss's SWIG-generated Python bindings lack static type information, which means type checkers (mypy, pyright, Pyre) cannot analyze faiss code and IDEs cannot provide autocompletion. This has been a community request.

This adds a comprehensive `__init__.pyi` stub file covering the full public API — all Index classes, quantizers, vector transforms, I/O functions, GPU classes, clustering, and utility functions — with `overload` signatures for both `numpy.NDArray` and `torch.Tensor` inputs. A `py.typed` PEP 561 marker is included so type checkers automatically discover the stubs.

`setup.py` is updated to copy the stub files into the staging directory during build and include `*.pyi` and `py.typed` in `package_data` so they ship in the wheel.

Reviewed By: limqiying

Differential Revision: D94139717

fbshipit-source-id: 4449f2275d716bf2c8e52e3e97c28ff6ad26a885
…4841)

Summary:
Pull Request resolved: #4841

D93538118 changed the RaBitQ FastScan serialization format by embedding auxiliary data directly into SIMD blocks (fourcc "Irfs"->"Irfn" for non-IVF, "Iwrf"->"Iwrn" for IVF). The conda faiss-cpu=1.13.2 reader used in the GitHub Actions backward compatibility test does not understand the new format, causing the "CMake Write -> Conda Read" test to fail with malformed fourcc errors.

Temporarily comment out all 8 RaBitQ FastScan index types from the test until a new conda release includes the new format, at which point they should be re-enabled and the conda version bumped (following the pattern from D89523639).

Reviewed By: junjieqi

Differential Revision: D94432870

fbshipit-source-id: d7d8129b4a9a0f0ff163c4a56e8f272bba2f970b
Summary:
Pull Request resolved: #4836

Replace the three `get_InvertedListScanner1/2/3` free function templates
with nested templatized lambdas inside `IndexIVFPQ::get_InvertedListScanner`,
using `with_simd_level` for SIMD dispatch.

This eliminates the repetition of `(index, store_pairs, sel)` parameters
through three levels of function calls, and removes the unreachable
`return nullptr` in the old metric_type dispatch.

Follow-up to D93217005 per reviewer feedback.

Reviewed By: mdouze

Differential Revision: D94351163

fbshipit-source-id: 54223a156ee361aa9bfb829b8ed355bf4a09f686
Summary:
Pull Request resolved: #4838

Pure type-system refactor — no behavioral change.

Convert the ScalarQuantizer template parameter from `int SIMDWIDTH`
(1/8/16) to `SIMDLevel SL` (NONE/AVX2/AVX512/ARM_NEON) across
quantizers.h, similarities.h, distance_computers.h, and
ScalarQuantizer.cpp.

Key changes:
- Primary templates: `int SIMD` → `SIMDLevel SL`
- Specializations: `<1>` → `<SIMDLevel::NONE>`, `<8>` → AVX2/ARM_NEON,
  `<16>` → AVX512
- Add `static constexpr SIMDLevel simd_level` to Similarity structs
- Split combined `USE_F16C || USE_NEON` guards into separate blocks
  for distinct SIMDLevel values (AVX2 vs ARM_NEON)
- Add `simd_width<SL>()` constexpr helper for future use

Prepares for per-SIMD .cpp file splitting in the next diff.

Reviewed By: mdouze

Differential Revision: D94375444

fbshipit-source-id: 4f0d3629d796d80e26e953e6b24054186ad6e753
Summary:
Pull Request resolved: #4837

Continue application of std::unique_ptr<> to ensure memory allocated during
index deserialization is freed during exception handling.

Reviewed By: mdouze

Differential Revision: D94242748

fbshipit-source-id: 5238aaa001fe89c91111a87cef5a05724197a313
Summary:
Pull Request resolved: #4827

In validate_HNSW(), bounds check access to cum_nneighbor_per_level for each
level of the deserialized HNSW.

Reviewed By: junjieqi

Differential Revision: D93903637

fbshipit-source-id: ff2a161e5f05890527f7bacc91a8a8e47e80776f
Summary:
Pull Request resolved: #4844

Fix 1: IndexLattice r2 and dsq validation
  - Tightens the r2 check from r2 >= 0 to r2 > 0. r2 must be greater
    than zero to avoid a divide by zero during normalization.
  - Adds a new check that 'dsq = d/nsq' is a power of 2 and >= 2. This
    aligns with asserts in ZnSphereCodecRec and prevents an invalid
    'cache_level' value of -1.

Fix 2: Binary hash invlists buffer validation
  - Changes from pre-allocating the buffer with a computed size (then
    overwriting via READVECTOR) to reading the buffer first and then
    checking it is large enough.  Since READVECTOR determines its size
    from the serialized data stream, the previous logic didn't actually
    verify that the buffer was fully filled with valid data.

Reviewed By: mdouze

Differential Revision: D94558548

fbshipit-source-id: b061faf5a44615346a12b1d3a6d52e596db66ba9
…on (#4851)

Summary:
Pull Request resolved: #4851

Roundtrip tests only test serialize -> deserialize. They should also test the deserialize -> serialize sequence.

Reviewed By: mdouze

Differential Revision: D93678044

fbshipit-source-id: 1503b5b954d49df509d9f785d35911ade989290e
Summary:
Pull Request resolved: #4852

Prior one: D79863512
Wiki: https://www.internalfb.com/wiki/Vector_Search/Onboarding/Faiss_Contributor/Release/Release_Process/

Reviewed By: subhadeepkaran

Differential Revision: D94947889

fbshipit-source-id: 94950b1704bef471f86420f7ebefbafdf5536cd7
…ay1(-1) (#4846)

Summary:
## Problem

Building faiss with SWIG 4.4.0+ fails to compile the generated wrapper:

```
In function 'int SWIG_mod_exec(PyObject*)':
error: cannot convert 'std::nullptr_t' to 'int' in return
  import_array();
```

SWIG 4.4.0 introduced PEP 489 multi-phase module initialization, where `SWIG_mod_exec` returns `int` instead of `PyObject*`. NumPy's `import_array()` macro expands to `return NULL`, which is a type mismatch.

Fixes #4845.

## Why not just use `import_array1(-1)`?

`import_array1(-1)` expands to `return -1`. This fixes SWIG 4.4+ but breaks older SWIG versions, which generate `PyInit_()` returning `PyObject*` — and the upstream CI (SWIG 4.0.2) confirmed exactly this failure:

```
In function 'PyObject* PyInit__swigfaiss()':
error: invalid conversion from 'int' to 'PyObject*'
  import_array1(-1);
```

## Fix

Isolate the NumPy import into a helper function that always returns `int`, then return the correct error value from `%init` using a `SWIG_VERSION` preprocessor check:

```diff
+%{
+static int _faiss_init_numpy() {
+    import_array1(-1);
+    return 0;
+}
+%}
+
 %init %{
-    import_array();
+    if (_faiss_init_numpy() < 0) {
+#if SWIG_VERSION >= 0x040400
+        return -1;
+#else
+        return NULL;
+#endif
+    }
     PythonInterruptCallback::reset();
 %}
```

- `_faiss_init_numpy()` always returns `int`, so `import_array1(-1)` inside it compiles cleanly under any SWIG version
- The `%init` block then returns the correct type for its enclosing function: `-1` (int) for SWIG 4.4+ multi-phase init, `NULL` (PyObject*) for older single-phase init

Pull Request resolved: #4846

Reviewed By: junjieqi

Differential Revision: D94921302

Pulled By: mnorris11

fbshipit-source-id: 8d4fab61ffce76a10ee7b3f00bab87471b48e02a
Summary:
Pull Request resolved: #4856

Using a Hadamard transformation is very similar to using a random rotation matrix, but we can calculate it in `d log d` time instead of `d ^ 2`. I initially started exploring this using a fast fourier transform which does outperform a random rotation, but a Hadamard transformation has the same computation complexity as FFT and in practice it is much easier to implement. Also because it is an orthonormal transformation, the recall is identical to a random rotation matrix.

I added a benchmark to compare both.

Reviewed By: mdouze

Differential Revision: D94658424

fbshipit-source-id: bb2f9dc0ab5a41ce58874a1e72a3e85c22e58ea2
…spatch (#4839)

Summary:
Pull Request resolved: #4839

Split the SIMD-gated template specializations out of ScalarQuantizer.cpp
and the shared headers into per-SIMD compilation units and wire up the
Dynamic Dispatch (DD) infrastructure (COMPILE_SIMD_*, with_simd_level).

**What moved where**

- SIMD specializations removed from `codecs.h`, `quantizers.h`,
  `similarities.h`, `distance_computers.h` — these now contain only
  primary templates and scalar (`SIMDLevel::NONE`) specializations.
  (Most use empty primary templates; `quantizers.h` uses an inheriting
  fallback pattern for `QuantizerFP16`, `QuantizerBF16`, etc.)
- SIMD specializations moved into `sq-avx2.cpp` / `sq-avx512.cpp` /
  `sq-neon.cpp`, each guarded by `COMPILE_SIMD_*`.
- `sq-generic.cpp` deleted — the `NONE` level is now instantiated
  directly in `ScalarQuantizer.cpp` via `sq-dispatch.h`.
- `sq-inl.h` renamed to `scanners.h`.

**Dispatch mechanism**

- `sq-dispatch.h` is an X-macro-style header: each per-SIMD `.cpp` file
  `#define`s `THE_LEVEL_TO_DISPATCH` and `#include`s it to stamp out
  explicit template specializations of the selection functions
  (`sq_select_quantizer`, `sq_select_distance_computer`,
  `sq_select_InvertedListScanner`).
- `ScalarQuantizer.cpp` uses `with_simd_level` for runtime dispatch
  and instantiates the `NONE` level via the same `sq-dispatch.h`.
- Each per-SIMD selection function returns `nullptr` when the dimension
  doesn't align, and the caller falls back to `NONE`.
- `sq-neon.cpp` handles both `ARM_NEON` and `ARM_SVE` (SVE forwards
  to NEON — no dedicated SVE SQ implementation yet).

**Build**

- `xplat.bzl`, `CMakeLists.txt` — register new SIMD source files and
  headers.
- Within the SQ module, `COMPILE_SIMD_*` macros gate all SIMD code
  paths. (Compiler-defined macros like `__AVX2__` are still used in
  lower-level shared headers like `simdlib.h` and `fp16.h`.)

Reviewed By: mdouze

Differential Revision: D94375408

fbshipit-source-id: a07c31540242defcc605dd74e07bd25b8c163f43
Summary:
Primarily for the sake of completeness, we could expose an IndexBinaryFlat to the C API.

Pull Request resolved: #4834

Reviewed By: junjieqi

Differential Revision: D94736003

Pulled By: gtwang01

fbshipit-source-id: 563a0b2b6684bead170ace6df08cc8ddd29c77d5
Summary:
Pull Request resolved: #4850

The multi-bit RaBitQ distance computation (`compute_full_multibit_distance`) previously extracted each code value bit-by-bit using `extract_code_inline`, which iterated `ex_bits` times per dimension — O(d × ex_bits) total with a data-dependent branch per bit.

This diff replaces it with two complementary optimizations:

**1. Improved scalar extraction (all platforms):**
Replaces the per-bit extraction loop with a 64-bit window read (`memcpy` + shift + mask) that extracts each code value in O(1) regardless of `ex_bits`. This alone gives 25–142% QPS improvement (higher gains at more bits).

**2. SIMD bit-plane decomposition (AVX2 + BMI2):**
Instead of extracting per-element multi-bit codes, decomposes the inner product into `(1 + ex_bits)` bit-plane dot products. Each plane is a float × bit-vector dot product computed via bit→mask→float conversion. For `ex_bits == 1`, both sign and ex are 1-bit packed, enabling zero-extraction kernels (AVX-512 and AVX2). For `ex_bits` 2–7, BMI2 PEXT extracts each bit plane in one instruction per 8 dimensions.

Also adds `-mbmi2` to the AVX2 compiler flags in `xplat.bzl`.

Recall@10 is identical across all nb_bits before and after.

Reviewed By: junjieqi, mdouze

Differential Revision: D94587233

fbshipit-source-id: fdcbbeb1b956e98fc05dbe6dc6cba365fb553750
Summary:
Pull Request resolved: #4859

Needed because svs min version is 3.14 which is blocking D95003637 which is blocking Alibek from uncommenting the backward compat

Reviewed By: subhadeepkaran

Differential Revision: D95087382

fbshipit-source-id: f06c87dd896d2dcbec77e935cf8fb6d321a6a0ab
Summary:
The last build pushed to conda-forge was 1.9 in December 2024. Removing the instructions in INSTALL.md for now on using conda-forge to not be misleading.

Pull Request resolved: #4843

Reviewed By: mnorris11

Differential Revision: D94739268

Pulled By: gtwang01

fbshipit-source-id: 7740161481f7f0fb3d099ec4bd54f964b009aadf
Summary:
This PR will add LeanVec OOD (out-of-distribution) support.

Pull Request resolved: #4773

Reviewed By: alibeklfc

Differential Revision: D94607022

Pulled By: mnorris11

fbshipit-source-id: df1b023919b6a2753fdddcd7d5490475f8630d36
Summary:
need new version to include Python 3.14 change so we can fix the backwards compatibility tests

- Increment version to 1.14.1
- Update CHANGELOG.md with 7 commits since v1.14.0
- Bump version in CMakeLists.txt, setup.py, Index.h, INSTALL.md
- Fix 6 missing CHANGELOG comparison links (1.12.0–1.14.0)

Pull Request resolved: #4861

Test Plan:
- [ ] CI passes
- [ ] Backward compatibility test uses correct version
- [ ] CHANGELOG entries are correctly categorized

Reviewed By: junjieqi

Differential Revision: D95254574

Pulled By: mnorris11

fbshipit-source-id: c6650fa1783b888c4000e4700da9c8bf01150176
…tQ (#4877)

Summary:
Pull Request resolved: #4877

Refactor the multi-bit RaBitQ inner product computation from the
squared-distance-then-convert approach to a direct dot-product formulation.

Before: IP = -0.5 * (||q-c||² + (||r||² - ||x||²) + (-2·||r||/ipnorm)·ex_ip - ||q||²)
After:  IP = <q,c> + <c,r> + (||r||/ipnorm)·ex_ip

Both are mathematically equivalent to <q, x>. The new form is simpler and
makes the code structurally immune to the D95166460 bug class where
the degenerate case (x ≈ centroid) required complex metric-specific
branching that was easy to get wrong.

Changes:
- Per-document factors (compute_ex_factors): IP branch now computes
  f_add_ex = <c, r> and f_rescale_ex = ||r||/ipnorm (positive, no -2 factor)
- Degenerate case simplified to metric-agnostic f_add_ex=0, f_rescale_ex=0
- Core distance function (compute_full_multibit_distance): accepts a single
  qr_base parameter instead of qr_to_c_L2sqr + qr_norm_L2sqr, eliminates
  the -0.5*(dist - qr_norm_L2sqr) IP post-processing
- Added q_dot_c field to QueryFactorsData, computed at all query-setup sites
- L2 path, 1-bit path, and SIMD kernels are completely unchanged
- All four index types covered: IndexRaBitQ, IndexIVFRaBitQ,
  IndexRaBitQFastScan, IndexIVFRaBitQFastScan

Breaking change: serialized multi-bit IP indexes must be re-encoded.
L2 indexes are unaffected.

Reviewed By: ddrcoder, latham-meta

Differential Revision: D95419974

fbshipit-source-id: 3a5a33b5a3065d172a2f57578f3a78bdf562b87c
Summary:
Installs [SVS v0.2.0](https://github.com/intel/ScalableVectorSearch/releases/tag/v0.2.0) which includes updates for LeanVec OOD and IVF to be integrated in subsequent PRs. Also removes intel conda channel logic as libsvs-runtime is now available on conda-forge.

Pull Request resolved: #4860

Reviewed By: alibeklfc

Differential Revision: D95164622

Pulled By: mnorris11

fbshipit-source-id: 3b2bbaa8c5283e6aa5ac5ee250e8eac4ccba0e5a
Summary:
Pull Request resolved: #4878

Deals with #4865

Reviewed By: alibeklfc

Differential Revision: D95475682

fbshipit-source-id: 386e043d0db3130eb58f9b398516271f921cdac2
Summary: Pull Request resolved: #4876

Reviewed By: alibeklfc

Differential Revision: D95420200

fbshipit-source-id: b30d0c9c0342460187c3b8dcffbbc3df88f4a11d
Summary: Pull Request resolved: #4855

Reviewed By: zoeyeye

Differential Revision: D95003637

fbshipit-source-id: 8def3429aa3890960c5b5b82431ad2878bac5c85
Summary:
Pull Request resolved: #4864

FAISS index data structures use C compatible types and thus cannot
embed RAII types like std::unique_ptr to trivially ensure exception
safety. For this reason, allocations within constructors must be performed
within the body of constructors where RAII types and other exception
handling tools are available.

This diff makes a pass through the index types to clean up the
"own_fields" allocation pattern. "own_fields" and the pointer it guards
are now both initialized to safe (false/nullptr) values at the definition
site and only updated within the constructor body when further exceptions
in the constructor are no longer possible. This also fixes a few cases
where these fields had indeterminent state or actually wrong state when
an exception is thrown.

There are a few cases that perform the strange looking operation
'std::make_unique().release()'. This is functionally equivalent to just
calling new, but hopefully encourages the continued use of std::unique_ptr
should the constructor logic change to introduce vectors for exceptions to
be thrown.

Reviewed By: mnorris11

Differential Revision: D95294213

fbshipit-source-id: 773d382ed7eedbb81717a15d604f3e2ca33b376b
Summary:
Pull Request resolved: #4883

Explicitly verify the results of dynamic_cast and throw desriptive
exceptions when the casts fail.

Of the 6 sites addressed here, 5 of them are defensive (e.g. against
a future code change that breaks a current invariant). The sixth, which
casts to IndexPQ*, actually protects against corrupted/mallicious input
that embeds an unexpected or malformed index type.

Reviewed By: mnorris11

Differential Revision: D95479168

fbshipit-source-id: e74269ce3d18a6a0583e17e32d352f771d437934
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.