Skip to content

feat: Add Bunch–Kaufman LDL Decomposition#1591

Open
awinick wants to merge 7 commits intodimforge:mainfrom
awinick:ldl-pivot
Open

feat: Add Bunch–Kaufman LDL Decomposition#1591
awinick wants to merge 7 commits intodimforge:mainfrom
awinick:ldl-pivot

Conversation

@awinick
Copy link
Copy Markdown

@awinick awinick commented Mar 16, 2026

This PR adds an LDL decomposition for Hermitian matrices using the Bunch–Kaufman symmetric pivoting algorithm.

The factorization computed is

Pᵀ A P = L D Lᴴ

where

  • P is a permutation matrix from symmetric pivoting
  • L is unit lower triangular
  • D is Hermitian block diagonal containing 1×1 and 2×2 blocks

This corresponds to the class of algorithms implemented in LAPACK’s *HETRF family.

Implementation notes

The factorization is stored in-place in a single matrix following the standard LAPACK layout:

  • the strictly lower triangle stores L
  • the diagonal and first subdiagonal store D
  • pivot information is stored separately

Tests

The tests are intentionally compact but exercise a wide variety of pivot behaviors.

During development I inspected pivot structures and found that matrices constructed as

A = U D Uᴴ

where U is Haar-random unitary and D has alternating signs consistently generate rich mixtures of 1×1 and 2×2 pivot blocks.

Two spectra are used.

Alternating ±1 spectrum

D = diag(1, -1, 1, -1, ...)

This produces a diverse set of pivot patterns across matrix sizes and verifies

  • reconstruction accuracy
  • determinant correctness
  • linear solve correctness

Alternating geometric spectrum

D = diag(1, -10, 100, -1000, ...)

This stresses the algorithm under extreme scaling, which is particularly important for

  • pivot selection behavior
  • numerical stability
  • determinant accuracy

Empirically this spectrum produces a wide variety of pivot configurations while remaining numerically well-behaved.

Determinant accuracy

During testing it was observed that computing determinants through this LDL factorization produces significantly more accurate real-valued results for Hermitian matrices than the current generic approach.

Because the decomposition explicitly respects Hermitian structure, the determinant is obtained as the product of real block determinants, which reduces complex roundoff artifacts.

Benchmarking

Benchmarks were performed comparing this implementation against LAPACK (zhetrf) and the existing QR decomposition in nalgebra.

All results below correspond to 1000 factorizations.

Size LAPACK zhetrf (Accelerate) LAPACK zhetrf (OpenBLAS) This implementation nalgebra QR
100×100 ~136 ms ~117 ms ~179 ms ~529 ms
200×200 ~590 ms ~877 ms ~1274 ms ~4249 ms

The pure Rust implementation is typically within roughly a factor of two of hardware-accelerated LAPACK. Given that LAPACK implementations rely on highly tuned BLAS kernels and architecture-specific optimizations, this level of performance is reasonable for a portable pure-Rust implementation.

For comparison, QR factorization, the current best available method, is substantially slower. QR does not exploit Hermitian structure and therefore performs significantly more work than an LDL factorization.

197g and others added 3 commits January 4, 2026 21:33
* Add known-failing unit test case for SymmetricEigen

* Fix eigenvector calculation for subdim=2 case in SymmetricEigen
@awinick
Copy link
Copy Markdown
Author

awinick commented Mar 16, 2026

This work was inspired in part by the implementation efforts in #1515 by @ajefweiss.

For the problem class I was working on, I needed clear and reliable handling of indefinite Hermitian matrices, which motivated implementing the classical Bunch–Kaufman LDL factorization.

While there is overlap in goals, this PR focuses specifically on the symmetric-indefinite Hermitian case with explicit pivot handling and block-diagonal D structure.

@geo-ant
Copy link
Copy Markdown
Collaborator

geo-ant commented Mar 16, 2026

I'll try to take a look soon ish. I've kicked off the checks and they show a few problems in the derive macros. Could you fix them if you have time?

@geo-ant geo-ant self-requested a review March 16, 2026 18:21
Copy link
Copy Markdown
Collaborator

@geo-ant geo-ant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, I'm sorry that I didn't have time to take a super in-depth look yet. I need to brush up on the Bunch-Kaufman factorization.

Did you use LAPACKS zhetrf function as your reference? If not, what other references should I use to review the implementation. I'm aware of:

Let me know what would be a good reference for me to check your implementation. I see you've added tests which is already fantastic, I'd just like to be able to review your implementation as well.

I've made some comments in the code for some API related things. If you want to take care of them, could you also run cargo fmt on your code so it'll pass the lint?

///
/// This uses the LAPACK `ipiv` convention:
/// a positive entry denotes a 1x1 pivot block, while a repeated negative entry
/// denotes a 2x2 pivot block. The stored pivot indices are 1-based.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like LAPACK as much as the next guy, but to my mind this isn't going to feel intuitive for Rust users.

  • Is there a reason to expose this API at all?
  • If so, would you mind abstracting it into a Pivot structure? This could expose an API for applying P A P^T, either as a function or using lazily overloaded operators depending how overengineered you'd like it to be... 😅
  • I'd much prefer 0-based indices, which is what nalgebra uses everywhere else.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I agree the LAPACK-style encoding is efficient but not very intuitive.

I’ve switched to a (pivot_index, block_size) representation with 0-based indices, so the semantics are explicit and consistent with nalgebra, without the sign/1-based encoding.

Right now I’m exposing the pivots mainly for inspection/testing rather than as a primary API. I didn’t introduce a dedicated Pivot type to keep things lightweight, but I’m happy to revisit that if you think it adds value.

I don’t expect most users to interact with pivots or zero_pivot directly, but exposing them gives some additional visibility for debugging (e.g. understanding how the factorization pivoted, or diagnosing singular/near-singular cases).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a super strong opinion here (just regular strength 😆), but if the API is exposed at all I'd like it to be simpler (meaning using some kind of wrapper type). But from what you said the API won't be touched by most users anyway so I'd say my preferred solution right now is to make it private and use it only in the tests. If people ask for it, there's always a chance to expose something that's easier to use.

If you say that the pivot give us a way to understand whether a matrix is (near) singular, maybe that's the only thing that should be exposed for now.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On zero_pivot, I’d be a little careful about treating it as a general statement about singularity or conditioning. An exactly zero pivot means solve/solve_mut cannot proceed reliably, so it is useful diagnostic information, but the converse is not true: a badly conditioned or even singular matrix can still sometimes produce a result, and that result has to be interpreted with care.

So I see zero_pivot as information about how the factorization behaved, not as a full statement about conditioning or singularity. More generally, I think assessing conditioning is outside the responsibility of the decomposition itself. If nalgebra wants to expose that kind of information, I think it would make more sense as a separate method rather than through zero_pivot. I’d definitely find that useful myself when evaluating problems.

On the API, I’m open to making the pivot representation non-public. The main reason I exposed it was for inspection/testing, since I’ve been keeping tests in integration tests rather than unit tests, so private or pub(crate) internals are not accessible there.

Would hiding it from the public docs be sufficient here, or would you prefer a different approach?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if a dedicated conditioning-related method would be useful, I’d be happy to open that as follow-up work.

///
/// This follows the LAPACK convention and is therefore 1-based.
/// A value of `None` means no zero pivot was detected.
#[inline]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above, this could better be abstracted into a Pivot structure, if it's needed at all...

///
/// P A P^T = L * D * L^H
///
/// where L is a product of permutation and unit lower triangular matrices, U^H is the
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where L is a product of permutation and unit lower triangular matrices, U^H is the

I think this is copied from LAPACK's ?{he,sy}trf family of functions, but lapack uses the form A = L D L^H, whereas you give P A P^T = L D L^H. In your case L should indeed be unit lower triangular, right?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes LAPACK’s ?sytrf/?hetrf routines were the primary reference for the implementation.

I wrote the factorization in the form
P A P^T = L D L^H intentionally. In this formulation, L is unit lower triangular. LAPACK’s documentation writes A = L D L^H, but in practice the permutations are absorbed into the pivot representation, so the stored L is not strictly lower triangular in the original coordinate system.

Keeping the permutation explicit via P makes the structure a bit clearer: L is unit lower triangular in the permuted basis, and the pivoting is handled separately rather than implicitly encoded.

Copy link
Copy Markdown
Collaborator

@geo-ant geo-ant Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay then I understood your implementation correctly and I actually like the clarity.

But then the comment needs to change, if I'm not missing something? Because the comment says the permutation is absorbed into L in your implementation, which it wouldn't be.

Edit: I still want to check your implementation against the lapack impl.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay then I understood your implementation correctly and I actually like the clarity.

But then the comment needs to change, if I'm not missing something? Because the comment says the permutation is absorbed into L in your implementation, which it wouldn't be.

Edit: I still want to check your implementation against the lapack impl.

Good catch 😅 I double-checked the docs, and they should be consistent and easier to understand now.

@awinick
Copy link
Copy Markdown
Author

awinick commented Mar 28, 2026

Hey, I'm sorry that I didn't have time to take a super in-depth look yet. I need to brush up on the Bunch-Kaufman factorization.

Did you use LAPACKS zhetrf function as your reference? If not, what other references should I use to review the implementation. I'm aware of:

Let me know what would be a good reference for me to check your implementation. I see you've added tests which is already fantastic, I'd just like to be able to review your implementation as well.

I've made some comments in the code for some API related things. If you want to take care of them, could you also run cargo fmt on your code so it'll pass the lint?

The implementation is based on the partial pivoting diagonal pivoting method from Bunch & Kaufman (1977), specifically the algorithm described in Section 2 (referred to as “Algorithm A”). This is also the basis for LAPACK’s ?sytrf/?hetrf routines, so the overall structure and pivoting logic are very similar to those.

Happy to clarify any specific part of the algorithm if helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants