Introduce `palette_math` and refactor gamma LUT code by Ogeon · Pull Request #437 · Ogeon/palette

Ogeon · 2025-04-21T14:14:14Z

This is quite a significant refactor to remove the generated u16 LUT constants and still offer them as a runtime feature. It introduces palette_math as a more low-level crate that lets us share code between code generation and the main crate. The size of the u16 tables were simply too large for them to be sustainable. Instead, it's possible to generate them on demand with the FromLinearLut and IntoLinearLut types.

codspeed-hq · 2025-04-21T14:16:02Z

CodSpeed Performance Report

Merging #437 will improve performances by 10.14%

_{Comparing palette_math (b8fcb95) with master (e735074)¹}

Summary

⚡ 1 improvements
✅ 46 untouched benchmarks

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
⚡	`multiply_3x3`	316.7 ns	287.5 ns	+10.14%

No successful run was found on master (88b711e) during the generation of this report, so e735074 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩

codecov · 2025-04-21T14:16:08Z

Codecov Report

❌ Patch coverage is 82.00313% with 115 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.67%. Comparing base (e735074) to head (b8fcb95).
⚠️ Report is 8 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #437      +/-   ##
==========================================
- Coverage   81.70%   81.67%   -0.03%     
==========================================
  Files         131      137       +6     
  Lines       17909    18510     +601     
  Branches    17909    18510     +601     
==========================================
+ Hits        14632    15118     +486     
- Misses       3135     3249     +114     
- Partials      142      143       +1

Components	Coverage Δ
palette	`81.86% <88.03%> (+0.11%)`	⬆️
palette_derive	`81.20% <ø> (ø)`
palette_math	`76.33% <77.89%> (∅)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Kirk-Fox · 2025-05-19T21:28:02Z

Very nice job cleaning up my messy LUT implementation. I think there are a few minor issues, so I'll leave some quick comments pointing out anything that seems like it could be corrected/improved, but this looks good!

Kirk-Fox · 2025-05-19T21:28:06Z

palette_math/src/gamma/lut.rs

+    /// Returns the bit representation of the smallest float value that converts
+    /// to `1u8`.
+    ///
+    /// The lookup algorithm the value is meant for is based on [this C++
+    /// code](<https://gist.github.com/rygorous/2203834>) by Fabian "ryg"
+    /// Giesen. This algorithm is implemented in the [`GammaLut`] type, and this
+    /// function is mainly meant for use in code generation.
+    pub fn linear_to_u8_min_float_bits(&self) -> u32 {
+        (((self.into_linear)(self, 0.5 / 255.0) as f32).to_bits() - 1) & 0xff800000
+    }


I think the documentation for this function is inaccurate, since it first finds the bits of the smallest float value that converts to 1u8 with ((self.into_linear)(self, 0.5 / 255.0) as f32).to_bits(). Then, it decrements that bit pattern so it represents the largest float value that converts to 0u8, and finally masks just the exponent bits.

Perhaps a better description would be something like "Returns the bit representation of the largest power of 2 float value that converts to 0u8."

Kirk-Fox · 2025-05-19T21:56:47Z

palette_math/src/gamma/lut.rs

+        // The number of mantissa bits used to index into the lookup table
+        const MAN_INDEX_WIDTH: u32 = U8_MAN_INDEX_WIDTH;
+        // The number of bits in the remainder of the mantissa
+        const BUCKET_INDEX_WIDTH: u32 = 20;


Perhaps
const BUCKET_INDEX_WIDTH: u32 = 23 - MAN_INDEX_WIDTH;
or
const BUCKET_INDEX_WIDTH: u32 = 23 - U8_MAN_INDEX_WIDTH;
would be preferable, since that is how this term is derived (23 mantissa bits in single-precision IEEE floating point minus those used for the index).

Kirk-Fox · 2025-05-19T22:03:43Z

palette_math/src/gamma/lut.rs

+        // The number of mantissa bits used to index into the lookup table
+        const MAN_INDEX_WIDTH: u32 = U16_MAN_INDEX_WIDTH;
+        // The number of bits in the remainder of the mantissa
+        const BUCKET_INDEX_WIDTH: i32 = 16;


I am not sure why BUCKET_INDEX_WIDTH is i32. Should it not also be u32 like with the u8 LUT implementation?

Additionally, same here as above,
const BUCKET_INDEX_WIDTH: u32 = 23 - MAN_INDEX_WIDTH;
or
const BUCKET_INDEX_WIDTH: u32 = 23 - U16_MAN_INDEX_WIDTH;
may be preferable.

This may have been a typo. I don't think there's any specific reason for it to be one or the other, so might as well be u32 now when I'm making it a computed value.

Kirk-Fox · 2025-05-19T22:14:32Z

palette_math/src/gamma/lut.rs

+    /// Returns the bit representation of the smallest float value that converts
+    /// to `1u16`.
+    ///
+    /// The lookup algorithm the value is meant for is based on [this C++
+    /// code](<https://gist.github.com/rygorous/2203834>) by Fabian "ryg"
+    /// Giesen. This algorithm is implemented in the [`GammaLut`] type, and this
+    /// function is mainly meant for use in code generation.
+    pub fn linear_to_u16_min_float_bits(&self) -> u32 {
+        let Self {
+            into_linear, beta, ..
+        } = self;
+
+        (*beta as f32)
+            .to_bits()
+            .max((into_linear(self, 0.5 / 65535.0) as f32).to_bits() - 1)
+            & 0xff800000
+    }


This also suffers from slightly inaccurate documentation. Notably, this is slightly more complex than the u8 implementation since the u16 table does not include entries for the linear portion of the function. A more accurate description might be something like "Returns the bit representation of the largest power of 2 float value that either converts to 0u16 or is in the linear part of its transfer function (is less than or equal to beta)." This is mildly wordy, though, so perhaps you might have something more concise in mind.

Ogeon · 2025-05-20T09:30:18Z

Thank you for the corrections 🙏, I will implement them when I get back to this (I have been taking a break). I had clearly misunderstood those values. If there's anything else, like any invariants I missed for the unsafe functions, feel free to add more.

Kirk-Fox · 2025-05-20T20:03:37Z

palette_math/src/gamma/lut.rs

+// SAFETY: `input` needs to be clamped between `min_float` and `max_float`.
+#[inline]
+unsafe fn linear_float_to_encoded_uint<E: GammaLutOutput>(
+    input: f32,
+    min_float_bits: u32,
+    table: &[E::TableValue],
+) -> E


Thank you for pointing out checking safety invariants, since I realize now that the safety of this function requires that table have the correct number of entries. Since one could pass in an empty table without violating any of the listed constraints, causing UB, the restrictions are much more strict. More realistically, one might simply use the wrong table which happens to be too small, leading to UB. This would also make functions that call it such as linear_to_encoded_u8 also unsafe, since it does not ensure this constraint.

This makes the function much more difficult to document, though. Perhaps having it rely on safety guarantees from GammaLutBuilder would be sufficient, but I'm still not sure of the ideal way to handle this.

This one isn't public, so the documentation is for us maintainers. The public functions should of course mention it too, where relevant, but I will probably phrase it in a "do not try this at home" kind of way. Essentially, the only supposed table data is the output of the builder for that number type.

Hmm, yeah, the functions calling this should be marked unsafe too, up until GammaLut::lookup. 😬 Not an amazing API, but I'll hide some of the functions from the documentation for now. I wouldn't recommend anyone to use them directly anyway.

Ogeon · 2025-08-01T16:07:33Z

Not perfect, but this will have to do for now. Some parts, like array construction, can be improved with a later rust version. Other parts can probably be made better with a cleaner split, but that requires breaking changes.

Ogeon force-pushed the palette_math branch 6 times, most recently from a151e2c to c08d9a7 Compare April 21, 2025 16:16

Kirk-Fox reviewed May 19, 2025

View reviewed changes

Kirk-Fox reviewed May 20, 2025

View reviewed changes

Ogeon force-pushed the palette_math branch from c08d9a7 to acc1d67 Compare July 31, 2025 17:33

Make u16 gamma LUTs runtime only and introduce palette_math

b8fcb95

Ogeon force-pushed the palette_math branch from acc1d67 to b8fcb95 Compare July 31, 2025 17:46

Ogeon merged commit 55e77f4 into master Aug 1, 2025
24 of 25 checks passed

Ogeon deleted the palette_math branch August 1, 2025 16:08

Conversation

Ogeon commented Apr 21, 2025

Uh oh!

codspeed-hq bot commented Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging #437 will improve performances by 10.14%

Summary

Benchmarks breakdown

Footnotes

Uh oh!

codecov bot commented Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Kirk-Fox commented May 19, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Ogeon commented May 20, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Ogeon commented Aug 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codspeed-hq bot commented Apr 21, 2025 •

edited

Loading

codecov bot commented Apr 21, 2025 •

edited

Loading