Introduce palette_math and refactor gamma LUT code#437
Conversation
CodSpeed Performance ReportMerging #437 will improve performances by 10.14%Comparing Summary
Benchmarks breakdown
Footnotes |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #437 +/- ##
==========================================
- Coverage 81.70% 81.67% -0.03%
==========================================
Files 131 137 +6
Lines 17909 18510 +601
Branches 17909 18510 +601
==========================================
+ Hits 14632 15118 +486
- Misses 3135 3249 +114
- Partials 142 143 +1
🚀 New features to boost your workflow:
|
a151e2c to
c08d9a7
Compare
|
Very nice job cleaning up my messy LUT implementation. I think there are a few minor issues, so I'll leave some quick comments pointing out anything that seems like it could be corrected/improved, but this looks good! |
palette_math/src/gamma/lut.rs
Outdated
| /// Returns the bit representation of the smallest float value that converts | ||
| /// to `1u8`. | ||
| /// | ||
| /// The lookup algorithm the value is meant for is based on [this C++ | ||
| /// code](<https://gist.github.com/rygorous/2203834>) by Fabian "ryg" | ||
| /// Giesen. This algorithm is implemented in the [`GammaLut`] type, and this | ||
| /// function is mainly meant for use in code generation. | ||
| pub fn linear_to_u8_min_float_bits(&self) -> u32 { | ||
| (((self.into_linear)(self, 0.5 / 255.0) as f32).to_bits() - 1) & 0xff800000 | ||
| } |
There was a problem hiding this comment.
I think the documentation for this function is inaccurate, since it first finds the bits of the smallest float value that converts to 1u8 with ((self.into_linear)(self, 0.5 / 255.0) as f32).to_bits(). Then, it decrements that bit pattern so it represents the largest float value that converts to 0u8, and finally masks just the exponent bits.
Perhaps a better description would be something like "Returns the bit representation of the largest power of 2 float value that converts to 0u8."
palette_math/src/gamma/lut.rs
Outdated
| // The number of mantissa bits used to index into the lookup table | ||
| const MAN_INDEX_WIDTH: u32 = U8_MAN_INDEX_WIDTH; | ||
| // The number of bits in the remainder of the mantissa | ||
| const BUCKET_INDEX_WIDTH: u32 = 20; |
There was a problem hiding this comment.
Perhaps
const BUCKET_INDEX_WIDTH: u32 = 23 - MAN_INDEX_WIDTH;
or
const BUCKET_INDEX_WIDTH: u32 = 23 - U8_MAN_INDEX_WIDTH;
would be preferable, since that is how this term is derived (23 mantissa bits in single-precision IEEE floating point minus those used for the index).
palette_math/src/gamma/lut.rs
Outdated
| // The number of mantissa bits used to index into the lookup table | ||
| const MAN_INDEX_WIDTH: u32 = U16_MAN_INDEX_WIDTH; | ||
| // The number of bits in the remainder of the mantissa | ||
| const BUCKET_INDEX_WIDTH: i32 = 16; |
There was a problem hiding this comment.
I am not sure why BUCKET_INDEX_WIDTH is i32. Should it not also be u32 like with the u8 LUT implementation?
Additionally, same here as above,
const BUCKET_INDEX_WIDTH: u32 = 23 - MAN_INDEX_WIDTH;
or
const BUCKET_INDEX_WIDTH: u32 = 23 - U16_MAN_INDEX_WIDTH;
may be preferable.
There was a problem hiding this comment.
This may have been a typo. I don't think there's any specific reason for it to be one or the other, so might as well be u32 now when I'm making it a computed value.
palette_math/src/gamma/lut.rs
Outdated
| /// Returns the bit representation of the smallest float value that converts | ||
| /// to `1u16`. | ||
| /// | ||
| /// The lookup algorithm the value is meant for is based on [this C++ | ||
| /// code](<https://gist.github.com/rygorous/2203834>) by Fabian "ryg" | ||
| /// Giesen. This algorithm is implemented in the [`GammaLut`] type, and this | ||
| /// function is mainly meant for use in code generation. | ||
| pub fn linear_to_u16_min_float_bits(&self) -> u32 { | ||
| let Self { | ||
| into_linear, beta, .. | ||
| } = self; | ||
|
|
||
| (*beta as f32) | ||
| .to_bits() | ||
| .max((into_linear(self, 0.5 / 65535.0) as f32).to_bits() - 1) | ||
| & 0xff800000 | ||
| } |
There was a problem hiding this comment.
This also suffers from slightly inaccurate documentation. Notably, this is slightly more complex than the u8 implementation since the u16 table does not include entries for the linear portion of the function. A more accurate description might be something like "Returns the bit representation of the largest power of 2 float value that either converts to 0u16 or is in the linear part of its transfer function (is less than or equal to beta)." This is mildly wordy, though, so perhaps you might have something more concise in mind.
|
Thank you for the corrections 🙏, I will implement them when I get back to this (I have been taking a break). I had clearly misunderstood those values. If there's anything else, like any invariants I missed for the unsafe functions, feel free to add more. |
| // SAFETY: `input` needs to be clamped between `min_float` and `max_float`. | ||
| #[inline] | ||
| unsafe fn linear_float_to_encoded_uint<E: GammaLutOutput>( | ||
| input: f32, | ||
| min_float_bits: u32, | ||
| table: &[E::TableValue], | ||
| ) -> E |
There was a problem hiding this comment.
Thank you for pointing out checking safety invariants, since I realize now that the safety of this function requires that table have the correct number of entries. Since one could pass in an empty table without violating any of the listed constraints, causing UB, the restrictions are much more strict. More realistically, one might simply use the wrong table which happens to be too small, leading to UB. This would also make functions that call it such as linear_to_encoded_u8 also unsafe, since it does not ensure this constraint.
This makes the function much more difficult to document, though. Perhaps having it rely on safety guarantees from GammaLutBuilder would be sufficient, but I'm still not sure of the ideal way to handle this.
There was a problem hiding this comment.
This one isn't public, so the documentation is for us maintainers. The public functions should of course mention it too, where relevant, but I will probably phrase it in a "do not try this at home" kind of way. Essentially, the only supposed table data is the output of the builder for that number type.
There was a problem hiding this comment.
Hmm, yeah, the functions calling this should be marked unsafe too, up until GammaLut::lookup. 😬 Not an amazing API, but I'll hide some of the functions from the documentation for now. I wouldn't recommend anyone to use them directly anyway.
|
Not perfect, but this will have to do for now. Some parts, like array construction, can be improved with a later rust version. Other parts can probably be made better with a cleaner split, but that requires breaking changes. |
This is quite a significant refactor to remove the generated
u16LUT constants and still offer them as a runtime feature. It introducespalette_mathas a more low-level crate that lets us share code between code generation and the main crate. The size of theu16tables were simply too large for them to be sustainable. Instead, it's possible to generate them on demand with theFromLinearLutandIntoLinearLuttypes.