Skip to content

Handle some float(widening_op(uint16, uint16)) better in x86 backend#8976

Open
abadams wants to merge 1 commit intomainfrom
abadams/fix_8913
Open

Handle some float(widening_op(uint16, uint16)) better in x86 backend#8976
abadams wants to merge 1 commit intomainfrom
abadams/fix_8913

Conversation

@abadams
Copy link
Member

@abadams abadams commented Mar 2, 2026

Fixes #8913

I have inspected manually, but no test because the place to put one would be simd_op_check_x86, but there was a deliberate decision not to assert on instruction selection outcomes for cast patterns on x86, because LLVM frequently changes it.

Fixes #8913

I have inspected manually, but no test because the place to put one
would be simd_op_check_x86, but there was a deliberate decision not to
assert on instruction selection outcomes for cast patterns on x86,
because LLVM frequently changes it.
Copy link
Contributor

@mcourteaux mcourteaux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, although I wonder if this is something that should be addressed beyond x86. Wouldn't halide produce equally pathological code for arm for example?

@mcourteaux
Copy link
Contributor

I also wonder if we should reconsider the not testing the op check. With the new llvm upgrade strategy we might actually be able to keep track nicely and evaluate things. I would assume that this particular pattern from the repro I posted should be stable: that looks like sane code.

@alexreinking
Copy link
Member

alexreinking commented Mar 3, 2026

I also wonder if we should reconsider the not testing the op check.

There's a concerning amount of IR rewriting that happens inside the code generators. It complicates writing tests that would ideally inspect the "final" state of the Halide IR prior to an LLVM hand-off. OTOH, we could maybe directly inspect the LLVM IR that we generate. That would be much more stable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FindIntrinsics lifts cast from uint16 to float out of multiplication.

3 participants