Skip to content

Stores into fixed-size slices are easy to misuse, lead to subtle bugs #33

@Shnatsel

Description

@Shnatsel

I've just run into a very subtle bug with SIMD stores in my own code that may also apply to this crate.

I was trying to make these two lines safe:

let out_data = core::mem::transmute::<*mut i16, *mut __m256i>(data.as_mut_ptr());
_mm256_storeu_si256(out_data, ymm3);

so I made a helper function to wrap it:

fn avx_store(input: __m256i, output: &mut [i16]) {
    unsafe { _mm256_storeu_si256(output.as_mut_ptr() as *mut __m256i, input) }
}

and the original code became

avx_store(ymm3, &mut data[0..16].try_into().unwrap());

And everything broke. I checked and double-checked and triple-checked and started wondering about a compiler bug because everything was so trivial and obviously correct.

Only with outside help I realized that my conversion to a fixed-size slice, &mut data[0..16].try_into().unwrap(), was creating an intermediate array instead of giving me a reference to the original slice. Then the SIMD store would write into that intermediate array and the data would never make it into the output.

Here's the full code if you'd like more context: vstroebel/jpeg-encoder#18

It seems that the function signatures in safe_unaligned_simd sidestep this problem for x86 by accepting dynamically sized arrays for store intrinsics, but it could still be a problem on ARM, e.g. https://docs.rs/safe_unaligned_simd/latest/aarch64-apple-darwin/safe_unaligned_simd/aarch64/fn.vst1_f32.html

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions