Skip to content

simd: convert rshift64 macros to functions and fix simd_utils bugs#376

Open
byeonguk-jeong wants to merge 1 commit intoVectorCamp:developfrom
AhnLab-OSSG:simd-utils-fix
Open

simd: convert rshift64 macros to functions and fix simd_utils bugs#376
byeonguk-jeong wants to merge 1 commit intoVectorCamp:developfrom
AhnLab-OSSG:simd-utils-fix

Conversation

@byeonguk-jeong
Copy link

Convert rshift64_m128/m256/m512 macros to inline functions that support runtime (non-constant) shift amounts on x86, matching the existing lshift64 function implementations.

Also fix:

  • lshift64_m256/rshift64_m256 parameter type from int to unsigned in the non-256-bit fallback path (common/simd_utils.h)
  • isnonzero512: remove redundant self-OR operations
  • load512: fix alignment assertion to check m512 instead of m256

Fixes: 3f0f9e6 ("move x86 implementations of simd_utils.h to util/arch/x86/")
Fixes: 6ff4752 ("add scalar versions of the vectorized functions for architectures that don't support 256-bit/512-bit SIMD vectors such as ARM")
Fixes: 75aadb7 ("split arch-agnostic simd_utils.h functions into the common file")

@AhnLab-OSS @AhnLab-OSSG

Convert rshift64_m128/m256/m512 macros to inline functions that
support runtime (non-constant) shift amounts on x86, matching the
existing lshift64 function implementations.

Also fix:
- lshift64_m256/rshift64_m256 parameter type from int to unsigned in
  the non-256-bit fallback path (common/simd_utils.h)
- isnonzero512: remove redundant self-OR operations
- load512: fix alignment assertion to check m512 instead of m256

Fixes: 3f0f9e6 ("move x86 implementations of simd_utils.h to util/arch/x86/")
Fixes: 6ff4752 ("add scalar versions of the vectorized functions for architectures that don't support 256-bit/512-bit SIMD vectors such as ARM")
Fixes: 75aadb7 ("split arch-agnostic simd_utils.h functions into the common file")

Signed-off-by: Byeonguk Jeong <jungbu2855@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant