public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/108614] New: _subborrow_u32 generates suboptimal code when second subtraction operand is constant on x86 targets
@ 2023-01-31 12:39 john_platts at hotmail dot com
0 siblings, 0 replies; only message in thread
From: john_platts at hotmail dot com @ 2023-01-31 12:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108614
Bug ID: 108614
Summary: _subborrow_u32 generates suboptimal code when second
subtraction operand is constant on x86 targets
Product: gcc
Version: 12.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: john_platts at hotmail dot com
Target Milestone: ---
Here is some C++ code that generates suboptimal code with the -O2
-march=skylake-avx512 -m32 options with gcc 12.2.0:
#include <stdint.h>
#include <utility>
#include <x86intrin.h>
#include <immintrin.h>
std::pair<uint32_t, uint32_t> ComputeHiMaskAndHiZeroAmt(uint32_t len) {
uint32_t hiMask;
uint32_t hiZeroAmt;
_addcarry_u32(_subborrow_u32(0, len, 32, &hiZeroAmt),
uint32_t{0xFFFFFFFFu}, 0, &hiMask);
hiMask = _bzhi_u32(hiMask, hiZeroAmt);
return std::make_pair(hiMask, hiZeroAmt);
}
Here is the assembly code that is generated when the above code is compiled
with gcc 12.2.0 with the -O2 -march=skylake-avx512 -m32 options:
_Z25ComputeHiMaskAndHiZeroAmtj:
subl $16, %esp
movl 24(%esp), %eax
movl $32, %edx
subl %edx, %eax
movl 20(%esp), %ecx
movl $-1, %edx
adcl $0, %edx
movl %eax, 4(%ecx)
bzhi %eax, %edx, %edx
movl %ecx, %eax
movl %edx, (%ecx)
addl $16, %esp
ret $4
Here is a more optimal version of the above code (for 32-bit x86):
_Z25ComputeHiMaskAndHiZeroAmtj:
movl 8(%esp), %eax
subl $32, %eax
movl 4(%esp), %ecx
movl $-1, %edx
adcl $0, %edx
movl %eax, 4(%ecx)
bzhi %eax, %edx, %edx
movl %ecx, %eax
movl %edx, (%ecx)
ret $4
Here is the assembly code that is generated when the above code is compiled
with gcc 12.2.0 with the -O2 -march=skylake-avx512 options:
_Z25ComputeHiMaskAndHiZeroAmtj:
movl $32, %eax
subl %eax, %edi
movl $-1, %eax
adcl $0, %eax
bzhi %edi, %eax, %eax
salq $32, %rdi
movl %eax, %eax
orq %rdi, %rax
ret
Here is a more optimal version of the above code (for 64-bit x86):
_Z25ComputeHiMaskAndHiZeroAmtj:
subl $32, %edi
movl $-1, %eax
adcl $0, %eax
bzhi %edi, %eax, %eax
salq $32, %rdi
movl %eax, %eax
orq %rdi, %rax
ret
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2023-01-31 12:39 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-31 12:39 [Bug target/108614] New: _subborrow_u32 generates suboptimal code when second subtraction operand is constant on x86 targets john_platts at hotmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).