public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/94617] New: Simple if condition not optimized
@ 2020-04-16 11:28 soap at gentoo dot org
  2020-04-16 11:57 ` [Bug tree-optimization/94617] " rguenth at gcc dot gnu.org
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: soap at gentoo dot org @ 2020-04-16 11:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94617

            Bug ID: 94617
           Summary: Simple if condition not optimized
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: soap at gentoo dot org
  Target Milestone: ---

Given the following C++ snippet

  const char* vanilla_bandpass(int a, int b, int x, const char* low, const
char* high)
  {
      const bool within_interval { (a <= x) && (x < b) };
      return (within_interval ? high : low);
  }

GCC trunk yields with -O3 -march=znver2 the following assembly

  vanilla_bandpass(int, int, int, char const*, char const*):
          mov     rax, r8
          cmp     edi, edx
          jg      .L4
          cmp     edx, esi
          jge     .L4
          ret
  .L4:
          mov     rax, rcx
          ret

which is terrible. On the other hand, Clang emits

  vanilla_bandpass(int, int, int, char const*, char const*):
          cmp     edx, esi
          cmovge  r8, rcx
          cmp     edi, edx
          cmovg   r8, rcx
          mov     rax, r8
          ret

which is a lot better. There exists an unbranched version for which I'm not
100% certain whether it's free of UB:

  #include <cstdint>

  const char* funky_bandpass(int a, int b, int x, const char* low, const char*
high)
  {
      const bool within_interval { (a <= x) && (x < b) };
      const auto low_ptr = reinterpret_cast<uintptr_t>(low) *
(!within_interval);
      const auto high_ptr = reinterpret_cast<uintptr_t>(high) *
within_interval;

      const auto ptr_sum = low_ptr + high_ptr;
      const auto* result = reinterpret_cast<const char*>(ptr_sum);
      return result;
  }

which yields

  funky_bandpass(int, int, int, char const*, char const*):
          cmp     edi, edx
          setle   al
          cmp     edx, esi
          setl    dl
          and     eax, edx
          mov     edx, eax
          xor     edx, 1
          movzx   edx, dl
          movzx   eax, al
          imul    rcx, rdx
          imul    rax, r8
          add     rax, rcx
          ret

which is jump-free and in practice executes at the same observable rate as
Clang's assembly, but still looks needlessly complex. Clang manages to compile
this code to the same assembly as vanilla_bandpass.

Any chance of getting the optimizer ironed out for this?

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-06-25 22:45 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-16 11:28 [Bug tree-optimization/94617] New: Simple if condition not optimized soap at gentoo dot org
2020-04-16 11:57 ` [Bug tree-optimization/94617] " rguenth at gcc dot gnu.org
2020-04-16 12:02 ` soap at gentoo dot org
2020-04-16 12:10 ` rguenth at gcc dot gnu.org
2020-04-16 12:17 ` rguenth at gcc dot gnu.org
2020-04-16 12:18 ` soap at gentoo dot org
2020-04-16 12:47 ` jakub at gcc dot gnu.org
2020-04-16 13:11 ` rguenth at gcc dot gnu.org
2021-07-19  3:40 ` [Bug target/94617] " pinskia at gcc dot gnu.org
2022-11-26 20:43 ` pinskia at gcc dot gnu.org
2023-06-25 22:29 ` pinskia at gcc dot gnu.org
2023-06-25 22:45 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).