public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/110007] New: Implement support for Clang’s __builtin_unpredictable()
@ 2023-05-27 17:54 richard.yao at alumni dot stonybrook.edu
  2023-05-27 17:59 ` [Bug c/110007] " pinskia at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: richard.yao at alumni dot stonybrook.edu @ 2023-05-27 17:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110007

            Bug ID: 110007
           Summary: Implement support for Clang’s
                    __builtin_unpredictable()
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: richard.yao at alumni dot stonybrook.edu
  Target Milestone: ---

Having the ability to specify __builtin_unpredictable() as a hint to encourage
the compiler to use cmov would be useful for implementing algorithms like
binary search that have unpredictable branches.
__builtin_expect_with_probability() looks like a possible substitute, but Clang
does not support it and it does not always work as described in #110001.

Combining this with C’s ternary operator should make the proposed cmov builtin
from #98801 unnecessary.

Note that while clang implements __builtin_unpredictable(), it does not
currently pass that information to the backend:

https://github.com/llvm/llvm-project/issues/62790#issuecomment-1552671099

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/110007] Implement support for Clang’s __builtin_unpredictable()
  2023-05-27 17:54 [Bug c/110007] New: Implement support for Clang’s __builtin_unpredictable() richard.yao at alumni dot stonybrook.edu
@ 2023-05-27 17:59 ` pinskia at gcc dot gnu.org
  2023-05-27 18:01 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-27 17:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110007

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note x86 backend still will reject if there are two cmovs close to another
because of some cores are much worse with 2 close to another another.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/110007] Implement support for Clang’s __builtin_unpredictable()
  2023-05-27 17:54 [Bug c/110007] New: Implement support for Clang’s __builtin_unpredictable() richard.yao at alumni dot stonybrook.edu
  2023-05-27 17:59 ` [Bug c/110007] " pinskia at gcc dot gnu.org
@ 2023-05-27 18:01 ` pinskia at gcc dot gnu.org
  2023-05-27 18:19 ` richard.yao at alumni dot stonybrook.edu
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-27 18:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110007

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Richard Yao from comment #0)
> Having the ability to specify __builtin_unpredictable() as a hint to
> encourage the compiler to use cmov would be useful for implementing
> algorithms like binary search that have unpredictable branches.
> __builtin_expect_with_probability() looks like a possible substitute, but
> Clang does not support it and it does not always work as described in
> #110001.

PR 110001 has nothing to do with __builtin_expect_with_probability and
__builtin_expect_with_probability will do exactly the same as
__builtin_unpredictable will do really.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/110007] Implement support for Clang’s __builtin_unpredictable()
  2023-05-27 17:54 [Bug c/110007] New: Implement support for Clang’s __builtin_unpredictable() richard.yao at alumni dot stonybrook.edu
  2023-05-27 17:59 ` [Bug c/110007] " pinskia at gcc dot gnu.org
  2023-05-27 18:01 ` pinskia at gcc dot gnu.org
@ 2023-05-27 18:19 ` richard.yao at alumni dot stonybrook.edu
  2023-05-27 18:25 ` pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: richard.yao at alumni dot stonybrook.edu @ 2023-05-27 18:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110007

--- Comment #3 from Richard Yao <richard.yao at alumni dot stonybrook.edu> ---
(In reply to Andrew Pinski from comment #2)
> (In reply to Richard Yao from comment #0)
> > Having the ability to specify __builtin_unpredictable() as a hint to
> > encourage the compiler to use cmov would be useful for implementing
> > algorithms like binary search that have unpredictable branches.
> > __builtin_expect_with_probability() looks like a possible substitute, but
> > Clang does not support it and it does not always work as described in
> > #110001.
> 
> PR 110001 has nothing to do with __builtin_expect_with_probability and

I mentioned it briefly in PR 110001, but I guess I should make it explicit. See
line 58 here:

https://gcc.godbolt.org/z/ef3Yfchzv

That is made into a jump, when all that is necessary is cmov, which is what
Clang generates. In the example I had posted in PR 110001, I had not used
__builtin_expect_with_probability() because it made no difference, but here I
am using it to show that cmov is not used.

Earlier in the output, GCC 12.3 uses cmov for every 3rd instruction, so I
suspect that this is not a case of two cmovs being too close to each other.

Should I file another issue explicitly for that so this could be left for the
request that we have __builtin_unpredictable()?

> __builtin_expect_with_probability will do exactly the same as
> __builtin_unpredictable will do really.

__builtin_unpredictable() is easier to read than
__builtin_expect_with_probability() and it would be nice for both Clang and GCC
to support the same builtins.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/110007] Implement support for Clang’s __builtin_unpredictable()
  2023-05-27 17:54 [Bug c/110007] New: Implement support for Clang’s __builtin_unpredictable() richard.yao at alumni dot stonybrook.edu
                   ` (2 preceding siblings ...)
  2023-05-27 18:19 ` richard.yao at alumni dot stonybrook.edu
@ 2023-05-27 18:25 ` pinskia at gcc dot gnu.org
  2023-05-27 20:26 ` richard.yao at alumni dot stonybrook.edu
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-27 18:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110007

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Richard Yao from comment #3)
> (In reply to Andrew Pinski from comment #2)
> > (In reply to Richard Yao from comment #0)
> > > Having the ability to specify __builtin_unpredictable() as a hint to
> > > encourage the compiler to use cmov would be useful for implementing
> > > algorithms like binary search that have unpredictable branches.
> > > __builtin_expect_with_probability() looks like a possible substitute, but
> > > Clang does not support it and it does not always work as described in
> > > #110001.
> > 
> > PR 110001 has nothing to do with __builtin_expect_with_probability and
> 
> I mentioned it briefly in PR 110001, but I guess I should make it explicit.
> See line 58 here:
> 
> https://gcc.godbolt.org/z/ef3Yfchzv
> 
> That is made into a jump, when all that is necessary is cmov, which is what
> Clang generates. In the example I had posted in PR 110001, I had not used
> __builtin_expect_with_probability() because it made no difference, but here
> I am using it to show that cmov is not used.



If we look at the -fdump-tree-optimized-lineno dump we see why it was not
turned into a cmov:
  [/app/example.c:58:12 discrim 2] if (a_111 <= key_128(D))
    goto <bb 28>; [50.00%]
  else
    goto <bb 23>; [50.00%]

  <bb 23> [local count: 447392427]:
  [/app/example.c:75:40] _268 = (long unsigned int) i_82;
  [/app/example.c:75:40] _271 = _268 * 4;
  [/app/example.c:75:34] _274 = array_89(D) + _271;
  [/app/example.c:5:6] pretmp_277 = *_274;
  goto <bb 28>; [100.00%]


There is a load moved inside the conditional.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/110007] Implement support for Clang’s __builtin_unpredictable()
  2023-05-27 17:54 [Bug c/110007] New: Implement support for Clang’s __builtin_unpredictable() richard.yao at alumni dot stonybrook.edu
                   ` (3 preceding siblings ...)
  2023-05-27 18:25 ` pinskia at gcc dot gnu.org
@ 2023-05-27 20:26 ` richard.yao at alumni dot stonybrook.edu
  2023-05-28  5:42 ` amonakov at gcc dot gnu.org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: richard.yao at alumni dot stonybrook.edu @ 2023-05-27 20:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110007

--- Comment #5 from Richard Yao <richard.yao at alumni dot stonybrook.edu> ---
(In reply to Andrew Pinski from comment #4)
> (In reply to Richard Yao from comment #3)
> > (In reply to Andrew Pinski from comment #2)
> > > (In reply to Richard Yao from comment #0)
> > > > Having the ability to specify __builtin_unpredictable() as a hint to
> > > > encourage the compiler to use cmov would be useful for implementing
> > > > algorithms like binary search that have unpredictable branches.
> > > > __builtin_expect_with_probability() looks like a possible substitute, but
> > > > Clang does not support it and it does not always work as described in
> > > > #110001.
> > > 
> > > PR 110001 has nothing to do with __builtin_expect_with_probability and
> > 
> > I mentioned it briefly in PR 110001, but I guess I should make it explicit.
> > See line 58 here:
> > 
> > https://gcc.godbolt.org/z/ef3Yfchzv
> > 
> > That is made into a jump, when all that is necessary is cmov, which is what
> > Clang generates. In the example I had posted in PR 110001, I had not used
> > __builtin_expect_with_probability() because it made no difference, but here
> > I am using it to show that cmov is not used.
> 
> 
> 
> If we look at the -fdump-tree-optimized-lineno dump we see why it was not
> turned into a cmov:
>   [/app/example.c:58:12 discrim 2] if (a_111 <= key_128(D))
>     goto <bb 28>; [50.00%]
>   else
>     goto <bb 23>; [50.00%]
> 
>   <bb 23> [local count: 447392427]:
>   [/app/example.c:75:40] _268 = (long unsigned int) i_82;
>   [/app/example.c:75:40] _271 = _268 * 4;
>   [/app/example.c:75:34] _274 = array_89(D) + _271;
>   [/app/example.c:5:6] pretmp_277 = *_274;
>   goto <bb 28>; [100.00%]
> 
> 
> There is a load moved inside the conditional.

Is executing an unpredictable branch cheaper than executing a redundant load?

This a patch against the assembly output from GCC 12.3 modifies this to use
cmov:

diff --git a/out.s b/out.s
index d796087..f0f009c 100644
--- a/out.s
+++ b/out.s
@@ -317,15 +317,11 @@ custom_binary_search_fast:
        cmovle  %ecx, %edx
 .L43:
        leal    1(%rdx), %ecx
-       movq    %rcx, %rax
-       movl    (%rdi,%rcx,4), %ecx
-       cmpl    %esi, %ecx
-       jle     .L15
+       cmpl    %esi, (%rdi,%rcx,4)
+       cmovle  %ecx, %edx
 .L25:
        movl    %edx, %eax
        movl    (%rdi,%rax,4), %ecx
-       movl    %edx, %eax
-.L15:
        cmpl    %ecx, %esi
        setl    %cl
        setg    %dl

Micro-benchmarking the two suggests that the answer is yes on Zen 3, although I
do not understand why.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/110007] Implement support for Clang’s __builtin_unpredictable()
  2023-05-27 17:54 [Bug c/110007] New: Implement support for Clang’s __builtin_unpredictable() richard.yao at alumni dot stonybrook.edu
                   ` (4 preceding siblings ...)
  2023-05-27 20:26 ` richard.yao at alumni dot stonybrook.edu
@ 2023-05-28  5:42 ` amonakov at gcc dot gnu.org
  2023-05-28 16:21 ` hubicka at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: amonakov at gcc dot gnu.org @ 2023-05-28  5:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110007

Alexander Monakov <amonakov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amonakov at gcc dot gnu.org

--- Comment #6 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
Are you sure the branch is unpredictable in your micro-benchmark? If you have
repeated runs you'll train the predictors.

Implementing a __builtin_branchless_select would address such needs more
directly. There were similar requests in the past, two of them related to qsort
and bsearch, unsurprisingly: PR 93165, PR 97734, PR 106804.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/110007] Implement support for Clang’s __builtin_unpredictable()
  2023-05-27 17:54 [Bug c/110007] New: Implement support for Clang’s __builtin_unpredictable() richard.yao at alumni dot stonybrook.edu
                   ` (5 preceding siblings ...)
  2023-05-28  5:42 ` amonakov at gcc dot gnu.org
@ 2023-05-28 16:21 ` hubicka at gcc dot gnu.org
  2023-05-28 21:37 ` richard.yao at alumni dot stonybrook.edu
  2023-05-30  7:49 ` rguenth at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: hubicka at gcc dot gnu.org @ 2023-05-28 16:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110007

Jan Hubicka <hubicka at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu.org

--- Comment #7 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
Also note that branch predicted with 50% outcome is not necessarily
unpredictable for example in this:

for (int i = 0; i < 10000; i++)
  if (i&1)
     ....

I would expect branch predictor to work this out on modern systems.
So having explicit flag in branch_probability that given probability is hard
for CPU to predict would make sense and I was thinking we may try to get this
info from auto-fdo eventually too.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/110007] Implement support for Clang’s __builtin_unpredictable()
  2023-05-27 17:54 [Bug c/110007] New: Implement support for Clang’s __builtin_unpredictable() richard.yao at alumni dot stonybrook.edu
                   ` (6 preceding siblings ...)
  2023-05-28 16:21 ` hubicka at gcc dot gnu.org
@ 2023-05-28 21:37 ` richard.yao at alumni dot stonybrook.edu
  2023-05-30  7:49 ` rguenth at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: richard.yao at alumni dot stonybrook.edu @ 2023-05-28 21:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110007

--- Comment #8 from Richard Yao <richard.yao at alumni dot stonybrook.edu> ---
(In reply to Alexander Monakov from comment #6)
> Are you sure the branch is unpredictable in your micro-benchmark? If you
> have repeated runs you'll train the predictors.

(In reply to Jan Hubicka from comment #7)
> Also note that branch predicted with 50% outcome is not necessarily
> unpredictable for example in this:
> 
> for (int i = 0; i < 10000; i++)
>   if (i&1)
>      ....
> 
> I would expect branch predictor to work this out on modern systems.
> So having explicit flag in branch_probability that given probability is hard
> for CPU to predict would make sense and I was thinking we may try to get
> this info from auto-fdo eventually too.

Good point. I had reused an existing micro-benchmark, but it is using libc's
srand(), which is known for not having great quality RNG. It is quite possible
that the last branch really is predictable because of that. Having only 1
unpredictable branch is not that terrible, so I probably will defer looking
into this further to a future date.

(In reply to Alexander Monakov from comment #6)
> Implementing a __builtin_branchless_select would address such needs more
> directly. There were similar requests in the past, two of them related to
> qsort and bsearch, unsurprisingly: PR 93165, PR 97734, PR 106804.

As a developer that works on a project that supports GCC and Clang equally, but
must support older versions of GCC longer, I would like to see both GCC and
Clang adopt each others' builtins. That way, I need to implement fewer
compatibility shims to support both compilers and what compatibility shims I do
need can be dropped sooner (maybe after 10 years).

I am not against the new builtin, but I would like to also have
__builtin_unpredictable(). It would be consistent with the existing
likely/unlikely macros that my project uses, which should mean other developers
will not have to learn the specifics of how predication works to read code
using it, since they can just treat it as a compiler hint and read things as if
it were not there unless they have some reason to reason more deeply about
things.

I could just do a macro around __builtin_expect_with_probability(), but I
greatly prefer __builtin_unpredictable() since people reading the code do not
need to decipher the argument list to understand what it does since the name is
self documenting.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/110007] Implement support for Clang’s __builtin_unpredictable()
  2023-05-27 17:54 [Bug c/110007] New: Implement support for Clang’s __builtin_unpredictable() richard.yao at alumni dot stonybrook.edu
                   ` (7 preceding siblings ...)
  2023-05-28 21:37 ` richard.yao at alumni dot stonybrook.edu
@ 2023-05-30  7:49 ` rguenth at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-30  7:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110007

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|unknown                     |14.0
           Severity|normal                      |enhancement

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-05-30  7:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-27 17:54 [Bug c/110007] New: Implement support for Clang’s __builtin_unpredictable() richard.yao at alumni dot stonybrook.edu
2023-05-27 17:59 ` [Bug c/110007] " pinskia at gcc dot gnu.org
2023-05-27 18:01 ` pinskia at gcc dot gnu.org
2023-05-27 18:19 ` richard.yao at alumni dot stonybrook.edu
2023-05-27 18:25 ` pinskia at gcc dot gnu.org
2023-05-27 20:26 ` richard.yao at alumni dot stonybrook.edu
2023-05-28  5:42 ` amonakov at gcc dot gnu.org
2023-05-28 16:21 ` hubicka at gcc dot gnu.org
2023-05-28 21:37 ` richard.yao at alumni dot stonybrook.edu
2023-05-30  7:49 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).