public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/110249] New: __builtin_unreachable helps optimisation at -O1 but not at -O2
@ 2023-06-14 11:01 david at westcontrol dot com
  2023-06-14 11:15 ` [Bug c/110249] " amonakov at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: david at westcontrol dot com @ 2023-06-14 11:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110249

            Bug ID: 110249
           Summary: __builtin_unreachable helps optimisation at -O1 but
                    not at -O2
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: david at westcontrol dot com
  Target Milestone: ---

Sometimes it can be useful to use __builtin_unreachable() to give the compiler
hints that can improve optimisation.  For example, it can be used here to tell
the compiler that the parameter is always 8-byte aligned:

#include <stdint.h>
#include <string.h>

uint64_t read64(const uint64_t * p) {
    if ((uint64_t) p % 8 ) {
        __builtin_unreachable();
    }
     uint64_t value;
     memcpy( &value, p, sizeof(uint64_t) );
     return value;     
}

For some targets, such as 32-bit ARM and especially RISC-V, this can make a
difference to the generated code.  In testing, when given -O1 the compiler
takes advantage of the explicit undefined behaviour to see that the pointer is
aligned, and generates a single 64-bit load.  With -O2, however, it seems that
information is lost - perhaps due to earlier optimisation passes - and now slow
unaligned load code is generated.

Ideally, such optimisation information from undefined behaviour (explicit via a
builtin, or implicit via other code) should be kept - -O2 should have at least
as much information as -O1.  An alternative would be the addition of a more
directed "__builtin_assume" function that could be used.

(I know that in this particular case, __builtin_assume_aligned is available and
works exactly as intended at -O1 and -O2, but I think this is a more general
issue than just alignments.)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug c/110249] __builtin_unreachable helps optimisation at -O1 but not at -O2
  2023-06-14 11:01 [Bug c/110249] New: __builtin_unreachable helps optimisation at -O1 but not at -O2 david at westcontrol dot com
@ 2023-06-14 11:15 ` amonakov at gcc dot gnu.org
  2023-06-14 11:38 ` david at westcontrol dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: amonakov at gcc dot gnu.org @ 2023-06-14 11:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110249

Alexander Monakov <amonakov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amonakov at gcc dot gnu.org

--- Comment #1 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
Please show 'gcc -v' for the case when you get one aligned load at -O1.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug c/110249] __builtin_unreachable helps optimisation at -O1 but not at -O2
  2023-06-14 11:01 [Bug c/110249] New: __builtin_unreachable helps optimisation at -O1 but not at -O2 david at westcontrol dot com
  2023-06-14 11:15 ` [Bug c/110249] " amonakov at gcc dot gnu.org
@ 2023-06-14 11:38 ` david at westcontrol dot com
  2023-06-14 12:16 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: david at westcontrol dot com @ 2023-06-14 11:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110249

--- Comment #2 from David Brown <david at westcontrol dot com> ---
My apologies - it does not optimise the code to a single aligned load at -O1
(at least, not with the combinations I have now tried).  The code was
originally C++, with a reference, which /does/ exhibit this behaviour of having
better code at -O1 than -O2.  I had translated the reference to a pointer to
get more generic code that is valid as C and C++, but I don't see the same
effect with the pointer.

For a reference, the issue is as I first described:

#include <stdint.h>
#include <string.h>

uint64_t read64r(const uint64_t &x) {
    if ((uint64_t) &x % 8 ) {
        __builtin_unreachable();
    }
     uint64_t value;
     memcpy( &value, &x, sizeof(uint64_t) );
     return value;     
}

I tested with 64-bit RISC-V and 32-bit ARM using trunk (gcc 13.1, I think) on
godbolt.org.  The only relevant flag was the optimisation level.

<https://godbolt.org/z/TaqdqeGch>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug c/110249] __builtin_unreachable helps optimisation at -O1 but not at -O2
  2023-06-14 11:01 [Bug c/110249] New: __builtin_unreachable helps optimisation at -O1 but not at -O2 david at westcontrol dot com
  2023-06-14 11:15 ` [Bug c/110249] " amonakov at gcc dot gnu.org
  2023-06-14 11:38 ` david at westcontrol dot com
@ 2023-06-14 12:16 ` rguenth at gcc dot gnu.org
  2023-06-14 12:43 ` david at westcontrol dot com
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-14 12:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110249

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
We could pattern-match this to

 p = __builtin_assume_aligned (p, 8);

which is btw the better way to write this.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug c/110249] __builtin_unreachable helps optimisation at -O1 but not at -O2
  2023-06-14 11:01 [Bug c/110249] New: __builtin_unreachable helps optimisation at -O1 but not at -O2 david at westcontrol dot com
                   ` (2 preceding siblings ...)
  2023-06-14 12:16 ` rguenth at gcc dot gnu.org
@ 2023-06-14 12:43 ` david at westcontrol dot com
  2023-09-19 14:30 ` [Bug tree-optimization/110249] " cvs-commit at gcc dot gnu.org
  2023-09-19 15:04 ` amacleod at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: david at westcontrol dot com @ 2023-06-14 12:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110249

--- Comment #4 from David Brown <david at westcontrol dot com> ---
Yes, __builtin_assume_aligned is the best way to write things in this
particular case (and optimises fine for -O1 and -O2).  It is also clearer in
the source code (IMHO), as it shows the programmer's intention better.

I am just a little concerned that optimisation hints provided by the programmer
via __builtin_unreachable are getting lost at -O2.  When the compiler knows
that something cannot happen - whether by __builtin_unreachable or by other
code analysis - it can often use that information to generate more efficient
code.  If that information can be used for better code at -O1, but is lost at
-O2, then something is wrong in the way that information is collected or passed
on inside the compiler.  Any case of -O1 generating more efficient code than
-O2 is a sign of a problem or limitation.

So I am not particularly bothered about this one piece of code - I am reporting
the issue because it might be an indication of a wider problem.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/110249] __builtin_unreachable helps optimisation at -O1 but not at -O2
  2023-06-14 11:01 [Bug c/110249] New: __builtin_unreachable helps optimisation at -O1 but not at -O2 david at westcontrol dot com
                   ` (3 preceding siblings ...)
  2023-06-14 12:43 ` david at westcontrol dot com
@ 2023-09-19 14:30 ` cvs-commit at gcc dot gnu.org
  2023-09-19 15:04 ` amacleod at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-09-19 14:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110249

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Andrew Macleod <amacleod@gcc.gnu.org>:

https://gcc.gnu.org/g:bf6b107e2a342319b3787ec960fc8014ef3aff91

commit r14-4141-gbf6b107e2a342319b3787ec960fc8014ef3aff91
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Wed Sep 13 11:52:15 2023 -0400

    New early __builtin_unreachable processing.

    in VRP passes before __builtin_unreachable MUST be removed, only remove it
    if all exports affected by the unreachable can have global values updated,
and
    do not involve loads from memory.

            PR tree-optimization/110080
            PR tree-optimization/110249
            gcc/
            * tree-vrp.cc (remove_unreachable::final_p): New.
            (remove_unreachable::maybe_register): Rename from
            maybe_register_block and call early or final routine.
            (fully_replaceable): New.
            (remove_unreachable::handle_early): New.
            (remove_unreachable::remove_and_update_globals): Remove
            non-final processing.
            (rvrp_folder::rvrp_folder): Add final flag to constructor.
            (rvrp_folder::post_fold_bb): Remove unreachable registration.
            (rvrp_folder::pre_fold_stmt): Move unreachable processing to here.
            (execute_ranger_vrp): Adjust some call parameters.

            gcc/testsuite/
            * g++.dg/pr110249.C: New.
            * gcc.dg/pr110080.c: New.
            * gcc.dg/pr93917.c: Adjust.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/110249] __builtin_unreachable helps optimisation at -O1 but not at -O2
  2023-06-14 11:01 [Bug c/110249] New: __builtin_unreachable helps optimisation at -O1 but not at -O2 david at westcontrol dot com
                   ` (4 preceding siblings ...)
  2023-09-19 14:30 ` [Bug tree-optimization/110249] " cvs-commit at gcc dot gnu.org
@ 2023-09-19 15:04 ` amacleod at redhat dot com
  5 siblings, 0 replies; 7+ messages in thread
From: amacleod at redhat dot com @ 2023-09-19 15:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110249

Andrew Macleod <amacleod at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
                 CC|                            |amacleod at redhat dot com
         Resolution|---                         |FIXED

--- Comment #6 from Andrew Macleod <amacleod at redhat dot com> ---
Fixed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-09-19 15:04 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-14 11:01 [Bug c/110249] New: __builtin_unreachable helps optimisation at -O1 but not at -O2 david at westcontrol dot com
2023-06-14 11:15 ` [Bug c/110249] " amonakov at gcc dot gnu.org
2023-06-14 11:38 ` david at westcontrol dot com
2023-06-14 12:16 ` rguenth at gcc dot gnu.org
2023-06-14 12:43 ` david at westcontrol dot com
2023-09-19 14:30 ` [Bug tree-optimization/110249] " cvs-commit at gcc dot gnu.org
2023-09-19 15:04 ` amacleod at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).