public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/96573] New: [Regression] Regression in optimization on x86-64 with -O3 from GCC 9 to 10
@ 2020-08-11 17:45 remi.andruccioli at gmail dot com
  2020-08-11 18:22 ` [Bug tree-optimization/96573] [10/11 Regression] Regression in optimization on x86-64 with -O3 jakub at gcc dot gnu.org
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: remi.andruccioli at gmail dot com @ 2020-08-11 17:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96573

            Bug ID: 96573
           Summary: [Regression] Regression in optimization on x86-64 with
                    -O3 from GCC 9 to 10
           Product: gcc
           Version: 10.2.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: remi.andruccioli at gmail dot com
  Target Milestone: ---

Created attachment 49046
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49046&action=edit
This file contains the source code of the function described in the report.

Hello,

I'd like to describe here what seems to be a regression/missed optimization
since GCC 10.

The host and target architecture is x86-64.
I'm using Fedora 32 with Linux 5.7, but I could reproduce it on many other
Linux platforms.
I'm using GCC 10.2, but I could reproduce it with any GCC 10 minor version.


Here is a function, written in pure ANSI C99:

#include <stdlib.h>
#include <stdint.h>

void *
ReverseBytesOfPointer(void * const pointer)
{
  const size_t maxIndex = sizeof(pointer) - 1;
  const uint8_t * const oldPointerPointer = (uint8_t*)&pointer;
  void *newPointer;
  uint8_t * const newPointerPointer = (uint8_t *)&newPointer;
  uint8_t i;

  for (i = 0; i <= maxIndex; ++i) {
    newPointerPointer[maxIndex - i] = oldPointerPointer[i];
  }

  return newPointer;
}


What this function does is simply to reverse all the bytes of a pointer. It is
written in pure C99 and is extremely portable, as it works from 16-bit to
64-bit machines. (I wrote it and use it for embedded development and I'm happy
with it).

What makes it magical is that when compiled with -O3 (and -std=c99 -Wall
-Werror -Wextra), GCC 9 (yes, 9) is clever enough to deduce the intent of this
function and compiles it all as:

ReverseBytesOfPointer:
        mov     rax, rdi
        bswap   rax
        ret


However, since GCC 10, the magic seems to have disappeared. This is the ASM
code that is generated now, with the exact same command line invocation:

ReverseBytesOfPointer:
        movq    %rdi, %rax
        movb    %dil, -1(%rsp)
        movzbl  %ah, %edx
        shrq    $56, %rax
        movb    %dl, -2(%rsp)
        movq    %rdi, %rdx
        shrq    $16, %rdx
        movb    %al, -8(%rsp)
        movb    %dl, -3(%rsp)
        movq    %rdi, %rdx
        shrq    $24, %rdx
        movb    %dl, -4(%rsp)
        movq    %rdi, %rdx
        shrq    $32, %rdx
        movb    %dl, -5(%rsp)
        movq    %rdi, %rdx
        shrq    $40, %rdx
        movb    %dl, -6(%rsp)
        movq    %rdi, %rdx
        shrq    $48, %rdx
        movb    %dl, -7(%rsp)
        movq    -8(%rsp), %rax
        ret


For your convenience, I include here a link to a snippet on compiler-explorer
to show the comparison between versions 9.3 and 10.2. This snippet also
includes a few unit tests:

https://godbolt.org/z/YG1KPf


Regards,

Remi Andruccioli

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96573] [10/11 Regression] Regression in optimization on x86-64 with -O3
  2020-08-11 17:45 [Bug c/96573] New: [Regression] Regression in optimization on x86-64 with -O3 from GCC 9 to 10 remi.andruccioli at gmail dot com
@ 2020-08-11 18:22 ` jakub at gcc dot gnu.org
  2020-08-12 15:42 ` remi.andruccioli at gmail dot com
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-08-11 18:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96573

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2020-08-11
             Status|UNCONFIRMED                 |NEW
   Target Milestone|---                         |10.3
                 CC|                            |jakub at gcc dot gnu.org
          Component|c                           |tree-optimization
            Summary|[Regression] Regression in  |[10/11 Regression]
                   |optimization on x86-64 with |Regression in optimization
                   |-O3 from GCC 9 to 10        |on x86-64 with -O3

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Stopped being recognized as bswap with r274922.
That changed the IL before store-merging from
  _22 = MEM[(const uint8_t *)&pointer];
  MEM[(uint8_t *)&newPointer + 7B] = _22;
  _33 = MEM[(const uint8_t *)&pointer + 1B];
  MEM[(uint8_t *)&newPointer + 6B] = _33;
  _44 = MEM[(const uint8_t *)&pointer + 2B];
  MEM[(uint8_t *)&newPointer + 5B] = _44;
  _55 = MEM[(const uint8_t *)&pointer + 3B];
  MEM[(uint8_t *)&newPointer + 4B] = _55;
  _66 = MEM[(const uint8_t *)&pointer + 4B];
  MEM[(uint8_t *)&newPointer + 3B] = _66;
  _77 = MEM[(const uint8_t *)&pointer + 5B];
  MEM[(uint8_t *)&newPointer + 2B] = _77;
  _88 = MEM[(const uint8_t *)&pointer + 6B];
  MEM[(uint8_t *)&newPointer + 1B] = _88;
  _5 = MEM[(const uint8_t *)&pointer + 7B];
  MEM[(uint8_t *)&newPointer] = _5;
  _9 = newPointer;
  newPointer ={v} {CLOBBER};
  return _9;
to:
  _22 = BIT_FIELD_REF <pointer_36(D), 8, 0>;
  MEM[(uint8_t *)&newPointer + 7B] = _22;
  _33 = BIT_FIELD_REF <pointer_36(D), 8, 8>;
  MEM[(uint8_t *)&newPointer + 6B] = _33;
  _44 = BIT_FIELD_REF <pointer_36(D), 8, 16>;
  MEM[(uint8_t *)&newPointer + 5B] = _44;
  _55 = BIT_FIELD_REF <pointer_36(D), 8, 24>;
  MEM[(uint8_t *)&newPointer + 4B] = _55;
  _66 = BIT_FIELD_REF <pointer_36(D), 8, 32>;
  MEM[(uint8_t *)&newPointer + 3B] = _66;
  _77 = BIT_FIELD_REF <pointer_36(D), 8, 40>;
  MEM[(uint8_t *)&newPointer + 2B] = _77;
  _88 = BIT_FIELD_REF <pointer_36(D), 8, 48>;
  MEM[(uint8_t *)&newPointer + 1B] = _88;
  _5 = BIT_FIELD_REF <pointer_36(D), 8, 56>;
  MEM[(uint8_t *)&newPointer] = _5;
  _9 = newPointer;
  newPointer ={v} {CLOBBER};
  return _9;
store merging has code to handle this, but punts on non-integral types
(POINTER_TYPE_P in this case).
Just changing:
--- gcc/gimple-ssa-store-merging.c.jj   2020-07-28 15:39:09.909757589 +0200
+++ gcc/gimple-ssa-store-merging.c      2020-08-11 20:19:19.293299266 +0200
@@ -333,7 +333,8 @@ init_symbolic_number (struct symbolic_nu
 {
   int size;

-  if (! INTEGRAL_TYPE_P (TREE_TYPE (src)))
+  if (! INTEGRAL_TYPE_P (TREE_TYPE (src))
+      && ! POINTER_TYPE_P (TREE_TYPE (src)))
     return false;

   n->base_addr = n->offset = n->alias_set = n->vuse = NULL_TREE;
seems to work, though I must say I don't find much rationale for byte-swapping
pointers...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96573] [10/11 Regression] Regression in optimization on x86-64 with -O3
  2020-08-11 17:45 [Bug c/96573] New: [Regression] Regression in optimization on x86-64 with -O3 from GCC 9 to 10 remi.andruccioli at gmail dot com
  2020-08-11 18:22 ` [Bug tree-optimization/96573] [10/11 Regression] Regression in optimization on x86-64 with -O3 jakub at gcc dot gnu.org
@ 2020-08-12 15:42 ` remi.andruccioli at gmail dot com
  2020-08-25  8:44 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: remi.andruccioli at gmail dot com @ 2020-08-12 15:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96573

--- Comment #2 from Remi Andruccioli <remi.andruccioli at gmail dot com> ---
Hello,

Thanks for your quick reply and investigations.

I agree this is quite an uncommon need. I use it when I have to deal with
endianness and I/O. So I mainly use it when dumping content and variables from
the memory to files on the disk in a constant endianness across all the
different target machines, for later analysis. (same idea than what is done
with the endianness of network)


Regards,

Remi Andruccioli

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96573] [10/11 Regression] Regression in optimization on x86-64 with -O3
  2020-08-11 17:45 [Bug c/96573] New: [Regression] Regression in optimization on x86-64 with -O3 from GCC 9 to 10 remi.andruccioli at gmail dot com
  2020-08-11 18:22 ` [Bug tree-optimization/96573] [10/11 Regression] Regression in optimization on x86-64 with -O3 jakub at gcc dot gnu.org
  2020-08-12 15:42 ` remi.andruccioli at gmail dot com
@ 2020-08-25  8:44 ` rguenth at gcc dot gnu.org
  2021-03-31  9:13 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-08-25  8:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96573

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96573] [10/11 Regression] Regression in optimization on x86-64 with -O3
  2020-08-11 17:45 [Bug c/96573] New: [Regression] Regression in optimization on x86-64 with -O3 from GCC 9 to 10 remi.andruccioli at gmail dot com
                   ` (2 preceding siblings ...)
  2020-08-25  8:44 ` rguenth at gcc dot gnu.org
@ 2021-03-31  9:13 ` rguenth at gcc dot gnu.org
  2021-04-01  8:52 ` cvs-commit at gcc dot gnu.org
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-03-31  9:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96573

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
See also PR96135

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96573] [10/11 Regression] Regression in optimization on x86-64 with -O3
  2020-08-11 17:45 [Bug c/96573] New: [Regression] Regression in optimization on x86-64 with -O3 from GCC 9 to 10 remi.andruccioli at gmail dot com
                   ` (3 preceding siblings ...)
  2021-03-31  9:13 ` rguenth at gcc dot gnu.org
@ 2021-04-01  8:52 ` cvs-commit at gcc dot gnu.org
  2021-04-01  9:22 ` [Bug tree-optimization/96573] [10 " jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-01  8:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96573

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:5b9a65ecbeb22ef6dd3344baae97f85b645522e3

commit r11-7946-g5b9a65ecbeb22ef6dd3344baae97f85b645522e3
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Apr 1 10:51:03 2021 +0200

    bswap: Handle bswapping of pointers [PR96573]

    In GCC8/9 we used to optimize this into a bswap, but we no longer do.
    Handling byteswapping of pointers is easy, all we need is to allow them,
    for the __builtin_bswap* we already use TYPE_PRECISION to determine
    the precision and we cast the operand and result to the correct type
    if they aren't uselessly convertible to what the builtin expects.

    2021-04-01  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/96573
            * gimple-ssa-store-merging.c (init_symbolic_number): Handle
            also pointer types.

            * gcc.dg/pr96573.c: New test.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96573] [10 Regression] Regression in optimization on x86-64 with -O3
  2020-08-11 17:45 [Bug c/96573] New: [Regression] Regression in optimization on x86-64 with -O3 from GCC 9 to 10 remi.andruccioli at gmail dot com
                   ` (4 preceding siblings ...)
  2021-04-01  8:52 ` cvs-commit at gcc dot gnu.org
@ 2021-04-01  9:22 ` jakub at gcc dot gnu.org
  2021-04-04 19:42 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-04-01  9:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96573

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[10/11 Regression]          |[10 Regression] Regression
                   |Regression in optimization  |in optimization on x86-64
                   |on x86-64 with -O3          |with -O3

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed on the trunk so far.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96573] [10 Regression] Regression in optimization on x86-64 with -O3
  2020-08-11 17:45 [Bug c/96573] New: [Regression] Regression in optimization on x86-64 with -O3 from GCC 9 to 10 remi.andruccioli at gmail dot com
                   ` (5 preceding siblings ...)
  2021-04-01  9:22 ` [Bug tree-optimization/96573] [10 " jakub at gcc dot gnu.org
@ 2021-04-04 19:42 ` pinskia at gcc dot gnu.org
  2021-04-06 10:46 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-04-04 19:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96573

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This test fails on aarch64-linux-gnu:
FAIL: gcc.dg/pr96573.c scan-tree-dump optimized "__builtin_bswap"

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96573] [10 Regression] Regression in optimization on x86-64 with -O3
  2020-08-11 17:45 [Bug c/96573] New: [Regression] Regression in optimization on x86-64 with -O3 from GCC 9 to 10 remi.andruccioli at gmail dot com
                   ` (6 preceding siblings ...)
  2021-04-04 19:42 ` pinskia at gcc dot gnu.org
@ 2021-04-06 10:46 ` cvs-commit at gcc dot gnu.org
  2021-04-08 12:02 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-06 10:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96573

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:bfeb36bd03c2168af263daa13370a20a96c42b5d

commit r11-8002-gbfeb36bd03c2168af263daa13370a20a96c42b5d
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Apr 6 12:44:51 2021 +0200

    testsuite: Fix up pr96573.c on aarch64 [PR96573]

    On Thu, Apr 01, 2021 at 02:16:55PM +0100, Alex Coplan via Gcc-patches
wrote:
    > FYI, I'm seeing the new test failing on aarch64:
    >
    > PASS: gcc.dg/pr96573.c (test for excess errors)
    > FAIL: gcc.dg/pr96573.c scan-tree-dump optimized "__builtin_bswap"

    The vectorizer in the aarch64 case manages to emit a VEC_PERM_EXPR instead
    (which is just as efficient).

    So, do we want to go for the following (and/or perhaps also restrict the
test to
    a couple of targets where it works?  In my last distro build it failed only
    on aarch64-linux, while armv7hl-linux-gnueabi and
    {i686,x86_64,powerpc64le,s390x}-linux were fine)?

    2021-04-06  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/96573
            * gcc.dg/pr96573.c: Instead of __builtin_bswap accept also
            VEC_PERM_EXPR with bswapping permutation.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96573] [10 Regression] Regression in optimization on x86-64 with -O3
  2020-08-11 17:45 [Bug c/96573] New: [Regression] Regression in optimization on x86-64 with -O3 from GCC 9 to 10 remi.andruccioli at gmail dot com
                   ` (7 preceding siblings ...)
  2021-04-06 10:46 ` cvs-commit at gcc dot gnu.org
@ 2021-04-08 12:02 ` rguenth at gcc dot gnu.org
  2022-06-28 10:41 ` jakub at gcc dot gnu.org
  2023-07-07  9:03 ` rguenth at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-04-08 12:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96573

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.3                        |10.4

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 10.3 is being released, retargeting bugs to GCC 10.4.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96573] [10 Regression] Regression in optimization on x86-64 with -O3
  2020-08-11 17:45 [Bug c/96573] New: [Regression] Regression in optimization on x86-64 with -O3 from GCC 9 to 10 remi.andruccioli at gmail dot com
                   ` (8 preceding siblings ...)
  2021-04-08 12:02 ` rguenth at gcc dot gnu.org
@ 2022-06-28 10:41 ` jakub at gcc dot gnu.org
  2023-07-07  9:03 ` rguenth at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-06-28 10:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96573

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.4                        |10.5

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 10.4 is being released, retargeting bugs to GCC 10.5.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/96573] [10 Regression] Regression in optimization on x86-64 with -O3
  2020-08-11 17:45 [Bug c/96573] New: [Regression] Regression in optimization on x86-64 with -O3 from GCC 9 to 10 remi.andruccioli at gmail dot com
                   ` (9 preceding siblings ...)
  2022-06-28 10:41 ` jakub at gcc dot gnu.org
@ 2023-07-07  9:03 ` rguenth at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-07  9:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96573

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED
      Known to work|                            |11.0
   Target Milestone|10.5                        |11.0
      Known to fail|                            |10.5.0

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed in GCC 11.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-07-07  9:03 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-11 17:45 [Bug c/96573] New: [Regression] Regression in optimization on x86-64 with -O3 from GCC 9 to 10 remi.andruccioli at gmail dot com
2020-08-11 18:22 ` [Bug tree-optimization/96573] [10/11 Regression] Regression in optimization on x86-64 with -O3 jakub at gcc dot gnu.org
2020-08-12 15:42 ` remi.andruccioli at gmail dot com
2020-08-25  8:44 ` rguenth at gcc dot gnu.org
2021-03-31  9:13 ` rguenth at gcc dot gnu.org
2021-04-01  8:52 ` cvs-commit at gcc dot gnu.org
2021-04-01  9:22 ` [Bug tree-optimization/96573] [10 " jakub at gcc dot gnu.org
2021-04-04 19:42 ` pinskia at gcc dot gnu.org
2021-04-06 10:46 ` cvs-commit at gcc dot gnu.org
2021-04-08 12:02 ` rguenth at gcc dot gnu.org
2022-06-28 10:41 ` jakub at gcc dot gnu.org
2023-07-07  9:03 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).