public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/95798] New: Initialization code --- suboptimal
@ 2020-06-21  7:16 zero at smallinteger dot com
  2020-06-21 16:57 ` [Bug target/95798] " zero at smallinteger dot com
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: zero at smallinteger dot com @ 2020-06-21  7:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798

            Bug ID: 95798
           Summary: Initialization code --- suboptimal
           Product: gcc
           Version: 9.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: zero at smallinteger dot com
  Target Milestone: ---

Created attachment 48764
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48764&action=edit
sample code

This is similar to (but not the same as) bug 87223 for structs.  Further, this
bug expands on this issue for gcc 10.x.  Originally, this was noted in gcc
(Ubuntu 9.3.0-10ubuntu2) 9.3.0, compiling with -O3.

First, note the initialization code that trivially sets values to zero in an
array,

        mov     eax, edi
        sub     rsp, 8080
        xor     edx, edx
        and     eax, 127
        mov     QWORD PTR [rsp-120+rax*8], 0
        mov     QWORD PTR [rsp-112+rax*8], 0
        mov     QWORD PTR [rsp-104+rax*8], 0
        mov     QWORD PTR [rsp-96+rax*8], 0
        mov     QWORD PTR [rsp-88+rax*8], 0
        mov     QWORD PTR [rsp-80+rax*8], 0
        mov     QWORD PTR [rsp-72+rax*8], 0
        mov     QWORD PTR [rsp-64+rax*8], 0
        xor     eax, eax

would be better by first setting a register to zero, then writing the value of
the register.  Further, note that there is already a zero register available
(edx), but it is not used.  This is similar to 87223 for structs, and here the
issue manifests for arrays.

Second, using gcc 10 versions and -O3 at godbolt.org results in this code:

        mov     eax, edi
        mov     edx, edi
        sub     rsp, 8072
        and     eax, 127
        and     edx, 127
        mov     QWORD PTR [rsp-120+rdx*8], 0
        lea     edx, [rax+1]
        movsx   rdx, edx
        mov     QWORD PTR [rsp-120+rdx*8], 0
        lea     edx, [rax+2]
        movsx   rdx, edx
        mov     QWORD PTR [rsp-120+rdx*8], 0
        lea     edx, [rax+3]
        movsx   rdx, edx
        mov     QWORD PTR [rsp-120+rdx*8], 0
        lea     edx, [rax+4]
        movsx   rdx, edx
        mov     QWORD PTR [rsp-120+rdx*8], 0
        lea     edx, [rax+5]
        movsx   rdx, edx
        mov     QWORD PTR [rsp-120+rdx*8], 0
        lea     edx, [rax+6]
        add     eax, 7
        movsx   rdx, edx
        cdqe
        mov     QWORD PTR [rsp-120+rdx*8], 0
        xor     edx, edx
        mov     QWORD PTR [rsp-120+rax*8], 0
        xor     eax, eax

This is much, much more verbose than in gcc 9.3, for no apparent gain.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/95798] Initialization code --- suboptimal
  2020-06-21  7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
@ 2020-06-21 16:57 ` zero at smallinteger dot com
  2020-06-22 10:42 ` [Bug target/95798] [10/11 Regression] " jakub at gcc dot gnu.org
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: zero at smallinteger dot com @ 2020-06-21 16:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798

--- Comment #1 from zero at smallinteger dot com ---
(note that changing the array declaration to be initialized does not result in
the individual array writes being optimized away, as one might expect at first
glance)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal
  2020-06-21  7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
  2020-06-21 16:57 ` [Bug target/95798] " zero at smallinteger dot com
@ 2020-06-22 10:42 ` jakub at gcc dot gnu.org
  2020-06-22 10:44 ` jakub at gcc dot gnu.org
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-06-22 10:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
            Summary|Initialization code ---     |[10/11 Regression]
                   |suboptimal                  |Initialization code ---
                   |                            |suboptimal
   Last reconfirmed|                            |2020-06-22
                 CC|                            |jakub at gcc dot gnu.org,
                   |                            |rdapp at gcc dot gnu.org
   Target Milestone|---                         |10.2
     Ever confirmed|0                           |1

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The 9 -> 10 regression started with
r10-2806-gdf7d46d925c7baca7bf9961aee900876d8aef225
since which the IL is much larger and the resulting code less efficient.
The testcase as written is just weird, it is an expensive check whether the
program is called with multiple of 128 arguments >= 1024 arguments (otherwise
it invokes UB).

Adjusted testcase that is more meaningful:
void bar (unsigned long long *, int);

void
foo (int y, unsigned long long z)
{
  unsigned long long x[1024];
  unsigned long long i = y % 127;
  __builtin_memset (x, -1, sizeof (x));
  x[i] = 0;
  x[i + 1] = 0;
  x[i + 2] = 0;
  x[i + 3] = 0;
  x[i + 4] = 0;
  x[i + 5] = 0;
  x[i + 6] = 0;
  x[i + 7] = 0;
  bar (x, y);
}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal
  2020-06-21  7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
  2020-06-21 16:57 ` [Bug target/95798] " zero at smallinteger dot com
  2020-06-22 10:42 ` [Bug target/95798] [10/11 Regression] " jakub at gcc dot gnu.org
@ 2020-06-22 10:44 ` jakub at gcc dot gnu.org
  2020-06-22 13:05 ` jakub at gcc dot gnu.org
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-06-22 10:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Perhaps the change should be guarded on single_use?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal
  2020-06-21  7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
                   ` (2 preceding siblings ...)
  2020-06-22 10:44 ` jakub at gcc dot gnu.org
@ 2020-06-22 13:05 ` jakub at gcc dot gnu.org
  2020-07-23  6:51 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-06-22 13:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Partially related, using the following -O2 -fno-ipa-icf:
void
foo (int x, int *p)
{
  p[x + 1] = 1;
}

void
bar (int x, int *p)
{
  p[x + 1UL] = 1;
}

void
baz (int x, int *p)
{
  unsigned long l = x;
  l++;
  p[l] = 1;
}

void
qux (int x, int *p)
{
  unsigned long l = x + 1;
  p[l] = 1;
}
we get the same 3 insn functions for the first 3 cases and 4 insn for the last
one.  I'm surprised that we treat foo and qux differently, as x + 1 has
undefined overflow, so (unsigned long) (x + 1) can be implemented with x + 1UL
and when used in address arithmetics it should be beneficial like that (so
shall e.g. expansion optimize it, or ivopts, or isel)?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal
  2020-06-21  7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
                   ` (3 preceding siblings ...)
  2020-06-22 13:05 ` jakub at gcc dot gnu.org
@ 2020-07-23  6:51 ` rguenth at gcc dot gnu.org
  2020-10-12 12:54 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-07-23  6:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.2                        |10.3

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 10.2 is released, adjusting target milestone.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal
  2020-06-21  7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
                   ` (4 preceding siblings ...)
  2020-07-23  6:51 ` rguenth at gcc dot gnu.org
@ 2020-10-12 12:54 ` rguenth at gcc dot gnu.org
  2021-02-24 15:33 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-10-12 12:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal
  2020-06-21  7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
                   ` (5 preceding siblings ...)
  2020-10-12 12:54 ` rguenth at gcc dot gnu.org
@ 2021-02-24 15:33 ` jakub at gcc dot gnu.org
  2021-02-25  9:26 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-02-24 15:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 50249
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50249&action=edit
gcc11-pr95798.patch

Untested fix.
The #c4 qux case above is not fixed by it, but it isn't a regression, so I
think we should defer that one for GCC 12 (file a separate PR for that).
And, it would work (if done e.g. at expansion time) only when there is signed
integer overflow, so not for -fwrapv nor when e.g. the narrower type is
unsigned, so I think we need the single_use match.pd case because after those
changes, there is no way to undo that.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/95798] [10/11 Regression] Initialization code --- suboptimal
  2020-06-21  7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
                   ` (6 preceding siblings ...)
  2021-02-24 15:33 ` jakub at gcc dot gnu.org
@ 2021-02-25  9:26 ` cvs-commit at gcc dot gnu.org
  2021-02-25  9:27 ` [Bug target/95798] [10 " jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-02-25  9:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:880682e7b2348d66f4089fa4af102b69eaaefbc2

commit r11-7384-g880682e7b2348d66f4089fa4af102b69eaaefbc2
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Feb 25 10:22:53 2021 +0100

    match.pd: Use :s for (T)(A) + CST -> (T)(A + CST) [PR95798]

    The r10-2806 change regressed following testcases, instead of doing
    int -> unsigned long sign-extension once and then add 8, 16, ... 56 to it
    for each of the memory access, it adds 8, 16, ... 56 in int mode and then
    sign extends each.  So that means:
    +       movq    $0, (%rsp,%rax,8)
    +       leal    1(%rdx), %eax
    +       cltq
    +       movq    $1, (%rsp,%rax,8)
    +       leal    2(%rdx), %eax
    +       cltq
    +       movq    $2, (%rsp,%rax,8)
    +       leal    3(%rdx), %eax
    +       cltq
    +       movq    $3, (%rsp,%rax,8)
    +       leal    4(%rdx), %eax
    +       cltq
    +       movq    $4, (%rsp,%rax,8)
    +       leal    5(%rdx), %eax
    +       cltq
    +       movq    $5, (%rsp,%rax,8)
    +       leal    6(%rdx), %eax
    +       addl    $7, %edx
    +       cltq
    +       movslq  %edx, %rdx
    +       movq    $6, (%rsp,%rax,8)
    +       movq    $7, (%rsp,%rdx,8)
    -       movq    $0, (%rsp,%rdx,8)
    -       movq    $1, 8(%rsp,%rdx,8)
    -       movq    $2, 16(%rsp,%rdx,8)
    -       movq    $3, 24(%rsp,%rdx,8)
    -       movq    $4, 32(%rsp,%rdx,8)
    -       movq    $5, 40(%rsp,%rdx,8)
    -       movq    $6, 48(%rsp,%rdx,8)
    -       movq    $7, 56(%rsp,%rdx,8)
    GCC 9 -> 10 change or:
    -       movq    $0, (%rsp,%rdx,8)
    -       movq    $1, 8(%rsp,%rdx,8)
    -       movq    $2, 16(%rsp,%rdx,8)
    -       movq    $3, 24(%rsp,%rdx,8)
    -       movq    $4, 32(%rsp,%rdx,8)
    -       movq    $5, 40(%rsp,%rdx,8)
    -       movq    $6, 48(%rsp,%rdx,8)
    -       movq    $7, 56(%rsp,%rdx,8)
    +       movq    $0, (%rsp,%rax,8)
    +       leal    1(%rdx), %eax
    +       movq    $1, (%rsp,%rax,8)
    +       leal    2(%rdx), %eax
    +       movq    $2, (%rsp,%rax,8)
    +       leal    3(%rdx), %eax
    +       movq    $3, (%rsp,%rax,8)
    +       leal    4(%rdx), %eax
    +       movq    $4, (%rsp,%rax,8)
    +       leal    5(%rdx), %eax
    +       movq    $5, (%rsp,%rax,8)
    +       leal    6(%rdx), %eax
    +       movq    $6, (%rsp,%rax,8)
    +       leal    7(%rdx), %eax
    +       movq    $7, (%rsp,%rax,8)
    change on the other test.  While for the former case of
    int there is due to signed integer overflow (unless -fwrapv)
    the possibility to undo it e.g. during expansion, for the unsigned
    case information is unfortunately lost.

    The following patch adds :s to the convert which restores these
    testcases but keeps the testcases the patch meant to improve as is.

    2021-02-25  Jakub Jelinek  <jakub@redhat.com>

            PR target/95798
            * match.pd ((T)(A) + CST -> (T)(A + CST)): Add :s to convert.

            * gcc.target/i386/pr95798-1.c: New test.
            * gcc.target/i386/pr95798-2.c: New test.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/95798] [10 Regression] Initialization code --- suboptimal
  2020-06-21  7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
                   ` (7 preceding siblings ...)
  2021-02-25  9:26 ` cvs-commit at gcc dot gnu.org
@ 2021-02-25  9:27 ` jakub at gcc dot gnu.org
  2021-04-08 12:02 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-02-25  9:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[10/11 Regression]          |[10 Regression]
                   |Initialization code ---     |Initialization code ---
                   |suboptimal                  |suboptimal

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed on the trunk so far.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/95798] [10 Regression] Initialization code --- suboptimal
  2020-06-21  7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
                   ` (8 preceding siblings ...)
  2021-02-25  9:27 ` [Bug target/95798] [10 " jakub at gcc dot gnu.org
@ 2021-04-08 12:02 ` rguenth at gcc dot gnu.org
  2022-06-28 10:41 ` jakub at gcc dot gnu.org
  2023-07-07  8:55 ` rguenth at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-04-08 12:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.3                        |10.4

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 10.3 is being released, retargeting bugs to GCC 10.4.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/95798] [10 Regression] Initialization code --- suboptimal
  2020-06-21  7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
                   ` (9 preceding siblings ...)
  2021-04-08 12:02 ` rguenth at gcc dot gnu.org
@ 2022-06-28 10:41 ` jakub at gcc dot gnu.org
  2023-07-07  8:55 ` rguenth at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-06-28 10:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.4                        |10.5

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 10.4 is being released, retargeting bugs to GCC 10.5.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/95798] [10 Regression] Initialization code --- suboptimal
  2020-06-21  7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
                   ` (10 preceding siblings ...)
  2022-06-28 10:41 ` jakub at gcc dot gnu.org
@ 2023-07-07  8:55 ` rguenth at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-07  8:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95798

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
      Known to work|                            |11.1.0
   Target Milestone|10.5                        |11.0
             Status|ASSIGNED                    |RESOLVED
      Known to fail|                            |10.5.0

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed for GCC 11.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-07-07  8:55 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-21  7:16 [Bug target/95798] New: Initialization code --- suboptimal zero at smallinteger dot com
2020-06-21 16:57 ` [Bug target/95798] " zero at smallinteger dot com
2020-06-22 10:42 ` [Bug target/95798] [10/11 Regression] " jakub at gcc dot gnu.org
2020-06-22 10:44 ` jakub at gcc dot gnu.org
2020-06-22 13:05 ` jakub at gcc dot gnu.org
2020-07-23  6:51 ` rguenth at gcc dot gnu.org
2020-10-12 12:54 ` rguenth at gcc dot gnu.org
2021-02-24 15:33 ` jakub at gcc dot gnu.org
2021-02-25  9:26 ` cvs-commit at gcc dot gnu.org
2021-02-25  9:27 ` [Bug target/95798] [10 " jakub at gcc dot gnu.org
2021-04-08 12:02 ` rguenth at gcc dot gnu.org
2022-06-28 10:41 ` jakub at gcc dot gnu.org
2023-07-07  8:55 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).