public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/114552] New: wrong code at -O1 and above on x86_64-linux-gnu
@ 2024-04-01 19:34 zhendong.su at inf dot ethz.ch
  2024-04-01 19:38 ` [Bug tree-optimization/114552] " pinskia at gcc dot gnu.org
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: zhendong.su at inf dot ethz.ch @ 2024-04-01 19:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114552

            Bug ID: 114552
           Summary: wrong code at -O1 and above on x86_64-linux-gnu
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: zhendong.su at inf dot ethz.ch
  Target Milestone: ---

It is a regression from 12.3, and affects 13.* and the trunk. 

Compiler Explorer: https://godbolt.org/z/bznEs9bao

[656] % gcctk -v
Using built-in specs.
COLLECT_GCC=gcctk
COLLECT_LTO_WRAPPER=/local/suz-local/software/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/14.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-trunk/configure --disable-bootstrap
--enable-checking=yes --prefix=/local/suz-local/software/local/gcc-trunk
--enable-sanitizers --enable-languages=c,c++ --disable-werror --enable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.1 20240401 (experimental) (GCC) 
[657] % 
[657] % gcctk -O0 small.c; ./a.out
[658] % 
[658] % gcctk -O1 small.c
[659] % ./a.out
Segmentation fault
[660] % 
[660] % cat small.c
#pragma pack(1)
struct a {
  short b;
  int c;
};
struct d {
  struct a b;
  int e;
};
short f, g, h;
int i, j, o;
const struct d k = {{1,0},0};
short *l = &h, *m = &g, *n = &f;
void p() { *n = (o > 0) > o; }
void q(struct a n) {
  j = 0;
  for (; j < 5; j++)
    i = *m = *l != 0;
  p();
}
int main() {
  q(k.b);
  return 0;
}

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/114552] wrong code at -O1 and above on x86_64-linux-gnu
  2024-04-01 19:34 [Bug tree-optimization/114552] New: wrong code at -O1 and above on x86_64-linux-gnu zhendong.su at inf dot ethz.ch
@ 2024-04-01 19:38 ` pinskia at gcc dot gnu.org
  2024-04-01 20:36 ` [Bug middle-end/114552] [13/14 Regression] " pinskia at gcc dot gnu.org
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-01 19:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114552

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Not this might not be a bug as pragma pack(1) changes the alignment of the
fields.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/114552] [13/14 Regression] wrong code at -O1 and above on x86_64-linux-gnu
  2024-04-01 19:34 [Bug tree-optimization/114552] New: wrong code at -O1 and above on x86_64-linux-gnu zhendong.su at inf dot ethz.ch
  2024-04-01 19:38 ` [Bug tree-optimization/114552] " pinskia at gcc dot gnu.org
@ 2024-04-01 20:36 ` pinskia at gcc dot gnu.org
  2024-04-02 11:33 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-01 20:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114552

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |13.3
   Last reconfirmed|                            |2024-04-01
            Summary|wrong code at -O1 and above |[13/14 Regression] wrong
                   |on x86_64-linux-gnu         |code at -O1 and above on
                   |                            |x86_64-linux-gnu
          Component|tree-optimization           |middle-end
             Status|UNCONFIRMED                 |NEW
             Target|                            |x86_64-linux-gnu
           Keywords|                            |needs-bisection, wrong-code
     Ever confirmed|0                           |1

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Better reduced testcase, removing the pragma and putting the packed on the
struct that is causing the issue:
```
struct a {
  short b;
  int c;
}__attribute__((packed));
struct d {
  struct a b;
  int e;
};
static const struct d k = {{1,0},0};
[[gnu::noinline]]
void p() { asm("":::"memory"); }
[[gnu::noinline]]
void q(struct a n) {
  p();
}
int main() {
  q(k.b);
  return 0;
}
```

The bug is in the middle-end expanding the call `q(k.b);` such that k.b is on
the stack but it messes up and uses the wrong size or something.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/114552] [13/14 Regression] wrong code at -O1 and above on x86_64-linux-gnu
  2024-04-01 19:34 [Bug tree-optimization/114552] New: wrong code at -O1 and above on x86_64-linux-gnu zhendong.su at inf dot ethz.ch
  2024-04-01 19:38 ` [Bug tree-optimization/114552] " pinskia at gcc dot gnu.org
  2024-04-01 20:36 ` [Bug middle-end/114552] [13/14 Regression] " pinskia at gcc dot gnu.org
@ 2024-04-02 11:33 ` rguenth at gcc dot gnu.org
  2024-04-02 13:29 ` [Bug middle-end/114552] [13/14 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r13-990 jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-04-02 11:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114552

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Version|unknown                     |14.0
           Priority|P3                          |P2

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/114552] [13/14 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r13-990
  2024-04-01 19:34 [Bug tree-optimization/114552] New: wrong code at -O1 and above on x86_64-linux-gnu zhendong.su at inf dot ethz.ch
                   ` (2 preceding siblings ...)
  2024-04-02 11:33 ` rguenth at gcc dot gnu.org
@ 2024-04-02 13:29 ` jakub at gcc dot gnu.org
  2024-04-02 13:46 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-02 13:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114552

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[13/14 Regression] wrong    |[13/14 Regression] wrong
                   |code at -O1 and above on    |code at -O1 and above on
                   |x86_64-linux-gnu            |x86_64-linux-gnu since
                   |                            |r13-990
                 CC|                            |jakub at gcc dot gnu.org,
                   |                            |sayle at gcc dot gnu.org
           Keywords|needs-bisection             |

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Further cleaned up testcase:

struct __attribute__((packed)) S { short b; int c; };
struct T { struct S b; int e; };
static const struct T k = { { 1, 0 }, 0 };

__attribute__((noinline)) void
foo (void)
{
  asm ("" : : : "memory");
}

__attribute__((noinline)) void
bar (struct S n)
{
  foo ();
}

int
main ()
{
  bar (k.b);
  return 0;
}

Started with r13-990-ged6fd2aed58f2cca99f15331bf68999c0e6df370

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/114552] [13/14 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r13-990
  2024-04-01 19:34 [Bug tree-optimization/114552] New: wrong code at -O1 and above on x86_64-linux-gnu zhendong.su at inf dot ethz.ch
                   ` (3 preceding siblings ...)
  2024-04-02 13:29 ` [Bug middle-end/114552] [13/14 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r13-990 jakub at gcc dot gnu.org
@ 2024-04-02 13:46 ` jakub at gcc dot gnu.org
  2024-04-02 14:18 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-02 13:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114552

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The r13-989 to r13-990 difference is
-       movl    k(%rip), %eax
-       movl    %eax, (%rsp)
-       movzwl  k+4(%rip), %eax
-       movw    %ax, 4(%rsp)
+       movw    $1, (%rsp)
+       movl    $0, 2(%rsp)
+       movl    $0, 8(%rsp)
which clearly shows that previously it was copying just 6 bytes from k to
%rsp+0,
while now it directly sets those 6 bytes at %rsp+0 + another 4 bytes at %rsp+8.
The last insn is incorrect, shouldn't be there.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/114552] [13/14 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r13-990
  2024-04-01 19:34 [Bug tree-optimization/114552] New: wrong code at -O1 and above on x86_64-linux-gnu zhendong.su at inf dot ethz.ch
                   ` (4 preceding siblings ...)
  2024-04-02 13:46 ` jakub at gcc dot gnu.org
@ 2024-04-02 14:18 ` jakub at gcc dot gnu.org
  2024-04-02 14:49 ` roger at nextmovesoftware dot com
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-02 14:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114552

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 57850
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57850&action=edit
gcc14-pr114552.patch

Untested fix.
Note, perhaps for GCC 15 we could support even non-matching sizes, as long as
we'd create a CONSTRUCTOR for just the needed bytes.  But, in that case we
wouldn't have to require that it is loaded from the beginning of a .rodata MEM.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/114552] [13/14 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r13-990
  2024-04-01 19:34 [Bug tree-optimization/114552] New: wrong code at -O1 and above on x86_64-linux-gnu zhendong.su at inf dot ethz.ch
                   ` (5 preceding siblings ...)
  2024-04-02 14:18 ` jakub at gcc dot gnu.org
@ 2024-04-02 14:49 ` roger at nextmovesoftware dot com
  2024-04-03  8:01 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: roger at nextmovesoftware dot com @ 2024-04-02 14:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114552

Roger Sayle <roger at nextmovesoftware dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |roger at nextmovesoftware dot com

--- Comment #6 from Roger Sayle <roger at nextmovesoftware dot com> ---
Many thanks Jakub, and my apologies for the breakage/inconvenience.  It looks
like sizeof(k) is 10 bytes, and sizeof(k.b) is 6 bytes, and somehow this code
is getting the constructor for "k" and not for just "k.b".  This is, of course,
fine for memcpy as it can move the just the pieces it wants.  I completely
agree that the safe fix is to check that the sizes match; I don't think I ever
considered that they might not be identical when I wrote this code, or assumed
that partial would be non-zero for this case].

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/114552] [13/14 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r13-990
  2024-04-01 19:34 [Bug tree-optimization/114552] New: wrong code at -O1 and above on x86_64-linux-gnu zhendong.su at inf dot ethz.ch
                   ` (6 preceding siblings ...)
  2024-04-02 14:49 ` roger at nextmovesoftware dot com
@ 2024-04-03  8:01 ` cvs-commit at gcc dot gnu.org
  2024-04-03 14:32 ` [Bug middle-end/114552] [13 " jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-04-03  8:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114552

--- Comment #7 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:03039744f368a24a452e4ea8d946e9c2cedaf1aa

commit r14-9768-g03039744f368a24a452e4ea8d946e9c2cedaf1aa
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Apr 3 09:59:45 2024 +0200

    expr: Fix up emit_push_insn [PR114552]

    r13-990 added optimizations in multiple spots to optimize during
    expansion storing of constant initializers into targets.
    In the load_register_parameters and expand_expr_real_1 cases,
    it checks it has a tree as the source and so knows we are reading
    that whole decl's value, so the code is fine as is, but in the
    emit_push_insn case it checks for a MEM from which something
    is pushed and checks for SYMBOL_REF as the MEM's address, but
    still assumes the whole object is copied, which as the following
    testcase shows might not always be the case.  In the testcase,
    k is 6 bytes, then 2 bytes of padding, then another 4 bytes,
    while the emit_push_insn wants to store just the 6 bytes.

    The following patch simply verifies it is the whole initializer
    that is being stored, I think that is best thing to do so late
    in GCC 14 cycle as well for backporting.

    For GCC 15, perhaps the code could stop requiring it must be at offset
zero,
    nor that the size is equal, but could use
    get_symbol_constant_value/fold_ctor_reference gimple-fold APIs to actually
    extract just part of the initializer if we e.g. push just some subset
    (of course, still verify that it is a subset).  For sizes which are power
    of two bytes and we have some integer modes, we could use as type for
    fold_ctor_reference corresponding integral types, otherwise dunno, punt
    or use some structure (e.g. try to find one in the initializer?), whatever.
    But even in the other spots it could perhaps handle loading of
    COMPONENT_REFs or MEM_REFs from the .rodata vars.

    2024-04-03  Jakub Jelinek  <jakub@redhat.com>

            PR middle-end/114552
            * expr.cc (emit_push_insn): Only use store_constructor for
            immediate_const_ctor_p if int_expr_size matches size.

            * gcc.c-torture/execute/pr114552.c: New test.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/114552] [13 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r13-990
  2024-04-01 19:34 [Bug tree-optimization/114552] New: wrong code at -O1 and above on x86_64-linux-gnu zhendong.su at inf dot ethz.ch
                   ` (7 preceding siblings ...)
  2024-04-03  8:01 ` cvs-commit at gcc dot gnu.org
@ 2024-04-03 14:32 ` jakub at gcc dot gnu.org
  2024-04-21  4:08 ` cvs-commit at gcc dot gnu.org
  2024-04-23  6:45 ` jakub at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-03 14:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114552

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[13/14 Regression] wrong    |[13 Regression] wrong code
                   |code at -O1 and above on    |at -O1 and above on
                   |x86_64-linux-gnu since      |x86_64-linux-gnu since
                   |r13-990                     |r13-990

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed on the trunk so far.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/114552] [13 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r13-990
  2024-04-01 19:34 [Bug tree-optimization/114552] New: wrong code at -O1 and above on x86_64-linux-gnu zhendong.su at inf dot ethz.ch
                   ` (8 preceding siblings ...)
  2024-04-03 14:32 ` [Bug middle-end/114552] [13 " jakub at gcc dot gnu.org
@ 2024-04-21  4:08 ` cvs-commit at gcc dot gnu.org
  2024-04-23  6:45 ` jakub at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-04-21  4:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114552

--- Comment #9 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-13 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:ba6fd407891fd83648ad803c85b607dc09e23be4

commit r13-8624-gba6fd407891fd83648ad803c85b607dc09e23be4
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Apr 3 09:59:45 2024 +0200

    expr: Fix up emit_push_insn [PR114552]

    r13-990 added optimizations in multiple spots to optimize during
    expansion storing of constant initializers into targets.
    In the load_register_parameters and expand_expr_real_1 cases,
    it checks it has a tree as the source and so knows we are reading
    that whole decl's value, so the code is fine as is, but in the
    emit_push_insn case it checks for a MEM from which something
    is pushed and checks for SYMBOL_REF as the MEM's address, but
    still assumes the whole object is copied, which as the following
    testcase shows might not always be the case.  In the testcase,
    k is 6 bytes, then 2 bytes of padding, then another 4 bytes,
    while the emit_push_insn wants to store just the 6 bytes.

    The following patch simply verifies it is the whole initializer
    that is being stored, I think that is best thing to do so late
    in GCC 14 cycle as well for backporting.

    For GCC 15, perhaps the code could stop requiring it must be at offset
zero,
    nor that the size is equal, but could use
    get_symbol_constant_value/fold_ctor_reference gimple-fold APIs to actually
    extract just part of the initializer if we e.g. push just some subset
    (of course, still verify that it is a subset).  For sizes which are power
    of two bytes and we have some integer modes, we could use as type for
    fold_ctor_reference corresponding integral types, otherwise dunno, punt
    or use some structure (e.g. try to find one in the initializer?), whatever.
    But even in the other spots it could perhaps handle loading of
    COMPONENT_REFs or MEM_REFs from the .rodata vars.

    2024-04-03  Jakub Jelinek  <jakub@redhat.com>

            PR middle-end/114552
            * expr.cc (emit_push_insn): Only use store_constructor for
            immediate_const_ctor_p if int_expr_size matches size.

            * gcc.c-torture/execute/pr114552.c: New test.

    (cherry picked from commit 03039744f368a24a452e4ea8d946e9c2cedaf1aa)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug middle-end/114552] [13 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r13-990
  2024-04-01 19:34 [Bug tree-optimization/114552] New: wrong code at -O1 and above on x86_64-linux-gnu zhendong.su at inf dot ethz.ch
                   ` (9 preceding siblings ...)
  2024-04-21  4:08 ` cvs-commit at gcc dot gnu.org
@ 2024-04-23  6:45 ` jakub at gcc dot gnu.org
  10 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-04-23  6:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114552

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed for 13.3 too.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-04-23  6:46 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-01 19:34 [Bug tree-optimization/114552] New: wrong code at -O1 and above on x86_64-linux-gnu zhendong.su at inf dot ethz.ch
2024-04-01 19:38 ` [Bug tree-optimization/114552] " pinskia at gcc dot gnu.org
2024-04-01 20:36 ` [Bug middle-end/114552] [13/14 Regression] " pinskia at gcc dot gnu.org
2024-04-02 11:33 ` rguenth at gcc dot gnu.org
2024-04-02 13:29 ` [Bug middle-end/114552] [13/14 Regression] wrong code at -O1 and above on x86_64-linux-gnu since r13-990 jakub at gcc dot gnu.org
2024-04-02 13:46 ` jakub at gcc dot gnu.org
2024-04-02 14:18 ` jakub at gcc dot gnu.org
2024-04-02 14:49 ` roger at nextmovesoftware dot com
2024-04-03  8:01 ` cvs-commit at gcc dot gnu.org
2024-04-03 14:32 ` [Bug middle-end/114552] [13 " jakub at gcc dot gnu.org
2024-04-21  4:08 ` cvs-commit at gcc dot gnu.org
2024-04-23  6:45 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).