public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/97053] New: an O3 codegen bug
@ 2020-09-15  2:50 cuzdav at gmail dot com
  2020-09-15  6:21 ` [Bug tree-optimization/97053] [10/11 Regression] an O2, " rguenth at gcc dot gnu.org
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: cuzdav at gmail dot com @ 2020-09-15  2:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97053

            Bug ID: 97053
           Summary: an O3 codegen bug
           Product: gcc
           Version: 10.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: cuzdav at gmail dot com
  Target Milestone: ---

Created attachment 49217
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49217&action=edit
preprocessed code in case something changes after submitting

"Wrong" output codegen bug.  Very sensitive, almost any code change prevents
the problem, including O2 or less, as well as running under ubsan or asan,
reordering variables, and in some cases changing the values involved (such as
setting p=0 instead of 1)

Copying a buffer of 8 bytes of data (null terminated) results in a 6-bytes (of
data) null terminated "string", losing 2 bytes from the end.

A variable "len" holds the strlen() of this string, reports 6, but some code
affected by optimizer seems to think it is a larger value.

Expected output:

  runtime len=8
  orig=ABCDXXXX
  copy=ABCDXXXX

actual output:

  runtime len=6
  orig=ABCDXXXX
  copy=ABCDXX


Even though above the actual output shows len=6 when printing "len", a separate
line of code is not present in the assembly (presumably optimized away under a
confused compile-time eval.)

    if (len < 8)
        printf("len < 8\n"); // but len == 6


Problem visible in 10.1 and 10.2 at -O2 and -O3, but not -O1 or less, nor in in
9.x or earlier.  

CODE: 

#include <cstdio>
#include <cstring>

struct Data
{
    short n;
    char name[8 + 1];
    int p;
    char s;
    int b;
};

int main(int argc, char** argv)
{
    int p = 1;  
    char orig[9]{"XXXXXXXX"};
    std::memcpy(orig, "ABCD", 4);

    Data data{};
    data.n = 5u;

    // this should copy ABCDXXXX but loses final 2 XX's
    std::memcpy(data.name, orig, 8); 

    data.s = 'X';
    data.b = p;

    const size_t len = strlen(data.name);
    if (len < 8) 
        printf("len < 8\n"); // does not print

    std::printf("runtime len=%lu\n", len); // this, however, prints 6
    std::printf("orig=%s\ncopy=%s\n", orig, data.name);
}


Live on godbolt:  https://godbolt.org/z/v6xqh8

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/97053] [10/11 Regression] an O2, O3 codegen bug
  2020-09-15  2:50 [Bug c++/97053] New: an O3 codegen bug cuzdav at gmail dot com
@ 2020-09-15  6:21 ` rguenth at gcc dot gnu.org
  2020-09-15  7:08 ` ebotcazou at gcc dot gnu.org
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-15  6:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97053

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |10.3
             Status|UNCONFIRMED                 |NEW
            Summary|an O2, O3 codegen bug       |[10/11 Regression] an O2,
                   |                            |O3 codegen bug
          Component|c++                         |tree-optimization
                 CC|                            |ebotcazou at gcc dot gnu.org,
                   |                            |jakub at gcc dot gnu.org
   Last reconfirmed|                            |2020-09-15
           Priority|P3                          |P2
     Ever confirmed|0                           |1
           Keywords|                            |wrong-code

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.  It is store-mergings wrong-doing:

> diff -u t.ii.194t.widening_mul t.ii.195t.store-merging
--- t.ii.194t.widening_mul      2020-09-15 08:15:39.824282586 +0200
+++ t.ii.195t.store-merging     2020-09-15 08:15:39.824282586 +0200
@@ -1,6 +1,10 @@

 ;; Function main (main, funcdef_no=26, decl_uid=3207, cgraph_uid=27,
symbol_order=26) (executed once)

+Coalescing successful!
+Merged into 1 stores
+New sequence of 2 stores to replace old one of 3 stores
+Merging successful!
 main (int argc, char * * argv)
 {
   const size_t len;
@@ -11,12 +15,11 @@
   <bb 2> [local count: 1073741824]:
   orig = "XXXXXXXX";
   memcpy (&orig, "ABCD", 4);
-  MEM <char[12]> [(struct Data *)&data + 8B] = {};
   data.n = 5;
   _7 = MEM <long unsigned int> [(char * {ref-all})&orig];
   MEM <long unsigned int> [(char * {ref-all})&data + 2B] = _7;
-  data.s = 88;
-  data.b = 1;
+  MEM <unsigned long> [(struct Data *)&data + 8B] = 0;
+  MEM <unsigned long> [(void *)&data + 16B] = 4294967384;
   len_11 = strlen (&data.name);
   printf ("runtime len=%lu\n", len_11);
   printf ("orig=%s\ncopy=%s\n", &orig, &data.name);

we merge data + 8 = {} "across" data + 2 = _7 but fail to see they alias.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/97053] [10/11 Regression] an O2, O3 codegen bug
  2020-09-15  2:50 [Bug c++/97053] New: an O3 codegen bug cuzdav at gmail dot com
  2020-09-15  6:21 ` [Bug tree-optimization/97053] [10/11 Regression] an O2, " rguenth at gcc dot gnu.org
@ 2020-09-15  7:08 ` ebotcazou at gcc dot gnu.org
  2020-09-15  8:21 ` jakub at gcc dot gnu.org
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: ebotcazou at gcc dot gnu.org @ 2020-09-15  7:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97053

--- Comment #2 from Eric Botcazou <ebotcazou at gcc dot gnu.org> ---
The statement generated by DSE

MEM <char[12]> [(struct Data *)&data + 8B] = {};

looks nonsensical and I guess store-merging is not prepared for it.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/97053] [10/11 Regression] an O2, O3 codegen bug
  2020-09-15  2:50 [Bug c++/97053] New: an O3 codegen bug cuzdav at gmail dot com
  2020-09-15  6:21 ` [Bug tree-optimization/97053] [10/11 Regression] an O2, " rguenth at gcc dot gnu.org
  2020-09-15  7:08 ` ebotcazou at gcc dot gnu.org
@ 2020-09-15  8:21 ` jakub at gcc dot gnu.org
  2020-09-15  9:08 ` jakub at gcc dot gnu.org
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-09-15  8:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97053

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Started with my r10-179-g3afd514bca6ea572e614b5289c4429ace693311b
I'll have a look.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/97053] [10/11 Regression] an O2, O3 codegen bug
  2020-09-15  2:50 [Bug c++/97053] New: an O3 codegen bug cuzdav at gmail dot com
                   ` (2 preceding siblings ...)
  2020-09-15  8:21 ` jakub at gcc dot gnu.org
@ 2020-09-15  9:08 ` jakub at gcc dot gnu.org
  2020-09-15  9:53 ` jakub at gcc dot gnu.org
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-09-15  9:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97053

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Reduced testcase:

struct S { short a; char b[9]; int c; char d; int e; };

__attribute__((noipa)) void
foo (char *x, char *y)
{
  if (__builtin_strcmp (x, "ABCDXXXX") != 0
      || __builtin_strcmp (y, "ABCDXXXX") != 0)
    __builtin_abort ();
}

int
main ()
{
  char a[9] = "XXXXXXXX";
  struct S b = {};
  __builtin_memcpy (a, "ABCD", 4);
  b.a = 5;
  __builtin_memcpy (b.b, a, 8); 
  b.d = 'X';
  b.e = 1;
  foo (a, b.b);
  return 0;
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/97053] [10/11 Regression] an O2, O3 codegen bug
  2020-09-15  2:50 [Bug c++/97053] New: an O3 codegen bug cuzdav at gmail dot com
                   ` (3 preceding siblings ...)
  2020-09-15  9:08 ` jakub at gcc dot gnu.org
@ 2020-09-15  9:53 ` jakub at gcc dot gnu.org
  2020-09-15 10:55 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-09-15  9:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97053

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Better testcase that won't cease to test the bug even if FRE or some other pass
gets smarter and optimizes the a = "XXXXXXXX"; __builtin_memcpy (&a, "ABCD",
4);
into a = "ABCDXXXX":

struct S { short a; char b[9]; int c; char d; int e; };

__attribute__((noipa)) void
foo (struct S *x)
{
  if (__builtin_strcmp (x->b, "ABCDXXXX") != 0)
    __builtin_abort ();
}

__attribute__((noipa)) void
bar (char *x)
{
  struct S b = {};
  b.a = 5;
  __builtin_memcpy (b.b, x, 8); 
  b.d = 'X';
  b.e = 1;
  foo (&b);
}

int
main ()
{
  bar ("ABCDXXXX");
  return 0;
}

(In reply to Eric Botcazou from comment #2)
> The statement generated by DSE
> 
> MEM <char[12]> [(struct Data *)&data + 8B] = {};
> 
> looks nonsensical and I guess store-merging is not prepared for it.

There is nothing non-sensical nor wrong on it, it is standard GIMPLE IL,
equivalent in behavior to __builtin_memset (&data + 8B, 0, 12);

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/97053] [10/11 Regression] an O2, O3 codegen bug
  2020-09-15  2:50 [Bug c++/97053] New: an O3 codegen bug cuzdav at gmail dot com
                   ` (4 preceding siblings ...)
  2020-09-15  9:53 ` jakub at gcc dot gnu.org
@ 2020-09-15 10:55 ` jakub at gcc dot gnu.org
  2020-09-15 13:23 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-09-15 10:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97053

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I think the problem is that check_no_overlap only looks at later m_store_info
stores that would overlap and cause problems.
But in this case, the problem is that we'd need to check earlier m_store_info
store too.
We have in the stmt order
bitpos bitsize kind order
64 96 INTEGER_CST 0
0 16 INTEGER_CST 1
16 64 MEM_REF 2
128 8 INTEGER_CST 3
160 32 INTEGER_CST 4
After sort_by_bitpos this is:
0 16 INTEGER_CST 1
16 64 MEM_REF 2
64 96 INTEGER_CST 0
128 8 INTEGER_CST 3
160 32 INTEGER_CST 4
The first store in this order doesn't overlap anything and is followed by
adjacent store of incompatible kind, so is kept as is.
Then the MEM_REF store is kept as is too, because again it overlaps with the
next one, but is of a different kind.
The bug is when deciding if the 3rd store could be mergeable with the 4th
store.
They are overlapping and of the same kind that can be handled for overlapping
stores, but they shouldn't be merged, because there was a non-mergeable
overlapping store in between them in the sort_by_order order (the MEM_REF one).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/97053] [10/11 Regression] an O2, O3 codegen bug
  2020-09-15  2:50 [Bug c++/97053] New: an O3 codegen bug cuzdav at gmail dot com
                   ` (5 preceding siblings ...)
  2020-09-15 10:55 ` jakub at gcc dot gnu.org
@ 2020-09-15 13:23 ` jakub at gcc dot gnu.org
  2020-09-15 14:58 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-09-15 13:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97053

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Another testcase for -O2 -fno-tree-dse that shows more issues, that it affects
not just the overlapping case handling, but also can affect also the adjacent
cases:
struct __attribute__((packed, may_alias)) S { long long s; };
struct __attribute__((packed, may_alias)) T { short t; };

__attribute__((noipa)) void
test (char *p, char *q, int s)
{
  if ((s & 1) == 0)
    {
      if (*(short __attribute__((may_alias)) *) &p[sizeof (short)]
          != *(short __attribute__((may_alias)) *) &q[sizeof (short)]
          || (((struct S __attribute__((may_alias)) *) &p[1])->s
              != ((struct S __attribute__((may_alias)) *) &q[1])->s)
          || (*(short __attribute__((may_alias)) *) &p[2 * sizeof (short)]
              != *(short __attribute__((may_alias)) *) &q[2 * sizeof (short)]))
        __builtin_abort ();
    }
  else
    {
      if (*(short __attribute__((may_alias)) *) &p[sizeof (short)]
          != *(short __attribute__((may_alias)) *) &q[sizeof (short)]
          || (((struct S __attribute__((may_alias)) *) &p[1])->s
              != ((struct S __attribute__((may_alias)) *) &q[1])->s)
          || (((struct T __attribute__((may_alias)) *) &p[2 * sizeof (short) -
1])->t
              != ((struct T __attribute__((may_alias)) *) &q[2 * sizeof (short)
- 1])->t)
          || p[3 * sizeof (short) - 2] != q[3 * sizeof (short) - 2])
        __builtin_abort ();
    }
}

__attribute__((noipa)) void
foo (long long *p, char *q, char *r, char *s)
{
  char a[64] __attribute__((aligned (__alignof (short))));
  *(short __attribute__((may_alias)) *) &a[sizeof (short)] = 1;
  ((struct S __attribute__((may_alias)) *) &a[1])->s = p[0];
  *(short __attribute__((may_alias)) *) &a[2 * sizeof (short)] = 2;
  *(short __attribute__((may_alias)) *) &q[sizeof (short)] = 1;
  ((struct S __attribute__((may_alias)) *) &r[1])->s = p[0];
  *(short __attribute__((may_alias)) *) &s[2 * sizeof (short)] = 2;
  test (a, q, 0);
}

__attribute__((noipa)) void
bar (long long *p, char *q, char *r, char *s, char *t)
{
  char a[64] __attribute__((aligned (__alignof (short))));
  *(short __attribute__((may_alias)) *) &a[sizeof (short)] = 1;
  ((struct S __attribute__((may_alias)) *) &a[1])->s = p[0];
  ((struct T __attribute__((may_alias)) *) &a[2 * sizeof (short) - 1])->t = 2;
  a[3 * sizeof (short) - 2] = 3;
  *(short __attribute__((may_alias)) *) &q[sizeof (short)] = 1;
  ((struct S __attribute__((may_alias)) *) &r[1])->s = p[0];
  ((struct T __attribute__((may_alias)) *) &s[2 * sizeof (short) - 1])->t = 2;
  t[3 * sizeof (short) - 2] = 3;
  test (a, q, 1);
}

__attribute__((noipa)) void
baz (long long *p, char *q, char *r, char *s)
{
  char a[64] __attribute__((aligned (__alignof (short))));
  *(short __attribute__((may_alias)) *) &a[2 * sizeof (short)] = 2;
  ((struct S __attribute__((may_alias)) *) &a[1])->s = p[0];
  *(short __attribute__((may_alias)) *) &a[sizeof (short)] = 1;
  *(short __attribute__((may_alias)) *) &q[2 * sizeof (short)] = 2;
  ((struct S __attribute__((may_alias)) *) &r[1])->s = p[0];
  *(short __attribute__((may_alias)) *) &s[sizeof (short)] = 1;
  test (a, q, 2);
}

__attribute__((noipa)) void
qux (long long *p, char *q, char *r, char *s, char *t)
{
  char a[64] __attribute__((aligned (__alignof (short))));
  *(short __attribute__((may_alias)) *) &a[2 * sizeof (short) - 1] = 2;
  ((struct S __attribute__((may_alias)) *) &a[1])->s = p[0];
  a[3 * sizeof (short) - 2] = 3;
  *(short __attribute__((may_alias)) *) &a[sizeof (short)] = 1;
  ((struct T __attribute__((may_alias)) *) &q[2 * sizeof (short) - 1])->t = 2;
  ((struct S __attribute__((may_alias)) *) &r[1])->s = p[0];
  s[3 * sizeof (short) - 2] = 3;
  ((struct T __attribute__((may_alias)) *) &t[sizeof (short)])->t = 1;
  test (a, q, 3);
}

int
main ()
{
  char a[64];
  long long p = -1LL;
  foo (&p, &a[0], &a[0], &a[0]);
  bar (&p, &a[0], &a[0], &a[0], &a[0]);
  baz (&p, &a[0], &a[0], &a[0]);
  qux (&p, &a[0], &a[0], &a[0], &a[0]);
  return 0;
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/97053] [10/11 Regression] an O2, O3 codegen bug
  2020-09-15  2:50 [Bug c++/97053] New: an O3 codegen bug cuzdav at gmail dot com
                   ` (6 preceding siblings ...)
  2020-09-15 13:23 ` jakub at gcc dot gnu.org
@ 2020-09-15 14:58 ` jakub at gcc dot gnu.org
  2020-09-16  7:45 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-09-15 14:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97053

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 49221
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49221&action=edit
gcc11-pr97053.patch

Untested fix.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/97053] [10/11 Regression] an O2, O3 codegen bug
  2020-09-15  2:50 [Bug c++/97053] New: an O3 codegen bug cuzdav at gmail dot com
                   ` (7 preceding siblings ...)
  2020-09-15 14:58 ` jakub at gcc dot gnu.org
@ 2020-09-16  7:45 ` cvs-commit at gcc dot gnu.org
  2020-09-16  7:50 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-09-16  7:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97053

--- Comment #9 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:bd909071ac04e94f4b6f0baab64d0687ec55681d

commit r11-3219-gbd909071ac04e94f4b6f0baab64d0687ec55681d
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Sep 16 09:42:33 2020 +0200

    store-merging: Consider also overlapping stores earlier in the by bitpos
sorting [PR97053]

    As the testcases show, if we have something like:
      MEM <char[12]> [&b + 8B] = {};
      MEM[(short *) &b] = 5;
      _5 = *x_4(D);
      MEM <long long unsigned int> [&b + 2B] = _5;
      MEM[(char *)&b + 16B] = 88;
      MEM[(int *)&b + 20B] = 1;
    then in sort_by_bitpos the stores are almost like in the given order,
    except the first store is after the = _5; store.
    We can't coalesce the = 5; store with = _5;, because the latter is MEM_REF,
    while the former INTEGER_CST, and we can't coalesce the = _5 store with
    the = {} store because the former is MEM_REF, the latter INTEGER_CST.
    But we happily coalesce the remaining 3 stores, which is wrong, because the
    = _5; store overlaps those and is in between them in the program order.
    We already have code to deal with similar cases in check_no_overlap, but we
    deal only with the following stores in sort_by_bitpos order, not the
earlier
    ones.

    The following patch checks also the earlier ones.  In
coalesce_immediate_stores
    it computes the first one that needs to be checked (all the ones whose
    bitpos + bitsize is smaller or equal to merged_store->start don't need to
be
    checked and don't need to be checked even for any following attempts
because
    of the sort_by_bitpos sorting) and the end of that (that is the first store
    in the merged_store).

    2020-09-16  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/97053
            * gimple-ssa-store-merging.c (check_no_overlap): Add FIRST_ORDER,
            START, FIRST_EARLIER and LAST_EARLIER arguments.  Return false if
            any stores between FIRST_EARLIER inclusive and LAST_EARLIER
exclusive
            has order in between FIRST_ORDER and LAST_ORDER and overlaps the to
            be merged store.
            (imm_store_chain_info::try_coalesce_bswap): Add FIRST_EARLIER
argument.
            Adjust check_no_overlap caller.
            (imm_store_chain_info::coalesce_immediate_stores): Add
first_earlier
            and last_earlier variables, adjust them during iterations.  Adjust
            check_no_overlap callers, call check_no_overlap even when extending
            overlapping stores by extra INTEGER_CST stores.

            * gcc.dg/store_merging_31.c: New test.
            * gcc.dg/store_merging_32.c: New test.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/97053] [10/11 Regression] an O2, O3 codegen bug
  2020-09-15  2:50 [Bug c++/97053] New: an O3 codegen bug cuzdav at gmail dot com
                   ` (8 preceding siblings ...)
  2020-09-16  7:45 ` cvs-commit at gcc dot gnu.org
@ 2020-09-16  7:50 ` cvs-commit at gcc dot gnu.org
  2020-09-16  7:55 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-09-16  7:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97053

--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:7e97e7470e74b0d9a68000938a359a7049774d77

commit r10-8769-g7e97e7470e74b0d9a68000938a359a7049774d77
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Sep 16 09:42:33 2020 +0200

    store-merging: Consider also overlapping stores earlier in the by bitpos
sorting [PR97053]

    As the testcases show, if we have something like:
      MEM <char[12]> [&b + 8B] = {};
      MEM[(short *) &b] = 5;
      _5 = *x_4(D);
      MEM <long long unsigned int> [&b + 2B] = _5;
      MEM[(char *)&b + 16B] = 88;
      MEM[(int *)&b + 20B] = 1;
    then in sort_by_bitpos the stores are almost like in the given order,
    except the first store is after the = _5; store.
    We can't coalesce the = 5; store with = _5;, because the latter is MEM_REF,
    while the former INTEGER_CST, and we can't coalesce the = _5 store with
    the = {} store because the former is MEM_REF, the latter INTEGER_CST.
    But we happily coalesce the remaining 3 stores, which is wrong, because the
    = _5; store overlaps those and is in between them in the program order.
    We already have code to deal with similar cases in check_no_overlap, but we
    deal only with the following stores in sort_by_bitpos order, not the
earlier
    ones.

    The following patch checks also the earlier ones.  In
coalesce_immediate_stores
    it computes the first one that needs to be checked (all the ones whose
    bitpos + bitsize is smaller or equal to merged_store->start don't need to
be
    checked and don't need to be checked even for any following attempts
because
    of the sort_by_bitpos sorting) and the end of that (that is the first store
    in the merged_store).

    2020-09-16  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/97053
            * gimple-ssa-store-merging.c (check_no_overlap): Add FIRST_ORDER,
            START, FIRST_EARLIER and LAST_EARLIER arguments.  Return false if
            any stores between FIRST_EARLIER inclusive and LAST_EARLIER
exclusive
            has order in between FIRST_ORDER and LAST_ORDER and overlaps the to
            be merged store.
            (imm_store_chain_info::try_coalesce_bswap): Add FIRST_EARLIER
argument.
            Adjust check_no_overlap caller.
            (imm_store_chain_info::coalesce_immediate_stores): Add
first_earlier
            and last_earlier variables, adjust them during iterations.  Adjust
            check_no_overlap callers, call check_no_overlap even when extending
            overlapping stores by extra INTEGER_CST stores.

            * gcc.dg/store_merging_31.c: New test.
            * gcc.dg/store_merging_32.c: New test.

    (cherry picked from commit bd909071ac04e94f4b6f0baab64d0687ec55681d)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/97053] [10/11 Regression] an O2, O3 codegen bug
  2020-09-15  2:50 [Bug c++/97053] New: an O3 codegen bug cuzdav at gmail dot com
                   ` (9 preceding siblings ...)
  2020-09-16  7:50 ` cvs-commit at gcc dot gnu.org
@ 2020-09-16  7:55 ` jakub at gcc dot gnu.org
  2020-09-18  9:52 ` cvs-commit at gcc dot gnu.org
  2020-09-18  9:58 ` cvs-commit at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-09-16  7:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97053

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Should be fixed now for 10.3+ and 11.1+.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/97053] [10/11 Regression] an O2, O3 codegen bug
  2020-09-15  2:50 [Bug c++/97053] New: an O3 codegen bug cuzdav at gmail dot com
                   ` (10 preceding siblings ...)
  2020-09-16  7:55 ` jakub at gcc dot gnu.org
@ 2020-09-18  9:52 ` cvs-commit at gcc dot gnu.org
  2020-09-18  9:58 ` cvs-commit at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-09-18  9:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97053

--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-9 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:a24744c1ed89e255f3db5b3981519f538d231886

commit r9-8918-ga24744c1ed89e255f3db5b3981519f538d231886
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Sep 16 09:42:33 2020 +0200

    store-merging: Consider also overlapping stores earlier in the by bitpos
sorting [PR97053]

    As the testcases show, if we have something like:
      MEM <char[12]> [&b + 8B] = {};
      MEM[(short *) &b] = 5;
      _5 = *x_4(D);
      MEM <long long unsigned int> [&b + 2B] = _5;
      MEM[(char *)&b + 16B] = 88;
      MEM[(int *)&b + 20B] = 1;
    then in sort_by_bitpos the stores are almost like in the given order,
    except the first store is after the = _5; store.
    We can't coalesce the = 5; store with = _5;, because the latter is MEM_REF,
    while the former INTEGER_CST, and we can't coalesce the = _5 store with
    the = {} store because the former is MEM_REF, the latter INTEGER_CST.
    But we happily coalesce the remaining 3 stores, which is wrong, because the
    = _5; store overlaps those and is in between them in the program order.
    We already have code to deal with similar cases in check_no_overlap, but we
    deal only with the following stores in sort_by_bitpos order, not the
earlier
    ones.

    The following patch checks also the earlier ones.  In
coalesce_immediate_stores
    it computes the first one that needs to be checked (all the ones whose
    bitpos + bitsize is smaller or equal to merged_store->start don't need to
be
    checked and don't need to be checked even for any following attempts
because
    of the sort_by_bitpos sorting) and the end of that (that is the first store
    in the merged_store).

    2020-09-16  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/97053
            * gimple-ssa-store-merging.c (check_no_overlap): Add FIRST_ORDER,
            START, FIRST_EARLIER and LAST_EARLIER arguments.  Return false if
            any stores between FIRST_EARLIER inclusive and LAST_EARLIER
exclusive
            has order in between FIRST_ORDER and LAST_ORDER and overlaps the to
            be merged store.
            (imm_store_chain_info::try_coalesce_bswap): Add FIRST_EARLIER
argument.
            Adjust check_no_overlap caller.
            (imm_store_chain_info::coalesce_immediate_stores): Add
first_earlier
            and last_earlier variables, adjust them during iterations.  Adjust
            check_no_overlap callers, call check_no_overlap even when extending
            overlapping stores by extra INTEGER_CST stores.

            * gcc.dg/store_merging_31.c: New test.
            * gcc.dg/store_merging_32.c: New test.

    (cherry picked from commit bd909071ac04e94f4b6f0baab64d0687ec55681d)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug tree-optimization/97053] [10/11 Regression] an O2, O3 codegen bug
  2020-09-15  2:50 [Bug c++/97053] New: an O3 codegen bug cuzdav at gmail dot com
                   ` (11 preceding siblings ...)
  2020-09-18  9:52 ` cvs-commit at gcc dot gnu.org
@ 2020-09-18  9:58 ` cvs-commit at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-09-18  9:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97053

--- Comment #13 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-8 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:87ae45cdbd7b70a4c92d5137552228ed9ad9e9e7

commit r8-10518-g87ae45cdbd7b70a4c92d5137552228ed9ad9e9e7
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Sep 16 09:42:33 2020 +0200

    store-merging: Consider also overlapping stores earlier in the by bitpos
sorting [PR97053]

    As the testcases show, if we have something like:
      MEM <char[12]> [&b + 8B] = {};
      MEM[(short *) &b] = 5;
      _5 = *x_4(D);
      MEM <long long unsigned int> [&b + 2B] = _5;
      MEM[(char *)&b + 16B] = 88;
      MEM[(int *)&b + 20B] = 1;
    then in sort_by_bitpos the stores are almost like in the given order,
    except the first store is after the = _5; store.
    We can't coalesce the = 5; store with = _5;, because the latter is MEM_REF,
    while the former INTEGER_CST, and we can't coalesce the = _5 store with
    the = {} store because the former is MEM_REF, the latter INTEGER_CST.
    But we happily coalesce the remaining 3 stores, which is wrong, because the
    = _5; store overlaps those and is in between them in the program order.
    We already have code to deal with similar cases in check_no_overlap, but we
    deal only with the following stores in sort_by_bitpos order, not the
earlier
    ones.

    The following patch checks also the earlier ones.  In
coalesce_immediate_stores
    it computes the first one that needs to be checked (all the ones whose
    bitpos + bitsize is smaller or equal to merged_store->start don't need to
be
    checked and don't need to be checked even for any following attempts
because
    of the sort_by_bitpos sorting) and the end of that (that is the first store
    in the merged_store).

    2020-09-16  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/97053
            * gimple-ssa-store-merging.c (check_no_overlap): Add FIRST_ORDER,
            START, FIRST_EARLIER and LAST_EARLIER arguments.  Return false if
            any stores between FIRST_EARLIER inclusive and LAST_EARLIER
exclusive
            has order in between FIRST_ORDER and LAST_ORDER and overlaps the to
            be merged store.
            (imm_store_chain_info::try_coalesce_bswap): Add FIRST_EARLIER
argument.
            Adjust check_no_overlap caller.
            (imm_store_chain_info::coalesce_immediate_stores): Add
first_earlier
            and last_earlier variables, adjust them during iterations.  Adjust
            check_no_overlap callers, call check_no_overlap even when extending
            overlapping stores by extra INTEGER_CST stores.

            * gcc.dg/store_merging_31.c: New test.
            * gcc.dg/store_merging_32.c: New test.

    (cherry picked from commit bd909071ac04e94f4b6f0baab64d0687ec55681d)

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-09-18  9:58 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-15  2:50 [Bug c++/97053] New: an O3 codegen bug cuzdav at gmail dot com
2020-09-15  6:21 ` [Bug tree-optimization/97053] [10/11 Regression] an O2, " rguenth at gcc dot gnu.org
2020-09-15  7:08 ` ebotcazou at gcc dot gnu.org
2020-09-15  8:21 ` jakub at gcc dot gnu.org
2020-09-15  9:08 ` jakub at gcc dot gnu.org
2020-09-15  9:53 ` jakub at gcc dot gnu.org
2020-09-15 10:55 ` jakub at gcc dot gnu.org
2020-09-15 13:23 ` jakub at gcc dot gnu.org
2020-09-15 14:58 ` jakub at gcc dot gnu.org
2020-09-16  7:45 ` cvs-commit at gcc dot gnu.org
2020-09-16  7:50 ` cvs-commit at gcc dot gnu.org
2020-09-16  7:55 ` jakub at gcc dot gnu.org
2020-09-18  9:52 ` cvs-commit at gcc dot gnu.org
2020-09-18  9:58 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).