public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/104529] New: [missed optimization] inefficient codegen around new/delete
@ 2022-02-14 15:51 tnfchris at gcc dot gnu.org
  2022-02-14 18:08 ` [Bug target/104529] " pinskia at gcc dot gnu.org
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2022-02-14 15:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

            Bug ID: 104529
           Summary: [missed optimization] inefficient codegen around
                    new/delete
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tnfchris at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64-*

Consider the following example

#include <cstdint>
#include <cstdlib>
#include <vector>

struct param {
  uint32_t k;
  std::vector<uint8_t> src;
  std::vector<uint8_t> ref0;
};

size_t foo() {
    param test[] = {
    {48, {255, 0, 0, 0, 0, 0}
    }};
    return sizeof(test);
}

where the entire thing should have been elided, but that is already reported in
#94294.

Instead this code also shows that we are generating quite inefficient code
(even at -Ofast)

on AArch64 we generate:

foo():
        stp     x29, x30, [sp, -32]!
        mov     w1, 255 <-- 1
        mov     x0, 6
        mov     x29, sp
        str     w1, [sp, 24] <-- 1
        strh    wzr, [sp, 28] <-- 2
        bl      operator new(unsigned long)
        ldrh    w3, [sp, 28] <-- 2
        mov     x1, 6
        ldr     w4, [sp, 24] <-- 1
        str     w4, [x0]
        strh    w3, [x0, 4]
        bl      operator delete(void*, unsigned long)
        mov     x0, 56
        ldp     x29, x30, [sp], 32
        ret

There's no reason to spill and rematerialize a constant when the constant is
representable in a single move.

It's also unclear to me why it things the 255 and 0 need to be before the call
to new.  But even if it did need it, it's better to re-create the constants
rather than materializing them again.

However x86 gets this right, which is why I've opened this as a target bug:

foo():
        sub     rsp, 8
        mov     edi, 6
        call    operator new(unsigned long)
        mov     esi, 6
        mov     DWORD PTR [rax], 255
        mov     rdi, rax
        xor     eax, eax
        mov     WORD PTR [rdi+4], ax
        call    operator delete(void*, unsigned long)
        mov     eax, 56
        add     rsp, 8
        ret

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/104529] [missed optimization] inefficient codegen around new/delete
  2022-02-14 15:51 [Bug target/104529] New: [missed optimization] inefficient codegen around new/delete tnfchris at gcc dot gnu.org
@ 2022-02-14 18:08 ` pinskia at gcc dot gnu.org
  2022-02-14 18:25 ` tnfchris at gcc dot gnu.org
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-14 18:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |DUPLICATE
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Dup of bug 86892.

*** This bug has been marked as a duplicate of bug 86892 ***

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/104529] [missed optimization] inefficient codegen around new/delete
  2022-02-14 15:51 [Bug target/104529] New: [missed optimization] inefficient codegen around new/delete tnfchris at gcc dot gnu.org
  2022-02-14 18:08 ` [Bug target/104529] " pinskia at gcc dot gnu.org
@ 2022-02-14 18:25 ` tnfchris at gcc dot gnu.org
  2022-02-14 19:07 ` [Bug target/104529] [12 Regression] " tnfchris at gcc dot gnu.org
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2022-02-14 18:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

--- Comment #2 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
I don't quite see how this is a CSE problem,

There's only one of each constant and none of them are needed before the call.
unlike in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86892

You don't need the values of your array until you allocate memory for said
array.

x86 has the following sequence in GIMPLE

  _32 = operator new (6);
  MEM <unsigned int> [(char * {ref-all})_32] = 255;
  MEM <unsigned short> [(char * {ref-all})_32 + 4B] = 0;
  operator delete (_32, 6);

which is optimal, you create the object, store the values, and remove it.

AArch64 however has this

  MEM <unsigned int> [(unsigned char *)&D.24688] = 255;
  MEM <unsigned short> [(unsigned char *)&D.24688 + 4B] = 0;
  _34 = operator new (6);
  MEM <unsigned char[6]> [(char * {ref-all})_34] = MEM <unsigned char[6]>
[(char * {ref-all})&D.24688];
  D.24688 ={v} {CLOBBER(eol)};
  operator delete (_34, 6);

which is where the issue comes from. So this has nothing to do with CSE as far
as I can tell.  The GIMPLE is just suboptimal.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/104529] [12 Regression] inefficient codegen around new/delete
  2022-02-14 15:51 [Bug target/104529] New: [missed optimization] inefficient codegen around new/delete tnfchris at gcc dot gnu.org
  2022-02-14 18:08 ` [Bug target/104529] " pinskia at gcc dot gnu.org
  2022-02-14 18:25 ` tnfchris at gcc dot gnu.org
@ 2022-02-14 19:07 ` tnfchris at gcc dot gnu.org
  2022-02-14 19:51 ` pinskia at gcc dot gnu.org
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2022-02-14 19:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

Tamar Christina <tnfchris at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[missed optimization]       |[12 Regression] inefficient
                   |inefficient codegen around  |codegen around new/delete
                   |new/delete                  |
             Status|RESOLVED                    |UNCONFIRMED
         Resolution|DUPLICATE                   |---

--- Comment #3 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
I'm re-opening because I don't think it has anything to do with #94294

This is a GCC 12 regression.

In GCC 11 we generated in the mid-end

  <bb 2> [local count: 536870913]:
  _32 = operator new (6);
  MEM <unsigned int> [(char * {ref-all})_32] = 255;
  MEM <unsigned short> [(char * {ref-all})_32 + 4B] = 0;
  operator delete (_32, 6);
  return 56;

and in GCC 12 we now generate

  <bb 2> [local count: 536870913]:
  MEM <vector(4) unsigned char> [(unsigned char *)&D.24688] = { 255, 0, 0, 0 };
  MEM <vector(2) unsigned char> [(unsigned char *)&D.24688 + 4B] = { 0, 0 };
  _34 = operator new (6);
  MEM <unsigned char[6]> [(char * {ref-all})_34] = MEM <unsigned char[6]>
[(char * {ref-all})&D.24688];
  D.24688 ={v} {CLOBBER(eol)};
  operator delete (_34, 6);
  return 56;

See https://godbolt.org/z/KKfhxTxnd

Forcing it to keep the stores before the call to new.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/104529] [12 Regression] inefficient codegen around new/delete
  2022-02-14 15:51 [Bug target/104529] New: [missed optimization] inefficient codegen around new/delete tnfchris at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2022-02-14 19:07 ` [Bug target/104529] [12 Regression] " tnfchris at gcc dot gnu.org
@ 2022-02-14 19:51 ` pinskia at gcc dot gnu.org
  2022-02-15  8:02 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-14 19:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Tamar Christina from comment #3)
> I'm re-opening because I don't think it has anything to do with #94294
> 
> This is a GCC 12 regression.
> 
> In GCC 11 we generated in the mid-end
> 
...
> and in GCC 12 we now generate
> 
...
> 
> See https://godbolt.org/z/KKfhxTxnd
> 
> Forcing it to keep the stores before the call to new.

Hmm, I should have looked into the code before marking it as a dup.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/104529] [12 Regression] inefficient codegen around new/delete
  2022-02-14 15:51 [Bug target/104529] New: [missed optimization] inefficient codegen around new/delete tnfchris at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2022-02-14 19:51 ` pinskia at gcc dot gnu.org
@ 2022-02-15  8:02 ` rguenth at gcc dot gnu.org
  2022-03-03 12:50 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-15  8:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |needs-bisection
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2022-02-15
                 CC|                            |jamborm at gcc dot gnu.org,
                   |                            |jason at redhat dot com,
                   |                            |rguenth at gcc dot gnu.org
   Target Milestone|---                         |12.0
     Ever confirmed|0                           |1

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think this is SRA no longer forwarding the constant init to the aggregate
copy.
In GCC 11:

--- t.C.121t.cplxlower1 2022-02-15 08:41:16.572229246 +0100
+++ t.C.122t.sra        2022-02-15 08:41
@@ -1,8 +1,31 @@

 ;; Function foo (_Z3foov, funcdef_no=883, decl_uid=19435, cgraph_uid=172,
symbol_order=181)

+No longer having address taken: D.19477
+No longer having address taken: test
+Created a replacement for D.19477 offset: 0, size: 8: SR.62D.21182
+Created a replacement for D.19477 offset: 8, size: 8: SR.63D.21183
+Created a replacement for D.19477 offset: 16, size: 8: SR.64D.21184
+Created a replacement for D.19477 offset: 24, size: 8: SR.65D.21185
+Created a replacement for D.19477 offset: 32, size: 8: SR.66D.21186
+Created a replacement for D.19477 offset: 40, size: 8: SR.67D.21187
+Removing load: MEM <unsigned char[6]> [(char * {ref-all})_41] = MEM <unsigned
char[6]> [(char * {ref-all})&D.19477];
+
+Symbols to be put in SSA form
+{ D.20917 D.21182 D.21183 D.21184 D.21185 D.21186 D.21187 }
+Incremental SSA update started at block: 0
+Number of blocks in CFG: 5
+Number of blocks to update: 4 ( 80%)
+
+
 size_t foo ()
 {
+  unsigned char SR.67;
+  unsigned char SR.66;
+  unsigned char SR.65;
+  unsigned char SR.64;
+  unsigned char SR.63;
+  unsigned char SR.62;
   unsigned char * D.21173;
   const size_type __n;
   const unsigned char * const __l$_M_array;
@@ -15,12 +38,12 @@

   <bb 2> [local count: 536870913]:
   MEM[(struct param *)&test].k = 48;
-  D.19477[0] = 255;
-  D.19477[1] = 0;
-  D.19477[2] = 0;
-  D.19477[3] = 0;
-  D.19477[4] = 0;
-  D.19477[5] = 0;
+  SR.62_42 = 255;
+  SR.63_11 = 0;
+  SR.64_12 = 0;
+  SR.65_61 = 0;
+  SR.66_55 = 0;
+  SR.67_57 = 0;
   MEM[(struct _Vector_impl_data *)&test + 8B] ={v} {CLOBBER};
   MEM[(struct _Vector_impl_data *)&test + 8B]._M_start = 0B;
   MEM[(struct _Vector_impl_data *)&test + 8B]._M_finish = 0B;
@@ -31,14 +54,18 @@
   MEM[(struct vector *)&test + 8B].D.19426._M_impl.D.18739._M_start = _41;
   _36 = _41 + 6;
   MEM[(struct vector *)&test + 8B].D.19426._M_impl.D.18739._M_end_of_storage =
_36;
-  MEM <unsigned char[6]> [(char * {ref-all})_41] = MEM <unsigned char[6]>
[(char * {ref-all})&D.19477];
+  MEM <unsigned char[6]> [(char * {ref-all})_41][0] = SR.62_42;
+  MEM <unsigned char[6]> [(char * {ref-all})_41][1] = SR.63_11;
+  MEM <unsigned char[6]> [(char * {ref-all})_41][2] = SR.64_12;
+  MEM <unsigned char[6]> [(char * {ref-all})_41][3] = SR.65_61;
+  MEM <unsigned char[6]> [(char * {ref-all})_41][4] = SR.66_55;
+  MEM <unsigned char[6]> [(char * {ref-all})_41][5] = SR.67_57;

while GCC 12 has:

--- t.C.126t.cplxlower1 2022-02-15 08:44:29.602672246 +0100
+++ t.C.127t.sra        2022-02-15 08:44:29.602672246 +0100
@@ -1,6 +1,8 @@

 ;; Function foo (_Z3foov, funcdef_no=1040, decl_uid=23047, cgraph_uid=176,
symbol_order=189)

+No longer having address taken: D.23106
+No longer having address taken: test
 size_t foo ()
 {
   struct param test[1];

and the SRA IL is

  const unsigned char D.23106[6];
  struct vector * _1;
  unsigned char * _43;

  <bb 2> [local count: 536870913]:
  MEM[(struct param *)&test] = {};
  MEM[(struct param *)&test].k = 48;
  D.23106[0] = 255;
  D.23106[1] = 0;
  D.23106[2] = 0;
  D.23106[3] = 0;
  D.23106[4] = 0;
  D.23106[5] = 0;
  MEM[(struct _Vector_impl_data *)&test + 8B] ={v} {CLOBBER};
  MEM[(struct _Vector_impl_data *)&test + 8B]._M_start = 0B;
  MEM[(struct _Vector_impl_data *)&test + 8B]._M_finish = 0B;
  MEM[(struct _Vector_impl_data *)&test + 8B]._M_end_of_storage = 0B;
  _43 = operator new (6);

  <bb 3> [local count: 498553576]:
  MEM <unsigned char[6]> [(char * {ref-all})_43] = MEM <unsigned char[6]>
[(char * {ref-all})&D.23106];

The dump says

Candidate (23106): D.23106
! Disqualifying D.23106 - Encountered a store to a read-only decl.

it looks like D.23106 is TREE_READONLY contrary to the IL.  The decl is
originally from

          TARGET_EXPR <D.23107, {._M_array=(const unsigned char *) &TARGET_EXPR
<D.23106, {255, 0, 0, 0, 0, 0}>, ._M_len=6}>

here the target is already readonly.  In other places in the gimplifier
we clear TREE_READONLY but in this case we fail to.  We also fail to
promote the variable to a static const, possibly because of heuristics
or -fmerge-constants != 2 and the var being TREE_ADDRESSABLE.

diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index f570daa015a..de58bdebbc7 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -5110,6 +5110,10 @@ gimplify_init_constructor (tree *expr_p, gimple_seq
*pre_
p, gimple_seq *post_p,
            break;
          }

+       if (VAR_P (object)
+           && TREE_READONLY (object))
+         TREE_READONLY (object) = 0;
+
        /* If there are "lots" of initialized elements, even discounting
           those that are not address constants (and thus *must* be
           computed at runtime), then partition the constructor into

"fixes" this, but I guess we need to maybe re-visit the decision to punt
on TREE_READONLY decls in SRA?

Jason, anything the C++ FE can improve here?  Thoughts about the gimplifier
retaining TREE_READONLY?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/104529] [12 Regression] inefficient codegen around new/delete
  2022-02-14 15:51 [Bug target/104529] New: [missed optimization] inefficient codegen around new/delete tnfchris at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2022-02-15  8:02 ` rguenth at gcc dot gnu.org
@ 2022-03-03 12:50 ` jakub at gcc dot gnu.org
  2022-03-03 12:53 ` tnfchris at gcc dot gnu.org
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-03-03 12:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|needs-bisection             |
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The change mentioned in #c3 happened in r12-1529-gd7deee423f993bee8ee44
(but both on aarch64 and x86_64).
I don't see the code mentioned in #c0 on x86_64, I see also loads and stores
like on aarch64.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/104529] [12 Regression] inefficient codegen around new/delete
  2022-02-14 15:51 [Bug target/104529] New: [missed optimization] inefficient codegen around new/delete tnfchris at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2022-03-03 12:50 ` jakub at gcc dot gnu.org
@ 2022-03-03 12:53 ` tnfchris at gcc dot gnu.org
  2022-03-03 13:18 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2022-03-03 12:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

--- Comment #7 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #6)
> I don't see the code mentioned in #c0 on x86_64, I see also loads and stores
> like on aarch64.

Yes, that was my mistake, I was accidentally comparing GCC 11 x86_64 with GCC
12 AArch64.  That's how I noticed it was an 12 regression later.  Should have
clarified.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/104529] [12 Regression] inefficient codegen around new/delete
  2022-02-14 15:51 [Bug target/104529] New: [missed optimization] inefficient codegen around new/delete tnfchris at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2022-03-03 12:53 ` tnfchris at gcc dot gnu.org
@ 2022-03-03 13:18 ` jakub at gcc dot gnu.org
  2022-03-03 13:49 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-03-03 13:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
>From what I can see, this setting of TREE_READONLY has been added in
r9-869-g5603790dbf233c31c60 aka PR85873 fix.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/104529] [12 Regression] inefficient codegen around new/delete
  2022-02-14 15:51 [Bug target/104529] New: [missed optimization] inefficient codegen around new/delete tnfchris at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2022-03-03 13:18 ` jakub at gcc dot gnu.org
@ 2022-03-03 13:49 ` jakub at gcc dot gnu.org
  2022-03-03 14:25 ` [Bug middle-end/104529] " jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-03-03 13:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
When I disable the TREE_READONLY (decl) = true; in build_target_expr, on the
array-temp1.C testcase gimple dump changes:
 int f ()
 {
   int D.2491;
-  static const int C.0[10] = {1, 42, 3, 4, 5, 6, 7, 8, 9, 0};
+  const int D.2435[10];
   typedef const int AR[<unknown>];

-  try
-    {
-      D.2491 = C.0[5];
-      return D.2491;
-    }
-  finally
-    {
-      C.0 = {CLOBBER(eol)};
-    }
+  D.2435[0] = 1;
+  D.2435[1] = 42;
+  D.2435[2] = 3;
+  D.2435[3] = 4;
+  D.2435[4] = 5;
+  D.2435[5] = 6;
+  D.2435[6] = 7;
+  D.2435[7] = 8;
+  D.2435[8] = 9;
+  D.2435[9] = 0;
+  D.2491 = D.2435[5];
+  return D.2491;

which seems quite undesirable change.
The spot that cares about TREE_READONLY is exactly in
gimplify_init_constructor:
        /* If a const aggregate variable is being initialized, then it
           should never be a lose to promote the variable to be static.  */
        if (valid_const_initializer
            && num_nonzero_elements > 1
            && TREE_READONLY (object)
            && VAR_P (object)
            && !DECL_REGISTER (object)
            && (flag_merge_constants >= 2 || !TREE_ADDRESSABLE (object))
...

So, the #c5 patch looks wrong from this regard too.
Furthermore, we have that notify_temp_creation mode there and I think we really
don't want to clear TREE_READONLY in that case.

So, I think we need something like:
--- gcc/gimplify.cc.jj  2022-03-03 09:13:16.000000000 +0100
+++ gcc/gimplify.cc     2022-03-03 14:42:00.952959549 +0100
@@ -5120,6 +5120,12 @@ gimplify_init_constructor (tree *expr_p,
          {
            if (notify_temp_creation)
              return GS_OK;
+
+           /* The var will be initialized and so appear on lhs of
+              assignment, it can't be TREE_READONLY anymore.  */
+           if (VAR_P (object))
+             TREE_READONLY (object) = 0;
+
            is_empty_ctor = true;
            break;
          }
@@ -5171,6 +5177,11 @@ gimplify_init_constructor (tree *expr_p,
            break;
          }

+       /* The var will be initialized and so appear on lhs of
+          assignment, it can't be TREE_READONLY anymore.  */
+       if (VAR_P (object) && !notify_temp_creation)
+         TREE_READONLY (object) = 0;
+
        /* If there are "lots" of initialized elements, even discounting
           those that are not address constants (and thus *must* be
           computed at runtime), then partition the constructor into

or so.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug middle-end/104529] [12 Regression] inefficient codegen around new/delete
  2022-02-14 15:51 [Bug target/104529] New: [missed optimization] inefficient codegen around new/delete tnfchris at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2022-03-03 13:49 ` jakub at gcc dot gnu.org
@ 2022-03-03 14:25 ` jakub at gcc dot gnu.org
  2022-03-04 14:15 ` cvs-commit at gcc dot gnu.org
  2022-03-04 14:35 ` jakub at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-03-03 14:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 52557
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52557&action=edit
gcc12-pr104529.patch

Untested fix.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug middle-end/104529] [12 Regression] inefficient codegen around new/delete
  2022-02-14 15:51 [Bug target/104529] New: [missed optimization] inefficient codegen around new/delete tnfchris at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2022-03-03 14:25 ` [Bug middle-end/104529] " jakub at gcc dot gnu.org
@ 2022-03-04 14:15 ` cvs-commit at gcc dot gnu.org
  2022-03-04 14:35 ` jakub at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-03-04 14:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

--- Comment #11 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:c85aaf2cbe9da50e23655a8082a37166adf4c0f7

commit r12-7483-gc85aaf2cbe9da50e23655a8082a37166adf4c0f7
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Mar 4 15:14:59 2022 +0100

    gimplify: Clear TREE_READONLY on automatic vars being stored into
[PR104529]

    The following testcase regressed when SRA started punting on stores to
    TREE_READONLY vars.  We document that:
    "In a VAR_DECL, PARM_DECL or FIELD_DECL, or any kind of ..._REF node,
    nonzero means it may not be the lhs of an assignment."
    so the SRA change looks desirable.  On the other side, at least in this
    testcase the TREE_READONLY is set there intentionally from the
    PR85873 fix, because gimplify_init_constructor itself uses TREE_READONLY
    on the object to determine if it can perform promotion to static const
    or not.

    So, similarly to other spots in the gimplifier where we also clear
    TREE_READONLY when we emit IL that stores into the object, this
    does the same in gimplify_init_constructor, but in the way so that
    the TREE_READONLY test for the promotion to static const keeps working
    and doesn't change anything for notify_temp_creation mode, which doesn't
    emit any IL, just tests if it would need a temporary or not.

    This keeps PR85873 testcase working as before and fixes this regression.

    2022-03-04  Jakub Jelinek  <jakub@redhat.com>

            PR middle-end/104529
            * gimplify.cc (gimplify_init_constructor): Clear TREE_READONLY
            on automatic objects which will be runtime initialized.

            * g++.dg/tree-ssa/pr104529.C: New test.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug middle-end/104529] [12 Regression] inefficient codegen around new/delete
  2022-02-14 15:51 [Bug target/104529] New: [missed optimization] inefficient codegen around new/delete tnfchris at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2022-03-04 14:15 ` cvs-commit at gcc dot gnu.org
@ 2022-03-04 14:35 ` jakub at gcc dot gnu.org
  11 siblings, 0 replies; 13+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-03-04 14:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104529

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-03-04 14:35 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-14 15:51 [Bug target/104529] New: [missed optimization] inefficient codegen around new/delete tnfchris at gcc dot gnu.org
2022-02-14 18:08 ` [Bug target/104529] " pinskia at gcc dot gnu.org
2022-02-14 18:25 ` tnfchris at gcc dot gnu.org
2022-02-14 19:07 ` [Bug target/104529] [12 Regression] " tnfchris at gcc dot gnu.org
2022-02-14 19:51 ` pinskia at gcc dot gnu.org
2022-02-15  8:02 ` rguenth at gcc dot gnu.org
2022-03-03 12:50 ` jakub at gcc dot gnu.org
2022-03-03 12:53 ` tnfchris at gcc dot gnu.org
2022-03-03 13:18 ` jakub at gcc dot gnu.org
2022-03-03 13:49 ` jakub at gcc dot gnu.org
2022-03-03 14:25 ` [Bug middle-end/104529] " jakub at gcc dot gnu.org
2022-03-04 14:15 ` cvs-commit at gcc dot gnu.org
2022-03-04 14:35 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).