public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/107263] New: Memcpy not elided when initializing struct
@ 2022-10-14 13:13 jmuizelaar at mozilla dot com
  2022-10-14 15:45 ` [Bug tree-optimization/107263] " pinskia at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: jmuizelaar at mozilla dot com @ 2022-10-14 13:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107263

            Bug ID: 107263
           Summary: Memcpy not elided when initializing struct
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jmuizelaar at mozilla dot com
  Target Milestone: ---

With the following code

struct Foo {
    Foo* next;
    char arr[580];
};

void ctx_push(Foo* f) {
    Foo tmp = { f->next };
    *f = tmp;
}

Clang is able to generate code that just memsets `arr`. GCC instead initializes
the entire struct on the stack and then copies it into `f`.

https://gcc.godbolt.org/z/Yzcbs4G71

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/107263] Memcpy not elided when initializing struct
  2022-10-14 13:13 [Bug tree-optimization/107263] New: Memcpy not elided when initializing struct jmuizelaar at mozilla dot com
@ 2022-10-14 15:45 ` pinskia at gcc dot gnu.org
  2022-10-17  7:43 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-10-14 15:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107263

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
           Keywords|                            |missed-optimization

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/107263] Memcpy not elided when initializing struct
  2022-10-14 13:13 [Bug tree-optimization/107263] New: Memcpy not elided when initializing struct jmuizelaar at mozilla dot com
  2022-10-14 15:45 ` [Bug tree-optimization/107263] " pinskia at gcc dot gnu.org
@ 2022-10-17  7:43 ` rguenth at gcc dot gnu.org
  2022-10-17 13:28 ` jmuizelaar at mozilla dot com
  2024-03-19 17:27 ` hiraditya at msn dot com
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-10-17  7:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107263

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2022-10-17
            Version|unknown                     |13.0
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
                 CC|                            |jamborm at gcc dot gnu.org

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.  The frontend leaves us with

  <<cleanup_point   struct Foo tmp = {};>>;
  <<cleanup_point <<< Unknown tree: expr_stmt
    tmp.next = f->next >>>>>;
  <<cleanup_point <<< Unknown tree: expr_stmt
    (void) (*NON_LVALUE_EXPR <f> = *(const struct Foo &) &tmp) >>>>>;

and ESRA sees

  <bb 2> :
  tmp = {};
  _1 = f_4(D)->next;
  tmp.next = _1;
  *f_4(D) = tmp;
  tmp ={v} {CLOBBER(eol)};
  return;

ESRA somewhat senselessly does

  <bb 2> :
  tmp = {};
  tmp$next_8 = 0B;
  _1 = f_4(D)->next;
  tmp$next_9 = _1;
  tmp.next = tmp$next_9;
  *f_4(D) = tmp;
  tmp ={v} {CLOBBER(eol)};
  return;

it doesn't scalarize the array because that's too large.  I would guess
that Clang doesn't split the initializer and thus its aggregate copy
propagation somehow manages to elide 'tmp'.  We don't have a good
place to peform the desired optimization, certainly the split
initialization of 'tmp' complicates things.

In principle it would be SRAs job since I think it does most of the
necessary analysis, it just lacks knowledge on how to re-materialize
*f_4(D) efficiently at the point of the aggregate assignment?
It has

Candidate (2384): tmp
Too big to totally scalarize: tmp (UID: 2384)
Created a replacement for tmp offset: 0, size: 64: tmp$nextD.2425

Access trees for tmp (UID: 2384):
access { base = (2384)'tmp', offset = 0, size = 4736, expr = tmp, type = struct
Foo, reverse = 0, grp_read = 1, grp_write = 1, grp_assignment_read = 1,
grp_assignment_write = 1, grp_scalar_read = 0, grp_scalar_write = 0,
grp_total_scalarization = 0, grp_hint = 0, grp_covered = 0,
grp_unscalarizable_region = 0, grp_unscalarized_data = 1, grp_same_access_path
= 1, grp_partial_lhs = 0, grp_to_be_replaced = 0, grp_to_be_debug_replaced = 0}
* access { base = (2384)'tmp', offset = 0, size = 64, expr = tmp.next, type =
struct Foo *, reverse = 0, grp_read = 1, grp_write = 1, grp_assignment_read =
1, grp_assignment_write = 1, grp_scalar_read = 0, grp_scalar_write = 1,
grp_total_scalarization = 0, grp_hint = 0, grp_covered = 1,
grp_unscalarizable_region = 0, grp_unscalarized_data = 0, grp_same_access_path
= 1, grp_partial_lhs = 0, grp_to_be_replaced = 1, grp_to_be_debug_replaced = 0}

but it fails to record that for the size 4736 write there's a clear performed
that's cheaply to re-materialize (and no variables need to be created).  SRA
could probably track writes from only constants that way, avoiding to create
scalar replacements.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/107263] Memcpy not elided when initializing struct
  2022-10-14 13:13 [Bug tree-optimization/107263] New: Memcpy not elided when initializing struct jmuizelaar at mozilla dot com
  2022-10-14 15:45 ` [Bug tree-optimization/107263] " pinskia at gcc dot gnu.org
  2022-10-17  7:43 ` rguenth at gcc dot gnu.org
@ 2022-10-17 13:28 ` jmuizelaar at mozilla dot com
  2024-03-19 17:27 ` hiraditya at msn dot com
  3 siblings, 0 replies; 5+ messages in thread
From: jmuizelaar at mozilla dot com @ 2022-10-17 13:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107263

--- Comment #2 from Jeff Muizelaar <jmuizelaar at mozilla dot com> ---
Even for small arrays clang does a noticeably better job:
https://gcc.godbolt.org/z/4d3TjGazY

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/107263] Memcpy not elided when initializing struct
  2022-10-14 13:13 [Bug tree-optimization/107263] New: Memcpy not elided when initializing struct jmuizelaar at mozilla dot com
                   ` (2 preceding siblings ...)
  2022-10-17 13:28 ` jmuizelaar at mozilla dot com
@ 2024-03-19 17:27 ` hiraditya at msn dot com
  3 siblings, 0 replies; 5+ messages in thread
From: hiraditya at msn dot com @ 2024-03-19 17:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107263

AK <hiraditya at msn dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hiraditya at msn dot com

--- Comment #3 from AK <hiraditya at msn dot com> ---
Seems like a duplicate of #59863 ?

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-03-19 17:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-14 13:13 [Bug tree-optimization/107263] New: Memcpy not elided when initializing struct jmuizelaar at mozilla dot com
2022-10-14 15:45 ` [Bug tree-optimization/107263] " pinskia at gcc dot gnu.org
2022-10-17  7:43 ` rguenth at gcc dot gnu.org
2022-10-17 13:28 ` jmuizelaar at mozilla dot com
2024-03-19 17:27 ` hiraditya at msn dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).