public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/107250] New: Load unnecessarily happens before malloc
@ 2022-10-13 14:54 jmuizelaar at mozilla dot com
  2022-10-13 16:04 ` [Bug tree-optimization/107250] " amonakov at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: jmuizelaar at mozilla dot com @ 2022-10-13 14:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107250

            Bug ID: 107250
           Summary: Load unnecessarily happens before malloc
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jmuizelaar at mozilla dot com
  Target Milestone: ---

The following code will be compiled with the load of next happening before the
call to malloc. This generates worse code than if the load is delayed until
after the call to malloc.


struct Foo {
    Foo* next;
};
void ctx_push(Foo* f) {
    Foo tmp = { f->next };
    Foo *n = (Foo*)malloc(sizeof(Foo));
    *n = tmp;
    f->next = n;
}

Manually moving the load in this example improves the generated code:

struct Foo {
    Foo* next;
};

void ctx_push(Foo* f) {
    Foo *n = (Foo*)malloc(sizeof(Foo));
    Foo tmp = { f->next };
    *n = tmp;
    f->next = n;
}

https://gcc.godbolt.org/z/TnMj1c636

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/107250] Load unnecessarily happens before malloc
  2022-10-13 14:54 [Bug tree-optimization/107250] New: Load unnecessarily happens before malloc jmuizelaar at mozilla dot com
@ 2022-10-13 16:04 ` amonakov at gcc dot gnu.org
  2022-10-14  7:37 ` [Bug target/107250] " rguenth at gcc dot gnu.org
  2022-10-14  7:55 ` amonakov at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: amonakov at gcc dot gnu.org @ 2022-10-13 16:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107250

Alexander Monakov <amonakov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amonakov at gcc dot gnu.org

--- Comment #1 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
On the other hand, dispatching the load before malloc is useful if you expect
it to miss in the caches. If you wrote the code with that in mind, and the
compiler moved the load anyway, a manual workaround to *that* would be more
invasive.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/107250] Load unnecessarily happens before malloc
  2022-10-13 14:54 [Bug tree-optimization/107250] New: Load unnecessarily happens before malloc jmuizelaar at mozilla dot com
  2022-10-13 16:04 ` [Bug tree-optimization/107250] " amonakov at gcc dot gnu.org
@ 2022-10-14  7:37 ` rguenth at gcc dot gnu.org
  2022-10-14  7:55 ` amonakov at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-10-14  7:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107250

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
          Component|tree-optimization           |target
            Version|unknown                     |13.0
   Last reconfirmed|                            |2022-10-14
             Status|UNCONFIRMED                 |NEW
             Target|                            |x86_64-*-*
     Ever confirmed|0                           |1

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
The question is _why_ we generate worse code ... looks like pro/epilogue
generation differs:

@@ -11,15 +11,19 @@
 Attempting shrink-wrapping optimization.
 Block 2 needs prologue due to insn 2:
 (insn 2 4 3 2 (set (reg/v/f:DI 3 bx [orig:84 f ] [84])
-        (reg:DI 5 di [87])) "t.c":4:23 82 {*movdi_internal}
+        (reg:DI 5 di [86])) "t.c":4:23 82 {*movdi_internal}
      (nil))
 After wrapping required blocks, PRO is now 2
 Avoiding non-duplicatable blocks, PRO is now 2
 Bumping back to anticipatable blocks, PRO is now 2
...
     1: NOTE_INSN_DELETED
     4: NOTE_INSN_BASIC_BLOCK 2
-   18: [--sp:DI]=bx:DI
-   19: NOTE_INSN_PROLOGUE_END
+   18: [--sp:DI]=bp:DI
+   19: [--sp:DI]=bx:DI
+   20: {sp:DI=sp:DI-0x8;clobber flags:CC;clobber [scratch];}
+      REG_CFA_ADJUST_CFA sp:DI=sp:DI-0x8
+   21: NOTE_INSN_PROLOGUE_END
     2: bx:DI=di:DI
     3: NOTE_INSN_FUNCTION_BEG
-    6: di:DI=0x8
-    7: ax:DI=call [`malloc'] argc:0
+    6: bp:DI=[bx:DI]
+    7: di:DI=0x8
+    8: ax:DI=call [`malloc'] argc:0
       REG_CALL_DECL `malloc'
       REG_EH_REGION 0
-   10: dx:DI=[bx:DI]
-      REG_EQUIV [bx:DI]
-   11: [ax:DI]=dx:DI
+   11: [ax:DI]=bp:DI
    12: [bx:DI]=ax:DI
-   20: NOTE_INSN_EPILOGUE_BEG
-   21: bx:DI=[sp:DI++]
+   22: NOTE_INSN_EPILOGUE_BEG
+   23: {sp:DI=sp:DI+0x8;clobber flags:CC;clobber [scratch];}
       REG_CFA_ADJUST_CFA sp:DI=sp:DI+0x8
-   22: simple_return
-   25: barrier
+   24: bx:DI=[sp:DI++]
+      REG_CFA_ADJUST_CFA sp:DI=sp:DI+0x8
+   25: bp:DI=[sp:DI++]
+      REG_CFA_ADJUST_CFA sp:DI=sp:DI+0x8
+   26: simple_return
+   29: barrier
    17: NOTE_INSN_DELETED

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/107250] Load unnecessarily happens before malloc
  2022-10-13 14:54 [Bug tree-optimization/107250] New: Load unnecessarily happens before malloc jmuizelaar at mozilla dot com
  2022-10-13 16:04 ` [Bug tree-optimization/107250] " amonakov at gcc dot gnu.org
  2022-10-14  7:37 ` [Bug target/107250] " rguenth at gcc dot gnu.org
@ 2022-10-14  7:55 ` amonakov at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: amonakov at gcc dot gnu.org @ 2022-10-14  7:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107250

--- Comment #3 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
Well, obviously because in one function both 'f' and 'tmp' are live across the
call, and in the other function only 'f' is live across the call. The
difference is literally pushing one register vs. two registers, plus extra 8
bytes to preserve 16-byte ABI alignment.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-10-14  7:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-13 14:54 [Bug tree-optimization/107250] New: Load unnecessarily happens before malloc jmuizelaar at mozilla dot com
2022-10-13 16:04 ` [Bug tree-optimization/107250] " amonakov at gcc dot gnu.org
2022-10-14  7:37 ` [Bug target/107250] " rguenth at gcc dot gnu.org
2022-10-14  7:55 ` amonakov at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).