[Bug rtl-optimization/88770] Redundant load opt. or CSE pessimizes code

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug rtl-optimization/88770] Redundant load opt. or CSE pessimizes code
       [not found] <bug-88770-4@http.gcc.gnu.org/bugzilla/>
@ 2021-06-05  6:31 ` peter at cordes dot ca
  2021-06-07  5:13 ` crazylht at gmail dot com
  1 sibling, 0 replies; 2+ messages in thread
From: peter at cordes dot ca @ 2021-06-05  6:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88770

Peter Cordes <peter at cordes dot ca> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |peter at cordes dot ca

--- Comment #2 from Peter Cordes <peter at cordes dot ca> ---
Note that mov r64, imm64 is a 10-byte instruction, and can be slow to read from
the uop-cache on Sandybridge-family.

The crap involving OR is clearly sub-optimal, but *if* you already have two
spare call-preserved registers across this call, the following is actually
smaller code-size:

        movabs  rdi, 21474836483
        mov     rbp, rdi
        movabs  rsi, 39743127552
        mov     rbx, rsi        
        call    test
        mov     rdi, rbp
        mov     rsi, rbx
        call    test

This is more total uops for the back-end though (movabs is still single-uop,
but takes 2 entries the uop cache on Sandybridge-family;
https://agner.org/optimize/).  So saving x86 machine-code size this way does
limit the ability of out-of-order exec to see farther, if the front-end isn't
the bottleneck.  And it's highly unlikely to be worth saving/restoring two regs
to enable this.  (Or to push rdi / push rsi before call, then pop after!)

Setting up the wrong value and then fixing it twice with OR is obviously
terrible and never has any advantage, but the general idea to CSE large
constants isn't totally crazy.  (But it's profitable only in such limited cases
that it might not be worth looking for, especially if it's only helpful at -Os)

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug rtl-optimization/88770] Redundant load opt. or CSE pessimizes code
       [not found] <bug-88770-4@http.gcc.gnu.org/bugzilla/>
  2021-06-05  6:31 ` [Bug rtl-optimization/88770] Redundant load opt. or CSE pessimizes code peter at cordes dot ca
@ 2021-06-07  5:13 ` crazylht at gmail dot com
  1 sibling, 0 replies; 2+ messages in thread
From: crazylht at gmail dot com @ 2021-06-07  5:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88770

--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---
Shouldn't pass_store_merging be better place to handle such optimization?
currently store-merging only merges .a and .b, fails to merge .c and .d

202t.store-merging

void caller ()
{
  struct guu D.4030;
  struct guu D.4029;

  <bb 2> [local count: 1073741824]:
  MEM <unsigned long> [(int *)&D.4029] = 21474836483;
  D.4029.c = 7.0e+0;
  D.4029.d = 9;
  test (D.4029);
  MEM <unsigned long> [(int *)&D.4030] = 21474836483;
  D.4030.c = 7.0e+0;
  D.4030.d = 9;
  test (D.4030);
  D.4029 ={v} {CLOBBER};
  D.4030 ={v} {CLOBBER};
  return;

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-06-07  5:13 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-88770-4@http.gcc.gnu.org/bugzilla/>
2021-06-05  6:31 ` [Bug rtl-optimization/88770] Redundant load opt. or CSE pessimizes code peter at cordes dot ca
2021-06-07  5:13 ` crazylht at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).