[Bug rtl-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22
Date: Thu, 27 Jan 2022 07:42:49 +0000	[thread overview]
Message-ID: <bug-102178-4-UtgVfeOvr6@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-102178-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|tree-optimization           |rtl-optimization
                 CC|                            |vmakarov at gcc dot gnu.org
           Keywords|                            |ra

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
I see a lot more GPR <-> XMM moves in the 'after' case:

    1035 :   401c8b: vaddsd %xmm1,%xmm0,%xmm0
    1953 :   401c8f: vmovq  %rcx,%xmm1
     305 :   401c94: vaddsd %xmm8,%xmm1,%xmm1
    3076 :   401c99: vmovq  %xmm0,%r14
     590 :   401c9e: vmovq  %r11,%xmm0
     267 :   401ca3: vmovq  %xmm1,%r8
     136 :   401ca8: vmovq  %rdx,%xmm1
     448 :   401cad: vaddsd %xmm1,%xmm0,%xmm1
    1703 :   401cb1: vmovq  %xmm1,%r9   (*)
     834 :   401cb6: vmovq  %r8,%xmm1
    1719 :   401cbb: vmovq  %r9,%xmm0   (*)
    2782 :   401cc0: vaddsd %xmm0,%xmm1,%xmm1
   22135 :   401cc4: vmovsd %xmm1,%xmm1,%xmm0
    1261 :   401cc8: vmovq  %r14,%xmm1
     646 :   401ccd: vaddsd %xmm0,%xmm1,%xmm0
   18136 :   401cd1: vaddsd %xmm2,%xmm5,%xmm1
     629 :   401cd5: vmovq  %xmm1,%r8
     142 :   401cda: vaddsd %xmm6,%xmm3,%xmm1
     177 :   401cde: vmovq  %xmm0,%r14
     288 :   401ce3: vmovq  %xmm1,%r9
     177 :   401ce8: vmovq  %r8,%xmm1
     174 :   401ced: vmovq  %r9,%xmm0

those look like RA / spilling artifacts and IIRC I saw Hongtao posting
patches in this area to regcprop I think?  The above is definitely
bad, for example (*) seems to swap %xmm0 and %xmm1 via %r9.

The function is LBM_performStreamCollide, the sinking pass does nothing wrong,
it moves unconditionally executed

-  _948 = _861 + _867;
-  _957 = _944 + _948;
-  _912 = _861 + _873;
...
-  _981 = _853 + _865;
-  _989 = _977 + _981;
-  _916 = _853 + _857;
-  _924 = _912 + _916;

into a conditionally executed block.  But that increases register pressure
by 5 FP regs (if I counted correctly) in that area.  So this would be the
usual issue of GIMPLE transforms not being register-pressure aware.

-fschedule-insn -fsched-pressure seems to be able to somewhat mitigate this
(though I think EBB scheduling cannot undo such movement).

In postreload I see transforms like

-(insn 466 410 411 7 (set (reg:DF 0 ax [530])
-        (mem/u/c:DF (symbol_ref/u:DI ("*.LC10") [flags 0x2]) [0  S8 A64]))
"lbm.c":241:5 141 {*movdf_internal}
-     (expr_list:REG_EQUAL (const_double:DF
9.939744999999999830464503247640095651149749755859375e-1
[0x0.fe751ce28ed5fp+0])
-        (nil)))
-(insn 411 466 467 7 (set (reg:DF 25 xmm5 [orig:123 prephitmp_643 ] [123])
+(insn 411 410 467 7 (set (reg:DF 25 xmm5 [orig:123 prephitmp_643 ] [123])
         (reg:DF 0 ax [530])) "lbm.c":241:5 141 {*movdf_internal}
      (nil))

which seems like we could have reloaded %xmm5 from .LC10.  But the spilling
to GPRs seems to be present already after LRA and cprop_hardreg doesn't
do anything bad either.

The differences can be seen on trunk with -Ofast -march=znver2
[-fdisable-tree-sink2].

We have X86_TUNE_INTER_UNIT_MOVES_TO_VEC/X86_TUNE_INTER_UNIT_MOVES_FROM_VEC
and the interesting thing is that when I disable them I do see some
spilling to the stack but also quite some re-materialized constants
(loads from .LC* as seem from the opportunity above).

It might be interesting to benchmark with
-mtune-ctrl=^inter_unit_moves_from_vec,^inter_unit_moves_to_vec and find a way
to make costs in a way that IRA/LRA prefer re-materialization of constants
from the constant pool over spilling to GPRs (if that's possible at all -
Vlad?)

next prev parent reply	other threads:[~2022-01-27  7:42 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-02 15:38 [Bug tree-optimization/102178] New: " jamborm at gcc dot gnu.org
2021-09-03  7:07 ` [Bug tree-optimization/102178] " marxin at gcc dot gnu.org
2021-09-06  6:40 ` rguenth at gcc dot gnu.org
2021-09-06  6:41 ` [Bug tree-optimization/102178] [12 Regression] " rguenth at gcc dot gnu.org
2021-09-07  2:46 ` luoxhu at gcc dot gnu.org
2021-09-08 14:06 ` jamborm at gcc dot gnu.org
2021-09-16 16:17 ` jamborm at gcc dot gnu.org
2022-01-20 10:20 ` rguenth at gcc dot gnu.org
2022-01-26 15:57 ` marxin at gcc dot gnu.org
2022-01-27  7:42 ` rguenth at gcc dot gnu.org [this message]
2022-01-27  7:55 ` [Bug rtl-optimization/102178] " rguenth at gcc dot gnu.org
2022-01-27  8:13 ` crazylht at gmail dot com
2022-01-27  8:18 ` crazylht at gmail dot com
2022-01-27  8:20 ` rguenth at gcc dot gnu.org
2022-01-27  9:34 ` rguenth at gcc dot gnu.org
2022-01-27  9:55   ` Jan Hubicka
2022-01-27  9:55 ` hubicka at kam dot mff.cuni.cz
2022-01-27 10:13 ` rguenth at gcc dot gnu.org
2022-01-27 10:14 ` rguenth at gcc dot gnu.org
2022-01-27 10:23 ` hubicka at kam dot mff.cuni.cz
2022-01-27 10:32 ` rguenth at gcc dot gnu.org
2022-01-27 11:18 ` rguenth at gcc dot gnu.org
2022-01-27 11:30 ` rguenther at suse dot de
2022-01-27 11:33 ` rguenther at suse dot de
2022-01-27 12:04   ` Jan Hubicka
2022-01-27 12:04 ` hubicka at kam dot mff.cuni.cz
2022-01-27 13:42 ` hjl.tools at gmail dot com
2022-01-27 14:24 ` rguenth at gcc dot gnu.org
2022-01-27 16:28 ` crazylht at gmail dot com
2022-01-27 16:36 ` crazylht at gmail dot com
2022-01-28 15:48 ` vmakarov at gcc dot gnu.org
2022-01-28 16:02 ` vmakarov at gcc dot gnu.org
2022-02-09 15:51 ` vmakarov at gcc dot gnu.org
2022-02-10  7:45 ` rguenth at gcc dot gnu.org
2022-02-10 15:17 ` vmakarov at gcc dot gnu.org
2022-04-11 13:04 ` rguenth at gcc dot gnu.org
2022-04-25  9:45 ` rguenth at gcc dot gnu.org
2022-04-25 12:52 ` rguenth at gcc dot gnu.org
2022-04-25 13:02 ` rguenth at gcc dot gnu.org
2022-04-25 13:09 ` rguenth at gcc dot gnu.org
2023-04-26  6:55 ` [Bug rtl-optimization/102178] [12/13/14 " rguenth at gcc dot gnu.org
2023-07-27  9:22 ` rguenth at gcc dot gnu.org
2024-05-21  9:10 ` [Bug rtl-optimization/102178] [12/13/14/15 " jakub at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-102178-4-UtgVfeOvr6@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).