public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/104581] New: Huge compile-time regression building SPEC 2017 538.imagick_r with PGO
@ 2022-02-17 11:33 rguenth at gcc dot gnu.org
  2022-02-17 11:33 ` [Bug target/104581] [12 Regression] " rguenth at gcc dot gnu.org
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-17 11:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581

            Bug ID: 104581
           Summary: Huge compile-time regression building SPEC 2017
                    538.imagick_r with PGO
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

Building magick/enhance.c with -Ofast -march=znver2 -fprofile-generate spends
all compile time in mdreorg:

 machine dep reorg                  : 505.62 ( 93%)   0.03 (  5%) 509.15 ( 92%)
  193k (  0%)
 TOTAL                              : 543.25          0.63        551.52       
  236M

https://lnt.opensuse.org/db_default/v4/SPEC/graph?highlight_run=23626&plot.507=14.507.8

shows similar behavior on Kabylake

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104581] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with PGO
  2022-02-17 11:33 [Bug target/104581] New: Huge compile-time regression building SPEC 2017 538.imagick_r with PGO rguenth at gcc dot gnu.org
@ 2022-02-17 11:33 ` rguenth at gcc dot gnu.org
  2022-02-17 11:47 ` rguenth at gcc dot gnu.org
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-17 11:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-*-* i?86-*-*
   Target Milestone|---                         |12.0
           Keywords|                            |compile-time-hog
            Summary|Huge compile-time           |[12 Regression] Huge
                   |regression building SPEC    |compile-time regression
                   |2017 538.imagick_r with PGO |building SPEC 2017
                   |                            |538.imagick_r with PGO

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104581] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with PGO
  2022-02-17 11:33 [Bug target/104581] New: Huge compile-time regression building SPEC 2017 538.imagick_r with PGO rguenth at gcc dot gnu.org
  2022-02-17 11:33 ` [Bug target/104581] [12 Regression] " rguenth at gcc dot gnu.org
@ 2022-02-17 11:47 ` rguenth at gcc dot gnu.org
  2022-02-17 11:56 ` rguenth at gcc dot gnu.org
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-17 11:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P1
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2022-02-17

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Samples: 2M of event 'cycles', Event count (approx.): 2071823178483             
Overhead  Command  Shared Object                     Symbol                     
  92.82%  cc1      cc1                               [.]
ix86_avx_u128_mode_need
   0.70%  cc1      cc1                               [.]
update_conflict_hard_re

GCC 11 compiles this file in 36 seconds.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104581] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with PGO
  2022-02-17 11:33 [Bug target/104581] New: Huge compile-time regression building SPEC 2017 538.imagick_r with PGO rguenth at gcc dot gnu.org
  2022-02-17 11:33 ` [Bug target/104581] [12 Regression] " rguenth at gcc dot gnu.org
  2022-02-17 11:47 ` rguenth at gcc dot gnu.org
@ 2022-02-17 11:56 ` rguenth at gcc dot gnu.org
  2022-02-17 12:35 ` rguenth at gcc dot gnu.org
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-17 11:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
I suspect that

          FOR_EACH_SUBRTX (iter, array, src, NONCONST)
            if (ix86_check_avx_upper_register (*iter))
              {
                int status = ix86_avx_u128_mode_source (insn, *iter);
                if (status == AVX_U128_DIRTY)
                  return status;
              }

and ix86_avx_u128_mode_source walking defs via DF and doing for each def

        /* Check if DEF_INSN is before INSN.  */
        rtx_insn *next;
        for (next = NEXT_INSN (def_insn);
             next != nullptr && next != end && next != insn;
             next = NEXT_INSN (next))
          ;

is ending up with quadraticness.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104581] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with PGO
  2022-02-17 11:33 [Bug target/104581] New: Huge compile-time regression building SPEC 2017 538.imagick_r with PGO rguenth at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2022-02-17 11:56 ` rguenth at gcc dot gnu.org
@ 2022-02-17 12:35 ` rguenth at gcc dot gnu.org
  2022-02-17 12:43 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-17 12:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Since targetm.mode_switching.needed is called as part of the mode-switching
dataflow problem I don't see how looking at reg refs should be needed at all.
What might be needed and what I'm not sure is available, is the mode-switching
state of uses.  The relevant call is

          FOR_BB_INSNS (bb, insn)
            {
              if (INSN_P (insn))
                {
                  int mode = targetm.mode_switching.needed (e, insn);

and indeed all the time is spent in

        /* Check if DEF_INSN is before INSN.  */
        rtx_insn *next;
        for (next = NEXT_INSN (def_insn);
             next != nullptr && next != end && next != insn;
             next = NEXT_INSN (next))
          ;

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104581] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with PGO
  2022-02-17 11:33 [Bug target/104581] New: Huge compile-time regression building SPEC 2017 538.imagick_r with PGO rguenth at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2022-02-17 12:35 ` rguenth at gcc dot gnu.org
@ 2022-02-17 12:43 ` rguenth at gcc dot gnu.org
  2022-02-17 12:54 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-17 12:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
                 CC|                            |hjl.tools at gmail dot com
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
For just the special-case of AVX loads from constant zero this is a quite
expensive thing to do.

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index cf246e74e57..e4b42fbba6f 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -14520,11 +14452,7 @@ ix86_avx_u128_mode_needed (rtx_insn *insn)
        {
          FOR_EACH_SUBRTX (iter, array, src, NONCONST)
            if (ix86_check_avx_upper_register (*iter))
-             {
-               int status = ix86_avx_u128_mode_source (insn, *iter);
-               if (status == AVX_U128_DIRTY)
-                 return status;
-             }
+             return AVX_U128_DIRTY;
        }

       /* This isn't YMM/ZMM load/store.  */

fixes this and makes the compile finish in 16s.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104581] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with PGO
  2022-02-17 11:33 [Bug target/104581] New: Huge compile-time regression building SPEC 2017 538.imagick_r with PGO rguenth at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2022-02-17 12:43 ` rguenth at gcc dot gnu.org
@ 2022-02-17 12:54 ` jakub at gcc dot gnu.org
  2022-02-17 12:57 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-02-17 12:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Therefore regressed with r12-7125-g5390a2f191682dae3c6d1e1deac20e05be413514

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104581] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with PGO
  2022-02-17 11:33 [Bug target/104581] New: Huge compile-time regression building SPEC 2017 538.imagick_r with PGO rguenth at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2022-02-17 12:54 ` jakub at gcc dot gnu.org
@ 2022-02-17 12:57 ` rguenth at gcc dot gnu.org
  2022-02-17 13:03 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-17 12:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #5)
> Therefore regressed with r12-7125-g5390a2f191682dae3c6d1e1deac20e05be413514

The insn walk was there before btw (the first jump in the LNT graph).  This
just made it even worse.  The walk was introduced with
r12-2571-g9775e465c1fbfc32656de77c618c61acf5bd905d

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104581] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with PGO
  2022-02-17 11:33 [Bug target/104581] New: Huge compile-time regression building SPEC 2017 538.imagick_r with PGO rguenth at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2022-02-17 12:57 ` rguenth at gcc dot gnu.org
@ 2022-02-17 13:03 ` jakub at gcc dot gnu.org
  2022-02-17 13:04 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-02-17 13:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Ah, you're right.
So, can't it instead of the quadratic walk just compare DF_INSN_LUID?
If it isn't right after df_analyze and some insns could have been added in
between, it would need to maintain the luids somehow (perhaps e.g. in the way
how we do it in tree-ssa-reassoc.cc, if we add insns, we must set their uid to
either the previous or next insn's uid and then can do some IL walk, but only
as long as the uid is the same, so unless everything in the bb changes it
should be still cheap.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104581] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with PGO
  2022-02-17 11:33 [Bug target/104581] New: Huge compile-time regression building SPEC 2017 538.imagick_r with PGO rguenth at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2022-02-17 13:03 ` jakub at gcc dot gnu.org
@ 2022-02-17 13:04 ` jakub at gcc dot gnu.org
  2022-02-17 13:11 ` rguenther at suse dot de
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-02-17 13:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Or if we have in df some way how to determine which insns have been added, just
ignore those and look for their last predecessor or first successor that isn't
dirty/without df computed.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104581] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with PGO
  2022-02-17 11:33 [Bug target/104581] New: Huge compile-time regression building SPEC 2017 538.imagick_r with PGO rguenth at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2022-02-17 13:04 ` jakub at gcc dot gnu.org
@ 2022-02-17 13:11 ` rguenther at suse dot de
  2022-02-18  7:50 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenther at suse dot de @ 2022-02-17 13:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581

--- Comment #9 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 17 Feb 2022, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581
> 
> --- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> Ah, you're right.
> So, can't it instead of the quadratic walk just compare DF_INSN_LUID?
> If it isn't right after df_analyze and some insns could have been added in
> between, it would need to maintain the luids somehow (perhaps e.g. in the way
> how we do it in tree-ssa-reassoc.cc, if we add insns, we must set their uid to
> either the previous or next insn's uid and then can do some IL walk, but only
> as long as the uid is the same, so unless everything in the bb changes it
> should be still cheap.

I think this "feature" needs to be better integrated with the 
mode-switching data-flow.  It's too much bolted-on and thus has
unnecessarily high complexity.

In particular this is the initial local analysis run of mode-switching
where it just computes mode_in/mode_out of a BB (but the target hooks
do not have the pass meta data available).

The first calls to the hook are from 

  if (targetm.mode_switching.entry && targetm.mode_switching.exit)
    {
      /* Split the edge from the entry block, so that we can note that
         there NORMAL_MODE is supplied.  */
      post_entry = split_edge (single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN 
(cfun)));
      pre_exit = create_pre_exit (n_entities, entity_map, num_modes);

which even runs before df_analyze () so if the DF walk would be
triggered there it definitely looks bogus.  I also see nothing
adding the chain problem (I dn't remember which ones are added by
default though).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104581] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with PGO
  2022-02-17 11:33 [Bug target/104581] New: Huge compile-time regression building SPEC 2017 538.imagick_r with PGO rguenth at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2022-02-17 13:11 ` rguenther at suse dot de
@ 2022-02-18  7:50 ` cvs-commit at gcc dot gnu.org
  2022-02-18  7:51 ` rguenth at gcc dot gnu.org
  2022-02-18 18:41 ` cvs-commit at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-02-18  7:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581

--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:fe79d652c96b53384ddfa43e312cb0010251391b

commit r12-7293-gfe79d652c96b53384ddfa43e312cb0010251391b
Author: Richard Biener <rguenther@suse.de>
Date:   Thu Feb 17 14:40:16 2022 +0100

    target/104581 - compile-time regression in mode-switching

    The x86 backend piggy-backs on mode-switching for insertion of
    vzeroupper.  A recent improvement there was implemented in a way
    to walk possibly the whole basic-block for all DF reg def definitions
    in its mode_needed hook which is called for each instruction in
    a basic-block during mode-switching local analysis.

    The following mostly reverts this improvement.  It needs to be
    re-done in a way more consistent with a local dataflow which
    probably means making targets aware of the state of the local
    dataflow analysis.

    2022-02-17  Richard Biener  <rguenther@suse.de>

            PR target/104581
            * config/i386/i386.cc (ix86_avx_u128_mode_source): Remove.
            (ix86_avx_u128_mode_needed): Return AVX_U128_DIRTY instead
            of calling ix86_avx_u128_mode_source which would eventually
            have returned AVX_U128_ANY in some very special case.

            * gcc.target/i386/pr101456-1.c: XFAIL.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104581] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with PGO
  2022-02-17 11:33 [Bug target/104581] New: Huge compile-time regression building SPEC 2017 538.imagick_r with PGO rguenth at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2022-02-18  7:50 ` cvs-commit at gcc dot gnu.org
@ 2022-02-18  7:51 ` rguenth at gcc dot gnu.org
  2022-02-18 18:41 ` cvs-commit at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-18  7:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/104581] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with PGO
  2022-02-17 11:33 [Bug target/104581] New: Huge compile-time regression building SPEC 2017 538.imagick_r with PGO rguenth at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2022-02-18  7:51 ` rguenth at gcc dot gnu.org
@ 2022-02-18 18:41 ` cvs-commit at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-02-18 18:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104581

--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:

https://gcc.gnu.org/g:1931cbad498e625b1e24452dcfffe02539b12224

commit r12-7295-g1931cbad498e625b1e24452dcfffe02539b12224
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri Feb 18 10:36:53 2022 -0800

    pieces-memset-21.c: Expect vzeroupper for ia32

    Update gcc.target/i386/pieces-memset-21.c to expect vzeroupper for ia32
    caused by

    commit fe79d652c96b53384ddfa43e312cb0010251391b
    Author: Richard Biener <rguenther@suse.de>
    Date:   Thu Feb 17 14:40:16 2022 +0100

        target/104581 - compile-time regression in mode-switching

            PR target/104581
            * gcc.target/i386/pieces-memset-21.c: Expect vzeroupper for ia32.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-02-18 18:41 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-17 11:33 [Bug target/104581] New: Huge compile-time regression building SPEC 2017 538.imagick_r with PGO rguenth at gcc dot gnu.org
2022-02-17 11:33 ` [Bug target/104581] [12 Regression] " rguenth at gcc dot gnu.org
2022-02-17 11:47 ` rguenth at gcc dot gnu.org
2022-02-17 11:56 ` rguenth at gcc dot gnu.org
2022-02-17 12:35 ` rguenth at gcc dot gnu.org
2022-02-17 12:43 ` rguenth at gcc dot gnu.org
2022-02-17 12:54 ` jakub at gcc dot gnu.org
2022-02-17 12:57 ` rguenth at gcc dot gnu.org
2022-02-17 13:03 ` jakub at gcc dot gnu.org
2022-02-17 13:04 ` jakub at gcc dot gnu.org
2022-02-17 13:11 ` rguenther at suse dot de
2022-02-18  7:50 ` cvs-commit at gcc dot gnu.org
2022-02-18  7:51 ` rguenth at gcc dot gnu.org
2022-02-18 18:41 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).