public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/108738] New: compile-time and memory-hog in mdreorg
@ 2023-02-09  8:10 rguenth at gcc dot gnu.org
  2023-02-09  8:47 ` [Bug target/108738] " rguenth at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-09  8:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738

            Bug ID: 108738
           Summary: compile-time and memory-hog in mdreorg
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

The compiler.i testcase from PR26854 at -O2 exhibits

 machine dep reorg                  : 550.15 ( 71%)   0.00 (  0%) 550.16 ( 70%)
   24  (  0%)
 TOTAL                              : 770.35         14.29        784.73       
 1475M
770.35user 14.33system 13:04.78elapsed 99%CPU (0avgtext+0avgdata
27915760maxresident)k
0inputs+18424outputs (0major+26040556minor)pagefaults 0swaps

I've fixed compile-time issues in that in the past but it seems another such
issue has re-surfaced.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/108738] compile-time and memory-hog in mdreorg
  2023-02-09  8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
@ 2023-02-09  8:47 ` rguenth at gcc dot gnu.org
  2023-02-09  9:11 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-09  8:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-*-*

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
A perf report isn't very conclusive, but at least points at STV:

Samples: 3M of event 'cycles:u', Event count (approx.): 4455606089800           
     Samples  Command  Shared Object       Symbol                               
     2153935  cc1      cc1                 [.] bitmap_bit_p                     
      288315  cc1      cc1                 [.] (anonymous
namespace)::scalar_chain::analyze_register_chain
      145980  cc1      cc1                 [.] bitmap_equal_p                   
       85953  cc1      cc1                 [.] pop_scope                        
       70985  cc1      cc1                 [.] bitmap_ior_into                  
       87675  cc1      cc1                 [.] df_chain_create

with -O2 -mno-stv we get to

 machine dep reorg                  :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 TOTAL                              : 202.90          8.46        211.41       
 1475M

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/108738] compile-time and memory-hog in mdreorg
  2023-02-09  8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
  2023-02-09  8:47 ` [Bug target/108738] " rguenth at gcc dot gnu.org
@ 2023-02-09  9:11 ` rguenth at gcc dot gnu.org
  2023-02-09  9:24 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-09  9:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2023-02-09
     Ever confirmed|0                           |1
      Known to fail|                            |10.4.0
             Status|UNCONFIRMED                 |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Happens at least since GCC 10.

There's a comment already:

  /* ???  The following is quadratic since analyze_register_chain
     iterates over all refs to look for dual-mode regs.  Instead this
     should be done separately for all regs mentioned in the chain once.  */
  df_ref ref;
  for (ref = DF_INSN_UID_DEFS (insn_uid); ref; ref = DF_REF_NEXT_LOC (ref))
    if (!HARD_REGISTER_P (DF_REF_REG (ref)))
      analyze_register_chain (candidates, ref);

but the walk also adds to the queue, so it's more interwinded and the suggested
solution isn't the proper.  Instead it might be possible to just record
already visited regs.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/108738] compile-time and memory-hog in mdreorg
  2023-02-09  8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
  2023-02-09  8:47 ` [Bug target/108738] " rguenth at gcc dot gnu.org
  2023-02-09  9:11 ` rguenth at gcc dot gnu.org
@ 2023-02-09  9:24 ` rguenth at gcc dot gnu.org
  2023-02-09 12:48 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-09  9:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
That brings it down to

 machine dep reorg                  :  87.09 ( 28%)

let me see if there's something else obvious to do.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/108738] compile-time and memory-hog in mdreorg
  2023-02-09  8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-02-09  9:24 ` rguenth at gcc dot gnu.org
@ 2023-02-09 12:48 ` rguenth at gcc dot gnu.org
  2023-02-15  7:28 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-02-09 12:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
The odd thing is that DF_REF_CHAIN of a USE ref contains _all_ definitions of
the pseudo while DF_REF_CHAIN of a DEF ref contains only reaching uses!?

Probably an artifact of how df_chain_create_bb_process_use operates.

That makes this much harder to fix ...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/108738] compile-time and memory-hog in mdreorg
  2023-02-09  8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-02-09 12:48 ` rguenth at gcc dot gnu.org
@ 2023-02-15  7:28 ` cvs-commit at gcc dot gnu.org
  2023-02-15  7:28 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-02-15  7:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:ec23e9e25eb64bb066dc408fd498861b8587bec8

commit r13-5994-gec23e9e25eb64bb066dc408fd498861b8587bec8
Author: Richard Biener <rguenther@suse.de>
Date:   Thu Feb 9 11:03:25 2023 +0100

    target/108738 - STV bitmap operations compile-time hog

    When the set of candidates becomes very large then repeated
    bit checks on it during the build of an actual chain can become
    slow because of the O(n) nature of bitmap tests.  The following
    switches the candidates bitmaps to the tree representation before
    building the chains to get O(log n) amortized behavior.

    For the testcase at hand this improves STV time by 50%.

            PR target/108738
            * config/i386/i386-features.cc (convert_scalars_to_vector):
            Switch candidates bitmaps to tree view before building the chains.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/108738] compile-time and memory-hog in mdreorg
  2023-02-09  8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2023-02-15  7:28 ` cvs-commit at gcc dot gnu.org
@ 2023-02-15  7:28 ` cvs-commit at gcc dot gnu.org
  2023-03-02 11:38 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-02-15  7:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:e1dfac7e71056e879f101fef1c5ecb8ff6be1a1f

commit r13-5995-ge1dfac7e71056e879f101fef1c5ecb8ff6be1a1f
Author: Richard Biener <rguenther@suse.de>
Date:   Thu Feb 9 13:40:43 2023 +0100

    target/108738 - optimize bit operations in STV

    The following does low-hanging optimizations, combining bitmap
    test and set and removing redundant operations.

            PR target/108738
            * config/i386/i386-features.cc (scalar_chain::add_to_queue):
            Combine bitmap test and set.
            (scalar_chain::add_insn): Likewise.
            (scalar_chain::analyze_register_chain): Remove redundant
            attempt to add to queue and instead strengthen assert.
            Sink common attempts to mark the def dual-mode.
            (scalar_chain::add_to_queue): Remove redundant insn bitmap
            check.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/108738] compile-time and memory-hog in mdreorg
  2023-02-09  8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2023-02-15  7:28 ` cvs-commit at gcc dot gnu.org
@ 2023-03-02 11:38 ` rguenth at gcc dot gnu.org
  2023-03-03  7:31 ` cvs-commit at gcc dot gnu.org
  2023-03-03  7:33 ` rguenth at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-02 11:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
So it's now better than before but still quadratic.

Finding a strathegic place to limit the search with some --param might be a
solution, but there's no easy
point to hook that into.  You'd not want to disable the whole pass but
terminate the greedy search and axe the candidates sofar processed (to not run
into the same ones again), which might then result in "odd" STV decisions if
the remains are picked up.  To avoid this maybe maintain a "too big" set of
candidates
and if a further greedy search lands at a insn in that set, axe that search as
well.  Note it's not the size of the set but the complexity of the search that
needs limiting, so count the number of ref visits through
analyze_register_chain
for an invocation of scalar_chain::build and limit that to some --param.

I'm trying to prototype that.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/108738] compile-time and memory-hog in mdreorg
  2023-02-09  8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2023-03-02 11:38 ` rguenth at gcc dot gnu.org
@ 2023-03-03  7:31 ` cvs-commit at gcc dot gnu.org
  2023-03-03  7:33 ` rguenth at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-03-03  7:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738

--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:6010189923908501ca5b02bd1f4aee05d2283118

commit r13-6439-g6010189923908501ca5b02bd1f4aee05d2283118
Author: Richard Biener <rguenther@suse.de>
Date:   Thu Mar 2 13:43:51 2023 +0100

    target/108738 - limit STV chain discovery

    The following puts a hard limit on the inherently quadratic STV chain
    discovery.  Without a limit for the compiler.i testcase in PR26854
    we see at -O2

     machine dep reorg                  : 574.45 ( 53%)

    with release checking while with the proposed limit it's

     machine dep reorg                  :   2.86 (  1%)

            PR target/108738
            * config/i386/i386.opt (--param x86-stv-max-visits): New param.
            * doc/invoke.texi (--param x86-stv-max-visits): Document it.
            * config/i386/i386-features.h (scalar_chain::max_visits): New.
            (scalar_chain::build): Add bitmap parameter, return boolean.
            (scalar_chain::add_insn): Likewise.
            (scalar_chain::analyze_register_chain): Likewise.
            * config/i386/i386-features.cc (scalar_chain::scalar_chain):
            Initialize max_visits.
            (scalar_chain::analyze_register_chain): When we exhaust
            max_visits, abort.  Also abort when running into any
            disallowed insn.
            (scalar_chain::add_insn): Propagate abort.
            (scalar_chain::build): Likewise.  When aborting amend
            the set of disallowed insn with the insns set.
            (convert_scalars_to_vector): Adjust.  Do not convert aborted
            chains.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/108738] compile-time and memory-hog in mdreorg
  2023-02-09  8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2023-03-03  7:31 ` cvs-commit at gcc dot gnu.org
@ 2023-03-03  7:33 ` rguenth at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-03-03  7:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108738

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
           Keywords|                            |compile-time-hog
   Target Milestone|---                         |13.0
             Status|ASSIGNED                    |RESOLVED

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
This has been mitigated for GCC 13, in theory backporting is possible but
there's a workaround available (-mno-stv), so I think that's not strictly
necessary since the issue is quite old already.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-03-03  7:33 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-09  8:10 [Bug target/108738] New: compile-time and memory-hog in mdreorg rguenth at gcc dot gnu.org
2023-02-09  8:47 ` [Bug target/108738] " rguenth at gcc dot gnu.org
2023-02-09  9:11 ` rguenth at gcc dot gnu.org
2023-02-09  9:24 ` rguenth at gcc dot gnu.org
2023-02-09 12:48 ` rguenth at gcc dot gnu.org
2023-02-15  7:28 ` cvs-commit at gcc dot gnu.org
2023-02-15  7:28 ` cvs-commit at gcc dot gnu.org
2023-03-02 11:38 ` rguenth at gcc dot gnu.org
2023-03-03  7:31 ` cvs-commit at gcc dot gnu.org
2023-03-03  7:33 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).