public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/102814] New: [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
@ 2021-10-18 17:32 dimhen at gmail dot com
  2021-10-18 21:06 ` [Bug tree-optimization/102814] " pinskia at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: dimhen at gmail dot com @ 2021-10-18 17:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814

            Bug ID: 102814
           Summary: [12 regression] quadratique/exponential time
                    complexity for max-jump-thread-duplication-stmts
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: dimhen at gmail dot com
  Target Milestone: ---

r12-4256 FAST
r12-4444 SLOW

g++ -fpreprocessed -std=c++98 -O2 --param max-jump-thread-duplication-stmts=NNN
-c x.ii

     r12-4256 r12-4444
180    0.5s     0.21s
181    0.8s     > 100s
190    1.0s     52s

cat x.ii # cvise'd from proprietary codebase
struct a {
  a operator+(int);
};
void *operator new(unsigned long, void *c) { return c; }
struct d {
  void h(int *j) { new (j) int(); }
};
struct k {
  struct {
    int e;
    int f;
  } g;
};
struct n : k {
  a m_fn2();
  a o();
  int m_fn4();
  int operator[](int);
  int ae;
  void s() {
    if (g.e != g.f) {
      d l;
      int *b;
      l.h(b);
    }
    a m = o();
    t(m, ae);
  }
  template <typename af> void u(a, af, af);
  void t(a, unsigned char);
};
int ah;
struct v {
  n w();
};
void x() {
  v p;
  n ak = p.w(), al, rapdu;
  int q(ak.m_fn4());
  int r;
  for (int i = 0; q; i++) {
    if (i == q - 1) {
      al.s();
      r = ak.m_fn4();
    }
    al.s();
    al.s();
    al.s();
    al.s();
    a am, an = ak.m_fn2(), ao = an + 1, ap = ak.m_fn2(), aq = ap + ah,
          ar = aq + r;
    al.u(am, ao, ar);
    if (i == q - 1)
      al.s();
    i &&rapdu[1];
  }
}

g++ -v
Using built-in specs.
COLLECT_GCC=/home/dimhen/arch-gcc/gcc_current/bin/g++
COLLECT_LTO_WRAPPER=/home/dimhen/arch-gcc/gcc_current/libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
Target: x86_64-pc-linux-gnu
Configured with: /home/dimhen/src/gcc_current/configure
--prefix=/home/dimhen/arch-gcc/gcc_current
--enable-checking=yes,df,fold,rtl,extra --enable-languages=c,c++,lto
--disable-multilib --enable-shared --enable-threads=posix --enable-__cxa_atexit
--enable-gnu-unique-object --enable-linker-build-id
--with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl
--enable-offload-targets=nvptx-none --without-cuda-driver
--enable-gnu-indirect-function --enable-cet --with-tune=native
--enable-libstdcxx-debug
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.0 20211015 (experimental) [master r12-4444-ga01704fc45a] (GCC)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
  2021-10-18 17:32 [Bug tree-optimization/102814] New: [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts dimhen at gmail dot com
@ 2021-10-18 21:06 ` pinskia at gcc dot gnu.org
  2021-10-19  7:20 ` aldyh at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-10-18 21:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0
           Keywords|                            |compile-time-hog
                 CC|                            |aldyh at gcc dot gnu.org,
                   |                            |pinskia at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
  2021-10-18 17:32 [Bug tree-optimization/102814] New: [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts dimhen at gmail dot com
  2021-10-18 21:06 ` [Bug tree-optimization/102814] " pinskia at gcc dot gnu.org
@ 2021-10-19  7:20 ` aldyh at gcc dot gnu.org
  2021-10-19  7:40 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: aldyh at gcc dot gnu.org @ 2021-10-19  7:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814

--- Comment #1 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
(In reply to Dmitry G. Dyachenko from comment #0)
> r12-4256 FAST
> r12-4444 SLOW
> 
> g++ -fpreprocessed -std=c++98 -O2 --param
> max-jump-thread-duplication-stmts=NNN -c x.ii

Well, you are basically eliding the fail safe we put in specifically for this
code explosion:

  /* Threading through an empty latch would cause code to be added to
     the latch.  This could alter the loop form sufficiently to cause
     loop optimizations to fail.  Disable these threads until after
     loop optimizations have run.  */
  if ((threaded_through_latch
       || (taken_edge && taken_edge->dest == loop->latch))
      && !(cfun->curr_properties & PROP_loop_opts_done)
      && empty_block_p (loop->latch))
    {
      if (dump_file && (dump_flags & TDF_DETAILS))
        fprintf (dump_file,
                 "  FAIL: Thread through latch before loop opts would create
non-empty latch\n");
      return false;

The default is 15, and you're pushing it to > 180.  Nothing good can come of
that.

That being said, in this very specific case, thread3 is creating some monster
PHIs which then thread4 further explodes.  Luckily this combination is
disallowed by the pending threading restrictions in the presence of loops here:

https://gcc.gnu.org/pipermail/gcc-patches/2021-October/581894.html

But really, bad things can happen when you disable the fail-safe mechanisms
we've put in place.

Question to the larger audience... do we support bug reports against internal
--param constructs?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
  2021-10-18 17:32 [Bug tree-optimization/102814] New: [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts dimhen at gmail dot com
  2021-10-18 21:06 ` [Bug tree-optimization/102814] " pinskia at gcc dot gnu.org
  2021-10-19  7:20 ` aldyh at gcc dot gnu.org
@ 2021-10-19  7:40 ` rguenth at gcc dot gnu.org
  2021-10-19  7:45 ` aldyh at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-10-19  7:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Aldy Hernandez from comment #1)
> Question to the larger audience... do we support bug reports against
> internal --param constructs?

Yes.  Generally 'max-jump-thread-duplication-stmts' would suggest this is
a parameter limiting code size growth and one that might affect compile-time
in a linear fashion - exponential growth here is unexpected.  The reporter
states a 180 -> 181 parameter change trips this over unexpectedly which is
a case worth investigating (it suggests a limit elsewhere is missing).

For example the alias walking code counts the amount of "work" it does and
has a limit on that, allowing linear growth parametrization.  Not sure if
there's sth in the threader and/or ranger that would support accumulating
a work budget and stop after it is exhausted, but something like that would
be very useful (not sure if that's the problem at hand in this case).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
  2021-10-18 17:32 [Bug tree-optimization/102814] New: [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts dimhen at gmail dot com
                   ` (2 preceding siblings ...)
  2021-10-19  7:40 ` rguenth at gcc dot gnu.org
@ 2021-10-19  7:45 ` aldyh at gcc dot gnu.org
  2021-10-20  7:10 ` aldyh at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: aldyh at gcc dot gnu.org @ 2021-10-19  7:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814

Aldy Hernandez <aldyh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |aldyh at gcc dot gnu.org
                 CC|                            |amacleod at redhat dot com
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2021-10-19
             Status|UNCONFIRMED                 |NEW

--- Comment #3 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> (In reply to Aldy Hernandez from comment #1)
> > Question to the larger audience... do we support bug reports against
> > internal --param constructs?
> 
> Yes.  Generally 'max-jump-thread-duplication-stmts' would suggest this is
> a parameter limiting code size growth and one that might affect compile-time
> in a linear fashion - exponential growth here is unexpected.  The reporter
> states a 180 -> 181 parameter change trips this over unexpectedly which is
> a case worth investigating (it suggests a limit elsewhere is missing).
> 
> For example the alias walking code counts the amount of "work" it does and
> has a limit on that, allowing linear growth parametrization.  Not sure if
> there's sth in the threader and/or ranger that would support accumulating
> a work budget and stop after it is exhausted, but something like that would
> be very useful (not sure if that's the problem at hand in this case).

ISTR Andrew had some patches to stop solving for some combination of extremely
large PHIs.  Not sure whether this is the issue at hand, but perhaps it's worth
looking at.

I'll put this in the back burner for now, since the loop threading restrictions
make this a non-issue, but I'll come back to it later.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
  2021-10-18 17:32 [Bug tree-optimization/102814] New: [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts dimhen at gmail dot com
                   ` (3 preceding siblings ...)
  2021-10-19  7:45 ` aldyh at gcc dot gnu.org
@ 2021-10-20  7:10 ` aldyh at gcc dot gnu.org
  2021-10-20  7:10 ` aldyh at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: aldyh at gcc dot gnu.org @ 2021-10-20  7:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814

Aldy Hernandez <aldyh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |marxin at gcc dot gnu.org

--- Comment #4 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
*** Bug 102852 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
  2021-10-18 17:32 [Bug tree-optimization/102814] New: [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts dimhen at gmail dot com
                   ` (4 preceding siblings ...)
  2021-10-20  7:10 ` aldyh at gcc dot gnu.org
@ 2021-10-20  7:10 ` aldyh at gcc dot gnu.org
  2021-10-20  9:09 ` cvs-commit at gcc dot gnu.org
  2021-10-20  9:10 ` aldyh at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: aldyh at gcc dot gnu.org @ 2021-10-20  7:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814
Bug 102814 depends on bug 102852, which changed state.

Bug 102852 Summary: [12 Regression] Compile time hog since r12-4421-g0bd68793921ecf3b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102852

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |DUPLICATE

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
  2021-10-18 17:32 [Bug tree-optimization/102814] New: [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts dimhen at gmail dot com
                   ` (5 preceding siblings ...)
  2021-10-20  7:10 ` aldyh at gcc dot gnu.org
@ 2021-10-20  9:09 ` cvs-commit at gcc dot gnu.org
  2021-10-20  9:10 ` aldyh at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-10-20  9:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Aldy Hernandez <aldyh@gcc.gnu.org>:

https://gcc.gnu.org/g:82cd78f2c31db1664ca154d7fcd24e9eaee1427f

commit r12-4532-g82cd78f2c31db1664ca154d7fcd24e9eaee1427f
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Wed Oct 20 09:05:23 2021 +0200

    Restore --param=max-fsm-thread-length

    The removal of --param=max-fsm-thread-length is causing code
    explosion.  I thought that --param=max-fsm-thread-path-insns was a
    better gague for path profitability than raw BB length, but it turns
    out that we don't take into account PHIs when estimating the number of
    statements.

    In this PR, we have a sequence of very large PHIs that have us
    traversing extremely large paths that blow up the compilation.

    We could fix this a couple of different ways.  We could avoid
    traversing more than a certain number of PHI arguments, or ignore
    large PHIs altogether.  The old implementation certainly had this
    knob, and we could cut things off before we even got to the ranger.
    We could also adjust the instruction estimation to take into account
    PHIs, but I'm sure we'll mess something else in the process ;-).

    The easiest thing to do is just restore the knob.

    At a later time we could tweak this further, for instance,
    disregarding empty blocks in the count.  BTW, this is the reason I
    didn't chop things off in the lowlevel registry for all threaders: the
    forward threader can't really explore too deep paths, but it could
    theoretically get there while threading over empty blocks.

    This fixes 102814, 102852, and I bet it solves the Linux kernel cross
    compile issue.

    Tested on x86-64 Linux.

    gcc/ChangeLog:

            PR tree-optimization/102814
            * doc/invoke.texi: Document --param=max-fsm-thread-length.
            * params.opt: Add --param=max-fsm-thread-length.
            * tree-ssa-threadbackward.c
            (back_threader_profitability::profitable_path_p): Fail on paths
            longer than max-fsm-thread-length.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
  2021-10-18 17:32 [Bug tree-optimization/102814] New: [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts dimhen at gmail dot com
                   ` (6 preceding siblings ...)
  2021-10-20  9:09 ` cvs-commit at gcc dot gnu.org
@ 2021-10-20  9:10 ` aldyh at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: aldyh at gcc dot gnu.org @ 2021-10-20  9:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814

Aldy Hernandez <aldyh at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #6 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
fixed

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-10-20  9:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-18 17:32 [Bug tree-optimization/102814] New: [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts dimhen at gmail dot com
2021-10-18 21:06 ` [Bug tree-optimization/102814] " pinskia at gcc dot gnu.org
2021-10-19  7:20 ` aldyh at gcc dot gnu.org
2021-10-19  7:40 ` rguenth at gcc dot gnu.org
2021-10-19  7:45 ` aldyh at gcc dot gnu.org
2021-10-20  7:10 ` aldyh at gcc dot gnu.org
2021-10-20  7:10 ` aldyh at gcc dot gnu.org
2021-10-20  9:09 ` cvs-commit at gcc dot gnu.org
2021-10-20  9:10 ` aldyh at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).