[Bug tree-optimization/55731] Issue with complete innermost loop unrolling (cunrolli)

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/55731] Issue with complete innermost loop unrolling (cunrolli)
Date: Wed, 19 Dec 2012 10:27:00 -0000	[thread overview]
Message-ID: <bug-55731-4-DZGskrPB3K@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-55731-4@http.gcc.gnu.org/bugzilla/>


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55731

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> 2012-12-19 10:27:22 UTC ---
(In reply to comment #4)
> (In reply to comment #3)
> > The reason is that unrolling early can be harmful to for example vectorization
> > and thus cunrolli restricts itself to "obviously" profitable cases.
> > 
> > In this case the loop is not an "inner" loop - it doesn't have a containing
> > loop and so growth is not allowed even with -O3 (we otherwise will fail
> > to vectorize if the unrolled body ends up as part of other basic-blocks).
> > 
> Richard,
> 
> It looks that you did not see attached testcases.

I did - I even compiled them as you did and looked at the dump file and
the unroller source.

> I can't agree with your statement since
> 1. Loop in problem (t.c) has only 3 iterations and in any case it should not be
> considered as candidate for vectorization.

That's target dependend knowledge the unroller does not have (with
two element vectors you can produce one vectorized and one scalar iteration).

> 2. Loop contains calls of functions that do not have vectorizable counterparts.

The unroller does not have this detailed knowledge of the vectorizers
capabilities - it simply considers all loops vectorizable.

> 3. Loop contains comparisons with loop control variable as
>     if (i == 0) etc.
> and cunrolli phase determines it:
> 
>  BB: 7, after_exit: 1
>   size:   2 if (i_1 == 1)
>    Constant conditional.
>  BB: 5, after_exit: 1
>   size:   2 foo4 (k_15(D));
>   size:   2 if (i_1 == 0)
>    Constant conditional.
> 
> It means that these tests will be completely eliminated by loop unroller and
> some bb will become unreachable.

So?  Fact is:

      FOR_EACH_LOOP (li, loop, LI_FROM_INNERMOST)
        {
          struct loop *loop_father = loop_outer (loop);

          if (may_increase_size && optimize_loop_nest_for_speed_p (loop)
              /* Unroll outermost loops only if asked to do so or they do
                 not cause code growth.  */
              && (unroll_outer || loop_outer (loop_father)))
            ul = UL_ALL;
          else
            ul = UL_NO_GROWTH;

will end up with ul == UL_NO_GROWTH for t.c.  Because loop_outer (loop_father)
is NULL (and unroll_outer is false).

I stated the reason for this "heuristic" (-> this loop may no longer be
a loop after unrolling and thus not vectorizable).

> I also added another testcase (t2.c) for which cunrolli does correct size
> estimation and completely unroll it (it has only 2 iterations).

size: 14-5, last_iteration: 2-0
  Loop size: 14
  Estimated size after unrolling: 13

doesn't grow thus is ok to unroll.

> So I assume that size estimation algorithm in unroller is not perfect and must
> be re-written.

Haha ;)  Of course - it can't be "perfect" - you cannot reasonably pre-compute
the outcome of all subsequent optimizations correctly without ever pessimizing
in one or another way (either estimate a too small or a too large size).

But you are of course free to propose a patch!

> And at last if customer provides gcc with "-funroll-loop" option we should not
> consider "possible size growth" as reason of unroll rejection. 

As I said above, cunrolli is supposed to only unroll inner loops.  Your
loop isn't an inner (nested loop).  This restriction is relaxed if unrolling
does not increase size.

> > It's a know issue that after cunroll there is no strong value-numbering
> > pass that handles memory (there is DOM which only has weak memory handling).
> > 
> > So, it's a trade-off we make, mostly for the sake of loop optimizations
> > that do not handle unrolled loops well.
> 
> Best regards.
> Yuri.

next prev parent reply	other threads:[~2012-12-19 10:27 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-18 14:13 [Bug tree-optimization/55731] New: " ysrumyan at gmail dot com
2012-12-18 14:23 ` [Bug tree-optimization/55731] " ysrumyan at gmail dot com
2012-12-18 14:24 ` ysrumyan at gmail dot com
2012-12-18 15:35 ` rguenth at gcc dot gnu.org
2012-12-19  9:17 ` ysrumyan at gmail dot com
2012-12-19 10:27 ` rguenth at gcc dot gnu.org [this message]
2021-12-12  9:45 ` pinskia at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-55731-4-DZGskrPB3K@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).