From: guojiufu <guojiufu@linux.ibm.com>
To: Richard Biener <rguenther@suse.de>
Cc: gcc-patches@gcc.gnu.org, wschmidt@linux.ibm.com,
segher@kernel.crashing.org, dje.gcc@gmail.com, jlaw@tachyum.com
Subject: Re: [PATCH] split loop for NE condition.
Date: Fri, 07 May 2021 16:27:09 +0800 [thread overview]
Message-ID: <8335b451836fd14e60d18046ea76c06e@imap.linux.ibm.com> (raw)
In-Reply-To: <nycvar.YFH.7.76.2105061017520.9200@zhemvz.fhfr.qr>
On 2021-05-06 16:27, Richard Biener wrote:
> On Thu, 6 May 2021, guojiufu wrote:
>
>> On 2021-05-03 20:18, Richard Biener wrote:
>> > On Thu, 29 Apr 2021, Jiufu Guo wrote:
>> >
>> >> When there is the possibility that overflow may happen on the loop index,
>> >> a few optimizations would not happen. For example code:
>> >>
>> >> foo (int *a, int *b, unsigned k, unsigned n)
>> >> {
>> >> while (++k != n)
>> >> a[k] = b[k] + 1;
>> >> }
>> >>
>> >> For this code, if "l > n", overflow may happen. if "l < n" at begining,
>> >> it could be optimized (e.g. vectorization).
>> >>
>> >> We can split the loop into two loops:
>> >>
>> >> while (++k > n)
>> >> a[k] = b[k] + 1;
>> >> while (l++ < n)
>> >> a[k] = b[k] + 1;
>> >>
>> >> then for the second loop, it could be optimized.
>> >>
>> >> This patch is splitting this kind of small loop to achieve better
>> >> performance.
>> >>
>> >> Bootstrap and regtest pass on ppc64le. Is this ok for trunk?
>> >
>> > Do you have any statistics on how often this splits a loop during
>> > bootstrap (use --with-build-config=bootstrap-O3)? Or alternatively
>> > on SPEC?
>>
>> In SPEC2017, there are ~240 loops are split. And I saw some
>> performance
>> improvement on xz.
>> I would try bootstrap-O3 (encounter ICE).
Without this patch, the ICE is also there when building with
bootstrap-O3 on ppc64le.
>>
>> >
>> > Actual comments on the patch inline.
>> >
>> >> Thanks!
>> >>
>> >> Jiufu Guo.
>> >>
>> >> gcc/ChangeLog:
>> >>
>> >> 2021-04-29 Jiufu Guo <guojiufu@linux.ibm.com>
>> >>
>> >> * params.opt (max-insns-ne-cond-split): New.
>> >> * tree-ssa-loop-split.c (connect_loop_phis): Add new param.
>> >> (get_ne_cond_branch): New function.
>> >> (split_ne_loop): New function.
>> >> (split_loop_on_ne_cond): New function.
>> >> (tree_ssa_split_loops): Use split_loop_on_ne_cond.
>> >>
>> >> gcc/testsuite/ChangeLog:
>> >> 2021-04-29 Jiufu Guo <guojiufu@linux.ibm.com>
>> >>
>> >> * gcc.dg/loop-split1.c: New test.
>> >>
>> >> ---
>> >> gcc/params.opt | 4 +
>> >> gcc/testsuite/gcc.dg/loop-split1.c | 28 ++++
>> >> gcc/tree-ssa-loop-split.c | 219
>> >> ++++++++++++++++++++++++++++-
>> >> 3 files changed, 247 insertions(+), 4 deletions(-)
>> >> create mode 100644 gcc/testsuite/gcc.dg/loop-split1.c
>> >>
>> >> diff --git a/gcc/params.opt b/gcc/params.opt
>> >> index 2e4cbdd7a71..900b59b5136 100644
>> >> --- a/gcc/params.opt
>> >> +++ b/gcc/params.opt
>> >> @@ -766,6 +766,10 @@ Min. ratio of insns to prefetches to enable
>> >> prefetching for a loop with an unkno
>> >> Common Joined UInteger Var(param_min_loop_cond_split_prob) Init(30)
>> >> IntegerRange(0, 100) Param Optimization
>> >> The minimum threshold for probability of semi-invariant condition statement
>> >> to trigger loop split.
>> >>
>> >> +-param=max-insns-ne-cond-split=
>> >> +Common Joined UInteger Var(param_max_insn_ne_cond_split) Init(64) Param
>> >> Optimization
>> >> +The maximum threshold for insnstructions number of a loop with ne
>> >> condition to split.
>> >> +
>> >> -param=min-nondebug-insn-uid=
>> >> Common Joined UInteger Var(param_min_nondebug_insn_uid) Param
>> >> The minimum UID to be used for a nondebug insn.
>> >> diff --git a/gcc/testsuite/gcc.dg/loop-split1.c
>> >> b/gcc/testsuite/gcc.dg/loop-split1.c
>> >> new file mode 100644
>> >> index 00000000000..4c466aa9f54
>> >> --- /dev/null
>> >> +++ b/gcc/testsuite/gcc.dg/loop-split1.c
>> >> @@ -0,0 +1,28 @@
>> >> +/* { dg-do compile } */
>> >> +/* { dg-options "-O2 -fsplit-loops -fdump-tree-lsplit-details" } */
>> >> +
>> >> +void
>> >> +foo (int *a, int *b, unsigned l, unsigned n)
>> >> +{
>> >> + while (++l != n)
>> >> + a[l] = b[l] + 1;
>> >> +}
>> >> +
>> >> +void
>> >> +foo1 (int *a, int *b, unsigned l, unsigned n)
>> >> +{
>> >> + while (l++ != n)
>> >> + a[l] = b[l] + 1;
>> >> +}
>> >> +
>> >> +unsigned
>> >> +foo2 (char *a, char *b, unsigned l, unsigned n)
>> >> +{
>> >> + while (++l != n)
>> >> + if (a[l] != b[l])
>> >> + break;
>> >> +
>> >> + return l;
>> >> +}
>> >> +
>> >> +/* { dg-final { scan-tree-dump-times "Loop split" 3 "lsplit" } } */
>> >> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
>> >> index b80b6a75e62..a6d28078e5e 100644
>> >> --- a/gcc/tree-ssa-loop-split.c
>> >> +++ b/gcc/tree-ssa-loop-split.c
>> >> @@ -41,6 +41,7 @@ along with GCC; see the file COPYING3. If not see
>> >> #include "cfghooks.h"
>> >> #include "gimple-fold.h"
>> >> #include "gimplify-me.h"
>> >> +#include "tree-ssa-loop-ivopts.h"
>> >>
>> >> /* This file implements two kinds of loop splitting.
>> >>
>> >> @@ -233,7 +234,8 @@ easy_exit_values (class loop *loop)
>> >> this. The loops need to fulfill easy_exit_values(). */
>> >>
>> >> static void
>> >> -connect_loop_phis (class loop *loop1, class loop *loop2, edge new_e)
>> >> +connect_loop_phis (class loop *loop1, class loop *loop2, edge new_e,
>> >> + bool use_prev = false)
>> >> {
>> >> basic_block rest = loop_preheader_edge (loop2)->src;
>> >> gcc_assert (new_e->dest == rest);
>> >> @@ -248,13 +250,14 @@ connect_loop_phis (class loop *loop1, class loop
>> >> *loop2, edge new_e)
>> >> !gsi_end_p (psi_first);
>> >> gsi_next (&psi_first), gsi_next (&psi_second))
>> >> {
>> >> - tree init, next, new_init;
>> >> + tree init, next, new_init, prev;
>> >> use_operand_p op;
>> >> gphi *phi_first = psi_first.phi ();
>> >> gphi *phi_second = psi_second.phi ();
>> >>
>> >> init = PHI_ARG_DEF_FROM_EDGE (phi_first, firste);
>> >> next = PHI_ARG_DEF_FROM_EDGE (phi_first, firstn);
>> >> + prev = PHI_RESULT (phi_first);
>> >> op = PHI_ARG_DEF_PTR_FROM_EDGE (phi_second, seconde);
>> >> gcc_assert (operand_equal_for_phi_arg_p (init, USE_FROM_PTR
>> >> (op)));
>> >>
>> >> @@ -279,7 +282,7 @@ connect_loop_phis (class loop *loop1, class loop
>> >> *loop2, edge new_e)
>> >>
>> >> gphi * newphi = create_phi_node (new_init, rest);
>> >> add_phi_arg (newphi, init, skip_first, UNKNOWN_LOCATION);
>> >> - add_phi_arg (newphi, next, new_e, UNKNOWN_LOCATION);
>> >> + add_phi_arg (newphi, use_prev ? prev : next, new_e,
>> >> UNKNOWN_LOCATION);
>> >> SET_USE (op, new_init);
>> >> }
>> >> }
>> >> @@ -1599,6 +1602,213 @@ split_loop_on_cond (struct loop *loop)
>> >> return do_split;
>> >> }
>> >>
>> >> +/* Check if the LOOP exit branch likes "if (idx != bound)".
>> >> + if INV is not NULL and the branch is "if (bound != idx)", set *INV to
>> >> true.
>> >
>> > If
>> Ok, I will update accordingly.
>> >
>> >> + return the branch edge which exit loop. */
>> >
>> > Return
>> Thanks.
>>
>> >
>> >> +
>> >> +static edge
>> >> +get_ne_cond_branch (struct loop *loop, bool *inv)
>> >> +{
>> >> + int i;
>> >> + edge e;
>> >> +
>> >> + auto_vec<edge> edges = get_loop_exit_edges (loop);
>> >> + FOR_EACH_VEC_ELT (edges, i, e)
>> >> + {
>> >> + basic_block bb = e->src;
>> >> +
>> >> + /* Check gcond. */
>> >> + gimple *last = last_stmt (bb);
>> >> + if (!last || gimple_code (last) != GIMPLE_COND)
>> >> + continue;
>> >> + gcond *cond = as_a<gcond *> (last);
>> >> + enum tree_code code = gimple_cond_code (cond);
>> >> + if (code != NE_EXPR)
>> >> + continue;
>> >
>> > I'm not sure we canonicalize the case with code == EQ_EXPR,
>> > at least for
>> >
>> > void bar();
>> > void foo(unsigned n)
>> > {
>> > unsigned i = 0;
>> > do
>> > {
>> > if (i == n)
>> > return;
>> > bar();
>> > ++i;
>> > }
>> > while (1);
>> > }
>> >
>> > we don't. Since you return the exit edge can this case be
>> > handled transparently?
>>
>> Oh, this case was not handled, the patch can be enhanced to handle
>> this kind
>> of case.
>>
>>
>> >
>> >> +
>> >> + /* Make sure idx and bound. */
>> >> + tree idx = gimple_cond_lhs (cond);
>> >> + tree bnd = gimple_cond_rhs (cond);
>> >> + if (expr_invariant_in_loop_p (loop, idx))
>> >> + {
>> >> + std::swap (idx, bnd);
>> >> + if (inv)
>> >> + *inv = true;
>> >> + }
>> >> + else if (!expr_invariant_in_loop_p (loop, bnd))
>> >> + continue;
>> >
>> > We canonicalize i < UINT_MAX to i != UINT_MAX so you want to
>> > detect that and not split the loop if 'bnd' is the maximum
>> > or minimum value of the type I think.
>> Yeap!
>>
>> >
>> >> + /* Extract conversion. */
>> >> + if (TREE_CODE (idx) == SSA_NAME)
>> >> + {
>> >> + gimple *stmt = SSA_NAME_DEF_STMT (idx);
>> >> + if (is_gimple_assign (stmt)
>> >> + && CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (stmt))
>> >> + && flow_bb_inside_loop_p (loop, gimple_bb (stmt)))
>> >> + idx = gimple_assign_rhs1 (stmt);
>> >> + }
>> >
>> > This skips arbitrary extensions and truncations - is that intended?
>> Yes. This could handle code like:
>>
>> _6 = (sizetype) idx;
>> if (_6 != n)
>
> That looks like widening, but the code above also does
>
> _6 = (char) idx;
> if (_6 != n)
>
Yes, This is also expected. Because truncation also needs to clear high
bits.
Comparing a longer index with a shorter bound, wrap may happen.
>>
>> >
>> >> + /* Make sure idx is iv. */
>> >> + class loop *useloop = loop_containing_stmt (cond);
>> >> + affine_iv iv;
>> >> + if (!simple_iv (loop, useloop, idx, &iv, false))
>> >> + continue;
>> >> +
>> >> + /* No need to split loop, if base is know value.
>> >> + Or check range info. */
>> >> + if (TREE_CODE (iv.base) == INTEGER_CST)
>> >> + continue;
>> >
>> > I think it would be better to check iv.no_overflow? Also looking
>> > it might be possible to use simple_iv_with_niters with IV_NITERS
>> > not NULL for most of this analysis?
>>
>> Yes, iv.no_overflow may better to use directly.
>> simple_iv invoke simple_iv_with_niters with IV_NITERS=null.
>> Because wrap may happen, IV_NITERS may not be calculated.
>>
>> Thanks!
>> >
>> >> + /* There is type conversion on idx(or rhs of idx's def).
>> >> + And there is converting shorter to longer type. */
>> >> + tree type = TREE_TYPE (idx);
>> >> + if (!INTEGRAL_TYPE_P (type) || TREE_CODE (idx) != SSA_NAME
>> >> + || !TYPE_UNSIGNED (type)
>> >> + || TYPE_PRECISION (type) == TYPE_PRECISION (sizetype))
>> >> + continue;
>> >
>> > That check can be done before the (expensive) simple_iv check.
>> Ok, will update accordingly.
>>
>> > I wonder what the 'sizetype' precision check is about? The
>> > function level comment should probably clarify what kind of
>> > conversions we handle (and why).
>> Currently, this patch is more care about the conversions
>> which may generate ext/truck stmts.
>> I would try to remove this check, since we may split the loop
>> if there is a possible wrap/overflow.
>
> As said, it should be documented what conversions we look through
> and why we can do that. I know that eventually simple_iv does
> not return true on the converted IV but does on the conversion
> src - is that why you do this in the first place?
Thanks, sure. The function comments will be updated.
The code is checking the conversions which may cause overflow/wrap.
Code would be like:
/* Check if wrap/overflow may happen during type conversion. */
tree type = TREE_TYPE (idx);
if (....)
I want to support the cases both converted iv and conversion src.
>
> Note getting this and the "overflow" check correct is the most
> important piece of the transform since in the end we want to
> enable followup transforms on the split parts which otherwise
> run into this very problem, no?
Right, we need to check 'wrap/overflow' correctly to enable follow-up
optimizations for the split parts. This is the basic intention of this
patch.
>
>> >
>> >> + /* Check loop is simple to split. */
>> >> + gcc_assert (bb != loop->latch);
>> >> +
>> >> + if (single_pred_p (loop->latch)
>> >> + && single_pred_edge (loop->latch)->src == bb
>> >> + && empty_block_p (loop->latch))
>> >> + return e;
>> >> +
>> >> + /* Splitting is cheap for idx increase header. */
>> >> + if (bb == loop->header)
>> >> + {
>> >> + if (get_virtual_phi (loop->header))
>> >> + continue;
>> >> +
>> >> + /* In loop header: i++ or ++i. */
>> >> + gimple_stmt_iterator gsi = gsi_start_bb (bb);
>> >> + if (gsi_end_p (gsi))
>> >> + return e;
>> >> +
>> >> + gimple *s1 = gsi_stmt (gsi);
>> >> + if (!(is_gimple_assign (s1)
>> >> + && (idx == gimple_assign_lhs (s1)
>> >> + || idx == gimple_assign_rhs1 (s1))))
>> >> + continue;
>> >> +
>> >> + gsi_next (&gsi);
>> >> + if (!gsi_end_p (gsi) && gsi_stmt (gsi) == cond)
>> >> + return e;
>> >> + }
>> >
>> > I wonder if these "cheapness" heuristics should simply fold
>> > into the cost of the extra duplication of the header/tail
>> > in the overall stmt limit?
>> Without this heuristic, the loop can be split only if the branch to
>> exit
>> locates at the end of the loop (just before the empty latch).
>
> why? Btw, to avoid code-generation differences with -g vs. -g0
> you have to skip debug insns, thus use gsi_start_nondebug_bb and
> gsi_next_nondebug
Thanks, bb would be treated as 'empty' if only debug stmts.
If the interesting "cond branch" is in the middle of the loop:
LH:
B1
B2
if(X != N)
goto LM
else
goto exit
LM:
B3
B4 (may also branch exit)
latch
goto LH:
When the first loop exit at "if (X > N)", B1 and B2 are already
executed;
then after the exit of the first loop, we can not jump to the second
split
loop header, which will re-run B1 and B2; so, we did not support these
cases.
If the header is simple (e.g. i++,++i), we may support it without too
much cost.
>
>> With the heuristics, the loop can be split if the branch is at the
>> simple
>> header.
>> And I feel the cost of small to rerun the duplicated simple header:
>> maybe just 'move' instructions or one add instruction.
>>
>> >
>> >> + }
>> >> +
>> >> + return NULL;
>> >> +}
>> >> +
>> >> +/* Split the LOOP with NE_EXPR into two loops with GT_EXPR and LT_EXPR.
>> >> */
>> >> +
>> >> +static bool
>> >> +split_ne_loop (struct loop *loop, edge cond_e)
>> >> +{
>> >> + initialize_original_copy_tables ();
>> >> +
>> >> + struct loop *loop2 = loop_version (loop, boolean_true_node, NULL,
>> >> + profile_probability::always (),
>> >> + profile_probability::never (),
>> >> + profile_probability::always (),
>> >> + profile_probability::always (), true);
>> >> +
>> >> + gcc_assert (loop2);
>> >> + update_ssa (TODO_update_ssa);
>> >> +
>> >> + free_original_copy_tables ();
>> >> +
>> >> + /* Change if (i != n) to LOOP1:if (i > n) and LOOP2:if (i < n) */
>> >> + bool inv = false;
>> >> + edge dup_cond = get_ne_cond_branch (loop2, &inv);
>> >
>> > I don't think you should rely in pattern-matching to detect the same
>> > condition in the versioned loop - instead you can use the copy
>> > tables, do
>> >
>> > 2nd_loop_exit_block = get_bb_copy (cond_e->src);
>> >
>> > to get to the block with the COND_EXPR (before free_original_copy_tables
>> > obviously).
>> >
>> Thanks! Your suggestion is great, it would save a lot of time to get
>> the new exit branch.
>>
>> >> + enum tree_code up_code = inv ? LT_EXPR : GT_EXPR;
>> >> + enum tree_code down_code = inv ? GT_EXPR : LT_EXPR;
>> >> +
>> >> + gcond *gc = as_a<gcond *> (last_stmt (cond_e->src));
>> >> + gimple_cond_set_code (gc, up_code);
>> >> +
>> >> + gcond *dup_gc = as_a<gcond *> (last_stmt (dup_cond->src));
>> >> + gimple_cond_set_code (dup_gc, down_code);
>> >> +
>> >> + /* Link the exit cond edge to new loop. */
>> >> + gcond *break_cond = as_a<gcond *> (gimple_copy (gc));
>> >> + edge pred_e = single_pred_edge (loop->latch);
>> >> + gcc_assert (pred_e);
>> >> + bool simple_loop = pred_e->src == cond_e->src && empty_block_p
>> >> (loop->latch);
>> >> + if (simple_loop)
>> >> + gimple_cond_set_code (break_cond, down_code);
>> >> + else
>> >> + gimple_cond_make_true (break_cond);
>> >> +
>> >> + basic_block break_bb = split_edge (cond_e);
>> >> + gimple_stmt_iterator gsi = gsi_last_bb (break_bb);
>> >> + gsi_insert_after (&gsi, break_cond, GSI_NEW_STMT);
>> >> +
>> >> + edge to_exit = single_succ_edge (break_bb);
>> >> + edge to_new_loop = make_edge (break_bb, loop_preheader_edge
>> >> (loop2)->src, 0);
>> >> + to_new_loop->flags |= EDGE_TRUE_VALUE;
>> >> + to_exit->flags |= EDGE_FALSE_VALUE;
>> >> + to_exit->flags &= ~EDGE_FALLTHRU;
>> >> + to_exit->probability = cond_e->probability;
>> >> + to_new_loop->probability = to_exit->probability.invert ();
>> >> +
>> >> + update_ssa (TODO_update_ssa);
>> >> +
>> >> + connect_loop_phis (loop, loop2, to_new_loop, !simple_loop);
>> >> +
>> >> + rewrite_into_loop_closed_ssa_1 (NULL, 0, SSA_OP_USE, loop);
>> >> + if (dump_file && (dump_flags & TDF_DETAILS))
>> >> + fprintf (dump_file, ";; Loop split.\n");
>> >
>> > Maybe ";; Loop split on != condition.\n"?
>> Ok!
>> >
>> >> +
>> >> + return true;
>> >> +}
>> >> +
>> >> +/* Checks if LOOP contains a suitable NE_EXPR conditional block to split.
>> >> +L_H:
>> >> + if (i!=N)
>> >> + S;
>> >> + i++;
>> >> + goto L_H;
>> >> +
>> >> +The "i!=N" is like "i>N || i<N", then it could be transform to:
>> >> +
>> >> +L_H:
>> >> + if (i>N)
>> >> + S;
>> >> + i++;
>> >> + goto L_H;
>> >> +L1_H:
>> >> + if (i<N)
>> >> + S;
>> >> + i++;
>> >> + goto L1_H;
>> >> +
>> >> +The loop with "i<N" is in favor both GIMPLE and RTL passes. */
>> >> +
>> >> +static bool
>> >> +split_loop_on_ne_cond (class loop *loop)
>> >> +{
>> >> + if (!can_duplicate_loop_p (loop))
>> >> + return false;
>> >> +
>> >> + int num = 0;
>> >> + basic_block *bbs = get_loop_body (loop);
>> >
>> > To avoid repeated DFS walks of the loop body do the can_duplicate_loop_p
>> > check here as
>> >
>> > if (!can_copy_bbs_p (bbs, loop->num_nodes))
>> > {
>> > free (bbs);
>> > return false;
>> > }
>> >
>> > (see split_loop)
>> >
>> Thanks! This could also save gcc runtime.
>>
>> >> + for (unsigned i = 0; i < loop->num_nodes; i++)
>> >> + num += estimate_num_insns_seq (bb_seq (bbs[i]), &eni_size_weights);
>> >> + free (bbs);
>> >
>> > So with this and the suggestion above it is maybe possible to re-use
>> > compute_added_num_insns? That already seems to handle splitting at
>> > aribtrary branches (but maybe not loops with multiple exits?).
>> > The code using this computation uses param_max_peeled_insns - is
>> > that limit not sufficient for your case (to avoid another param?)?
>> > It's default is 100, a bit higher than yours (64).
>>
>> This patch uses this simple code to calculate and does not re-use
>> compute_added_num_insns, because there would be no many stmts in
>> loops are
>> deleted.
>>
>> Introducing a new param, because I think can independently use it to
>> control
>> different
>> optimization part.
>> And the name of param_max_peeled_insns seems designed for other
>> behavior:)
>> Do we prefer to reuse exit params for new optimizations? If so, we
>> would
>> reuse it.
>
> It depends. values for new params tend to fall out of the air while
> olders might have been tuned carefully. If they limit related things
> (code growth) then re-using older might be beneficial.
>
>> >
>> > I'd really like to see some numbers on how much this triggers since
>> > the splitting itself is O (number-of-split-loops * function-size)
>> > complexity and thus worse than quadratic as we do update_ssa for each
>> > split loop. You can experience this by simply concating one of your
>> > testcase loops N times in a function ...
>>
>> If I duplicate the test case loop 15 times in the function,
>> it riggers 15 ";; Loop split".
>>
>> Concatted times, compiling time, gap
>> 4: 0m0.106s
>> 8: 0m0.186s 0.080s
>> 12: 0m0.267s 0.082s
>> 16: 0m0.350s 0.083s
>> 20: 0m0.434s 0.084s
>> 24: 0m0.522s 0.088s
>> 28: 0m0.609s 0.087s
>> 32: 0m0.705s 0.086s
>>
>> It seems near a linear complexity.
>
> I see, though 32 is not a lot - I would suggest to try 1000s or even
> more ;) update_ssa time is linear in the size of the function _at
> least_.
Thanks for your suggestion!
It is interesting: when the numbers are bigger:(e.g. more than 1000s),
it shows non-linear complexity.
1000 56.57
2000 215.59
3000 486.92
4000 869.46
Without this patch:
1000 9.61
2000 25.54
3000 49.70
4000 80.31
Thanks again!
Jiufu Guo.
>
> Richard.
>
>>
>> Thanks a lot for review!
>>
>> Jiufu Guo.
>>
>> >
>> >> + if (num > param_max_insn_ne_cond_split)
>> >> + return false;
>> >> +
>> >> + edge branch_edge = get_ne_cond_branch (loop, NULL);
>> >> + if (branch_edge && split_ne_loop (loop, branch_edge))
>> >> + return true;
>> >> +
>> >> + return false;
>> >> +}
>> >> +
>> >> /* Main entry point. Perform loop splitting on all suitable loops. */
>> >>
>> >> static unsigned int
>> >> @@ -1628,7 +1838,8 @@ tree_ssa_split_loops (void)
>> >> if (optimize_loop_for_size_p (loop))
>> >> continue;
>> >>
>> >> - if (split_loop (loop) || split_loop_on_cond (loop))
>> >> + if (split_loop (loop) || split_loop_on_cond (loop)
>> >> + || split_loop_on_ne_cond (loop))
>> >> {
>> >> /* Mark our containing loop as having had some split inner loops.
>> >> */
>> >> loop_outer (loop)->aux = loop;
>> >>
>>
next prev parent reply other threads:[~2021-05-07 8:27 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-29 9:50 Jiufu Guo
2021-04-30 16:27 ` Jeff Law
2021-05-06 1:05 ` guojiufu
2021-04-30 21:37 ` Segher Boessenkool
2021-05-06 1:37 ` guojiufu
2021-05-03 12:18 ` Richard Biener
2021-05-06 7:57 ` guojiufu
2021-05-06 8:27 ` Richard Biener
2021-05-07 8:27 ` guojiufu [this message]
2021-05-07 9:52 ` Richard Biener
2021-05-14 14:58 ` [RFC] " guojiufu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8335b451836fd14e60d18046ea76c06e@imap.linux.ibm.com \
--to=guojiufu@linux.ibm.com \
--cc=dje.gcc@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jlaw@tachyum.com \
--cc=rguenther@suse.de \
--cc=segher@kernel.crashing.org \
--cc=wschmidt@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).