From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x234.google.com (mail-lj1-x234.google.com [IPv6:2a00:1450:4864:20::234]) by sourceware.org (Postfix) with ESMTPS id DF3373858CDB for ; Thu, 20 Jul 2023 13:04:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DF3373858CDB Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-x234.google.com with SMTP id 38308e7fff4ca-2b70bfc8db5so10731921fa.2 for ; Thu, 20 Jul 2023 06:04:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689858269; x=1690463069; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=xrw0kq3s9yZoRZDvw+MI1bqPseSdjI4UThv2dhWh5NM=; b=XyVkuB5pU5pjd2sgqcrkiIIz5w3WdhcqWgRV+2liSLFU4m1ZuOA0bSbXrCpACJ58YC mhuHxJoTMBNvf7nibisU8nCT7TNY81ikaM3uILhg04aosFh6wK+yIIvZkvVrBKVbhWVx SmG51m/O/kadUkr+Ev8jOXgXqUl9zKfJvi09LE1hJPy4mGKMiHt8S1+2twtQ4X+xtCGG iWQAS+7U3DPJnYyl2sKpTdMVTwyBN+KhHx8QE0sp+5DWLQVbEoFrhgZDc3lmXP/EPko1 g3qGnP0kgCBTb946jtG8jaTTl8YWJyMEhORD64Bg2B38pajIYTsbqtJyxQXFN99GQYct bL1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689858269; x=1690463069; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xrw0kq3s9yZoRZDvw+MI1bqPseSdjI4UThv2dhWh5NM=; b=NcYKziFell5HY+wYLiet/GGIGQ+rELHBcBkHXmdjRCLlsCbPWgdTuzvcIaJs/bL+cu ZlAryMQgTG2wiKZp71kIkSzt+t+shY9vSj2GXvOp1wUkxI1KY7iMDwRM+TrGkbp4mG3+ 6x60kSG15RtnNqh3Xn1W69KbK5R55kz8ktDsED/f5lPrk6NWHIJjuC9goe/H8l9FNmhD 0OPLSui9Cuc37KEb6eVubw9BdA8fQhwRSfGshApqCtVH5oLOitsJM0JZoSPdbwmc6IWj 8ZmJP39TT0wT5FyukjsINbnSNWPieWmASmp620sb8c/WYahtxHN0kgEJqaW61IXbISw9 9MTQ== X-Gm-Message-State: ABy/qLapdPB4QF2d5qdxoYrcHTWx39O+PpLZ/8ACZLd9xnZye2ASEdMP 14fwa7weU9vW9XO8ZgP5G6MW8mHHy7GrIIQQuE8= X-Google-Smtp-Source: APBJJlHAGutvACOD/hH+OzOKUbaitkkZW4+zfXuRFcJlJpez0U3t+80UNdzF17vJYYF8y+FdPX+8bdLHScxjzrnCWaM= X-Received: by 2002:a2e:9604:0:b0:2b6:e128:e7a3 with SMTP id v4-20020a2e9604000000b002b6e128e7a3mr2176671ljh.33.1689858269016; Thu, 20 Jul 2023 06:04:29 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Thu, 20 Jul 2023 15:03:54 +0200 Message-ID: Subject: Re: loop-ch improvements, part 3 To: Jan Hubicka Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_NUMSUBJECT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Jul 20, 2023 at 9:10=E2=80=AFAM Jan Hubicka via Gcc-patches wrote: > > Hi, > this patch makes tree-ssa-loop-ch to understand if-combined conditionals = (which > are quite common) and remove the IV-derived heuristics. That heuristics = is > quite dubious because every variable with PHI in header of integral or po= inter > type is seen as IV, so in the first basic block we match all loop invaria= nts as > invariants and everything that chagnes in loop as IV-like. > > I think the heuristics was mostly there to make header duplication happen= when > the exit conditional is constant false in the first iteration and with ra= nger > we can work this out in good enough precision. > > The patch adds notion of "combined exit" which has conditional that is > and/or/xor of loop invariant exit and exit known to be false in first > iteration. Copying these is a win since the loop conditional will simpli= fy > in both copies. > > It seems that those are usual bit or/and/xor and the code size accounting= is > true only when the values have at most one bit set or when the static con= stant > and invariant versions are simple (such as all zeros). I am not testing = this, > so the code may be optimistic here. I think it is not common enough to m= atter > and I can not think of correct condition that is not quite complex. > > I also improved code size estimate not accounting non-conditionals that a= re > know to be constant in peeled copy and improved debug output. > > This requires testsuite compensaiton. uninit-pred-loop-1.c.C does: > > /* { dg-do compile } */ > /* { dg-options "-Wuninitialized -O2 -std=3Dc++98" } */ > > extern int bar(); > int foo(int n, int m) > { > for (;;) { > int err =3D ({int _err; > for (int i =3D 0; i < 16; ++i) { > if (m+i > n) > break; > _err =3D 17; > _err =3D bar(); > } > _err; > }); > > if (err =3D=3D 0) return 17; > } > > Before path we duplicate > if (m+i > n) > which makes maybe-uninitialized warning to not be output. I do not quite= see > why copying this out would be a win, since it won't simlify. Also I thin= k the > warning is correct. if m>n the loop will bail out before initializing _e= rr and > it will be used unitialized. I think it is bug elsewhere that header > duplication supresses this. > > copy headers does: > int is_sorted(int *a, int n, int m, int k) > { > for (int i =3D 0; i < n - 1 && m && k > i; i++) > if (a[i] > a[i + 1]) > return 0; > return 1; > } > > it tests that all three for statement conditionals are duplicaed. With p= atch > we no longer do k>i since it is not going to simplify. So I added test > ensuring that k is positive. Also the tests requires disabling if-combin= ing and > vrp to avoid conditionals becoming combined ones. So I aded new version o= f test > that we now behave correctly aslo with if-combine. > > ivopt_mult_2.c and ivopt_mult_1.c seems to require loop header > duplication for ivopts to behave particular way, so I also ensured by val= ue > range that the header is duplicated. > > Bootstrapped/regtested x86_64-linux, OK? > > gcc/ChangeLog: > > * tree-ssa-loop-ch.cc (edge_range_query): Rename to ... > (get_range_query): ... this one; do > (static_loop_exit): Add query parametr, turn ranger to reference. > (loop_static_stmt_p): New function. > (loop_static_op_p): New function. > (loop_iv_derived_p): Remove. > (loop_combined_static_and_iv_p): New function. > (should_duplicate_loop_header_p): Discover combined onditionals; > do not track iv derived; improve dumps. > (pass_ch::execute): Fix whitespace. > > gcc/testsuite/ChangeLog: > > * g++.dg/uninit-pred-loop-1_c.C: Allow warning. > * gcc.dg/tree-ssa/copy-headers-7.c: Add tests so exit conditition= is > static; update template. > * gcc.dg/tree-ssa/ivopt_mult_1.c: Add test so exit condition is s= tatic. > * gcc.dg/tree-ssa/ivopt_mult_2.c: Add test so exit condition is s= tatic. > * gcc.dg/tree-ssa/copy-headers-8.c: New test. > > diff --git a/gcc/testsuite/g++.dg/uninit-pred-loop-1_c.C b/gcc/testsuite/= g++.dg/uninit-pred-loop-1_c.C > index 711812aae1b..1ee1615526f 100644 > --- a/gcc/testsuite/g++.dg/uninit-pred-loop-1_c.C > +++ b/gcc/testsuite/g++.dg/uninit-pred-loop-1_c.C > @@ -15,7 +15,7 @@ int foo(int n, int m) > _err; > }); > > - if (err =3D=3D 0) return 17; > + if (err =3D=3D 0) return 17; /* { dg-warning "uninitialized" "warn= ing" } */ > } > > return 18; > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-7.c b/gcc/testsui= te/gcc.dg/tree-ssa/copy-headers-7.c > index 3c9b3807041..e2a6c75f2e9 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-7.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-7.c > @@ -3,9 +3,10 @@ > > int is_sorted(int *a, int n, int m, int k) > { > - for (int i =3D 0; i < n - 1 && m && k > i; i++) > - if (a[i] > a[i + 1]) > - return 0; > + if (k > 0) > + for (int i =3D 0; i < n - 1 && m && k > i; i++) > + if (a[i] > a[i + 1]) > + return 0; > return 1; > } > > @@ -13,4 +14,8 @@ int is_sorted(int *a, int n, int m, int k) > the invariant test, not the alternate exit test. */ > > /* { dg-final { scan-tree-dump "is now do-while loop" "ch2" } } */ > +/* { dg-final { scan-tree-dump-times "Conditional combines static and in= variant" 0 "ch2" } } */ > +/* { dg-final { scan-tree-dump-times "Will elliminate invariant exit" 1 = "ch2" } } */ > +/* { dg-final { scan-tree-dump-times "Will eliminate peeled conditional"= 1 "ch2" } } */ > +/* { dg-final { scan-tree-dump-times "Not duplicating bb .: condition ba= sed on non-IV loop variant." 1 "ch2" } } */ > /* { dg-final { scan-tree-dump-times "Will duplicate bb" 3 "ch2" } } */ > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-8.c b/gcc/testsui= te/gcc.dg/tree-ssa/copy-headers-8.c > new file mode 100644 > index 00000000000..8b4b5e7ea81 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-8.c > @@ -0,0 +1,18 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-ch2-details" } */ > + > +int is_sorted(int *a, int n, int m, int k) > +{ > + if (k > 0) > + for (int i =3D 0; i < n - 1 && m && k > i; i++) > + if (a[i] > a[i + 1]) > + return 0; > + return 1; > +} > + > +/* Verify we apply loop header copying but only copy the IV tests and > + the invariant test, not the alternate exit test. */ > + > +/* { dg-final { scan-tree-dump "is now do-while loop" "ch2" } } */ > +/* { dg-final { scan-tree-dump-times "Conditional combines static and in= variant" 1 "ch2" } } */ > +/* { dg-final { scan-tree-dump-times "Will duplicate bb" 2 "ch2" } } */ > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_1.c b/gcc/testsuite= /gcc.dg/tree-ssa/ivopt_mult_1.c > index adfe371c7ce..faed9114f6f 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_1.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_1.c > @@ -9,6 +9,7 @@ long foo(long* p, long* p2, int N1, int N2) > long* p_limit =3D p + N1; > long* p_limit2 =3D p2 + N2; > long s =3D 0; > + if (p2 <=3D p_limit2) > while (p <=3D p_limit) > { > p++; > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2.c b/gcc/testsuite= /gcc.dg/tree-ssa/ivopt_mult_2.c > index 50d0cc5d2ae..6a82aeb0268 100644 > --- a/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2.c > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ivopt_mult_2.c > @@ -8,15 +8,16 @@ long foo(long* p, long* p2, int N1, int N2) > int i =3D 0; > long* p_limit2 =3D p2 + N2; > long s =3D 0; > - while (i < N1) > - { > - p++; > - p2++; > - i++; > - if (p2 > p_limit2) > - break; > - s +=3D (*p); > - } > + if (p2 <=3D p_limit2) > + while (i < N1) > + { > + p++; > + p2++; > + i++; > + if (p2 > p_limit2) > + break; > + s +=3D (*p); > + } > > return s; > } > diff --git a/gcc/tree-ssa-loop-ch.cc b/gcc/tree-ssa-loop-ch.cc > index e0139cb432c..f3dc3d998e3 100644 > --- a/gcc/tree-ssa-loop-ch.cc > +++ b/gcc/tree-ssa-loop-ch.cc > @@ -38,34 +38,33 @@ along with GCC; see the file COPYING3. If not see > #include "value-range.h" > #include "gimple-range.h" > #include "gimple-range-path.h" > +#include "gimple-pretty-print.h" > #include "cfganal.h" > > -/* Duplicates headers of loops if they are small enough, so that the sta= tements > - in the loop body are always executed when the loop is entered. This > - increases effectiveness of code motion optimizations, and reduces the= need > - for loop preconditioning. */ > +/* Return path query insteance for testing ranges of statements > + in headers of LOOP contained in basic block BB. > + Use RANGER instance. */ > > -/* Given a path through edge E, whose last statement is COND, return > - the range of the solved conditional in R. */ > - > -static void > -edge_range_query (irange &r, class loop *loop, gcond *cond, gimple_range= r &ranger) > +static path_range_query * > +get_range_query (class loop *loop, > + basic_block bb, > + gimple_ranger &ranger) > { > auto_vec path; > - for (basic_block bb =3D gimple_bb (cond); bb !=3D loop->header; bb =3D= single_pred_edge (bb)->src) > + for (; bb !=3D loop->header; bb =3D single_pred_edge (bb)->src) > path.safe_push (bb); > path.safe_push (loop->header); > path.safe_push (loop_preheader_edge (loop)->src); > - path_range_query query (ranger, path); > - if (!query.range_of_stmt (r, cond)) > - r.set_varying (boolean_type_node); > + return new path_range_query (ranger, path); > } > > /* Return edge that is true in the first iteration of the loop > - and NULL otherwise. */ > + and NULL otherwise. > + Formulate corrent ranger query to RANGER. */ > > static edge > -static_loop_exit (class loop *l, basic_block bb, gimple_ranger *ranger) > +static_loop_exit (class loop *l, basic_block bb, gimple_ranger &ranger, > + path_range_query *&query) > { > gcond *last =3D safe_dyn_cast (*gsi_last_bb (bb)); > edge ret_e; > @@ -83,21 +82,48 @@ static_loop_exit (class loop *l, basic_block bb, gimp= le_ranger *ranger) > > int_range<1> desired_static_range; > if (loop_exit_edge_p (l, true_e)) > - { > + { > desired_static_range =3D range_false (); > ret_e =3D true_e; > - } > + } > else > - { > - desired_static_range =3D range_true (); > - ret_e =3D false_e; > - } > + { > + desired_static_range =3D range_true (); > + ret_e =3D false_e; > + } > + > + if (!query) > + query =3D get_range_query (l, gimple_bb (last), ranger); > > int_range<2> r; > - edge_range_query (r, l, last, *ranger); > + if (!query->range_of_stmt (r, last)) > + return NULL; > return r =3D=3D desired_static_range ? ret_e : NULL; > } > > +/* Return true if STMT is static in LOOP. This means that its value > + is constant in the first iteration. > + Use RANGER and formulate query cached in QUERY. */ > + > +static bool > +loop_static_stmt_p (class loop *loop, > + gimple_ranger &ranger, > + path_range_query *&query, > + gimple *stmt) > +{ > + tree type =3D gimple_range_type (stmt); > + if (!type || !Value_Range::supports_type_p (type)) > + return false; > + > + if (!query) > + query =3D get_range_query (loop, gimple_bb (stmt), ranger); > + > + Value_Range r (gimple_range_type (stmt)); > + if (!query->range_of_stmt (r, stmt)) > + return false; > + return r.singleton_p (); > +} > + > /* Return true if OP is invariant. */ > > static bool > @@ -109,21 +135,37 @@ loop_invariant_op_p (class loop *loop, > if (SSA_NAME_IS_DEFAULT_DEF (op) > || !flow_bb_inside_loop_p (loop, gimple_bb (SSA_NAME_DEF_STMT (op)= ))) > return true; > + return gimple_uid (SSA_NAME_DEF_STMT (op)) & 1; > +} > + > +/* Return true if OP combines outcome of static and > + loop invariant conditional. */ > + > +static bool > +loop_static_op_p (class loop *loop, tree op) > +{ > + /* Always check for invariant first. */ > + gcc_checking_assert (!is_gimple_min_invariant (op) > + && !SSA_NAME_IS_DEFAULT_DEF (op) > + && flow_bb_inside_loop_p (loop, > + gimple_bb (SSA_NAME_DEF_STMT (op)))); > return gimple_uid (SSA_NAME_DEF_STMT (op)) & 2; > } > > -/* Return true if OP looks like it is derived from IV. */ > + > +/* Return true if OP combines outcome of static and > + loop invariant conditional. */ > > static bool > -loop_iv_derived_p (class loop *loop, > - tree op) > +loop_combined_static_and_iv_p (class loop *loop, > + tree op) > { > /* Always check for invariant first. */ > gcc_checking_assert (!is_gimple_min_invariant (op) > && !SSA_NAME_IS_DEFAULT_DEF (op) > && flow_bb_inside_loop_p (loop, > gimple_bb (SSA_NAME_DEF_STMT (op)))); > - return gimple_uid (SSA_NAME_DEF_STMT (op)) & 1; > + return gimple_uid (SSA_NAME_DEF_STMT (op)) & 4; > } > > /* Check whether we should duplicate HEADER of LOOP. At most *LIMIT > @@ -182,25 +224,18 @@ should_duplicate_loop_header_p (basic_block header,= class loop *loop, > return false; > } > > + path_range_query *query =3D NULL; > for (gphi_iterator psi =3D gsi_start_phis (header); !gsi_end_p (psi); > gsi_next (&psi)) > - /* If this is actual loop header PHIs indicate that the SSA_NAME > - may be IV. Otherwise just give up. */ > - if (header =3D=3D loop->header) > + if (!virtual_operand_p (gimple_phi_result (psi.phi ()))) > { > - gphi *phi =3D psi.phi (); > - tree res =3D gimple_phi_result (phi); > - if (INTEGRAL_TYPE_P (TREE_TYPE (res)) > - || POINTER_TYPE_P (TREE_TYPE (res))) > - gimple_set_uid (phi, 1 /* IV */); > - else > - gimple_set_uid (phi, 0); > + bool static_p =3D loop_static_stmt_p (loop, *ranger, > + query, psi.phi ()); > + gimple_set_uid (psi.phi (), static_p ? 2 : 0); > } > - else > - gimple_set_uid (psi.phi (), 0); > > /* Count number of instructions and punt on calls. > - Populate stmts INV/IV flag to later apply heuristics to the > + Populate stmts INV flag to later apply heuristics to the > kind of conditions we want to copy. */ > for (bsi =3D gsi_start_bb (header); !gsi_end_p (bsi); gsi_next (&bsi)) > { > @@ -215,6 +250,12 @@ should_duplicate_loop_header_p (basic_block header, = class loop *loop, > if (gimple_code (last) =3D=3D GIMPLE_COND) > break; > > + if (dump_file && (dump_flags & TDF_DETAILS)) > + { > + fprintf (dump_file, " Analyzing: "); > + print_gimple_stmt (dump_file, last, 0, TDF_SLIM); > + } > + > if (gimple_code (last) =3D=3D GIMPLE_CALL > && (!gimple_inexpensive_call_p (as_a (last)) > /* IFN_LOOP_DIST_ALIAS means that inner loop is distributed > @@ -225,53 +266,152 @@ should_duplicate_loop_header_p (basic_block header= , class loop *loop, > fprintf (dump_file, > " Not duplicating bb %i: it contains call.\n", > header->index); > + if (query) > + delete query; > return false; > } > > - /* Classify the stmt based on whether its computation is based > - on a IV or whether it is invariant in the loop. */ > + /* Classify the stmt is invariant in the loop. */ > gimple_set_uid (last, 0); > if (!gimple_vuse (last) > && gimple_code (last) !=3D GIMPLE_ASM > && (gimple_code (last) !=3D GIMPLE_CALL > || gimple_call_flags (last) & ECF_CONST)) > { > - bool inv =3D true; > - bool iv =3D true; > + bool inv =3D true, static_p =3D false; > ssa_op_iter i; > tree op; > FOR_EACH_SSA_TREE_OPERAND (op, last, i, SSA_OP_USE) > if (!loop_invariant_op_p (loop, op)) > - { > - if (!loop_iv_derived_p (loop, op)) > - { > - inv =3D false; > - iv =3D false; > - break; > - } > - else > - inv =3D false; > - } > - gimple_set_uid (last, (iv ? 1 : 0) | (inv ? 2 : 0)); > + inv =3D false; > + /* Assume that code is reasonably optimized and invariant > + constants are already identified. */ > + if (!inv) > + static_p =3D loop_static_stmt_p (loop, *ranger, query, last); > + gimple_set_uid (last, (inv ? 1 : 0) | (static_p ? 2 : 0)); > + if (dump_file && (dump_flags & TDF_DETAILS)) > + { > + if (inv) > + fprintf (dump_file, " Stmt is loop invariant\n"); > + if (static_p) > + fprintf (dump_file, " Stmt is static " > + "(constant in the first iteration)\n"); > + } > /* Loop invariants will be optimized out in loop body after > duplication; do not account invariant computation in code > - size costs. */ > - if (inv) > + size costs. > + > + Similarly static computations will be optimized out in the > + duplicatd header. */ > + if (inv || static_p) > continue; > + > + /* Match the following: > + _1 =3D i_1 < 10 <- static condtion > + _2 =3D n !=3D 0 <- loop invariant condition > + _3 =3D _1 & _2 <- combined static and iv statement. */ > + if (gimple_code (last) =3D=3D GIMPLE_ASSIGN > + && (gimple_assign_rhs_code (last) =3D=3D TRUTH_AND_EXPR > + || gimple_assign_rhs_code (last) =3D=3D TRUTH_OR_EXPR > + || gimple_assign_rhs_code (last) =3D=3D TRUTH_XOR_EXPR > + || gimple_assign_rhs_code (last) =3D=3D BIT_AND_EXPR > + || gimple_assign_rhs_code (last) =3D=3D BIT_IOR_EXPR > + || gimple_assign_rhs_code (last) =3D=3D BIT_XOR_EXPR)) Please make this cheaper by doing sth like gassign *last_ass =3D dyn_cast (last); enum tree_code code; if (last_ass && ((code =3D gimple_asssign_rhs_code (last_ass), true)) && (code =3D=3D TRUTH_ ....)) and use last_ass in the following block. You save checking code for each call and esp. gimple_assign_rhs_code isn't very cheap (though it should be all inlined and optimized eventually). Note TRUTH_* can never appear here, those are not valid GIMPLE codes. Otherwise looks good to me. Thanks, Richard. > + { > + tree op1 =3D gimple_assign_rhs1 (last); > + tree op2 =3D gimple_assign_rhs2 (last); > + > + if ((loop_invariant_op_p (loop, op1) > + || loop_combined_static_and_iv_p (loop, op1) > + || loop_static_op_p (loop, op1)) > + && (loop_invariant_op_p (loop, op2) > + || loop_combined_static_and_iv_p (loop, op2) > + || loop_static_op_p (loop, op2))) > + { > + /* Duplicating loop header with combned conditional wil= l > + remove this statement in each copy. But we account = for > + that later when seeing that condition. > + > + Note that this may be overly optimistic for bit oper= ations > + where the static parameter may still result in non-t= rivial > + bit operation. */ > + gimple_set_uid (last, 4); > + if (dump_file && (dump_flags & TDF_DETAILS)) > + fprintf (dump_file, > + " Stmt combines static and invariant op\n= "); > + continue; > + } > + } > } > > - *limit -=3D estimate_num_insns (last, &eni_size_weights); > + int insns =3D estimate_num_insns (last, &eni_size_weights); > + *limit -=3D insns; > + if (dump_file && (dump_flags & TDF_DETAILS)) > + fprintf (dump_file, > + " Acconting stmt as %i insns\n", insns); > if (*limit < 0) > { > if (dump_file && (dump_flags & TDF_DETAILS)) > fprintf (dump_file, > " Not duplicating bb %i contains too many insns.\n"= , > header->index); > + if (query) > + delete query; > return false; > } > } > > - edge static_exit =3D static_loop_exit (loop, header, ranger); > + if (dump_file && (dump_flags & TDF_DETAILS)) > + { > + fprintf (dump_file, " Analyzing: "); > + print_gimple_stmt (dump_file, last, 0, TDF_SLIM); > + } > + > + /* If the condition tests a non-IV loop variant we do not want to rota= te > + the loop further. Unless this is the original loop header. */ > + tree lhs =3D gimple_cond_lhs (last); > + tree rhs =3D gimple_cond_rhs (last); > + bool lhs_invariant =3D loop_invariant_op_p (loop, lhs); > + bool rhs_invariant =3D loop_invariant_op_p (loop, rhs); > + > + /* Combined conditional is a result of if combining: > + > + _1 =3D i_1 < 10 <- static condtion > + _2 =3D n !=3D 0 <- loop invariant condition > + _3 =3D _1 & _2 <- combined static and iv statement > + if (_3 !=3D 0) <- combined conditional > + > + Combined conditionals will not be optimized out in either copy. > + However duplicaed header simplifies to: > + > + if (n < 10) > + > + while loop body to > + > + if (i_1 < 10) > + > + So effectively the resulting code sequence will be of same length a= s > + the original code. > + > + Combined conditionals are never static or invariant, so save some w= ork > + below. */ > + if (lhs_invariant !=3D rhs_invariant > + && (lhs_invariant > + || loop_combined_static_and_iv_p (loop, lhs)) > + && (rhs_invariant > + || loop_combined_static_and_iv_p (loop, rhs))) > + { > + if (query) > + delete query; > + if (dump_file && (dump_flags & TDF_DETAILS)) > + fprintf (dump_file, > + " Conditional combines static and invariant op.\n"); > + return true; > + } > + > + edge static_exit =3D static_loop_exit (loop, header, *ranger, query); > + if (query) > + delete query; > > if (static_exit && static_exits) > { > @@ -282,13 +422,6 @@ should_duplicate_loop_header_p (basic_block header, = class loop *loop, > static_exit->src->index); > /* Still look for invariant exits; exit may be both. */ > } > - > - /* If the condition tests a non-IV loop variant we do not want to rota= te > - the loop further. Unless this is the original loop header. */ > - tree lhs =3D gimple_cond_lhs (last); > - tree rhs =3D gimple_cond_rhs (last); > - bool lhs_invariant =3D loop_invariant_op_p (loop, lhs); > - bool rhs_invariant =3D loop_invariant_op_p (loop, rhs); > if (lhs_invariant && rhs_invariant) > { > if (invariant_exits) > @@ -312,7 +445,11 @@ should_duplicate_loop_header_p (basic_block header, = class loop *loop, > return true; > > /* We was not able to prove that conditional will be eliminated. */ > - *limit -=3D estimate_num_insns (last, &eni_size_weights); > + int insns =3D estimate_num_insns (last, &eni_size_weights); > + *limit -=3D insns; > + if (dump_file && (dump_flags & TDF_DETAILS)) > + fprintf (dump_file, > + " Acconting stmt as %i insns\n", insns); > if (*limit < 0) > { > if (dump_file && (dump_flags & TDF_DETAILS)) > @@ -322,12 +459,6 @@ should_duplicate_loop_header_p (basic_block header, = class loop *loop, > return false; > } > > - /* TODO: This is heuristics that claims that IV based ocnditionals wil= l > - likely be optimized out in duplicated header. We could use ranger > - query instead to tell this more precisely. */ > - if ((lhs_invariant || loop_iv_derived_p (loop, lhs)) > - && (rhs_invariant || loop_iv_derived_p (loop, rhs))) > - return true; > if (header !=3D loop->header) > { > if (dump_file && (dump_flags & TDF_DETAILS)) > @@ -550,7 +681,7 @@ public: > > /* opt_pass methods: */ > bool gate (function *) final override { return flag_tree_ch !=3D 0; } > - > + > /* Initialize and finalize loop structures, copying headers inbetween.= */ > unsigned int execute (function *) final override; > > @@ -590,7 +721,7 @@ public: > return flag_tree_ch !=3D 0 > && (flag_tree_loop_vectorize !=3D 0 || fun->has_force_vectoriz= e_loops); > } > - > + > /* Just copy headers, no initialization/finalization of loop structure= s. */ > unsigned int execute (function *) final override; > > @@ -973,7 +1104,7 @@ pass_ch::execute (function *fun) > /* Assume an earlier phase has already initialized all the loop structur= es that > we need here (and perhaps others too), and that these will be finaliz= ed by > a later phase. */ > - > + > unsigned int > pass_ch_vect::execute (function *fun) > {