public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Bin.Cheng" <amker.cheng@gmail.com>
To: "Kewen.Lin" <linkw@linux.ibm.com>
Cc: gcc-patches List <gcc-patches@gcc.gnu.org>,
		Segher Boessenkool <segher@kernel.crashing.org>,
	Bill Schmidt <wschmidt@linux.ibm.com>,
		Richard Guenther <rguenther@suse.de>
Subject: Re: [PATCH v6 3/3] PR80791 Consider doloop cmp use in ivopts
Date: Fri, 23 Aug 2019 10:43:00 -0000	[thread overview]
Message-ID: <CAHFci2_xTa20gDMmf3omvDxki=cL7Cc02iFX_8P3bMGoiFSisQ@mail.gmail.com> (raw)
In-Reply-To: <ca838609-51dc-37b0-f2d4-f6299aa99e8d@linux.ibm.com>

On Fri, Aug 23, 2019 at 4:27 PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> Hi Bin
>
> on 2019/8/23 上午10:19, Bin.Cheng wrote:
> > On Thu, Aug 22, 2019 at 3:09 PM Kewen.Lin <linkw@linux.ibm.com> wrote:
> >>
> >> Hi Bin,
> >>
> >> on 2019/8/22 下午1:46, Bin.Cheng wrote:
> >>> On Thu, Aug 22, 2019 at 11:18 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
> >>>>
> >>>> Hi Bin,
> >>>>
> >>>> Thanks for your time!
> >>>>
> >>>> on 2019/8/21 下午8:32, Bin.Cheng wrote:
> >>>>> On Wed, Aug 14, 2019 at 3:23 PM Kewen.Lin <linkw@linux.ibm.com> wrote:
> >>>>>>
> >>>>>> Hi!
> >>>>>>
> >>>>>> Comparing to the previous versions of implementation mainly based on the
> >>>>>> existing IV cands but zeroing the related group/use cost, this new one is based
> >>>>>> on Richard and Segher's suggestion introducing one doloop dedicated IV cand.
> >>>>>>
> >>>>>> Some key points are listed below:
> >>>>>>   1) New field doloop_p in struct iv_cand to indicate doloop dedicated IV cand.
> >>>>>>   2) Special name "doloop" assigned.
> >>>>>>   3) Doloop IV cand with form (niter+1, +, -1)
> >>>>>>   4) For doloop IV cand, no extra one cost like BIV, assign zero cost for step.
> >>>>>>   5) Support may_be_zero (regressed PR is in this case), the base of doloop IV
> >>>>>>      can be COND_EXPR, add handlings in cand_value_at and may_eliminate_iv.
> >>>>>>   6) Add more expr support in force_expr_to_var_cost for reasonable cost
> >>>>>>      calculation on the IV base with may_be_zero (like COND_EXPR).
> >>>>>>   7) Set zero cost when using doloop IV cand for doloop use.
> >>>>>>   8) Add three hooks (should we merge _generic and _address?).
> >>>>>>     *) have_count_reg_decr_p, is to indicate the target has special hardware
> >>>>>>        count register, we shouldn't consider the impact of doloop IV when
> >>>>>>        calculating register pressures.
> >>>>>>     *) doloop_cost_for_generic, is the extra cost when using doloop IV cand for
> >>>>>>        generic type IV use.
> >>>>>>     *) doloop_cost_for_address, is the extra cost when using doloop IV cand for
> >>>>>>        address type IV use.
>
> >> The new patch addressing the comments is attached.
> >> Could you please have a look again?  Thanks in advance!
> > Thanks for working on this.  A bit more nit-pickings.
> >
> > -    add_candidate_1 (data, base, step, important,
> > -                    IP_NORMAL, use, NULL, orig_iv);
> > -  if (ip_end_pos (data->current_loop)
> > +    add_candidate_1 (data, base, step, important, IP_NORMAL, use, NULL, doloop,
> > +                    orig_iv);
> > +  if (!doloop && ip_end_pos (data->current_loop)
> > Could you add some comments elaborating why ip_end_pos candidate
> > shouldn't be added for doloop case?  Because the increment position is
> > wrong.
> >
> > Also if you make doloop the last default parameter of add_candidate_1,
> > you can save more unnecessary changes to calls to add_candidate?
> >
> > -    cost = get_computation_cost (data, use, cand, false,
> > -                                &inv_vars, NULL, &inv_expr);
> > +    {
> > +      cost = get_computation_cost (data, use, cand, false, &inv_vars, NULL,
> > +                                  &inv_expr);
> > +      if (cand->doloop_p)
> > +       cost += targetm.doloop_cost_for_generic;
> > +    }
> > This adjustment
> >
> >    cost = get_computation_cost (data, use, cand, true,
> >                                &inv_vars, &can_autoinc, &inv_expr);
> >
> > +  if (cand->doloop_p)
> > +    cost += targetm.doloop_cost_for_address;
> > +
> > and this adjustment can be moved into get_computation_cost where all
> > cost adjustments are done.
> >
> > +      /* For doloop candidate/use pair, adjust to zero cost.  */
> > +      if (group->doloop_p && cand->doloop_p)
> > +       cost = no_cost;
> > Note above code handles comparing against zero case and decreases the
> > cost by one (which prefers the same kind candidate as doloop one),
> > it's very possible to have -1 cost for doloop cand here.  how about
> > just set to no_cost if it's positive?  Your call.
> >
> > +/* For the targets which support doloop, to predict whether later RTL doloop
> > +   transformation will perform on this loop, further detect the doloop use and
> > +   mark the flag doloop_use_p if predicted.  */
> > +
> > +void
> > +predict_and_process_doloop (struct ivopts_data *data)
> > A better name here? Sorry I don't have another candidate in mind...
> >
> > +  data->doloop_use_p = false;
> > This can be moved to the beginning of above
> > 'predict_and_process_doloop' function.
> >
> > Lastly, could you please add some brief description/comment about
> > doloop handling as a subsection in the file head comment?
> >
> > Otherwise, the ivopt changes look good to me.
> >
> > Thanks,
> > bin
> >
>
> Thanks for your prompt reply!  I've updated the code as your comments,
> the updated version is attached.  Looking forward to your review again.

Sorry to bother.

-      return get_scaled_computation_cost_at (data, at, cost);
+      cost = get_scaled_computation_cost_at (data, at, cost);
+      /* For doloop IV cand, add on the extra cost.  */
+      cost += cand->doloop_p ? targetm.doloop_cost_for_address : 0;
+      return cost;
Here the cost is adjusted after scaling, while:

+  /* For doloop IV cand, add on the extra cost.  */
+  if (cand->doloop_p && use->type == USE_NONLINEAR_EXPR)
+    cost += targetm.doloop_cost_for_generic;
+
   return get_scaled_computation_cost_at (data, at, cost);
is adjusted before scaling.  Please work consistently.

+      /* Simply use 1.5 * add cost for now, FIXME if there is some
more accurate
+        cost evaluation way.  */
+      cost = comp_cost (1.5 * add_cost (speed, mode), 0);
+      break;
Is 1.5 important for some test cases?  Can we simply use 1 instead?
Or at least use xxx * 2 / 3 in order to avoid floating number.

Not sure if non-ivopts parts are already approved?  If so, the patch
is okay with above issues addressed.

Thanks very much for your time!

Thanks,
bin
>
>
> Thanks,
> Kewen
>
> -----
>
> gcc/ChangeLog
>
> 2019-08-23  Kewen Lin  <linkw@gcc.gnu.org>
>
>         PR middle-end/80791
>         * config/rs6000/rs6000.c (TARGET_HAVE_COUNT_REG_DECR_P): New macro.
>         (TARGET_DOLOOP_COST_FOR_GENERIC): Likewise.
>         (TARGET_DOLOOP_COST_FOR_ADDRESS): Likewise.
>         * target.def (have_count_reg_decr_p): New hook.
>         (doloop_cost_for_generic): Likewise.
>         (doloop_cost_for_address): Likewise.
>         * doc/tm.texi.in (TARGET_HAVE_COUNT_REG_DECR_P): Likewise.
>         (TARGET_DOLOOP_COST_FOR_GENERIC): Likewise.
>         (TARGET_DOLOOP_COST_FOR_ADDRESS): Likewise.
>         * doc/tm.texi: Regenerate.
>         * tree-ssa-loop-ivopts.c (comp_cost::operator+=): Consider infinite cost
>         addend.
>         (record_group): Init doloop_p.
>         (add_candidate_1): Add optional argument doloop, change the handlings
>         accordingly.
>         (add_candidate): Likewise.
>         (generic_predict_doloop_p): Update attribute.
>         (force_expr_to_var_cost): Add costing for expressions COND_EXPR/LT_EXPR/
>         LE_EXPR/GT_EXPR/GE_EXPR/EQ_EXPR/NE_EXPR/UNORDERED_EXPR/ORDERED_EXPR/
>         UNLT_EXPR/UNLE_EXPR/UNGT_EXPR/UNGE_EXPR/UNEQ_EXPR/LTGT_EXPR/MAX_EXPR/
>         MIN_EXPR.
>         (get_computation_cost): Update for doloop IV cand extra cost.
>         (determine_group_iv_cost_cond): Update for doloop IV cand.
>         (determine_iv_cost): Likewise.
>         (ivopts_estimate_reg_pressure): Likewise.
>         (may_eliminate_iv): Update handlings for doloop IV cand.
>         (add_iv_candidate_for_doloop): New function.
>         (find_iv_candidates): Call function add_iv_candidate_for_doloop.
>         (iv_ca_set_no_cp): Update for doloop IV cand.
>         (iv_ca_set_cp): Likewise.
>         (iv_ca_dump): Dump register cost.
>         (find_doloop_use): New function.
>         (analyze_and_mark_doloop_use): Likewise.
>         (tree_ssa_iv_optimize_loop): Call function analyze_and_mark_doloop_use.
>
> gcc/testsuite/ChangeLog
>
> 2019-08-23  Kewen Lin  <linkw@gcc.gnu.org>
>
>         PR middle-end/80791
>         * gcc.dg/tree-ssa/ivopts-3.c: Adjust for doloop change.
>         * gcc.dg/tree-ssa/ivopts-lt.c: Likewise.
>         * gcc.dg/tree-ssa/pr32044.c: Likewise.
>
>

  reply	other threads:[~2019-08-23  9:43 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-14  3:10 [PATCH v2 3/3] " linkw
2019-05-14  7:26 ` Richard Biener
2019-05-15  5:03   ` Kewen.Lin
2019-05-15  8:47     ` Richard Biener
2019-05-15 16:17       ` Segher Boessenkool
2019-05-16  7:25         ` Richard Biener
2019-05-16 17:35           ` Segher Boessenkool
2019-05-16  3:53       ` Kewen.Lin
2019-05-16 18:41       ` Jeff Law
2019-05-16 21:42         ` Segher Boessenkool
2019-06-19 11:47 ` [PATCH v3 3/3] PR80791 " Kewen.Lin
2019-06-20  9:09   ` Segher Boessenkool
2019-06-20 12:08     ` Kewen.Lin
2019-06-20 12:17       ` Kewen.Lin
2019-07-10  2:31         ` [PING^1][PATCH v4 " Kewen.Lin
2019-07-12 12:40           ` Richard Biener
2019-07-12 14:10             ` Segher Boessenkool
2019-07-15  6:40             ` Kewen.Lin
2019-07-15  6:50             ` Bin.Cheng
2019-07-21  9:06   ` [PATCH v3 " Bin.Cheng
2019-07-22  5:42     ` Kewen.Lin
2019-07-22  6:53       ` Segher Boessenkool
2019-07-22  7:18         ` Kewen.Lin
2019-07-22  8:02         ` Richard Biener
2019-07-22 21:47           ` Segher Boessenkool
2019-07-23  6:14             ` Kewen.Lin
2019-07-23  7:38             ` Richard Biener
2019-07-23  6:09           ` Kewen.Lin
2019-07-23  8:05             ` Richard Biener
2019-07-23  6:28       ` [PATCH v5 " Kewen.Lin
2019-08-14  7:48         ` [PATCH v6 " Kewen.Lin
2019-08-21 13:42           ` Bin.Cheng
2019-08-22  7:09             ` Kewen.Lin
2019-08-22  8:07               ` Bin.Cheng
2019-08-22  9:16                 ` Kewen.Lin
2019-08-23  5:31                   ` Bin.Cheng
2019-08-23  9:57                     ` Kewen.Lin
2019-08-23 10:43                       ` Bin.Cheng [this message]
2019-08-23 11:02                         ` Segher Boessenkool
2019-09-11  6:18                           ` Kewen.Lin
2019-09-12  8:14                             ` Richard Biener
2019-09-14  9:35                               ` Kewen.Lin
2019-08-24 22:43                         ` Kewen.Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHFci2_xTa20gDMmf3omvDxki=cL7Cc02iFX_8P3bMGoiFSisQ@mail.gmail.com' \
    --to=amker.cheng@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=linkw@linux.ibm.com \
    --cc=rguenther@suse.de \
    --cc=segher@kernel.crashing.org \
    --cc=wschmidt@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).