From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-500121-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 72548 invoked by alias); 6 May 2019 01:50:19 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 72540 invoked by uid 89); 6 May 2019 01:50:19 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-3.5 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=letting
X-HELO: mail-it1-f174.google.com
Received: from mail-it1-f174.google.com (HELO mail-it1-f174.google.com) (209.85.166.174) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 06 May 2019 01:50:17 +0000
Received: by mail-it1-f174.google.com with SMTP id q19so17597909itk.3        for <gcc-patches@gcc.gnu.org>; Sun, 05 May 2019 18:50:17 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;        d=gmail.com; s=20161025;        h=mime-version:references:in-reply-to:from:date:message-id:subject:to         :cc:content-transfer-encoding;        bh=REw8ZKQaxVJ9yQn1h/Fen60/208DeRmrdLlEdivn6YQ=;        b=PfhnBmhl1D4oGiQmv/Pz51R97F9NzC/0RNFWYNzbgIUrOoGq3Kl/43n7K/pPPc0Ewe         AWd0jMpLLttYoFvZiQkloAhx3HgodGNtMSSdH7bjYDJksGLFZGuvzGo1stCLBLXHGULc         gBRtSXOjMwk7LkBB1v56CO9cWJYU4BtgQ3rWRbF5GW65ZwbKOMghT5/+d3uNSjCAE7gb         yIeja89lCbZq5UlN6bNhywLdk249A05/fawv+h1nUu7NUnTmuEU6xEIvy2AaEYZOd0Dg         FRUL6hJkez6LY8ntNf1Kz2fw/a79hiyZHx+G2mDp9LsYMbiQ1faxlsz4B97OIaRMAoIC         4pYw==
MIME-Version: 1.0
References: <f63c1459-6e99-8ea5-45d1-a486ed801c0f@linux.ibm.com> <CAHFci29vEgZ9cofF6+chD8m3K0svU1s2kABXHqwoS7+oQda7bA@mail.gmail.com> <f41505e8-78bf-849d-b5f2-7ad5cda64dc1@linux.ibm.com> <CAHFci2-bv3W-gugxhHWa-iyC4E8pHh04bgpza93LnRauyPr=LA@mail.gmail.com> <3e5526ba-ed4d-c13b-9953-9b95340fcdcf@linux.ibm.com>
In-Reply-To: <3e5526ba-ed4d-c13b-9953-9b95340fcdcf@linux.ibm.com>
From: "Bin.Cheng" <amker.cheng@gmail.com>
Date: Mon, 06 May 2019 01:50:00 -0000
Message-ID: <CAHFci2_ug9HdMUO0eeDbgJu6qRWM3Fo2w=Z+XW9+e7ZLw9D2bA@mail.gmail.com>
Subject: Re: [PATCH, RFC, rs6000] PR80791 Consider doloop in ivopts
To: "Kewen.Lin" <linkw@linux.ibm.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>, "bin.cheng" <bin.cheng@linux.alibaba.com>, 	Segher Boessenkool <segher@kernel.crashing.org>, Bill Schmidt <wschmidt@linux.ibm.com>, 	Richard Guenther <rguenther@suse.de>, Jakub Jelinek <jakub@redhat.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-IsSubscribed: yes
X-SW-Source: 2019-05/txt/msg00174.txt.bz2

On Sun, May 5, 2019 at 2:02 PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> on 2019/5/5 =E4=B8=8B=E5=8D=8812:04, Bin.Cheng wrote:
> > On Sun, May 5, 2019 at 11:23 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
> >>>> +  /* Some compare iv_use is probably useless once the doloop optimi=
zation
> >>>> +     performs.  */
> >>>> +  if (tailor_cmp_p)
> >>>> +    tailor_cmp_uses (data);
> >>> Function tailor_cmp_uses sets iv_use->zero_cost_p under some
> >>> conditions.  Once the flag is set, though the iv_use participates cost
> >>> computation in determine_group_iv_cost_cond, the result cost is
> >>> hard-wired to ZERO (which means cost computation for such iv_use is
> >>> waste of time).
> >>
> >> Yes, it can be improved by some early check and return.
> >> But it's still helpful to make it call with may_eliminate_iv.
> >> gcc.dg/no-strict-overflow-6.c is one example, with may_eliminate_iv
> >> consideration it exposes more opportunities for downstream optimizatio=
n.
> > Hmm, I wasn't suggesting early check and return, on the contrary, we
> > can better integrate doloop/cost stuff in the overall model.  See more
> > in following comments.
>
> Sorry, I didn't claim it clearly, the previous comment is to claim the
> call with may_eliminate_iv is not completely "waste of time", and early
> return can make it save time.  :)
>
> And yes, it's not an issue any more with your proposed idea.
>
> >>
> >>> Also iv_use rewriting process is skipped for related
> >>> ivs preserved explicitly by preserve_ivs_for_use.
> >>> Note IVOPTs already adds candidate for original ivs.  So why not
> >>> detecting doloop iv_use, adjust its cost with the corresponding
> >>> original iv candidate, then let the general cost model make the
> >>> decision?  I believe this better fits existing infrastructure and
> >>> requires less change, for example, preserve_ivs_for_use won't be
> >>> necessary.
> >> I agree adjusting the cost of original iv candidate of the iv_use
> >> requires less change, but on one hand, I thought to remove interest
> >> cmp iv use or make it zero cost is close to the fact.  Since it's
> >> eliminated eventually in doloop optimization, it should not
> >> considered in cost modeling.  This way looks more exact.
> > Whether doloop transformation should be considered or be bypassed in
> > cost model isn't a problem.  Actually we can bind doloop iv_cand to
> > cmp iv_use in order to force the transformation. My concern is the
> > patch specially handles doloop by setting the special flag, then
> > checking it.  We generally avoid such special-case handling since it
> > hurts long time maintenance.  The pass was very unstable in the pass
> > because of such issues.
> >
>
> OK, I understand your concerns now. Thanks for explanation!
>
> >> One the other hand, I assumed your suggestion is to increase the
> >> cost for the pair (original iv cand, cmp iv use), the increase cost
> >> seems to be a heuristic value?  It seems it's possible to sacrifice
> > Decrease the cost so that the iv_cand is preferred?  The comment
> > wasn't made on top of implementing doloop in ivopts.  Anyway, you can
> > still bypass cost model by binding the "correct" iv_cand to cmp
> > iv_use.
> >
>
> To decrease the cost isn't workable for this case, it make the original
> iv cand is preferred more and over the other iv cand for memory iv use,
> then the desirable memory based iv cand won't be chosen.
> If increase the cost, one basic iv cand is chosen for cmp use, memory
> based iv cand is chosen for memory use, instead of original iv for both.
Sorry for the mistake, I meant to decrease use cost of whatever "correct"
iv_cand for cmp iv_use that could enable doloop optimization, it doesn't
has to the original iv_cand.

>
> Could you help to explain the "binding" more?  Does it mean cost modeling
> decision making can simply bypass the cmp iv_use (we have artificially
> assigned one "correct" cand iv to it) and also doesn't count the "correct"
> iv cand cost into total iv cost? Is my understanding correct?
For example, if the heuristic finds out the "correct" doloop iv_cand, we can
force choosing that iv_cand for cmp iv_use and bypass the candidate choosing
algorithm:
struct iv_group {
  //...
  struct iv_cand *bind_cand;
};
then set this bind_cand directly in struct iv_ca::cand_for_group.  As a res=
ult,
iv_use rewriting code takes care of everything, no special handling (such as
preserve_ivs_for_use) is needed.

Whether letting cost model decide the "correct" iv_cand or bind it by yours=
elf
depends on how good your heuristic check is.  It's your call. :)

>
> >>> tuning;  2) the doloop decision can still be canceled by cost model if
> >>> it's really not beneficial.  With current patch, it can't be undo once
> >>> the decision is made (at very early stage in IVOPTs.).
> >>
> >> I can't really follow this.  If it's predicted to be transformed to do=
loop,
> >> I think it should not be undoed any more, since it's useless to consid=
er
> >> this cmp iv use. Whatever IVOPTS does, the comp at loop closing should=
 not
> >> be changed (although possible to use other iv), right?  Do I miss some=
thing?
> > As mentioned, the previous comment wasn't made on top of implementing
> > doloop in ivopts.  That would be nice but a different story.
> > Before we can do that, it'd better be conservative and only makes
> > (doloop) decision in ivopts when you are sure.  As you mentioned, it's
> > hard to do the same checks at gimple as RTL, right?  In this case,
> > making it a (conservative) heuristic capturing certain beneficial
> > cases sounds better than capturing all cases but fail in later RTL
> > passes.
> >
>
> Yes, I agree we should be conservative.  But it's hard to determine which=
 is
> better in practice, even for capturing all cases, we are still trying our=
 best
> to be more conservative, excluding any suspicious factor which is possibl=
e to
> make it fail in later RTL checking, one example is that the patch won't p=
redict
> it can be doloop once finding switch statement.  It depends on how much "=
bad"
> cases we don't catch and how serious its impact is and whether easy to im=
prove.
Sure, I don't know ppc so have all the trust in your decision here.

Thanks for your patience.

Thanks,
bin