From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-504143-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 102733 invoked by alias); 2 Jul 2019 08:51:14 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 102690 invoked by uid 89); 2 Jul 2019 08:51:14 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-5.8 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.1 spammy=releasing
X-HELO: mx1.suse.de
Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 02 Jul 2019 08:51:12 +0000
Received: from relay2.suse.de (unknown [195.135.220.254])	by mx1.suse.de (Postfix) with ESMTP id 02CF0AE32;	Tue,  2 Jul 2019 08:51:09 +0000 (UTC)
Date: Tue, 02 Jul 2019 08:51:00 -0000
From: Richard Biener <rguenther@suse.de>
To: Li Jia He <helijia@linux.ibm.com>
cc: Andrew Pinski <pinskia@gmail.com>, Jeff Law <law@redhat.com>,     GCC Patches <gcc-patches@gcc.gnu.org>,     Segher Boessenkool <segher@kernel.crashing.org>, wschmidt@linux.ibm.com
Subject: Re: [PATCH][middle-end/88784] Middle end is missing some optimizations about unsigned
In-Reply-To: <alpine.LSU.2.20.1907021007050.2976@zhemvz.fhfr.qr>
Message-ID: <alpine.LSU.2.20.1907021031040.2976@zhemvz.fhfr.qr>
References: <1561615913-22109-1-git-send-email-helijia@linux.ibm.com> <6fb28248-5134-cec5-5045-45253e4d2eb0@redhat.com> <6d333ccf-9905-e929-c2dc-fc611ff929f1@linux.ibm.com> <CA+=Sn1mgUXkEBQTzp5Nv5gcANtpZeutisEtQbez_L7xgg39ppw@mail.gmail.com> <alpine.LSU.2.20.1907010920230.10704@zhemvz.fhfr.qr> <bc69c05f-d966-cab7-b79f-a188523f9258@linux.ibm.com> <alpine.LSU.2.20.1907021007050.2976@zhemvz.fhfr.qr>
User-Agent: Alpine 2.20 (LSU 67 2015-01-07)
MIME-Version: 1.0
Content-Type: multipart/mixed; BOUNDARY="-1609908220-1984626218-1562057469=:2976"
X-SW-Source: 2019-07/txt/msg00111.txt.bz2

  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

---1609908220-1984626218-1562057469=:2976
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8BIT
Content-length: 8585

On Tue, 2 Jul 2019, Richard Biener wrote:

> On Tue, 2 Jul 2019, Li Jia He wrote:
> 
> > 
> > 
> > On 2019/7/1 3:30 PM, Richard Biener wrote:
> > > On Fri, 28 Jun 2019, Andrew Pinski wrote:
> > > 
> > > > On Thu, Jun 27, 2019 at 9:55 PM Li Jia He <helijia@linux.ibm.com> wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > On 2019/6/27 11:48 PM, Jeff Law wrote:
> > > > > > On 6/27/19 12:11 AM, Li Jia He wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > According to the optimizable case described by Qi Feng on
> > > > > > > issue 88784, we can combine the cases into the following:
> > > > > > > 
> > > > > > > 1. x >  y  &&  x != XXX_MIN  -->   x > y
> > > > > > > 2. x >  y  &&  x == XXX_MIN  -->   false
> > > > > > > 3. x <= y  &&  x == XXX_MIN  -->   x == XXX_MIN
> > > > > > > 
> > > > > > > 4. x <  y  &&  x != XXX_MAX  -->   x < y
> > > > > > > 5. x <  y  &&  x == XXX_MAX  -->   false
> > > > > > > 6. x >= y  &&  x == XXX_MAX  -->   x == XXX_MAX
> > > > > > > 
> > > > > > > 7. x >  y  ||  x != XXX_MIN  -->   x != XXX_MIN
> > > > > > > 8. x <= y  ||  x != XXX_MIN  -->   true
> > > > > > > 9. x <= y  ||  x == XXX_MIN  -->   x <= y
> > > > > > > 
> > > > > > > 10. x <  y  ||  x != XXX_MAX  -->   x != UXXX_MAX
> > > > > > > 11. x >= y  ||  x != XXX_MAX  -->   true
> > > > > > > 12. x >= y  ||  x == XXX_MAX  -->   x >= y
> > > > > > > 
> > > > > > > Note: XXX_MIN represents the minimum value of type x.
> > > > > > >         XXX_MAX represents the maximum value of type x.
> > > > > > > 
> > > > > > > Here we don't need to care about whether the operation is
> > > > > > > signed or unsigned.  For example, in the below equation:
> > > > > > > 
> > > > > > > 'x >  y  &&  x != XXX_MIN  -->   x > y'
> > > > > > > 
> > > > > > > If the x type is signed int and XXX_MIN is INT_MIN, we can
> > > > > > > optimize it to 'x > y'.  However, if the type of x is unsigned
> > > > > > > int and XXX_MIN is 0, we can still optimize it to 'x > y'.
> > > > > > > 
> > > > > > > The regression testing for the patch was done on GCC mainline on
> > > > > > > 
> > > > > > >       powerpc64le-unknown-linux-gnu (Power 9 LE)
> > > > > > > 
> > > > > > > with no regressions.  Is it OK for trunk ?
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Lijia He
> > > > > > > 
> > > > > > > gcc/ChangeLog
> > > > > > > 
> > > > > > > 2019-06-27  Li Jia He  <helijia@linux.ibm.com>
> > > > > > >           Qi Feng  <ffengqi@linux.ibm.com>
> > > > > > > 
> > > > > > >       PR middle-end/88784
> > > > > > >       * gimple-fold.c (and_comparisons_contain_equal_operands): New
> > > > > > > function.
> > > > > > >       (and_comparisons_1): Use
> > > > > > > and_comparisons_contain_equal_operands.
> > > > > > >       (or_comparisons_contain_equal_operands): New function.
> > > > > > >       (or_comparisons_1): Use or_comparisons_contain_equal_operands.
> > > > > > Would this be better done via match.pd?  ISTM this transformation
> > > > > > would
> > > > > > be well suited for that framework.
> > > > > 
> > > > > Hi, Jeff
> > > > > 
> > > > > I did this because of the following test case:
> > > > > `
> > > > > _Bool comp(unsigned x, unsigned y)
> > > > > {
> > > > >     return x > y && x != 0;
> > > > > }
> > > > > `
> > > > > The gimple file dumped on the power platform is:
> > > > > `
> > > > > comp (unsigned int x, unsigned int y)
> > > > > {
> > > > >     _Bool D.2837;
> > > > >     int iftmp.0;
> > > > > 
> > > > >     if (x > y) goto <D.2841>; else goto <D.2839>;
> > > > >     <D.2841>:
> > > > >     if (x != 0) goto <D.2842>; else goto <D.2839>;
> > > > >     <D.2842>:
> > > > >     iftmp.0 = 1;
> > > > >     goto <D.2840>;
> > > > >     <D.2839>:
> > > > >     iftmp.0 = 0;
> > > > >     <D.2840>:
> > > > >     D.2837 = (_Bool) iftmp.0;
> > > > >     return D.2837;
> > > > > }
> > > > > `
> > > > > However, the gimple file dumped on x86 is
> > > > > `
> > > > > comp (unsigned int x, unsigned int y)
> > > > > {
> > > > >     _Bool D.2837;
> > > > > 
> > > > >     _1 = x > y;
> > > > >     _2 = x != 0;
> > > > >     _3 = _1 & _2;
> > > > >     _4 = (int) _3;
> > > > >     D.2837 = (_Bool) _4;
> > > > >     return D.2837;
> > > > > }
> > > > > `
> > > > > 
> > > > > The reason for the inconsistency between these two behaviors is param
> > > > > logical-op-non-short-circuit.  If we add the pattern to the match.pd
> > > > > file, we can only optimize the situation in which the statement is in
> > > > > the same basic block (logical-op-non-short-circuit=1, x86).  But for
> > > > > a cross-basic block (logical-op-non-short-circuit=0, power), match.pd
> > > > > can't handle this situation.
> > > > > 
> > > > > Another reason is that I found out maybe_fold_and_comparisons and
> > > > > maybe_fold_or_comparisons are not only called by ifcombine pass but
> > > > > also by reassoc pass. Using this method can basically unify param
> > > > > logical-op-non-short-circuit=0 or 1.
> > > > 
> > > > 
> > > > As mentioned before ifcombine pass should be using gimple-match
> > > > instead of fold_build.  Try converting ifcombine over to gimple-match
> > > > infrastructure and add these to match.pd.
> > > 
> > > Yes, I mentioned that in the PR.  The issue is that at the moment
> > > to combine x > y with x <= y you'd have to build GENERIC trees
> > > for both or temporary GIMPLE assign with a SSA def (and then feed
> > > that into the GENERIC or GIMPLE match.pd path).
> > 
> > Hi,
> > 
> > I did some experimentation using âtemporary GIMPLE with a SSA def (and then
> > feed that into the GIMPLE match.pd pathâ.  Could we consider the code in the
> > attachment(I did a test and the code took effect)?
> 
> No, that's excessive overhead IMHO - the whole point of these functions
> originally was to avoid build2 (...) on both conditionals.  If we
> now create two SSA names and two GIMPLE statements that's too much.
> 
> As said it shouldn't be rocket science to teach genmatch to
> auto-generate those functions from match.pd patterns but it certainly
> is some work (less so for somebody with genmatch knowledge).
> I'll give it a poke...

OK, it's not so easy.  I guess we could lower the cost of building
SSA names / gimple stmts significantly if we allocated them on the
stack like via

gimple *stmt1 = XALLOCAVEC (char, gimple_size (GIMPLE_ASSIGN) + 2 * 
sizeof (tree));
memset (stmt1, 0, ...);
gimple_set_code (stmt1, GIMPLE_ASSIGN);
gimple_set_num_ops (stmt1, 3);
gimple_init_singleton (stmt1);

gimple_stmt_iterator gsi = gsi_for_stmt (stmt1);
gimple_assign_set_rhs_with_ops (&gsi, actual operation...);

tree lhs1 = XALLOCA (tree_ssa_name);
memset (lhs1, 0, sizeof (tree_ssa_name));
TREE_SET_CODE (lhs1, SSA_NAME);
TREE_TYPE (lhs1) = ...;

gimple_assing_set_lhs (stmt1, lhs1);

it's all a bit ugly and some factoring in the generic gimple machinery
might easen this kind of hacks, but well...

With the above you could then use

  gimple_match_op op (gimple_match_cond::UNCOND, BIT_AND_EXPR /* or OR */,
		      boolean_type_node, lhs1, lhs2);
  if (gimple_resimplify2 (NULL, &op, follow_all_ssa_edges))
    ... successfully simplified into 'op' ...

with the simplification path then extracting the simplified result
from 'op'.  Note this deliberately leaves out passing a gimple_seq
as storage for more complex simplification results to match
existing behavior.

Your patch has the GC allocation and SSA name (and namespace!)
allocation overhead and most definitely misses releasing the
SSA names you allocated.

Note with the above hack the simplified result has to be checked
for mentions of lhs1 or lhs2 - a case we have to reject because
their definitions are transitional.

Richard.

> Richard.
> 
> > Thanks,
> > Lijia He
> > 
> > > 
> > > maybe_fold_and/or_comparisons handle two exploded binary expressions
> > > while the current match.pd entries handle at most one exploded one
> > > (the outermost then, either AND or OR).  But it would be definitely
> > > doable to auto-generate maybe_fold_and/or_comparisons from match.pd
> > > patterns which is what I'd ultimatively suggest to do (in some more
> > > generalized form maybe).  Either with a separate genmatch invocation
> > > or as part of the --gimple processing (not sure what is more feasible
> > > here).
> > > 
> > > I told Li Jia He that I don't expect him to do this work.
> > > 
> > > Note I didn't review the actual patch yet.
> > > 
> > > Thanks,
> > > Richard.
> > > 
> > 
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix ImendÃ¶rffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG NÃ¼rnberg)
---1609908220-1984626218-1562057469=:2976--