From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id A552A385841D for ; Wed, 1 Dec 2021 20:02:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A552A385841D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Transfer-Encoding:Content-Type: MIME-Version:Message-ID:Date:Subject:In-Reply-To:References:Cc:To:From:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=ZlQthu69IKYYSw0qARK0YL/VGJFpb48sxe2IsvFa3bU=; b=px6/7/A7h8fdTiALMQy9/lfWBO QkMgAAbadkIFtAGlGO4GiFgjr0OCzglq5did/a0CMMeZ2jg21FeOHlAELGDf4RhLCY3+fsnpMTmQl 9lItdJW578ENiiiiNuQt6fjLb+cwZwJ0inHpNP5FEIWAZd7FDEaQomx8HZbKLn3wBYpgfWW1Fx1e2 XiFyVIg/sD/ChhDiUK9nHDNPsglShpNu46oJuQeKJXM5JM0utuPr3BSKeL4o6eVUTeZdXBX0fDVCO HS1VzIwI+WGJH5uq0WWHqpTRK73IIrut7HO2jk0GC9aJSdsPfzUgbJuN1eF6092FEZo4KmVrWplNj 0O3wNf1Q==; Received: from [185.62.158.67] (port=49828 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1msVo0-0006ah-0p; Wed, 01 Dec 2021 15:02:16 -0500 From: "Roger Sayle" To: "'Richard Biener'" Cc: "'GCC Patches'" References: <016001d7e500$2628ae80$727a0b80$@nextmovesoftware.com> In-Reply-To: Subject: RE: [PATCH] Final value replacement improvements for until-wrap loops. Date: Wed, 1 Dec 2021 20:02:14 -0000 Message-ID: <003c01d7e6ee$56902090$03b061b0$@nextmovesoftware.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 16.0 Thread-Index: AQEdH5XXZESJtXikdVguo/YJUPCLOQIN5QjNrYMsQOA= Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-6.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Dec 2021 20:02:18 -0000 Hi Richard, Many thanks for the review. Here's the final version that I've = committed, including=20 your suggested improvements, following another make bootstrap and make = -k check on x86_64-pc-linux-gnu with no new failures. Thanks again. 2021-12-01 Roger Sayle Richard Biener gcc/ChangeLog * tree-ssa-loop-niter.c (number_of_iterations_until_wrap): Check if simplify_using_initial_conditions allows us to simplify the expression for may_be_zero. * match.pd (X !=3D C ? -X : -C -> -X): New transform. (X !=3D C ? ~X : ~C -> ~X): Likewise. ((X+1) > Y ? -X : 1 -> X >=3D Y ? -X : 1): Likewise. gcc/testsuite/ChangeLog * gcc.dg/fold-condneg-1.c: New test case. * gcc.dg/fold-condneg-2.c: New test case. * gcc.dg/fold-condnot-1.c: New test case. * gcc.dg/pr101145-1.c: New test case. * gcc.dg/pr101145-2.c: New test case. Roger -- > -----Original Message----- > From: Richard Biener > Sent: 30 November 2021 10:00 > To: Roger Sayle > Cc: GCC Patches > Subject: Re: [PATCH] Final value replacement improvements for = until-wrap > loops. >=20 > On Mon, Nov 29, 2021 at 10:07 AM Roger Sayle > wrote: > > > > > > This middle-end patch is inspired by Richard Biener's until-wrap = loop > > example in PR tree-optimization/101145. > > > > unsigned foo(unsigned val, unsigned start) { > > unsigned cnt =3D 0; > > for (unsigned i =3D start; i > val; ++i) > > cnt++; > > return cnt; > > } > > > > For this loop, the tree optimizers currently generate: > > > > unsigned int foo (unsigned int val, unsigned int start) { > > unsigned int cnt; > > unsigned int _1; > > unsigned int _5; > > > > [local count: 118111600]: > > if (start_3(D) > val_4(D)) > > goto ; [89.00%] > > else > > goto ; [11.00%] > > > > [local count: 105119324]: > > _1 =3D start_3(D) + 1; > > _5 =3D -start_3(D); > > cnt_2 =3D _1 > val_4(D) ? _5 : 1; > > > > [local count: 118111600]: > > # cnt_11 =3D PHI > > return cnt_11; > > } > > > > or perhaps slightly easier to read: > > > > if (start > val) { > > cnt =3D (start+1) > val ? -start : 1; > > } else cnt =3D 0; > > > > In this snippet, if we know start > val, then (start+1) > val unless > > start+1 overflows, i.e. (start+1) =3D=3D 0 and start =3D=3D ~0. > > We can use this (loop header) context to simplify the ternary > > expression to "(start !=3D -1) ? -start : 1", which with a little = help > > from match.pd can be folded to -start. Hence the optimal final = value > > replacement should be: > > > > cnt =3D (start > val) ? -start : 0; > > > > Or as now generated by this patch: > > > > unsigned int foo (unsigned int val, unsigned int start) { > > unsigned int cnt; > > > > [local count: 118111600]: > > if (start_3(D) > val_4(D)) > > goto ; [89.00%] > > else > > goto ; [11.00%] > > > > [local count: 105119324]: > > cnt_2 =3D -start_3(D); > > > > [local count: 118111600]: > > # cnt_11 =3D PHI > > return cnt_11; > > } > > > > > > We can also improve until-wrap loops that don't have a (suitable) = loop > > header, as determined by simplify_using_initial_conditions. > > > > unsigned bar(unsigned val, unsigned start) { > > unsigned cnt =3D 0; > > unsigned i =3D start; > > do { > > cnt++; > > i++; > > } while (i > val); > > return cnt; > > } > > > > which is currently optimized to: > > > > unsigned int foo (unsigned int val, unsigned int start) { > > unsigned int cnt; > > unsigned int _9; > > unsigned int _10; > > > > [local count: 118111600]: > > _9 =3D start_4(D) + 1; > > _10 =3D -start_4(D); > > cnt_3 =3D val_7(D) < _9 ? _10 : 1; > > return cnt_3; > > } > > > > Here we have "val < (start+1) ? -start : 1", which again with the = help > > of match.pd can be slightly simplified to "val <=3D start ? -start : = 1" > > when dealing with unsigned types, because at the complicating value > > where start =3D=3D ~0, we fortunately have -start =3D=3D 1, hence it = doesn't > > matter whether the second or third operand of the ternary operator = is > returned. > > > > To summarize, this patch (in addition to tweaking may_be_zero in > > number_of_iterations_until_wrap) adds three new constant folding > > transforms to match.pd. > > > > X !=3D C1 ? -X : C2 simplifies to -X when -C1 =3D=3D C2. > > which is the generalized form of the simplification above. > > > > X !=3D C1 ? ~X : C2 simplifies to ~X when ~C1 =3D=3D C2. > > which is the BIT_NOT_EXPR analog of the NEGATE_EXPR case. > > > > and the "until-wrap final value replacement without context": > > > > (X + 1) > Y ? -X : 1 simplifies to X >=3D Y ? -X : 1 when X is = unsigned, > > as when X + 1 overflows, X is -1, so -X =3D=3D 1. > > > > > > This patch has been tested on x86_64-pc-linux-gnu with make = bootstrap > > and make -k check with no new failures. Ok for mainline? >=20 > +/* X !=3D C1 ? -X : C2 simplifies to -X when -C1 =3D=3D C2. */ = (simplify > +(cond (ne @0 INTEGER_CST@1) (negate@3 @0) INTEGER_CST@2) (if > +((!wi::only_sign_bit_p (wi::to_wide (@1)) > + || TYPE_UNSIGNED (type) > + || TYPE_OVERFLOW_WRAPS (type) > + || (!TYPE_OVERFLOW_SANITIZED (type) > + && !TYPE_OVERFLOW_TRAPS (type) > + && !TYPE_SATURATING (type))) >=20 > I'm wondering about TYPE_UNSIGNED && TYPE_SATURATING. > Also I think unsigned cannot trap or be sanitized and with > TYPE_OVERFLOW_WRAPS sanitizing or trapping cannot be true either. = Also > unsigned types always wrap on overflow. So maybe >=20 > (if (!TYPE_SATURATING (type) > && (TYPE_OVERFLOW_WRAPS (type) > || !wi::only_sign_bit_p (..))) >=20 > ? >=20 > + && wi::eq_p (wi::neg (wi::to_wide (@1)), wi::to_wide (@2))) > + @3)) >=20 > +/* (X + 1) > Y ? -X : 1 simplifies to X >=3D Y ? -X : 1 when > + X is unsigned, as when X + 1 overflows, X is -1, so -X =3D=3D 1. = */ > +(simplify (cond (gt (plus @0 integer_onep) @1) (negate @0) > +integer_onep@2) (if (TYPE_UNSIGNED (type) > + && TYPE_UNSIGNED (TREE_TYPE (@0))) >=20 > I think the second test is redundant since @0 participates in both the = comparison > and the condition value. >=20 > + (cond (ge @0 @1) (negate @0) @2))) >=20 > Otherwise looks OK to me. >=20 > Thanks, > Richard. >=20 > > > > 2021-11-29 Roger Sayle > > > > gcc/ChangeLog > > * tree-ssa-loop-niter.c (number_of_iterations_until_wrap): > > Check if simplify_using_initial_conditions allows us to > > simplify the expression for may_be_zero. > > * match.pd (X !=3D C ? -X : -C -> -X): New transform. > > (X !=3D C ? ~X : ~C -> ~X): Likewise. > > ((X+1) > Y ? -X : 1 -> X >=3D Y ? -X : 1): Likewise. > > > > gcc/testsuite/ChangeLog > > * gcc.dg/fold-condneg-1.c: New test case. > > * gcc.dg/fold-condneg-2.c: New test case. > > * gcc.dg/fold-condnot-1.c: New test case. > > * gcc.dg/pr101145-1.c: New test case. > > * gcc.dg/pr101145-2.c: New test case. > > > > > > Thanks in advance, > > Roger > > -- > >