From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) by sourceware.org (Postfix) with ESMTPS id 1B03A3851AA7 for ; Wed, 14 Sep 2022 08:25:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1B03A3851AA7 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ed1-x531.google.com with SMTP id z13so7105764edb.13 for ; Wed, 14 Sep 2022 01:25:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=is61Z7OQVMD6H7fzpeCm1YGLS2L4un3rk+cgpOV0gaI=; b=ps+AApJ8eHHqJKl3ROHq2IsDsOubJDdR/qRs5p3KNIsrAhmDyAlkCqIjBwC3PxzRze B6D8BEYom49q7M3MGs/nAkacMl2gjQM2cA7wlSRRcVzXUWk/Hx7wxP112RFA6M3PTSqu p9JPQ71Mw6osu7SuHOURLaldjpzFccex63m5kefGzuRlfZ1CnfZcq5nF0m3t8lj15VMw aBP3JEt+/pn2EuoMo/ssbqADoG1qK6RUPbsYmVAmiuvu6gu29sTXYqElcGILwr3aUPjy 6lb4FHyY70eUJjEM3PH3Th0/yObX6ZPqaCSzBdCyFPDL0E3NvqOQLawA4KPqPfT3HeNd wh4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=is61Z7OQVMD6H7fzpeCm1YGLS2L4un3rk+cgpOV0gaI=; b=R3yDdFcj6/Weq8Llxch+PLzFFjtzZZXJioXx166DLBtXmegnH32rU7rmEU9snY4nZo CqsLm68cheTv2kliR8Zk3+wZgQCosGXqSq5IzN4rUz7Gb/1XMpCEe3MRKjz13zZv4dIi 7dAKVAmXTqawi488YeJ28jWpJHCdhDM3I8FP+gS8dKXB1mn3cd+Xd54q87m3LZnytcHo jDh38PAg0EAIObZtt9aVqez8rZKwIxWeNwAx6m1z2+tOV8e/8ruxgOJwy0ec5lVht/AK +jDDxexbE+MiVAZOclA4j8hcXPg7IaO2LNZZEdgishejqO3hoSZevhpyOY32USah05ej WUmg== X-Gm-Message-State: ACgBeo2R8tYut6jEyo29opwDeV/wj25KYdK+TyiMnI5PCdG0up/NKQ6m 4Wq2NOGTC4reO7lhXsYfaHp7yDQ+bm/CFvh04EDz00mZoCs= X-Google-Smtp-Source: AA6agR4bE1z7gS02nagAgsaF5ahoshAOCU3T2L/qw/7jyuG/95Ko5hIOQcusRYr3Yenbp8pmPofn4JNUaACE/PXlejg= X-Received: by 2002:a05:6402:3786:b0:451:24da:f8c9 with SMTP id et6-20020a056402378600b0045124daf8c9mr19384081edb.250.1663143898672; Wed, 14 Sep 2022 01:24:58 -0700 (PDT) MIME-Version: 1.0 References: <00a501d8aafd$e10b6da0$a32248e0$@nextmovesoftware.com> <000e01d8c799$f1d2fe10$d578fa30$@nextmovesoftware.com> In-Reply-To: <000e01d8c799$f1d2fe10$d578fa30$@nextmovesoftware.com> From: Richard Biener Date: Wed, 14 Sep 2022 10:24:46 +0200 Message-ID: Subject: Re: [PATCH] PR tree-optimization/71343: Value number X<<2 as X*4. To: Roger Sayle Cc: GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Sep 13, 2022 at 7:55 PM Roger Sayle wrote: > > > This patch is the second part of a fix for PR tree-optimization/71343, > that implements Richard Biener's suggestion of using tree-ssa's value > numbering instead of match.pd. The change is that when assigning a > value number for the expression X< the value number for the multiplication X*(1< handles the fact that we (intentionally) don't canonicalize these as > equivalent in GIMPLE, and the optimization/equivalence in PR 71343 now > happens by (tree-ssa SCCVN) magic. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-32}, > with no new failures. Ok for mainline? Note that "insertion" is quite limited, in particular does not support inserting a MULT_EXPR (see eliminate_dom_walker::eliminate_insert). Your testcases have all expressions in one C statement, if you break things down to enforce the various evaluation orders you seem to test I think you'd see that CSEing tem1 = (a + b) << 2; tem2 = (a + b ) * 4; does not actually work? Amending eliminate_insert by for example changing the BIT_AND_EXPR handling (with constant rhs2) to covert all tcc_binary should make it work. Can you double-check? Otherwise this looks OK. Thanks, Richard. > > 2022-09-13 Roger Sayle > > gcc/ChangeLog > PR tree-optimization/71343 > * tree-ssa-sccvn.cc (visit_nary_op) : Make > the value number of the expression X << C the same as the value > number for the multiplication X * (1< > gcc/testsuite/ChangeLog > PR tree-optimization/71343 > * gcc.dg/pr71343-2.c: New test case. > > > Thanks in advance, > Roger > -- > > > -----Original Message----- > > From: Richard Biener > > Sent: 08 August 2022 12:42 > > To: Roger Sayle > > Cc: GCC Patches > > Subject: Re: [PATCH] PR tree-optimization/71343: Optimize (X< > (X&Y)< > > > On Mon, Aug 8, 2022 at 10:07 AM Roger Sayle > > wrote: > > > > > > This patch resolves PR tree-optimization/71343, a missed-optimization > > > enhancement request where GCC fails to see that (a<<2)+(b<<2) == a*4+b*4. > > > This requires two related (sets of) optimizations to be added to match.pd. > > > > > > The first is that (X< > > for many binary operators, including AND, IOR, XOR, and (if overflow > > > isn't an issue) PLUS and MINUS. Likewise, the right shifts (both > > > logical and arithmetic) and bit-wise logical operators can be > > > simplified in a similar fashion. These all reduce the number of > > > GIMPLE binary operations from 3 to 2, by combining/eliminating a shift > > operation. > > > > > > The second optimization reflects that the middle-end doesn't impose a > > > canonical form on multiplications by powers of two, vs. left shifts, > > > instead leaving these operations as specified by the programmer unless > > > there's a good reason to change them. Hence, GIMPLE code may contain > > > the expressions "X * 8" and "X << 3" even though these represent the > > > same value/computation. The tweak to match.pd is that comparison > > > operations whose operands are equivalent non-canonical expressions can > > > be taught their equivalence. Hence "(X * 8) == (X << 3)" will always > > > evaluate to true, and "(X<<2) > 4*X" will always evaluate to false. > > > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > > and make -k check, both with and without --target_board=unix{-m32}, > > > with no new failures. Ok for mainline? > > > > +/* Shifts by constants distribute over several binary operations, > > + hence (X << C) + (Y << C) can be simplified to (X + Y) << C. */ > > +(for op (plus minus) > > + (simplify > > + (op (lshift:s @0 INTEGER_CST@1) (lshift:s @2 INTEGER_CST@1)) > > + (if (INTEGRAL_TYPE_P (type) > > + && TYPE_OVERFLOW_WRAPS (type) > > + && !TYPE_SATURATING (type) > > + && tree_fits_shwi_p (@1) > > + && tree_to_shwi (@1) > 0 > > + && tree_to_shwi (@1) < TYPE_PRECISION (type)) > > > > I do wonder why we need to restrict this to shifts by constants? > > Any out-of-bound shift was already there, no? > > > > +/* Some tree expressions are intentionally non-canonical. > > + We handle the comparison of the equivalent forms here. */ (for cmp > > +(eq le ge) > > + (simplify > > + (cmp:c (lshift @0 INTEGER_CST@1) (mult @0 integer_pow2p@2)) > > + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) > > + && tree_fits_shwi_p (@1) > > + && tree_to_shwi (@1) > 0 > > + && tree_to_shwi (@1) < TYPE_PRECISION (TREE_TYPE (@0)) > > + && wi::to_wide (@1) == wi::exact_log2 (wi::to_wide (@2))) > > + { constant_boolean_node (true, type); }))) > > + > > +(for cmp (ne lt gt) > > + (simplify > > + (cmp:c (lshift @0 INTEGER_CST@1) (mult @0 integer_pow2p@2)) > > + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) > > + && tree_fits_shwi_p (@1) > > + && tree_to_shwi (@1) > 0 > > + && tree_to_shwi (@1) < TYPE_PRECISION (TREE_TYPE (@0)) > > + && wi::to_wide (@1) == wi::exact_log2 (wi::to_wide (@2))) > > + { constant_boolean_node (false, type); }))) > > > > hmm. I wonder if it makes more sense to handle this in value-numbering. > > tree-ssa-sccvn.cc:visit_nary_op handles some cases that are not exactly > > canonicalization issues but the shift vs mult could be handled there by just > > performing the alternate lookup. That would also enable CSE and by means of > > that of course the comparisons you do above. > > > > Thanks, > > Richard. > > > > > > > > 2022-08-08 Roger Sayle > > > > > > gcc/ChangeLog > > > PR tree-optimization/71343 > > > * match.pd (op (lshift @0 @1) (lshift @2 @1)): Optimize the > > > expression (X< > > (op (rshift @0 @1) (rshift @2 @1)): Likwise, simplify (X>>C)^(Y>>C) > > > to (X^Y)>>C for binary logical operators, AND, IOR and XOR. > > > (cmp:c (lshift @0) (mult @1)): Optimize comparisons between > > > shifts by integer constants and multiplications by powers of 2. > > > > > > gcc/testsuite/ChangeLog > > > PR tree-optimization/71343 > > > * gcc.dg/pr71343-1.c: New test case. > > > * gcc.dg/pr71343-2.c: Likewise. > > > > > > > > > Thanks in advance, > > > Roger > > > -- >