From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by sourceware.org (Postfix) with ESMTPS id CE8173864839 for ; Sat, 19 Feb 2022 10:05:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CE8173864839 Received: by mail-pf1-x42f.google.com with SMTP id p8so4372269pfh.8 for ; Sat, 19 Feb 2022 02:05:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:date:references:to:in-reply-to:message-id; bh=QZcNCelx6oeus+L6QEuSCn59zdTvCK1mecguKKo0eyA=; b=hBnBukSsuFQeIyfcjBFyVT2mgHerPEShg8/XWhG6vttxLgqUkFD1Ji0R7FSNoShv0/ IFEEMJLqT4OMIYca09ngljMhWWcrRWWxjkNQmhXATdRq/ValFivs0EIMtJGXVyrp7pEx UMgWqLde7//t/mwUJc9e3IORKfp30GNUxogClyjseRqM7CeLGVb47xze18R3WB2K4DIL 1blyVAG8drjMDQUo9XlB9XI1+4U9TXNNGLVDXTxctL6XXb2ES+HKHPLXoKypWFM79wZ2 Jvm369cf3gN1eo1lY98vPcGE3Lhg7b2A2RJ/iGdGaH2v+4DwXPj0xanHERC+JwdID7Kd tHyA== X-Gm-Message-State: AOAM5309tDs91PjPeRusxWADdWmmDRarnfrDVkQKL3o0jbJvFQw+3OdY I/bDKLIgrmq1OVpm3W5wYipxKPt0Ne7GUA== X-Google-Smtp-Source: ABdhPJzDN2jGpoYsgt623YK6vbyc1dPrBfOEnqMKfmydJH6Eu5FSjQiE4yZdbU8ohs6RkhvVkz1Yug== X-Received: by 2002:a63:5d09:0:b0:372:9a55:bf89 with SMTP id r9-20020a635d09000000b003729a55bf89mr9328713pgb.321.1645265126481; Sat, 19 Feb 2022 02:05:26 -0800 (PST) Received: from smtpclient.apple ([2401:7400:c807:233c:c02a:8a3e:419a:1059]) by smtp.gmail.com with ESMTPSA id 8sm5766731pfl.164.2022.02.19.02.05.25 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 19 Feb 2022 02:05:26 -0800 (PST) From: Zhao Wei Liew Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 15.0 \(3693.60.0.1.1\)) Subject: Re: [PATCH] tree-optimization: [PR103855] Fold (type)X / (type)Y Date: Sat, 19 Feb 2022 18:05:24 +0800 References: To: GCC Patches In-Reply-To: Message-Id: <5D56A123-525E-4180-A7C6-0862E6D5C76B@gmail.com> X-Mailer: Apple Mail (2.3693.60.0.1.1) X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Feb 2022 10:05:29 -0000 > On 19 Feb 2022, at 5:36 PM, Zhao Wei Liew = wrote: >=20 > This pattern converts (trunc_div (convert a) (convert b)) to > (convert (trunc_div a b)) when: >=20 > 1. type, a, and b all have unsigned integeral types > 2. a and b have the same type precision > 3. type has type precision at least as larger as a and b >=20 > This is useful as wider divisions are typically more expensive. >=20 > To illustrate the effects, consider the following code snippet: >=20 > unsigned long long f(unsigned int a, unsigned int b) { > unsigned long long all =3D a; > return all / b; > } >=20 > Without the pattern, g++ -std=3Dc++20 -O2 generates the following > assembly: >=20 > f(unsigned int, unsigned int): > mov eax, edi > mov esi, esi > xor edx, edx > div rsi > ret >=20 > With the pattern, it generates this: >=20 > f(unsigned int, unsigned int): > mov eax, edi > xor edx, edx > div esi > ret >=20 > This is identical to what clang++ -std=3Dc++20 -O2 generates. >=20 > Signed-off-by: Zhao Wei Liew >=20 > PR tree-optimization/103855 >=20 > gcc/ChangeLog: >=20 > * match.pd: Add pattern for (type)X / (type)Y. >=20 > gcc/testsuite/ChangeLog: >=20 > * gcc.dg/tree-ssa/divide-8.c: New test. > * gcc.dg/tree-ssa/divide-9.c: New test. > --- > gcc/match.pd | 15 +++++++++++++++ > gcc/testsuite/gcc.dg/tree-ssa/divide-8.c | 9 +++++++++ > gcc/testsuite/gcc.dg/tree-ssa/divide-9.c | 10 ++++++++++ > 3 files changed, 34 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/divide-8.c > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/divide-9.c >=20 > diff --git a/gcc/match.pd b/gcc/match.pd > index 10f62284862..393b43756dd 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -684,6 +684,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type)) > (convert (trunc_mod @0 @1)))) >=20 > +/* (type)X / (type)Y -> (type)(X / Y) > + when the resulting type is at least precise as the original types > + and when all the types are unsigned integral types. */ > +(simplify > + (trunc_div (convert @0) (convert @1)) > + (if (INTEGRAL_TYPE_P (type) > + && INTEGRAL_TYPE_P (TREE_TYPE (@0)) > + && INTEGRAL_TYPE_P (TREE_TYPE (@1)) > + && TYPE_UNSIGNED (type) > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > + && TYPE_UNSIGNED (TREE_TYPE (@1)) > + && TYPE_PRECISION (TREE_TYPE (@0)) =3D=3D TYPE_PRECISION = (TREE_TYPE (@1)) > + && TYPE_PRECISION (type) >=3D TYPE_PRECISION (TREE_TYPE (@0))) > + (convert (trunc_div @0 @1)))) > + > /* x * (1 + y / x) - y -> x - y % x */ > (simplify > (minus (mult:cs @0 (plus:s (trunc_div:s @1 @0) integer_onep)) @1) > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/divide-8.c = b/gcc/testsuite/gcc.dg/tree-ssa/divide-8.c > new file mode 100644 > index 00000000000..489604c4eb6 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/divide-8.c > @@ -0,0 +1,9 @@ > +/* PR tree-optimization/103855 */ > +/* { dg-options "-O -fdump-tree-optimized" } */ > + > +unsigned int f(unsigned int a, unsigned int b) { > + unsigned long long all =3D a; > + return all / b; > +} > + > +/* { dg-final { scan-tree-dump-not "\(unsigned long long int)" = "optimized" } } */ > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/divide-9.c = b/gcc/testsuite/gcc.dg/tree-ssa/divide-9.c > new file mode 100644 > index 00000000000..3e75a49b509 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/divide-9.c > @@ -0,0 +1,10 @@ > +/* PR tree-optimization/103855 */ > +/* { dg-options "-O -fdump-tree-optimized" } */ > + > +unsigned long long f(unsigned int a, unsigned int b) { > + unsigned long long all =3D a; > + return all / b; > +} > + > +/* { dg-final { scan-tree-dump-times "\\\(unsigned long long int\\\)" = 1 "optimized" } } */ > + > --=20 > 2.35.1 >=20 Sorry, I noticed issues with the test cases when running a regression = test. I=E2=80=99ll complete regression testing before uploading a v2.