public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Zhao Wei Liew <zhaoweiliew@gmail.com>
To: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] tree-optimization: [PR103855] Fold (type)X / (type)Y
Date: Sat, 19 Feb 2022 18:05:24 +0800	[thread overview]
Message-ID: <5D56A123-525E-4180-A7C6-0862E6D5C76B@gmail.com> (raw)
In-Reply-To: <A55A18E0-1A02-4FB6-A25A-CCEB7C8E57E9@gmail.com>


> On 19 Feb 2022, at 5:36 PM, Zhao Wei Liew <zhaoweiliew@gmail.com> wrote:
> 
> This pattern converts (trunc_div (convert a) (convert b)) to
> (convert (trunc_div a b)) when:
> 
> 1. type, a, and b all have unsigned integeral types
> 2. a and b have the same type precision
> 3. type has type precision at least as larger as a and b
> 
> This is useful as wider divisions are typically more expensive.
> 
> To illustrate the effects, consider the following code snippet:
> 
> unsigned long long f(unsigned int a, unsigned int b) {
> 	unsigned long long all = a;
> 	return all / b;
> }
> 
> Without the pattern, g++ -std=c++20 -O2 generates the following
> assembly:
> 
> f(unsigned int, unsigned int):
> 	mov eax, edi
> 	mov esi, esi
> 	xor edx, edx
> 	div rsi
> 	ret
> 
> With the pattern, it generates this:
> 
> f(unsigned int, unsigned int):
> 	mov eax, edi
> 	xor edx, edx
> 	div esi
> 	ret
> 
> This is identical to what clang++ -std=c++20 -O2 generates.
> 
> Signed-off-by: Zhao Wei Liew <zhaoweiliew@gmail.com>
> 
> 	PR tree-optimization/103855
> 
> gcc/ChangeLog:
> 
> 	* match.pd: Add pattern for (type)X / (type)Y.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.dg/tree-ssa/divide-8.c: New test.
> 	* gcc.dg/tree-ssa/divide-9.c: New test.
> ---
> gcc/match.pd                             | 15 +++++++++++++++
> gcc/testsuite/gcc.dg/tree-ssa/divide-8.c |  9 +++++++++
> gcc/testsuite/gcc.dg/tree-ssa/divide-9.c | 10 ++++++++++
> 3 files changed, 34 insertions(+)
> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/divide-8.c
> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/divide-9.c
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 10f62284862..393b43756dd 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -684,6 +684,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (if (INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
>   (convert (trunc_mod @0 @1))))
> 
> +/* (type)X / (type)Y -> (type)(X / Y)
> +   when the resulting type is at least precise as the original types
> +   and when all the types are unsigned integral types. */
> +(simplify
> + (trunc_div (convert @0) (convert @1))
> + (if (INTEGRAL_TYPE_P (type)
> +      && INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +      && INTEGRAL_TYPE_P (TREE_TYPE (@1))
> +      && TYPE_UNSIGNED (type)
> +      && TYPE_UNSIGNED (TREE_TYPE (@0))
> +      && TYPE_UNSIGNED (TREE_TYPE (@1))
> +      && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE (@1))
> +      && TYPE_PRECISION (type) >= TYPE_PRECISION (TREE_TYPE (@0)))
> +  (convert (trunc_div @0 @1))))
> +
> /* x * (1 + y / x) - y -> x - y % x */
> (simplify
>  (minus (mult:cs @0 (plus:s (trunc_div:s @1 @0) integer_onep)) @1)
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/divide-8.c b/gcc/testsuite/gcc.dg/tree-ssa/divide-8.c
> new file mode 100644
> index 00000000000..489604c4eb6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/divide-8.c
> @@ -0,0 +1,9 @@
> +/* PR tree-optimization/103855 */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +unsigned int f(unsigned int a, unsigned int b) {
> +    unsigned long long all = a;
> +    return all / b;
> +}
> +
> +/* { dg-final { scan-tree-dump-not "\(unsigned long long int)" "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/divide-9.c b/gcc/testsuite/gcc.dg/tree-ssa/divide-9.c
> new file mode 100644
> index 00000000000..3e75a49b509
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/divide-9.c
> @@ -0,0 +1,10 @@
> +/* PR tree-optimization/103855 */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +
> +unsigned long long f(unsigned int a, unsigned int b) {
> +    unsigned long long all = a;
> +    return all / b;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "\\\(unsigned long long int\\\)" 1 "optimized" } } */
> +
> -- 
> 2.35.1
> 

Sorry, I noticed issues with the test cases when running a regression test.
I’ll complete regression testing before uploading a v2.


  reply	other threads:[~2022-02-19 10:05 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-19  9:36 Zhao Wei Liew
2022-02-19 10:05 ` Zhao Wei Liew [this message]
2022-02-22  3:57 Zhao Wei Liew
2022-02-22  4:00 ` Zhao Wei Liew
2022-02-22  7:53   ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5D56A123-525E-4180-A7C6-0862E6D5C76B@gmail.com \
    --to=zhaoweiliew@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).