public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] match.pd: Optimize sign-extension followed by truncation [PR113024]
@ 2023-12-14 22:27 Jakub Jelinek
  2023-12-15  7:45 ` Richard Biener
  2023-12-15 18:30 ` Richard Sandiford
  0 siblings, 2 replies; 3+ messages in thread
From: Jakub Jelinek @ 2023-12-14 22:27 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Hi!

While looking at a bitint ICE, I've noticed we don't optimize
in f1 and f5 functions below the 2 casts into just one at GIMPLE,
even when optimize it in convert_to_integer if it appears in the same
stmt.  The large match.pd simplification of two conversions in a row
has many complex rules and as the testcase shows, everything else from
the narrowest -> widest -> prec_in_between all integer conversions
is already handled, either because the inside_unsignedp == inter_unsignedp
rule kicks in, or the
         && ((inter_unsignedp && inter_prec > inside_prec)
             == (final_unsignedp && final_prec > inter_prec))
one, but there is no reason why sign extension to from narrowest to
widest type followed by truncation to something in between can't be
done just as sign extension from narrowest to the final type.  After all,
if the widest type is signed rather than unsigned, regardless of the final
type signedness we already handle it that way.
And since PR93044 we also handle it if the final precision is not wider
than the inside precision.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-12-14  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/113024
	* match.pd (two conversions in a row): Simplify scalar integer
	sign-extension followed by truncation.

	* gcc.dg/tree-ssa/pr113024.c: New test.

--- gcc/match.pd.jj	2023-12-14 11:59:28.000000000 +0100
+++ gcc/match.pd	2023-12-14 18:25:00.457961975 +0100
@@ -4754,11 +4754,14 @@ (define_operator_list SYNC_FETCH_AND_AND
     /* If we have a sign-extension of a zero-extended value, we can
        replace that by a single zero-extension.  Likewise if the
        final conversion does not change precision we can drop the
-       intermediate conversion.  */
+       intermediate conversion.  Similarly truncation of a sign-extension
+       can be replaced by a single sign-extension.  */
     (if (inside_int && inter_int && final_int
 	 && ((inside_prec < inter_prec && inter_prec < final_prec
 	      && inside_unsignedp && !inter_unsignedp)
-	     || final_prec == inter_prec))
+	     || final_prec == inter_prec
+	     || (inside_prec < inter_prec && inter_prec > final_prec
+		 && !inside_unsignedp && inter_unsignedp)))
      (ocvt @0))
 
     /* Two conversions in a row are not needed unless:
--- gcc/testsuite/gcc.dg/tree-ssa/pr113024.c.jj	2023-12-14 18:35:30.652225327 +0100
+++ gcc/testsuite/gcc.dg/tree-ssa/pr113024.c	2023-12-14 18:37:42.056403418 +0100
@@ -0,0 +1,22 @@
+/* PR tree-optimization/113024 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-forwprop1" } */
+/* Make sure we have just a single cast per function rather than 2 casts in some cases.  */
+/* { dg-final { scan-tree-dump-times " = \\\(\[a-z \]*\\\) \[xy_\]" 16 "forwprop1" { target { ilp32 || lp64 } } } } */
+
+unsigned int f1 (signed char x) { unsigned long long y = x; return y; }
+unsigned int f2 (unsigned char x) { unsigned long long y = x; return y; }
+unsigned int f3 (signed char x) { long long y = x; return y; }
+unsigned int f4 (unsigned char x) { long long y = x; return y; }
+int f5 (signed char x) { unsigned long long y = x; return y; }
+int f6 (unsigned char x) { unsigned long long y = x; return y; }
+int f7 (signed char x) { long long y = x; return y; }
+int f8 (unsigned char x) { long long y = x; return y; }
+unsigned int f9 (signed char x) { return (unsigned long long) x; }
+unsigned int f10 (unsigned char x) { return (unsigned long long) x; }
+unsigned int f11 (signed char x) { return (long long) x; }
+unsigned int f12 (unsigned char x) { return (long long) x; }
+int f13 (signed char x) { return (unsigned long long) x; }
+int f14 (unsigned char x) { return (unsigned long long) x; }
+int f15 (signed char x) { return (long long) x; }
+int f16 (unsigned char x) { return (long long) x; }

	Jakub


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] match.pd: Optimize sign-extension followed by truncation [PR113024]
  2023-12-14 22:27 [PATCH] match.pd: Optimize sign-extension followed by truncation [PR113024] Jakub Jelinek
@ 2023-12-15  7:45 ` Richard Biener
  2023-12-15 18:30 ` Richard Sandiford
  1 sibling, 0 replies; 3+ messages in thread
From: Richard Biener @ 2023-12-15  7:45 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On Thu, 14 Dec 2023, Jakub Jelinek wrote:

> Hi!
> 
> While looking at a bitint ICE, I've noticed we don't optimize
> in f1 and f5 functions below the 2 casts into just one at GIMPLE,
> even when optimize it in convert_to_integer if it appears in the same
> stmt.  The large match.pd simplification of two conversions in a row
> has many complex rules and as the testcase shows, everything else from
> the narrowest -> widest -> prec_in_between all integer conversions
> is already handled, either because the inside_unsignedp == inter_unsignedp
> rule kicks in, or the
>          && ((inter_unsignedp && inter_prec > inside_prec)
>              == (final_unsignedp && final_prec > inter_prec))
> one, but there is no reason why sign extension to from narrowest to
> widest type followed by truncation to something in between can't be
> done just as sign extension from narrowest to the final type.  After all,
> if the widest type is signed rather than unsigned, regardless of the final
> type signedness we already handle it that way.
> And since PR93044 we also handle it if the final precision is not wider
> than the inside precision.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2023-12-14  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/113024
> 	* match.pd (two conversions in a row): Simplify scalar integer
> 	sign-extension followed by truncation.
> 
> 	* gcc.dg/tree-ssa/pr113024.c: New test.
> 
> --- gcc/match.pd.jj	2023-12-14 11:59:28.000000000 +0100
> +++ gcc/match.pd	2023-12-14 18:25:00.457961975 +0100
> @@ -4754,11 +4754,14 @@ (define_operator_list SYNC_FETCH_AND_AND
>      /* If we have a sign-extension of a zero-extended value, we can
>         replace that by a single zero-extension.  Likewise if the
>         final conversion does not change precision we can drop the
> -       intermediate conversion.  */
> +       intermediate conversion.  Similarly truncation of a sign-extension
> +       can be replaced by a single sign-extension.  */
>      (if (inside_int && inter_int && final_int
>  	 && ((inside_prec < inter_prec && inter_prec < final_prec
>  	      && inside_unsignedp && !inter_unsignedp)
> -	     || final_prec == inter_prec))
> +	     || final_prec == inter_prec
> +	     || (inside_prec < inter_prec && inter_prec > final_prec
> +		 && !inside_unsignedp && inter_unsignedp)))
>       (ocvt @0))
>  
>      /* Two conversions in a row are not needed unless:
> --- gcc/testsuite/gcc.dg/tree-ssa/pr113024.c.jj	2023-12-14 18:35:30.652225327 +0100
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr113024.c	2023-12-14 18:37:42.056403418 +0100
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/113024 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-forwprop1" } */
> +/* Make sure we have just a single cast per function rather than 2 casts in some cases.  */
> +/* { dg-final { scan-tree-dump-times " = \\\(\[a-z \]*\\\) \[xy_\]" 16 "forwprop1" { target { ilp32 || lp64 } } } } */
> +
> +unsigned int f1 (signed char x) { unsigned long long y = x; return y; }
> +unsigned int f2 (unsigned char x) { unsigned long long y = x; return y; }
> +unsigned int f3 (signed char x) { long long y = x; return y; }
> +unsigned int f4 (unsigned char x) { long long y = x; return y; }
> +int f5 (signed char x) { unsigned long long y = x; return y; }
> +int f6 (unsigned char x) { unsigned long long y = x; return y; }
> +int f7 (signed char x) { long long y = x; return y; }
> +int f8 (unsigned char x) { long long y = x; return y; }
> +unsigned int f9 (signed char x) { return (unsigned long long) x; }
> +unsigned int f10 (unsigned char x) { return (unsigned long long) x; }
> +unsigned int f11 (signed char x) { return (long long) x; }
> +unsigned int f12 (unsigned char x) { return (long long) x; }
> +int f13 (signed char x) { return (unsigned long long) x; }
> +int f14 (unsigned char x) { return (unsigned long long) x; }
> +int f15 (signed char x) { return (long long) x; }
> +int f16 (unsigned char x) { return (long long) x; }
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] match.pd: Optimize sign-extension followed by truncation [PR113024]
  2023-12-14 22:27 [PATCH] match.pd: Optimize sign-extension followed by truncation [PR113024] Jakub Jelinek
  2023-12-15  7:45 ` Richard Biener
@ 2023-12-15 18:30 ` Richard Sandiford
  1 sibling, 0 replies; 3+ messages in thread
From: Richard Sandiford @ 2023-12-15 18:30 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches

Jakub Jelinek <jakub@redhat.com> writes:
> Hi!
>
> While looking at a bitint ICE, I've noticed we don't optimize
> in f1 and f5 functions below the 2 casts into just one at GIMPLE,
> even when optimize it in convert_to_integer if it appears in the same
> stmt.  The large match.pd simplification of two conversions in a row
> has many complex rules and as the testcase shows, everything else from
> the narrowest -> widest -> prec_in_between all integer conversions
> is already handled, either because the inside_unsignedp == inter_unsignedp
> rule kicks in, or the
>          && ((inter_unsignedp && inter_prec > inside_prec)
>              == (final_unsignedp && final_prec > inter_prec))
> one, but there is no reason why sign extension to from narrowest to
> widest type followed by truncation to something in between can't be
> done just as sign extension from narrowest to the final type.  After all,
> if the widest type is signed rather than unsigned, regardless of the final
> type signedness we already handle it that way.
> And since PR93044 we also handle it if the final precision is not wider
> than the inside precision.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2023-12-14  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR tree-optimization/113024
> 	* match.pd (two conversions in a row): Simplify scalar integer
> 	sign-extension followed by truncation.
>
> 	* gcc.dg/tree-ssa/pr113024.c: New test.
>
> --- gcc/match.pd.jj	2023-12-14 11:59:28.000000000 +0100
> +++ gcc/match.pd	2023-12-14 18:25:00.457961975 +0100
> @@ -4754,11 +4754,14 @@ (define_operator_list SYNC_FETCH_AND_AND
>      /* If we have a sign-extension of a zero-extended value, we can
>         replace that by a single zero-extension.  Likewise if the
>         final conversion does not change precision we can drop the
> -       intermediate conversion.  */
> +       intermediate conversion.  Similarly truncation of a sign-extension
> +       can be replaced by a single sign-extension.  */
>      (if (inside_int && inter_int && final_int
>  	 && ((inside_prec < inter_prec && inter_prec < final_prec
>  	      && inside_unsignedp && !inter_unsignedp)
> -	     || final_prec == inter_prec))
> +	     || final_prec == inter_prec
> +	     || (inside_prec < inter_prec && inter_prec > final_prec
> +		 && !inside_unsignedp && inter_unsignedp)))

Just curious: is the inter_unsignedp part needed for correctness?
If it's bigger than both the initial and final type then I wouldn't
have expected its signedness to matter.

Thanks,
Richard

>       (ocvt @0))
>  
>      /* Two conversions in a row are not needed unless:
> --- gcc/testsuite/gcc.dg/tree-ssa/pr113024.c.jj	2023-12-14 18:35:30.652225327 +0100
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr113024.c	2023-12-14 18:37:42.056403418 +0100
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/113024 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-forwprop1" } */
> +/* Make sure we have just a single cast per function rather than 2 casts in some cases.  */
> +/* { dg-final { scan-tree-dump-times " = \\\(\[a-z \]*\\\) \[xy_\]" 16 "forwprop1" { target { ilp32 || lp64 } } } } */
> +
> +unsigned int f1 (signed char x) { unsigned long long y = x; return y; }
> +unsigned int f2 (unsigned char x) { unsigned long long y = x; return y; }
> +unsigned int f3 (signed char x) { long long y = x; return y; }
> +unsigned int f4 (unsigned char x) { long long y = x; return y; }
> +int f5 (signed char x) { unsigned long long y = x; return y; }
> +int f6 (unsigned char x) { unsigned long long y = x; return y; }
> +int f7 (signed char x) { long long y = x; return y; }
> +int f8 (unsigned char x) { long long y = x; return y; }
> +unsigned int f9 (signed char x) { return (unsigned long long) x; }
> +unsigned int f10 (unsigned char x) { return (unsigned long long) x; }
> +unsigned int f11 (signed char x) { return (long long) x; }
> +unsigned int f12 (unsigned char x) { return (long long) x; }
> +int f13 (signed char x) { return (unsigned long long) x; }
> +int f14 (unsigned char x) { return (unsigned long long) x; }
> +int f15 (signed char x) { return (long long) x; }
> +int f16 (unsigned char x) { return (long long) x; }
>
> 	Jakub

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-12-15 18:30 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-14 22:27 [PATCH] match.pd: Optimize sign-extension followed by truncation [PR113024] Jakub Jelinek
2023-12-15  7:45 ` Richard Biener
2023-12-15 18:30 ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).