[PATCH] Optimize (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) as (and:SI x 1).

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] Optimize (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) as (and:SI x 1).
@ 2023-10-10 12:28 Roger Sayle
  2023-10-10 14:20 ` Jeff Law
  2023-10-10 14:41 ` Michael Matz
  0 siblings, 2 replies; 4+ messages in thread
From: Roger Sayle @ 2023-10-10 12:28 UTC (permalink / raw)
  To: gcc-patches; +Cc: 'Jeff Law'

[-- Attachment #1: Type: text/plain, Size: 1378 bytes --]


This patch is the middle-end piece of an improvement to PRs 101955 and
106245, that adds a missing simplification to the RTL optimizers.
This transformation is to simplify (char)(x << 7) != 0 as x & 1.
Technically, the cast can be any truncation, where shift is by one
less than the narrower type's precision, setting the most significant
(only) bit from the least significant bit.

This transformation applies to any target, but it's easy to see
(and add a new test case) on x86, where the following function:

int f(int a) { return (a << 31) >> 31; }

currently gets compiled with -O2 to:

foo:    movl    %edi, %eax
        sall    $7, %eax
        sarb    $7, %al
        movsbl  %al, %eax
        ret

but with this patch, we now generate the slightly simpler.

foo:    movl    %edi, %eax
        sall    $31, %eax
        sarl    $31, %eax
        ret


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures.  Ok for mainline?


2023-10-10  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        PR middle-end/101955
        PR tree-optimization/106245
        * simplify-rtx.c (simplify_relational_operation_1): Simplify
        the RTL (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) to (and:SI x 1).

gcc/testsuite/ChangeLog
        * gcc.target/i386/pr106245-1.c: New test case.


Thanks in advance,
Roger
--


[-- Attachment #2: patchsr.txt --]
[-- Type: text/plain, Size: 1358 bytes --]

diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index bd9443d..69d8757 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -6109,6 +6109,23 @@ simplify_context::simplify_relational_operation_1 (rtx_code code,
 	break;
       }
 
+  /* (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) -> (and:SI x 1).  */
+  if (code == NE
+      && op1 == const0_rtx
+      && (op0code == TRUNCATE
+	  || (partial_subreg_p (op0)
+	      && subreg_lowpart_p (op0)))
+      && SCALAR_INT_MODE_P (mode)
+      && STORE_FLAG_VALUE == 1)
+    {
+      rtx tmp = XEXP (op0, 0);
+      if (GET_CODE (tmp) == ASHIFT
+	  && GET_MODE (tmp) == mode
+	  && CONST_INT_P (XEXP (tmp, 1))
+	  && is_int_mode (GET_MODE (op0), &int_mode)
+	  && INTVAL (XEXP (tmp, 1)) == GET_MODE_PRECISION (int_mode) - 1)
+	return simplify_gen_binary (AND, mode, XEXP (tmp, 0), const1_rtx);
+    }
   return NULL_RTX;
 }
 
diff --git a/gcc/testsuite/gcc.target/i386/pr106245-1.c b/gcc/testsuite/gcc.target/i386/pr106245-1.c
new file mode 100644
index 0000000..a0403e9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr106245-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int f(int a)
+{
+    return (a << 31) >> 31;
+}
+
+/* { dg-final { scan-assembler-not "sarb" } } */
+/* { dg-final { scan-assembler-not "movsbl" } } */

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Optimize (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) as (and:SI x 1).
  2023-10-10 12:28 [PATCH] Optimize (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) as (and:SI x 1) Roger Sayle
@ 2023-10-10 14:20 ` Jeff Law
  2023-10-10 14:41 ` Michael Matz
  1 sibling, 0 replies; 4+ messages in thread
From: Jeff Law @ 2023-10-10 14:20 UTC (permalink / raw)
  To: Roger Sayle, gcc-patches



On 10/10/23 06:28, Roger Sayle wrote:
> 
> This patch is the middle-end piece of an improvement to PRs 101955 and
> 106245, that adds a missing simplification to the RTL optimizers.
> This transformation is to simplify (char)(x << 7) != 0 as x & 1.
> Technically, the cast can be any truncation, where shift is by one
> less than the narrower type's precision, setting the most significant
> (only) bit from the least significant bit.
> 
> This transformation applies to any target, but it's easy to see
> (and add a new test case) on x86, where the following function:
> 
> int f(int a) { return (a << 31) >> 31; }
> 
> currently gets compiled with -O2 to:
> 
> foo:    movl    %edi, %eax
>          sall    $7, %eax
>          sarb    $7, %al
>          movsbl  %al, %eax
>          ret
> 
> but with this patch, we now generate the slightly simpler.
> 
> foo:    movl    %edi, %eax
>          sall    $31, %eax
>          sarl    $31, %eax
>          ret
> 
> 
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check with no new failures.  Ok for mainline?
> 
> 
> 2023-10-10  Roger Sayle  <roger@nextmovesoftware.com>
> 
> gcc/ChangeLog
>          PR middle-end/101955
>          PR tree-optimization/106245
>          * simplify-rtx.c (simplify_relational_operation_1): Simplify
>          the RTL (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) to (and:SI x 1).
> 
> gcc/testsuite/ChangeLog
>          * gcc.target/i386/pr106245-1.c: New test case.
OK.  Thanks!  I must admit, I'm a bit surprised this wasn't already handled.

jeff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Optimize (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) as (and:SI x 1).
  2023-10-10 12:28 [PATCH] Optimize (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) as (and:SI x 1) Roger Sayle
  2023-10-10 14:20 ` Jeff Law
@ 2023-10-10 14:41 ` Michael Matz
  2023-10-10 14:50   ` Jeff Law
  1 sibling, 1 reply; 4+ messages in thread
From: Michael Matz @ 2023-10-10 14:41 UTC (permalink / raw)
  To: Roger Sayle; +Cc: gcc-patches, 'Jeff Law'


On Tue, 10 Oct 2023, Roger Sayle wrote:

> 
> This patch is the middle-end piece of an improvement to PRs 101955 and
> 106245, that adds a missing simplification to the RTL optimizers.
> This transformation is to simplify (char)(x << 7) != 0 as x & 1.

Random observation:

So, why restrict to shifts of LEN-1 and mask 1?  It's always the case that
(type-of-LEN)(x << S)) != 0  ===  (x & ((1 << (LEN - S)) - 1)) != 0.

E.g. (char)(x << 5) != 0  ===  (x & 7) != 0.

(Eventually the mask will be a constant that's too costly to compute if S 
is target-dependendly too small, but all else being equal avoiding shifts 
seems sensible)


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Optimize (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) as (and:SI x 1).
  2023-10-10 14:41 ` Michael Matz
@ 2023-10-10 14:50   ` Jeff Law
  0 siblings, 0 replies; 4+ messages in thread
From: Jeff Law @ 2023-10-10 14:50 UTC (permalink / raw)
  To: Michael Matz, Roger Sayle; +Cc: gcc-patches



On 10/10/23 08:41, Michael Matz wrote:
> 
> On Tue, 10 Oct 2023, Roger Sayle wrote:
> 
>>
>> This patch is the middle-end piece of an improvement to PRs 101955 and
>> 106245, that adds a missing simplification to the RTL optimizers.
>> This transformation is to simplify (char)(x << 7) != 0 as x & 1.
> 
> Random observation:
> 
> So, why restrict to shifts of LEN-1 and mask 1?  It's always the case that
> (type-of-LEN)(x << S)) != 0  ===  (x & ((1 << (LEN - S)) - 1)) != 0.
> 
> E.g. (char)(x << 5) != 0  ===  (x & 7) != 0.
Yea, it probably could be extended as a followup.

> 
> (Eventually the mask will be a constant that's too costly to compute if S
> is target-dependendly too small, but all else being equal avoiding shifts
> seems sensible)
Agreed, though it's nowhere near as important as it was 20+ years ago ;-)

jeff

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-10-10 14:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-10 12:28 [PATCH] Optimize (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) as (and:SI x 1) Roger Sayle
2023-10-10 14:20 ` Jeff Law
2023-10-10 14:41 ` Michael Matz
2023-10-10 14:50   ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).