public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [ARC PATCH] Convert (signed<<31)>>31 to -(signed&1) without barrel shifter.
@ 2023-10-28 16:47 Roger Sayle
  2023-10-30  3:05 ` Jeff Law
  0 siblings, 1 reply; 3+ messages in thread
From: Roger Sayle @ 2023-10-28 16:47 UTC (permalink / raw)
  To: gcc-patches; +Cc: 'Claudiu Zissulescu'

[-- Attachment #1: Type: text/plain, Size: 1089 bytes --]


This patch optimizes PR middle-end/101955 for the ARC backend.  On ARC
CPUs with a barrel shifter, using two shifts is (probably) optimal as:

        asl_s   r0,r0,31
        asr_s   r0,r0,31

but without a barrel shifter, GCC -O2 -mcpu=em currently generates:

        and     r2,r0,1
        ror     r2,r2
        add.f   0,r2,r2
        sbc     r0,r0,r0

with this patch, we now generate the smaller, faster and non-flags
clobbering:

        bmsk_s  r0,r0,0
        neg_s   r0,r0

Tested with a cross-compiler to arc-linux hosted on x86_64,
with no new (compile-only) regressions from make -k check.
Ok for mainline if this passes Claudiu's nightly testing?


2023-10-28  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        PR middle-end/101955
        * config/arc/arc.md (*extvsi_1_0): New define_insn_and_split
        to convert sign extract of the least significant bit into an
        AND $1 then a NEG when !TARGET_BARREL_SHIFTER.

gcc/testsuite/ChangeLog
        PR middle-end/101955
        * gcc.target/arc/pr101955.c: New test case.


Thanks again,
Roger
--


[-- Attachment #2: patchar3.txt --]
[-- Type: text/plain, Size: 1388 bytes --]

diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index ee43887..6471344 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -5873,6 +5873,20 @@ archs4x, archs4xd"
 		   (zero_extract:SI (match_dup 1) (match_dup 5) (match_dup 7)))])
    (match_dup 1)])
 
+;; Split sign-extension of single least significant bit as and x,$1;neg x
+(define_insn_and_split "*extvsi_1_0"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(sign_extract:SI (match_operand:SI 1 "register_operand" "0")
+			 (const_int 1)
+			 (const_int 0)))]
+  "!TARGET_BARREL_SHIFTER"
+  "#"
+  "&& 1"
+  [(set (match_dup 0) (and:SI (match_dup 1) (const_int 1)))
+   (set (match_dup 0) (neg:SI (match_dup 0)))]
+  ""
+  [(set_attr "length" "8")])
+
 (define_insn_and_split "rotlsi3_cnt1"
   [(set (match_operand:SI 0 "dest_reg_operand"            "=r")
 	(rotate:SI (match_operand:SI 1 "register_operand" "r")
diff --git a/gcc/testsuite/gcc.target/arc/pr101955.c b/gcc/testsuite/gcc.target/arc/pr101955.c
new file mode 100644
index 0000000..74bca3c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/pr101955.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=em" } */
+
+int f(int a)
+{
+    return (a << 31) >> 31;
+}
+
+/* { dg-final { scan-assembler "msk_s\\s+r0,r0,0" } } */
+/* { dg-final { scan-assembler "neg_s\\s+r0,r0" } } */

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [ARC PATCH] Convert (signed<<31)>>31 to -(signed&1) without barrel shifter.
  2023-10-28 16:47 [ARC PATCH] Convert (signed<<31)>>31 to -(signed&1) without barrel shifter Roger Sayle
@ 2023-10-30  3:05 ` Jeff Law
  2023-10-30 13:50   ` Claudiu Zissulescu Ianculescu
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Law @ 2023-10-30  3:05 UTC (permalink / raw)
  To: Roger Sayle, gcc-patches; +Cc: 'Claudiu Zissulescu'



On 10/28/23 10:47, Roger Sayle wrote:
> 
> This patch optimizes PR middle-end/101955 for the ARC backend.  On ARC
> CPUs with a barrel shifter, using two shifts is (probably) optimal as:
> 
>          asl_s   r0,r0,31
>          asr_s   r0,r0,31
> 
> but without a barrel shifter, GCC -O2 -mcpu=em currently generates:
> 
>          and     r2,r0,1
>          ror     r2,r2
>          add.f   0,r2,r2
>          sbc     r0,r0,r0
> 
> with this patch, we now generate the smaller, faster and non-flags
> clobbering:
> 
>          bmsk_s  r0,r0,0
>          neg_s   r0,r0
> 
> Tested with a cross-compiler to arc-linux hosted on x86_64,
> with no new (compile-only) regressions from make -k check.
> Ok for mainline if this passes Claudiu's nightly testing?
> 
> 
> 2023-10-28  Roger Sayle  <roger@nextmovesoftware.com>
> 
> gcc/ChangeLog
>          PR middle-end/101955
>          * config/arc/arc.md (*extvsi_1_0): New define_insn_and_split
>          to convert sign extract of the least significant bit into an
>          AND $1 then a NEG when !TARGET_BARREL_SHIFTER.
> 
> gcc/testsuite/ChangeLog
>          PR middle-end/101955
>          * gcc.target/arc/pr101955.c: New test case.
Good catch.  Looking to do something very similar on the H8 based on 
your work here.

One the H8 we can use bld to load a bit from an 8 bit register into the 
C flag.  Then we use subtract with carry to get an 8 bit 0/-1 which we 
can then sign extend to 16 or 32 bits.  That covers bit positions 0..15 
of an SImode input.

For bits 16..31 we can move the high half into the low half, the use the 
bld sequence.

For bit zero the and+neg is the same number of clocks and size as bld 
based sequence.  But it'll simulate faster, so it's special cased.


Jeff


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [ARC PATCH] Convert (signed<<31)>>31 to -(signed&1) without barrel shifter.
  2023-10-30  3:05 ` Jeff Law
@ 2023-10-30 13:50   ` Claudiu Zissulescu Ianculescu
  0 siblings, 0 replies; 3+ messages in thread
From: Claudiu Zissulescu Ianculescu @ 2023-10-30 13:50 UTC (permalink / raw)
  To: Jeff Law; +Cc: Roger Sayle, gcc-patches

Hi Roger,

Do you want to say bmsk_s instead of msk_s here:
+/* { dg-final { scan-assembler "msk_s\\s+r0,r0,0" } } */

Anyhow, the patch looks good. Proceed with your commit.

Thank you,
Claudiu

On Mon, Oct 30, 2023 at 5:05 AM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 10/28/23 10:47, Roger Sayle wrote:
> >
> > This patch optimizes PR middle-end/101955 for the ARC backend.  On ARC
> > CPUs with a barrel shifter, using two shifts is (probably) optimal as:
> >
> >          asl_s   r0,r0,31
> >          asr_s   r0,r0,31
> >
> > but without a barrel shifter, GCC -O2 -mcpu=em currently generates:
> >
> >          and     r2,r0,1
> >          ror     r2,r2
> >          add.f   0,r2,r2
> >          sbc     r0,r0,r0
> >
> > with this patch, we now generate the smaller, faster and non-flags
> > clobbering:
> >
> >          bmsk_s  r0,r0,0
> >          neg_s   r0,r0
> >
> > Tested with a cross-compiler to arc-linux hosted on x86_64,
> > with no new (compile-only) regressions from make -k check.
> > Ok for mainline if this passes Claudiu's nightly testing?
> >
> >
> > 2023-10-28  Roger Sayle  <roger@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> >          PR middle-end/101955
> >          * config/arc/arc.md (*extvsi_1_0): New define_insn_and_split
> >          to convert sign extract of the least significant bit into an
> >          AND $1 then a NEG when !TARGET_BARREL_SHIFTER.
> >
> > gcc/testsuite/ChangeLog
> >          PR middle-end/101955
> >          * gcc.target/arc/pr101955.c: New test case.
> Good catch.  Looking to do something very similar on the H8 based on
> your work here.
>
> One the H8 we can use bld to load a bit from an 8 bit register into the
> C flag.  Then we use subtract with carry to get an 8 bit 0/-1 which we
> can then sign extend to 16 or 32 bits.  That covers bit positions 0..15
> of an SImode input.
>
> For bits 16..31 we can move the high half into the low half, the use the
> bld sequence.
>
> For bit zero the and+neg is the same number of clocks and size as bld
> based sequence.  But it'll simulate faster, so it's special cased.
>
>
> Jeff
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-10-30 13:51 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-28 16:47 [ARC PATCH] Convert (signed<<31)>>31 to -(signed&1) without barrel shifter Roger Sayle
2023-10-30  3:05 ` Jeff Law
2023-10-30 13:50   ` Claudiu Zissulescu Ianculescu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).