public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [x86 PATCH] PR target/110588: Add *bt<mode>_setncqi_2 to generate btl
@ 2023-07-13 17:10 Roger Sayle
  2023-07-13 18:21 ` Uros Bizjak
  0 siblings, 1 reply; 4+ messages in thread
From: Roger Sayle @ 2023-07-13 17:10 UTC (permalink / raw)
  To: gcc-patches; +Cc: 'Uros Bizjak'

[-- Attachment #1: Type: text/plain, Size: 1072 bytes --]


This patch resolves PR target/110588 to catch another case in combine
where the i386 backend should be generating a btl instruction.  This adds
another define_insn_and_split to recognize the RTL representation for this
case.

I also noticed that two related define_insn_and_split weren't using the
preferred string style for single statement preparation-statements, so
I've reformatted these to be consistent in style with the new one.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2023-07-13  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        PR target/110588
        * config/i386/i386.md (*bt<mode>_setcqi): Prefer string form
        preparation statement over braces for a single statement.
        (*bt<mode>_setncqi): Likewise.
        (*bt<mode>_setncqi_2): New define_insn_and_split.

gcc/testsuite/ChangeLog
        PR target/110588
        * gcc.target/i386/pr110588.c: New test case.


Thanks again,
Roger
--


[-- Attachment #2: patchbt.txt --]
[-- Type: text/plain, Size: 2587 bytes --]

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index e47ced1..04eca049 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -16170,9 +16170,7 @@
          (const_int 0)))
    (set (match_dup 0)
         (eq:QI (reg:CCC FLAGS_REG) (const_int 0)))]
-{
-  operands[2] = lowpart_subreg (SImode, operands[2], QImode);
-})
+  "operands[2] = lowpart_subreg (SImode, operands[2], QImode);")
 
 ;; Help combine recognize bt followed by setnc
 (define_insn_and_split "*bt<mode>_setncqi"
@@ -16193,9 +16191,7 @@
          (const_int 0)))
    (set (match_dup 0)
         (ne:QI (reg:CCC FLAGS_REG) (const_int 0)))]
-{
-  operands[2] = lowpart_subreg (SImode, operands[2], QImode);
-})
+  "operands[2] = lowpart_subreg (SImode, operands[2], QImode);")
 
 (define_insn_and_split "*bt<mode>_setnc<mode>"
   [(set (match_operand:SWI48 0 "register_operand")
@@ -16219,6 +16215,27 @@
   operands[2] = lowpart_subreg (SImode, operands[2], QImode);
   operands[3] = gen_reg_rtx (QImode);
 })
+
+;; Help combine recognize bt followed by setnc (PR target/110588)
+(define_insn_and_split "*bt<mode>_setncqi_2"
+  [(set (match_operand:QI 0 "register_operand")
+	(eq:QI
+	  (zero_extract:SWI48
+ 	    (match_operand:SWI48 1 "register_operand")
+	    (const_int 1)
+	    (zero_extend:SI (match_operand:QI 2 "register_operand")))
+	  (const_int 0)))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_USE_BT && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(set (reg:CCC FLAGS_REG)
+        (compare:CCC
+         (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))
+         (const_int 0)))
+   (set (match_dup 0)
+        (ne:QI (reg:CCC FLAGS_REG) (const_int 0)))]
+  "operands[2] = lowpart_subreg (SImode, operands[2], QImode);")
 \f
 ;; Store-flag instructions.
 
diff --git a/gcc/testsuite/gcc.target/i386/pr110588.c b/gcc/testsuite/gcc.target/i386/pr110588.c
new file mode 100644
index 0000000..4505c87
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr110588.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mtune=core2" } */
+
+unsigned char foo (unsigned char x, int y)
+{
+  int _1 = (int) x;
+  int _2 = _1 >> y;
+  int _3 = _2 & 1;
+  unsigned char _8 = (unsigned char) _3;
+  unsigned char _6 = _8 ^ 1;
+  return _6;
+}
+
+/* { dg-final { scan-assembler "btl" } } */
+/* { dg-final { scan-assembler "setnc" } } */
+/* { dg-final { scan-assembler-not "sarl" } } */
+/* { dg-final { scan-assembler-not "andl" } } */
+/* { dg-final { scan-assembler-not "xorl" } } */

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [x86 PATCH] PR target/110588: Add *bt<mode>_setncqi_2 to generate btl
  2023-07-13 17:10 [x86 PATCH] PR target/110588: Add *bt<mode>_setncqi_2 to generate btl Roger Sayle
@ 2023-07-13 18:21 ` Uros Bizjak
  2023-07-14  9:27   ` Roger Sayle
  0 siblings, 1 reply; 4+ messages in thread
From: Uros Bizjak @ 2023-07-13 18:21 UTC (permalink / raw)
  To: Roger Sayle; +Cc: gcc-patches

On Thu, Jul 13, 2023 at 7:10 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> This patch resolves PR target/110588 to catch another case in combine
> where the i386 backend should be generating a btl instruction.  This adds
> another define_insn_and_split to recognize the RTL representation for this
> case.
>
> I also noticed that two related define_insn_and_split weren't using the
> preferred string style for single statement preparation-statements, so
> I've reformatted these to be consistent in style with the new one.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?
>
>
> 2023-07-13  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         PR target/110588
>         * config/i386/i386.md (*bt<mode>_setcqi): Prefer string form
>         preparation statement over braces for a single statement.
>         (*bt<mode>_setncqi): Likewise.
>         (*bt<mode>_setncqi_2): New define_insn_and_split.
>
> gcc/testsuite/ChangeLog
>         PR target/110588
>         * gcc.target/i386/pr110588.c: New test case.

+;; Help combine recognize bt followed by setnc (PR target/110588)
+(define_insn_and_split "*bt<mode>_setncqi_2"
+  [(set (match_operand:QI 0 "register_operand")
+ (eq:QI
+  (zero_extract:SWI48
+    (match_operand:SWI48 1 "register_operand")
+    (const_int 1)
+    (zero_extend:SI (match_operand:QI 2 "register_operand")))
+  (const_int 0)))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_USE_BT && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(set (reg:CCC FLAGS_REG)
+        (compare:CCC
+         (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))
+         (const_int 0)))
+   (set (match_dup 0)
+        (ne:QI (reg:CCC FLAGS_REG) (const_int 0)))]
+  "operands[2] = lowpart_subreg (SImode, operands[2], QImode);")

I don't think the above transformation is 100% correct, mainly due to
the use of paradoxical subreg.

The combined instruction is operating with a zero_extended QImode
register, so all bits of the register are well defined. You are
splitting using paradoxical subreg, so you don't know what garbage is
there in the highpart of the count register. However, BTL/BTQ uses
modulo 64 (or 32) of this register, so even with a slightly invalid
RTX, everything checks out.

+  "operands[2] = lowpart_subreg (SImode, operands[2], QImode);")

You probably need <SWI48>mode instead of SImode here.

Otherwise OK.

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [x86 PATCH] PR target/110588: Add *bt<mode>_setncqi_2 to generate btl
  2023-07-13 18:21 ` Uros Bizjak
@ 2023-07-14  9:27   ` Roger Sayle
  2023-07-14 10:03     ` Uros Bizjak
  0 siblings, 1 reply; 4+ messages in thread
From: Roger Sayle @ 2023-07-14  9:27 UTC (permalink / raw)
  To: 'Uros Bizjak'; +Cc: gcc-patches


> From: Uros Bizjak <ubizjak@gmail.com>
> Sent: 13 July 2023 19:21
> 
> On Thu, Jul 13, 2023 at 7:10 PM Roger Sayle <roger@nextmovesoftware.com>
> wrote:
> >
> > This patch resolves PR target/110588 to catch another case in combine
> > where the i386 backend should be generating a btl instruction.  This
> > adds another define_insn_and_split to recognize the RTL representation
> > for this case.
> >
> > I also noticed that two related define_insn_and_split weren't using
> > the preferred string style for single statement
> > preparation-statements, so I've reformatted these to be consistent in style with
> the new one.
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check, both with and without --target_board=unix{-m32}
> > with no new failures.  Ok for mainline?
> >
> >
> > 2023-07-13  Roger Sayle  <roger@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> >         PR target/110588
> >         * config/i386/i386.md (*bt<mode>_setcqi): Prefer string form
> >         preparation statement over braces for a single statement.
> >         (*bt<mode>_setncqi): Likewise.
> >         (*bt<mode>_setncqi_2): New define_insn_and_split.
> >
> > gcc/testsuite/ChangeLog
> >         PR target/110588
> >         * gcc.target/i386/pr110588.c: New test case.
> 
> +;; Help combine recognize bt followed by setnc (PR target/110588)
> +(define_insn_and_split "*bt<mode>_setncqi_2"
> +  [(set (match_operand:QI 0 "register_operand")  (eq:QI
> +  (zero_extract:SWI48
> +    (match_operand:SWI48 1 "register_operand")
> +    (const_int 1)
> +    (zero_extend:SI (match_operand:QI 2 "register_operand")))
> +  (const_int 0)))
> +   (clobber (reg:CC FLAGS_REG))]
> +  "TARGET_USE_BT && ix86_pre_reload_split ()"
> +  "#"
> +  "&& 1"
> +  [(set (reg:CCC FLAGS_REG)
> +        (compare:CCC
> +         (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))
> +         (const_int 0)))
> +   (set (match_dup 0)
> +        (ne:QI (reg:CCC FLAGS_REG) (const_int 0)))]
> +  "operands[2] = lowpart_subreg (SImode, operands[2], QImode);")
> 
> I don't think the above transformation is 100% correct, mainly due to the use of
> paradoxical subreg.
> 
> The combined instruction is operating with a zero_extended QImode register, so
> all bits of the register are well defined. You are splitting using paradoxical subreg,
> so you don't know what garbage is there in the highpart of the count register.
> However, BTL/BTQ uses modulo 64 (or 32) of this register, so even with a slightly
> invalid RTX, everything checks out.
> 
> +  "operands[2] = lowpart_subreg (SImode, operands[2], QImode);")
> 
> You probably need <SWI48>mode instead of SImode here.

The define_insn for *bt is:

(define_insn "*bt<mode>"
  [(set (reg:CCC FLAGS_REG)
        (compare:CCC
          (zero_extract:SWI48
            (match_operand:SWI48 0 "nonimmediate_operand" "r,m")
            (const_int 1)
            (match_operand:SI 1 "nonmemory_operand" "r<S>,<S>"))
          (const_int 0)))]

So <SWI48> isn't appropriate here.

But now you've made me think about it, it's inconsistent that all of the shifts
and rotates in i386.md standardize on QImode for shift counts, but the bit test
instructions use SImode?  I think this explains where the paradoxical SUBREGs
come from, and in theory any_extend from QImode to SImode here could/should 
be handled/unnecessary.

Is it worth investigating a follow-up patch to convert all ZERO_EXTRACTs and
SIGN_EXTRACTs in i386.md to use QImode (instead of SImode)?

Thanks in advance,
Roger
--



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [x86 PATCH] PR target/110588: Add *bt<mode>_setncqi_2 to generate btl
  2023-07-14  9:27   ` Roger Sayle
@ 2023-07-14 10:03     ` Uros Bizjak
  0 siblings, 0 replies; 4+ messages in thread
From: Uros Bizjak @ 2023-07-14 10:03 UTC (permalink / raw)
  To: Roger Sayle; +Cc: gcc-patches

On Fri, Jul 14, 2023 at 11:27 AM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> > From: Uros Bizjak <ubizjak@gmail.com>
> > Sent: 13 July 2023 19:21
> >
> > On Thu, Jul 13, 2023 at 7:10 PM Roger Sayle <roger@nextmovesoftware.com>
> > wrote:
> > >
> > > This patch resolves PR target/110588 to catch another case in combine
> > > where the i386 backend should be generating a btl instruction.  This
> > > adds another define_insn_and_split to recognize the RTL representation
> > > for this case.
> > >
> > > I also noticed that two related define_insn_and_split weren't using
> > > the preferred string style for single statement
> > > preparation-statements, so I've reformatted these to be consistent in style with
> > the new one.
> > >
> > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > > and make -k check, both with and without --target_board=unix{-m32}
> > > with no new failures.  Ok for mainline?
> > >
> > >
> > > 2023-07-13  Roger Sayle  <roger@nextmovesoftware.com>
> > >
> > > gcc/ChangeLog
> > >         PR target/110588
> > >         * config/i386/i386.md (*bt<mode>_setcqi): Prefer string form
> > >         preparation statement over braces for a single statement.
> > >         (*bt<mode>_setncqi): Likewise.
> > >         (*bt<mode>_setncqi_2): New define_insn_and_split.
> > >
> > > gcc/testsuite/ChangeLog
> > >         PR target/110588
> > >         * gcc.target/i386/pr110588.c: New test case.
> >
> > +;; Help combine recognize bt followed by setnc (PR target/110588)
> > +(define_insn_and_split "*bt<mode>_setncqi_2"
> > +  [(set (match_operand:QI 0 "register_operand")  (eq:QI
> > +  (zero_extract:SWI48
> > +    (match_operand:SWI48 1 "register_operand")
> > +    (const_int 1)
> > +    (zero_extend:SI (match_operand:QI 2 "register_operand")))
> > +  (const_int 0)))
> > +   (clobber (reg:CC FLAGS_REG))]
> > +  "TARGET_USE_BT && ix86_pre_reload_split ()"
> > +  "#"
> > +  "&& 1"
> > +  [(set (reg:CCC FLAGS_REG)
> > +        (compare:CCC
> > +         (zero_extract:SWI48 (match_dup 1) (const_int 1) (match_dup 2))
> > +         (const_int 0)))
> > +   (set (match_dup 0)
> > +        (ne:QI (reg:CCC FLAGS_REG) (const_int 0)))]
> > +  "operands[2] = lowpart_subreg (SImode, operands[2], QImode);")
> >
> > I don't think the above transformation is 100% correct, mainly due to the use of
> > paradoxical subreg.
> >
> > The combined instruction is operating with a zero_extended QImode register, so
> > all bits of the register are well defined. You are splitting using paradoxical subreg,
> > so you don't know what garbage is there in the highpart of the count register.
> > However, BTL/BTQ uses modulo 64 (or 32) of this register, so even with a slightly
> > invalid RTX, everything checks out.
> >
> > +  "operands[2] = lowpart_subreg (SImode, operands[2], QImode);")
> >
> > You probably need <SWI48>mode instead of SImode here.
>
> The define_insn for *bt is:
>
> (define_insn "*bt<mode>"
>   [(set (reg:CCC FLAGS_REG)
>         (compare:CCC
>           (zero_extract:SWI48
>             (match_operand:SWI48 0 "nonimmediate_operand" "r,m")
>             (const_int 1)
>             (match_operand:SI 1 "nonmemory_operand" "r<S>,<S>"))
>           (const_int 0)))]
>
> So <SWI48> isn't appropriate here.
>
> But now you've made me think about it, it's inconsistent that all of the shifts
> and rotates in i386.md standardize on QImode for shift counts, but the bit test
> instructions use SImode?  I think this explains where the paradoxical SUBREGs
> come from, and in theory any_extend from QImode to SImode here could/should
> be handled/unnecessary.
>
> Is it worth investigating a follow-up patch to convert all ZERO_EXTRACTs and
> SIGN_EXTRACTs in i386.md to use QImode (instead of SImode)?

IIRC, zero_extract was moved from modeless to a pattern with defined
mode a while ago. Perhaps SImode is just because of these ancient
times, and BT pattern was written that way to satisfy combine. I think
it is definitely worth investigating; perhaps some BT-related pattern
will become obsolete because of the change.

Uros.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-07-14 10:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-13 17:10 [x86 PATCH] PR target/110588: Add *bt<mode>_setncqi_2 to generate btl Roger Sayle
2023-07-13 18:21 ` Uros Bizjak
2023-07-14  9:27   ` Roger Sayle
2023-07-14 10:03     ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).