public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [x86 PATCH] PR target/106273: Add earlyclobber to *andn<dwi>3_doubleword_bmi
@ 2022-07-15 13:27 Roger Sayle
  2022-07-15 15:47 ` Uros Bizjak
  0 siblings, 1 reply; 2+ messages in thread
From: Roger Sayle @ 2022-07-15 13:27 UTC (permalink / raw)
  To: 'GCC Patches'

[-- Attachment #1: Type: text/plain, Size: 2761 bytes --]

 

This patch resolves PR target/106273 which is a wrong code regression

caused by the recent reorganization to split doubleword operations after

reload on x86.  For the failing test case, the constraints on the

andnti3_doubleword_bmi pattern allow reload to allocate the output and

operand in overlapping but non-identical registers, i.e.

 

(insn 45 44 66 2 (parallel [

            (set (reg/v:TI 5 di [orig:96 i ] [96])

                (and:TI (not:TI (reg:TI 39 r11 [orig:83 _2 ] [83]))

                    (reg/v:TI 4 si [orig:100 i ] [100])))

            (clobber (reg:CC 17 flags))

        ]) "pr106273.c":13:5 562 {*andnti3_doubleword_bmi}

 

where the output is in registers 5 and 6, and the second operand is

registers 4 and 5, which then leads to the incorrect split:

 

(insn 113 44 114 2 (parallel [

            (set (reg:DI 5 di [orig:96 i ] [96])

                (and:DI (not:DI (reg:DI 39 r11 [orig:83 _2 ] [83]))

                    (reg:DI 4 si [orig:100 i ] [100])))

            (clobber (reg:CC 17 flags))

        ]) "pr106273.c":13:5 566 {*andndi_1}

 

(insn 114 113 66 2 (parallel [

            (set (reg:DI 6 bp [ i+8 ])

                (and:DI (not:DI (reg:DI 40 r12 [ _2+8 ]))

                    (reg:DI 5 di [ i+8 ])))

            (clobber (reg:CC 17 flags))

        ]) "pr106273.c":13:5 566 {*andndi_1}

 

[Notice that reg:DI 5 is set in the first instruction, but assumed

to have its original value in the second].  My first thought was

that this could be fixed by swapping the order of the split instructions

(which works in this case), but in the general case, it's impossible

to handle (set (reg:TI x) (op (reg:TI x+1) (reg:TI x-1)).  Hence for

correctness this pattern needs an earlyclobber "=&r", but we can also

allow cases where the output is the same as one of the operands (using

constraint "0").  The other binary logic operations (AND, IOR, XOR)

are unaffected as they constrain the output to match the first

operand, but BMI's andn is a three-operand instruction which can

lead to the overlapping cases described above.

 

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap

and make -k check, both with and without --target_board=unix{-m32} with

no new failures.  Ok for mainline?

 

2022-07-15  Roger Sayle  <roger@nextmovesoftware.com>

 

gcc/ChangeLog

        PR target/106273

        * config/i386/i386.md (*andn<dwi>3_doubleword_bmi): Update the

        constraints to reflect the output is earlyclobber, unless it is

        the same register (pair) as one of the operands.

 

gcc/testsuite/ChangeLog

        PR target/106273

        * gcc.target/i386/pr106273.c: New test case.

 

 

Thanks again, and sorry for the inconvenience.

Roger

--

 


[-- Attachment #2: patchan.txt --]
[-- Type: text/plain, Size: 1496 bytes --]

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 3b02d0c..585b2d5 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -10423,10 +10423,10 @@
 })
 
 (define_insn_and_split "*andn<dwi>3_doubleword_bmi"
-  [(set (match_operand:<DWI> 0 "register_operand" "=r")
+  [(set (match_operand:<DWI> 0 "register_operand" "=&r,r,r")
 	(and:<DWI>
-	  (not:<DWI> (match_operand:<DWI> 1 "register_operand" "r"))
-	  (match_operand:<DWI> 2 "nonimmediate_operand" "ro")))
+	  (not:<DWI> (match_operand:<DWI> 1 "register_operand" "r,0,r"))
+	  (match_operand:<DWI> 2 "nonimmediate_operand" "ro,ro,0")))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_BMI"
   "#"
diff --git a/gcc/testsuite/gcc.target/i386/pr106273.c b/gcc/testsuite/gcc.target/i386/pr106273.c
new file mode 100644
index 0000000..8c2fbbb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr106273.c
@@ -0,0 +1,27 @@
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-Og -march=cascadelake" } */
+typedef unsigned char u8;
+typedef unsigned short u16;
+typedef unsigned long long u64;
+
+u8 g;
+
+void
+foo (__int128 i, u8 *r)
+{
+  u16 a = __builtin_sub_overflow_p (0, i * g, 0);
+  i ^= g & i;
+  u64 s = (i >> 64) + i;
+  *r = ((union { u16 a; u8 b[2]; }) a).b[1] + s;
+}
+
+int
+main (void)
+{
+  u8 x;
+  foo (5, &x);
+  if (x != 5)
+    __builtin_abort ();
+  return 0;
+}
+/* { dg-final { scan-assembler-not "andn\[ \\t\]+%rdi, %r11, %rdi" } } */

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [x86 PATCH] PR target/106273: Add earlyclobber to *andn<dwi>3_doubleword_bmi
  2022-07-15 13:27 [x86 PATCH] PR target/106273: Add earlyclobber to *andn<dwi>3_doubleword_bmi Roger Sayle
@ 2022-07-15 15:47 ` Uros Bizjak
  0 siblings, 0 replies; 2+ messages in thread
From: Uros Bizjak @ 2022-07-15 15:47 UTC (permalink / raw)
  To: Roger Sayle; +Cc: GCC Patches

On Fri, Jul 15, 2022 at 3:28 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
>
> This patch resolves PR target/106273 which is a wrong code regression
>
> caused by the recent reorganization to split doubleword operations after
>
> reload on x86.  For the failing test case, the constraints on the
>
> andnti3_doubleword_bmi pattern allow reload to allocate the output and
>
> operand in overlapping but non-identical registers, i.e.
>
>
>
> (insn 45 44 66 2 (parallel [
>
>             (set (reg/v:TI 5 di [orig:96 i ] [96])
>
>                 (and:TI (not:TI (reg:TI 39 r11 [orig:83 _2 ] [83]))
>
>                     (reg/v:TI 4 si [orig:100 i ] [100])))
>
>             (clobber (reg:CC 17 flags))
>
>         ]) "pr106273.c":13:5 562 {*andnti3_doubleword_bmi}
>
>
>
> where the output is in registers 5 and 6, and the second operand is
>
> registers 4 and 5, which then leads to the incorrect split:
>
>
>
> (insn 113 44 114 2 (parallel [
>
>             (set (reg:DI 5 di [orig:96 i ] [96])
>
>                 (and:DI (not:DI (reg:DI 39 r11 [orig:83 _2 ] [83]))
>
>                     (reg:DI 4 si [orig:100 i ] [100])))
>
>             (clobber (reg:CC 17 flags))
>
>         ]) "pr106273.c":13:5 566 {*andndi_1}
>
>
>
> (insn 114 113 66 2 (parallel [
>
>             (set (reg:DI 6 bp [ i+8 ])
>
>                 (and:DI (not:DI (reg:DI 40 r12 [ _2+8 ]))
>
>                     (reg:DI 5 di [ i+8 ])))
>
>             (clobber (reg:CC 17 flags))
>
>         ]) "pr106273.c":13:5 566 {*andndi_1}
>
>
>
> [Notice that reg:DI 5 is set in the first instruction, but assumed
>
> to have its original value in the second].  My first thought was
>
> that this could be fixed by swapping the order of the split instructions
>
> (which works in this case), but in the general case, it's impossible
>
> to handle (set (reg:TI x) (op (reg:TI x+1) (reg:TI x-1)).  Hence for
>
> correctness this pattern needs an earlyclobber "=&r", but we can also
>
> allow cases where the output is the same as one of the operands (using
>
> constraint "0").  The other binary logic operations (AND, IOR, XOR)
>
> are unaffected as they constrain the output to match the first
>
> operand, but BMI's andn is a three-operand instruction which can
>
> lead to the overlapping cases described above.
>
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>
> and make -k check, both with and without --target_board=unix{-m32} with
>
> no new failures.  Ok for mainline?
>
>
>
> 2022-07-15  Roger Sayle  <roger@nextmovesoftware.com>
>
>
>
> gcc/ChangeLog
>
>         PR target/106273
>
>         * config/i386/i386.md (*andn<dwi>3_doubleword_bmi): Update the
>
>         constraints to reflect the output is earlyclobber, unless it is
>
>         the same register (pair) as one of the operands.
>
>
>
> gcc/testsuite/ChangeLog
>
>         PR target/106273
>
>         * gcc.target/i386/pr106273.c: New test case.

OK with a small testcase adjustment.

Thanks,
Uros.

+int
+main (void)

If possible, please name this function "bar", the "main" function
assumes a runtime test.

+{
+  u8 x;
+  foo (5, &x);
+  if (x != 5)
+    __builtin_abort ();
+  return 0;
+}
>
>
>
>
>
> Thanks again, and sorry for the inconvenience.
>
> Roger
>
> --
>
>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-07-15 15:47 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-15 13:27 [x86 PATCH] PR target/106273: Add earlyclobber to *andn<dwi>3_doubleword_bmi Roger Sayle
2022-07-15 15:47 ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).