public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Fix PR rtl-optimization/89795
@ 2019-09-11 10:44 Eric Botcazou
  2019-09-11 11:39 ` Jakub Jelinek
  2019-09-11 12:05 ` Segher Boessenkool
  0 siblings, 2 replies; 5+ messages in thread
From: Eric Botcazou @ 2019-09-11 10:44 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 989 bytes --]

This fixes a wrong code generation on the ARM in very peculiar circumstances ( 
-O2 -fno-dce -fno-forward-propagate -fno-sched-pressure) present on all active 
branches in the form of a missing zero-extension.  It turns out that not all 
parts of the RTL middle-end agree on the assumptions valid on RISC machines as 
encoded by WORD_REGISTER_OPERATIONS, so the nonzero_bits machinery needs to be 
more conservative with them.  As witnessed by the added XFAIL, this will add 
back some redundant zero-extensions on WORD_REGISTER_OPERATIONS targets.

Tested on SPARC/Solaris and SPARC64/Linux, applied on all active branches.


2019-09-11  Eric Botcazou  <ebotcazou@adacore.com>

	PR rtl-optimization/89795
	* rtlanal.c (nonzero_bits1) <SUBREG>: Do not propagate results from
	inner REGs to paradoxical SUBREGs if WORD_REGISTER_OPERATIONS is set.


2019-09-11  Eric Botcazou  <ebotcazou@adacore.com>

	* gcc.target/sparc/20161111-1.c: XFAIL redundant zero-extension test.

-- 
Eric Botcazou

[-- Attachment #2: pr89795.diff --]
[-- Type: text/x-patch, Size: 1033 bytes --]

Index: rtlanal.c
===================================================================
--- rtlanal.c	(revision 275484)
+++ rtlanal.c	(working copy)
@@ -4810,7 +4810,7 @@ nonzero_bits1 (const_rtx x, scalar_int_m
 	       || ((extend_op = load_extend_op (inner_mode)) == SIGN_EXTEND
 		   ? val_signbit_known_set_p (inner_mode, nonzero)
 		   : extend_op != ZERO_EXTEND)
-	       || (!MEM_P (SUBREG_REG (x)) && !REG_P (SUBREG_REG (x))))
+	       || !MEM_P (SUBREG_REG (x)))
 	      && xmode_width > inner_width)
 	    nonzero
 	      |= (GET_MODE_MASK (GET_MODE (x)) & ~GET_MODE_MASK (inner_mode));
Index: testsuite/gcc.target/sparc/20161111-1.c
===================================================================
--- testsuite/gcc.target/sparc/20161111-1.c	(revision 275484)
+++ testsuite/gcc.target/sparc/20161111-1.c	(working copy)
@@ -14,4 +14,4 @@ unsigned char ee_isdigit2(unsigned int i
   return retval;
 }
 
-/* { dg-final { scan-assembler-not "and\t%" } } */
+/* { dg-final { scan-assembler-not "and\t%" { xfail *-*-* } } } */

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Fix PR rtl-optimization/89795
  2019-09-11 10:44 Fix PR rtl-optimization/89795 Eric Botcazou
@ 2019-09-11 11:39 ` Jakub Jelinek
  2019-09-11 12:05 ` Segher Boessenkool
  1 sibling, 0 replies; 5+ messages in thread
From: Jakub Jelinek @ 2019-09-11 11:39 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

On Wed, Sep 11, 2019 at 12:39:51PM +0200, Eric Botcazou wrote:
> This fixes a wrong code generation on the ARM in very peculiar circumstances ( 
> -O2 -fno-dce -fno-forward-propagate -fno-sched-pressure) present on all active 
> branches in the form of a missing zero-extension.  It turns out that not all 
> parts of the RTL middle-end agree on the assumptions valid on RISC machines as 
> encoded by WORD_REGISTER_OPERATIONS, so the nonzero_bits machinery needs to be 
> more conservative with them.  As witnessed by the added XFAIL, this will add 
> back some redundant zero-extensions on WORD_REGISTER_OPERATIONS targets.
> 
> Tested on SPARC/Solaris and SPARC64/Linux, applied on all active branches.

Thanks.  I've added testcase for the various PRs that are fixed by this
change, verified on armv4 and riscv64 on the first and last testcases that
the rtlanal.c change makes the desired change in the assembly, plus tested
on x86_64-linux -m32/-m64, committed to trunk as obvious.

2019-09-11  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/89435
	PR rtl-optimization/89795
	PR rtl-optimization/91720
	* gcc.dg/pr89435.c: New test.
	* gcc.dg/pr89795.c: New test.
	* gcc.dg/pr91720.c: New test.

--- gcc/testsuite/gcc.dg/pr89435.c.jj	2019-09-11 13:34:09.630165981 +0200
+++ gcc/testsuite/gcc.dg/pr89435.c	2019-09-11 13:33:37.038664072 +0200
@@ -0,0 +1,21 @@
+/* PR rtl-optimization/89435 */
+/* { dg-do run } */
+/* { dg-options "-O1 -fno-forward-propagate -fno-tree-forwprop -fno-tree-ccp" } */
+
+unsigned short a;
+unsigned int b, c, d, e, f;
+
+int
+main ()
+{
+#if __CHAR_BIT__ == 8 && __SIZEOF_INT__ == 4
+  unsigned char g = e = __builtin_mul_overflow_p (5, 542624702, 0);
+  d = __builtin_bswap64 (a);
+  b = __builtin_sub_overflow ((unsigned char) -e, (unsigned int) d, &g);
+  e = __builtin_mul_overflow (b, c, &a);
+  f = g + e;
+  if (f != 0xff)
+    __builtin_abort ();
+#endif
+  return 0;
+}
--- gcc/testsuite/gcc.dg/pr89795.c.jj	2019-09-11 13:05:58.405033868 +0200
+++ gcc/testsuite/gcc.dg/pr89795.c	2019-09-11 13:04:52.508042619 +0200
@@ -0,0 +1,25 @@
+/* PR rtl-optimization/89795 */
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-dce -fno-forward-propagate -fno-sched-pressure" } */
+
+unsigned char a;
+unsigned b, c, d;
+
+int
+main ()
+{
+#if __CHAR_BIT__ == 8
+  unsigned x;
+  int e, f;
+  unsigned char g;
+  e = __builtin_bswap32 (a);
+  f = __builtin_ffs (~(unsigned short) e);
+  a = __builtin_mul_overflow ((unsigned char) 0xf7, f, &g);
+  a |= __builtin_sub_overflow_p (c, 0, (unsigned char) 0);
+  d = g + b;
+  x = d;
+  if (x != 0xf7)
+    __builtin_abort ();
+#endif
+  return 0;
+}
--- gcc/testsuite/gcc.dg/pr91720.c.jj	2019-09-11 13:06:08.909873060 +0200
+++ gcc/testsuite/gcc.dg/pr91720.c	2019-09-11 13:00:50.070754162 +0200
@@ -0,0 +1,22 @@
+/* PR rtl-optimization/91720 */
+/* { dg-do run } */
+/* { dg-options "-Og -fno-forward-propagate -frerun-cse-after-loop -fno-tree-fre" } */
+
+unsigned a, b;
+
+int
+main ()
+{
+#if __CHAR_BIT__ == 8
+  unsigned c = 1;
+  unsigned long long d = 0;
+  unsigned char e = 0;
+  e = __builtin_sub_overflow (d, e, &a) ? 0 : 0x80;
+  e = e << 7 | e >> c;
+  __builtin_memmove (&d, &a, 2);
+  b = e;
+  if (b != 0x40)
+    __builtin_abort ();
+#endif
+  return 0;
+}


	Jakub

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Fix PR rtl-optimization/89795
  2019-09-11 10:44 Fix PR rtl-optimization/89795 Eric Botcazou
  2019-09-11 11:39 ` Jakub Jelinek
@ 2019-09-11 12:05 ` Segher Boessenkool
  2019-09-11 12:18   ` Eric Botcazou
  1 sibling, 1 reply; 5+ messages in thread
From: Segher Boessenkool @ 2019-09-11 12:05 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

Hi Eric,

On Wed, Sep 11, 2019 at 12:39:51PM +0200, Eric Botcazou wrote:
> This fixes a wrong code generation on the ARM in very peculiar circumstances ( 
> -O2 -fno-dce -fno-forward-propagate -fno-sched-pressure) present on all active 
> branches in the form of a missing zero-extension.  It turns out that not all 
> parts of the RTL middle-end agree on the assumptions valid on RISC machines as 
> encoded by WORD_REGISTER_OPERATIONS, so the nonzero_bits machinery needs to be 
> more conservative with them.  As witnessed by the added XFAIL, this will add 
> back some redundant zero-extensions on WORD_REGISTER_OPERATIONS targets.

But what are those assumptions, _exactly_?  It's not clear to me.


WORD_REGISTER_OPERATIONS
  Define this macro if operations between registers with integral mode
  smaller than a word are always performed on the entire register. Most
  RISC machines have this property and most CISC machines do not.


What does it mean for
  (set (reg:QI) (const_int -128))
for example?  This is the canonical form of moving positive 128 into a
register as well, of course.

On a target with 64-bit registers, does it mean that
  (set (reg:SI d) (mult:SI (reg:SI a) (reg:SI b)))
is actually done as
  (set (reg:DI d) (mult:DI (reg:DI a) (reg:DI b)))
?  I can't think of any hardware where that is true.

W_R_O needs many more restrictions if it can work at all, but those aren't
documented.


Segher

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Fix PR rtl-optimization/89795
  2019-09-11 12:05 ` Segher Boessenkool
@ 2019-09-11 12:18   ` Eric Botcazou
  2019-09-12  0:42     ` Segher Boessenkool
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Botcazou @ 2019-09-11 12:18 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches

> What does it mean for
>   (set (reg:QI) (const_int -128))
> for example?  This is the canonical form of moving positive 128 into a
> register as well, of course.

Nothing, since word_register_operation_p returns false on CONST_INTs .

> W_R_O needs many more restrictions if it can work at all, but those aren't
> documented.

The problematic issue is more basic and pertains to mere SUBREGs.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Fix PR rtl-optimization/89795
  2019-09-11 12:18   ` Eric Botcazou
@ 2019-09-12  0:42     ` Segher Boessenkool
  0 siblings, 0 replies; 5+ messages in thread
From: Segher Boessenkool @ 2019-09-12  0:42 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches

On Wed, Sep 11, 2019 at 02:18:36PM +0200, Eric Botcazou wrote:
> > What does it mean for
> >   (set (reg:QI) (const_int -128))
> > for example?  This is the canonical form of moving positive 128 into a
> > register as well, of course.
> 
> Nothing, since word_register_operation_p returns false on CONST_INTs .
> 
> > W_R_O needs many more restrictions if it can work at all, but those aren't
> > documented.
> 
> The problematic issue is more basic and pertains to mere SUBREGs.

I'd say this is more basic, but heh...  It's an example.  If it isn't
documented what W_R_O really *does*, how can we ever hope to make it
works stably?


Segher

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-09-12  0:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-11 10:44 Fix PR rtl-optimization/89795 Eric Botcazou
2019-09-11 11:39 ` Jakub Jelinek
2019-09-11 12:05 ` Segher Boessenkool
2019-09-11 12:18   ` Eric Botcazou
2019-09-12  0:42     ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).