public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Fix __builtin_*mul*_overflow* expansion (PR middle-end/90095)
@ 2019-04-16  8:07 Jakub Jelinek
  2019-04-16  8:34 ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Jelinek @ 2019-04-16  8:07 UTC (permalink / raw)
  To: Richard Biener, Jeff Law, Eric Botcazou; +Cc: gcc-patches

Hi!

In expand_mul_overflow, we emit (depending on operand ranges etc.) more than
one multiplication to optimize for the common case.  One of those is where
we are guaranteed that both operands are either sign or zero extended from
half precision types to the full precision, we can then use much shorter
sequence.  opN_small_p is a compile time predicate whether the corresponding
argument is known to be in that range.  If yes, we just emit the simple
code, otherwise we emit a runtime conditional to check (for uns compare the
high half of bits against zero, for !uns against signbit of the low part
shifted arithmetically right).  In both cases we use SUBREG_PROMOTED_VAR_P
on the SUBREG to propagate this information down to the rest of the
expansion code.

Unfortunately, seems the RTL hoist pass hapilly hoists the subreg/s/v in
this case before the conditional branch, the r259649 change in
simplify-rtx.c optimizes more than in the past a ZERO_EXTENSION from
subreg/s/v into just a pseudo copy and later on yet another pass optimizes
the comparison away.

The following patch fixes that by emitting slightly different code if
!opN_small_p - instead of adding a subreg/s/v on the original pseudo, it
adds a new pseudo, copies original pseudo to it only in the bb guarded with
the condition and makes subreg/s/v on that new pseudo.  That is enough to
tell the hoisting that it shouldn't hoist it, if even that wouldn't work,
I'd say that SUBREG_PROMOTED_VAR_P would be pretty much useless.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-04-16  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/90095
	* internal-fn.c (expand_mul_overflow): Don't set SUBREG_PROMOTED_VAR_P
	on lowpart SUBREG of op0 if !op0_small_p, instead copy op0 into a new
	pseudo and set SUBREG_PROMOTED_VAR_P on a lowpart SUBREG of that
	pseudo.  Similarly for op1.

	* gcc.dg/pr90095.c: New test.

--- gcc/internal-fn.c.jj	2019-01-07 09:47:33.000000000 +0100
+++ gcc/internal-fn.c	2019-04-15 17:41:50.915039293 +0200
@@ -1756,13 +1756,35 @@ expand_mul_overflow (location_t loc, tre
 	  rtx lopart0s = lopart0, lopart1s = lopart1;
 	  if (GET_CODE (lopart0) == SUBREG)
 	    {
-	      lopart0s = shallow_copy_rtx (lopart0);
+	      if (!op0_small_p)
+		{
+		  /* If !op0_small_p, the SUBREG_PROMOTED_VAR_P state we
+		     want to set is just local to the containing basic block,
+		     guarded with the above conditional branch.  Make sure
+		     to create a new pseudo, so that hoisting doesn't hoist
+		     the promoted subreg before the conditional and optimize
+		     away the conditional incorrectly.  See PR90095 for
+		     details.  */
+		  rtx op0_copy = gen_reg_rtx (mode);
+		  emit_move_insn (op0_copy, op0);
+		  lopart0s = convert_modes (hmode, mode, op0_copy, uns);
+		  gcc_assert (GET_CODE (lopart0s) == SUBREG);
+		}
+	      lopart0s = shallow_copy_rtx (lopart0s);
 	      SUBREG_PROMOTED_VAR_P (lopart0s) = 1;
 	      SUBREG_PROMOTED_SET (lopart0s, uns ? SRP_UNSIGNED : SRP_SIGNED);
 	    }
 	  if (GET_CODE (lopart1) == SUBREG)
 	    {
-	      lopart1s = shallow_copy_rtx (lopart1);
+	      if (!op1_small_p)
+		{
+		  /* See above PR90095 comment.  */
+		  rtx op1_copy = gen_reg_rtx (mode);
+		  emit_move_insn (op1_copy, op1);
+		  lopart1s = convert_modes (hmode, mode, op1_copy, uns);
+		  gcc_assert (GET_CODE (lopart1s) == SUBREG);
+		}
+	      lopart1s = shallow_copy_rtx (lopart1s);
 	      SUBREG_PROMOTED_VAR_P (lopart1s) = 1;
 	      SUBREG_PROMOTED_SET (lopart1s, uns ? SRP_UNSIGNED : SRP_SIGNED);
 	    }
--- gcc/testsuite/gcc.dg/pr90095.c.jj	2019-04-15 17:47:21.402673528 +0200
+++ gcc/testsuite/gcc.dg/pr90095.c	2019-04-15 17:52:38.111531221 +0200
@@ -0,0 +1,18 @@
+/* PR middle-end/90095 */
+/* { dg-do run } */
+/* { dg-options "-Os -fno-tree-bit-ccp" } */
+
+unsigned long long a;
+unsigned int b;
+
+int
+main ()
+{
+  unsigned int c = 255, d = c |= b;
+  if (__CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4 || __SIZEOF_LONG_LONG__ != 8)
+    return 0;
+  d = __builtin_mul_overflow (-(unsigned long long) d, (unsigned char) - c, &a);
+  if (d != 0)
+    __builtin_abort ();
+  return 0;
+}

	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix __builtin_*mul*_overflow* expansion (PR middle-end/90095)
  2019-04-16  8:07 [PATCH] Fix __builtin_*mul*_overflow* expansion (PR middle-end/90095) Jakub Jelinek
@ 2019-04-16  8:34 ` Richard Biener
  2019-04-16 10:22   ` Eric Botcazou
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2019-04-16  8:34 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Jeff Law, Eric Botcazou, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5621 bytes --]

On Tue, 16 Apr 2019, Jakub Jelinek wrote:

> Hi!
> 
> In expand_mul_overflow, we emit (depending on operand ranges etc.) more than
> one multiplication to optimize for the common case.  One of those is where
> we are guaranteed that both operands are either sign or zero extended from
> half precision types to the full precision, we can then use much shorter
> sequence.  opN_small_p is a compile time predicate whether the corresponding
> argument is known to be in that range.  If yes, we just emit the simple
> code, otherwise we emit a runtime conditional to check (for uns compare the
> high half of bits against zero, for !uns against signbit of the low part
> shifted arithmetically right).  In both cases we use SUBREG_PROMOTED_VAR_P
> on the SUBREG to propagate this information down to the rest of the
> expansion code.
> 
> Unfortunately, seems the RTL hoist pass hapilly hoists the subreg/s/v in
> this case before the conditional branch, the r259649 change in
> simplify-rtx.c optimizes more than in the past a ZERO_EXTENSION from
> subreg/s/v into just a pseudo copy and later on yet another pass optimizes
> the comparison away.
> 
> The following patch fixes that by emitting slightly different code if
> !opN_small_p - instead of adding a subreg/s/v on the original pseudo, it
> adds a new pseudo, copies original pseudo to it only in the bb guarded with
> the condition and makes subreg/s/v on that new pseudo.  That is enough to
> tell the hoisting that it shouldn't hoist it, if even that wouldn't work,
> I'd say that SUBREG_PROMOTED_VAR_P would be pretty much useless.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

If I remember correctly and read the cfgexpand.c comment

  /* For a promoted variable, X will not be used directly but wrapped in a
     SUBREG with SUBREG_PROMOTED_VAR_P set, which means that the RTL land
     will assume that its upper bits can be inferred from its lower bits.
     Therefore, if X isn't initialized on every path from the entry, then
     we must do it manually in order to fulfill the above assumption.  */

you run into a similar situation but no, I don't think your patch is
really fixing this given SUBREG_PROMOTED_VAR_P is only true depending
on a condition (and not a data-dependence).

Note that likely SUBREG_PROMOTED_VAR_P wasn't designed to communicate
zero-extend info (can't you use a REG_EQUIV note somehow?) but it has
to be information that is valid everywhere in the function unless
data dependences force its motion (thus a conditional doesn't do).

Eric should know for sure though.

(oh, and yes, I think SUBREG_PROMOTED_VAR_P is a dangerous thing,
but so is VRP info on SSA names as we've learned...)

Richard.

> 2019-04-16  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR middle-end/90095
> 	* internal-fn.c (expand_mul_overflow): Don't set SUBREG_PROMOTED_VAR_P
> 	on lowpart SUBREG of op0 if !op0_small_p, instead copy op0 into a new
> 	pseudo and set SUBREG_PROMOTED_VAR_P on a lowpart SUBREG of that
> 	pseudo.  Similarly for op1.
> 
> 	* gcc.dg/pr90095.c: New test.
> 
> --- gcc/internal-fn.c.jj	2019-01-07 09:47:33.000000000 +0100
> +++ gcc/internal-fn.c	2019-04-15 17:41:50.915039293 +0200
> @@ -1756,13 +1756,35 @@ expand_mul_overflow (location_t loc, tre
>  	  rtx lopart0s = lopart0, lopart1s = lopart1;
>  	  if (GET_CODE (lopart0) == SUBREG)
>  	    {
> -	      lopart0s = shallow_copy_rtx (lopart0);
> +	      if (!op0_small_p)
> +		{
> +		  /* If !op0_small_p, the SUBREG_PROMOTED_VAR_P state we
> +		     want to set is just local to the containing basic block,
> +		     guarded with the above conditional branch.  Make sure
> +		     to create a new pseudo, so that hoisting doesn't hoist
> +		     the promoted subreg before the conditional and optimize
> +		     away the conditional incorrectly.  See PR90095 for
> +		     details.  */
> +		  rtx op0_copy = gen_reg_rtx (mode);
> +		  emit_move_insn (op0_copy, op0);
> +		  lopart0s = convert_modes (hmode, mode, op0_copy, uns);
> +		  gcc_assert (GET_CODE (lopart0s) == SUBREG);
> +		}
> +	      lopart0s = shallow_copy_rtx (lopart0s);
>  	      SUBREG_PROMOTED_VAR_P (lopart0s) = 1;
>  	      SUBREG_PROMOTED_SET (lopart0s, uns ? SRP_UNSIGNED : SRP_SIGNED);
>  	    }
>  	  if (GET_CODE (lopart1) == SUBREG)
>  	    {
> -	      lopart1s = shallow_copy_rtx (lopart1);
> +	      if (!op1_small_p)
> +		{
> +		  /* See above PR90095 comment.  */
> +		  rtx op1_copy = gen_reg_rtx (mode);
> +		  emit_move_insn (op1_copy, op1);
> +		  lopart1s = convert_modes (hmode, mode, op1_copy, uns);
> +		  gcc_assert (GET_CODE (lopart1s) == SUBREG);
> +		}
> +	      lopart1s = shallow_copy_rtx (lopart1s);
>  	      SUBREG_PROMOTED_VAR_P (lopart1s) = 1;
>  	      SUBREG_PROMOTED_SET (lopart1s, uns ? SRP_UNSIGNED : SRP_SIGNED);
>  	    }
> --- gcc/testsuite/gcc.dg/pr90095.c.jj	2019-04-15 17:47:21.402673528 +0200
> +++ gcc/testsuite/gcc.dg/pr90095.c	2019-04-15 17:52:38.111531221 +0200
> @@ -0,0 +1,18 @@
> +/* PR middle-end/90095 */
> +/* { dg-do run } */
> +/* { dg-options "-Os -fno-tree-bit-ccp" } */
> +
> +unsigned long long a;
> +unsigned int b;
> +
> +int
> +main ()
> +{
> +  unsigned int c = 255, d = c |= b;
> +  if (__CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4 || __SIZEOF_LONG_LONG__ != 8)
> +    return 0;
> +  d = __builtin_mul_overflow (-(unsigned long long) d, (unsigned char) - c, &a);
> +  if (d != 0)
> +    __builtin_abort ();
> +  return 0;
> +}
> 
> 	Jakub
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG NÌrnberg)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix __builtin_*mul*_overflow* expansion (PR middle-end/90095)
  2019-04-16  8:34 ` Richard Biener
@ 2019-04-16 10:22   ` Eric Botcazou
  2019-04-16 13:07     ` Jakub Jelinek
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Botcazou @ 2019-04-16 10:22 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Jakub Jelinek, Jeff Law

> If I remember correctly and read the cfgexpand.c comment
> 
>   /* For a promoted variable, X will not be used directly but wrapped in a
>      SUBREG with SUBREG_PROMOTED_VAR_P set, which means that the RTL land
>      will assume that its upper bits can be inferred from its lower bits.
>      Therefore, if X isn't initialized on every path from the entry, then
>      we must do it manually in order to fulfill the above assumption.  */
> 
> you run into a similar situation but no, I don't think your patch is
> really fixing this given SUBREG_PROMOTED_VAR_P is only true depending
> on a condition (and not a data-dependence).

If the new pseudo is initialized with the above semantics on every possible 
path where it is used, then I think it's OK.

> Note that likely SUBREG_PROMOTED_VAR_P wasn't designed to communicate
> zero-extend info (can't you use a REG_EQUIV note somehow?) but it has
> to be information that is valid everywhere in the function unless
> data dependences force its motion (thus a conditional doesn't do).

SUBREG_PROMOTED_VAR_P is fundamentally a property of the inner pseudo and not 
of the SUBREG it is put on, hence...

> (oh, and yes, I think SUBREG_PROMOTED_VAR_P is a dangerous thing,
> but so is VRP info on SSA names as we've learned...)

...yes, it is delicate to deal with.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix __builtin_*mul*_overflow* expansion (PR middle-end/90095)
  2019-04-16 10:22   ` Eric Botcazou
@ 2019-04-16 13:07     ` Jakub Jelinek
  2019-04-16 16:29       ` Eric Botcazou
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Jelinek @ 2019-04-16 13:07 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Richard Biener, gcc-patches, Jeff Law

On Tue, Apr 16, 2019 at 12:20:16PM +0200, Eric Botcazou wrote:
> > If I remember correctly and read the cfgexpand.c comment
> > 
> >   /* For a promoted variable, X will not be used directly but wrapped in a
> >      SUBREG with SUBREG_PROMOTED_VAR_P set, which means that the RTL land
> >      will assume that its upper bits can be inferred from its lower bits.
> >      Therefore, if X isn't initialized on every path from the entry, then
> >      we must do it manually in order to fulfill the above assumption.  */
> > 
> > you run into a similar situation but no, I don't think your patch is
> > really fixing this given SUBREG_PROMOTED_VAR_P is only true depending
> > on a condition (and not a data-dependence).
> 
> If the new pseudo is initialized with the above semantics on every possible 
> path where it is used, then I think it's OK.

The patch changed (expand dump with some linebreaks removed):
 ;; _12 = .MUL_OVERFLOW (_3, _6);
 (insn 7 6 8 (set (reg:DI 96) (zero_extend:DI (reg/v:SI 91 [ c ]))) "pr90095.c":14:32 -1 (nil))
 (insn 8 7 9 (parallel [ (set (reg:DI 97) (neg:DI (reg:DI 96))) (clobber (reg:CC 17 flags)) ]) "pr90095.c":14:7 -1 (nil))
 (insn 9 8 10 (parallel [ (set (reg:QI 98) (neg:QI (subreg:QI (reg/v:SI 91 [ c ]) 0))) (clobber (reg:CC 17 flags)) ]) "pr90095.c":14:56 -1 (nil))
 (insn 10 9 11 (set (reg:SI 99) (zero_extend:SI (reg:QI 98))) "pr90095.c":14:7 -1 (nil))
 (insn 11 10 12 (parallel [ (set (reg:DI 100) (sign_extend:DI (reg:SI 99))) (clobber (reg:CC 17 flags)) (clobber (scratch:SI)) ]) "pr90095.c":14:7 -1 (nil))
 (insn 12 11 13 (set (reg:DI 93 [ _12+8 ]) (const_int 0 [0])) "pr90095.c":14:7 -1 (nil))
 (insn 13 12 14 (parallel [ (set (reg:DI 101) (lshiftrt:DI (reg:DI 97) (const_int 32 [0x20]))) (clobber (reg:CC 17 flags)) ]) "pr90095.c":14:7 -1 (nil))
 (insn 14 13 15 (parallel [ (set (reg:DI 102) (lshiftrt:DI (reg:DI 100) (const_int 32 [0x20]))) (clobber (reg:CC 17 flags)) ]) "pr90095.c":14:7 -1 (nil))
 (insn 15 14 16 (set (reg:CCZ 17 flags) (compare:CCZ (subreg:SI (reg:DI 101) 0) (const_int 0 [0]))) "pr90095.c":14:7 -1 (nil))
-(jump_insn 16 15 18 (set (pc) (if_then_else (ne (reg:CCZ 17 flags) (const_int 0 [0])) (label_ref 28) (pc))) "pr90095.c":14:7 -1 (int_list:REG_BR_PROB 214748364 (nil)))
-(insn 18 16 19 (parallel [ (set (reg:DI 105) (mult:DI (zero_extend:DI (subreg/s/v:SI (reg:DI 97) 0)) (zero_extend:DI (subreg/s/v:SI (reg:DI 100) 0)))) (clobber (reg:CC 17 flags)) ]) "pr90095.c":14:7 -1 (nil))
+(jump_insn 16 15 17 (set (pc) (if_then_else (ne (reg:CCZ 17 flags) (const_int 0 [0])) (label_ref 28) (pc))) "pr90095.c":14:7 -1 (int_list:REG_BR_PROB 214748364 (nil)))
+(insn 17 16 18 (set (reg:DI 104) (reg:DI 97)) "pr90095.c":14:7 -1 (nil))
+(insn 18 17 19 (parallel [ (set (reg:DI 105) (mult:DI (zero_extend:DI (subreg/s/v:SI (reg:DI 104) 0)) (zero_extend:DI (subreg/s/v:SI (reg:DI 100) 0)))) (clobber (reg:CC 17 flags)) ]) "pr90095.c":14:7 -1 (nil))
 (insn 19 18 20 (set (reg:DI 103) (reg:DI 105)) "pr90095.c":14:7 -1 (nil))
 (jump_insn 20 19 21 (set (pc) (label_ref 68)) "pr90095.c":14:7 -1 (nil))
 (barrier 21 20 22)

The pseudo 97 (op0) is initialized before the comparison of its high 32 bits
(insn 15), in the second basic block starting after jump_insn 16 previously
the first insn was the multiplication with (subreg/s/v:SI (reg:DI 97) 0)
(the only spot where such a subreg appears), with the patch we first copy
the pseudo 97 into another pseudo 104 and use (subreg/s/v:SI (reg:DI 104) 0)
in the multiplication instead.  Nothing after jump_insn 20 uses the pseudo
104 or that subreg/s/v.
The runtime check assures that at runtime, the upper 32 bits of pseudo 104
must be always 0 (in this case, in some other case could be sign bit
copies).

The question is if it would be valid say for forward propagation to first
propagate (or combine) the pseudo 97 into the (subreg/s/v:SI (reg:DI 104) 0),
then hoisting it before the jump_insn 16, have the subreg optimized away and
miscompile later on.

On the testcase that won't happen e.g. because of the subreg1 pass, which
lowers that insn 17 into:
(insn 99 86 100 3 (set (reg:SI 137)
        (subreg:SI (reg:DI 97) 0)) "pr90095.c":14:7 67 {*movsi_internal}
     (nil))
(insn 100 99 18 3 (set (reg:SI 138 [+4 ])
        (subreg:SI (reg:DI 97) 4)) "pr90095.c":14:7 67 {*movsi_internal}
     (nil))
(insn 18 100 101 3 (parallel [
            (set (reg:DI 105)
                (mult:DI (zero_extend:DI (reg:SI 137))
                    (zero_extend:DI (subreg/s/v:SI (reg:DI 100) 0))))
            (clobber (reg:CC 17 flags))
        ]) "pr90095.c":14:7 334 {*umulsidi3_1}
     (nil))
and so the problematic subreg/s/v is gone (the other is fine, op1_small_p is
true).  But if I add -fno-split-wide-types to the option set, the testcase
aborts again even with the patch.

That means either that the hoisting pass is buggy, or that SUBREG_PROMOTED_*
is only safe at the function boundary (function arguments and return value)
and not elsewhere.

	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix __builtin_*mul*_overflow* expansion (PR middle-end/90095)
  2019-04-16 13:07     ` Jakub Jelinek
@ 2019-04-16 16:29       ` Eric Botcazou
  2019-04-16 19:08         ` [PATCH] Fix __builtin_*mul*_overflow* expansion (PR middle-end/90095, take 2) Jakub Jelinek
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Botcazou @ 2019-04-16 16:29 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, Richard Biener, Jeff Law

> The runtime check assures that at runtime, the upper 32 bits of pseudo 104
> must be always 0 (in this case, in some other case could be sign bit
> copies).

OK, as Richard pointed out, that's not sufficient if we allow...

> The question is if it would be valid say for forward propagation to first
> propagate (or combine) the pseudo 97 into the (subreg/s/v:SI (reg:DI 104)
> 0), then hoisting it before the jump_insn 16, have the subreg optimized
> away and miscompile later on.

...this to happen.  So we could clear SUBREG_PROMOTED_VAR_P as soon as the 
SUBREG is rewritten, but this looks quite fragile.  The safest route is 
probably not to use SUBREG_PROMOTED_VAR_P in this conditional context.

> That means either that the hoisting pass is buggy, or that SUBREG_PROMOTED_*
> is only safe at the function boundary (function arguments and return value)
> and not elsewhere.

I think that Richard's characterization is correct:

"Note that likely SUBREG_PROMOTED_VAR_P wasn't designed to communicate
zero-extend info (can't you use a REG_EQUIV note somehow?) but it has
to be information that is valid everywhere in the function unless
data dependences force its motion (thus a conditional doesn't do)."

i.e. this also works for a local variable that is always accessed with the 
SUBREG_PROMOTED_VAR_P semantics.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] Fix __builtin_*mul*_overflow* expansion (PR middle-end/90095, take 2)
  2019-04-16 16:29       ` Eric Botcazou
@ 2019-04-16 19:08         ` Jakub Jelinek
  2019-04-17  7:50           ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Jakub Jelinek @ 2019-04-16 19:08 UTC (permalink / raw)
  To: Eric Botcazou, Richard Biener, Jeff Law; +Cc: gcc-patches

On Tue, Apr 16, 2019 at 06:21:25PM +0200, Eric Botcazou wrote:
> > The runtime check assures that at runtime, the upper 32 bits of pseudo 104
> > must be always 0 (in this case, in some other case could be sign bit
> > copies).
> 
> OK, as Richard pointed out, that's not sufficient if we allow...
> 
> > The question is if it would be valid say for forward propagation to first
> > propagate (or combine) the pseudo 97 into the (subreg/s/v:SI (reg:DI 104)
> > 0), then hoisting it before the jump_insn 16, have the subreg optimized
> > away and miscompile later on.
> 
> ...this to happen.  So we could clear SUBREG_PROMOTED_VAR_P as soon as the 
> SUBREG is rewritten, but this looks quite fragile.  The safest route is 
> probably not to use SUBREG_PROMOTED_VAR_P in this conditional context.
> 
> > That means either that the hoisting pass is buggy, or that SUBREG_PROMOTED_*
> > is only safe at the function boundary (function arguments and return value)
> > and not elsewhere.
> 
> I think that Richard's characterization is correct:
> 
> "Note that likely SUBREG_PROMOTED_VAR_P wasn't designed to communicate
> zero-extend info (can't you use a REG_EQUIV note somehow?) but it has
> to be information that is valid everywhere in the function unless
> data dependences force its motion (thus a conditional doesn't do)."
> 
> i.e. this also works for a local variable that is always accessed with the 
> SUBREG_PROMOTED_VAR_P semantics.

Ok, here is a patch that just removes all of that SUBREG_PROMOTED_SET then,
as even for the opN_small_p we can't actually guarantee that for the whole
function, only for where the pseudo with the SSA_NAME for which we get the
range appears.  On the bright side, the generated code at least for the
particular testcase has somewhat different RA decisions, but isn't
significantly worse.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-04-16  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/90095
	* internal-fn.c (expand_mul_overflow): Don't set SUBREG_PROMOTED_VAR_P
	on lowpart SUBREGs.

	* gcc.dg/pr90095-1.c: New test.
	* gcc.dg/pr90095-2.c: New test.

--- gcc/internal-fn.c.jj	2019-04-15 19:45:22.384444646 +0200
+++ gcc/internal-fn.c	2019-04-16 15:18:56.614708804 +0200
@@ -1753,22 +1753,9 @@ expand_mul_overflow (location_t loc, tre
 	  /* If both op0 and op1 are sign (!uns) or zero (uns) extended from
 	     hmode to mode, the multiplication will never overflow.  We can
 	     do just one hmode x hmode => mode widening multiplication.  */
-	  rtx lopart0s = lopart0, lopart1s = lopart1;
-	  if (GET_CODE (lopart0) == SUBREG)
-	    {
-	      lopart0s = shallow_copy_rtx (lopart0);
-	      SUBREG_PROMOTED_VAR_P (lopart0s) = 1;
-	      SUBREG_PROMOTED_SET (lopart0s, uns ? SRP_UNSIGNED : SRP_SIGNED);
-	    }
-	  if (GET_CODE (lopart1) == SUBREG)
-	    {
-	      lopart1s = shallow_copy_rtx (lopart1);
-	      SUBREG_PROMOTED_VAR_P (lopart1s) = 1;
-	      SUBREG_PROMOTED_SET (lopart1s, uns ? SRP_UNSIGNED : SRP_SIGNED);
-	    }
 	  tree halfstype = build_nonstandard_integer_type (hprec, uns);
-	  ops.op0 = make_tree (halfstype, lopart0s);
-	  ops.op1 = make_tree (halfstype, lopart1s);
+	  ops.op0 = make_tree (halfstype, lopart0);
+	  ops.op1 = make_tree (halfstype, lopart1);
 	  ops.code = WIDEN_MULT_EXPR;
 	  ops.type = type;
 	  rtx thisres
--- gcc/testsuite/gcc.dg/pr90095-1.c.jj	2019-04-16 13:45:22.614772955 +0200
+++ gcc/testsuite/gcc.dg/pr90095-1.c	2019-04-16 13:45:22.614772955 +0200
@@ -0,0 +1,18 @@
+/* PR middle-end/90095 */
+/* { dg-do run } */
+/* { dg-options "-Os -fno-tree-bit-ccp" } */
+
+unsigned long long a;
+unsigned int b;
+
+int
+main ()
+{
+  unsigned int c = 255, d = c |= b;
+  if (__CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4 || __SIZEOF_LONG_LONG__ != 8)
+    return 0;
+  d = __builtin_mul_overflow (-(unsigned long long) d, (unsigned char) - c, &a);
+  if (d != 0)
+    __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.dg/pr90095-2.c.jj	2019-04-16 15:20:14.728414325 +0200
+++ gcc/testsuite/gcc.dg/pr90095-2.c	2019-04-16 15:20:29.597167928 +0200
@@ -0,0 +1,5 @@
+/* PR middle-end/90095 */
+/* { dg-do run } */
+/* { dg-options "-Os -fno-tree-bit-ccp -fno-split-wide-types" } */
+
+#include "pr90095-1.c"


	Jakub

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Fix __builtin_*mul*_overflow* expansion (PR middle-end/90095, take 2)
  2019-04-16 19:08         ` [PATCH] Fix __builtin_*mul*_overflow* expansion (PR middle-end/90095, take 2) Jakub Jelinek
@ 2019-04-17  7:50           ` Richard Biener
  0 siblings, 0 replies; 7+ messages in thread
From: Richard Biener @ 2019-04-17  7:50 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Eric Botcazou, Jeff Law, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 4685 bytes --]

On Tue, 16 Apr 2019, Jakub Jelinek wrote:

> On Tue, Apr 16, 2019 at 06:21:25PM +0200, Eric Botcazou wrote:
> > > The runtime check assures that at runtime, the upper 32 bits of pseudo 104
> > > must be always 0 (in this case, in some other case could be sign bit
> > > copies).
> > 
> > OK, as Richard pointed out, that's not sufficient if we allow...
> > 
> > > The question is if it would be valid say for forward propagation to first
> > > propagate (or combine) the pseudo 97 into the (subreg/s/v:SI (reg:DI 104)
> > > 0), then hoisting it before the jump_insn 16, have the subreg optimized
> > > away and miscompile later on.
> > 
> > ...this to happen.  So we could clear SUBREG_PROMOTED_VAR_P as soon as the 
> > SUBREG is rewritten, but this looks quite fragile.  The safest route is 
> > probably not to use SUBREG_PROMOTED_VAR_P in this conditional context.
> > 
> > > That means either that the hoisting pass is buggy, or that SUBREG_PROMOTED_*
> > > is only safe at the function boundary (function arguments and return value)
> > > and not elsewhere.
> > 
> > I think that Richard's characterization is correct:
> > 
> > "Note that likely SUBREG_PROMOTED_VAR_P wasn't designed to communicate
> > zero-extend info (can't you use a REG_EQUIV note somehow?) but it has
> > to be information that is valid everywhere in the function unless
> > data dependences force its motion (thus a conditional doesn't do)."
> > 
> > i.e. this also works for a local variable that is always accessed with the 
> > SUBREG_PROMOTED_VAR_P semantics.
> 
> Ok, here is a patch that just removes all of that SUBREG_PROMOTED_SET then,
> as even for the opN_small_p we can't actually guarantee that for the whole
> function, only for where the pseudo with the SSA_NAME for which we get the
> range appears.  On the bright side, the generated code at least for the
> particular testcase has somewhat different RA decisions, but isn't
> significantly worse.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2019-04-16  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR middle-end/90095
> 	* internal-fn.c (expand_mul_overflow): Don't set SUBREG_PROMOTED_VAR_P
> 	on lowpart SUBREGs.
> 
> 	* gcc.dg/pr90095-1.c: New test.
> 	* gcc.dg/pr90095-2.c: New test.
> 
> --- gcc/internal-fn.c.jj	2019-04-15 19:45:22.384444646 +0200
> +++ gcc/internal-fn.c	2019-04-16 15:18:56.614708804 +0200
> @@ -1753,22 +1753,9 @@ expand_mul_overflow (location_t loc, tre
>  	  /* If both op0 and op1 are sign (!uns) or zero (uns) extended from
>  	     hmode to mode, the multiplication will never overflow.  We can
>  	     do just one hmode x hmode => mode widening multiplication.  */
> -	  rtx lopart0s = lopart0, lopart1s = lopart1;
> -	  if (GET_CODE (lopart0) == SUBREG)
> -	    {
> -	      lopart0s = shallow_copy_rtx (lopart0);
> -	      SUBREG_PROMOTED_VAR_P (lopart0s) = 1;
> -	      SUBREG_PROMOTED_SET (lopart0s, uns ? SRP_UNSIGNED : SRP_SIGNED);
> -	    }
> -	  if (GET_CODE (lopart1) == SUBREG)
> -	    {
> -	      lopart1s = shallow_copy_rtx (lopart1);
> -	      SUBREG_PROMOTED_VAR_P (lopart1s) = 1;
> -	      SUBREG_PROMOTED_SET (lopart1s, uns ? SRP_UNSIGNED : SRP_SIGNED);
> -	    }
>  	  tree halfstype = build_nonstandard_integer_type (hprec, uns);
> -	  ops.op0 = make_tree (halfstype, lopart0s);
> -	  ops.op1 = make_tree (halfstype, lopart1s);
> +	  ops.op0 = make_tree (halfstype, lopart0);
> +	  ops.op1 = make_tree (halfstype, lopart1);
>  	  ops.code = WIDEN_MULT_EXPR;
>  	  ops.type = type;
>  	  rtx thisres
> --- gcc/testsuite/gcc.dg/pr90095-1.c.jj	2019-04-16 13:45:22.614772955 +0200
> +++ gcc/testsuite/gcc.dg/pr90095-1.c	2019-04-16 13:45:22.614772955 +0200
> @@ -0,0 +1,18 @@
> +/* PR middle-end/90095 */
> +/* { dg-do run } */
> +/* { dg-options "-Os -fno-tree-bit-ccp" } */
> +
> +unsigned long long a;
> +unsigned int b;
> +
> +int
> +main ()
> +{
> +  unsigned int c = 255, d = c |= b;
> +  if (__CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4 || __SIZEOF_LONG_LONG__ != 8)
> +    return 0;
> +  d = __builtin_mul_overflow (-(unsigned long long) d, (unsigned char) - c, &a);
> +  if (d != 0)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/gcc.dg/pr90095-2.c.jj	2019-04-16 15:20:14.728414325 +0200
> +++ gcc/testsuite/gcc.dg/pr90095-2.c	2019-04-16 15:20:29.597167928 +0200
> @@ -0,0 +1,5 @@
> +/* PR middle-end/90095 */
> +/* { dg-do run } */
> +/* { dg-options "-Os -fno-tree-bit-ccp -fno-split-wide-types" } */
> +
> +#include "pr90095-1.c"
> 
> 
> 	Jakub
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG NÌrnberg)

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-04-17  7:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-16  8:07 [PATCH] Fix __builtin_*mul*_overflow* expansion (PR middle-end/90095) Jakub Jelinek
2019-04-16  8:34 ` Richard Biener
2019-04-16 10:22   ` Eric Botcazou
2019-04-16 13:07     ` Jakub Jelinek
2019-04-16 16:29       ` Eric Botcazou
2019-04-16 19:08         ` [PATCH] Fix __builtin_*mul*_overflow* expansion (PR middle-end/90095, take 2) Jakub Jelinek
2019-04-17  7:50           ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).