Re: [PATCH] ARM cmpsi2_addneg fixes (PR target/89506)

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: [PATCH] ARM cmpsi2_addneg fixes (PR target/89506)
@ 2019-03-01 14:58 Wilco Dijkstra
  2019-03-01 15:07 ` Jakub Jelinek
  0 siblings, 1 reply; 14+ messages in thread
From: Wilco Dijkstra @ 2019-03-01 14:58 UTC (permalink / raw)
  To: Jakub Jelinek, Kyrill Tkachov, Richard Earnshaw, Ramana Radhakrishnan
  Cc: GCC Patches, nd

Hi Jakub,

> Well, with the patch the decision which insn is chosen is done in C code.
> So it could look like:
>  if (operands[2] == const0_rtx
>      || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x80000000)
>    subs; // mandatory
>  else if (TARGET_THUMB2
>	   && arm_immediate_operand (operands[2], SImode)
>	   && arm_neg_immediate_operand (operands[2], SImode))
>    {
>      // Both adds and subs can do the job here, and unlike
>      // non-thumb mode, the instructions can have different
>      // sizes.  Pick the shorter one.
>    }
>  else if (which_alternative == 1)
>    subs;
>  else
>    adds;
>
> This can be done incrementally, I just have no idea what the rules
> for thumb2 constant encoding are.

This is overcomplicating something simple - adds/subs are completely
symmetrical on all Arm targets. So for addition you simply place adds 
alternative first and then subs. For comparison/subtraction place the 
subs alternative first, then adds. Now all special cases will be correct
automatically without needing any extra work or code.

In terms of codesize you need one little tweak for Thumb-2 given that
add(s) and sub(s) of -1 would use a 32-bit rather than 16-bit encoding.
This is handled by not allowing it as a valid immediate on Thumb-2, so
that add -1 is emitted as sub 1 instead (this is not really a loss given that
AND -1, OR -1, EOR -1 etc are all redundant).

Wilco

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ARM cmpsi2_addneg fixes (PR target/89506)
  2019-03-01 14:58 [PATCH] ARM cmpsi2_addneg fixes (PR target/89506) Wilco Dijkstra
@ 2019-03-01 15:07 ` Jakub Jelinek
  2019-03-01 15:41   ` Wilco Dijkstra
  0 siblings, 1 reply; 14+ messages in thread
From: Jakub Jelinek @ 2019-03-01 15:07 UTC (permalink / raw)
  To: Wilco Dijkstra
  Cc: Kyrill Tkachov, Richard Earnshaw, Ramana Radhakrishnan, GCC Patches, nd

On Fri, Mar 01, 2019 at 02:57:57PM +0000, Wilco Dijkstra wrote:
> > This can be done incrementally, I just have no idea what the rules
> > for thumb2 constant encoding are.
> 
> This is overcomplicating something simple - adds/subs are completely
> symmetrical on all Arm targets. So for addition you simply place adds 
> alternative first and then subs. For comparison/subtraction place the 

As I wrote, that is what I have done, and regtest revealed two code size
regressions because of that.  Is -1 vs. 1 the only case of immediate
valid for both "I" and "L" constraints where the former is longer than the
latter?

	Jakub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ARM cmpsi2_addneg fixes (PR target/89506)
  2019-03-01 15:07 ` Jakub Jelinek
@ 2019-03-01 15:41   ` Wilco Dijkstra
  2019-03-01 16:04     ` Jakub Jelinek
  2019-03-04  9:04     ` [PATCH] ARM cmpsi2_addneg fix follow-up " Jakub Jelinek
  0 siblings, 2 replies; 14+ messages in thread
From: Wilco Dijkstra @ 2019-03-01 15:41 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Kyrill Tkachov, Richard Earnshaw, Ramana Radhakrishnan, GCC Patches, nd

Hi Jakub,

>> This is overcomplicating something simple - adds/subs are completely
>> symmetrical on all Arm targets. So for addition you simply place adds 
>> alternative first and then subs. For comparison/subtraction place the 
>
> As I wrote, that is what I have done, 

That's not what your proposed patch does:

-  "TARGET_32BIT && INTVAL (operands[2]) == -INTVAL (operands[3])"
-  "@
-   adds%?\\t%0, %1, %3
-   subs%?\\t%0, %1, #%n3"
+  "TARGET_32BIT
+   && (INTVAL (operands[2])
+       == trunc_int_for_mode (-INTVAL (operands[3]), SImode))"
+{
+  /* For 0 and INT_MIN it is essential that we use subs, as adds
+     will result in different condition codes (like cmn rather than
+     like cmp).  For other immediates, we should choose whatever
+     will have smaller encoding.  */
+  if (operands[2] == const0_rtx
+      || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x80000000)
+      || which_alternative == 1)
+    return "subs%?\\t%0, %1, #%n3";
+  else
+    return "adds%?\\t%0, %1, %3";
+}

Adds/subs were in the incorrect order before and should simply be swapped
rather than replaced by complex C code (which would be unique just to this pattern
when there are lot of similar patterns that do the right thing already).

> and regtest revealed two code size
> regressions because of that.  Is -1 vs. 1 the only case of immediate
> valid for both "I" and "L" constraints where the former is longer than the
> latter?

Yes -1 is the only case which can result in larger code on Thumb-2, so -1 should
be disallowed by the I constraint (or even better, the underlying query). That way
it will work correctly for all add/sub patterns, not just this one.

Wilco
    

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ARM cmpsi2_addneg fixes (PR target/89506)
  2019-03-01 15:41   ` Wilco Dijkstra
@ 2019-03-01 16:04     ` Jakub Jelinek
  2019-03-01 16:18       ` Jakub Jelinek
  2019-03-04  9:04     ` [PATCH] ARM cmpsi2_addneg fix follow-up " Jakub Jelinek
  1 sibling, 1 reply; 14+ messages in thread
From: Jakub Jelinek @ 2019-03-01 16:04 UTC (permalink / raw)
  To: Wilco Dijkstra
  Cc: Kyrill Tkachov, Richard Earnshaw, Ramana Radhakrishnan, GCC Patches, nd

On Fri, Mar 01, 2019 at 03:41:33PM +0000, Wilco Dijkstra wrote:
> >> This is overcomplicating something simple - adds/subs are completely
> >> symmetrical on all Arm targets. So for addition you simply place adds 
> >> alternative first and then subs. For comparison/subtraction place the 
> >
> > As I wrote, that is what I have done, 
> 
> That's not what your proposed patch does:

But the earlier version of that patch did, see the PR, in particular
https://gcc.gnu.org/bugzilla/attachment.cgi?id=45840

> 
> -  "TARGET_32BIT && INTVAL (operands[2]) == -INTVAL (operands[3])"
> -  "@
> -   adds%?\\t%0, %1, %3
> -   subs%?\\t%0, %1, #%n3"
> +  "TARGET_32BIT
> +   && (INTVAL (operands[2])
> +       == trunc_int_for_mode (-INTVAL (operands[3]), SImode))"
> +{
> +  /* For 0 and INT_MIN it is essential that we use subs, as adds
> +     will result in different condition codes (like cmn rather than
> +     like cmp).  For other immediates, we should choose whatever
> +     will have smaller encoding.  */
> +  if (operands[2] == const0_rtx
> +      || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x80000000)
> +      || which_alternative == 1)
> +    return "subs%?\\t%0, %1, #%n3";
> +  else
> +    return "adds%?\\t%0, %1, %3";
> +}
> 
> Adds/subs were in the incorrect order before and should simply be swapped
> rather than replaced by complex C code (which would be unique just to this pattern
> when there are lot of similar patterns that do the right thing already).

I don't see what's wrong on that, the code isn't really complex and actually
explains what it does and why.  If -1/1 is the only case where thumb2 cares,
it will be trivial to add that.  For operands[2] == const1_rtx we then want
to use always adds and for operands[2] == constm1_rtx we want to use subs.

Or swap the alternatives (then 0 and 0x80000000 will come out naturaly out
of it) and just special case the operands[2] == const1_rtx case.
That would then be for this hunk only following.

Changing the "I" constraint or adding new constraints seems too risky or
overkill.

--- gcc/config/arm/arm.md.jj	2019-03-01 09:05:56.911097368 +0100
+++ gcc/config/arm/arm.md	2019-03-01 17:00:23.284076436 +0100
@@ -863,14 +863,24 @@ (define_insn "cmpsi2_addneg"
   [(set (reg:CC CC_REGNUM)
 	(compare:CC
 	 (match_operand:SI 1 "s_register_operand" "r,r")
-	 (match_operand:SI 2 "arm_addimm_operand" "L,I")))
+	 (match_operand:SI 2 "arm_addimm_operand" "I,L")))
    (set (match_operand:SI 0 "s_register_operand" "=r,r")
 	(plus:SI (match_dup 1)
-		 (match_operand:SI 3 "arm_addimm_operand" "I,L")))]
-  "TARGET_32BIT && INTVAL (operands[2]) == -INTVAL (operands[3])"
-  "@
-   adds%?\\t%0, %1, %3
-   subs%?\\t%0, %1, #%n3"
+		 (match_operand:SI 3 "arm_addimm_operand" "L,I")))]
+  "TARGET_32BIT
+   && (INTVAL (operands[2])
+       == trunc_int_for_mode (-INTVAL (operands[3]), SImode))"
+{
+  /* For 0 and INT_MIN it is essential that we use subs, as adds
+     will result in different condition codes (like cmn rather than
+     like cmp).  Both alternatives can match also for -1/1 with
+     TARGET_THUMB2, prefer instruction with #1 in that case as it
+     is shorter.  */
+  if (which_alternative == 0 && operands[1] != const1_rtx)
+    return "subs%?\\t%0, %1, #%n3";
+  else
+    return "adds%?\\t%0, %1, %3";
+}
   [(set_attr "conds" "set")
    (set_attr "type" "alus_sreg")]
 )

	Jakub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ARM cmpsi2_addneg fixes (PR target/89506)
  2019-03-01 16:04     ` Jakub Jelinek
@ 2019-03-01 16:18       ` Jakub Jelinek
  0 siblings, 0 replies; 14+ messages in thread
From: Jakub Jelinek @ 2019-03-01 16:18 UTC (permalink / raw)
  To: Wilco Dijkstra
  Cc: Kyrill Tkachov, Richard Earnshaw, Ramana Radhakrishnan, GCC Patches, nd

On Fri, Mar 01, 2019 at 05:04:24PM +0100, Jakub Jelinek wrote:
> +  /* For 0 and INT_MIN it is essential that we use subs, as adds
> +     will result in different condition codes (like cmn rather than
> +     like cmp).  Both alternatives can match also for -1/1 with
> +     TARGET_THUMB2, prefer instruction with #1 in that case as it
> +     is shorter.  */
> +  if (which_alternative == 0 && operands[1] != const1_rtx)
				   ^^^ operands[3] != const1_rtx
				    or operands[2] != constm1_rtx.
Sorry.

> +    return "subs%?\\t%0, %1, #%n3";
> +  else
> +    return "adds%?\\t%0, %1, %3";
> +}
>    [(set_attr "conds" "set")
>     (set_attr "type" "alus_sreg")]
>  )

	Jakub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506)
  2019-03-01 15:41   ` Wilco Dijkstra
  2019-03-01 16:04     ` Jakub Jelinek
@ 2019-03-04  9:04     ` Jakub Jelinek
  2019-03-12 11:50       ` Patch ping (Re: [PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506)) Jakub Jelinek
  2019-03-19 10:16       ` [PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506) Ramana Radhakrishnan
  1 sibling, 2 replies; 14+ messages in thread
From: Jakub Jelinek @ 2019-03-04  9:04 UTC (permalink / raw)
  To: Kyrylo Tkachov, Richard Earnshaw, Ramana Radhakrishnan
  Cc: Wilco Dijkstra, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1562 bytes --]

On Fri, Mar 01, 2019 at 03:41:33PM +0000, Wilco Dijkstra wrote:
> > and regtest revealed two code size
> > regressions because of that.Â  Is -1 vs. 1 the only case of immediate
> > valid for both "I" and "L" constraints where the former is longer than the
> > latter?
> 
> Yes -1 is the only case which can result in larger code on Thumb-2, so -1 should
> be disallowed by the I constraint (or even better, the underlying query). That way
> it will work correctly for all add/sub patterns, not just this one.

So, over the weekend I've bootstrapped/regtested on armv7hl-linux-gnueabi
following two possible follow-ups which handle the -1 and 1 cases right
(prefer the instruction with #1 for thumb2), 0 and INT_MIN (use subs) and
for others use subs if both constraints match, otherwise adds.

The first one uses constraints and no C code in the output, I believe it is
actually more expensive for compile time, because if one just reads what
constrain_operands needs to do for another constraint, it is quite a lot.
I've tried to at least not introduce new constraints for this, there is no
constraint for number 1 (or for number -1).
The Pu constraint is thumb2 only for numbers 1..8, and the alternative uses
I constraint for the negation of it, i.e. -8..-1, only -1 from this is
valid for I.  If that matches, we emit adds with #1, otherwise just prefer
subs over adds.

The other swaps the alternatives similarly to the above, but for the special
case of desirable adds with #1 uses C code instead of another alternative.

Ok for trunk (which one)?

	Jakub

[-- Attachment #2: T676 --]
[-- Type: text/plain, Size: 2517 bytes --]

2019-03-04  Jakub Jelinek  <jakub@redhat.com>

	PR target/89506
	* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives, add
	another alternative with I constraint for operands[2] and Pu
	for operands[3] and emit adds in that case, don't use C code to
	emit the instruction.

--- gcc/config/arm/arm.md.jj	2019-03-02 09:04:25.550794239 +0100
+++ gcc/config/arm/arm.md	2019-03-02 17:08:13.036725812 +0100
@@ -857,31 +857,31 @@ (define_insn "*compare_negsi_si"
    (set_attr "type" "alus_sreg")]
 )
 
-;; This is the canonicalization of addsi3_compare0_for_combiner when the
+;; This is the canonicalization of subsi3_compare when the
 ;; addend is a constant.
+;; For 0 and INT_MIN it is essential that we use subs, as adds will result
+;; in different condition codes (like cmn rather than like cmp), so that
+;; alternative comes first.  Both I and L constraints can match for any
+;; 0x??000000 where except for 0 and INT_MIN it doesn't matter what we choose,
+;; and also for -1 and 1 with TARGET_THUMB2, in that case prefer instruction
+;; with #1 as it is shorter.  The first alternative will use adds ?, ?, #1 over
+;; subs ?, ?, #-1, the second alternative will use subs for #0 or #2147483648
+;; or any other case where both I and L constraints match.
 (define_insn "cmpsi2_addneg"
   [(set (reg:CC CC_REGNUM)
 	(compare:CC
-	 (match_operand:SI 1 "s_register_operand" "r,r")
-	 (match_operand:SI 2 "arm_addimm_operand" "L,I")))
-   (set (match_operand:SI 0 "s_register_operand" "=r,r")
+	 (match_operand:SI 1 "s_register_operand" "r,r,r")
+	 (match_operand:SI 2 "arm_addimm_operand" "I,I,L")))
+   (set (match_operand:SI 0 "s_register_operand" "=r,r,r")
 	(plus:SI (match_dup 1)
-		 (match_operand:SI 3 "arm_addimm_operand" "I,L")))]
+		 (match_operand:SI 3 "arm_addimm_operand" "Pu,L,I")))]
   "TARGET_32BIT
    && (INTVAL (operands[2])
        == trunc_int_for_mode (-INTVAL (operands[3]), SImode))"
-{
-  /* For 0 and INT_MIN it is essential that we use subs, as adds
-     will result in different condition codes (like cmn rather than
-     like cmp).  For other immediates, we should choose whatever
-     will have smaller encoding.  */
-  if (operands[2] == const0_rtx
-      || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x80000000)
-      || which_alternative == 1)
-    return "subs%?\\t%0, %1, #%n3";
-  else
-    return "adds%?\\t%0, %1, %3";
-}
+  "@
+   adds%?\\t%0, %1, %3
+   subs%?\\t%0, %1, #%n3
+   adds%?\\t%0, %1, %3"
   [(set_attr "conds" "set")
    (set_attr "type" "alus_sreg")]
 )

[-- Attachment #3: T676a --]
[-- Type: text/plain, Size: 2042 bytes --]

2019-03-04  Jakub Jelinek  <jakub@redhat.com>

	PR target/89506
	* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives and use
	subs for the first alternative except when operands[3] is 1.

--- gcc/config/arm/arm.md.jj	2019-03-02 09:04:25.550794239 +0100
+++ gcc/config/arm/arm.md	2019-03-02 09:41:03.501404694 +0100
@@ -857,27 +857,27 @@ (define_insn "*compare_negsi_si"
    (set_attr "type" "alus_sreg")]
 )
 
-;; This is the canonicalization of addsi3_compare0_for_combiner when the
+;; This is the canonicalization of subsi3_compare when the
 ;; addend is a constant.
 (define_insn "cmpsi2_addneg"
   [(set (reg:CC CC_REGNUM)
 	(compare:CC
 	 (match_operand:SI 1 "s_register_operand" "r,r")
-	 (match_operand:SI 2 "arm_addimm_operand" "L,I")))
+	 (match_operand:SI 2 "arm_addimm_operand" "I,L")))
    (set (match_operand:SI 0 "s_register_operand" "=r,r")
 	(plus:SI (match_dup 1)
-		 (match_operand:SI 3 "arm_addimm_operand" "I,L")))]
+		 (match_operand:SI 3 "arm_addimm_operand" "L,I")))]
   "TARGET_32BIT
    && (INTVAL (operands[2])
        == trunc_int_for_mode (-INTVAL (operands[3]), SImode))"
 {
-  /* For 0 and INT_MIN it is essential that we use subs, as adds
-     will result in different condition codes (like cmn rather than
-     like cmp).  For other immediates, we should choose whatever
-     will have smaller encoding.  */
-  if (operands[2] == const0_rtx
-      || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x80000000)
-      || which_alternative == 1)
+  /* For 0 and INT_MIN it is essential that we use subs, as adds will result
+     in different condition codes (like cmn rather than like cmp), so that
+     alternative comes first.  Both alternatives can match for any 0x??000000
+     where except for 0 and INT_MIN it doesn't matter what we choose, and also
+     for -1 and 1 with TARGET_THUMB2, in that case prefer instruction with #1
+     as it is shorter.  */
+  if (which_alternative == 0 && operands[3] != const1_rtx)
     return "subs%?\\t%0, %1, #%n3";
   else
     return "adds%?\\t%0, %1, %3";

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Patch ping (Re: [PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506))
  2019-03-04  9:04     ` [PATCH] ARM cmpsi2_addneg fix follow-up " Jakub Jelinek
@ 2019-03-12 11:50       ` Jakub Jelinek
  2019-03-19  7:57         ` Patch ping^2 " Jakub Jelinek
  2019-03-19 10:16       ` [PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506) Ramana Radhakrishnan
  1 sibling, 1 reply; 14+ messages in thread
From: Jakub Jelinek @ 2019-03-12 11:50 UTC (permalink / raw)
  To: Kyrylo Tkachov, Richard Earnshaw, Ramana Radhakrishnan
  Cc: Wilco Dijkstra, gcc-patches

On Mon, Mar 04, 2019 at 10:04:01AM +0100, Jakub Jelinek wrote:
> The first one uses constraints and no C code in the output, I believe it is
> actually more expensive for compile time, because if one just reads what
> constrain_operands needs to do for another constraint, it is quite a lot.
> I've tried to at least not introduce new constraints for this, there is no
> constraint for number 1 (or for number -1).
> The Pu constraint is thumb2 only for numbers 1..8, and the alternative uses
> I constraint for the negation of it, i.e. -8..-1, only -1 from this is
> valid for I.  If that matches, we emit adds with #1, otherwise just prefer
> subs over adds.
> 
> The other swaps the alternatives similarly to the above, but for the special
> case of desirable adds with #1 uses C code instead of another alternative.
> 
> Ok for trunk (which one)?

I'd like to ping these patches:

> 2019-03-04  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR target/89506
> 	* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives, add
> 	another alternative with I constraint for operands[2] and Pu
> 	for operands[3] and emit adds in that case, don't use C code to
> 	emit the instruction.

or:

> 2019-03-04  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR target/89506
> 	* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives and use
> 	subs for the first alternative except when operands[3] is 1.

	Jakub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Patch ping^2 (Re: [PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506))
  2019-03-12 11:50       ` Patch ping (Re: [PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506)) Jakub Jelinek
@ 2019-03-19  7:57         ` Jakub Jelinek
  0 siblings, 0 replies; 14+ messages in thread
From: Jakub Jelinek @ 2019-03-19  7:57 UTC (permalink / raw)
  To: Kyrylo Tkachov, Richard Earnshaw, Ramana Radhakrishnan
  Cc: Wilco Dijkstra, gcc-patches

On Tue, Mar 12, 2019 at 12:43:29PM +0100, Jakub Jelinek wrote:
> On Mon, Mar 04, 2019 at 10:04:01AM +0100, Jakub Jelinek wrote:
> > The first one uses constraints and no C code in the output, I believe it is
> > actually more expensive for compile time, because if one just reads what
> > constrain_operands needs to do for another constraint, it is quite a lot.
> > I've tried to at least not introduce new constraints for this, there is no
> > constraint for number 1 (or for number -1).
> > The Pu constraint is thumb2 only for numbers 1..8, and the alternative uses
> > I constraint for the negation of it, i.e. -8..-1, only -1 from this is
> > valid for I.  If that matches, we emit adds with #1, otherwise just prefer
> > subs over adds.
> > 
> > The other swaps the alternatives similarly to the above, but for the special
> > case of desirable adds with #1 uses C code instead of another alternative.
> > 
> > Ok for trunk (which one)?
> 
> I'd like to ping these patches:
> 
> > 2019-03-04  Jakub Jelinek  <jakub@redhat.com>
> > 
> > 	PR target/89506
> > 	* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives, add
> > 	another alternative with I constraint for operands[2] and Pu
> > 	for operands[3] and emit adds in that case, don't use C code to
> > 	emit the instruction.
> 
> or:
> 
> > 2019-03-04  Jakub Jelinek  <jakub@redhat.com>
> > 
> > 	PR target/89506
> > 	* config/arm/arm.md (cmpsi2_addneg): Swap the alternatives and use
> > 	subs for the first alternative except when operands[3] is 1.

Ping.

	Jakub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506)
  2019-03-04  9:04     ` [PATCH] ARM cmpsi2_addneg fix follow-up " Jakub Jelinek
  2019-03-12 11:50       ` Patch ping (Re: [PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506)) Jakub Jelinek
@ 2019-03-19 10:16       ` Ramana Radhakrishnan
  1 sibling, 0 replies; 14+ messages in thread
From: Ramana Radhakrishnan @ 2019-03-19 10:16 UTC (permalink / raw)
  To: Jakub Jelinek, Kyrylo Tkachov, Richard Earnshaw
  Cc: Wilco Dijkstra, gcc-patches, nd

On 04/03/2019 09:04, Jakub Jelinek wrote:
> On Fri, Mar 01, 2019 at 03:41:33PM +0000, Wilco Dijkstra wrote:
>> > and regtest revealed two code size
>> > regressions because of that.  Is -1 vs. 1 the only case of immediate
>> > valid for both "I" and "L" constraints where the former is longer than the
>> > latter?
>> 
>> Yes -1 is the only case which can result in larger code on Thumb-2, so -1 should
>> be disallowed by the I constraint (or even better, the underlying query). That way
>> it will work correctly for all add/sub patterns, not just this one.
> 
> So, over the weekend I've bootstrapped/regtested on armv7hl-linux-gnueabi
> following two possible follow-ups which handle the -1 and 1 cases right
> (prefer the instruction with #1 for thumb2), 0 and INT_MIN (use subs) and
> for others use subs if both constraints match, otherwise adds.
> 
> The first one uses constraints and no C code in the output, I believe it is
> actually more expensive for compile time, because if one just reads what
> constrain_operands needs to do for another constraint, it is quite a lot.
> I've tried to at least not introduce new constraints for this, there is no
> constraint for number 1 (or for number -1).
> The Pu constraint is thumb2 only for numbers 1..8, and the alternative uses
> I constraint for the negation of it, i.e. -8..-1, only -1 from this is
> valid for I.  If that matches, we emit adds with #1, otherwise just prefer
> subs over adds.
> 
> The other swaps the alternatives similarly to the above, but for the special
> case of desirable adds with #1 uses C code instead of another alternative.
> 
> Ok for trunk (which one)?
> 
>          Jakub


Option #2 is better (the C code variant)- for something like this adding 
one more constraint is a bit painful.

Ok and watch out for any regressions as usual.

Ramana

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ARM cmpsi2_addneg fixes (PR target/89506)
  2019-03-01  9:42     ` Kyrill Tkachov
@ 2019-03-01  9:43       ` Kyrill Tkachov
  0 siblings, 0 replies; 14+ messages in thread
From: Kyrill Tkachov @ 2019-03-01  9:43 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Earnshaw, Ramana Radhakrishnan, gcc-patches


On 3/1/19 9:42 AM, Kyrill Tkachov wrote:
>
> On 3/1/19 9:36 AM, Jakub Jelinek wrote:
> > On Fri, Mar 01, 2019 at 09:21:16AM +0000, Kyrill Tkachov wrote:
> >>> Ok for trunk?
> >> Ok.
> > Thanks.Â  I'll wait for my regtest, previously I've regtested it only 
> with
> > an older version of this patch.
> >
> >>> Or is there an easy way to estimate if a constant satisfies both 
> "I" and "L"
> >>> constraints at the same time and is not one of 0 and INT_MIN, which
> >>> of the two (positive or negated) would have smaller encoding and 
> decide
> >>> based on that?
> >> Don't think there's a cleaner way of doing that. There are some helper
> >> functions in arm.c that can estimate the number instructions neeedd to
> >> synthesise a constant, but nothing that takes size into account. The
> >> encoding size can only change for TARGET_THUMB2 (not TARGET_ARM) 
> but it's
> >> not worth gating that decision on TARGET_THUMB2 as well IMO.
> > Well, with the patch the decision which insn is chosen is done in C 
> code.
> > So it could look like:
> >Â Â Â  if (operands[2] == const0_rtx
> >Â Â Â Â Â Â Â  || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x80000000)
> >Â Â Â Â Â  subs; // mandatory
> >Â Â Â  else if (TARGET_THUMB2
> >Â Â Â Â Â Â Â Â Â Â  && arm_immediate_operand (operands[2], SImode)
> >Â Â Â Â Â Â Â Â Â Â  && arm_neg_immediate_operand (operands[2], SImode))
> >Â Â Â Â Â  {
> >Â Â Â Â Â Â Â  // Both adds and subs can do the job here, and unlike
> >Â Â Â Â Â Â Â  // non-thumb mode, the instructions can have different
> >Â Â Â Â Â Â Â  // sizes.Â  Pick the shorter one.
> >Â Â Â Â Â  }
> >Â Â Â  else if (which_alternative == 1)
> >Â Â Â Â Â  subs;
> >Â Â Â  else
> >Â Â Â Â Â  adds;
> >
> > This can be done incrementally, I just have no idea what the rules
> > for thumb2 constant encoding are.
>
>
> Yeah, I had considered this. But I don't think it's worth the extra
> code. I'm not aware of any implementation where adds and subs differ in
> performance and I'd be surprised so when there's no size advantage I
> wouldn't be picky.


Two thoughts collided in my head on the last sentence. I meant to say:

"I'm not aware of any implementation where adds and subs differ in
performance so when there's no size advantage I wouldn't be picky."

Thanks,

Kyrill

>
> Thanks,
>
> Kyrill
>
>
> >
> >Â Â Â Â Â Â Â  Jakub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ARM cmpsi2_addneg fixes (PR target/89506)
  2019-03-01  9:36   ` Jakub Jelinek
@ 2019-03-01  9:42     ` Kyrill Tkachov
  2019-03-01  9:43       ` Kyrill Tkachov
  0 siblings, 1 reply; 14+ messages in thread
From: Kyrill Tkachov @ 2019-03-01  9:42 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Earnshaw, Ramana Radhakrishnan, gcc-patches


On 3/1/19 9:36 AM, Jakub Jelinek wrote:
> On Fri, Mar 01, 2019 at 09:21:16AM +0000, Kyrill Tkachov wrote:
>>> Ok for trunk?
>> Ok.
> Thanks.  I'll wait for my regtest, previously I've regtested it only with
> an older version of this patch.
>
>>> Or is there an easy way to estimate if a constant satisfies both "I" and "L"
>>> constraints at the same time and is not one of 0 and INT_MIN, which
>>> of the two (positive or negated) would have smaller encoding and decide
>>> based on that?
>> Don't think there's a cleaner way of doing that. There are some helper
>> functions in arm.c that can estimate the number instructions neeedd to
>> synthesise a constant, but nothing that takes size into account. The
>> encoding size can only change for TARGET_THUMB2 (not TARGET_ARM) but it's
>> not worth gating that decision on TARGET_THUMB2 as well IMO.
> Well, with the patch the decision which insn is chosen is done in C code.
> So it could look like:
>    if (operands[2] == const0_rtx
>        || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x80000000)
>      subs; // mandatory
>    else if (TARGET_THUMB2
> 	   && arm_immediate_operand (operands[2], SImode)
> 	   && arm_neg_immediate_operand (operands[2], SImode))
>      {
>        // Both adds and subs can do the job here, and unlike
>        // non-thumb mode, the instructions can have different
>        // sizes.  Pick the shorter one.
>      }
>    else if (which_alternative == 1)
>      subs;
>    else
>      adds;
>
> This can be done incrementally, I just have no idea what the rules
> for thumb2 constant encoding are.


Yeah, I had considered this. But I don't think it's worth the extra 
code. I'm not aware of any implementation where adds and subs differ in 
performance and I'd be surprised so when there's no size advantage I 
wouldn't be picky.

Thanks,

Kyrill


>
> 	Jakub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ARM cmpsi2_addneg fixes (PR target/89506)
  2019-03-01  9:21 ` Kyrill Tkachov
@ 2019-03-01  9:36   ` Jakub Jelinek
  2019-03-01  9:42     ` Kyrill Tkachov
  0 siblings, 1 reply; 14+ messages in thread
From: Jakub Jelinek @ 2019-03-01  9:36 UTC (permalink / raw)
  To: Kyrill Tkachov; +Cc: Richard Earnshaw, Ramana Radhakrishnan, gcc-patches

On Fri, Mar 01, 2019 at 09:21:16AM +0000, Kyrill Tkachov wrote:
> > Ok for trunk?
> 
> Ok.

Thanks.  I'll wait for my regtest, previously I've regtested it only with
an older version of this patch.

> > Or is there an easy way to estimate if a constant satisfies both "I" and "L"
> > constraints at the same time and is not one of 0 and INT_MIN, which
> > of the two (positive or negated) would have smaller encoding and decide
> > based on that?
> 
> Don't think there's a cleaner way of doing that. There are some helper
> functions in arm.c that can estimate the number instructions neeedd to
> synthesise a constant, but nothing that takes size into account. The
> encoding size can only change for TARGET_THUMB2 (not TARGET_ARM) but it's
> not worth gating that decision on TARGET_THUMB2 as well IMO.

Well, with the patch the decision which insn is chosen is done in C code.
So it could look like:
  if (operands[2] == const0_rtx
      || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x80000000)
    subs; // mandatory
  else if (TARGET_THUMB2
	   && arm_immediate_operand (operands[2], SImode)
	   && arm_neg_immediate_operand (operands[2], SImode))
    {
      // Both adds and subs can do the job here, and unlike
      // non-thumb mode, the instructions can have different
      // sizes.  Pick the shorter one.
    }
  else if (which_alternative == 1)
    subs;
  else
    adds;

This can be done incrementally, I just have no idea what the rules
for thumb2 constant encoding are.

	Jakub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ARM cmpsi2_addneg fixes (PR target/89506)
  2019-02-28 22:16 [PATCH] ARM cmpsi2_addneg fixes " Jakub Jelinek
@ 2019-03-01  9:21 ` Kyrill Tkachov
  2019-03-01  9:36   ` Jakub Jelinek
  0 siblings, 1 reply; 14+ messages in thread
From: Kyrill Tkachov @ 2019-03-01  9:21 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Earnshaw, Ramana Radhakrishnan, gcc-patches

Hi Jakub,

On 2/28/19 9:43 PM, Jakub Jelinek wrote:
> Hi!
>
> The following testcase ICEs on ARM, because the backend creates CONST_INTs
> that aren't valid for SImode, in which they are used (0x80000000 rather than
> the canonical -0x80000000).  This is fixed by the 3 gen_int_mode calls
> instead of just GEN_INT.
> Once that is fixed, we ICE because if both the CONST_INTs are -0x80000000,
> then INTVAL (operands[2]) == -INTVAL (operands[3]) is no longer true and
> thus cmpsi2_addneg doesn't match and no other pattern does.
> Fixing that is pretty obvious thing to do as well, similarly to the recent
> *subsi3_carryin_compare_const fix.
>
> The next problem is that the result doesn't bootstrap.  The problem is that
> the instruction emitted for that -0x80000000 immediate - adds reg, reg, #-0x80000000
> actually doesn't do what the RTL pattern for it says.
> The pattern is like subsi3_compare, except that the MINUS with
> constant second argument is canonicalized as RTL normally does into PLUS of
> the negated constant.  The mode on the condition code register is CCmode,
> so all Z, N, C and V bits should be valid, and the pattern is like that of
> a cmp instruction, so the behavior vs. condition codes should be like cmp
> instruction, which is what the subs instruction does.  The adds r1, r2, #N
> and subs r1, r2, #-N instructions actually behave the same most of the time.
> The result is the same, Z and N flags are always the same as well.  And,
> it seems that for all N except for 0 and 0x80000000 also the C and V bits
> are the same (I've proved that by using the valgrind subs condition code
> algorithm (which is the same as cmp) vs. valgrind adds condition code
> algorithm (which is the same as cmn) and even emulated it using 8-bit x
> 8-bit exhaustive check).  For 0 and 0x80000000 it is different and as can be
> seen by the bootstrap failure, it is significant.
> Previously before the above change we got by by the pattern never triggering
> by the combiner for 0x80000000 (because the combiner would always try
> canonical CONST_INTs for both and those don't have INTVAL == -INTVAL), but
> it worked for the *compare_scc splitter/*negscc/*thumb2_negscc patterns
> (which created invalid RTL and used adds rather than subs, but didn't care,
> as it only tested the Z bit using ne condition).
>
> My next attempt was just to switch the two alternatives, so that if
> operands[2] matches both "I" and "L" constraints and we can choose, we'd
> use subs.  That cured the bootstrap failure, but introduced
> +FAIL: gcc.target/arm/pr43920-2.c object-size text <= 54
> +FAIL: gcc.target/arm/thumb2-cmpneg2add-2.c scan-assembler adds
> regressions, in those 2 testcases neither the problematic 0 nor INT_MIN
> is used, so both adds and subs are equivalent, but
> -	adds	r2, r4, #1
> +	subs	r2, r4, #-1
> results in larger code.
> Note, I've seen the patch creating smaller code in some cases too,
> when the combiner matched the cmpsi2_addneg pattern with INT_MIN and
> previously emitted loading the constant into some register and performing
> subs with 3 registers instead of 2 and immediate.
>
> So, this is another alternative I'm proposing, just make sure we use
> subs for #0 and #-2147483648 (where it is essential for correctness)
> and for others keep previous behavior.
>
> Ok for trunk?

Ok.


>
> Or is there an easy way to estimate if a constant satisfies both "I" and "L"
> constraints at the same time and is not one of 0 and INT_MIN, which
> of the two (positive or negated) would have smaller encoding and decide
> based on that?

Don't think there's a cleaner way of doing that. There are some helper 
functions in arm.c that can estimate the number instructions neeedd to 
synthesise a constant, but nothing that takes size into account. The 
encoding size can only change for TARGET_THUMB2 (not TARGET_ARM) but 
it's not worth gating that decision on TARGET_THUMB2 as well IMO.

Thanks,

Kyrill

>
> 2019-02-28  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR target/89506
> 	* config/arm/arm.md (cmpsi2_addneg): Use
> 	trunc_int_for_mode (-INTVAL (...), SImode) instead of -INTVAL (...).
> 	If operands[2] is 0 or INT_MIN, force use of subs.
> 	(*compare_scc splitter): Use gen_int_mode.
> 	(*negscc): Likewise.
> 	* config/arm/thumb2.md (*thumb2_negscc): Likewise.
>
> 	* gcc.dg/pr89506.c: New test.
>
> --- gcc/config/arm/arm.md.jj	2019-02-28 14:13:08.368267536 +0100
> +++ gcc/config/arm/arm.md	2019-02-28 22:09:03.089588596 +0100
> @@ -867,10 +867,21 @@ (define_insn "cmpsi2_addneg"
>      (set (match_operand:SI 0 "s_register_operand" "=r,r")
>   	(plus:SI (match_dup 1)
>   		 (match_operand:SI 3 "arm_addimm_operand" "I,L")))]
> -  "TARGET_32BIT && INTVAL (operands[2]) == -INTVAL (operands[3])"
> -  "@
> -   adds%?\\t%0, %1, %3
> -   subs%?\\t%0, %1, #%n3"
> +  "TARGET_32BIT
> +   && (INTVAL (operands[2])
> +       == trunc_int_for_mode (-INTVAL (operands[3]), SImode))"
> +{
> +  /* For 0 and INT_MIN it is essential that we use subs, as adds
> +     will result in different condition codes (like cmn rather than
> +     like cmp).  For other immediates, we should choose whatever
> +     will have smaller encoding.  */
> +  if (operands[2] == const0_rtx
> +      || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x80000000)
> +      || which_alternative == 1)
> +    return "subs%?\\t%0, %1, #%n3";
> +  else
> +    return "adds%?\\t%0, %1, %3";
> +}
>     [(set_attr "conds" "set")
>      (set_attr "type" "alus_sreg")]
>   )
> @@ -9302,7 +9313,7 @@ (define_split
>      (cond_exec (ne:CC (reg:CC CC_REGNUM) (const_int 0))
>   	      (set (match_dup 0) (const_int 1)))]
>   {
> -  operands[3] = GEN_INT (-INTVAL (operands[2]));
> +  operands[3] = gen_int_mode (-INTVAL (operands[2]), SImode);
>   })
>   
>   (define_split
> @@ -10082,7 +10093,8 @@ (define_insn_and_split "*negscc"
>           /* Emit subs\\t%0, %1, %2\;mvnne\\t%0, #0 */
>           if (CONST_INT_P (operands[2]))
>             emit_insn (gen_cmpsi2_addneg (operands[0], operands[1], operands[2],
> -                                        GEN_INT (- INTVAL (operands[2]))));
> +                                        gen_int_mode (-INTVAL (operands[2]),
> +						      SImode)));
>           else
>             emit_insn (gen_subsi3_compare (operands[0], operands[1], operands[2]));
>   
> --- gcc/config/arm/thumb2.md.jj	2019-02-28 08:14:55.970600405 +0100
> +++ gcc/config/arm/thumb2.md	2019-02-28 21:51:20.387902673 +0100
> @@ -913,7 +913,8 @@ (define_insn_and_split "*thumb2_negscc"
>           /* Emit subs\\t%0, %1, %2\;it\\tne\;mvnne\\t%0, #0 */
>           if (CONST_INT_P (operands[2]))
>             emit_insn (gen_cmpsi2_addneg (operands[0], operands[1], operands[2],
> -                                        GEN_INT (- INTVAL (operands[2]))));
> +                                        gen_int_mode (-INTVAL (operands[2]),
> +						      SImode)));
>           else
>             emit_insn (gen_subsi3_compare (operands[0], operands[1], operands[2]));
>   
> --- gcc/testsuite/gcc.dg/pr89506.c.jj	2019-02-28 21:51:20.387902673 +0100
> +++ gcc/testsuite/gcc.dg/pr89506.c	2019-02-28 21:51:20.387902673 +0100
> @@ -0,0 +1,14 @@
> +/* PR target/89506 */
> +/* { dg-do compile } */
> +/* { dg-options "-Og -g -w" } */
> +
> +long long a;
> +int c;
> +
> +int
> +foo (long long d, short e)
> +{
> +  __builtin_sub_overflow (0xffffffff, c, &a);
> +  e >>= ~2147483647 != (int) a;
> +  return d + e;
> +}
>
> 	Jakub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] ARM cmpsi2_addneg fixes (PR target/89506)
@ 2019-02-28 22:16 Jakub Jelinek
  2019-03-01  9:21 ` Kyrill Tkachov
  0 siblings, 1 reply; 14+ messages in thread
From: Jakub Jelinek @ 2019-02-28 22:16 UTC (permalink / raw)
  To: Kyrill Tkachov; +Cc: Richard Earnshaw, Ramana Radhakrishnan, gcc-patches

Hi!

The following testcase ICEs on ARM, because the backend creates CONST_INTs
that aren't valid for SImode, in which they are used (0x80000000 rather than
the canonical -0x80000000).  This is fixed by the 3 gen_int_mode calls
instead of just GEN_INT.
Once that is fixed, we ICE because if both the CONST_INTs are -0x80000000,
then INTVAL (operands[2]) == -INTVAL (operands[3]) is no longer true and
thus cmpsi2_addneg doesn't match and no other pattern does.
Fixing that is pretty obvious thing to do as well, similarly to the recent
*subsi3_carryin_compare_const fix.

The next problem is that the result doesn't bootstrap.  The problem is that
the instruction emitted for that -0x80000000 immediate - adds reg, reg, #-0x80000000
actually doesn't do what the RTL pattern for it says.
The pattern is like subsi3_compare, except that the MINUS with
constant second argument is canonicalized as RTL normally does into PLUS of
the negated constant.  The mode on the condition code register is CCmode,
so all Z, N, C and V bits should be valid, and the pattern is like that of
a cmp instruction, so the behavior vs. condition codes should be like cmp
instruction, which is what the subs instruction does.  The adds r1, r2, #N
and subs r1, r2, #-N instructions actually behave the same most of the time.
The result is the same, Z and N flags are always the same as well.  And,
it seems that for all N except for 0 and 0x80000000 also the C and V bits
are the same (I've proved that by using the valgrind subs condition code
algorithm (which is the same as cmp) vs. valgrind adds condition code
algorithm (which is the same as cmn) and even emulated it using 8-bit x
8-bit exhaustive check).  For 0 and 0x80000000 it is different and as can be
seen by the bootstrap failure, it is significant.
Previously before the above change we got by by the pattern never triggering
by the combiner for 0x80000000 (because the combiner would always try
canonical CONST_INTs for both and those don't have INTVAL == -INTVAL), but
it worked for the *compare_scc splitter/*negscc/*thumb2_negscc patterns
(which created invalid RTL and used adds rather than subs, but didn't care,
as it only tested the Z bit using ne condition).

My next attempt was just to switch the two alternatives, so that if
operands[2] matches both "I" and "L" constraints and we can choose, we'd
use subs.  That cured the bootstrap failure, but introduced
+FAIL: gcc.target/arm/pr43920-2.c object-size text <= 54
+FAIL: gcc.target/arm/thumb2-cmpneg2add-2.c scan-assembler adds
regressions, in those 2 testcases neither the problematic 0 nor INT_MIN
is used, so both adds and subs are equivalent, but
-	adds	r2, r4, #1
+	subs	r2, r4, #-1
results in larger code.
Note, I've seen the patch creating smaller code in some cases too,
when the combiner matched the cmpsi2_addneg pattern with INT_MIN and
previously emitted loading the constant into some register and performing
subs with 3 registers instead of 2 and immediate.

So, this is another alternative I'm proposing, just make sure we use
subs for #0 and #-2147483648 (where it is essential for correctness)
and for others keep previous behavior.

Ok for trunk?

Or is there an easy way to estimate if a constant satisfies both "I" and "L"
constraints at the same time and is not one of 0 and INT_MIN, which
of the two (positive or negated) would have smaller encoding and decide
based on that?

2019-02-28  Jakub Jelinek  <jakub@redhat.com>

	PR target/89506
	* config/arm/arm.md (cmpsi2_addneg): Use
	trunc_int_for_mode (-INTVAL (...), SImode) instead of -INTVAL (...).
	If operands[2] is 0 or INT_MIN, force use of subs.
	(*compare_scc splitter): Use gen_int_mode.
	(*negscc): Likewise.
	* config/arm/thumb2.md (*thumb2_negscc): Likewise.

	* gcc.dg/pr89506.c: New test.

--- gcc/config/arm/arm.md.jj	2019-02-28 14:13:08.368267536 +0100
+++ gcc/config/arm/arm.md	2019-02-28 22:09:03.089588596 +0100
@@ -867,10 +867,21 @@ (define_insn "cmpsi2_addneg"
    (set (match_operand:SI 0 "s_register_operand" "=r,r")
 	(plus:SI (match_dup 1)
 		 (match_operand:SI 3 "arm_addimm_operand" "I,L")))]
-  "TARGET_32BIT && INTVAL (operands[2]) == -INTVAL (operands[3])"
-  "@
-   adds%?\\t%0, %1, %3
-   subs%?\\t%0, %1, #%n3"
+  "TARGET_32BIT
+   && (INTVAL (operands[2])
+       == trunc_int_for_mode (-INTVAL (operands[3]), SImode))"
+{
+  /* For 0 and INT_MIN it is essential that we use subs, as adds
+     will result in different condition codes (like cmn rather than
+     like cmp).  For other immediates, we should choose whatever
+     will have smaller encoding.  */
+  if (operands[2] == const0_rtx
+      || INTVAL (operands[2]) == -HOST_WIDE_INT_C (0x80000000)
+      || which_alternative == 1)
+    return "subs%?\\t%0, %1, #%n3";
+  else
+    return "adds%?\\t%0, %1, %3";
+}
   [(set_attr "conds" "set")
    (set_attr "type" "alus_sreg")]
 )
@@ -9302,7 +9313,7 @@ (define_split
    (cond_exec (ne:CC (reg:CC CC_REGNUM) (const_int 0))
 	      (set (match_dup 0) (const_int 1)))]
 {
-  operands[3] = GEN_INT (-INTVAL (operands[2]));
+  operands[3] = gen_int_mode (-INTVAL (operands[2]), SImode);
 })

 (define_split
@@ -10082,7 +10093,8 @@ (define_insn_and_split "*negscc"
         /* Emit subs\\t%0, %1, %2\;mvnne\\t%0, #0 */
         if (CONST_INT_P (operands[2]))
           emit_insn (gen_cmpsi2_addneg (operands[0], operands[1], operands[2],
-                                        GEN_INT (- INTVAL (operands[2]))));
+                                        gen_int_mode (-INTVAL (operands[2]),
+						      SImode)));
         else
           emit_insn (gen_subsi3_compare (operands[0], operands[1], operands[2]));

--- gcc/config/arm/thumb2.md.jj	2019-02-28 08:14:55.970600405 +0100
+++ gcc/config/arm/thumb2.md	2019-02-28 21:51:20.387902673 +0100
@@ -913,7 +913,8 @@ (define_insn_and_split "*thumb2_negscc"
         /* Emit subs\\t%0, %1, %2\;it\\tne\;mvnne\\t%0, #0 */
         if (CONST_INT_P (operands[2]))
           emit_insn (gen_cmpsi2_addneg (operands[0], operands[1], operands[2],
-                                        GEN_INT (- INTVAL (operands[2]))));
+                                        gen_int_mode (-INTVAL (operands[2]),
+						      SImode)));
         else
           emit_insn (gen_subsi3_compare (operands[0], operands[1], operands[2]));

--- gcc/testsuite/gcc.dg/pr89506.c.jj	2019-02-28 21:51:20.387902673 +0100
+++ gcc/testsuite/gcc.dg/pr89506.c	2019-02-28 21:51:20.387902673 +0100
@@ -0,0 +1,14 @@
+/* PR target/89506 */
+/* { dg-do compile } */
+/* { dg-options "-Og -g -w" } */
+
+long long a;
+int c;
+
+int
+foo (long long d, short e)
+{
+  __builtin_sub_overflow (0xffffffff, c, &a);
+  e >>= ~2147483647 != (int) a;
+  return d + e;
+}

	Jakub

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-03-19 10:01 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-01 14:58 [PATCH] ARM cmpsi2_addneg fixes (PR target/89506) Wilco Dijkstra
2019-03-01 15:07 ` Jakub Jelinek
2019-03-01 15:41   ` Wilco Dijkstra
2019-03-01 16:04     ` Jakub Jelinek
2019-03-01 16:18       ` Jakub Jelinek
2019-03-04  9:04     ` [PATCH] ARM cmpsi2_addneg fix follow-up " Jakub Jelinek
2019-03-12 11:50       ` Patch ping (Re: [PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506)) Jakub Jelinek
2019-03-19  7:57         ` Patch ping^2 " Jakub Jelinek
2019-03-19 10:16       ` [PATCH] ARM cmpsi2_addneg fix follow-up (PR target/89506) Ramana Radhakrishnan
  -- strict thread matches above, loose matches on Subject: below --
2019-02-28 22:16 [PATCH] ARM cmpsi2_addneg fixes " Jakub Jelinek
2019-03-01  9:21 ` Kyrill Tkachov
2019-03-01  9:36   ` Jakub Jelinek
2019-03-01  9:42     ` Kyrill Tkachov
2019-03-01  9:43       ` Kyrill Tkachov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).