public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][ARM,ifcvt] Improve use of conditional execution in thumb mode.
@ 2012-02-14 17:00 Andrew Stubbs
  2012-02-14 17:32 ` Richard Earnshaw
  0 siblings, 1 reply; 12+ messages in thread
From: Andrew Stubbs @ 2012-02-14 17:00 UTC (permalink / raw)
  To: gcc-patches; +Cc: patches

[-- Attachment #1: Type: text/plain, Size: 1572 bytes --]

Hi all,

I've discovered that GCC does not use ARM conditional execution for 
16-bit Thumb opcodes in many cases. It's fine for individual 
instructions, but if-conversion of basic blocks with more than one 
instruction fails.

E.g.

int
foo (int a, int b)
{
   if (a != b)
     {
       a = a << b;
       a = a >> 1;
     }

   return a + b;
}

The current compiler gives:

foo:
         cmp     r0, r1
         beq     .L2
         lsls    r0, r0, r1
         asrs    r0, r0, #1
.L2:
         adds    r0, r0, r1
         bx      lr

With my patch I get this:

foo:
         cmp     r0, r1
         itt     ne
         lslne   r0, r0, r1
         asrne   r0, r0, #1
         adds    r0, r0, r1
         bx      lr

The problem comes from the fact that the compiler prefers "lsls" over 
"lsl" because the former is a 16-bit encoding, and the latter a 32-bit 
encoding. There's actually a peephole optimization defined to make this 
happen wherever the CC register is not live.

This is fine in unconditional code, but the CC register clobber means 
that it's only possible to convert it to conditional code if it is the 
last instruction in the IT block, so if-conversion fails on the above 
example.

My patch introduces a new target hook "IFCVT_OVERRIDE_MODIFIED_TEST" 
that allows the CC clobber to be ignored on such instructions, and uses 
IFCVT_MODIFY_INSN to convert from "lsls" to "lsl<c>" where possible.

I've also introduced a new instruction attribute "it_cc" to indicate 
which instruction patterns are affected.

OK for trunk, once stage 1 reopens?

Andrew

[-- Attachment #2: ifcvt_modify_insn.patch --]
[-- Type: text/x-patch, Size: 11264 bytes --]

2012-02-14  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* config/arm/arm-protos.h (arm_ifcvt_modify_insn): New prototype.
	(arm_ifcvt_override_modified_test): New prototype.
	* config/arm/arm.c (thumb_insn_suitable_for_ifcvt): New function.
	(arm_ifcvt_override_modified_test): New function.
	(arm_ifcvt_modify_insn): New function.
	* config/arm/arm.h (IFCVT_OVERRIDE_MODIFIED_TEST): New macro.
	(IFCVT_MODIFY_INSN): New macro.
	* config/arm/thumb2.md (it_cc): New attribute.
	(thumb2_alusi3_short): Set it_cc attribute.
	(thumb2_shiftsi3_short, thumb2_mov<mode>_shortim): Likewise.
	(thumb2_addsi_short, thumb2_subsi_short): Likewise.
	(thumb2_mulsi_short, thumb2_one_cmplsi2_short): Likewise.
	(thumb2_negsi2_short): Likewise.
	* doc/tm.texi: Regenerate.
	* doc/tm.texi.in (IFCVT_OVERRIDE_MODIFIED_TEST): Document.
	* ifcvt.c (cond_exec_process_insns): Add IFCVT_OVERRIDE_MODIFIED_TEST.

	gcc/testsuite/
	* gcc.target/arm/thumb-ifcvt.c: New test case.

---
 gcc/config/arm/arm-protos.h                |    5 ++
 gcc/config/arm/arm.c                       |   67 ++++++++++++++++++++++++++++
 gcc/config/arm/arm.h                       |    5 ++
 gcc/config/arm/thumb2.md                   |   11 +++++
 gcc/doc/tm.texi                            |   13 +++++
 gcc/doc/tm.texi.in                         |   13 +++++
 gcc/ifcvt.c                                |   14 +++++-
 gcc/testsuite/gcc.target/arm/thumb-ifcvt.c |   19 ++++++++
 8 files changed, 146 insertions(+), 1 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/thumb-ifcvt.c

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 296550a..67396f0 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -244,4 +244,9 @@ extern const struct tune_params *current_tune;
 extern int vfp3_const_double_for_fract_bits (rtx);
 #endif /* RTX_CODE */
 
+#ifdef BB_HEAD
+extern int arm_ifcvt_override_modified_test (rtx, rtx);
+extern rtx arm_ifcvt_modify_insn (ce_if_block_t *, rtx, rtx);
+#endif
+
 #endif /* ! GCC_ARM_PROTOS_H */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 0bded8d..803e1c9 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -25139,5 +25139,72 @@ vfp3_const_double_for_fract_bits (rtx operand)
   return 0;
 }
 
+/* Find the portion of INSN that is suitable for if-conversion.
+
+   Some Thumb mode 16-bit instructions have two variants: one that is
+   unconditional but clobbers the condition flags, and one that is
+   conditional (in an IT block) and does not clobber anything.
+
+   There are 32-bit variants that are unconditional but don't clobber
+   anything, so the peephole2 pass adds the clobber in order to use the
+   smaller instruction encoding.  Unfortunately this defeats the
+   if-conversion pass since CC must not be modified in an IT block.
+
+   The peephole can be reversed, and the instruction converted to
+   conditional execution without affecting the size optimization.
+
+   This function detects these instructions and returns the sub-expression
+   that contains the operation, without the clobber.  */
+
+static rtx
+thumb_insn_suitable_for_ifcvt (rtx pattern, rtx insn)
+{
+  if (GET_CODE (pattern) == PARALLEL
+      && XVECLEN (pattern, 0) == 2)
+    {
+      rtx op = XVECEXP (pattern, 0, 0);
+      rtx clobber = XVECEXP (pattern, 0, 1);
+
+      if (GET_CODE (clobber) == CLOBBER
+	  && GET_CODE (XEXP (clobber, 0)) == REG
+	  && REGNO (XEXP (clobber, 0)) == CC_REGNUM
+	  && get_attr_it_cc (insn) == IT_CC_NOCLOBBER)
+	return op;
+    }
+
+  return NULL_RTX;
+}
+
+/* Prevent if-conversion thinking that certain Thumb1 instructions
+   clobber CC.  These instruction in fact do not when in an IT block.  */
+int
+arm_ifcvt_override_modified_test (rtx test, rtx insn)
+{
+  if (thumb_insn_suitable_for_ifcvt (PATTERN (insn), insn) != NULL_RTX)
+    return 0; /* Force *not* modified.  */
+
+  return -1;
+}
+
+/* Modify insns to enable conditional execution.
+
+   This function removes CC clobers that impede if-conversion.  See the
+   comments on thumb_insn_suitable_for_ifcvt.  */
+rtx
+arm_ifcvt_modify_insn (ce_if_block_t *ce, rtx pattern, rtx insn)
+{
+  rtx op, cond;
+
+  gcc_assert (GET_CODE (pattern) == COND_EXEC);
+
+  cond = XEXP (pattern, 0);
+  op = thumb_insn_suitable_for_ifcvt (XEXP (pattern, 1), insn);
+
+  if (op != NULL_RTX)
+    return gen_rtx_COND_EXEC (VOIDmode, cond, op);
+
+  return pattern;
+}
+
 #include "gt-arm.h"
 
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 5a78125..0b7f332 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2220,4 +2220,9 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
 
 #define DRIVER_SELF_SPECS MCPU_MTUNE_NATIVE_SPECS
 
+#define IFCVT_OVERRIDE_MODIFIED_TEST(TEST,INSN) \
+  arm_ifcvt_override_modified_test ((TEST), (INSN))
+#define IFCVT_MODIFY_INSN(CE,PATTERN,INSN) \
+  (PATTERN) = arm_ifcvt_modify_insn ((CE), (PATTERN), (INSN))
+
 #endif /* ! GCC_ARM_H */
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 05585da..454c42f 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -24,6 +24,9 @@
 ;; changes made in armv5t as "thumb2".  These are considered part
 ;; the 16-bit Thumb-1 instruction set.
 
+;; Does an insn clobber CC if it appears in an IT block.  */
+(define_attr "it_cc" "undef,clobber,noclobber" (const_string "undef"))
+
 (define_insn "*thumb2_incscc"
   [(set (match_operand:SI 0 "s_register_operand" "=r,r")
         (plus:SI (match_operator:SI 2 "arm_comparison_operator"
@@ -674,6 +677,7 @@
    && GET_CODE(operands[3]) != MINUS"
   "%I3%!\\t%0, %1, %2"
   [(set_attr "predicable" "yes")
+   (set_attr "it_cc" "noclobber")
    (set_attr "length" "2")]
 )
 
@@ -708,6 +712,7 @@
        || REG_P(operands[2]))"
   "* return arm_output_shift(operands, 2);"
   [(set_attr "predicable" "yes")
+   (set_attr "it_cc" "noclobber")
    (set_attr "shift" "1")
    (set_attr "length" "2")
    (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
@@ -736,6 +741,7 @@
   "TARGET_THUMB2 && reload_completed"
   "mov%!\t%0, %1"
   [(set_attr "predicable" "yes")
+   (set_attr "it_cc" "noclobber")
    (set_attr "length" "2")]
 )
 
@@ -778,6 +784,7 @@
       return \"add%!\\t%0, %1, %2\";
   "
   [(set_attr "predicable" "yes")
+   (set_attr "it_cc" "noclobber")
    (set_attr "length" "2")]
 )
 
@@ -789,6 +796,7 @@
   "TARGET_THUMB2 && reload_completed"
   "sub%!\\t%0, %1, %2"
   [(set_attr "predicable" "yes")
+   (set_attr "it_cc" "noclobber")
    (set_attr "length" "2")]
 )
 
@@ -905,6 +913,7 @@
   "TARGET_THUMB2 && optimize_size && reload_completed"
   "mul%!\\t%0, %2, %0"
   [(set_attr "predicable" "yes")
+   (set_attr "it_cc" "noclobber")
    (set_attr "length" "2")
    (set_attr "insn" "muls")])
 
@@ -999,6 +1008,7 @@
   "TARGET_THUMB2 && reload_completed"
   "mvn%!\t%0, %1"
   [(set_attr "predicable" "yes")
+   (set_attr "it_cc" "noclobber")
    (set_attr "length" "2")]
 )
 
@@ -1022,6 +1032,7 @@
   "TARGET_THUMB2 && reload_completed"
   "neg%!\t%0, %1"
   [(set_attr "predicable" "yes")
+   (set_attr "it_cc" "noclobber")
    (set_attr "length" "2")]
 )
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 6d41cee..7d09ddb 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -10879,6 +10879,19 @@ if-statements into conditions combined by @code{and} and @code{or} operations.
 being processed and about to be turned into a condition.
 @end defmac
 
+@defmac IFCVT_OVERRIDE_MODIFIED_TEST (@var{test}, @var{insn})
+Used if the target has instructions that modify the condition
+register when executed unconditionally, but do not do so when executed
+conditionally, or vice-versa.  @var{TEST} contains the condition test, and
+@var{INSN} contains the original insn to be conditionalized.  The macro
+should return minus-one if there is to be no override, zero to disregard
+an apparent modified condition register, or any other value to indicate
+that the instruction does modify the condition register, even if it does
+not appear to.  Additionally, if conditionalizing the instruction does
+change its behaviour, @code{IFCVT_MODIFY_INSN} should make the appropriate
+adjustments to match.
+@end defmac
+
 @defmac IFCVT_MODIFY_INSN (@var{ce_info}, @var{pattern}, @var{insn})
 A C expression to modify the @var{PATTERN} of an @var{INSN} that is to
 be converted to conditional execution format.  @var{ce_info} points to
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 396f244..52d9e96 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -10759,6 +10759,19 @@ if-statements into conditions combined by @code{and} and @code{or} operations.
 being processed and about to be turned into a condition.
 @end defmac
 
+@defmac IFCVT_OVERRIDE_MODIFIED_TEST (@var{test}, @var{insn})
+Used if the target has instructions that modify the condition
+register when executed unconditionally, but do not do so when executed
+conditionally, or vice-versa.  @var{TEST} contains the condition test, and
+@var{INSN} contains the original insn to be conditionalized.  The macro
+should return minus-one if there is to be no override, zero to disregard
+an apparent modified condition register, or any other value to indicate
+that the instruction does modify the condition register, even if it does
+not appear to.  Additionally, if conditionalizing the instruction does
+change its behaviour, @code{IFCVT_MODIFY_INSN} should make the appropriate
+adjustments to match.
+@end defmac
+
 @defmac IFCVT_MODIFY_INSN (@var{ce_info}, @var{pattern}, @var{insn})
 A C expression to modify the @var{PATTERN} of an @var{INSN} that is to
 be converted to conditional execution format.  @var{ce_info} points to
diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index ce60ce2..ff620da 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -310,6 +310,7 @@ cond_exec_process_insns (ce_if_block_t *ce_info ATTRIBUTE_UNUSED,
   rtx insn;
   rtx xtest;
   rtx pattern;
+  int override_modified = -1;
 
   if (!start || !end)
     return FALSE;
@@ -338,7 +339,18 @@ cond_exec_process_insns (ce_if_block_t *ce_info ATTRIBUTE_UNUSED,
       if (must_be_last)
 	return FALSE;
 
-      if (modified_in_p (test, insn))
+      /* Some instructions modify CC flags when executed unconditionally
+	 but not when executed conditionally.
+         Return values:
+	    -1 = no override
+	    0 = force false
+	    other = force true.  */
+#ifdef IFCVT_OVERRIDE_MODIFIED_TEST
+      override_modified = IFCVT_OVERRIDE_MODIFIED_TEST (test, insn);
+#endif
+
+      if ((override_modified == -1 && modified_in_p (test, insn))
+	  || override_modified)
 	{
 	  if (!mod_ok)
 	    return FALSE;
diff --git a/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c b/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
new file mode 100644
index 0000000..b03bbce
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
@@ -0,0 +1,19 @@
+/* Check that Thumb 16-bit shifts can be if-converted.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-options "-O2" } */
+
+int
+foo (int a, int b)
+{
+  if (a != b)
+    {
+      a = a << b;
+      a = a >> 1;
+    }
+
+  return a + b;
+}
+
+/* { dg-final { scan-assembler "lslne" } } */
+/* { dg-final { scan-assembler "asrne" } } */

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][ARM,ifcvt] Improve use of conditional execution in thumb mode.
  2012-02-14 17:00 [PATCH][ARM,ifcvt] Improve use of conditional execution in thumb mode Andrew Stubbs
@ 2012-02-14 17:32 ` Richard Earnshaw
  2012-02-14 17:40   ` Richard Earnshaw
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Earnshaw @ 2012-02-14 17:32 UTC (permalink / raw)
  To: Andrew Stubbs; +Cc: gcc-patches, patches

On 14/02/12 16:53, Andrew Stubbs wrote:
> Hi all,
> 
> I've discovered that GCC does not use ARM conditional execution for 
> 16-bit Thumb opcodes in many cases. It's fine for individual 
> instructions, but if-conversion of basic blocks with more than one 
> instruction fails.
> 
> E.g.
> 
> int
> foo (int a, int b)
> {
>    if (a != b)
>      {
>        a = a << b;
>        a = a >> 1;
>      }
> 
>    return a + b;
> }
> 
> The current compiler gives:
> 
> foo:
>          cmp     r0, r1
>          beq     .L2
>          lsls    r0, r0, r1
>          asrs    r0, r0, #1
> .L2:
>          adds    r0, r0, r1
>          bx      lr
> 
> With my patch I get this:
> 
> foo:
>          cmp     r0, r1
>          itt     ne
>          lslne   r0, r0, r1
>          asrne   r0, r0, #1
>          adds    r0, r0, r1
>          bx      lr
> 
> The problem comes from the fact that the compiler prefers "lsls" over 
> "lsl" because the former is a 16-bit encoding, and the latter a 32-bit 
> encoding. There's actually a peephole optimization defined to make this 
> happen wherever the CC register is not live.
> 
> This is fine in unconditional code, but the CC register clobber means 
> that it's only possible to convert it to conditional code if it is the 
> last instruction in the IT block, so if-conversion fails on the above 
> example.
> 
> My patch introduces a new target hook "IFCVT_OVERRIDE_MODIFIED_TEST" 
> that allows the CC clobber to be ignored on such instructions, and uses 
> IFCVT_MODIFY_INSN to convert from "lsls" to "lsl<c>" where possible.
> 
> I've also introduced a new instruction attribute "it_cc" to indicate 
> which instruction patterns are affected.
> 
> OK for trunk, once stage 1 reopens?
> 
> Andrew
> 

Bernds checked in a patch last year (or maybe even before that) to make
the selection of flags clobbered insns run very late (certainly after
condexec had run), so I'm not sure why you're not seeing this.

R.

> 
> ifcvt_modify_insn.patch
> 
> 
> 2012-02-14  Andrew Stubbs  <ams@codesourcery.com>
> 
> 	gcc/
> 	* config/arm/arm-protos.h (arm_ifcvt_modify_insn): New prototype.
> 	(arm_ifcvt_override_modified_test): New prototype.
> 	* config/arm/arm.c (thumb_insn_suitable_for_ifcvt): New function.
> 	(arm_ifcvt_override_modified_test): New function.
> 	(arm_ifcvt_modify_insn): New function.
> 	* config/arm/arm.h (IFCVT_OVERRIDE_MODIFIED_TEST): New macro.
> 	(IFCVT_MODIFY_INSN): New macro.
> 	* config/arm/thumb2.md (it_cc): New attribute.
> 	(thumb2_alusi3_short): Set it_cc attribute.
> 	(thumb2_shiftsi3_short, thumb2_mov<mode>_shortim): Likewise.
> 	(thumb2_addsi_short, thumb2_subsi_short): Likewise.
> 	(thumb2_mulsi_short, thumb2_one_cmplsi2_short): Likewise.
> 	(thumb2_negsi2_short): Likewise.
> 	* doc/tm.texi: Regenerate.
> 	* doc/tm.texi.in (IFCVT_OVERRIDE_MODIFIED_TEST): Document.
> 	* ifcvt.c (cond_exec_process_insns): Add IFCVT_OVERRIDE_MODIFIED_TEST.
> 
> 	gcc/testsuite/
> 	* gcc.target/arm/thumb-ifcvt.c: New test case.
> 
> ---
>  gcc/config/arm/arm-protos.h                |    5 ++
>  gcc/config/arm/arm.c                       |   67 ++++++++++++++++++++++++++++
>  gcc/config/arm/arm.h                       |    5 ++
>  gcc/config/arm/thumb2.md                   |   11 +++++
>  gcc/doc/tm.texi                            |   13 +++++
>  gcc/doc/tm.texi.in                         |   13 +++++
>  gcc/ifcvt.c                                |   14 +++++-
>  gcc/testsuite/gcc.target/arm/thumb-ifcvt.c |   19 ++++++++
>  8 files changed, 146 insertions(+), 1 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
> 
> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index 296550a..67396f0 100644
> --- a/gcc/config/arm/arm-protos.h
> +++ b/gcc/config/arm/arm-protos.h
> @@ -244,4 +244,9 @@ extern const struct tune_params *current_tune;
>  extern int vfp3_const_double_for_fract_bits (rtx);
>  #endif /* RTX_CODE */
>  
> +#ifdef BB_HEAD
> +extern int arm_ifcvt_override_modified_test (rtx, rtx);
> +extern rtx arm_ifcvt_modify_insn (ce_if_block_t *, rtx, rtx);
> +#endif
> +
>  #endif /* ! GCC_ARM_PROTOS_H */
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 0bded8d..803e1c9 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -25139,5 +25139,72 @@ vfp3_const_double_for_fract_bits (rtx operand)
>    return 0;
>  }
>  
> +/* Find the portion of INSN that is suitable for if-conversion.
> +
> +   Some Thumb mode 16-bit instructions have two variants: one that is
> +   unconditional but clobbers the condition flags, and one that is
> +   conditional (in an IT block) and does not clobber anything.
> +
> +   There are 32-bit variants that are unconditional but don't clobber
> +   anything, so the peephole2 pass adds the clobber in order to use the
> +   smaller instruction encoding.  Unfortunately this defeats the
> +   if-conversion pass since CC must not be modified in an IT block.
> +
> +   The peephole can be reversed, and the instruction converted to
> +   conditional execution without affecting the size optimization.
> +
> +   This function detects these instructions and returns the sub-expression
> +   that contains the operation, without the clobber.  */
> +
> +static rtx
> +thumb_insn_suitable_for_ifcvt (rtx pattern, rtx insn)
> +{
> +  if (GET_CODE (pattern) == PARALLEL
> +      && XVECLEN (pattern, 0) == 2)
> +    {
> +      rtx op = XVECEXP (pattern, 0, 0);
> +      rtx clobber = XVECEXP (pattern, 0, 1);
> +
> +      if (GET_CODE (clobber) == CLOBBER
> +	  && GET_CODE (XEXP (clobber, 0)) == REG
> +	  && REGNO (XEXP (clobber, 0)) == CC_REGNUM
> +	  && get_attr_it_cc (insn) == IT_CC_NOCLOBBER)
> +	return op;
> +    }
> +
> +  return NULL_RTX;
> +}
> +
> +/* Prevent if-conversion thinking that certain Thumb1 instructions
> +   clobber CC.  These instruction in fact do not when in an IT block.  */
> +int
> +arm_ifcvt_override_modified_test (rtx test, rtx insn)
> +{
> +  if (thumb_insn_suitable_for_ifcvt (PATTERN (insn), insn) != NULL_RTX)
> +    return 0; /* Force *not* modified.  */
> +
> +  return -1;
> +}
> +
> +/* Modify insns to enable conditional execution.
> +
> +   This function removes CC clobers that impede if-conversion.  See the
> +   comments on thumb_insn_suitable_for_ifcvt.  */
> +rtx
> +arm_ifcvt_modify_insn (ce_if_block_t *ce, rtx pattern, rtx insn)
> +{
> +  rtx op, cond;
> +
> +  gcc_assert (GET_CODE (pattern) == COND_EXEC);
> +
> +  cond = XEXP (pattern, 0);
> +  op = thumb_insn_suitable_for_ifcvt (XEXP (pattern, 1), insn);
> +
> +  if (op != NULL_RTX)
> +    return gen_rtx_COND_EXEC (VOIDmode, cond, op);
> +
> +  return pattern;
> +}
> +
>  #include "gt-arm.h"
>  
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index 5a78125..0b7f332 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -2220,4 +2220,9 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
>  
>  #define DRIVER_SELF_SPECS MCPU_MTUNE_NATIVE_SPECS
>  
> +#define IFCVT_OVERRIDE_MODIFIED_TEST(TEST,INSN) \
> +  arm_ifcvt_override_modified_test ((TEST), (INSN))
> +#define IFCVT_MODIFY_INSN(CE,PATTERN,INSN) \
> +  (PATTERN) = arm_ifcvt_modify_insn ((CE), (PATTERN), (INSN))
> +
>  #endif /* ! GCC_ARM_H */
> diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
> index 05585da..454c42f 100644
> --- a/gcc/config/arm/thumb2.md
> +++ b/gcc/config/arm/thumb2.md
> @@ -24,6 +24,9 @@
>  ;; changes made in armv5t as "thumb2".  These are considered part
>  ;; the 16-bit Thumb-1 instruction set.
>  
> +;; Does an insn clobber CC if it appears in an IT block.  */
> +(define_attr "it_cc" "undef,clobber,noclobber" (const_string "undef"))
> +
>  (define_insn "*thumb2_incscc"
>    [(set (match_operand:SI 0 "s_register_operand" "=r,r")
>          (plus:SI (match_operator:SI 2 "arm_comparison_operator"
> @@ -674,6 +677,7 @@
>     && GET_CODE(operands[3]) != MINUS"
>    "%I3%!\\t%0, %1, %2"
>    [(set_attr "predicable" "yes")
> +   (set_attr "it_cc" "noclobber")
>     (set_attr "length" "2")]
>  )
>  
> @@ -708,6 +712,7 @@
>         || REG_P(operands[2]))"
>    "* return arm_output_shift(operands, 2);"
>    [(set_attr "predicable" "yes")
> +   (set_attr "it_cc" "noclobber")
>     (set_attr "shift" "1")
>     (set_attr "length" "2")
>     (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
> @@ -736,6 +741,7 @@
>    "TARGET_THUMB2 && reload_completed"
>    "mov%!\t%0, %1"
>    [(set_attr "predicable" "yes")
> +   (set_attr "it_cc" "noclobber")
>     (set_attr "length" "2")]
>  )
>  
> @@ -778,6 +784,7 @@
>        return \"add%!\\t%0, %1, %2\";
>    "
>    [(set_attr "predicable" "yes")
> +   (set_attr "it_cc" "noclobber")
>     (set_attr "length" "2")]
>  )
>  
> @@ -789,6 +796,7 @@
>    "TARGET_THUMB2 && reload_completed"
>    "sub%!\\t%0, %1, %2"
>    [(set_attr "predicable" "yes")
> +   (set_attr "it_cc" "noclobber")
>     (set_attr "length" "2")]
>  )
>  
> @@ -905,6 +913,7 @@
>    "TARGET_THUMB2 && optimize_size && reload_completed"
>    "mul%!\\t%0, %2, %0"
>    [(set_attr "predicable" "yes")
> +   (set_attr "it_cc" "noclobber")
>     (set_attr "length" "2")
>     (set_attr "insn" "muls")])
>  
> @@ -999,6 +1008,7 @@
>    "TARGET_THUMB2 && reload_completed"
>    "mvn%!\t%0, %1"
>    [(set_attr "predicable" "yes")
> +   (set_attr "it_cc" "noclobber")
>     (set_attr "length" "2")]
>  )
>  
> @@ -1022,6 +1032,7 @@
>    "TARGET_THUMB2 && reload_completed"
>    "neg%!\t%0, %1"
>    [(set_attr "predicable" "yes")
> +   (set_attr "it_cc" "noclobber")
>     (set_attr "length" "2")]
>  )
>  
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 6d41cee..7d09ddb 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -10879,6 +10879,19 @@ if-statements into conditions combined by @code{and} and @code{or} operations.
>  being processed and about to be turned into a condition.
>  @end defmac
>  
> +@defmac IFCVT_OVERRIDE_MODIFIED_TEST (@var{test}, @var{insn})
> +Used if the target has instructions that modify the condition
> +register when executed unconditionally, but do not do so when executed
> +conditionally, or vice-versa.  @var{TEST} contains the condition test, and
> +@var{INSN} contains the original insn to be conditionalized.  The macro
> +should return minus-one if there is to be no override, zero to disregard
> +an apparent modified condition register, or any other value to indicate
> +that the instruction does modify the condition register, even if it does
> +not appear to.  Additionally, if conditionalizing the instruction does
> +change its behaviour, @code{IFCVT_MODIFY_INSN} should make the appropriate
> +adjustments to match.
> +@end defmac
> +
>  @defmac IFCVT_MODIFY_INSN (@var{ce_info}, @var{pattern}, @var{insn})
>  A C expression to modify the @var{PATTERN} of an @var{INSN} that is to
>  be converted to conditional execution format.  @var{ce_info} points to
> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
> index 396f244..52d9e96 100644
> --- a/gcc/doc/tm.texi.in
> +++ b/gcc/doc/tm.texi.in
> @@ -10759,6 +10759,19 @@ if-statements into conditions combined by @code{and} and @code{or} operations.
>  being processed and about to be turned into a condition.
>  @end defmac
>  
> +@defmac IFCVT_OVERRIDE_MODIFIED_TEST (@var{test}, @var{insn})
> +Used if the target has instructions that modify the condition
> +register when executed unconditionally, but do not do so when executed
> +conditionally, or vice-versa.  @var{TEST} contains the condition test, and
> +@var{INSN} contains the original insn to be conditionalized.  The macro
> +should return minus-one if there is to be no override, zero to disregard
> +an apparent modified condition register, or any other value to indicate
> +that the instruction does modify the condition register, even if it does
> +not appear to.  Additionally, if conditionalizing the instruction does
> +change its behaviour, @code{IFCVT_MODIFY_INSN} should make the appropriate
> +adjustments to match.
> +@end defmac
> +
>  @defmac IFCVT_MODIFY_INSN (@var{ce_info}, @var{pattern}, @var{insn})
>  A C expression to modify the @var{PATTERN} of an @var{INSN} that is to
>  be converted to conditional execution format.  @var{ce_info} points to
> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
> index ce60ce2..ff620da 100644
> --- a/gcc/ifcvt.c
> +++ b/gcc/ifcvt.c
> @@ -310,6 +310,7 @@ cond_exec_process_insns (ce_if_block_t *ce_info ATTRIBUTE_UNUSED,
>    rtx insn;
>    rtx xtest;
>    rtx pattern;
> +  int override_modified = -1;
>  
>    if (!start || !end)
>      return FALSE;
> @@ -338,7 +339,18 @@ cond_exec_process_insns (ce_if_block_t *ce_info ATTRIBUTE_UNUSED,
>        if (must_be_last)
>  	return FALSE;
>  
> -      if (modified_in_p (test, insn))
> +      /* Some instructions modify CC flags when executed unconditionally
> +	 but not when executed conditionally.
> +         Return values:
> +	    -1 = no override
> +	    0 = force false
> +	    other = force true.  */
> +#ifdef IFCVT_OVERRIDE_MODIFIED_TEST
> +      override_modified = IFCVT_OVERRIDE_MODIFIED_TEST (test, insn);
> +#endif
> +
> +      if ((override_modified == -1 && modified_in_p (test, insn))
> +	  || override_modified)
>  	{
>  	  if (!mod_ok)
>  	    return FALSE;
> diff --git a/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c b/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
> new file mode 100644
> index 0000000..b03bbce
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
> @@ -0,0 +1,19 @@
> +/* Check that Thumb 16-bit shifts can be if-converted.  */
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_thumb2_ok } */
> +/* { dg-options "-O2" } */
> +
> +int
> +foo (int a, int b)
> +{
> +  if (a != b)
> +    {
> +      a = a << b;
> +      a = a >> 1;
> +    }
> +
> +  return a + b;
> +}
> +
> +/* { dg-final { scan-assembler "lslne" } } */
> +/* { dg-final { scan-assembler "asrne" } } */


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][ARM,ifcvt] Improve use of conditional execution in thumb mode.
  2012-02-14 17:32 ` Richard Earnshaw
@ 2012-02-14 17:40   ` Richard Earnshaw
  2012-02-14 18:02     ` [PATCH][ARM] " Andrew Stubbs
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Earnshaw @ 2012-02-14 17:40 UTC (permalink / raw)
  To: Andrew Stubbs; +Cc: gcc-patches, patches

On 14/02/12 17:30, Richard Earnshaw wrote:
> On 14/02/12 16:53, Andrew Stubbs wrote:
>> Hi all,
>>
>> I've discovered that GCC does not use ARM conditional execution for
>> 16-bit Thumb opcodes in many cases. It's fine for individual
>> instructions, but if-conversion of basic blocks with more than one
>> instruction fails.
>>
>> E.g.
>>
>> int
>> foo (int a, int b)
>> {
>>    if (a != b)
>>      {
>>        a = a << b;
>>        a = a >> 1;
>>      }
>>
>>    return a + b;
>> }
>>
>> The current compiler gives:
>>
>> foo:
>>          cmp     r0, r1
>>          beq     .L2
>>          lsls    r0, r0, r1
>>          asrs    r0, r0, #1
>> .L2:
>>          adds    r0, r0, r1
>>          bx      lr
>>
>> With my patch I get this:
>>
>> foo:
>>          cmp     r0, r1
>>          itt     ne
>>          lslne   r0, r0, r1
>>          asrne   r0, r0, #1
>>          adds    r0, r0, r1
>>          bx      lr
>>
>> The problem comes from the fact that the compiler prefers "lsls" over
>> "lsl" because the former is a 16-bit encoding, and the latter a 32-bit
>> encoding. There's actually a peephole optimization defined to make this
>> happen wherever the CC register is not live.
>>
>> This is fine in unconditional code, but the CC register clobber means
>> that it's only possible to convert it to conditional code if it is the
>> last instruction in the IT block, so if-conversion fails on the above
>> example.
>>
>> My patch introduces a new target hook "IFCVT_OVERRIDE_MODIFIED_TEST"
>> that allows the CC clobber to be ignored on such instructions, and uses
>> IFCVT_MODIFY_INSN to convert from "lsls" to "lsl<c>" where possible.
>>
>> I've also introduced a new instruction attribute "it_cc" to indicate
>> which instruction patterns are affected.
>>
>> OK for trunk, once stage 1 reopens?
>>
>> Andrew
>>
> 
> Bernds checked in a patch last year (or maybe even before that) to make
> the selection of flags clobbered insns run very late (certainly after
> condexec had run), so I'm not sure why you're not seeing this.
> 

Hm, you mentioned some peepholes.  Try removing them....

R.

> R.
> 
>>
>> ifcvt_modify_insn.patch
>>
>>
>> 2012-02-14  Andrew Stubbs  <ams@codesourcery.com>
>>
>>       gcc/
>>       * config/arm/arm-protos.h (arm_ifcvt_modify_insn): New prototype.
>>       (arm_ifcvt_override_modified_test): New prototype.
>>       * config/arm/arm.c (thumb_insn_suitable_for_ifcvt): New function.
>>       (arm_ifcvt_override_modified_test): New function.
>>       (arm_ifcvt_modify_insn): New function.
>>       * config/arm/arm.h (IFCVT_OVERRIDE_MODIFIED_TEST): New macro.
>>       (IFCVT_MODIFY_INSN): New macro.
>>       * config/arm/thumb2.md (it_cc): New attribute.
>>       (thumb2_alusi3_short): Set it_cc attribute.
>>       (thumb2_shiftsi3_short, thumb2_mov<mode>_shortim): Likewise.
>>       (thumb2_addsi_short, thumb2_subsi_short): Likewise.
>>       (thumb2_mulsi_short, thumb2_one_cmplsi2_short): Likewise.
>>       (thumb2_negsi2_short): Likewise.
>>       * doc/tm.texi: Regenerate.
>>       * doc/tm.texi.in (IFCVT_OVERRIDE_MODIFIED_TEST): Document.
>>       * ifcvt.c (cond_exec_process_insns): Add IFCVT_OVERRIDE_MODIFIED_TEST.
>>
>>       gcc/testsuite/
>>       * gcc.target/arm/thumb-ifcvt.c: New test case.
>>
>> ---
>>  gcc/config/arm/arm-protos.h                |    5 ++
>>  gcc/config/arm/arm.c                       |   67 ++++++++++++++++++++++++++++
>>  gcc/config/arm/arm.h                       |    5 ++
>>  gcc/config/arm/thumb2.md                   |   11 +++++
>>  gcc/doc/tm.texi                            |   13 +++++
>>  gcc/doc/tm.texi.in                         |   13 +++++
>>  gcc/ifcvt.c                                |   14 +++++-
>>  gcc/testsuite/gcc.target/arm/thumb-ifcvt.c |   19 ++++++++
>>  8 files changed, 146 insertions(+), 1 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
>>
>> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
>> index 296550a..67396f0 100644
>> --- a/gcc/config/arm/arm-protos.h
>> +++ b/gcc/config/arm/arm-protos.h
>> @@ -244,4 +244,9 @@ extern const struct tune_params *current_tune;
>>  extern int vfp3_const_double_for_fract_bits (rtx);
>>  #endif /* RTX_CODE */
>>
>> +#ifdef BB_HEAD
>> +extern int arm_ifcvt_override_modified_test (rtx, rtx);
>> +extern rtx arm_ifcvt_modify_insn (ce_if_block_t *, rtx, rtx);
>> +#endif
>> +
>>  #endif /* ! GCC_ARM_PROTOS_H */
>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>> index 0bded8d..803e1c9 100644
>> --- a/gcc/config/arm/arm.c
>> +++ b/gcc/config/arm/arm.c
>> @@ -25139,5 +25139,72 @@ vfp3_const_double_for_fract_bits (rtx operand)
>>    return 0;
>>  }
>>
>> +/* Find the portion of INSN that is suitable for if-conversion.
>> +
>> +   Some Thumb mode 16-bit instructions have two variants: one that is
>> +   unconditional but clobbers the condition flags, and one that is
>> +   conditional (in an IT block) and does not clobber anything.
>> +
>> +   There are 32-bit variants that are unconditional but don't clobber
>> +   anything, so the peephole2 pass adds the clobber in order to use the
>> +   smaller instruction encoding.  Unfortunately this defeats the
>> +   if-conversion pass since CC must not be modified in an IT block.
>> +
>> +   The peephole can be reversed, and the instruction converted to
>> +   conditional execution without affecting the size optimization.
>> +
>> +   This function detects these instructions and returns the sub-expression
>> +   that contains the operation, without the clobber.  */
>> +
>> +static rtx
>> +thumb_insn_suitable_for_ifcvt (rtx pattern, rtx insn)
>> +{
>> +  if (GET_CODE (pattern) == PARALLEL
>> +      && XVECLEN (pattern, 0) == 2)
>> +    {
>> +      rtx op = XVECEXP (pattern, 0, 0);
>> +      rtx clobber = XVECEXP (pattern, 0, 1);
>> +
>> +      if (GET_CODE (clobber) == CLOBBER
>> +       && GET_CODE (XEXP (clobber, 0)) == REG
>> +       && REGNO (XEXP (clobber, 0)) == CC_REGNUM
>> +       && get_attr_it_cc (insn) == IT_CC_NOCLOBBER)
>> +     return op;
>> +    }
>> +
>> +  return NULL_RTX;
>> +}
>> +
>> +/* Prevent if-conversion thinking that certain Thumb1 instructions
>> +   clobber CC.  These instruction in fact do not when in an IT block.  */
>> +int
>> +arm_ifcvt_override_modified_test (rtx test, rtx insn)
>> +{
>> +  if (thumb_insn_suitable_for_ifcvt (PATTERN (insn), insn) != NULL_RTX)
>> +    return 0; /* Force *not* modified.  */
>> +
>> +  return -1;
>> +}
>> +
>> +/* Modify insns to enable conditional execution.
>> +
>> +   This function removes CC clobers that impede if-conversion.  See the
>> +   comments on thumb_insn_suitable_for_ifcvt.  */
>> +rtx
>> +arm_ifcvt_modify_insn (ce_if_block_t *ce, rtx pattern, rtx insn)
>> +{
>> +  rtx op, cond;
>> +
>> +  gcc_assert (GET_CODE (pattern) == COND_EXEC);
>> +
>> +  cond = XEXP (pattern, 0);
>> +  op = thumb_insn_suitable_for_ifcvt (XEXP (pattern, 1), insn);
>> +
>> +  if (op != NULL_RTX)
>> +    return gen_rtx_COND_EXEC (VOIDmode, cond, op);
>> +
>> +  return pattern;
>> +}
>> +
>>  #include "gt-arm.h"
>>
>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
>> index 5a78125..0b7f332 100644
>> --- a/gcc/config/arm/arm.h
>> +++ b/gcc/config/arm/arm.h
>> @@ -2220,4 +2220,9 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
>>
>>  #define DRIVER_SELF_SPECS MCPU_MTUNE_NATIVE_SPECS
>>
>> +#define IFCVT_OVERRIDE_MODIFIED_TEST(TEST,INSN) \
>> +  arm_ifcvt_override_modified_test ((TEST), (INSN))
>> +#define IFCVT_MODIFY_INSN(CE,PATTERN,INSN) \
>> +  (PATTERN) = arm_ifcvt_modify_insn ((CE), (PATTERN), (INSN))
>> +
>>  #endif /* ! GCC_ARM_H */
>> diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
>> index 05585da..454c42f 100644
>> --- a/gcc/config/arm/thumb2.md
>> +++ b/gcc/config/arm/thumb2.md
>> @@ -24,6 +24,9 @@
>>  ;; changes made in armv5t as "thumb2".  These are considered part
>>  ;; the 16-bit Thumb-1 instruction set.
>>
>> +;; Does an insn clobber CC if it appears in an IT block.  */
>> +(define_attr "it_cc" "undef,clobber,noclobber" (const_string "undef"))
>> +
>>  (define_insn "*thumb2_incscc"
>>    [(set (match_operand:SI 0 "s_register_operand" "=r,r")
>>          (plus:SI (match_operator:SI 2 "arm_comparison_operator"
>> @@ -674,6 +677,7 @@
>>     && GET_CODE(operands[3]) != MINUS"
>>    "%I3%!\\t%0, %1, %2"
>>    [(set_attr "predicable" "yes")
>> +   (set_attr "it_cc" "noclobber")
>>     (set_attr "length" "2")]
>>  )
>>
>> @@ -708,6 +712,7 @@
>>         || REG_P(operands[2]))"
>>    "* return arm_output_shift(operands, 2);"
>>    [(set_attr "predicable" "yes")
>> +   (set_attr "it_cc" "noclobber")
>>     (set_attr "shift" "1")
>>     (set_attr "length" "2")
>>     (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
>> @@ -736,6 +741,7 @@
>>    "TARGET_THUMB2 && reload_completed"
>>    "mov%!\t%0, %1"
>>    [(set_attr "predicable" "yes")
>> +   (set_attr "it_cc" "noclobber")
>>     (set_attr "length" "2")]
>>  )
>>
>> @@ -778,6 +784,7 @@
>>        return \"add%!\\t%0, %1, %2\";
>>    "
>>    [(set_attr "predicable" "yes")
>> +   (set_attr "it_cc" "noclobber")
>>     (set_attr "length" "2")]
>>  )
>>
>> @@ -789,6 +796,7 @@
>>    "TARGET_THUMB2 && reload_completed"
>>    "sub%!\\t%0, %1, %2"
>>    [(set_attr "predicable" "yes")
>> +   (set_attr "it_cc" "noclobber")
>>     (set_attr "length" "2")]
>>  )
>>
>> @@ -905,6 +913,7 @@
>>    "TARGET_THUMB2 && optimize_size && reload_completed"
>>    "mul%!\\t%0, %2, %0"
>>    [(set_attr "predicable" "yes")
>> +   (set_attr "it_cc" "noclobber")
>>     (set_attr "length" "2")
>>     (set_attr "insn" "muls")])
>>
>> @@ -999,6 +1008,7 @@
>>    "TARGET_THUMB2 && reload_completed"
>>    "mvn%!\t%0, %1"
>>    [(set_attr "predicable" "yes")
>> +   (set_attr "it_cc" "noclobber")
>>     (set_attr "length" "2")]
>>  )
>>
>> @@ -1022,6 +1032,7 @@
>>    "TARGET_THUMB2 && reload_completed"
>>    "neg%!\t%0, %1"
>>    [(set_attr "predicable" "yes")
>> +   (set_attr "it_cc" "noclobber")
>>     (set_attr "length" "2")]
>>  )
>>
>> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
>> index 6d41cee..7d09ddb 100644
>> --- a/gcc/doc/tm.texi
>> +++ b/gcc/doc/tm.texi
>> @@ -10879,6 +10879,19 @@ if-statements into conditions combined by @code{and} and @code{or} operations.
>>  being processed and about to be turned into a condition.
>>  @end defmac
>>
>> +@defmac IFCVT_OVERRIDE_MODIFIED_TEST (@var{test}, @var{insn})
>> +Used if the target has instructions that modify the condition
>> +register when executed unconditionally, but do not do so when executed
>> +conditionally, or vice-versa.  @var{TEST} contains the condition test, and
>> +@var{INSN} contains the original insn to be conditionalized.  The macro
>> +should return minus-one if there is to be no override, zero to disregard
>> +an apparent modified condition register, or any other value to indicate
>> +that the instruction does modify the condition register, even if it does
>> +not appear to.  Additionally, if conditionalizing the instruction does
>> +change its behaviour, @code{IFCVT_MODIFY_INSN} should make the appropriate
>> +adjustments to match.
>> +@end defmac
>> +
>>  @defmac IFCVT_MODIFY_INSN (@var{ce_info}, @var{pattern}, @var{insn})
>>  A C expression to modify the @var{PATTERN} of an @var{INSN} that is to
>>  be converted to conditional execution format.  @var{ce_info} points to
>> diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
>> index 396f244..52d9e96 100644
>> --- a/gcc/doc/tm.texi.in
>> +++ b/gcc/doc/tm.texi.in
>> @@ -10759,6 +10759,19 @@ if-statements into conditions combined by @code{and} and @code{or} operations.
>>  being processed and about to be turned into a condition.
>>  @end defmac
>>
>> +@defmac IFCVT_OVERRIDE_MODIFIED_TEST (@var{test}, @var{insn})
>> +Used if the target has instructions that modify the condition
>> +register when executed unconditionally, but do not do so when executed
>> +conditionally, or vice-versa.  @var{TEST} contains the condition test, and
>> +@var{INSN} contains the original insn to be conditionalized.  The macro
>> +should return minus-one if there is to be no override, zero to disregard
>> +an apparent modified condition register, or any other value to indicate
>> +that the instruction does modify the condition register, even if it does
>> +not appear to.  Additionally, if conditionalizing the instruction does
>> +change its behaviour, @code{IFCVT_MODIFY_INSN} should make the appropriate
>> +adjustments to match.
>> +@end defmac
>> +
>>  @defmac IFCVT_MODIFY_INSN (@var{ce_info}, @var{pattern}, @var{insn})
>>  A C expression to modify the @var{PATTERN} of an @var{INSN} that is to
>>  be converted to conditional execution format.  @var{ce_info} points to
>> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
>> index ce60ce2..ff620da 100644
>> --- a/gcc/ifcvt.c
>> +++ b/gcc/ifcvt.c
>> @@ -310,6 +310,7 @@ cond_exec_process_insns (ce_if_block_t *ce_info ATTRIBUTE_UNUSED,
>>    rtx insn;
>>    rtx xtest;
>>    rtx pattern;
>> +  int override_modified = -1;
>>
>>    if (!start || !end)
>>      return FALSE;
>> @@ -338,7 +339,18 @@ cond_exec_process_insns (ce_if_block_t *ce_info ATTRIBUTE_UNUSED,
>>        if (must_be_last)
>>       return FALSE;
>>
>> -      if (modified_in_p (test, insn))
>> +      /* Some instructions modify CC flags when executed unconditionally
>> +      but not when executed conditionally.
>> +         Return values:
>> +         -1 = no override
>> +         0 = force false
>> +         other = force true.  */
>> +#ifdef IFCVT_OVERRIDE_MODIFIED_TEST
>> +      override_modified = IFCVT_OVERRIDE_MODIFIED_TEST (test, insn);
>> +#endif
>> +
>> +      if ((override_modified == -1 && modified_in_p (test, insn))
>> +       || override_modified)
>>       {
>>         if (!mod_ok)
>>           return FALSE;
>> diff --git a/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c b/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
>> new file mode 100644
>> index 0000000..b03bbce
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
>> @@ -0,0 +1,19 @@
>> +/* Check that Thumb 16-bit shifts can be if-converted.  */
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target arm_thumb2_ok } */
>> +/* { dg-options "-O2" } */
>> +
>> +int
>> +foo (int a, int b)
>> +{
>> +  if (a != b)
>> +    {
>> +      a = a << b;
>> +      a = a >> 1;
>> +    }
>> +
>> +  return a + b;
>> +}
>> +
>> +/* { dg-final { scan-assembler "lslne" } } */
>> +/* { dg-final { scan-assembler "asrne" } } */
> 
> 
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][ARM] Improve use of conditional execution in thumb mode.
  2012-02-14 17:40   ` Richard Earnshaw
@ 2012-02-14 18:02     ` Andrew Stubbs
  2012-02-14 18:02       ` Andrew Stubbs
  2012-02-17 16:06       ` Andrew Stubbs
  0 siblings, 2 replies; 12+ messages in thread
From: Andrew Stubbs @ 2012-02-14 18:02 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches, patches

On Tue 14 Feb 2012 17:35:36 GMT, Richard Earnshaw wrote:
>> Bernds checked in a patch last year (or maybe even before that) to make
>> the selection of flags clobbered insns run very late (certainly after
>> condexec had run), so I'm not sure why you're not seeing this.
>
> Hm, you mentioned some peepholes.  Try removing them....

Hmmm, it seems you're right. The machine reorg pass now takes care of 
this. Well ... that was easy!

Here's a patch to remove the obsolete peepholes. I've confirmed it works 
with the testcase.

OK?

Andrew

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][ARM] Improve use of conditional execution in thumb mode.
  2012-02-14 18:02     ` [PATCH][ARM] " Andrew Stubbs
@ 2012-02-14 18:02       ` Andrew Stubbs
  2012-02-14 18:40         ` Richard Earnshaw
  2012-02-17 16:06       ` Andrew Stubbs
  1 sibling, 1 reply; 12+ messages in thread
From: Andrew Stubbs @ 2012-02-14 18:02 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches, patches

[-- Attachment #1: Type: text/plain, Size: 648 bytes --]

And again with the attachment.

On Tue 14 Feb 2012 18:00:57 GMT, Andrew Stubbs wrote:
> On Tue 14 Feb 2012 17:35:36 GMT, Richard Earnshaw wrote:
>>> Bernds checked in a patch last year (or maybe even before that) to make
>>> the selection of flags clobbered insns run very late (certainly after
>>> condexec had run), so I'm not sure why you're not seeing this.
>>
>> Hm, you mentioned some peepholes. Try removing them....
>
> Hmmm, it seems you're right. The machine reorg pass now takes care of
> this. Well ... that was easy!
>
> Here's a patch to remove the obsolete peepholes. I've confirmed it
> works with the testcase.
>
> OK?
>
> Andrew


[-- Attachment #2: thumb-ifcvt2.patch --]
[-- Type: text/x-patch, Size: 5797 bytes --]

2012-02-14  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* config/arm/thumb2.md: Delete obsolete peepholes.

	gcc/testsuite/
	* gcc.target/arm/thumb-ifcvt.c: New test case.

---
 gcc/config/arm/thumb2.md                   |  107 ----------------------------
 gcc/testsuite/gcc.target/arm/thumb-ifcvt.c |   19 +++++
 2 files changed, 19 insertions(+), 107 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/thumb-ifcvt.c

diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 05585da..799a3df 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -677,26 +677,6 @@
    (set_attr "length" "2")]
 )
 
-;; Similarly for 16-bit shift instructions
-;; There is no 16-bit rotate by immediate instruction.
-(define_peephole2
-  [(set (match_operand:SI   0 "low_register_operand" "")
-	(match_operator:SI  3 "shift_operator"
-	 [(match_operand:SI 1 "low_register_operand" "")
-	  (match_operand:SI 2 "low_reg_or_int_operand" "")]))]
-  "TARGET_THUMB2
-   && peep2_regno_dead_p(0, CC_REGNUM)
-   && ((GET_CODE(operands[3]) != ROTATE && GET_CODE(operands[3]) != ROTATERT)
-       || REG_P(operands[2]))"
-  [(parallel
-    [(set (match_dup 0)
-	  (match_op_dup 3
-	   [(match_dup 1)
-	    (match_dup 2)]))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_shiftsi3_short"
   [(set (match_operand:SI   0 "low_register_operand" "=l")
 	(match_operator:SI  3 "shift_operator"
@@ -715,20 +695,6 @@
 		      (const_string "alu_shift_reg")))]
 )
 
-;; 16-bit load immediate
-(define_peephole2
-  [(set (match_operand:QHSI 0 "low_register_operand" "")
-	(match_operand:QHSI 1 "const_int_operand" ""))]
-  "TARGET_THUMB2
-   && peep2_regno_dead_p(0, CC_REGNUM)
-   && (unsigned HOST_WIDE_INT) INTVAL(operands[1]) < 256"
-  [(parallel
-    [(set (match_dup 0)
-	  (match_dup 1))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_mov<mode>_shortim"
   [(set (match_operand:QHSI 0 "low_register_operand" "=l")
 	(match_operand:QHSI 1 "const_int_operand" "I"))
@@ -739,24 +705,6 @@
    (set_attr "length" "2")]
 )
 
-;; 16-bit add/sub immediate
-(define_peephole2
-  [(set (match_operand:SI 0 "low_register_operand" "")
-	(plus:SI (match_operand:SI 1 "low_register_operand" "")
-		 (match_operand:SI 2 "const_int_operand" "")))]
-  "TARGET_THUMB2
-   && peep2_regno_dead_p(0, CC_REGNUM)
-   && ((rtx_equal_p(operands[0], operands[1])
-	&& INTVAL(operands[2]) > -256 && INTVAL(operands[2]) < 256)
-       || (INTVAL(operands[2]) > -8 && INTVAL(operands[2]) < 8))"
-  [(parallel
-    [(set (match_dup 0)
-	  (plus:SI (match_dup 1)
-		   (match_dup 2)))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_addsi_short"
   [(set (match_operand:SI 0 "low_register_operand" "=l,l")
 	(plus:SI (match_operand:SI 1 "low_register_operand" "l,0")
@@ -868,35 +816,6 @@
    (set_attr "length" "2,4")]
 )
 
-;; 16-bit encodings of "muls" and "mul<c>".  We only use these when
-;; optimizing for size since "muls" is slow on all known
-;; implementations and since "mul<c>" will be generated by
-;; "*arm_mulsi3_v6" anyhow.  The assembler will use a 16-bit encoding
-;; for "mul<c>" whenever possible anyhow.
-(define_peephole2
-  [(set (match_operand:SI 0 "low_register_operand" "")
-        (mult:SI (match_operand:SI 1 "low_register_operand" "")
-                 (match_dup 0)))]
-  "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)"
-  [(parallel
-    [(set (match_dup 0)
-           (mult:SI (match_dup 0) (match_dup 1)))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
-(define_peephole2
-  [(set (match_operand:SI 0 "low_register_operand" "")
-        (mult:SI (match_dup 0)
-	         (match_operand:SI 1 "low_register_operand" "")))]
-  "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)"
-  [(parallel
-    [(set (match_dup 0)
-           (mult:SI (match_dup 0) (match_dup 1)))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_mulsi_short"
   [(set (match_operand:SI 0 "low_register_operand" "=l")
         (mult:SI (match_operand:SI 1 "low_register_operand" "%0")
@@ -979,19 +898,6 @@
 	    (const_int 8)))]
 )
 
-;; 16-bit complement
-(define_peephole2
-  [(set (match_operand:SI 0 "low_register_operand" "")
-	(not:SI (match_operand:SI 1 "low_register_operand" "")))]
-  "TARGET_THUMB2
-   && peep2_regno_dead_p(0, CC_REGNUM)"
-  [(parallel
-    [(set (match_dup 0)
-	  (not:SI (match_dup 1)))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_one_cmplsi2_short"
   [(set (match_operand:SI 0 "low_register_operand" "=l")
 	(not:SI (match_operand:SI 1 "low_register_operand" "l")))
@@ -1002,19 +908,6 @@
    (set_attr "length" "2")]
 )
 
-;; 16-bit negate
-(define_peephole2
-  [(set (match_operand:SI 0 "low_register_operand" "")
-	(neg:SI (match_operand:SI 1 "low_register_operand" "")))]
-  "TARGET_THUMB2
-   && peep2_regno_dead_p(0, CC_REGNUM)"
-  [(parallel
-    [(set (match_dup 0)
-	  (neg:SI (match_dup 1)))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_negsi2_short"
   [(set (match_operand:SI 0 "low_register_operand" "=l")
 	(neg:SI (match_operand:SI 1 "low_register_operand" "l")))
diff --git a/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c b/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
new file mode 100644
index 0000000..b03bbce
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
@@ -0,0 +1,19 @@
+/* Check that Thumb 16-bit shifts can be if-converted.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-options "-O2" } */
+
+int
+foo (int a, int b)
+{
+  if (a != b)
+    {
+      a = a << b;
+      a = a >> 1;
+    }
+
+  return a + b;
+}
+
+/* { dg-final { scan-assembler "lslne" } } */
+/* { dg-final { scan-assembler "asrne" } } */

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][ARM] Improve use of conditional execution in thumb mode.
  2012-02-14 18:02       ` Andrew Stubbs
@ 2012-02-14 18:40         ` Richard Earnshaw
  2012-02-15 12:15           ` Andrew Stubbs
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Earnshaw @ 2012-02-14 18:40 UTC (permalink / raw)
  To: Andrew Stubbs; +Cc: gcc-patches, patches

On 14/02/12 18:01, Andrew Stubbs wrote:
> And again with the attachment.
> 
> On Tue 14 Feb 2012 18:00:57 GMT, Andrew Stubbs wrote:
>> On Tue 14 Feb 2012 17:35:36 GMT, Richard Earnshaw wrote:
>>>> Bernds checked in a patch last year (or maybe even before that) to make
>>>> the selection of flags clobbered insns run very late (certainly after
>>>> condexec had run), so I'm not sure why you're not seeing this.
>>>
>>> Hm, you mentioned some peepholes. Try removing them....
>>
>> Hmmm, it seems you're right. The machine reorg pass now takes care of
>> this. Well ... that was easy!
>>
>> Here's a patch to remove the obsolete peepholes. I've confirmed it
>> works with the testcase.
>>
>> OK?
>>
>> Andrew
>>
>> thumb-ifcvt2.patch
>>
>>
>> 2012-02-14  Andrew Stubbs  <ams@codesourcery.com>
>>
>> 	gcc/
>> 	* config/arm/thumb2.md: Delete obsolete peepholes.
>>
>> 	gcc/testsuite/
>> 	* gcc.target/arm/thumb-ifcvt.c: New test case.
>>

s/obsolete peepholes/obsolete flag-clobbering peepholes/

Otherwise ok for stage1

The 'Similarly' that you delete suggests there might still be more of
them...

R.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][ARM] Improve use of conditional execution in thumb mode.
  2012-02-14 18:40         ` Richard Earnshaw
@ 2012-02-15 12:15           ` Andrew Stubbs
  0 siblings, 0 replies; 12+ messages in thread
From: Andrew Stubbs @ 2012-02-15 12:15 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches, patches

On 14/02/12 18:30, Richard Earnshaw wrote:
> s/obsolete peepholes/obsolete flag-clobbering peepholes/
>
> Otherwise ok for stage1

Thanks.

> The 'Similarly' that you delete suggests there might still be more of
> them...

You'd think, but I can't figure out what that refers to.

There are some other peepholes that might be obsolete - I've not 
checked; I only deleted the ones I recognised.

Andrew

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][ARM] Improve use of conditional execution in thumb mode.
  2012-02-14 18:02     ` [PATCH][ARM] " Andrew Stubbs
  2012-02-14 18:02       ` Andrew Stubbs
@ 2012-02-17 16:06       ` Andrew Stubbs
  2012-03-08 15:49         ` Andrew Stubbs
  1 sibling, 1 reply; 12+ messages in thread
From: Andrew Stubbs @ 2012-02-17 16:06 UTC (permalink / raw)
  Cc: Richard Earnshaw, gcc-patches, patches

[-- Attachment #1: Type: text/plain, Size: 1213 bytes --]

On 14/02/12 18:00, Andrew Stubbs wrote:
> On Tue 14 Feb 2012 17:35:36 GMT, Richard Earnshaw wrote:
>>> Bernds checked in a patch last year (or maybe even before that) to make
>>> the selection of flags clobbered insns run very late (certainly after
>>> condexec had run), so I'm not sure why you're not seeing this.
>>
>> Hm, you mentioned some peepholes. Try removing them....
>
> Hmmm, it seems you're right. The machine reorg pass now takes care of
> this. Well ... that was easy!
>
> Here's a patch to remove the obsolete peepholes. I've confirmed it works
> with the testcase.

Testing revealed that the new thumb_reorg code was not as thorough as 
the old peepholes at converting insns to 16-bit encodings.

This updated patch rewrites thumb_reorg to fix up all the missing cases, 
and adds a testcase to make sure it's all good. The parts that were in 
the old patch are unchanged.

I did initially try to insert the new cases into the old code, but the 
if conditions got way out of hand, and/or I ended up duplicating the 
active code. I've therefore recast the code to make it more scalable, 
but it boils down to the exact same thing.

I've got a full test run going again.

OK for 4.8, again?

Andrew

[-- Attachment #2: thumb-ifcvt2.patch --]
[-- Type: text/x-patch, Size: 13563 bytes --]

2012-02-17  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* config/arm/arm.c (thumb2_reorg): Add complete support
	for 16-bit instructions.
	* config/arm/thumb2.md: Delete obsolete flag-clobbering peepholes.

	gcc/testsuite/
	* gcc.target/arm/thumb-16bit-ops.c: New file.
	* gcc.target/arm/thumb-ifcvt.c: New file.

---
 gcc/config/arm/arm.c                           |  132 +++++++++++++++----
 gcc/config/arm/thumb2.md                       |  107 ----------------
 gcc/testsuite/gcc.target/arm/thumb-16bit-ops.c |  164 ++++++++++++++++++++++++
 gcc/testsuite/gcc.target/arm/thumb-ifcvt.c     |   19 +++
 4 files changed, 286 insertions(+), 136 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/thumb-16bit-ops.c
 create mode 100644 gcc/testsuite/gcc.target/arm/thumb-ifcvt.c

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 0bded8d..c3a19e4 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -13246,47 +13246,121 @@ thumb2_reorg (void)
       FOR_BB_INSNS_REVERSE (bb, insn)
 	{
 	  if (NONJUMP_INSN_P (insn)
-	      && !REGNO_REG_SET_P (&live, CC_REGNUM))
+	      && !REGNO_REG_SET_P (&live, CC_REGNUM)
+	      && GET_CODE (PATTERN (insn)) == SET)
 	    {
+	      enum {SKIP, CONV, SWAP_CONV} action = SKIP;
 	      rtx pat = PATTERN (insn);
-	      if (GET_CODE (pat) == SET
-		  && low_register_operand (XEXP (pat, 0), SImode)
-		  && thumb_16bit_operator (XEXP (pat, 1), SImode)
-		  && low_register_operand (XEXP (XEXP (pat, 1), 0), SImode)
-		  && low_register_operand (XEXP (XEXP (pat, 1), 1), SImode))
+	      rtx dst = XEXP (pat, 0);
+	      rtx src = XEXP (pat, 1);
+	      rtx op0 = NULL_RTX, op1 = NULL_RTX;
+
+	      if (!OBJECT_P (src))
+		  op0 = XEXP (src, 0);
+
+	      if (BINARY_P (src))
+		  op1 = XEXP (src, 1);
+
+	      if (low_register_operand (dst, SImode))
 		{
-		  rtx dst = XEXP (pat, 0);
-		  rtx src = XEXP (pat, 1);
-		  rtx op0 = XEXP (src, 0);
-		  rtx op1 = (GET_RTX_CLASS (GET_CODE (src)) == RTX_COMM_ARITH
-			     ? XEXP (src, 1) : NULL);
-
-		  if (rtx_equal_p (dst, op0)
-		      || GET_CODE (src) == PLUS || GET_CODE (src) == MINUS)
+		  switch (GET_CODE (src))
 		    {
-		      rtx ccreg = gen_rtx_REG (CCmode, CC_REGNUM);
-		      rtx clobber = gen_rtx_CLOBBER (VOIDmode, ccreg);
-		      rtvec vec = gen_rtvec (2, pat, clobber);
-
-		      PATTERN (insn) = gen_rtx_PARALLEL (VOIDmode, vec);
-		      INSN_CODE (insn) = -1;
+		    case PLUS:
+		    case MINUS:
+		      if (low_register_operand (op0, SImode))
+			{
+			  /* ADDS <Rd>,<Rn>,<Rm>  */
+			  if (low_register_operand (op1, SImode))
+			    action = CONV;
+			  /* ADDS <Rdn>,#<imm8>  */
+			  else if (rtx_equal_p (dst, op0)
+				   && CONST_INT_P (op1)
+				   && IN_RANGE (INTVAL (op1), 0, 255))
+			    action = CONV;
+			  /* ADDS <Rd>,<Rn>,#<imm3>  */
+			  else if (CONST_INT_P (op1)
+				   && IN_RANGE (INTVAL (op1), 0, 7))
+			    action = CONV;
+			}
+		      break;
+		    case MULT:
+		      /* MULS <Rdm>,<Rn>,<Rdm>
+			 As an exception to the rule, this is only used
+			 when optimizing for size since MULS is slow on all
+			 known implementations.  */
+		      if (!optimize_function_for_size_p (cfun))
+			break;
+		      /* else fall through.  */
+		    case AND:
+		    case IOR:
+		    case XOR:
+		      /* ANDS <Rdn>,<Rm>  */
+		      if (rtx_equal_p (dst, op0)
+			  && low_register_operand (op1, SImode))
+			action = CONV;
+		      else if (rtx_equal_p (dst, op1)
+			       && low_register_operand (op0, SImode))
+			action = SWAP_CONV;
+		      break;
+		    case ASHIFTRT:
+		    case ASHIFT:
+		    case LSHIFTRT:
+		      /* ASRS <Rdn>,<Rm> */
+		      if (rtx_equal_p (dst, op0)
+			  && low_register_operand (op1, SImode))
+			action = CONV;
+		      /* ASRS <Rd>,<Rm>,#<imm5> */
+		      else if (low_register_operand (op0, SImode)
+			       && CONST_INT_P (op1)
+			       && IN_RANGE (INTVAL (op1), 0, 31))
+			action = CONV;
+		      break;
+		    case ROTATERT:
+		      /* RORS <Rdn>,<Rm>  */
+		      if (rtx_equal_p (dst, op0)
+			  && low_register_operand (op1, SImode))
+			action = CONV;
+		      break;
+		    case NOT:
+		    case NEG:
+		      /* MVNS <Rd>,<Rm>  */
+		      if (low_register_operand (op0, SImode))
+			action = CONV;
+		      break;
+		    case CONST_INT:
+		      /* MOVS <Rd>,#<imm8>  */
+		      if (CONST_INT_P (src)
+			  && IN_RANGE (INTVAL (src), 0, 255))
+			action = CONV;
+		      break;
+		    case REG:
+		      /* MOVS and MOV<c> with registers have different
+			 encodings, so are not relevant here.  */
+		      break;
+		    default:
+		      break;
 		    }
-		  /* We can also handle a commutative operation where the
-		     second operand matches the destination.  */
-		  else if (op1 && rtx_equal_p (dst, op1))
-		    {
-		      rtx ccreg = gen_rtx_REG (CCmode, CC_REGNUM);
-		      rtx clobber = gen_rtx_CLOBBER (VOIDmode, ccreg);
-		      rtvec vec;
+		}
+
+	      if (action != SKIP)
+		{
+		  rtx ccreg = gen_rtx_REG (CCmode, CC_REGNUM);
+		  rtx clobber = gen_rtx_CLOBBER (VOIDmode, ccreg);
+		  rtvec vec;
 
+		  if (action == SWAP_CONV)
+		    {
 		      src = copy_rtx (src);
 		      XEXP (src, 0) = op1;
 		      XEXP (src, 1) = op0;
 		      pat = gen_rtx_SET (VOIDmode, dst, src);
 		      vec = gen_rtvec (2, pat, clobber);
-		      PATTERN (insn) = gen_rtx_PARALLEL (VOIDmode, vec);
-		      INSN_CODE (insn) = -1;
 		    }
+		  else /* action == CONV */
+		    vec = gen_rtvec (2, pat, clobber);
+
+		  PATTERN (insn) = gen_rtx_PARALLEL (VOIDmode, vec);
+		  INSN_CODE (insn) = -1;
 		}
 	    }
 
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 05585da..799a3df 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -677,26 +677,6 @@
    (set_attr "length" "2")]
 )
 
-;; Similarly for 16-bit shift instructions
-;; There is no 16-bit rotate by immediate instruction.
-(define_peephole2
-  [(set (match_operand:SI   0 "low_register_operand" "")
-	(match_operator:SI  3 "shift_operator"
-	 [(match_operand:SI 1 "low_register_operand" "")
-	  (match_operand:SI 2 "low_reg_or_int_operand" "")]))]
-  "TARGET_THUMB2
-   && peep2_regno_dead_p(0, CC_REGNUM)
-   && ((GET_CODE(operands[3]) != ROTATE && GET_CODE(operands[3]) != ROTATERT)
-       || REG_P(operands[2]))"
-  [(parallel
-    [(set (match_dup 0)
-	  (match_op_dup 3
-	   [(match_dup 1)
-	    (match_dup 2)]))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_shiftsi3_short"
   [(set (match_operand:SI   0 "low_register_operand" "=l")
 	(match_operator:SI  3 "shift_operator"
@@ -715,20 +695,6 @@
 		      (const_string "alu_shift_reg")))]
 )
 
-;; 16-bit load immediate
-(define_peephole2
-  [(set (match_operand:QHSI 0 "low_register_operand" "")
-	(match_operand:QHSI 1 "const_int_operand" ""))]
-  "TARGET_THUMB2
-   && peep2_regno_dead_p(0, CC_REGNUM)
-   && (unsigned HOST_WIDE_INT) INTVAL(operands[1]) < 256"
-  [(parallel
-    [(set (match_dup 0)
-	  (match_dup 1))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_mov<mode>_shortim"
   [(set (match_operand:QHSI 0 "low_register_operand" "=l")
 	(match_operand:QHSI 1 "const_int_operand" "I"))
@@ -739,24 +705,6 @@
    (set_attr "length" "2")]
 )
 
-;; 16-bit add/sub immediate
-(define_peephole2
-  [(set (match_operand:SI 0 "low_register_operand" "")
-	(plus:SI (match_operand:SI 1 "low_register_operand" "")
-		 (match_operand:SI 2 "const_int_operand" "")))]
-  "TARGET_THUMB2
-   && peep2_regno_dead_p(0, CC_REGNUM)
-   && ((rtx_equal_p(operands[0], operands[1])
-	&& INTVAL(operands[2]) > -256 && INTVAL(operands[2]) < 256)
-       || (INTVAL(operands[2]) > -8 && INTVAL(operands[2]) < 8))"
-  [(parallel
-    [(set (match_dup 0)
-	  (plus:SI (match_dup 1)
-		   (match_dup 2)))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_addsi_short"
   [(set (match_operand:SI 0 "low_register_operand" "=l,l")
 	(plus:SI (match_operand:SI 1 "low_register_operand" "l,0")
@@ -868,35 +816,6 @@
    (set_attr "length" "2,4")]
 )
 
-;; 16-bit encodings of "muls" and "mul<c>".  We only use these when
-;; optimizing for size since "muls" is slow on all known
-;; implementations and since "mul<c>" will be generated by
-;; "*arm_mulsi3_v6" anyhow.  The assembler will use a 16-bit encoding
-;; for "mul<c>" whenever possible anyhow.
-(define_peephole2
-  [(set (match_operand:SI 0 "low_register_operand" "")
-        (mult:SI (match_operand:SI 1 "low_register_operand" "")
-                 (match_dup 0)))]
-  "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)"
-  [(parallel
-    [(set (match_dup 0)
-           (mult:SI (match_dup 0) (match_dup 1)))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
-(define_peephole2
-  [(set (match_operand:SI 0 "low_register_operand" "")
-        (mult:SI (match_dup 0)
-	         (match_operand:SI 1 "low_register_operand" "")))]
-  "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)"
-  [(parallel
-    [(set (match_dup 0)
-           (mult:SI (match_dup 0) (match_dup 1)))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_mulsi_short"
   [(set (match_operand:SI 0 "low_register_operand" "=l")
         (mult:SI (match_operand:SI 1 "low_register_operand" "%0")
@@ -979,19 +898,6 @@
 	    (const_int 8)))]
 )
 
-;; 16-bit complement
-(define_peephole2
-  [(set (match_operand:SI 0 "low_register_operand" "")
-	(not:SI (match_operand:SI 1 "low_register_operand" "")))]
-  "TARGET_THUMB2
-   && peep2_regno_dead_p(0, CC_REGNUM)"
-  [(parallel
-    [(set (match_dup 0)
-	  (not:SI (match_dup 1)))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_one_cmplsi2_short"
   [(set (match_operand:SI 0 "low_register_operand" "=l")
 	(not:SI (match_operand:SI 1 "low_register_operand" "l")))
@@ -1002,19 +908,6 @@
    (set_attr "length" "2")]
 )
 
-;; 16-bit negate
-(define_peephole2
-  [(set (match_operand:SI 0 "low_register_operand" "")
-	(neg:SI (match_operand:SI 1 "low_register_operand" "")))]
-  "TARGET_THUMB2
-   && peep2_regno_dead_p(0, CC_REGNUM)"
-  [(parallel
-    [(set (match_dup 0)
-	  (neg:SI (match_dup 1)))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_negsi2_short"
   [(set (match_operand:SI 0 "low_register_operand" "=l")
 	(neg:SI (match_operand:SI 1 "low_register_operand" "l")))
diff --git a/gcc/testsuite/gcc.target/arm/thumb-16bit-ops.c b/gcc/testsuite/gcc.target/arm/thumb-16bit-ops.c
new file mode 100644
index 0000000..3e43e4a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb-16bit-ops.c
@@ -0,0 +1,164 @@
+/* Check that the compiler properly uses 16-bit encodings where available.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-options "-Os -fno-builtin" } */
+
+int
+f (int a, int b )
+{
+  return a + b;
+}
+
+/* { dg-final { scan-assembler "adds	r0, r0, r1" } } */
+
+int
+g1 (int a)
+{
+  return a + 255;
+}
+
+/* { dg-final { scan-assembler "adds	r0, r0, #255" } } */
+
+int
+g2 (int a)
+{
+  return a + 256;
+}
+
+/* { dg-final { scan-assembler "add	r0, r0, #256" } } */
+
+int
+h1 (int a, int b)
+{
+  return b + 7;
+}
+
+/* { dg-final { scan-assembler "adds	r0, r1, #7" } } */
+
+int
+h2 (int a, int b)
+{
+  return b + 8;
+}
+
+/* { dg-final { scan-assembler "add	r0, r1, #8" } } */
+
+int
+i (int a, int b)
+{
+  return b;
+}
+
+/* { dg-final { scan-assembler "mov	r0, r1" } } */
+
+int
+j1 ()
+{
+  return 255;
+}
+
+/* { dg-final { scan-assembler "movs	r0, #255" } } */
+
+int
+j2 ()
+{
+  return 256;
+}
+
+/* { dg-final { scan-assembler "mov	r0, #256" } } */
+
+int
+k (int a, int b)
+{
+  return b << 15;
+}
+
+/* { dg-final { scan-assembler "lsls	r0, r1, #15" } } */
+
+int
+l1 (int a, int b)
+{
+  return a << b;
+}
+
+/* { dg-final { scan-assembler "lsls	r0, r0, r1" } } */
+
+int
+l2 (int a, int b, int c)
+{
+  return b << c;
+}
+
+/* { dg-final { scan-assembler "lsl	r0, r1, r2" } } */
+
+int
+m (int a, int b)
+{
+  return b >> 15;
+}
+
+/* { dg-final { scan-assembler "asrs	r0, r1, #15" } } */
+
+int
+n1 (int a, int b)
+{
+  return a >> b;
+}
+
+/* { dg-final { scan-assembler "asrs	r0, r0, r1" } } */
+
+int
+n2 (int a, int b, int c)
+{
+  return b >> c;
+}
+
+/* { dg-final { scan-assembler "asr	r0, r1, r2" } } */
+
+unsigned int
+o (unsigned int a, unsigned int b)
+{
+  return b >> 15;
+}
+
+/* { dg-final { scan-assembler "lsrs	r0, r1, #15" } } */
+
+unsigned int
+p1 (unsigned int a, unsigned int b)
+{
+  return a >> b;
+}
+
+/* { dg-final { scan-assembler "lsrs	r0, r0, r1" } } */
+
+unsigned int
+p2 (unsigned int a, unsigned int b, unsigned int c)
+{
+  return b >> c;
+}
+
+/* { dg-final { scan-assembler "lsr	r0, r1, r2" } } */
+
+int
+q (int a, int b)
+{
+  return b * a;
+}
+
+/* { dg-final { scan-assembler "muls	r0, r1, r0" } } */
+
+int
+r (int a, int b)
+{
+  return ~b;
+}
+
+/* { dg-final { scan-assembler "mvns	r0, r1" } } */
+
+int
+s (int a, int b)
+{
+  return -b;
+}
+
+/* { dg-final { scan-assembler "negs	r0, r1" } } */
diff --git a/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c b/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
new file mode 100644
index 0000000..b03bbce
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
@@ -0,0 +1,19 @@
+/* Check that Thumb 16-bit shifts can be if-converted.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-options "-O2" } */
+
+int
+foo (int a, int b)
+{
+  if (a != b)
+    {
+      a = a << b;
+      a = a >> 1;
+    }
+
+  return a + b;
+}
+
+/* { dg-final { scan-assembler "lslne" } } */
+/* { dg-final { scan-assembler "asrne" } } */

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][ARM] Improve use of conditional execution in thumb mode.
  2012-02-17 16:06       ` Andrew Stubbs
@ 2012-03-08 15:49         ` Andrew Stubbs
  2012-03-19 14:31           ` Andrew Stubbs
  2012-03-19 14:49           ` Richard Earnshaw
  0 siblings, 2 replies; 12+ messages in thread
From: Andrew Stubbs @ 2012-03-08 15:49 UTC (permalink / raw)
  Cc: Richard Earnshaw, gcc-patches, patches

[-- Attachment #1: Type: text/plain, Size: 327 bytes --]

On 17/02/12 15:30, Andrew Stubbs wrote:
> I've got a full test run going again.
>
> OK for 4.8, again?

The test run revealed some bugs handling MINUS.

This update has been tested and passes a bootstrap and test with no 
regressions. Indeed, it has actually corrected a failure in 
gcc.target/arm/combine-movs.c.

OK?

Andrew

[-- Attachment #2: thumb-ifcvt2.patch --]
[-- Type: text/x-patch, Size: 14689 bytes --]

2012-03-08  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* config/arm/arm.c (thumb2_reorg): Add complete support
	for 16-bit instructions.
	* config/arm/thumb2.md: Delete obsolete flag-clobbering peepholes.

	gcc/testsuite/
	* gcc.target/arm/thumb-16bit-ops.c: New file.
	* gcc.target/arm/thumb-ifcvt.c: New file.

---
 gcc/config/arm/arm.c                           |  157 ++++++++++++++++---
 gcc/config/arm/thumb2.md                       |  107 -------------
 gcc/testsuite/gcc.target/arm/thumb-16bit-ops.c |  196 ++++++++++++++++++++++++
 gcc/testsuite/gcc.target/arm/thumb-ifcvt.c     |   19 ++
 4 files changed, 344 insertions(+), 135 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/thumb-16bit-ops.c
 create mode 100644 gcc/testsuite/gcc.target/arm/thumb-ifcvt.c

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 0bded8d..44f99c1 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -13246,47 +13246,148 @@ thumb2_reorg (void)
       FOR_BB_INSNS_REVERSE (bb, insn)
 	{
 	  if (NONJUMP_INSN_P (insn)
-	      && !REGNO_REG_SET_P (&live, CC_REGNUM))
+	      && !REGNO_REG_SET_P (&live, CC_REGNUM)
+	      && GET_CODE (PATTERN (insn)) == SET)
 	    {
+	      enum {SKIP, CONV, SWAP_CONV} action = SKIP;
 	      rtx pat = PATTERN (insn);
-	      if (GET_CODE (pat) == SET
-		  && low_register_operand (XEXP (pat, 0), SImode)
-		  && thumb_16bit_operator (XEXP (pat, 1), SImode)
-		  && low_register_operand (XEXP (XEXP (pat, 1), 0), SImode)
-		  && low_register_operand (XEXP (XEXP (pat, 1), 1), SImode))
+	      rtx dst = XEXP (pat, 0);
+	      rtx src = XEXP (pat, 1);
+	      rtx op0 = NULL_RTX, op1 = NULL_RTX;
+
+	      if (!OBJECT_P (src))
+		  op0 = XEXP (src, 0);
+
+	      if (BINARY_P (src))
+		  op1 = XEXP (src, 1);
+
+	      if (low_register_operand (dst, SImode))
 		{
-		  rtx dst = XEXP (pat, 0);
-		  rtx src = XEXP (pat, 1);
-		  rtx op0 = XEXP (src, 0);
-		  rtx op1 = (GET_RTX_CLASS (GET_CODE (src)) == RTX_COMM_ARITH
-			     ? XEXP (src, 1) : NULL);
-
-		  if (rtx_equal_p (dst, op0)
-		      || GET_CODE (src) == PLUS || GET_CODE (src) == MINUS)
+		  switch (GET_CODE (src))
 		    {
-		      rtx ccreg = gen_rtx_REG (CCmode, CC_REGNUM);
-		      rtx clobber = gen_rtx_CLOBBER (VOIDmode, ccreg);
-		      rtvec vec = gen_rtvec (2, pat, clobber);
+		    case PLUS:
+		      if (low_register_operand (op0, SImode))
+			{
+			  /* ADDS <Rd>,<Rn>,<Rm>  */
+			  if (low_register_operand (op1, SImode))
+			    action = CONV;
+			  /* ADDS <Rdn>,#<imm8>  */
+			  /* SUBS <Rdn>,#<imm8>  */
+			  else if (rtx_equal_p (dst, op0)
+				   && CONST_INT_P (op1)
+				   && IN_RANGE (INTVAL (op1), -255, 255))
+			    action = CONV;
+			  /* ADDS <Rd>,<Rn>,#<imm3>  */
+			  /* SUBS <Rd>,<Rn>,#<imm3>  */
+			  else if (CONST_INT_P (op1)
+				   && IN_RANGE (INTVAL (op1), -7, 7))
+			    action = CONV;
+			}
+		      break;
+
+		    case MINUS:
+		      /* RSBS <Rd>,<Rn>,#0  
+			 Not handled here: see NEG below.  */
+		      /* SUBS <Rd>,<Rn>,#<imm3>
+			 SUBS <Rdn>,#<imm8>
+			 Not handled here: see PLUS above.  */
+		      /* SUBS <Rd>,<Rn>,<Rm>  */
+		      if (low_register_operand (op0, SImode)
+			  && low_register_operand (op1, SImode))
+			    action = CONV;
+		      break;
+
+		    case MULT:
+		      /* MULS <Rdm>,<Rn>,<Rdm>
+			 As an exception to the rule, this is only used
+			 when optimizing for size since MULS is slow on all
+			 known implementations.  We do not even want to use
+			 MULS in cold code, if optimizing for speed, so we
+			 test the global flag here.  */
+		      if (!optimize_size)
+			break;
+		      /* else fall through.  */
+		    case AND:
+		    case IOR:
+		    case XOR:
+		      /* ANDS <Rdn>,<Rm>  */
+		      if (rtx_equal_p (dst, op0)
+			  && low_register_operand (op1, SImode))
+			action = CONV;
+		      else if (rtx_equal_p (dst, op1)
+			       && low_register_operand (op0, SImode))
+			action = SWAP_CONV;
+		      break;
+
+		    case ASHIFTRT:
+		    case ASHIFT:
+		    case LSHIFTRT:
+		      /* ASRS <Rdn>,<Rm> */
+		      /* LSRS <Rdn>,<Rm> */
+		      /* LSLS <Rdn>,<Rm> */
+		      if (rtx_equal_p (dst, op0)
+			  && low_register_operand (op1, SImode))
+			action = CONV;
+		      /* ASRS <Rd>,<Rm>,#<imm5> */
+		      /* LSRS <Rd>,<Rm>,#<imm5> */
+		      /* LSLS <Rd>,<Rm>,#<imm5> */
+		      else if (low_register_operand (op0, SImode)
+			       && CONST_INT_P (op1)
+			       && IN_RANGE (INTVAL (op1), 0, 31))
+			action = CONV;
+		      break;
+
+		    case ROTATERT:
+		      /* RORS <Rdn>,<Rm>  */
+		      if (rtx_equal_p (dst, op0)
+			  && low_register_operand (op1, SImode))
+			action = CONV;
+		      break;
 
-		      PATTERN (insn) = gen_rtx_PARALLEL (VOIDmode, vec);
-		      INSN_CODE (insn) = -1;
+		    case NOT:
+		    case NEG:
+		      /* MVNS <Rd>,<Rm>  */
+		      /* NEGS <Rd>,<Rm>  (a.k.a RSBS)  */
+		      if (low_register_operand (op0, SImode))
+			action = CONV;
+		      break;
+
+		    case CONST_INT:
+		      /* MOVS <Rd>,#<imm8>  */
+		      if (CONST_INT_P (src)
+			  && IN_RANGE (INTVAL (src), 0, 255))
+			action = CONV;
+		      break;
+
+		    case REG:
+		      /* MOVS and MOV<c> with registers have different
+			 encodings, so are not relevant here.  */
+		      break;
+
+		    default:
+		      break;
 		    }
-		  /* We can also handle a commutative operation where the
-		     second operand matches the destination.  */
-		  else if (op1 && rtx_equal_p (dst, op1))
-		    {
-		      rtx ccreg = gen_rtx_REG (CCmode, CC_REGNUM);
-		      rtx clobber = gen_rtx_CLOBBER (VOIDmode, ccreg);
-		      rtvec vec;
+		}
 
+	      if (action != SKIP)
+		{
+		  rtx ccreg = gen_rtx_REG (CCmode, CC_REGNUM);
+		  rtx clobber = gen_rtx_CLOBBER (VOIDmode, ccreg);
+		  rtvec vec;
+
+		  if (action == SWAP_CONV)
+		    {
 		      src = copy_rtx (src);
 		      XEXP (src, 0) = op1;
 		      XEXP (src, 1) = op0;
 		      pat = gen_rtx_SET (VOIDmode, dst, src);
 		      vec = gen_rtvec (2, pat, clobber);
-		      PATTERN (insn) = gen_rtx_PARALLEL (VOIDmode, vec);
-		      INSN_CODE (insn) = -1;
 		    }
+		  else /* action == CONV */
+		    vec = gen_rtvec (2, pat, clobber);
+
+		  PATTERN (insn) = gen_rtx_PARALLEL (VOIDmode, vec);
+		  INSN_CODE (insn) = -1;
 		}
 	    }
 
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 05585da..799a3df 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -677,26 +677,6 @@
    (set_attr "length" "2")]
 )
 
-;; Similarly for 16-bit shift instructions
-;; There is no 16-bit rotate by immediate instruction.
-(define_peephole2
-  [(set (match_operand:SI   0 "low_register_operand" "")
-	(match_operator:SI  3 "shift_operator"
-	 [(match_operand:SI 1 "low_register_operand" "")
-	  (match_operand:SI 2 "low_reg_or_int_operand" "")]))]
-  "TARGET_THUMB2
-   && peep2_regno_dead_p(0, CC_REGNUM)
-   && ((GET_CODE(operands[3]) != ROTATE && GET_CODE(operands[3]) != ROTATERT)
-       || REG_P(operands[2]))"
-  [(parallel
-    [(set (match_dup 0)
-	  (match_op_dup 3
-	   [(match_dup 1)
-	    (match_dup 2)]))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_shiftsi3_short"
   [(set (match_operand:SI   0 "low_register_operand" "=l")
 	(match_operator:SI  3 "shift_operator"
@@ -715,20 +695,6 @@
 		      (const_string "alu_shift_reg")))]
 )
 
-;; 16-bit load immediate
-(define_peephole2
-  [(set (match_operand:QHSI 0 "low_register_operand" "")
-	(match_operand:QHSI 1 "const_int_operand" ""))]
-  "TARGET_THUMB2
-   && peep2_regno_dead_p(0, CC_REGNUM)
-   && (unsigned HOST_WIDE_INT) INTVAL(operands[1]) < 256"
-  [(parallel
-    [(set (match_dup 0)
-	  (match_dup 1))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_mov<mode>_shortim"
   [(set (match_operand:QHSI 0 "low_register_operand" "=l")
 	(match_operand:QHSI 1 "const_int_operand" "I"))
@@ -739,24 +705,6 @@
    (set_attr "length" "2")]
 )
 
-;; 16-bit add/sub immediate
-(define_peephole2
-  [(set (match_operand:SI 0 "low_register_operand" "")
-	(plus:SI (match_operand:SI 1 "low_register_operand" "")
-		 (match_operand:SI 2 "const_int_operand" "")))]
-  "TARGET_THUMB2
-   && peep2_regno_dead_p(0, CC_REGNUM)
-   && ((rtx_equal_p(operands[0], operands[1])
-	&& INTVAL(operands[2]) > -256 && INTVAL(operands[2]) < 256)
-       || (INTVAL(operands[2]) > -8 && INTVAL(operands[2]) < 8))"
-  [(parallel
-    [(set (match_dup 0)
-	  (plus:SI (match_dup 1)
-		   (match_dup 2)))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_addsi_short"
   [(set (match_operand:SI 0 "low_register_operand" "=l,l")
 	(plus:SI (match_operand:SI 1 "low_register_operand" "l,0")
@@ -868,35 +816,6 @@
    (set_attr "length" "2,4")]
 )
 
-;; 16-bit encodings of "muls" and "mul<c>".  We only use these when
-;; optimizing for size since "muls" is slow on all known
-;; implementations and since "mul<c>" will be generated by
-;; "*arm_mulsi3_v6" anyhow.  The assembler will use a 16-bit encoding
-;; for "mul<c>" whenever possible anyhow.
-(define_peephole2
-  [(set (match_operand:SI 0 "low_register_operand" "")
-        (mult:SI (match_operand:SI 1 "low_register_operand" "")
-                 (match_dup 0)))]
-  "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)"
-  [(parallel
-    [(set (match_dup 0)
-           (mult:SI (match_dup 0) (match_dup 1)))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
-(define_peephole2
-  [(set (match_operand:SI 0 "low_register_operand" "")
-        (mult:SI (match_dup 0)
-	         (match_operand:SI 1 "low_register_operand" "")))]
-  "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)"
-  [(parallel
-    [(set (match_dup 0)
-           (mult:SI (match_dup 0) (match_dup 1)))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_mulsi_short"
   [(set (match_operand:SI 0 "low_register_operand" "=l")
         (mult:SI (match_operand:SI 1 "low_register_operand" "%0")
@@ -979,19 +898,6 @@
 	    (const_int 8)))]
 )
 
-;; 16-bit complement
-(define_peephole2
-  [(set (match_operand:SI 0 "low_register_operand" "")
-	(not:SI (match_operand:SI 1 "low_register_operand" "")))]
-  "TARGET_THUMB2
-   && peep2_regno_dead_p(0, CC_REGNUM)"
-  [(parallel
-    [(set (match_dup 0)
-	  (not:SI (match_dup 1)))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_one_cmplsi2_short"
   [(set (match_operand:SI 0 "low_register_operand" "=l")
 	(not:SI (match_operand:SI 1 "low_register_operand" "l")))
@@ -1002,19 +908,6 @@
    (set_attr "length" "2")]
 )
 
-;; 16-bit negate
-(define_peephole2
-  [(set (match_operand:SI 0 "low_register_operand" "")
-	(neg:SI (match_operand:SI 1 "low_register_operand" "")))]
-  "TARGET_THUMB2
-   && peep2_regno_dead_p(0, CC_REGNUM)"
-  [(parallel
-    [(set (match_dup 0)
-	  (neg:SI (match_dup 1)))
-     (clobber (reg:CC CC_REGNUM))])]
-  ""
-)
-
 (define_insn "*thumb2_negsi2_short"
   [(set (match_operand:SI 0 "low_register_operand" "=l")
 	(neg:SI (match_operand:SI 1 "low_register_operand" "l")))
diff --git a/gcc/testsuite/gcc.target/arm/thumb-16bit-ops.c b/gcc/testsuite/gcc.target/arm/thumb-16bit-ops.c
new file mode 100644
index 0000000..2b71238
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb-16bit-ops.c
@@ -0,0 +1,196 @@
+/* Check that the compiler properly uses 16-bit encodings where available.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-options "-Os -fno-builtin" } */
+
+int
+f (int a, int b )
+{
+  return a + b;
+}
+
+/* { dg-final { scan-assembler "adds	r0, r0, r1" } } */
+
+int
+g1 (int a)
+{
+  return a + 255;
+}
+
+/* { dg-final { scan-assembler "adds	r0, r0, #255" } } */
+
+int
+g2 (int a)
+{
+  return a + 256;
+}
+
+/* { dg-final { scan-assembler "add	r0, r0, #256" } } */
+
+int
+g3 (int a)
+{
+  return a - 255;
+}
+
+/* { dg-final { scan-assembler "subs	r0, r0, #255" } } */
+
+int
+g4 (int a)
+{
+  return a - 256;
+}
+
+/* { dg-final { scan-assembler "sub	r0, r0, #256" } } */
+
+int
+h1 (int a, int b)
+{
+  return b + 7;
+}
+
+/* { dg-final { scan-assembler "adds	r0, r1, #7" } } */
+
+int
+h2 (int a, int b)
+{
+  return b + 8;
+}
+
+/* { dg-final { scan-assembler "add	r0, r1, #8" } } */
+
+int
+h3 (int a, int b)
+{
+  return b - 7;
+}
+
+/* { dg-final { scan-assembler "subs	r0, r1, #7" } } */
+
+int
+h4 (int a, int b)
+{
+  return b - 8;
+}
+
+/* { dg-final { scan-assembler "sub	r0, r1, #8" } } */
+
+int
+i (int a, int b)
+{
+  return b;
+}
+
+/* { dg-final { scan-assembler "mov	r0, r1" } } */
+
+int
+j1 ()
+{
+  return 255;
+}
+
+/* { dg-final { scan-assembler "movs	r0, #255" } } */
+
+int
+j2 ()
+{
+  return 256;
+}
+
+/* { dg-final { scan-assembler "mov	r0, #256" } } */
+
+int
+k (int a, int b)
+{
+  return b << 15;
+}
+
+/* { dg-final { scan-assembler "lsls	r0, r1, #15" } } */
+
+int
+l1 (int a, int b)
+{
+  return a << b;
+}
+
+/* { dg-final { scan-assembler "lsls	r0, r0, r1" } } */
+
+int
+l2 (int a, int b, int c)
+{
+  return b << c;
+}
+
+/* { dg-final { scan-assembler "lsl	r0, r1, r2" } } */
+
+int
+m (int a, int b)
+{
+  return b >> 15;
+}
+
+/* { dg-final { scan-assembler "asrs	r0, r1, #15" } } */
+
+int
+n1 (int a, int b)
+{
+  return a >> b;
+}
+
+/* { dg-final { scan-assembler "asrs	r0, r0, r1" } } */
+
+int
+n2 (int a, int b, int c)
+{
+  return b >> c;
+}
+
+/* { dg-final { scan-assembler "asr	r0, r1, r2" } } */
+
+unsigned int
+o (unsigned int a, unsigned int b)
+{
+  return b >> 15;
+}
+
+/* { dg-final { scan-assembler "lsrs	r0, r1, #15" } } */
+
+unsigned int
+p1 (unsigned int a, unsigned int b)
+{
+  return a >> b;
+}
+
+/* { dg-final { scan-assembler "lsrs	r0, r0, r1" } } */
+
+unsigned int
+p2 (unsigned int a, unsigned int b, unsigned int c)
+{
+  return b >> c;
+}
+
+/* { dg-final { scan-assembler "lsr	r0, r1, r2" } } */
+
+int
+q (int a, int b)
+{
+  return b * a;
+}
+
+/* { dg-final { scan-assembler "muls	r0, r1, r0" } } */
+
+int
+r (int a, int b)
+{
+  return ~b;
+}
+
+/* { dg-final { scan-assembler "mvns	r0, r1" } } */
+
+int
+s (int a, int b)
+{
+  return -b;
+}
+
+/* { dg-final { scan-assembler "negs	r0, r1" } } */
diff --git a/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c b/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
new file mode 100644
index 0000000..b03bbce
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/thumb-ifcvt.c
@@ -0,0 +1,19 @@
+/* Check that Thumb 16-bit shifts can be if-converted.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-options "-O2" } */
+
+int
+foo (int a, int b)
+{
+  if (a != b)
+    {
+      a = a << b;
+      a = a >> 1;
+    }
+
+  return a + b;
+}
+
+/* { dg-final { scan-assembler "lslne" } } */
+/* { dg-final { scan-assembler "asrne" } } */

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][ARM] Improve use of conditional execution in thumb mode.
  2012-03-08 15:49         ` Andrew Stubbs
@ 2012-03-19 14:31           ` Andrew Stubbs
  2012-03-19 14:49           ` Richard Earnshaw
  1 sibling, 0 replies; 12+ messages in thread
From: Andrew Stubbs @ 2012-03-19 14:31 UTC (permalink / raw)
  Cc: Richard Earnshaw, gcc-patches, patches

Ping.

On 08/03/12 15:48, Andrew Stubbs wrote:
> On 17/02/12 15:30, Andrew Stubbs wrote:
>> I've got a full test run going again.
>>
>> OK for 4.8, again?
>
> The test run revealed some bugs handling MINUS.
>
> This update has been tested and passes a bootstrap and test with no
> regressions. Indeed, it has actually corrected a failure in
> gcc.target/arm/combine-movs.c.
>
> OK?
>
> Andrew

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][ARM] Improve use of conditional execution in thumb mode.
  2012-03-08 15:49         ` Andrew Stubbs
  2012-03-19 14:31           ` Andrew Stubbs
@ 2012-03-19 14:49           ` Richard Earnshaw
  2012-03-21 10:42             ` Andrew Stubbs
  1 sibling, 1 reply; 12+ messages in thread
From: Richard Earnshaw @ 2012-03-19 14:49 UTC (permalink / raw)
  To: Andrew Stubbs; +Cc: gcc-patches, patches

On 08/03/12 15:48, Andrew Stubbs wrote:
> On 17/02/12 15:30, Andrew Stubbs wrote:
>> I've got a full test run going again.
>>
>> OK for 4.8, again?
> 
> The test run revealed some bugs handling MINUS.
> 
> This update has been tested and passes a bootstrap and test with no 
> regressions. Indeed, it has actually corrected a failure in 
> gcc.target/arm/combine-movs.c.
> 
> OK?
> 

OK.

R.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH][ARM] Improve use of conditional execution in thumb mode.
  2012-03-19 14:49           ` Richard Earnshaw
@ 2012-03-21 10:42             ` Andrew Stubbs
  0 siblings, 0 replies; 12+ messages in thread
From: Andrew Stubbs @ 2012-03-21 10:42 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches, patches

On 19/03/12 14:48, Richard Earnshaw wrote:
> OK.

Committed.

Andrew

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-03-21 10:42 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-14 17:00 [PATCH][ARM,ifcvt] Improve use of conditional execution in thumb mode Andrew Stubbs
2012-02-14 17:32 ` Richard Earnshaw
2012-02-14 17:40   ` Richard Earnshaw
2012-02-14 18:02     ` [PATCH][ARM] " Andrew Stubbs
2012-02-14 18:02       ` Andrew Stubbs
2012-02-14 18:40         ` Richard Earnshaw
2012-02-15 12:15           ` Andrew Stubbs
2012-02-17 16:06       ` Andrew Stubbs
2012-03-08 15:49         ` Andrew Stubbs
2012-03-19 14:31           ` Andrew Stubbs
2012-03-19 14:49           ` Richard Earnshaw
2012-03-21 10:42             ` Andrew Stubbs

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).