public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH ARM]Define LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target
@ 2012-11-16  5:38 Bin Cheng
  0 siblings, 0 replies; 7+ messages in thread
From: Bin Cheng @ 2012-11-16  5:38 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2262 bytes --]

Hi,
This patch defines LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target and prefers
short circuit for armv6-m and Thumb2+Os.

I tested the patch on arm-none-eabi on armv6-m/Thumb2 for both Os/O2. The
patch introduces new fails on ARMv6-m:

gcc/testsuite/gcc.dg/pr19105.c
gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-4.c
gcc/testsuite/gcc.dg/tree-ssa/vrp47.c
	These tests depends on non-short-circuit IR, which is not generated
because we prefer short circuit on ARMv6-m now. I modified the tests to skip
them on ARMv6-m.

gcc/testsuite/gcc.dg/uninit-pred-8_b.c
	Root cause: tree-ssa-uninit.c computes control dependent chain for
uses/def of variable and checks whether each use is guarded by def. It has a
upper bound on the number of control dependent chains(MAX_NUM_CHAINS==8) and
just retreat to false warning if the number of chains exceeds
MAX_NUM_CHAINS. In our scenario, the number of chains exceeds MAX_NUM_CHAINS
because we prefer short circuit now, resulting in false warning information.
These false warning cannot be fully removed if the MAX_NUM_CHAINS exists,
but we can improve it in following way: There are lots of invalid control
dependent chains computed in tree-ssa-uninit.c now and should be pruned. I
have already implemented a quick fix and it works for our scenario. I can
take this problem once I get some time.


This is an backend patch and not a bug fix, I am not sure whether it can go
in trunk or I have to wait for GCC4.9?

Thanks

2012-11-16  Bin Cheng  <bin.cheng@arm.com>

	* config/arm/arm-cores.def (cortex-m1, cortex-m0)
	(cortex-m0plus): Use v6m.
	* config/arm/arm-protos.h (tune_params): Add
	logical_op_non_short_circuit.
	* config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune)
	(arm_strongarm_tune, arm_xscale_tune, arm_9e_tune, arm_v6t2_tune)
	(arm_cortex_tune, arm_cortex_a15_tune, arm_cortex_a5_tune)
	(arm_cortex_a9_tune, arm_fa726te_tune): Set
	logical_op_non_short_circuit field.
	(arm_v6m_tune): New tune_params struct.
	* config/arm/arm.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Define.

gcc/testsuite/ChangeLog
2012-11-16  Bin Cheng  <bin.cheng@arm.com>

	* gcc.dg/pr19105.c: Skip on armv6-m.
	* gcc.dg/tree-ssa/ssa-dom-thread-4.c: Ditto.
	* gcc.dg/tree-ssa/vrp47.c: Ditto.

[-- Attachment #2: short-circuit-20121115.txt --]
[-- Type: text/plain, Size: 9949 bytes --]

Index: gcc/testsuite/gcc.dg/pr19105.c
===================================================================
--- gcc/testsuite/gcc.dg/pr19105.c	(revision 193494)
+++ gcc/testsuite/gcc.dg/pr19105.c	(working copy)
@@ -1,6 +1,10 @@
 /* PR tree-optimization/19105 */
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-reassoc1-details" } */
+/* Skip on ARM Cortex-M0, where LOGICAL_OP_NON_SHORT_CIRCUIT is set to false,
+   leading to two conditional jumps when evaluating an && condition,  causing
+   the case failed.  */
+/* { dg-skip-if "" { arm_cortex_m && arm_thumb1} } */
 
 enum e
 {
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-4.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-4.c	(revision 193494)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-4.c	(working copy)
@@ -59,7 +59,9 @@ bitmap_ior_and_compl (bitmap dst, const_bitmap a,
    code we missed the edge when the first conditional is false
    (b_elt is zero, which means the second conditional is always
    zero.  */
-/* { dg-final { scan-tree-dump-times "Threaded" 3 "dom1" { target { ! mips*-*-* } } } } */
+/* ARM Cortex-M0 defined LOGICAL_OP_NON_SHORT_CIRCUIT to false,
+   so skip below test.  */
+/* { dg-final { scan-tree-dump-times "Threaded" 3 "dom1" { target { ! { mips*-*-* || { arm_cortex_m && arm_thumb1 } } } } } } */
 /* MIPS defines LOGICAL_OP_NON_SHORT_CIRCUIT to 0, so we split var1 || var2
    into two conditions, rather than use (var1 != 0) | (var2 != 0).  */
 /* { dg-final { scan-tree-dump-times "Threaded" 4 "dom1" { target mips*-*-* } } } */
Index: gcc/testsuite/gcc.dg/tree-ssa/vrp47.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/vrp47.c	(revision 193494)
+++ gcc/testsuite/gcc.dg/tree-ssa/vrp47.c	(working copy)
@@ -6,6 +6,10 @@
 /* { dg-do compile { target { ! "mips*-*-* s390*-*-*  avr-*-* mn10300-*-*" } } } */
 /* { dg-options "-O2 -fdump-tree-vrp1 -fdump-tree-dom1 -fdump-tree-dom2" } */
 /* { dg-additional-options "-march=i586" { target { { i?86-*-* x86_64-*-* } && ia32 } } } */
+/* Skip on ARM Cortex-M0, where LOGICAL_OP_NON_SHORT_CIRCUIT is set to false,
+   leading to two conditional jumps when evaluating an && condition.  VRP is
+   not able to optimize this.  */
+/* { dg-skip-if "" { arm_cortex_m && arm_thumb1} } */
 
 int h(int x, int y)
 {
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 193494)
+++ gcc/config/arm/arm.c	(working copy)
@@ -893,7 +893,8 @@ const struct tune_params arm_slowmul_tune =
   ARM_PREFETCH_NOT_BENEFICIAL,
   true,						/* Prefer constant pool.  */
   arm_default_branch_cost,
-  false                                         /* Prefer LDRD/STRD.  */
+  false,					/* Prefer LDRD/STRD.  */
+  {true, true},					/* Prefer non short circuit.  */
 };
 
 const struct tune_params arm_fastmul_tune =
@@ -905,7 +906,8 @@ const struct tune_params arm_fastmul_tune =
   ARM_PREFETCH_NOT_BENEFICIAL,
   true,						/* Prefer constant pool.  */
   arm_default_branch_cost,
-  false                                         /* Prefer LDRD/STRD.  */
+  false,					/* Prefer LDRD/STRD.  */
+  {true, true},					/* Prefer non short circuit.  */
 };
 
 /* StrongARM has early execution of branches, so a sequence that is worth
@@ -920,7 +922,8 @@ const struct tune_params arm_strongarm_tune =
   ARM_PREFETCH_NOT_BENEFICIAL,
   true,						/* Prefer constant pool.  */
   arm_default_branch_cost,
-  false                                         /* Prefer LDRD/STRD.  */
+  false,					/* Prefer LDRD/STRD.  */
+  {true, true},					/* Prefer non short circuit.  */
 };
 
 const struct tune_params arm_xscale_tune =
@@ -932,7 +935,8 @@ const struct tune_params arm_xscale_tune =
   ARM_PREFETCH_NOT_BENEFICIAL,
   true,						/* Prefer constant pool.  */
   arm_default_branch_cost,
-  false                                         /* Prefer LDRD/STRD.  */
+  false,					/* Prefer LDRD/STRD.  */
+  {true, true},					/* Prefer non short circuit.  */
 };
 
 const struct tune_params arm_9e_tune =
@@ -944,7 +948,8 @@ const struct tune_params arm_9e_tune =
   ARM_PREFETCH_NOT_BENEFICIAL,
   true,						/* Prefer constant pool.  */
   arm_default_branch_cost,
-  false                                         /* Prefer LDRD/STRD.  */
+  false,					/* Prefer LDRD/STRD.  */
+  {true, true},					/* Prefer non short circuit.  */
 };
 
 const struct tune_params arm_v6t2_tune =
@@ -956,7 +961,8 @@ const struct tune_params arm_v6t2_tune =
   ARM_PREFETCH_NOT_BENEFICIAL,
   false,					/* Prefer constant pool.  */
   arm_default_branch_cost,
-  false                                         /* Prefer LDRD/STRD.  */
+  false,					/* Prefer LDRD/STRD.  */
+  {true, true},					/* Prefer non short circuit.  */
 };
 
 /* Generic Cortex tuning.  Use more specific tunings if appropriate.  */
@@ -969,7 +975,8 @@ const struct tune_params arm_cortex_tune =
   ARM_PREFETCH_NOT_BENEFICIAL,
   false,					/* Prefer constant pool.  */
   arm_default_branch_cost,
-  false                                         /* Prefer LDRD/STRD.  */
+  false,					/* Prefer LDRD/STRD.  */
+  {true, true},					/* Prefer non short circuit.  */
 };
 
 const struct tune_params arm_cortex_a15_tune =
@@ -981,7 +988,8 @@ const struct tune_params arm_cortex_a15_tune =
   ARM_PREFETCH_NOT_BENEFICIAL,
   false,					/* Prefer constant pool.  */
   arm_default_branch_cost,
-  true                                          /* Prefer LDRD/STRD.  */
+  true,						/* Prefer LDRD/STRD.  */
+  {true, true},					/* Prefer non short circuit.  */
 };
 
 /* Branches can be dual-issued on Cortex-A5, so conditional execution is
@@ -996,7 +1004,8 @@ const struct tune_params arm_cortex_a5_tune =
   ARM_PREFETCH_NOT_BENEFICIAL,
   false,					/* Prefer constant pool.  */
   arm_cortex_a5_branch_cost,
-  false                                         /* Prefer LDRD/STRD.  */
+  false,					/* Prefer LDRD/STRD.  */
+  {false, false},				/* Prefer non short circuit.  */
 };
 
 const struct tune_params arm_cortex_a9_tune =
@@ -1008,9 +1017,25 @@ const struct tune_params arm_cortex_a9_tune =
   ARM_PREFETCH_BENEFICIAL(4,32,32),
   false,					/* Prefer constant pool.  */
   arm_default_branch_cost,
-  false                                         /* Prefer LDRD/STRD.  */
+  false,					/* Prefer LDRD/STRD.  */
+  {true, true},					/* Prefer non short circuit.  */
 };
 
+/* The arm_v6m_tune is duplicated from arm_cortex_tune, rather than
+   arm_v6t2_tune. It is used for cortex-m0, cortex-m1 and cortex-m0plus.  */
+const struct tune_params arm_v6m_tune =
+{
+  arm_9e_rtx_costs,
+  NULL,
+  1,						/* Constant limit.  */
+  5,						/* Max cond insns.  */
+  ARM_PREFETCH_NOT_BENEFICIAL,
+  false,					/* Prefer constant pool.  */
+  arm_default_branch_cost,
+  false,					/* Prefer LDRD/STRD.  */
+  {false, false},				/* Prefer non short circuit.  */
+};
+
 const struct tune_params arm_fa726te_tune =
 {
   arm_9e_rtx_costs,
@@ -1020,7 +1045,8 @@ const struct tune_params arm_fa726te_tune =
   ARM_PREFETCH_NOT_BENEFICIAL,
   true,						/* Prefer constant pool.  */
   arm_default_branch_cost,
-  false                                         /* Prefer LDRD/STRD.  */
+  false,					/* Prefer LDRD/STRD.  */
+  {true, true},					/* Prefer non short circuit.  */
 };
 
 
Index: gcc/config/arm/arm.h
===================================================================
--- gcc/config/arm/arm.h	(revision 193494)
+++ gcc/config/arm/arm.h	(working copy)
@@ -2012,10 +2012,16 @@ enum arm_auto_incmodes
    || (X) == arg_pointer_rtx)
 
 /* Try to generate sequences that don't involve branches, we can then use
-   conditional instructions */
+   conditional instructions.  */
 #define BRANCH_COST(speed_p, predictable_p) \
   (current_tune->branch_cost (speed_p, predictable_p))
 
+/* False if short circuit operation is preferred.  */
+#define LOGICAL_OP_NON_SHORT_CIRCUIT				\
+  ((optimize_size)						\
+   ? (TARGET_THUMB ? false : true)				\
+   : (current_tune->logical_op_non_short_circuit[TARGET_ARM]))
+
 \f
 /* Position Independent Code.  */
 /* We decide which register to use based on the compilation options and
Index: gcc/config/arm/arm-cores.def
===================================================================
--- gcc/config/arm/arm-cores.def	(revision 193494)
+++ gcc/config/arm/arm-cores.def	(working copy)
@@ -135,6 +135,6 @@ ARM_CORE("cortex-r4f",	  cortexr4f,	7R,				 FL_LDS
 ARM_CORE("cortex-r5",	  cortexr5,	7R,				 FL_LDSCHED | FL_ARM_DIV, cortex)
 ARM_CORE("cortex-m4",	  cortexm4,	7EM,				 FL_LDSCHED, cortex)
 ARM_CORE("cortex-m3",	  cortexm3,	7M,				 FL_LDSCHED, cortex)
-ARM_CORE("cortex-m1",	  cortexm1,	6M,				 FL_LDSCHED, cortex)
-ARM_CORE("cortex-m0",	  cortexm0,	6M,				 FL_LDSCHED, cortex)
-ARM_CORE("cortex-m0plus", cortexm0plus,	6M,				 FL_LDSCHED, cortex)
+ARM_CORE("cortex-m1",	  cortexm1,	6M,				 FL_LDSCHED, v6m)
+ARM_CORE("cortex-m0",	  cortexm0,	6M,				 FL_LDSCHED, v6m)
+ARM_CORE("cortex-m0plus", cortexm0plus,	6M,				 FL_LDSCHED, v6m)
Index: gcc/config/arm/arm-protos.h
===================================================================
--- gcc/config/arm/arm-protos.h	(revision 193494)
+++ gcc/config/arm/arm-protos.h	(working copy)
@@ -243,6 +243,10 @@ struct tune_params
   int (*branch_cost) (bool, bool);
   /* Prefer STRD/LDRD instructions over PUSH/POP/LDM/STM.  */
   bool prefer_ldrd_strd;
+  /* The preference for non short cirtcuit operation when optimizing for
+     performance. The first element covers Thumb state and the second one
+     is for ARM state.  */
+  bool logical_op_non_short_circuit[2];
 };
 
 extern const struct tune_params *current_tune;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH ARM]Define LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target
  2012-11-20 22:01 ` Ramana Radhakrishnan
@ 2012-11-21  3:54   ` Bin Cheng
  0 siblings, 0 replies; 7+ messages in thread
From: Bin Cheng @ 2012-11-21  3:54 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: gcc-patches



> -----Original Message-----
> From: Ramana Radhakrishnan [mailto:ramana.gcc@googlemail.com]
> Sent: Wednesday, November 21, 2012 6:02 AM
> To: Bin Cheng
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH ARM]Define LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target
> 
> On Fri, Nov 16, 2012 at 5:37 AM, Bin Cheng <bin.cheng@arm.com> wrote:
> > Hi,
> > This patch defines LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target and
> > prefers short circuit for armv6-m and Thumb2+Os.
> >
> > I tested the patch on arm-none-eabi on armv6-m/Thumb2 for both Os/O2.
> > The patch introduces new fails on ARMv6-m:
> >
> > gcc/testsuite/gcc.dg/pr19105.c
> > gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-4.c
> > gcc/testsuite/gcc.dg/tree-ssa/vrp47.c
> >         These tests depends on non-short-circuit IR, which is not
> > generated because we prefer short circuit on ARMv6-m now. I modified
> > the tests to skip them on ARMv6-m.
> >
> > gcc/testsuite/gcc.dg/uninit-pred-8_b.c
> >         Root cause: tree-ssa-uninit.c computes control dependent chain
> > for uses/def of variable and checks whether each use is guarded by
> > def. It has a upper bound on the number of control dependent
> > chains(MAX_NUM_CHAINS==8) and just retreat to false warning if the
> > number of chains exceeds MAX_NUM_CHAINS. In our scenario, the number
> > of chains exceeds MAX_NUM_CHAINS because we prefer short circuit now,
> resulting in false warning information.
> > These false warning cannot be fully removed if the MAX_NUM_CHAINS
> > exists, but we can improve it in following way: There are lots of
> > invalid control dependent chains computed in tree-ssa-uninit.c now and
> > should be pruned. I have already implemented a quick fix and it works
> > for our scenario. I can take this problem once I get some time.
> 
> Please keep track of this in a separate bugzilla entry.
I will file a bug about this later.

> 
> >
> >
> > This is an backend patch and not a bug fix, I am not sure whether it
> > can go in trunk or I have to wait for GCC4.9?
> 
> 
> OK for trunk.
> 
> Ramana
> 
> >
> > Thanks
> >
> > 2012-11-16  Bin Cheng  <bin.cheng@arm.com>
> >
> >         * config/arm/arm-cores.def (cortex-m1, cortex-m0)
> >         (cortex-m0plus): Use v6m.
> >         * config/arm/arm-protos.h (tune_params): Add
> >         logical_op_non_short_circuit.
> >         * config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune)
> >         (arm_strongarm_tune, arm_xscale_tune, arm_9e_tune,
arm_v6t2_tune)
> >         (arm_cortex_tune, arm_cortex_a15_tune, arm_cortex_a5_tune)
> >         (arm_cortex_a9_tune, arm_fa726te_tune): Set
> >         logical_op_non_short_circuit field.
> >         (arm_v6m_tune): New tune_params struct.
> >         * config/arm/arm.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Define.
> >
> > gcc/testsuite/ChangeLog
> > 2012-11-16  Bin Cheng  <bin.cheng@arm.com>
> >
> >         * gcc.dg/pr19105.c: Skip on armv6-m.
> >         * gcc.dg/tree-ssa/ssa-dom-thread-4.c: Ditto.
> >         * gcc.dg/tree-ssa/vrp47.c: Ditto.

Committed as r193687 in TRUNK.
I did not change test case gcc.dg/pr19105.c because Jakub had a patch also
fixing this and already got approved/committed. See
http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00361.html

Thanks



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH ARM]Define LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target
       [not found] <50a5d14b.a750420a.35e3.2d81SMTPIN_ADDED@mx.google.com>
@ 2012-11-20 22:01 ` Ramana Radhakrishnan
  2012-11-21  3:54   ` Bin Cheng
  0 siblings, 1 reply; 7+ messages in thread
From: Ramana Radhakrishnan @ 2012-11-20 22:01 UTC (permalink / raw)
  To: Bin Cheng; +Cc: gcc-patches

On Fri, Nov 16, 2012 at 5:37 AM, Bin Cheng <bin.cheng@arm.com> wrote:
> Hi,
> This patch defines LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target and prefers
> short circuit for armv6-m and Thumb2+Os.
>
> I tested the patch on arm-none-eabi on armv6-m/Thumb2 for both Os/O2. The
> patch introduces new fails on ARMv6-m:
>
> gcc/testsuite/gcc.dg/pr19105.c
> gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-4.c
> gcc/testsuite/gcc.dg/tree-ssa/vrp47.c
>         These tests depends on non-short-circuit IR, which is not generated
> because we prefer short circuit on ARMv6-m now. I modified the tests to skip
> them on ARMv6-m.
>
> gcc/testsuite/gcc.dg/uninit-pred-8_b.c
>         Root cause: tree-ssa-uninit.c computes control dependent chain for
> uses/def of variable and checks whether each use is guarded by def. It has a
> upper bound on the number of control dependent chains(MAX_NUM_CHAINS==8) and
> just retreat to false warning if the number of chains exceeds
> MAX_NUM_CHAINS. In our scenario, the number of chains exceeds MAX_NUM_CHAINS
> because we prefer short circuit now, resulting in false warning information.
> These false warning cannot be fully removed if the MAX_NUM_CHAINS exists,
> but we can improve it in following way: There are lots of invalid control
> dependent chains computed in tree-ssa-uninit.c now and should be pruned. I
> have already implemented a quick fix and it works for our scenario. I can
> take this problem once I get some time.

Please keep track of this in a separate bugzilla entry.

>
>
> This is an backend patch and not a bug fix, I am not sure whether it can go
> in trunk or I have to wait for GCC4.9?


OK for trunk.

Ramana

>
> Thanks
>
> 2012-11-16  Bin Cheng  <bin.cheng@arm.com>
>
>         * config/arm/arm-cores.def (cortex-m1, cortex-m0)
>         (cortex-m0plus): Use v6m.
>         * config/arm/arm-protos.h (tune_params): Add
>         logical_op_non_short_circuit.
>         * config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune)
>         (arm_strongarm_tune, arm_xscale_tune, arm_9e_tune, arm_v6t2_tune)
>         (arm_cortex_tune, arm_cortex_a15_tune, arm_cortex_a5_tune)
>         (arm_cortex_a9_tune, arm_fa726te_tune): Set
>         logical_op_non_short_circuit field.
>         (arm_v6m_tune): New tune_params struct.
>         * config/arm/arm.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Define.
>
> gcc/testsuite/ChangeLog
> 2012-11-16  Bin Cheng  <bin.cheng@arm.com>
>
>         * gcc.dg/pr19105.c: Skip on armv6-m.
>         * gcc.dg/tree-ssa/ssa-dom-thread-4.c: Ditto.
>         * gcc.dg/tree-ssa/vrp47.c: Ditto.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH ARM]Define LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target
  2012-11-19 12:20     ` Matthew Gretton-Dann
@ 2012-11-20  5:18       ` Bin Cheng
  0 siblings, 0 replies; 7+ messages in thread
From: Bin Cheng @ 2012-11-20  5:18 UTC (permalink / raw)
  To: 'Matthew Gretton-Dann'; +Cc: gcc-patches



> -----Original Message-----
> From: Matthew Gretton-Dann [mailto:matthew.gretton-dann@linaro.org]
> Sent: Monday, November 19, 2012 8:20 PM
> To: Bin Cheng
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH ARM]Define LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target
> 
> On 16 November 2012 12:22, Bin Cheng <bin.cheng@arm.com> wrote:
> >
> >
> >> -----Original Message-----
> >> From: Matthew Gretton-Dann [mailto:matthew.gretton-dann@linaro.org]
> >> Sent: Friday, November 16, 2012 6:30 PM
> >> To: Bin Cheng
> >> Cc: gcc-patches@gcc.gnu.org
> >> Subject: Re: [PATCH ARM]Define LOGICAL_OP_NON_SHORT_CIRCUIT for ARM
> >> target
> >>
> >> On 16 November 2012 05:37, Bin Cheng <bin.cheng@arm.com> wrote:
> >> > Hi,
> >> > This patch defines LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target and
> >> > prefers short circuit for armv6-m and Thumb2+Os.
> >> >
> >>
> >> > ===================================================================
> >> > --- gcc/config/arm/arm.h (revision 193494)
> >> > +++ gcc/config/arm/arm.h (working copy)
> >> > @@ -2012,10 +2012,16 @@ enum arm_auto_incmodes
> >> >   || (X) == arg_pointer_rtx)
> >> >
> >> > /* Try to generate sequences that don't involve branches, we can
> >> > then
> > use
> >> > -   conditional instructions */
> >> > +   conditional instructions.  */
> >> > #define BRANCH_COST(speed_p, predictable_p) \
> >> >   (current_tune->branch_cost (speed_p, predictable_p))
> >> >
> >> > +/* False if short circuit operation is preferred.  */ #define
> >> > +LOGICAL_OP_NON_SHORT_CIRCUIT \
> >> > +  ((optimize_size) \
> >> > +   ? (TARGET_THUMB ? false : true) \
> >> > +   : (current_tune->logical_op_non_short_circuit[TARGET_ARM]))
> >> > +
> >>
> >> This changes the definition of LOGICAL_OP_NON_SHORT_CIRCUIT for all
> >> cores supported by the ARM backend.
> >>
> >> In gcc/fold-const.c LOGICAL_OP_NON_SHORT_CIRCUIT is defined as follows:
> >>
> >> #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT
> >> #define LOGICAL_OP_NON_SHORT_CIRCUIT \
> >>   (BRANCH_COST (optimize_function_for_speed_p (cfun), \
> >> false) >= 2)
> >> #endif
> >>
> >> Now whilst this is probably wrong for most ARM cores, can you please
> >> keep
> > it
> >> as the default for cores which you haven't benchmarked the change on?
> >> The optimise for code size changes are probably on all cores without
> >> further testing.
> >>
> >
> > Thanks for your comments,
> > I am not sure what's the meaning of "probably wrong for most ARM
> > cores",
> 
> I meant that the default definition is not the best definition for ARM
cores.
> 
> > I
> > deduced the value of field "logical_op_non_short_circuit" from the
> > previously default macro and the BRANCH_COST for all arm tune_params,
> > so this patch should not change the behavior on ARM cores other than
> > v6m. Or did I miss something?
> 
> My issue was that you had changed all the ARM backend to not have the
> 'default' behaviour (in the sense that if the default was changed in fold-
> const.c then the ARM backend would no longer pick this up).
> However, this is of course not possible to achieve as once you've defined
the
> hook you have to use it for all cores in the ARM backend.
Yes, I think it's one of purposes to break the relation between
LOGICAL_OP_NON_SHORT_CIRCUIT and BRANCH_COST for ARM target, right?

Thanks




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH ARM]Define LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target
       [not found]   ` <50a62ff4.0258d80a.4f8f.ffffe7c3SMTPIN_ADDED@mx.google.com>
@ 2012-11-19 12:20     ` Matthew Gretton-Dann
  2012-11-20  5:18       ` Bin Cheng
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Gretton-Dann @ 2012-11-19 12:20 UTC (permalink / raw)
  To: Bin Cheng; +Cc: gcc-patches

On 16 November 2012 12:22, Bin Cheng <bin.cheng@arm.com> wrote:
>
>
>> -----Original Message-----
>> From: Matthew Gretton-Dann [mailto:matthew.gretton-dann@linaro.org]
>> Sent: Friday, November 16, 2012 6:30 PM
>> To: Bin Cheng
>> Cc: gcc-patches@gcc.gnu.org
>> Subject: Re: [PATCH ARM]Define LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target
>>
>> On 16 November 2012 05:37, Bin Cheng <bin.cheng@arm.com> wrote:
>> > Hi,
>> > This patch defines LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target and
>> > prefers short circuit for armv6-m and Thumb2+Os.
>> >
>>
>> > ===================================================================
>> > --- gcc/config/arm/arm.h (revision 193494)
>> > +++ gcc/config/arm/arm.h (working copy)
>> > @@ -2012,10 +2012,16 @@ enum arm_auto_incmodes
>> >   || (X) == arg_pointer_rtx)
>> >
>> > /* Try to generate sequences that don't involve branches, we can then
> use
>> > -   conditional instructions */
>> > +   conditional instructions.  */
>> > #define BRANCH_COST(speed_p, predictable_p) \
>> >   (current_tune->branch_cost (speed_p, predictable_p))
>> >
>> > +/* False if short circuit operation is preferred.  */ #define
>> > +LOGICAL_OP_NON_SHORT_CIRCUIT \
>> > +  ((optimize_size) \
>> > +   ? (TARGET_THUMB ? false : true) \
>> > +   : (current_tune->logical_op_non_short_circuit[TARGET_ARM]))
>> > +
>>
>> This changes the definition of LOGICAL_OP_NON_SHORT_CIRCUIT for all cores
>> supported by the ARM backend.
>>
>> In gcc/fold-const.c LOGICAL_OP_NON_SHORT_CIRCUIT is defined as follows:
>>
>> #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT
>> #define LOGICAL_OP_NON_SHORT_CIRCUIT \
>>   (BRANCH_COST (optimize_function_for_speed_p (cfun), \
>> false) >= 2)
>> #endif
>>
>> Now whilst this is probably wrong for most ARM cores, can you please keep
> it
>> as the default for cores which you haven't benchmarked the change on?  The
>> optimise for code size changes are probably on all cores without further
>> testing.
>>
>
> Thanks for your comments,
> I am not sure what's the meaning of "probably wrong for most ARM cores",

I meant that the default definition is not the best definition for ARM cores.

> I
> deduced the value of field "logical_op_non_short_circuit" from the
> previously default macro and the BRANCH_COST for all arm tune_params, so
> this patch should not change the behavior on ARM cores other than v6m. Or
> did I miss something?

My issue was that you had changed all the ARM backend to not have the
'default' behaviour (in the sense that if the default was changed in
fold-const.c then the ARM backend would no longer pick this up).
However, this is of course not possible to achieve as once you've
defined the hook you have to use it for all cores in the ARM backend.
Sorry.

Thanks,

Matt

--
Matthew Gretton-Dann
Linaro Toolchain Working Group
matthew.gretton-dann@linaro.org

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH ARM]Define LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target
  2012-11-16 10:30 ` Matthew Gretton-Dann
@ 2012-11-16 12:22   ` Bin Cheng
       [not found]   ` <50a62ff4.0258d80a.4f8f.ffffe7c3SMTPIN_ADDED@mx.google.com>
  1 sibling, 0 replies; 7+ messages in thread
From: Bin Cheng @ 2012-11-16 12:22 UTC (permalink / raw)
  To: 'Matthew Gretton-Dann'; +Cc: gcc-patches



> -----Original Message-----
> From: Matthew Gretton-Dann [mailto:matthew.gretton-dann@linaro.org]
> Sent: Friday, November 16, 2012 6:30 PM
> To: Bin Cheng
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH ARM]Define LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target
> 
> On 16 November 2012 05:37, Bin Cheng <bin.cheng@arm.com> wrote:
> > Hi,
> > This patch defines LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target and
> > prefers short circuit for armv6-m and Thumb2+Os.
> >
> 
> > ===================================================================
> > --- gcc/config/arm/arm.h (revision 193494)
> > +++ gcc/config/arm/arm.h (working copy)
> > @@ -2012,10 +2012,16 @@ enum arm_auto_incmodes
> >   || (X) == arg_pointer_rtx)
> >
> > /* Try to generate sequences that don't involve branches, we can then
use
> > -   conditional instructions */
> > +   conditional instructions.  */
> > #define BRANCH_COST(speed_p, predictable_p) \
> >   (current_tune->branch_cost (speed_p, predictable_p))
> >
> > +/* False if short circuit operation is preferred.  */ #define
> > +LOGICAL_OP_NON_SHORT_CIRCUIT \
> > +  ((optimize_size) \
> > +   ? (TARGET_THUMB ? false : true) \
> > +   : (current_tune->logical_op_non_short_circuit[TARGET_ARM]))
> > +
> 
> This changes the definition of LOGICAL_OP_NON_SHORT_CIRCUIT for all cores
> supported by the ARM backend.
> 
> In gcc/fold-const.c LOGICAL_OP_NON_SHORT_CIRCUIT is defined as follows:
> 
> #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT
> #define LOGICAL_OP_NON_SHORT_CIRCUIT \
>   (BRANCH_COST (optimize_function_for_speed_p (cfun), \
> false) >= 2)
> #endif
> 
> Now whilst this is probably wrong for most ARM cores, can you please keep
it
> as the default for cores which you haven't benchmarked the change on?  The
> optimise for code size changes are probably on all cores without further
> testing.
> 

Thanks for your comments,
I am not sure what's the meaning of "probably wrong for most ARM cores", I
deduced the value of field "logical_op_non_short_circuit" from the
previously default macro and the BRANCH_COST for all arm tune_params, so
this patch should not change the behavior on ARM cores other than v6m. Or
did I miss something?

Thanks.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH ARM]Define LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target
       [not found] <50a5d145.a3e8440a.034d.288dSMTPIN_ADDED@mx.google.com>
@ 2012-11-16 10:30 ` Matthew Gretton-Dann
  2012-11-16 12:22   ` Bin Cheng
       [not found]   ` <50a62ff4.0258d80a.4f8f.ffffe7c3SMTPIN_ADDED@mx.google.com>
  0 siblings, 2 replies; 7+ messages in thread
From: Matthew Gretton-Dann @ 2012-11-16 10:30 UTC (permalink / raw)
  To: Bin Cheng; +Cc: gcc-patches

On 16 November 2012 05:37, Bin Cheng <bin.cheng@arm.com> wrote:
> Hi,
> This patch defines LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target and prefers
> short circuit for armv6-m and Thumb2+Os.
>

> ===================================================================
> --- gcc/config/arm/arm.h (revision 193494)
> +++ gcc/config/arm/arm.h (working copy)
> @@ -2012,10 +2012,16 @@ enum arm_auto_incmodes
>   || (X) == arg_pointer_rtx)
>
> /* Try to generate sequences that don't involve branches, we can then use
> -   conditional instructions */
> +   conditional instructions.  */
> #define BRANCH_COST(speed_p, predictable_p) \
>   (current_tune->branch_cost (speed_p, predictable_p))
>
> +/* False if short circuit operation is preferred.  */
> +#define LOGICAL_OP_NON_SHORT_CIRCUIT \
> +  ((optimize_size) \
> +   ? (TARGET_THUMB ? false : true) \
> +   : (current_tune->logical_op_non_short_circuit[TARGET_ARM]))
> +

This changes the definition of LOGICAL_OP_NON_SHORT_CIRCUIT for all
cores supported by the ARM backend.

In gcc/fold-const.c LOGICAL_OP_NON_SHORT_CIRCUIT is defined as follows:

#ifndef LOGICAL_OP_NON_SHORT_CIRCUIT
#define LOGICAL_OP_NON_SHORT_CIRCUIT \
  (BRANCH_COST (optimize_function_for_speed_p (cfun), \
false) >= 2)
#endif

Now whilst this is probably wrong for most ARM cores, can you please
keep it as the default for cores which you haven't benchmarked the
change on?  The optimise for code size changes are probably on all
cores without further testing.

Thanks,

Matt

--
Matthew Gretton-Dann
Linaro Toolchain Working Group
matthew.gretton-dann@linaro.org

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-11-21  3:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-16  5:38 [PATCH ARM]Define LOGICAL_OP_NON_SHORT_CIRCUIT for ARM target Bin Cheng
     [not found] <50a5d145.a3e8440a.034d.288dSMTPIN_ADDED@mx.google.com>
2012-11-16 10:30 ` Matthew Gretton-Dann
2012-11-16 12:22   ` Bin Cheng
     [not found]   ` <50a62ff4.0258d80a.4f8f.ffffe7c3SMTPIN_ADDED@mx.google.com>
2012-11-19 12:20     ` Matthew Gretton-Dann
2012-11-20  5:18       ` Bin Cheng
     [not found] <50a5d14b.a750420a.35e3.2d81SMTPIN_ADDED@mx.google.com>
2012-11-20 22:01 ` Ramana Radhakrishnan
2012-11-21  3:54   ` Bin Cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).