public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction
@ 2014-01-15  9:23 Terry Guo
  2014-01-15  9:54 ` Richard Earnshaw
  0 siblings, 1 reply; 9+ messages in thread
From: Terry Guo @ 2014-01-15  9:23 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 978 bytes --]

Hi there,

With trunk enhancement at
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00533.html, gcc can properly
handle PLD rtx. Otherwise the PLD rtx will be treated as SET rtx and gcc
will end up with ICE. The attached patch intends to back port this
enhancement to 4.8 branch. Tested with gcc regression test, no new
regressions. Is it ok to back port?

BR,
Terry

2014-01-15  Terry Guo  <terry.guo@arm.com>

	Backported from mainline r204575 and applied to file arm.c.
	2013-11-08  James Greenhalgh  <james.greenhalgh@arm.com>

	* config/arm/aarch-common.c
	(search_term): New typedef.
	(shift_rtx_costs): New array.
	(arm_rtx_shift_left_p): New.
	(arm_find_sub_rtx_with_search_term): Likewise.
	(arm_find_sub_rtx_with_code): Likewise.
	(arm_early_load_addr_dep): Add sanity checking.
	(arm_no_early_alu_shift_dep): Likewise.
	(arm_no_early_alu_shift_value_dep): Likewise.
	(arm_no_early_mul_dep): Likewise.
	(arm_no_early_store_addr_dep): Likewise.

[-- Attachment #2: backport-204575.txt --]
[-- Type: text/plain, Size: 10787 bytes --]

Index: gcc/ChangeLog
===================================================================
--- gcc/ChangeLog	(revision 206619)
+++ gcc/ChangeLog	(working copy)
@@ -1,3 +1,20 @@
+2014-01-15  Terry Guo  <terry.guo@arm.com>
+
+	Backported from mainline r204575 and applied to file arm.c.
+	2013-11-08  James Greenhalgh  <james.greenhalgh@arm.com>
+
+	* config/arm/aarch-common.c
+	(search_term): New typedef.
+	(shift_rtx_costs): New array.
+	(arm_rtx_shift_left_p): New.
+	(arm_find_sub_rtx_with_search_term): Likewise.
+	(arm_find_sub_rtx_with_code): Likewise.
+	(arm_early_load_addr_dep): Add sanity checking.
+	(arm_no_early_alu_shift_dep): Likewise.
+	(arm_no_early_alu_shift_value_dep): Likewise.
+	(arm_no_early_mul_dep): Likewise.
+	(arm_no_early_store_addr_dep): Likewise.
+
 2014-01-14  Uros Bizjak  <ubizjak@gmail.com>
 
 	Revert:
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 206619)
+++ gcc/config/arm/arm.c	(working copy)
@@ -1161,6 +1161,30 @@
   TLS_DESCSEQ	/* GNU scheme */
 };
 
+typedef struct
+{
+  rtx_code search_code;
+  rtx search_result;
+  bool find_any_shift;
+} search_term;
+
+/* Return TRUE if X is either an arithmetic shift left, or
+   is a multiplication by a power of two.  */
+bool
+arm_rtx_shift_left_p (rtx x)
+{
+  enum rtx_code code = GET_CODE (x);
+
+  if (code == MULT && CONST_INT_P (XEXP (x, 1))
+      && exact_log2 (INTVAL (XEXP (x, 1))) > 0)
+    return true;
+
+  if (code == ASHIFT)
+    return true;
+
+  return false;
+}
+
 /* The maximum number of insns to be used when loading a constant.  */
 inline static int
 arm_constant_limit (bool size_p)
@@ -24604,62 +24628,116 @@
     *pretend_size = (NUM_ARG_REGS - nregs) * UNITS_PER_WORD;
 }
 
-/* Return nonzero if the CONSUMER instruction (a store) does not need
-   PRODUCER's value to calculate the address.  */
+static rtx_code shift_rtx_codes[] =
+  { ASHIFT, ROTATE, ASHIFTRT, LSHIFTRT,
+    ROTATERT, ZERO_EXTEND, SIGN_EXTEND };
 
-int
-arm_no_early_store_addr_dep (rtx producer, rtx consumer)
+/* Callback function for arm_find_sub_rtx_with_code.
+   DATA is safe to treat as a SEARCH_TERM, ST.  This will
+   hold a SEARCH_CODE.  PATTERN is checked to see if it is an
+   RTX with that code.  If it is, write SEARCH_RESULT in ST
+   and return 1.  Otherwise, or if we have been passed a NULL_RTX
+   return 0.  If ST.FIND_ANY_SHIFT then we are interested in
+   anything which can reasonably be described as a SHIFT RTX.  */
+static int
+arm_find_sub_rtx_with_search_term (rtx *pattern, void *data)
 {
-  rtx value = PATTERN (producer);
-  rtx addr = PATTERN (consumer);
+  search_term *st = (search_term *) data;
+  rtx_code pattern_code;
+  int found = 0;
 
-  if (GET_CODE (value) == COND_EXEC)
-    value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-    value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (addr) == COND_EXEC)
-    addr = COND_EXEC_CODE (addr);
-  if (GET_CODE (addr) == PARALLEL)
-    addr = XVECEXP (addr, 0, 0);
-  addr = XEXP (addr, 0);
+  gcc_assert (pattern);
+  gcc_assert (st);
 
-  return !reg_overlap_mentioned_p (value, addr);
+  /* Poorly formed patterns can really ruin our day.  */
+  if (*pattern == NULL_RTX)
+    return 0;
+
+  pattern_code = GET_CODE (*pattern);
+
+  if (st->find_any_shift)
+    {
+      unsigned i = 0;
+
+      /* Left shifts might have been canonicalized to a MULT of some
+	 power of two.  Make sure we catch them.  */
+      if (arm_rtx_shift_left_p (*pattern))
+	found = 1;
+      else
+	for (i = 0; i < ARRAY_SIZE (shift_rtx_codes); i++)
+	  if (pattern_code == shift_rtx_codes[i])
+	    found = 1;
+    }
+
+  if (pattern_code == st->search_code)
+    found = 1;
+
+  if (found)
+    st->search_result = *pattern;
+
+  return found;
 }
 
-/* Return nonzero if the CONSUMER instruction (a store) does need
-   PRODUCER's value to calculate the address.  */
+/* Traverse PATTERN looking for a sub-rtx with RTX_CODE CODE.  */
+static rtx
+arm_find_sub_rtx_with_code (rtx pattern, rtx_code code, bool find_any_shift)
+{
+  search_term st;
+  int result = 0;
 
-int
-arm_early_store_addr_dep (rtx producer, rtx consumer)
+  gcc_assert (pattern != NULL_RTX);
+  st.search_code = code;
+  st.search_result = NULL_RTX;
+  st.find_any_shift = find_any_shift;
+  result = for_each_rtx (&pattern, arm_find_sub_rtx_with_search_term, &st);
+  if (result)
+    return st.search_result;
+  else
+    return NULL_RTX;
+}
+
+/* Traverse PATTERN looking for any sub-rtx which looks like a shift.  */
+static rtx
+arm_find_shift_sub_rtx (rtx pattern)
 {
-  return !arm_no_early_store_addr_dep (producer, consumer);
+  return arm_find_sub_rtx_with_code (pattern, ASHIFT, true);
 }
 
+/* PRODUCER and CONSUMER are two potentially dependant RTX.  PRODUCER
+   (possibly) contains a SET which will provide a result we can access
+   using the SET_DEST macro.  We will place the RTX which would be
+   written by PRODUCER in SET_SOURCE.
+   Similarly, CONSUMER (possibly) contains a SET which has an operand
+   we can access using SET_SRC.  We place this operand in
+   SET_DESTINATION.
+
+   Return nonzero if we found the SET RTX we expected.  */
+static int
+arm_get_set_operands (rtx producer, rtx consumer,
+		      rtx *set_source, rtx *set_destination)
+{
+  rtx set_producer = arm_find_sub_rtx_with_code (producer, SET, false);
+  rtx set_consumer = arm_find_sub_rtx_with_code (consumer, SET, false);
+
+  if (set_producer && set_consumer)
+    {
+      *set_source = SET_DEST (set_producer);
+      *set_destination = SET_SRC (set_consumer);
+      return 1;
+    }
+  return 0;
+}
+
 /* Return nonzero if the CONSUMER instruction (a load) does need
    PRODUCER's value to calculate the address.  */
 
 int
 arm_early_load_addr_dep (rtx producer, rtx consumer)
 {
-  rtx value = PATTERN (producer);
-  rtx addr = PATTERN (consumer);
+  rtx value, addr;
 
-  if (GET_CODE (value) == COND_EXEC)
-    value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-    value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (addr) == COND_EXEC)
-    addr = COND_EXEC_CODE (addr);
-  if (GET_CODE (addr) == PARALLEL)
-    {
-      if (GET_CODE (XVECEXP (addr, 0, 0)) == RETURN)
-        addr = XVECEXP (addr, 0, 1);
-      else
-        addr = XVECEXP (addr, 0, 0);
-    }
-  addr = XEXP (addr, 1);
+  if (!arm_get_set_operands (producer, consumer, &value, &addr))
+    return 0;
 
   return reg_overlap_mentioned_p (value, addr);
 }
@@ -24671,29 +24749,21 @@
 int
 arm_no_early_alu_shift_dep (rtx producer, rtx consumer)
 {
-  rtx value = PATTERN (producer);
-  rtx op = PATTERN (consumer);
+  rtx value, op;
   rtx early_op;
 
-  if (GET_CODE (value) == COND_EXEC)
-    value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-    value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (op) == COND_EXEC)
-    op = COND_EXEC_CODE (op);
-  if (GET_CODE (op) == PARALLEL)
-    op = XVECEXP (op, 0, 0);
-  op = XEXP (op, 1);
+  if (!arm_get_set_operands (producer, consumer, &value, &op))
+    return 0;
 
-  early_op = XEXP (op, 0);
-  /* This is either an actual independent shift, or a shift applied to
-     the first operand of another operation.  We want the whole shift
-     operation.  */
-  if (REG_P (early_op))
-    early_op = op;
+  if ((early_op = arm_find_shift_sub_rtx (op)))
+    {
+      if (REG_P (early_op))
+	early_op = op;
 
-  return !reg_overlap_mentioned_p (value, early_op);
+      return !reg_overlap_mentioned_p (value, early_op);
+    }
+
+  return 0;
 }
 
 /* Return nonzero if the CONSUMER instruction (an ALU op) does not
@@ -24703,30 +24773,18 @@
 int
 arm_no_early_alu_shift_value_dep (rtx producer, rtx consumer)
 {
-  rtx value = PATTERN (producer);
-  rtx op = PATTERN (consumer);
+  rtx value, op;
   rtx early_op;
 
-  if (GET_CODE (value) == COND_EXEC)
-    value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-    value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (op) == COND_EXEC)
-    op = COND_EXEC_CODE (op);
-  if (GET_CODE (op) == PARALLEL)
-    op = XVECEXP (op, 0, 0);
-  op = XEXP (op, 1);
+  if (!arm_get_set_operands (producer, consumer, &value, &op))
+    return 0;
 
-  early_op = XEXP (op, 0);
+  if ((early_op = arm_find_shift_sub_rtx (op)))
+    /* We want to check the value being shifted.  */
+    if (!reg_overlap_mentioned_p (value, XEXP (early_op, 0)))
+      return 1;
 
-  /* This is either an actual independent shift, or a shift applied to
-     the first operand of another operation.  We want the value being
-     shifted, in either case.  */
-  if (!REG_P (early_op))
-    early_op = XEXP (early_op, 0);
-
-  return !reg_overlap_mentioned_p (value, early_op);
+  return 0;
 }
 
 /* Return nonzero if the CONSUMER (a mul or mac op) does not
@@ -24736,19 +24794,10 @@
 int
 arm_no_early_mul_dep (rtx producer, rtx consumer)
 {
-  rtx value = PATTERN (producer);
-  rtx op = PATTERN (consumer);
+  rtx value, op;
 
-  if (GET_CODE (value) == COND_EXEC)
-    value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-    value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (op) == COND_EXEC)
-    op = COND_EXEC_CODE (op);
-  if (GET_CODE (op) == PARALLEL)
-    op = XVECEXP (op, 0, 0);
-  op = XEXP (op, 1);
+  if (!arm_get_set_operands (producer, consumer, &value, &op))
+    return 0;
 
   if (GET_CODE (op) == PLUS || GET_CODE (op) == MINUS)
     {
@@ -24761,6 +24810,36 @@
   return 0;
 }
 
+/* Return nonzero if the CONSUMER instruction (a store) does not need
+   PRODUCER's value to calculate the address.  */
+
+int
+arm_no_early_store_addr_dep (rtx producer, rtx consumer)
+{
+  rtx value = arm_find_sub_rtx_with_code (producer, SET, false);
+  rtx addr = arm_find_sub_rtx_with_code (consumer, SET, false);
+
+  if (value)
+    value = SET_DEST (value);
+
+  if (addr)
+    addr = SET_DEST (addr);
+
+  if (!value || !addr)
+    return 0;
+
+  return !reg_overlap_mentioned_p (value, addr);
+}
+
+/* Return nonzero if the CONSUMER instruction (a store) does need
+   PRODUCER's value to calculate the address.  */
+
+int
+arm_early_store_addr_dep (rtx producer, rtx consumer)
+{
+  return !arm_no_early_store_addr_dep (producer, consumer);
+}
+
 /* We can't rely on the caller doing the proper promotion when
    using APCS or ATPCS.  */
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction
  2014-01-15  9:23 [GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction Terry Guo
@ 2014-01-15  9:54 ` Richard Earnshaw
  2014-01-15 10:45   ` Terry Guo
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Earnshaw @ 2014-01-15  9:54 UTC (permalink / raw)
  To: Terry Guo; +Cc: gcc-patches

On 15/01/14 09:23, Terry Guo wrote:
> Hi there,
> 
> With trunk enhancement at
> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00533.html, gcc can properly
> handle PLD rtx. Otherwise the PLD rtx will be treated as SET rtx and gcc
> will end up with ICE. The attached patch intends to back port this
> enhancement to 4.8 branch. Tested with gcc regression test, no new
> regressions. Is it ok to back port?
> 
> BR,
> Terry
> 
> 2014-01-15  Terry Guo  <terry.guo@arm.com>
> 
> 	Backported from mainline r204575 and applied to file arm.c.
> 	2013-11-08  James Greenhalgh  <james.greenhalgh@arm.com>
> 
> 	* config/arm/aarch-common.c
> 	(search_term): New typedef.
> 	(shift_rtx_costs): New array.
> 	(arm_rtx_shift_left_p): New.
> 	(arm_find_sub_rtx_with_search_term): Likewise.
> 	(arm_find_sub_rtx_with_code): Likewise.
> 	(arm_early_load_addr_dep): Add sanity checking.
> 	(arm_no_early_alu_shift_dep): Likewise.
> 	(arm_no_early_alu_shift_value_dep): Likewise.
> 	(arm_no_early_mul_dep): Likewise.
> 	(arm_no_early_store_addr_dep): Likewise.
> 

Is there a PR for this?



^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction
  2014-01-15  9:54 ` Richard Earnshaw
@ 2014-01-15 10:45   ` Terry Guo
  2014-01-15 11:33     ` Richard Earnshaw
  0 siblings, 1 reply; 9+ messages in thread
From: Terry Guo @ 2014-01-15 10:45 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches



> -----Original Message-----
> From: Richard Earnshaw
> Sent: Wednesday, January 15, 2014 5:54 PM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
handle
> rtx of ARM PLD instruction
> 
> On 15/01/14 09:23, Terry Guo wrote:
> > Hi there,
> >
> > With trunk enhancement at
> > http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00533.html, gcc can
> > properly handle PLD rtx. Otherwise the PLD rtx will be treated as SET
> > rtx and gcc will end up with ICE. The attached patch intends to back
> > port this enhancement to 4.8 branch. Tested with gcc regression test,
> > no new regressions. Is it ok to back port?
> >
> > BR,
> > Terry
> >
> > 2014-01-15  Terry Guo  <terry.guo@arm.com>
> >
> > 	Backported from mainline r204575 and applied to file arm.c.
> > 	2013-11-08  James Greenhalgh  <james.greenhalgh@arm.com>
> >
> > 	* config/arm/aarch-common.c
> > 	(search_term): New typedef.
> > 	(shift_rtx_costs): New array.
> > 	(arm_rtx_shift_left_p): New.
> > 	(arm_find_sub_rtx_with_search_term): Likewise.
> > 	(arm_find_sub_rtx_with_code): Likewise.
> > 	(arm_early_load_addr_dep): Add sanity checking.
> > 	(arm_no_early_alu_shift_dep): Likewise.
> > 	(arm_no_early_alu_shift_value_dep): Likewise.
> > 	(arm_no_early_mul_dep): Likewise.
> > 	(arm_no_early_store_addr_dep): Likewise.
> >
> 
> Is there a PR for this?
> 

No. It is firstly found on arm/embedded-4_8-branch and then found on
upstream 4.8 branch. The trunk hasn't such issue. Do I need to report a PR
against 4.8? If so, I am willing to do it along with the test case.

BR,
Terry


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction
  2014-01-15 10:45   ` Terry Guo
@ 2014-01-15 11:33     ` Richard Earnshaw
  2014-01-15 12:21       ` Terry Guo
  2014-01-15 12:37       ` Terry Guo
  0 siblings, 2 replies; 9+ messages in thread
From: Richard Earnshaw @ 2014-01-15 11:33 UTC (permalink / raw)
  To: Terry Guo; +Cc: gcc-patches

On 15/01/14 10:45, Terry Guo wrote:
> 
> 
>> -----Original Message-----
>> From: Richard Earnshaw
>> Sent: Wednesday, January 15, 2014 5:54 PM
>> To: Terry Guo
>> Cc: gcc-patches@gcc.gnu.org
>> Subject: Re: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
> handle
>> rtx of ARM PLD instruction
>>
>> On 15/01/14 09:23, Terry Guo wrote:
>>> Hi there,
>>>
>>> With trunk enhancement at
>>> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00533.html, gcc can
>>> properly handle PLD rtx. Otherwise the PLD rtx will be treated as SET
>>> rtx and gcc will end up with ICE. The attached patch intends to back
>>> port this enhancement to 4.8 branch. Tested with gcc regression test,
>>> no new regressions. Is it ok to back port?
>>>
>>> BR,
>>> Terry
>>>
>>> 2014-01-15  Terry Guo  <terry.guo@arm.com>
>>>
>>> 	Backported from mainline r204575 and applied to file arm.c.
>>> 	2013-11-08  James Greenhalgh  <james.greenhalgh@arm.com>
>>>
>>> 	* config/arm/aarch-common.c
>>> 	(search_term): New typedef.
>>> 	(shift_rtx_costs): New array.
>>> 	(arm_rtx_shift_left_p): New.
>>> 	(arm_find_sub_rtx_with_search_term): Likewise.
>>> 	(arm_find_sub_rtx_with_code): Likewise.
>>> 	(arm_early_load_addr_dep): Add sanity checking.
>>> 	(arm_no_early_alu_shift_dep): Likewise.
>>> 	(arm_no_early_alu_shift_value_dep): Likewise.
>>> 	(arm_no_early_mul_dep): Likewise.
>>> 	(arm_no_early_store_addr_dep): Likewise.
>>>
>>
>> Is there a PR for this?
>>
> 
> No. It is firstly found on arm/embedded-4_8-branch and then found on
> upstream 4.8 branch. The trunk hasn't such issue. Do I need to report a PR
> against 4.8? If so, I am willing to do it along with the test case.
> 
> BR,
> Terry
> 

Preferably, particularly since you haven't supplied a testcase.

R.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction
  2014-01-15 11:33     ` Richard Earnshaw
@ 2014-01-15 12:21       ` Terry Guo
  2014-01-15 12:37       ` Terry Guo
  1 sibling, 0 replies; 9+ messages in thread
From: Terry Guo @ 2014-01-15 12:21 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches

> 
> Preferably, particularly since you haven't supplied a testcase.
> 
> R.

Bug is reported at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59826. I
shall update the patch to include the test case.

BR,
Terry


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction
  2014-01-15 11:33     ` Richard Earnshaw
  2014-01-15 12:21       ` Terry Guo
@ 2014-01-15 12:37       ` Terry Guo
  2014-01-15 14:30         ` Richard Earnshaw
  1 sibling, 1 reply; 9+ messages in thread
From: Terry Guo @ 2014-01-15 12:37 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 602 bytes --]



> -----Original Message-----
> From: Terry Guo [mailto:terry.guo@arm.com]
> Sent: Wednesday, January 15, 2014 8:21 PM
> To: Richard Earnshaw
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
handle
> rtx of ARM PLD instruction
> 
> >
> > Preferably, particularly since you haven't supplied a testcase.
> >
> > R.
> 
> Bug is reported at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59826. I
shall
> update the patch to include the test case.
> 
> BR,
> Terry

Here is updated patch along with test case. Is it OK?

BR,
Terry

[-- Attachment #2: backport-204575-v2.txt --]
[-- Type: text/plain, Size: 12163 bytes --]

Index: gcc/ChangeLog
===================================================================
--- gcc/ChangeLog	(revision 206619)
+++ gcc/ChangeLog	(working copy)
@@ -1,3 +1,21 @@
+2014-01-15  Terry Guo  <terry.guo@arm.com>
+
+	PR target/59826
+	Backported from mainline r204575 and applied to file arm.c.
+	2013-11-08  James Greenhalgh  <james.greenhalgh@arm.com>
+
+	* config/arm/aarch-common.c
+	(search_term): New typedef.
+	(shift_rtx_costs): New array.
+	(arm_rtx_shift_left_p): New.
+	(arm_find_sub_rtx_with_search_term): Likewise.
+	(arm_find_sub_rtx_with_code): Likewise.
+	(arm_early_load_addr_dep): Add sanity checking.
+	(arm_no_early_alu_shift_dep): Likewise.
+	(arm_no_early_alu_shift_value_dep): Likewise.
+	(arm_no_early_mul_dep): Likewise.
+	(arm_no_early_store_addr_dep): Likewise.
+
 2014-01-14  Uros Bizjak  <ubizjak@gmail.com>
 
 	Revert:
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	(revision 206619)
+++ gcc/config/arm/arm.c	(working copy)
@@ -1161,6 +1161,30 @@
   TLS_DESCSEQ	/* GNU scheme */
 };
 
+typedef struct
+{
+  rtx_code search_code;
+  rtx search_result;
+  bool find_any_shift;
+} search_term;
+
+/* Return TRUE if X is either an arithmetic shift left, or
+   is a multiplication by a power of two.  */
+bool
+arm_rtx_shift_left_p (rtx x)
+{
+  enum rtx_code code = GET_CODE (x);
+
+  if (code == MULT && CONST_INT_P (XEXP (x, 1))
+      && exact_log2 (INTVAL (XEXP (x, 1))) > 0)
+    return true;
+
+  if (code == ASHIFT)
+    return true;
+
+  return false;
+}
+
 /* The maximum number of insns to be used when loading a constant.  */
 inline static int
 arm_constant_limit (bool size_p)
@@ -24604,62 +24628,116 @@
     *pretend_size = (NUM_ARG_REGS - nregs) * UNITS_PER_WORD;
 }
 
-/* Return nonzero if the CONSUMER instruction (a store) does not need
-   PRODUCER's value to calculate the address.  */
+static rtx_code shift_rtx_codes[] =
+  { ASHIFT, ROTATE, ASHIFTRT, LSHIFTRT,
+    ROTATERT, ZERO_EXTEND, SIGN_EXTEND };
 
-int
-arm_no_early_store_addr_dep (rtx producer, rtx consumer)
+/* Callback function for arm_find_sub_rtx_with_code.
+   DATA is safe to treat as a SEARCH_TERM, ST.  This will
+   hold a SEARCH_CODE.  PATTERN is checked to see if it is an
+   RTX with that code.  If it is, write SEARCH_RESULT in ST
+   and return 1.  Otherwise, or if we have been passed a NULL_RTX
+   return 0.  If ST.FIND_ANY_SHIFT then we are interested in
+   anything which can reasonably be described as a SHIFT RTX.  */
+static int
+arm_find_sub_rtx_with_search_term (rtx *pattern, void *data)
 {
-  rtx value = PATTERN (producer);
-  rtx addr = PATTERN (consumer);
+  search_term *st = (search_term *) data;
+  rtx_code pattern_code;
+  int found = 0;
 
-  if (GET_CODE (value) == COND_EXEC)
-    value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-    value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (addr) == COND_EXEC)
-    addr = COND_EXEC_CODE (addr);
-  if (GET_CODE (addr) == PARALLEL)
-    addr = XVECEXP (addr, 0, 0);
-  addr = XEXP (addr, 0);
+  gcc_assert (pattern);
+  gcc_assert (st);
 
-  return !reg_overlap_mentioned_p (value, addr);
+  /* Poorly formed patterns can really ruin our day.  */
+  if (*pattern == NULL_RTX)
+    return 0;
+
+  pattern_code = GET_CODE (*pattern);
+
+  if (st->find_any_shift)
+    {
+      unsigned i = 0;
+
+      /* Left shifts might have been canonicalized to a MULT of some
+	 power of two.  Make sure we catch them.  */
+      if (arm_rtx_shift_left_p (*pattern))
+	found = 1;
+      else
+	for (i = 0; i < ARRAY_SIZE (shift_rtx_codes); i++)
+	  if (pattern_code == shift_rtx_codes[i])
+	    found = 1;
+    }
+
+  if (pattern_code == st->search_code)
+    found = 1;
+
+  if (found)
+    st->search_result = *pattern;
+
+  return found;
 }
 
-/* Return nonzero if the CONSUMER instruction (a store) does need
-   PRODUCER's value to calculate the address.  */
+/* Traverse PATTERN looking for a sub-rtx with RTX_CODE CODE.  */
+static rtx
+arm_find_sub_rtx_with_code (rtx pattern, rtx_code code, bool find_any_shift)
+{
+  search_term st;
+  int result = 0;
 
-int
-arm_early_store_addr_dep (rtx producer, rtx consumer)
+  gcc_assert (pattern != NULL_RTX);
+  st.search_code = code;
+  st.search_result = NULL_RTX;
+  st.find_any_shift = find_any_shift;
+  result = for_each_rtx (&pattern, arm_find_sub_rtx_with_search_term, &st);
+  if (result)
+    return st.search_result;
+  else
+    return NULL_RTX;
+}
+
+/* Traverse PATTERN looking for any sub-rtx which looks like a shift.  */
+static rtx
+arm_find_shift_sub_rtx (rtx pattern)
 {
-  return !arm_no_early_store_addr_dep (producer, consumer);
+  return arm_find_sub_rtx_with_code (pattern, ASHIFT, true);
 }
 
+/* PRODUCER and CONSUMER are two potentially dependant RTX.  PRODUCER
+   (possibly) contains a SET which will provide a result we can access
+   using the SET_DEST macro.  We will place the RTX which would be
+   written by PRODUCER in SET_SOURCE.
+   Similarly, CONSUMER (possibly) contains a SET which has an operand
+   we can access using SET_SRC.  We place this operand in
+   SET_DESTINATION.
+
+   Return nonzero if we found the SET RTX we expected.  */
+static int
+arm_get_set_operands (rtx producer, rtx consumer,
+		      rtx *set_source, rtx *set_destination)
+{
+  rtx set_producer = arm_find_sub_rtx_with_code (producer, SET, false);
+  rtx set_consumer = arm_find_sub_rtx_with_code (consumer, SET, false);
+
+  if (set_producer && set_consumer)
+    {
+      *set_source = SET_DEST (set_producer);
+      *set_destination = SET_SRC (set_consumer);
+      return 1;
+    }
+  return 0;
+}
+
 /* Return nonzero if the CONSUMER instruction (a load) does need
    PRODUCER's value to calculate the address.  */
 
 int
 arm_early_load_addr_dep (rtx producer, rtx consumer)
 {
-  rtx value = PATTERN (producer);
-  rtx addr = PATTERN (consumer);
+  rtx value, addr;
 
-  if (GET_CODE (value) == COND_EXEC)
-    value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-    value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (addr) == COND_EXEC)
-    addr = COND_EXEC_CODE (addr);
-  if (GET_CODE (addr) == PARALLEL)
-    {
-      if (GET_CODE (XVECEXP (addr, 0, 0)) == RETURN)
-        addr = XVECEXP (addr, 0, 1);
-      else
-        addr = XVECEXP (addr, 0, 0);
-    }
-  addr = XEXP (addr, 1);
+  if (!arm_get_set_operands (producer, consumer, &value, &addr))
+    return 0;
 
   return reg_overlap_mentioned_p (value, addr);
 }
@@ -24671,29 +24749,21 @@
 int
 arm_no_early_alu_shift_dep (rtx producer, rtx consumer)
 {
-  rtx value = PATTERN (producer);
-  rtx op = PATTERN (consumer);
+  rtx value, op;
   rtx early_op;
 
-  if (GET_CODE (value) == COND_EXEC)
-    value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-    value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (op) == COND_EXEC)
-    op = COND_EXEC_CODE (op);
-  if (GET_CODE (op) == PARALLEL)
-    op = XVECEXP (op, 0, 0);
-  op = XEXP (op, 1);
+  if (!arm_get_set_operands (producer, consumer, &value, &op))
+    return 0;
 
-  early_op = XEXP (op, 0);
-  /* This is either an actual independent shift, or a shift applied to
-     the first operand of another operation.  We want the whole shift
-     operation.  */
-  if (REG_P (early_op))
-    early_op = op;
+  if ((early_op = arm_find_shift_sub_rtx (op)))
+    {
+      if (REG_P (early_op))
+	early_op = op;
 
-  return !reg_overlap_mentioned_p (value, early_op);
+      return !reg_overlap_mentioned_p (value, early_op);
+    }
+
+  return 0;
 }
 
 /* Return nonzero if the CONSUMER instruction (an ALU op) does not
@@ -24703,30 +24773,18 @@
 int
 arm_no_early_alu_shift_value_dep (rtx producer, rtx consumer)
 {
-  rtx value = PATTERN (producer);
-  rtx op = PATTERN (consumer);
+  rtx value, op;
   rtx early_op;
 
-  if (GET_CODE (value) == COND_EXEC)
-    value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-    value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (op) == COND_EXEC)
-    op = COND_EXEC_CODE (op);
-  if (GET_CODE (op) == PARALLEL)
-    op = XVECEXP (op, 0, 0);
-  op = XEXP (op, 1);
+  if (!arm_get_set_operands (producer, consumer, &value, &op))
+    return 0;
 
-  early_op = XEXP (op, 0);
+  if ((early_op = arm_find_shift_sub_rtx (op)))
+    /* We want to check the value being shifted.  */
+    if (!reg_overlap_mentioned_p (value, XEXP (early_op, 0)))
+      return 1;
 
-  /* This is either an actual independent shift, or a shift applied to
-     the first operand of another operation.  We want the value being
-     shifted, in either case.  */
-  if (!REG_P (early_op))
-    early_op = XEXP (early_op, 0);
-
-  return !reg_overlap_mentioned_p (value, early_op);
+  return 0;
 }
 
 /* Return nonzero if the CONSUMER (a mul or mac op) does not
@@ -24736,19 +24794,10 @@
 int
 arm_no_early_mul_dep (rtx producer, rtx consumer)
 {
-  rtx value = PATTERN (producer);
-  rtx op = PATTERN (consumer);
+  rtx value, op;
 
-  if (GET_CODE (value) == COND_EXEC)
-    value = COND_EXEC_CODE (value);
-  if (GET_CODE (value) == PARALLEL)
-    value = XVECEXP (value, 0, 0);
-  value = XEXP (value, 0);
-  if (GET_CODE (op) == COND_EXEC)
-    op = COND_EXEC_CODE (op);
-  if (GET_CODE (op) == PARALLEL)
-    op = XVECEXP (op, 0, 0);
-  op = XEXP (op, 1);
+  if (!arm_get_set_operands (producer, consumer, &value, &op))
+    return 0;
 
   if (GET_CODE (op) == PLUS || GET_CODE (op) == MINUS)
     {
@@ -24761,6 +24810,36 @@
   return 0;
 }
 
+/* Return nonzero if the CONSUMER instruction (a store) does not need
+   PRODUCER's value to calculate the address.  */
+
+int
+arm_no_early_store_addr_dep (rtx producer, rtx consumer)
+{
+  rtx value = arm_find_sub_rtx_with_code (producer, SET, false);
+  rtx addr = arm_find_sub_rtx_with_code (consumer, SET, false);
+
+  if (value)
+    value = SET_DEST (value);
+
+  if (addr)
+    addr = SET_DEST (addr);
+
+  if (!value || !addr)
+    return 0;
+
+  return !reg_overlap_mentioned_p (value, addr);
+}
+
+/* Return nonzero if the CONSUMER instruction (a store) does need
+   PRODUCER's value to calculate the address.  */
+
+int
+arm_early_store_addr_dep (rtx producer, rtx consumer)
+{
+  return !arm_no_early_store_addr_dep (producer, consumer);
+}
+
 /* We can't rely on the caller doing the proper promotion when
    using APCS or ATPCS.  */
 
Index: gcc/testsuite/ChangeLog
===================================================================
--- gcc/testsuite/ChangeLog	(revision 206619)
+++ gcc/testsuite/ChangeLog	(working copy)
@@ -1,3 +1,7 @@
+2014-01-15  Terry Guo  <terry.guo@arm.com>
+
+	* gcc.target/arm/pr59826.c: New test.
+
 2014-01-10  Yufeng Zhang  <yufeng.zhang@arm.com>
 
 	* gcc.target/arm/neon/vst1Q_laneu64-1.c: New test.
Index: gcc/testsuite/gcc.target/arm/pr59826.c
===================================================================
--- gcc/testsuite/gcc.target/arm/pr59826.c	(revision 0)
+++ gcc/testsuite/gcc.target/arm/pr59826.c	(working copy)
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-mthumb -mcpu=cortex-m4 -fprefetch-loop-arrays -O2" }  */
+
+typedef struct genxWriter_rec * genxWriter;
+typedef unsigned char * utf8;
+typedef const unsigned char * constUtf8;
+
+int genxScrubText(genxWriter w, constUtf8 in, utf8 out)
+{
+  int problems = 0;
+  constUtf8 last = in;
+
+  while (*in)
+  {
+    int c = genxNextUnicodeChar(&in);
+    if (c == -1)
+    {
+      problems++;
+      last = in;
+      continue;
+    }
+
+    if (!isXMLChar(w, c))
+    {
+      problems++;
+      last = in;
+      continue;
+    }
+
+    while (last < in)
+      *out++ = *last++;
+  }
+  *out = 0;
+  return problems;
+}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction
  2014-01-15 12:37       ` Terry Guo
@ 2014-01-15 14:30         ` Richard Earnshaw
  2014-01-15 16:00           ` Terry Guo
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Earnshaw @ 2014-01-15 14:30 UTC (permalink / raw)
  To: Terry Guo; +Cc: gcc-patches

On 15/01/14 12:37, Terry Guo wrote:
> 
> 
>> -----Original Message-----
>> From: Terry Guo [mailto:terry.guo@arm.com]
>> Sent: Wednesday, January 15, 2014 8:21 PM
>> To: Richard Earnshaw
>> Cc: gcc-patches@gcc.gnu.org
>> Subject: RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
> handle
>> rtx of ARM PLD instruction
>>
>>>
>>> Preferably, particularly since you haven't supplied a testcase.
>>>
>>> R.
>>
>> Bug is reported at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59826. I
> shall
>> update the patch to include the test case.
>>
>> BR,
>> Terry
> 
> Here is updated patch along with test case. Is it OK?
> 

I'm rather concerned about the complexity of this patch as a backport.
Furthermore, part of the problem is that the preload insn is
misclassified as an alu_reg operation, which it clearly isn't.

Instead of doing this, please could you try a simpler patch that simply
reclassifies the "type" of preload as "load1".  This would break the
alu->load/store dependency and thereby avoid the trigger of the problem

R.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction
  2014-01-15 14:30         ` Richard Earnshaw
@ 2014-01-15 16:00           ` Terry Guo
  2014-01-15 16:40             ` Richard Earnshaw
  0 siblings, 1 reply; 9+ messages in thread
From: Terry Guo @ 2014-01-15 16:00 UTC (permalink / raw)
  To: Richard Earnshaw; +Cc: gcc-patches



> -----Original Message-----
> From: Richard Earnshaw
> Sent: Wednesday, January 15, 2014 10:30 PM
> To: Terry Guo
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
handle
> rtx of ARM PLD instruction
> 
> On 15/01/14 12:37, Terry Guo wrote:
> >
> >
> >> -----Original Message-----
> >> From: Terry Guo [mailto:terry.guo@arm.com]
> >> Sent: Wednesday, January 15, 2014 8:21 PM
> >> To: Richard Earnshaw
> >> Cc: gcc-patches@gcc.gnu.org
> >> Subject: RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
> > handle
> >> rtx of ARM PLD instruction
> >>
> >>>
> >>> Preferably, particularly since you haven't supplied a testcase.
> >>>
> >>> R.
> >>
> >> Bug is reported at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59826.
> >> I
> > shall
> >> update the patch to include the test case.
> >>
> >> BR,
> >> Terry
> >
> > Here is updated patch along with test case. Is it OK?
> >
> 
> I'm rather concerned about the complexity of this patch as a backport.
> Furthermore, part of the problem is that the preload insn is misclassified
as
> an alu_reg operation, which it clearly isn't.
> 
> Instead of doing this, please could you try a simpler patch that simply
> reclassifies the "type" of preload as "load1".  This would break the
> alu->load/store dependency and thereby avoid the trigger of the problem
> 
> R.

Thanks Richard and you are right. What you said should be the root cause for
this issue. I will implement another patch for trunk to correctly classify
the preload instruction into load1, and then back port to 4.8 branch.
Therefore please consider my request in this thread discarded.

BR,
Terry




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction
  2014-01-15 16:00           ` Terry Guo
@ 2014-01-15 16:40             ` Richard Earnshaw
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Earnshaw @ 2014-01-15 16:40 UTC (permalink / raw)
  To: Terry Guo; +Cc: gcc-patches

On 15/01/14 15:59, Terry Guo wrote:
> 
> 
>> -----Original Message-----
>> From: Richard Earnshaw
>> Sent: Wednesday, January 15, 2014 10:30 PM
>> To: Terry Guo
>> Cc: gcc-patches@gcc.gnu.org
>> Subject: Re: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
> handle
>> rtx of ARM PLD instruction
>>
>> On 15/01/14 12:37, Terry Guo wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Terry Guo [mailto:terry.guo@arm.com]
>>>> Sent: Wednesday, January 15, 2014 8:21 PM
>>>> To: Richard Earnshaw
>>>> Cc: gcc-patches@gcc.gnu.org
>>>> Subject: RE: [GCC, ARM] Backport trunk fix to 4.8 branch to properly
>>> handle
>>>> rtx of ARM PLD instruction
>>>>
>>>>>
>>>>> Preferably, particularly since you haven't supplied a testcase.
>>>>>
>>>>> R.
>>>>
>>>> Bug is reported at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59826.
>>>> I
>>> shall
>>>> update the patch to include the test case.
>>>>
>>>> BR,
>>>> Terry
>>>
>>> Here is updated patch along with test case. Is it OK?
>>>
>>
>> I'm rather concerned about the complexity of this patch as a backport.
>> Furthermore, part of the problem is that the preload insn is misclassified
> as
>> an alu_reg operation, which it clearly isn't.
>>
>> Instead of doing this, please could you try a simpler patch that simply
>> reclassifies the "type" of preload as "load1".  This would break the
>> alu->load/store dependency and thereby avoid the trigger of the problem
>>
>> R.
> 
> Thanks Richard and you are right. What you said should be the root cause for
> this issue. I will implement another patch for trunk to correctly classify
> the preload instruction into load1, and then back port to 4.8 branch.
> Therefore please consider my request in this thread discarded.
> 

Trunk has already reclassified it this way.

R.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-01-15 16:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-15  9:23 [GCC, ARM] Backport trunk fix to 4.8 branch to properly handle rtx of ARM PLD instruction Terry Guo
2014-01-15  9:54 ` Richard Earnshaw
2014-01-15 10:45   ` Terry Guo
2014-01-15 11:33     ` Richard Earnshaw
2014-01-15 12:21       ` Terry Guo
2014-01-15 12:37       ` Terry Guo
2014-01-15 14:30         ` Richard Earnshaw
2014-01-15 16:00           ` Terry Guo
2014-01-15 16:40             ` Richard Earnshaw

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).