public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check
@ 2015-10-28 10:08 Kyrill Tkachov
  2015-10-29 13:50 ` Marcus Shawcroft
  2015-11-06 12:26 ` Nikolai Bozhenov
  0 siblings, 2 replies; 15+ messages in thread
From: Kyrill Tkachov @ 2015-10-28 10:08 UTC (permalink / raw)
  To: GCC Patches
  Cc: Marcus Shawcroft, Ramana Radhakrishnan, Richard Earnshaw,
	James Greenhalgh

[-- Attachment #1: Type: text/plain, Size: 1420 bytes --]

Hi all,

This RTL checking error occurs on aarch64 in aarch_accumulator_forwarding when processing an msubsi insn
with subregs:
(insn 15 14 16 3 (set (reg/v:SI 78 [ i ])
         (minus:SI (subreg:SI (reg/v:DI 76 [ aul ]) 0)
             (mult:SI (subreg:SI (reg:DI 83) 0)
                 (subreg:SI (reg:DI 75 [ _20 ]) 0)))) schedice.c:10 357 {*msubsi}

The register_operand predicate for that pattern allows subregs (I think correctly).
The code in aarch_accumulator_forwarding doesn't take that into account and ends up
taking a REGNO of a SUBREG, causing a checking error.

This patch fixes that by stripping the subregs off the accumulator rtx before
checking that the inner expression is a REG and taking its REGNO.

The testcase now works fine with an aarch64-none-elf toolchain configure for RTL checking.

The testcase is taken verbatim from the BZ entry for PR 68088.
Since this function is shared between arm and aarch64 I've bootstrapped and tested it on both
and I'll need ok's for both ports.

Ok for trunk?

Thanks,
Kyrill

2015-10-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     PR target/68088
     * config/arm/aarch-common.c (aarch_strip_subreg): New function.
     (aarch_accumulator_forwarding): Strip subregs from accumulator rtx
     when appropriate.

2015-10-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * gcc.target/aarch64/pr68088_1.c: New test.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: aarch64-accum-rtl-check-fix.patch --]
[-- Type: text/x-patch; name=aarch64-accum-rtl-check-fix.patch, Size: 2777 bytes --]

commit 7ce1b9ec8b8486cab34071a9c120db13e7c3b96a
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Tue Oct 27 11:42:19 2015 +0000

    [ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check

diff --git a/gcc/config/arm/aarch-common.c b/gcc/config/arm/aarch-common.c
index a940a02..2a21c4e 100644
--- a/gcc/config/arm/aarch-common.c
+++ b/gcc/config/arm/aarch-common.c
@@ -389,6 +389,15 @@ arm_mac_accumulator_is_result (rtx producer, rtx consumer)
           && !reg_overlap_mentioned_p (result, op1));
 }
 
+/* If X is a subreg return the value it contains, otherwise
+   return X unchanged.  */
+
+static rtx
+aarch_strip_subreg (rtx x)
+{
+  return GET_CODE (x) == SUBREG ? SUBREG_REG (x) : x;
+}
+
 /* Return non-zero if the destination of PRODUCER feeds the accumulator
    operand of an MLA-like operation.  */
 
@@ -420,14 +429,14 @@ aarch_accumulator_forwarding (rtx_insn *producer, rtx_insn *consumer)
     case PLUS:
       /* Possibly an MADD.  */
       if (GET_CODE (XEXP (mla, 0)) == MULT)
-	accumulator = XEXP (mla, 1);
+	accumulator = aarch_strip_subreg (XEXP (mla, 1));
       else
 	return 0;
       break;
     case MINUS:
       /* Possibly an MSUB.  */
       if (GET_CODE (XEXP (mla, 1)) == MULT)
-	accumulator = XEXP (mla, 0);
+	accumulator = aarch_strip_subreg (XEXP (mla, 0));
       else
 	return 0;
       break;
@@ -441,7 +450,7 @@ aarch_accumulator_forwarding (rtx_insn *producer, rtx_insn *consumer)
 
 	    {
 	      /* FMADD/FMSUB.  */
-	      accumulator = XEXP (mla, 2);
+	      accumulator = aarch_strip_subreg (XEXP (mla, 2));
 	    }
 	  else if (REG_P (XEXP (mla, 1))
 		   && GET_CODE (XEXP (mla, 2)) == NEG
@@ -449,7 +458,7 @@ aarch_accumulator_forwarding (rtx_insn *producer, rtx_insn *consumer)
 		       || GET_CODE (XEXP (mla, 0)) == NEG))
 	    {
 	      /* FNMADD/FNMSUB.  */
-	      accumulator = XEXP (XEXP (mla, 2), 0);
+	      accumulator = aarch_strip_subreg (XEXP (XEXP (mla, 2), 0));
 	    }
 	  else
 	    return 0;
@@ -460,6 +469,9 @@ aarch_accumulator_forwarding (rtx_insn *producer, rtx_insn *consumer)
 	return 0;
     }
 
+  if (!REG_P (accumulator))
+    return 0;
+
   return (REGNO (dest) == REGNO (accumulator));
 }
 
diff --git a/gcc/testsuite/gcc.target/aarch64/pr68088_1.c b/gcc/testsuite/gcc.target/aarch64/pr68088_1.c
new file mode 100644
index 0000000..49c6aa1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr68088_1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void bar (unsigned long);
+
+void
+foo (unsigned long aul, unsigned m, unsigned i)
+{
+  while (1)
+    {
+      aul += i;
+      i = aul % m;
+      bar (aul);
+    }
+}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check
  2015-10-28 10:08 [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check Kyrill Tkachov
@ 2015-10-29 13:50 ` Marcus Shawcroft
  2015-10-29 14:00   ` Kyrill Tkachov
  2015-11-06 12:26 ` Nikolai Bozhenov
  1 sibling, 1 reply; 15+ messages in thread
From: Marcus Shawcroft @ 2015-10-29 13:50 UTC (permalink / raw)
  To: Kyrill Tkachov
  Cc: GCC Patches, Marcus Shawcroft, Ramana Radhakrishnan,
	Richard Earnshaw, James Greenhalgh

On 28 October 2015 at 10:07, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
> Hi all,
>
> This RTL checking error occurs on aarch64 in aarch_accumulator_forwarding
> when processing an msubsi insn
> with subregs:
> (insn 15 14 16 3 (set (reg/v:SI 78 [ i ])
>         (minus:SI (subreg:SI (reg/v:DI 76 [ aul ]) 0)
>             (mult:SI (subreg:SI (reg:DI 83) 0)
>                 (subreg:SI (reg:DI 75 [ _20 ]) 0)))) schedice.c:10 357
> {*msubsi}
>
> The register_operand predicate for that pattern allows subregs (I think
> correctly).
> The code in aarch_accumulator_forwarding doesn't take that into account and
> ends up
> taking a REGNO of a SUBREG, causing a checking error.
>
> This patch fixes that by stripping the subregs off the accumulator rtx
> before
> checking that the inner expression is a REG and taking its REGNO.
>
> The testcase now works fine with an aarch64-none-elf toolchain configure for
> RTL checking.
>
> The testcase is taken verbatim from the BZ entry for PR 68088.
> Since this function is shared between arm and aarch64 I've bootstrapped and
> tested it on both
> and I'll need ok's for both ports.
>
> Ok for trunk?

rtl.h exposes reg_or_subregno() already doesn't that do what we need here?

The test case is not aarch64 specific therefore I think convention is
that it should go into a generic directory.

Cheers
/Marcus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check
  2015-10-29 13:50 ` Marcus Shawcroft
@ 2015-10-29 14:00   ` Kyrill Tkachov
  2015-10-29 14:02     ` Marcus Shawcroft
  0 siblings, 1 reply; 15+ messages in thread
From: Kyrill Tkachov @ 2015-10-29 14:00 UTC (permalink / raw)
  To: Marcus Shawcroft
  Cc: GCC Patches, Marcus Shawcroft, Ramana Radhakrishnan,
	Richard Earnshaw, James Greenhalgh

Hi Marcus,

On 29/10/15 13:46, Marcus Shawcroft wrote:
> On 28 October 2015 at 10:07, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>> Hi all,
>>
>> This RTL checking error occurs on aarch64 in aarch_accumulator_forwarding
>> when processing an msubsi insn
>> with subregs:
>> (insn 15 14 16 3 (set (reg/v:SI 78 [ i ])
>>          (minus:SI (subreg:SI (reg/v:DI 76 [ aul ]) 0)
>>              (mult:SI (subreg:SI (reg:DI 83) 0)
>>                  (subreg:SI (reg:DI 75 [ _20 ]) 0)))) schedice.c:10 357
>> {*msubsi}
>>
>> The register_operand predicate for that pattern allows subregs (I think
>> correctly).
>> The code in aarch_accumulator_forwarding doesn't take that into account and
>> ends up
>> taking a REGNO of a SUBREG, causing a checking error.
>>
>> This patch fixes that by stripping the subregs off the accumulator rtx
>> before
>> checking that the inner expression is a REG and taking its REGNO.
>>
>> The testcase now works fine with an aarch64-none-elf toolchain configure for
>> RTL checking.
>>
>> The testcase is taken verbatim from the BZ entry for PR 68088.
>> Since this function is shared between arm and aarch64 I've bootstrapped and
>> tested it on both
>> and I'll need ok's for both ports.
>>
>> Ok for trunk?
> rtl.h exposes reg_or_subregno() already doesn't that do what we need here?

reg_or_subregno assumes that what it's passed is REG or a SUBREG.
It will ICE on any other rtx. Here I want to strip the subreg if it is
a subreg, but leave it as it is otherwise.

>
> The test case is not aarch64 specific therefore I think convention is
> that it should go into a generic directory.

Ok, I'll put it in gcc.dg/

Thanks,
Kyrill

>
> Cheers
> /Marcus
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check
  2015-10-29 14:00   ` Kyrill Tkachov
@ 2015-10-29 14:02     ` Marcus Shawcroft
  2015-10-29 14:18       ` Kyrill Tkachov
  0 siblings, 1 reply; 15+ messages in thread
From: Marcus Shawcroft @ 2015-10-29 14:02 UTC (permalink / raw)
  To: Kyrill Tkachov
  Cc: GCC Patches, Marcus Shawcroft, Ramana Radhakrishnan,
	Richard Earnshaw, James Greenhalgh

On 29 October 2015 at 13:50, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:

>>> Ok for trunk?
>>
>> rtl.h exposes reg_or_subregno() already doesn't that do what we need here?
>
>
> reg_or_subregno assumes that what it's passed is REG or a SUBREG.
> It will ICE on any other rtx. Here I want to strip the subreg if it is
> a subreg, but leave it as it is otherwise.

OK, I follow.

>> The test case is not aarch64 specific therefore I think convention is
>> that it should go into a generic directory.
>
>
> Ok, I'll put it in gcc.dg/


OK with the test case moved. Thanks /Marcus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check
  2015-10-29 14:02     ` Marcus Shawcroft
@ 2015-10-29 14:18       ` Kyrill Tkachov
  2015-11-06 11:39         ` Kyrill Tkachov
  2015-11-06 11:50         ` Ramana Radhakrishnan
  0 siblings, 2 replies; 15+ messages in thread
From: Kyrill Tkachov @ 2015-10-29 14:18 UTC (permalink / raw)
  To: Marcus Shawcroft
  Cc: GCC Patches, Marcus Shawcroft, Ramana Radhakrishnan,
	Richard Earnshaw, James Greenhalgh

[-- Attachment #1: Type: text/plain, Size: 1349 bytes --]


On 29/10/15 14:00, Marcus Shawcroft wrote:
> On 29 October 2015 at 13:50, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>
>>>> Ok for trunk?
>>> rtl.h exposes reg_or_subregno() already doesn't that do what we need here?
>>
>> reg_or_subregno assumes that what it's passed is REG or a SUBREG.
>> It will ICE on any other rtx. Here I want to strip the subreg if it is
>> a subreg, but leave it as it is otherwise.
> OK, I follow.
>
>>> The test case is not aarch64 specific therefore I think convention is
>>> that it should go into a generic directory.
>>
>> Ok, I'll put it in gcc.dg/
>
> OK with the test case moved. Thanks /Marcus
>
Thanks, but I'd like to do a slight respin.
The testcase is moved to gcc.dg but I also avoid creating the new
helper function and just do the SUBREG extraction once at the very end.
This makes the patch smaller.

Since you're ok with the approach and this revision is logically equivalent,
I just need an ok from an arm perspective.

Thanks,
Kyrill

2015-10-29  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     PR target/68088
     * config/arm/aarch-common.c (aarch_accumulator_forwarding): Strip
     subregs from accumulator and make sure it's a register.

2015-10-29  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     PR target/68088
     * gcc.dg/pr68088_1.c: New test.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: aarch64-accum-rtl-check-fix.patch --]
[-- Type: text/x-patch; name=aarch64-accum-rtl-check-fix.patch, Size: 1180 bytes --]

commit aa4df340968b330544edbbb8ea706f2d56011381
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Tue Oct 27 11:42:19 2015 +0000

    [ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check

diff --git a/gcc/config/arm/aarch-common.c b/gcc/config/arm/aarch-common.c
index a940a02..e6668d5 100644
--- a/gcc/config/arm/aarch-common.c
+++ b/gcc/config/arm/aarch-common.c
@@ -460,6 +460,12 @@ aarch_accumulator_forwarding (rtx_insn *producer, rtx_insn *consumer)
 	return 0;
     }
 
+  if (GET_CODE (accumulator) == SUBREG)
+    accumulator = SUBREG_REG (accumulator);
+
+  if (!REG_P (accumulator))
+    return 0;
+
   return (REGNO (dest) == REGNO (accumulator));
 }
 
diff --git a/gcc/testsuite/gcc.dg/pr68088_1.c b/gcc/testsuite/gcc.dg/pr68088_1.c
new file mode 100644
index 0000000..49c6aa1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr68088_1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void bar (unsigned long);
+
+void
+foo (unsigned long aul, unsigned m, unsigned i)
+{
+  while (1)
+    {
+      aul += i;
+      i = aul % m;
+      bar (aul);
+    }
+}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check
  2015-10-29 14:18       ` Kyrill Tkachov
@ 2015-11-06 11:39         ` Kyrill Tkachov
  2015-11-06 11:50         ` Ramana Radhakrishnan
  1 sibling, 0 replies; 15+ messages in thread
From: Kyrill Tkachov @ 2015-11-06 11:39 UTC (permalink / raw)
  To: Marcus Shawcroft
  Cc: GCC Patches, Marcus Shawcroft, Ramana Radhakrishnan,
	Richard Earnshaw, James Greenhalgh

Ping.
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg03170.html

Thanks,
Kyrill

On 29/10/15 14:14, Kyrill Tkachov wrote:
>
> On 29/10/15 14:00, Marcus Shawcroft wrote:
>> On 29 October 2015 at 13:50, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>>
>>>>> Ok for trunk?
>>>> rtl.h exposes reg_or_subregno() already doesn't that do what we need here?
>>>
>>> reg_or_subregno assumes that what it's passed is REG or a SUBREG.
>>> It will ICE on any other rtx. Here I want to strip the subreg if it is
>>> a subreg, but leave it as it is otherwise.
>> OK, I follow.
>>
>>>> The test case is not aarch64 specific therefore I think convention is
>>>> that it should go into a generic directory.
>>>
>>> Ok, I'll put it in gcc.dg/
>>
>> OK with the test case moved. Thanks /Marcus
>>
> Thanks, but I'd like to do a slight respin.
> The testcase is moved to gcc.dg but I also avoid creating the new
> helper function and just do the SUBREG extraction once at the very end.
> This makes the patch smaller.
>
> Since you're ok with the approach and this revision is logically equivalent,
> I just need an ok from an arm perspective.
>
> Thanks,
> Kyrill
>
> 2015-10-29  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     PR target/68088
>     * config/arm/aarch-common.c (aarch_accumulator_forwarding): Strip
>     subregs from accumulator and make sure it's a register.
>
> 2015-10-29  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     PR target/68088
>     * gcc.dg/pr68088_1.c: New test.
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check
  2015-10-29 14:18       ` Kyrill Tkachov
  2015-11-06 11:39         ` Kyrill Tkachov
@ 2015-11-06 11:50         ` Ramana Radhakrishnan
  1 sibling, 0 replies; 15+ messages in thread
From: Ramana Radhakrishnan @ 2015-11-06 11:50 UTC (permalink / raw)
  To: Kyrill Tkachov, Marcus Shawcroft
  Cc: GCC Patches, Marcus Shawcroft, Ramana Radhakrishnan,
	Richard Earnshaw, James Greenhalgh



On 29/10/15 14:14, Kyrill Tkachov wrote:
> 
> On 29/10/15 14:00, Marcus Shawcroft wrote:
>> On 29 October 2015 at 13:50, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>>
>>>>> Ok for trunk?
>>>> rtl.h exposes reg_or_subregno() already doesn't that do what we need here?
>>>
>>> reg_or_subregno assumes that what it's passed is REG or a SUBREG.
>>> It will ICE on any other rtx. Here I want to strip the subreg if it is
>>> a subreg, but leave it as it is otherwise.
>> OK, I follow.
>>
>>>> The test case is not aarch64 specific therefore I think convention is
>>>> that it should go into a generic directory.
>>>
>>> Ok, I'll put it in gcc.dg/
>>
>> OK with the test case moved. Thanks /Marcus
>>
> Thanks, but I'd like to do a slight respin.
> The testcase is moved to gcc.dg but I also avoid creating the new
> helper function and just do the SUBREG extraction once at the very end.
> This makes the patch smaller.
> 
> Since you're ok with the approach and this revision is logically equivalent,
> I just need an ok from an arm perspective.

OK, looks good to me and assuming you've tested this with a regression test run targeting cortex-a53 to stress the function ;)


regards
Ramana
> 
> Thanks,
> Kyrill
> 
> 2015-10-29  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
>     PR target/68088
>     * config/arm/aarch-common.c (aarch_accumulator_forwarding): Strip
>     subregs from accumulator and make sure it's a register.
> 
> 2015-10-29  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
>     PR target/68088
>     * gcc.dg/pr68088_1.c: New test.
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check
  2015-10-28 10:08 [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check Kyrill Tkachov
  2015-10-29 13:50 ` Marcus Shawcroft
@ 2015-11-06 12:26 ` Nikolai Bozhenov
  2015-11-06 13:46   ` Ramana Radhakrishnan
  1 sibling, 1 reply; 15+ messages in thread
From: Nikolai Bozhenov @ 2015-11-06 12:26 UTC (permalink / raw)
  To: Kyrill Tkachov, GCC Patches
  Cc: Marcus Shawcroft, Ramana Radhakrishnan, Richard Earnshaw,
	James Greenhalgh

On 10/28/2015 01:07 PM, Kyrill Tkachov wrote:
> Hi all,
>
> This RTL checking error occurs on aarch64 in 
> aarch_accumulator_forwarding when processing an msubsi insn
> with subregs:
> (insn 15 14 16 3 (set (reg/v:SI 78 [ i ])
>         (minus:SI (subreg:SI (reg/v:DI 76 [ aul ]) 0)
>             (mult:SI (subreg:SI (reg:DI 83) 0)
>                 (subreg:SI (reg:DI 75 [ _20 ]) 0)))) schedice.c:10 357 
> {*msubsi}
>
> The register_operand predicate for that pattern allows subregs (I 
> think correctly).
> The code in aarch_accumulator_forwarding doesn't take that into 
> account and ends up
> taking a REGNO of a SUBREG, causing a checking error.
>
> This patch fixes that by stripping the subregs off the accumulator rtx 
> before
> checking that the inner expression is a REG and taking its REGNO.
>
> The testcase now works fine with an aarch64-none-elf toolchain 
> configure for RTL checking.
>
> The testcase is taken verbatim from the BZ entry for PR 68088.
> Since this function is shared between arm and aarch64 I've 
> bootstrapped and tested it on both
> and I'll need ok's for both ports.
>
> Ok for trunk?
>
> Thanks,
> Kyrill
>
> 2015-10-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     PR target/68088
>     * config/arm/aarch-common.c (aarch_strip_subreg): New function.
>     (aarch_accumulator_forwarding): Strip subregs from accumulator rtx
>     when appropriate.
>
> 2015-10-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * gcc.target/aarch64/pr68088_1.c: New test.

Hi!

I faced the same issue but I had somewhat different RTL for the consumer:

     (insn 20 15 21 2 (set (reg/i:SI 0 r0)
             (minus:SI (subreg:SI (reg:DI 117) 4)
                 (mult:SI (reg:SI 123)
                     (reg:SI 114)))) gasman.c:4 48 {*mulsi3subsi})

where (reg:DI 117) is produced by umulsidi3_v6 instruction. Is it
really true that (subreg:SI (reg:DI 117) 4) may be forwarded in one
cycle in this case?

Thanks,
Nikolai

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check
  2015-11-06 12:26 ` Nikolai Bozhenov
@ 2015-11-06 13:46   ` Ramana Radhakrishnan
  2015-11-06 17:08     ` Nikolai Bozhenov
  0 siblings, 1 reply; 15+ messages in thread
From: Ramana Radhakrishnan @ 2015-11-06 13:46 UTC (permalink / raw)
  To: Nikolai Bozhenov, Kyrill Tkachov, GCC Patches
  Cc: Marcus Shawcroft, Richard Earnshaw, James Greenhalgh


> Hi!
> 
> I faced the same issue but I had somewhat different RTL for the consumer:
> 
>     (insn 20 15 21 2 (set (reg/i:SI 0 r0)
>             (minus:SI (subreg:SI (reg:DI 117) 4)
>                 (mult:SI (reg:SI 123)
>                     (reg:SI 114)))) gasman.c:4 48 {*mulsi3subsi})
> 
> where (reg:DI 117) is produced by umulsidi3_v6 instruction. Is it
> really true that (subreg:SI (reg:DI 117) 4) may be forwarded in one
> cycle in this case?

If the accumulator can be forwarded (i.e. a SImode register), there isn't a reason why a subreg:SI (reg:DI) will not get forwarded.

The subreg:SI is an artifact before register allocation, thus it's a representation issue that the patch is fixing here unless I misunderstand your question.

regards
Ramana


> 
> Thanks,
> Nikolai
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check
  2015-11-06 13:46   ` Ramana Radhakrishnan
@ 2015-11-06 17:08     ` Nikolai Bozhenov
  2015-11-06 17:09       ` Kyrill Tkachov
  0 siblings, 1 reply; 15+ messages in thread
From: Nikolai Bozhenov @ 2015-11-06 17:08 UTC (permalink / raw)
  To: Ramana Radhakrishnan, Kyrill Tkachov, GCC Patches
  Cc: Marcus Shawcroft, Richard Earnshaw, James Greenhalgh

On 11/06/2015 04:46 PM, Ramana Radhakrishnan wrote:
>> Hi!
>>
>> I faced the same issue but I had somewhat different RTL for the consumer:
>>
>>      (insn 20 15 21 2 (set (reg/i:SI 0 r0)
>>              (minus:SI (subreg:SI (reg:DI 117) 4)
>>                  (mult:SI (reg:SI 123)
>>                      (reg:SI 114)))) gasman.c:4 48 {*mulsi3subsi})
>>
>> where (reg:DI 117) is produced by umulsidi3_v6 instruction. Is it
>> really true that (subreg:SI (reg:DI 117) 4) may be forwarded in one
>> cycle in this case?
> If the accumulator can be forwarded (i.e. a SImode register), there isn't a reason why a subreg:SI (reg:DI) will not get forwarded.
>
> The subreg:SI is an artifact before register allocation, thus it's a representation issue that the patch is fixing here unless I misunderstand your question.
>
I mean, in my example it is not the multiplication result that is
forwarded but its upper part. So, shouldn't we check that offset in a
subreg expression is zero? Or is it ok to forward only the upper part
of a multiplication?

Thanks,
Nikolai

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check
  2015-11-06 17:08     ` Nikolai Bozhenov
@ 2015-11-06 17:09       ` Kyrill Tkachov
  2015-11-06 17:16         ` Kyrill Tkachov
  2015-11-06 17:32         ` Nikolai Bozhenov
  0 siblings, 2 replies; 15+ messages in thread
From: Kyrill Tkachov @ 2015-11-06 17:09 UTC (permalink / raw)
  To: Nikolai Bozhenov, Ramana Radhakrishnan, GCC Patches
  Cc: Marcus Shawcroft, Richard Earnshaw, James Greenhalgh


On 06/11/15 17:07, Nikolai Bozhenov wrote:
> On 11/06/2015 04:46 PM, Ramana Radhakrishnan wrote:
>>> Hi!
>>>
>>> I faced the same issue but I had somewhat different RTL for the consumer:
>>>
>>>      (insn 20 15 21 2 (set (reg/i:SI 0 r0)
>>>              (minus:SI (subreg:SI (reg:DI 117) 4)
>>>                  (mult:SI (reg:SI 123)
>>>                      (reg:SI 114)))) gasman.c:4 48 {*mulsi3subsi})
>>>
>>> where (reg:DI 117) is produced by umulsidi3_v6 instruction. Is it
>>> really true that (subreg:SI (reg:DI 117) 4) may be forwarded in one
>>> cycle in this case?
>> If the accumulator can be forwarded (i.e. a SImode register), there isn't a reason why a subreg:SI (reg:DI) will not get forwarded.
>>
>> The subreg:SI is an artifact before register allocation, thus it's a representation issue that the patch is fixing here unless I misunderstand your question.
>>
> I mean, in my example it is not the multiplication result that is
> forwarded but its upper part. So, shouldn't we check that offset in a
> subreg expression is zero? Or is it ok to forward only the upper part
> of a multiplication?

Could you please post the full RTL instruction we're talking about here as it appears in the scheduler dump?
So that we're all on the same page about which case we're talking about.

Thanks,
Kyrill

>
> Thanks,
> Nikolai
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check
  2015-11-06 17:09       ` Kyrill Tkachov
@ 2015-11-06 17:16         ` Kyrill Tkachov
  2015-11-09  8:14           ` Nikolai Bozhenov
  2015-11-06 17:32         ` Nikolai Bozhenov
  1 sibling, 1 reply; 15+ messages in thread
From: Kyrill Tkachov @ 2015-11-06 17:16 UTC (permalink / raw)
  To: Nikolai Bozhenov, Ramana Radhakrishnan, GCC Patches
  Cc: Marcus Shawcroft, Richard Earnshaw, James Greenhalgh


On 06/11/15 17:09, Kyrill Tkachov wrote:
>
> On 06/11/15 17:07, Nikolai Bozhenov wrote:
>> On 11/06/2015 04:46 PM, Ramana Radhakrishnan wrote:
>>>> Hi!
>>>>
>>>> I faced the same issue but I had somewhat different RTL for the consumer:
>>>>
>>>>      (insn 20 15 21 2 (set (reg/i:SI 0 r0)
>>>>              (minus:SI (subreg:SI (reg:DI 117) 4)
>>>>                  (mult:SI (reg:SI 123)
>>>>                      (reg:SI 114)))) gasman.c:4 48 {*mulsi3subsi})
>>>>
>>>> where (reg:DI 117) is produced by umulsidi3_v6 instruction. Is it
>>>> really true that (subreg:SI (reg:DI 117) 4) may be forwarded in one
>>>> cycle in this case?
>>> If the accumulator can be forwarded (i.e. a SImode register), there isn't a reason why a subreg:SI (reg:DI) will not get forwarded.
>>>
>>> The subreg:SI is an artifact before register allocation, thus it's a representation issue that the patch is fixing here unless I misunderstand your question.
>>>
>> I mean, in my example it is not the multiplication result that is
>> forwarded but its upper part. So, shouldn't we check that offset in a
>> subreg expression is zero? Or is it ok to forward only the upper part
>> of a multiplication?
>
> Could you please post the full RTL instruction we're talking about here as it appears in the scheduler dump?
> So that we're all on the same page about which case we're talking about.
>

Sorry, missed the above instruction.
This subreg is just a pre-register allocation representation of the instruction and will go away after reload.
This particular function only really has a real effect in post-reload scheduling as it's only there when the final
register numbers are known.

Kyrill


> Thanks,
> Kyrill
>
>>
>> Thanks,
>> Nikolai
>>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check
  2015-11-06 17:09       ` Kyrill Tkachov
  2015-11-06 17:16         ` Kyrill Tkachov
@ 2015-11-06 17:32         ` Nikolai Bozhenov
  1 sibling, 0 replies; 15+ messages in thread
From: Nikolai Bozhenov @ 2015-11-06 17:32 UTC (permalink / raw)
  To: Kyrill Tkachov, Ramana Radhakrishnan, GCC Patches
  Cc: Marcus Shawcroft, Richard Earnshaw, James Greenhalgh

On 11/06/2015 08:09 PM, Kyrill Tkachov wrote:
>
> On 06/11/15 17:07, Nikolai Bozhenov wrote:
>> On 11/06/2015 04:46 PM, Ramana Radhakrishnan wrote:
>>>> Hi!
>>>>
>>>> I faced the same issue but I had somewhat different RTL for the 
>>>> consumer:
>>>>
>>>>      (insn 20 15 21 2 (set (reg/i:SI 0 r0)
>>>>              (minus:SI (subreg:SI (reg:DI 117) 4)
>>>>                  (mult:SI (reg:SI 123)
>>>>                      (reg:SI 114)))) gasman.c:4 48 {*mulsi3subsi})
>>>>
>>>> where (reg:DI 117) is produced by umulsidi3_v6 instruction. Is it
>>>> really true that (subreg:SI (reg:DI 117) 4) may be forwarded in one
>>>> cycle in this case?
>>> If the accumulator can be forwarded (i.e. a SImode register), there 
>>> isn't a reason why a subreg:SI (reg:DI) will not get forwarded.
>>>
>>> The subreg:SI is an artifact before register allocation, thus it's a 
>>> representation issue that the patch is fixing here unless I 
>>> misunderstand your question.
>>>
>> I mean, in my example it is not the multiplication result that is
>> forwarded but its upper part. So, shouldn't we check that offset in a
>> subreg expression is zero? Or is it ok to forward only the upper part
>> of a multiplication?
>
> Could you please post the full RTL instruction we're talking about 
> here as it appears in the scheduler dump?
> So that we're all on the same page about which case we're talking about.
>

Sure. I'm talking about a sequence like this:

     (insn 8 13 11 2 (set (reg:DI 117)
             (mult:DI (zero_extend:DI (reg:SI 116))
                 (zero_extend:DI (reg:SI 118)))) test.c:2 54 {*umulsidi3_v6}
          (expr_list:REG_DEAD (reg:SI 118)
             (expr_list:REG_DEAD (reg:SI 116)
                 (expr_list:REG_EQUAL (mult:DI (zero_extend:DI (reg:SI 116))
                         (const_int 1374400 [0x14f8c0]))
                     (nil)))))
     (insn 11 8 12 2 (set (reg:DI 120)
             (mult:DI (zero_extend:DI (subreg:SI (reg:DI 117) 4))
                 (zero_extend:DI (reg:SI 121)))) test.c:2 54 {*umulsidi3_v6}
          (expr_list:REG_DEAD (reg:SI 121)
             (expr_list:REG_EQUAL (mult:DI (zero_extend:DI (subreg:SI 
(reg:DI 117) 4))
                     (const_int 3435973837 [0xcccccccd]))
                 (nil))))
     (insn 12 11 20 2 (set (reg:SI 114)
             (lshiftrt:SI (subreg:SI (reg:DI 120) 4)
                 (const_int 3 [0x3]))) test.c:2 126 {*arm_shiftsi3}
          (expr_list:REG_DEAD (reg:DI 120)
             (expr_list:REG_EQUAL (udiv:SI (subreg:SI (reg:DI 117) 4)
                     (const_int 10 [0xa]))
                 (nil))))
     (insn 20 12 21 2 (set (reg/i:SI 0 r0)
             (minus:SI (subreg:SI (reg:DI 117) 4)
                 (mult:SI (reg:SI 123)
                     (reg:SI 114)))) test.c:3 48 {*mulsi3subsi}
          (expr_list:REG_DEAD (reg:SI 123)
             (expr_list:REG_DEAD (reg:DI 117)
                 (expr_list:REG_DEAD (reg:SI 114)
                     (nil)))))

The last insn is a MLA-like operation which is tested by
aarch_accumulator_forwarding if it is a valid forwarding consumer.

Thanks,
Nikolai

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check
  2015-11-06 17:16         ` Kyrill Tkachov
@ 2015-11-09  8:14           ` Nikolai Bozhenov
  2015-11-09  9:22             ` Kyrill Tkachov
  0 siblings, 1 reply; 15+ messages in thread
From: Nikolai Bozhenov @ 2015-11-09  8:14 UTC (permalink / raw)
  To: Kyrill Tkachov, Ramana Radhakrishnan, GCC Patches
  Cc: Marcus Shawcroft, Richard Earnshaw, James Greenhalgh



On 11/06/2015 08:16 PM, Kyrill Tkachov wrote:
>
> On 06/11/15 17:09, Kyrill Tkachov wrote:
>>
>> On 06/11/15 17:07, Nikolai Bozhenov wrote:
>>> On 11/06/2015 04:46 PM, Ramana Radhakrishnan wrote:
>>>>> Hi!
>>>>>
>>>>> I faced the same issue but I had somewhat different RTL for the 
>>>>> consumer:
>>>>>
>>>>>      (insn 20 15 21 2 (set (reg/i:SI 0 r0)
>>>>>              (minus:SI (subreg:SI (reg:DI 117) 4)
>>>>>                  (mult:SI (reg:SI 123)
>>>>>                      (reg:SI 114)))) gasman.c:4 48 {*mulsi3subsi})
>>>>>
>>>>> where (reg:DI 117) is produced by umulsidi3_v6 instruction. Is it
>>>>> really true that (subreg:SI (reg:DI 117) 4) may be forwarded in one
>>>>> cycle in this case?
>>>> If the accumulator can be forwarded (i.e. a SImode register), there 
>>>> isn't a reason why a subreg:SI (reg:DI) will not get forwarded.
>>>>
>>>> The subreg:SI is an artifact before register allocation, thus it's 
>>>> a representation issue that the patch is fixing here unless I 
>>>> misunderstand your question.
>>>>
>>> I mean, in my example it is not the multiplication result that is
>>> forwarded but its upper part. So, shouldn't we check that offset in a
>>> subreg expression is zero? Or is it ok to forward only the upper part
>>> of a multiplication?
>>
>> Could you please post the full RTL instruction we're talking about 
>> here as it appears in the scheduler dump?
>> So that we're all on the same page about which case we're talking about.
>>
>
> Sorry, missed the above instruction.
> This subreg is just a pre-register allocation representation of the 
> instruction and will go away after reload.
> This particular function only really has a real effect in post-reload 
> scheduling as it's only there when the final
> register numbers are known.
>

I see. aarch_accumulator_forwarding always returns 0 for virtual
registers. But isn't it overly pessimistic to assume that accumulator
forwarding is never possible at sched1? I wonder if it would be better
to be more optimistic about register allocation outcome. I mean, in
case of virtual registers we could assume forwarding from A to B if B
is the only consumer of A's result. Something like this:

     if (REGNO (dest) >= FIRST_VIRTUAL_REGISTER
         || REGNO (accumulator) >= FIRST_VIRTUAL_REGISTER)
       return (DF_REG_USE_COUNT (REGNO (dest)) == 1)
         && (DF_REF_INSN (DF_REG_USE_CHAIN (REGNO (dest))) == consumer);
     else
       return REGNO (dest) == REGNO (accumulator);

Thanks,
Nikolai

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check
  2015-11-09  8:14           ` Nikolai Bozhenov
@ 2015-11-09  9:22             ` Kyrill Tkachov
  0 siblings, 0 replies; 15+ messages in thread
From: Kyrill Tkachov @ 2015-11-09  9:22 UTC (permalink / raw)
  To: Nikolai Bozhenov, Ramana Radhakrishnan, GCC Patches
  Cc: Marcus Shawcroft, Richard Earnshaw, James Greenhalgh


On 09/11/15 08:14, Nikolai Bozhenov wrote:
>
>
> On 11/06/2015 08:16 PM, Kyrill Tkachov wrote:
>>
>> On 06/11/15 17:09, Kyrill Tkachov wrote:
>>>
>>> On 06/11/15 17:07, Nikolai Bozhenov wrote:
>>>> On 11/06/2015 04:46 PM, Ramana Radhakrishnan wrote:
>>>>>> Hi!
>>>>>>
>>>>>> I faced the same issue but I had somewhat different RTL for the consumer:
>>>>>>
>>>>>>      (insn 20 15 21 2 (set (reg/i:SI 0 r0)
>>>>>>              (minus:SI (subreg:SI (reg:DI 117) 4)
>>>>>>                  (mult:SI (reg:SI 123)
>>>>>>                      (reg:SI 114)))) gasman.c:4 48 {*mulsi3subsi})
>>>>>>
>>>>>> where (reg:DI 117) is produced by umulsidi3_v6 instruction. Is it
>>>>>> really true that (subreg:SI (reg:DI 117) 4) may be forwarded in one
>>>>>> cycle in this case?
>>>>> If the accumulator can be forwarded (i.e. a SImode register), there isn't a reason why a subreg:SI (reg:DI) will not get forwarded.
>>>>>
>>>>> The subreg:SI is an artifact before register allocation, thus it's a representation issue that the patch is fixing here unless I misunderstand your question.
>>>>>
>>>> I mean, in my example it is not the multiplication result that is
>>>> forwarded but its upper part. So, shouldn't we check that offset in a
>>>> subreg expression is zero? Or is it ok to forward only the upper part
>>>> of a multiplication?
>>>
>>> Could you please post the full RTL instruction we're talking about here as it appears in the scheduler dump?
>>> So that we're all on the same page about which case we're talking about.
>>>
>>
>> Sorry, missed the above instruction.
>> This subreg is just a pre-register allocation representation of the instruction and will go away after reload.
>> This particular function only really has a real effect in post-reload scheduling as it's only there when the final
>> register numbers are known.
>>
>
> I see. aarch_accumulator_forwarding always returns 0 for virtual
> registers. But isn't it overly pessimistic to assume that accumulator
> forwarding is never possible at sched1? I wonder if it would be better
> to be more optimistic about register allocation outcome. I mean, in
> case of virtual registers we could assume forwarding from A to B if B
> is the only consumer of A's result. Something like this:
>
>     if (REGNO (dest) >= FIRST_VIRTUAL_REGISTER
>         || REGNO (accumulator) >= FIRST_VIRTUAL_REGISTER)
>       return (DF_REG_USE_COUNT (REGNO (dest)) == 1)
>         && (DF_REF_INSN (DF_REG_USE_CHAIN (REGNO (dest))) == consumer);
>     else
>       return REGNO (dest) == REGNO (accumulator);
>

Interesting...
As far as I know sched1 tries to minimise live ranges before register allocation rather than trying to
perform the most exact pipeline modelling, since we only know the exact registers used in sched2.

What you're proposing is a heuristic, so it would need benchmarking results and analysis to be considered.

Thanks,
Kyrill


> Thanks,
> Nikolai
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-11-09  9:22 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-28 10:08 [PATCH][ARM/AArch64] PR 68088: Fix RTL checking ICE due to subregs inside accumulator forwarding check Kyrill Tkachov
2015-10-29 13:50 ` Marcus Shawcroft
2015-10-29 14:00   ` Kyrill Tkachov
2015-10-29 14:02     ` Marcus Shawcroft
2015-10-29 14:18       ` Kyrill Tkachov
2015-11-06 11:39         ` Kyrill Tkachov
2015-11-06 11:50         ` Ramana Radhakrishnan
2015-11-06 12:26 ` Nikolai Bozhenov
2015-11-06 13:46   ` Ramana Radhakrishnan
2015-11-06 17:08     ` Nikolai Bozhenov
2015-11-06 17:09       ` Kyrill Tkachov
2015-11-06 17:16         ` Kyrill Tkachov
2015-11-09  8:14           ` Nikolai Bozhenov
2015-11-09  9:22             ` Kyrill Tkachov
2015-11-06 17:32         ` Nikolai Bozhenov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).