public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA
@ 2016-11-30 16:47 Kyrill Tkachov
  2016-12-08 11:55 ` Kyrill Tkachov
  2016-12-15 10:00 ` Richard Earnshaw (lists)
  0 siblings, 2 replies; 11+ messages in thread
From: Kyrill Tkachov @ 2016-11-30 16:47 UTC (permalink / raw)
  To: GCC Patches; +Cc: Ramana Radhakrishnan, Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 3016 bytes --]

Hi all,

In this awkward ICE we have a *load_multiple pattern that is being transformed in reload from:
(insn 55 67 151 3 (parallel [
             (set (reg:SI 0 r0)
                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
             (set (reg:SI 158 [ c+4 ])
                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
         ]) arm-crash.c:25 393 {*load_multiple}
      (expr_list:REG_UNUSED (reg:SI 0 r0)
         (nil)))


into the invalid:
(insn 55 67 70 3 (parallel [
             (set (reg:SI 0 r0)
                 (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32]))
             (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
                         (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12 S4 A32])
                 (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147])
                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
         ]) arm-crash.c:25 393 {*load_multiple}
      (nil))

The operands of *load_multiple are not validated through constraints like LRA is used to, but rather through
a match_parallel predicate which ends up calling ldm_stm_operation_p to validate the multiple sets.
But this means that LRA cannot reason about the constraints properly.
This two-regiseter load should not have used *load_multiple anyway, it should have used *ldm2_ from ldmstm.md
and indeed it did until the loop2_invariant pass which copied the ldm2_ pattern:
(insn 27 23 28 4 (parallel [
             (set (reg:SI 0 r0)
                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
             (set (reg:SI 1 r1)
                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
         ]) "ldm.c":25 385 {*ldm2_}
      (nil))

into:
(insn 55 19 67 3 (parallel [
             (set (reg:SI 0 r0)
                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
             (set (reg:SI 158)
                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
         ]) "ldm.c":25 404 {*load_multiple}
      (expr_list:REG_UNUSED (reg:SI 0 r0)
         (nil)))

Note that it now got recognised as load_multiple because the second register is not a hard register but the pseudo 158.
In any case, the solution suggested in the PR (and I agree with it) is to restrict *load_multiple to after reload.
The similar pattern *load_multiple_with_writeback also has a similar condition and the comment above *load_multiple says that
it's used to generate epilogues, which is done after reload anyway. For pre-reload load-multiples the patterns in ldmstm.md
should do just fine.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Kyrill

2016-11-30  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     PR target/71436
     * config/arm/arm.md (*load_multiple): Add reload_completed to
     matching condition.

2016-11-30  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     PR target/71436
     * gcc.c-torture/compile/pr71436.c: New test.

[-- Attachment #2: arm-ldm.patch --]
[-- Type: text/x-patch, Size: 1781 bytes --]

commit 996d28e2353badd1b29ef000f94d40c7dab9010f
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date:   Tue Nov 29 15:07:30 2016 +0000

    [ARM] Restrict *load_multiple pattern till after LRA

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 74c44f3..22d2a84 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -11807,12 +11807,15 @@ (define_insn "<crc_variant>"
 
 ;; Patterns in ldmstm.md don't cover more than 4 registers. This pattern covers
 ;; large lists without explicit writeback generated for APCS_FRAME epilogue.
+;; The operands are validated through the load_multiple_operation
+;; match_parallel predicate rather than through constraints so enable it only
+;; after reload.
 (define_insn "*load_multiple"
   [(match_parallel 0 "load_multiple_operation"
     [(set (match_operand:SI 2 "s_register_operand" "=rk")
           (mem:SI (match_operand:SI 1 "s_register_operand" "rk")))
         ])]
-  "TARGET_32BIT"
+  "TARGET_32BIT && reload_completed"
   "*
   {
     arm_output_multireg_pop (operands, /*return_pc=*/false,
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr71436.c b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
new file mode 100644
index 0000000..ab08d5d
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
@@ -0,0 +1,35 @@
+/* PR target/71436.  */
+
+#pragma pack(1)
+struct S0
+{
+  volatile int f0;
+  short f2;
+};
+
+void foo (struct S0 *);
+int a, d;
+static struct S0 b[5];
+static struct S0 c;
+void fn1 ();
+void
+main ()
+{
+  {
+    struct S0 e;
+    for (; d; fn1 ())
+      {
+        {
+          a = 3;
+          for (; a >= 0; a -= 1)
+            {
+              {
+                e = c;
+              }
+              b[a] = e;
+            }
+        }
+      }
+  }
+  foo (b);
+}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA
  2016-11-30 16:47 [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA Kyrill Tkachov
@ 2016-12-08 11:55 ` Kyrill Tkachov
  2016-12-15  9:55   ` Kyrill Tkachov
  2016-12-15 10:00 ` Richard Earnshaw (lists)
  1 sibling, 1 reply; 11+ messages in thread
From: Kyrill Tkachov @ 2016-12-08 11:55 UTC (permalink / raw)
  To: GCC Patches; +Cc: Ramana Radhakrishnan, Richard Earnshaw

Ping.
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg03078.html

Thanks,
Kyrill

On 30/11/16 16:47, Kyrill Tkachov wrote:
> Hi all,
>
> In this awkward ICE we have a *load_multiple pattern that is being transformed in reload from:
> (insn 55 67 151 3 (parallel [
>             (set (reg:SI 0 r0)
>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>             (set (reg:SI 158 [ c+4 ])
>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>         ]) arm-crash.c:25 393 {*load_multiple}
>      (expr_list:REG_UNUSED (reg:SI 0 r0)
>         (nil)))
>
>
> into the invalid:
> (insn 55 67 70 3 (parallel [
>             (set (reg:SI 0 r0)
>                 (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32]))
>             (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
>                         (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12 S4 A32])
>                 (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147])
>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>         ]) arm-crash.c:25 393 {*load_multiple}
>      (nil))
>
> The operands of *load_multiple are not validated through constraints like LRA is used to, but rather through
> a match_parallel predicate which ends up calling ldm_stm_operation_p to validate the multiple sets.
> But this means that LRA cannot reason about the constraints properly.
> This two-regiseter load should not have used *load_multiple anyway, it should have used *ldm2_ from ldmstm.md
> and indeed it did until the loop2_invariant pass which copied the ldm2_ pattern:
> (insn 27 23 28 4 (parallel [
>             (set (reg:SI 0 r0)
>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>             (set (reg:SI 1 r1)
>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>         ]) "ldm.c":25 385 {*ldm2_}
>      (nil))
>
> into:
> (insn 55 19 67 3 (parallel [
>             (set (reg:SI 0 r0)
>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>             (set (reg:SI 158)
>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>         ]) "ldm.c":25 404 {*load_multiple}
>      (expr_list:REG_UNUSED (reg:SI 0 r0)
>         (nil)))
>
> Note that it now got recognised as load_multiple because the second register is not a hard register but the pseudo 158.
> In any case, the solution suggested in the PR (and I agree with it) is to restrict *load_multiple to after reload.
> The similar pattern *load_multiple_with_writeback also has a similar condition and the comment above *load_multiple says that
> it's used to generate epilogues, which is done after reload anyway. For pre-reload load-multiples the patterns in ldmstm.md
> should do just fine.
>
> Bootstrapped and tested on arm-none-linux-gnueabihf.
>
> Ok for trunk?
>
> Thanks,
> Kyrill
>
> 2016-11-30  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     PR target/71436
>     * config/arm/arm.md (*load_multiple): Add reload_completed to
>     matching condition.
>
> 2016-11-30  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     PR target/71436
>     * gcc.c-torture/compile/pr71436.c: New test.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA
  2016-12-08 11:55 ` Kyrill Tkachov
@ 2016-12-15  9:55   ` Kyrill Tkachov
  0 siblings, 0 replies; 11+ messages in thread
From: Kyrill Tkachov @ 2016-12-15  9:55 UTC (permalink / raw)
  To: GCC Patches; +Cc: Ramana Radhakrishnan, Richard Earnshaw

Ping.

Thanks,
Kyrill

On 08/12/16 11:55, Kyrill Tkachov wrote:
> Ping.
> https://gcc.gnu.org/ml/gcc-patches/2016-11/msg03078.html
>
> Thanks,
> Kyrill
>
> On 30/11/16 16:47, Kyrill Tkachov wrote:
>> Hi all,
>>
>> In this awkward ICE we have a *load_multiple pattern that is being transformed in reload from:
>> (insn 55 67 151 3 (parallel [
>>             (set (reg:SI 0 r0)
>>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>             (set (reg:SI 158 [ c+4 ])
>>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>         ]) arm-crash.c:25 393 {*load_multiple}
>>      (expr_list:REG_UNUSED (reg:SI 0 r0)
>>         (nil)))
>>
>>
>> into the invalid:
>> (insn 55 67 70 3 (parallel [
>>             (set (reg:SI 0 r0)
>>                 (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32]))
>>             (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
>>                         (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12 S4 A32])
>>                 (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147])
>>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>         ]) arm-crash.c:25 393 {*load_multiple}
>>      (nil))
>>
>> The operands of *load_multiple are not validated through constraints like LRA is used to, but rather through
>> a match_parallel predicate which ends up calling ldm_stm_operation_p to validate the multiple sets.
>> But this means that LRA cannot reason about the constraints properly.
>> This two-regiseter load should not have used *load_multiple anyway, it should have used *ldm2_ from ldmstm.md
>> and indeed it did until the loop2_invariant pass which copied the ldm2_ pattern:
>> (insn 27 23 28 4 (parallel [
>>             (set (reg:SI 0 r0)
>>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>             (set (reg:SI 1 r1)
>>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>         ]) "ldm.c":25 385 {*ldm2_}
>>      (nil))
>>
>> into:
>> (insn 55 19 67 3 (parallel [
>>             (set (reg:SI 0 r0)
>>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>             (set (reg:SI 158)
>>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>         ]) "ldm.c":25 404 {*load_multiple}
>>      (expr_list:REG_UNUSED (reg:SI 0 r0)
>>         (nil)))
>>
>> Note that it now got recognised as load_multiple because the second register is not a hard register but the pseudo 158.
>> In any case, the solution suggested in the PR (and I agree with it) is to restrict *load_multiple to after reload.
>> The similar pattern *load_multiple_with_writeback also has a similar condition and the comment above *load_multiple says that
>> it's used to generate epilogues, which is done after reload anyway. For pre-reload load-multiples the patterns in ldmstm.md
>> should do just fine.
>>
>> Bootstrapped and tested on arm-none-linux-gnueabihf.
>>
>> Ok for trunk?
>>
>> Thanks,
>> Kyrill
>>
>> 2016-11-30  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>     PR target/71436
>>     * config/arm/arm.md (*load_multiple): Add reload_completed to
>>     matching condition.
>>
>> 2016-11-30  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>     PR target/71436
>>     * gcc.c-torture/compile/pr71436.c: New test.
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA
  2016-11-30 16:47 [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA Kyrill Tkachov
  2016-12-08 11:55 ` Kyrill Tkachov
@ 2016-12-15 10:00 ` Richard Earnshaw (lists)
  2016-12-15 10:06   ` Richard Earnshaw (lists)
  2016-12-15 16:40   ` Kyrill Tkachov
  1 sibling, 2 replies; 11+ messages in thread
From: Richard Earnshaw (lists) @ 2016-12-15 10:00 UTC (permalink / raw)
  To: Kyrill Tkachov, GCC Patches; +Cc: Ramana Radhakrishnan

On 30/11/16 16:47, Kyrill Tkachov wrote:
> Hi all,
> 
> In this awkward ICE we have a *load_multiple pattern that is being
> transformed in reload from:
> (insn 55 67 151 3 (parallel [
>             (set (reg:SI 0 r0)
>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>             (set (reg:SI 158 [ c+4 ])
>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>         ]) arm-crash.c:25 393 {*load_multiple}
>      (expr_list:REG_UNUSED (reg:SI 0 r0)
>         (nil)))
> 
> 
> into the invalid:
> (insn 55 67 70 3 (parallel [
>             (set (reg:SI 0 r0)
>                 (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32]))
>             (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
>                         (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12
> S4 A32])
>                 (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147])
>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>         ]) arm-crash.c:25 393 {*load_multiple}
>      (nil))
> 
> The operands of *load_multiple are not validated through constraints
> like LRA is used to, but rather through
> a match_parallel predicate which ends up calling ldm_stm_operation_p to
> validate the multiple sets.
> But this means that LRA cannot reason about the constraints properly.
> This two-regiseter load should not have used *load_multiple anyway, it
> should have used *ldm2_ from ldmstm.md
> and indeed it did until the loop2_invariant pass which copied the ldm2_
> pattern:
> (insn 27 23 28 4 (parallel [
>             (set (reg:SI 0 r0)
>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>             (set (reg:SI 1 r1)
>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>         ]) "ldm.c":25 385 {*ldm2_}
>      (nil))
> 
> into:
> (insn 55 19 67 3 (parallel [
>             (set (reg:SI 0 r0)
>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>             (set (reg:SI 158)
>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>         ]) "ldm.c":25 404 {*load_multiple}
>      (expr_list:REG_UNUSED (reg:SI 0 r0)
>         (nil)))
> 
> Note that it now got recognised as load_multiple because the second
> register is not a hard register but the pseudo 158.
> In any case, the solution suggested in the PR (and I agree with it) is
> to restrict *load_multiple to after reload.
> The similar pattern *load_multiple_with_writeback also has a similar
> condition and the comment above *load_multiple says that
> it's used to generate epilogues, which is done after reload anyway. For
> pre-reload load-multiples the patterns in ldmstm.md
> should do just fine.
> 
> Bootstrapped and tested on arm-none-linux-gnueabihf.
> 
> Ok for trunk?
> 

I don't think this is right.  Firstly, these patterns look to me like
the ones used for memcpy expansion, so not recognizing them could lead
to compiler aborts.

Secondly, the bug is when we generate

 (insn 55 67 70 3 (parallel [
             (set (reg:SI 0 r0)
                 (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32]))
             (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
                         (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12
 S4 A32])
                 (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147])
                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
         ]) arm-crash.c:25 393 {*load_multiple}
      (nil))

These patterns are supposed to enforce that the load (store) target
register is a hard register that is higher numbered than the hard
register in the memory slot that precedes it (thus satisfying the
constraints for a normal ldm/stm instruction.

The real question is why did ldm_stm_operation_p permit this modified
pattern through, or was the pattern validation bypassed incorrectly by
the loop2 invariant code when it copied the insn and made changes?

R.

> Thanks,
> Kyrill
> 
> 2016-11-30  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
>     PR target/71436
>     * config/arm/arm.md (*load_multiple): Add reload_completed to
>     matching condition.
> 
> 2016-11-30  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
>     PR target/71436
>     * gcc.c-torture/compile/pr71436.c: New test.
> 
> arm-ldm.patch
> 
> 
> commit 996d28e2353badd1b29ef000f94d40c7dab9010f
> Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
> Date:   Tue Nov 29 15:07:30 2016 +0000
> 
>     [ARM] Restrict *load_multiple pattern till after LRA
> 
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 74c44f3..22d2a84 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -11807,12 +11807,15 @@ (define_insn "<crc_variant>"
>  
>  ;; Patterns in ldmstm.md don't cover more than 4 registers. This pattern covers
>  ;; large lists without explicit writeback generated for APCS_FRAME epilogue.
> +;; The operands are validated through the load_multiple_operation
> +;; match_parallel predicate rather than through constraints so enable it only
> +;; after reload.
>  (define_insn "*load_multiple"
>    [(match_parallel 0 "load_multiple_operation"
>      [(set (match_operand:SI 2 "s_register_operand" "=rk")
>            (mem:SI (match_operand:SI 1 "s_register_operand" "rk")))
>          ])]
> -  "TARGET_32BIT"
> +  "TARGET_32BIT && reload_completed"
>    "*
>    {
>      arm_output_multireg_pop (operands, /*return_pc=*/false,
> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr71436.c b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
> new file mode 100644
> index 0000000..ab08d5d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
> @@ -0,0 +1,35 @@
> +/* PR target/71436.  */
> +
> +#pragma pack(1)
> +struct S0
> +{
> +  volatile int f0;
> +  short f2;
> +};
> +
> +void foo (struct S0 *);
> +int a, d;
> +static struct S0 b[5];
> +static struct S0 c;
> +void fn1 ();
> +void
> +main ()
> +{
> +  {
> +    struct S0 e;
> +    for (; d; fn1 ())
> +      {
> +        {
> +          a = 3;
> +          for (; a >= 0; a -= 1)
> +            {
> +              {
> +                e = c;
> +              }
> +              b[a] = e;
> +            }
> +        }
> +      }
> +  }
> +  foo (b);
> +}
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA
  2016-12-15 10:00 ` Richard Earnshaw (lists)
@ 2016-12-15 10:06   ` Richard Earnshaw (lists)
  2016-12-19 15:04     ` Jakub Jelinek
  2016-12-15 16:40   ` Kyrill Tkachov
  1 sibling, 1 reply; 11+ messages in thread
From: Richard Earnshaw (lists) @ 2016-12-15 10:06 UTC (permalink / raw)
  To: Kyrill Tkachov, GCC Patches; +Cc: Ramana Radhakrishnan

On 15/12/16 09:55, Richard Earnshaw (lists) wrote:
> On 30/11/16 16:47, Kyrill Tkachov wrote:
>> Hi all,
>>
>> In this awkward ICE we have a *load_multiple pattern that is being
>> transformed in reload from:
>> (insn 55 67 151 3 (parallel [
>>             (set (reg:SI 0 r0)
>>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>             (set (reg:SI 158 [ c+4 ])
>>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>         ]) arm-crash.c:25 393 {*load_multiple}
>>      (expr_list:REG_UNUSED (reg:SI 0 r0)
>>         (nil)))
>>
>>
>> into the invalid:
>> (insn 55 67 70 3 (parallel [
>>             (set (reg:SI 0 r0)
>>                 (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32]))
>>             (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
>>                         (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12
>> S4 A32])
>>                 (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147])
>>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>         ]) arm-crash.c:25 393 {*load_multiple}
>>      (nil))
>>
>> The operands of *load_multiple are not validated through constraints
>> like LRA is used to, but rather through
>> a match_parallel predicate which ends up calling ldm_stm_operation_p to
>> validate the multiple sets.
>> But this means that LRA cannot reason about the constraints properly.
>> This two-regiseter load should not have used *load_multiple anyway, it
>> should have used *ldm2_ from ldmstm.md
>> and indeed it did until the loop2_invariant pass which copied the ldm2_
>> pattern:
>> (insn 27 23 28 4 (parallel [
>>             (set (reg:SI 0 r0)
>>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>             (set (reg:SI 1 r1)
>>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>         ]) "ldm.c":25 385 {*ldm2_}
>>      (nil))
>>
>> into:
>> (insn 55 19 67 3 (parallel [
>>             (set (reg:SI 0 r0)
>>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>             (set (reg:SI 158)
>>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>         ]) "ldm.c":25 404 {*load_multiple}
>>      (expr_list:REG_UNUSED (reg:SI 0 r0)
>>         (nil)))
>>
>> Note that it now got recognised as load_multiple because the second
>> register is not a hard register but the pseudo 158.
>> In any case, the solution suggested in the PR (and I agree with it) is
>> to restrict *load_multiple to after reload.
>> The similar pattern *load_multiple_with_writeback also has a similar
>> condition and the comment above *load_multiple says that
>> it's used to generate epilogues, which is done after reload anyway. For
>> pre-reload load-multiples the patterns in ldmstm.md
>> should do just fine.
>>
>> Bootstrapped and tested on arm-none-linux-gnueabihf.
>>
>> Ok for trunk?
>>
> 
> I don't think this is right.  Firstly, these patterns look to me like
> the ones used for memcpy expansion, so not recognizing them could lead
> to compiler aborts.
> 
> Secondly, the bug is when we generate
> 
>  (insn 55 67 70 3 (parallel [
>              (set (reg:SI 0 r0)
>                  (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32]))
>              (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
>                          (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12
>  S4 A32])
>                  (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147])
>                          (const_int 4 [0x4])) [2 c+4 S4 A32]))
>          ]) arm-crash.c:25 393 {*load_multiple}
>       (nil))

sorry, pasted the wrong bit of code.

That should read when we generate:

(insn 55 19 67 3 (parallel [
            (set (reg:SI 0 r0)
                (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
            (set (reg:SI 158)
                (mem/u/c:SI (plus:SI (reg/f:SI 147)
                        (const_int 4 [0x4])) [2 c+4 S4 A32]))
        ]) "ldm.c":25 404 {*load_multiple}
     (expr_list:REG_UNUSED (reg:SI 0 r0)
        (nil)))

ie when we put a pseudo into the register load list.

> 
> These patterns are supposed to enforce that the load (store) target
> register is a hard register that is higher numbered than the hard
> register in the memory slot that precedes it (thus satisfying the
> constraints for a normal ldm/stm instruction.
> 
> The real question is why did ldm_stm_operation_p permit this modified
> pattern through, or was the pattern validation bypassed incorrectly by
> the loop2 invariant code when it copied the insn and made changes?
> 
> R.
> 
>> Thanks,
>> Kyrill
>>
>> 2016-11-30  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>     PR target/71436
>>     * config/arm/arm.md (*load_multiple): Add reload_completed to
>>     matching condition.
>>
>> 2016-11-30  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>     PR target/71436
>>     * gcc.c-torture/compile/pr71436.c: New test.
>>
>> arm-ldm.patch
>>
>>
>> commit 996d28e2353badd1b29ef000f94d40c7dab9010f
>> Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>> Date:   Tue Nov 29 15:07:30 2016 +0000
>>
>>     [ARM] Restrict *load_multiple pattern till after LRA
>>
>> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>> index 74c44f3..22d2a84 100644
>> --- a/gcc/config/arm/arm.md
>> +++ b/gcc/config/arm/arm.md
>> @@ -11807,12 +11807,15 @@ (define_insn "<crc_variant>"
>>  
>>  ;; Patterns in ldmstm.md don't cover more than 4 registers. This pattern covers
>>  ;; large lists without explicit writeback generated for APCS_FRAME epilogue.
>> +;; The operands are validated through the load_multiple_operation
>> +;; match_parallel predicate rather than through constraints so enable it only
>> +;; after reload.
>>  (define_insn "*load_multiple"
>>    [(match_parallel 0 "load_multiple_operation"
>>      [(set (match_operand:SI 2 "s_register_operand" "=rk")
>>            (mem:SI (match_operand:SI 1 "s_register_operand" "rk")))
>>          ])]
>> -  "TARGET_32BIT"
>> +  "TARGET_32BIT && reload_completed"
>>    "*
>>    {
>>      arm_output_multireg_pop (operands, /*return_pc=*/false,
>> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr71436.c b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
>> new file mode 100644
>> index 0000000..ab08d5d
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
>> @@ -0,0 +1,35 @@
>> +/* PR target/71436.  */
>> +
>> +#pragma pack(1)
>> +struct S0
>> +{
>> +  volatile int f0;
>> +  short f2;
>> +};
>> +
>> +void foo (struct S0 *);
>> +int a, d;
>> +static struct S0 b[5];
>> +static struct S0 c;
>> +void fn1 ();
>> +void
>> +main ()
>> +{
>> +  {
>> +    struct S0 e;
>> +    for (; d; fn1 ())
>> +      {
>> +        {
>> +          a = 3;
>> +          for (; a >= 0; a -= 1)
>> +            {
>> +              {
>> +                e = c;
>> +              }
>> +              b[a] = e;
>> +            }
>> +        }
>> +      }
>> +  }
>> +  foo (b);
>> +}
>>
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA
  2016-12-15 10:00 ` Richard Earnshaw (lists)
  2016-12-15 10:06   ` Richard Earnshaw (lists)
@ 2016-12-15 16:40   ` Kyrill Tkachov
  1 sibling, 0 replies; 11+ messages in thread
From: Kyrill Tkachov @ 2016-12-15 16:40 UTC (permalink / raw)
  To: Richard Earnshaw (lists), GCC Patches; +Cc: Ramana Radhakrishnan

[-- Attachment #1: Type: text/plain, Size: 4859 bytes --]


On 15/12/16 09:55, Richard Earnshaw (lists) wrote:
> On 30/11/16 16:47, Kyrill Tkachov wrote:
>> Hi all,
>>
>> In this awkward ICE we have a *load_multiple pattern that is being
>> transformed in reload from:
>> (insn 55 67 151 3 (parallel [
>>              (set (reg:SI 0 r0)
>>                  (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>              (set (reg:SI 158 [ c+4 ])
>>                  (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>                          (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>          ]) arm-crash.c:25 393 {*load_multiple}
>>       (expr_list:REG_UNUSED (reg:SI 0 r0)
>>          (nil)))
>>
>>
>> into the invalid:
>> (insn 55 67 70 3 (parallel [
>>              (set (reg:SI 0 r0)
>>                  (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32]))
>>              (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
>>                          (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12
>> S4 A32])
>>                  (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147])
>>                          (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>          ]) arm-crash.c:25 393 {*load_multiple}
>>       (nil))
>>
>> The operands of *load_multiple are not validated through constraints
>> like LRA is used to, but rather through
>> a match_parallel predicate which ends up calling ldm_stm_operation_p to
>> validate the multiple sets.
>> But this means that LRA cannot reason about the constraints properly.
>> This two-regiseter load should not have used *load_multiple anyway, it
>> should have used *ldm2_ from ldmstm.md
>> and indeed it did until the loop2_invariant pass which copied the ldm2_
>> pattern:
>> (insn 27 23 28 4 (parallel [
>>              (set (reg:SI 0 r0)
>>                  (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>              (set (reg:SI 1 r1)
>>                  (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>                          (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>          ]) "ldm.c":25 385 {*ldm2_}
>>       (nil))
>>
>> into:
>> (insn 55 19 67 3 (parallel [
>>              (set (reg:SI 0 r0)
>>                  (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>              (set (reg:SI 158)
>>                  (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>                          (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>          ]) "ldm.c":25 404 {*load_multiple}
>>       (expr_list:REG_UNUSED (reg:SI 0 r0)
>>          (nil)))
>>
>> Note that it now got recognised as load_multiple because the second
>> register is not a hard register but the pseudo 158.
>> In any case, the solution suggested in the PR (and I agree with it) is
>> to restrict *load_multiple to after reload.
>> The similar pattern *load_multiple_with_writeback also has a similar
>> condition and the comment above *load_multiple says that
>> it's used to generate epilogues, which is done after reload anyway. For
>> pre-reload load-multiples the patterns in ldmstm.md
>> should do just fine.
>>
>> Bootstrapped and tested on arm-none-linux-gnueabihf.
>>
>> Ok for trunk?
>>
> I don't think this is right.  Firstly, these patterns look to me like
> the ones used for memcpy expansion, so not recognizing them could lead
> to compiler aborts.

I think the other patterns in ldmstm.md would catch these anyway but...


> Secondly, the bug is when we generate
>
>   (insn 55 67 70 3 (parallel [
>               (set (reg:SI 0 r0)
>                   (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32]))
>               (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp)
>                           (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12
>   S4 A32])
>                   (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147])
>                           (const_int 4 [0x4])) [2 c+4 S4 A32]))
>           ]) arm-crash.c:25 393 {*load_multiple}
>        (nil))
>
> These patterns are supposed to enforce that the load (store) target
> register is a hard register that is higher numbered than the hard
> register in the memory slot that precedes it (thus satisfying the
> constraints for a normal ldm/stm instruction.
>
> The real question is why did ldm_stm_operation_p permit this modified
> pattern through, or was the pattern validation bypassed incorrectly by
> the loop2 invariant code when it copied the insn and made changes?

... ldm_stm_operation_p doesn't check that the registers are hard registers,
which is an implicit requirement since it validates their ascending order.
So the safe easy fix is to require hard registers.

This patch does that.
Bootstrapped and tested on arm-none-linux-gnueabihf.
Ok for trunk?

Thanks,
Kyrill

2016-12-15  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     PR target/71436
     * config/arm/arm.c (ldm_stm_operation_p): Check that last register
     in the list is a hard reg.

2016-12-15  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     PR target/71436
     * gcc.c-torture/compile/pr71436.c: New test.

[-- Attachment #2: arm-ldm.patch --]
[-- Type: text/x-patch, Size: 1498 bytes --]

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 437da6fe3d34978e7a3a72f7ec39dc76a54d6408..b6c5fdf42ee359a0c0b0fab8de4d374ff39cd35b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -12709,6 +12709,13 @@ ldm_stm_operation_p (rtx op, bool load, machine_mode mode,
         addr_reg_in_reglist = true;
     }
 
+  /* The ascending register number requirement only makes sense when dealing
+     with hard registers.  The code above already validated the ascending
+     requirement so we only need to validate the "hardness" of the last
+     register in the list.  */
+  if (regno >= FIRST_PSEUDO_REGISTER)
+    return false;
+
   if (load)
     {
       if (update && addr_reg_in_reglist)
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr71436.c b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
new file mode 100644
index 0000000000000000000000000000000000000000..ab08d5d369c769fc9e411238fcf1379705e576f2
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr71436.c
@@ -0,0 +1,35 @@
+/* PR target/71436.  */
+
+#pragma pack(1)
+struct S0
+{
+  volatile int f0;
+  short f2;
+};
+
+void foo (struct S0 *);
+int a, d;
+static struct S0 b[5];
+static struct S0 c;
+void fn1 ();
+void
+main ()
+{
+  {
+    struct S0 e;
+    for (; d; fn1 ())
+      {
+        {
+          a = 3;
+          for (; a >= 0; a -= 1)
+            {
+              {
+                e = c;
+              }
+              b[a] = e;
+            }
+        }
+      }
+  }
+  foo (b);
+}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA
  2016-12-15 10:06   ` Richard Earnshaw (lists)
@ 2016-12-19 15:04     ` Jakub Jelinek
  2017-01-18  9:53       ` Kyrill Tkachov
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Jelinek @ 2016-12-19 15:04 UTC (permalink / raw)
  To: Richard Earnshaw (lists)
  Cc: Kyrill Tkachov, GCC Patches, Ramana Radhakrishnan

On Thu, Dec 15, 2016 at 10:00:14AM +0000, Richard Earnshaw (lists) wrote:
> sorry, pasted the wrong bit of code.
> 
> That should read when we generate:
> 
> (insn 55 19 67 3 (parallel [
>             (set (reg:SI 0 r0)
>                 (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>             (set (reg:SI 158)
>                 (mem/u/c:SI (plus:SI (reg/f:SI 147)
>                         (const_int 4 [0x4])) [2 c+4 S4 A32]))
>         ]) "ldm.c":25 404 {*load_multiple}
>      (expr_list:REG_UNUSED (reg:SI 0 r0)
>         (nil)))
> 
> ie when we put a pseudo into the register load list.

We put a pseudo there because the predicate on the insn allows it:
(define_special_predicate "load_multiple_operation"
  (match_code "parallel")
{
 return ldm_stm_operation_p (op, /*load=*/true, SImode,
                                 /*consecutive=*/false,
                                 /*return_pc=*/false);
})
and the consecutive = false argument says that (almost) no verification
is performed on the SET_DEST, just that it is a REG and doesn't have
REGNO smaller than the first reg.
That said, RA is still not able to cope with such instructions, because
only the first set is represented with constraints, so if such an insn
needs any kind of reloading, it just will not happen.
So I think the posted patch makes lots of sense, otherwise if you use
such a pattern before reload, you just have to hope no reloading will be
needed on it.

	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA
  2016-12-19 15:04     ` Jakub Jelinek
@ 2017-01-18  9:53       ` Kyrill Tkachov
  2017-02-07 14:50         ` Kyrill Tkachov
  0 siblings, 1 reply; 11+ messages in thread
From: Kyrill Tkachov @ 2017-01-18  9:53 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Earnshaw (lists); +Cc: GCC Patches, Ramana Radhakrishnan


On 19/12/16 14:53, Jakub Jelinek wrote:
> On Thu, Dec 15, 2016 at 10:00:14AM +0000, Richard Earnshaw (lists) wrote:
>> sorry, pasted the wrong bit of code.
>>
>> That should read when we generate:
>>
>> (insn 55 19 67 3 (parallel [
>>              (set (reg:SI 0 r0)
>>                  (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>              (set (reg:SI 158)
>>                  (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>                          (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>          ]) "ldm.c":25 404 {*load_multiple}
>>       (expr_list:REG_UNUSED (reg:SI 0 r0)
>>          (nil)))
>>
>> ie when we put a pseudo into the register load list.
> We put a pseudo there because the predicate on the insn allows it:
> (define_special_predicate "load_multiple_operation"
>    (match_code "parallel")
> {
>   return ldm_stm_operation_p (op, /*load=*/true, SImode,
>                                   /*consecutive=*/false,
>                                   /*return_pc=*/false);
> })
> and the consecutive = false argument says that (almost) no verification
> is performed on the SET_DEST, just that it is a REG and doesn't have
> REGNO smaller than the first reg.
> That said, RA is still not able to cope with such instructions, because
> only the first set is represented with constraints, so if such an insn
> needs any kind of reloading, it just will not happen.
> So I think the posted patch makes lots of sense, otherwise if you use
> such a pattern before reload, you just have to hope no reloading will be
> needed on it.

In that case I believe restricting the pattern till after LRA is the way to go.
However, we should still add the check from [1] to ldm_stm_operation_p for when
'consecutive' is true as consecutive order has no useful meaning on pseudos here.

Thanks,
Kyrill

[1] https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01400.html

>
> 	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA
  2017-01-18  9:53       ` Kyrill Tkachov
@ 2017-02-07 14:50         ` Kyrill Tkachov
  2017-03-14 14:07           ` Kyrill Tkachov
  2017-03-23  9:05           ` Ramana Radhakrishnan
  0 siblings, 2 replies; 11+ messages in thread
From: Kyrill Tkachov @ 2017-02-07 14:50 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Earnshaw (lists); +Cc: GCC Patches, Ramana Radhakrishnan


On 18/01/17 09:49, Kyrill Tkachov wrote:
>
> On 19/12/16 14:53, Jakub Jelinek wrote:
>> On Thu, Dec 15, 2016 at 10:00:14AM +0000, Richard Earnshaw (lists) wrote:
>>> sorry, pasted the wrong bit of code.
>>>
>>> That should read when we generate:
>>>
>>> (insn 55 19 67 3 (parallel [
>>>              (set (reg:SI 0 r0)
>>>                  (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>>              (set (reg:SI 158)
>>>                  (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>>                          (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>>          ]) "ldm.c":25 404 {*load_multiple}
>>>       (expr_list:REG_UNUSED (reg:SI 0 r0)
>>>          (nil)))
>>>
>>> ie when we put a pseudo into the register load list.
>> We put a pseudo there because the predicate on the insn allows it:
>> (define_special_predicate "load_multiple_operation"
>>    (match_code "parallel")
>> {
>>   return ldm_stm_operation_p (op, /*load=*/true, SImode,
>>                                   /*consecutive=*/false,
>>                                   /*return_pc=*/false);
>> })
>> and the consecutive = false argument says that (almost) no verification
>> is performed on the SET_DEST, just that it is a REG and doesn't have
>> REGNO smaller than the first reg.
>> That said, RA is still not able to cope with such instructions, because
>> only the first set is represented with constraints, so if such an insn
>> needs any kind of reloading, it just will not happen.
>> So I think the posted patch makes lots of sense, otherwise if you use
>> such a pattern before reload, you just have to hope no reloading will be
>> needed on it.
>
> In that case I believe restricting the pattern till after LRA is the way to go.
> However, we should still add the check from [1] to ldm_stm_operation_p for when
> 'consecutive' is true as consecutive order has no useful meaning on pseudos here.
>

In the interest of fixing this PR I think the patch at https://gcc.gnu.org/ml/gcc-patches/2016-11/msg03078.html
is the way to go, that is disable this pattern until after reload.
It was intended to handle epilogue pops that is generated after reload anyway and all the general load/store multiple
patterns are handled by ldmstmd.md so not having this before reload is not risky.

I'll also propose a variant of https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01400.html to tighten up the consecutivity
check when 'consecutive' is passed down as true, but that is indepdendent of this PR.

Is the patch at https://gcc.gnu.org/ml/gcc-patches/2016-11/msg03078.html ok for trunk?
It's been in my tree for a couple of months now getting testing and there's been no fallout.

Thanks,
Kyrill

>
>
> [1] https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01400.html
>
>>
>>     Jakub
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA
  2017-02-07 14:50         ` Kyrill Tkachov
@ 2017-03-14 14:07           ` Kyrill Tkachov
  2017-03-23  9:05           ` Ramana Radhakrishnan
  1 sibling, 0 replies; 11+ messages in thread
From: Kyrill Tkachov @ 2017-03-14 14:07 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Earnshaw (lists); +Cc: GCC Patches, Ramana Radhakrishnan


On 07/02/17 14:49, Kyrill Tkachov wrote:
>
> On 18/01/17 09:49, Kyrill Tkachov wrote:
>>
>> On 19/12/16 14:53, Jakub Jelinek wrote:
>>> On Thu, Dec 15, 2016 at 10:00:14AM +0000, Richard Earnshaw (lists) wrote:
>>>> sorry, pasted the wrong bit of code.
>>>>
>>>> That should read when we generate:
>>>>
>>>> (insn 55 19 67 3 (parallel [
>>>>              (set (reg:SI 0 r0)
>>>>                  (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>>>              (set (reg:SI 158)
>>>>                  (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>>>                          (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>>>          ]) "ldm.c":25 404 {*load_multiple}
>>>>       (expr_list:REG_UNUSED (reg:SI 0 r0)
>>>>          (nil)))
>>>>
>>>> ie when we put a pseudo into the register load list.
>>> We put a pseudo there because the predicate on the insn allows it:
>>> (define_special_predicate "load_multiple_operation"
>>>    (match_code "parallel")
>>> {
>>>   return ldm_stm_operation_p (op, /*load=*/true, SImode,
>>>                                   /*consecutive=*/false,
>>>                                   /*return_pc=*/false);
>>> })
>>> and the consecutive = false argument says that (almost) no verification
>>> is performed on the SET_DEST, just that it is a REG and doesn't have
>>> REGNO smaller than the first reg.
>>> That said, RA is still not able to cope with such instructions, because
>>> only the first set is represented with constraints, so if such an insn
>>> needs any kind of reloading, it just will not happen.
>>> So I think the posted patch makes lots of sense, otherwise if you use
>>> such a pattern before reload, you just have to hope no reloading will be
>>> needed on it.
>>
>> In that case I believe restricting the pattern till after LRA is the way to go.
>> However, we should still add the check from [1] to ldm_stm_operation_p for when
>> 'consecutive' is true as consecutive order has no useful meaning on pseudos here.
>>
>
> In the interest of fixing this PR I think the patch at https://gcc.gnu.org/ml/gcc-patches/2016-11/msg03078.html
> is the way to go, that is disable this pattern until after reload.
> It was intended to handle epilogue pops that is generated after reload anyway and all the general load/store multiple
> patterns are handled by ldmstmd.md so not having this before reload is not risky.
>
> I'll also propose a variant of https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01400.html to tighten up the consecutivity
> check when 'consecutive' is passed down as true, but that is indepdendent of this PR.
>
> Is the patch at https://gcc.gnu.org/ml/gcc-patches/2016-11/msg03078.html ok for trunk?
> It's been in my tree for a couple of months now getting testing and there's been no fallout.
>

Ping.
Thanks,
Kyrill

> Thanks,
> Kyrill
>
>>
>>
>> [1] https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01400.html
>>
>>>
>>>     Jakub
>>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA
  2017-02-07 14:50         ` Kyrill Tkachov
  2017-03-14 14:07           ` Kyrill Tkachov
@ 2017-03-23  9:05           ` Ramana Radhakrishnan
  1 sibling, 0 replies; 11+ messages in thread
From: Ramana Radhakrishnan @ 2017-03-23  9:05 UTC (permalink / raw)
  To: Kyrill Tkachov, Jakub Jelinek, Richard Earnshaw (lists)
  Cc: GCC Patches, Ramana Radhakrishnan

On 07/02/17 14:49, Kyrill Tkachov wrote:
>
> On 18/01/17 09:49, Kyrill Tkachov wrote:
>>
>> On 19/12/16 14:53, Jakub Jelinek wrote:
>>> On Thu, Dec 15, 2016 at 10:00:14AM +0000, Richard Earnshaw (lists)
>>> wrote:
>>>> sorry, pasted the wrong bit of code.
>>>>
>>>> That should read when we generate:
>>>>
>>>> (insn 55 19 67 3 (parallel [
>>>>              (set (reg:SI 0 r0)
>>>>                  (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32]))
>>>>              (set (reg:SI 158)
>>>>                  (mem/u/c:SI (plus:SI (reg/f:SI 147)
>>>>                          (const_int 4 [0x4])) [2 c+4 S4 A32]))
>>>>          ]) "ldm.c":25 404 {*load_multiple}
>>>>       (expr_list:REG_UNUSED (reg:SI 0 r0)
>>>>          (nil)))
>>>>
>>>> ie when we put a pseudo into the register load list.
>>> We put a pseudo there because the predicate on the insn allows it:
>>> (define_special_predicate "load_multiple_operation"
>>>    (match_code "parallel")
>>> {
>>>   return ldm_stm_operation_p (op, /*load=*/true, SImode,
>>>                                   /*consecutive=*/false,
>>>                                   /*return_pc=*/false);
>>> })
>>> and the consecutive = false argument says that (almost) no verification
>>> is performed on the SET_DEST, just that it is a REG and doesn't have
>>> REGNO smaller than the first reg.
>>> That said, RA is still not able to cope with such instructions, because
>>> only the first set is represented with constraints, so if such an insn
>>> needs any kind of reloading, it just will not happen.
>>> So I think the posted patch makes lots of sense, otherwise if you use
>>> such a pattern before reload, you just have to hope no reloading will be
>>> needed on it.
>>
>> In that case I believe restricting the pattern till after LRA is the
>> way to go.
>> However, we should still add the check from [1] to ldm_stm_operation_p
>> for when
>> 'consecutive' is true as consecutive order has no useful meaning on
>> pseudos here.
>>
>
> In the interest of fixing this PR I think the patch at
> https://gcc.gnu.org/ml/gcc-patches/2016-11/msg03078.html
> is the way to go, that is disable this pattern until after reload.
> It was intended to handle epilogue pops that is generated after reload
> anyway and all the general load/store multiple
> patterns are handled by ldmstmd.md so not having this before reload is
> not risky.
>
> I'll also propose a variant of
> https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01400.html to tighten up
> the consecutivity
> check when 'consecutive' is passed down as true, but that is
> indepdendent of this PR.
>
> Is the patch at https://gcc.gnu.org/ml/gcc-patches/2016-11/msg03078.html
> ok for trunk?
> It's been in my tree for a couple of months now getting testing and
> there's been no fallout.
>


We shouldn't generate any ldm / stm operations for anything other than 
memcpy before reload and all of those are capped at 4 integer registers 
(MAX_LDM_STM_OPS in arm.h) at a time and that will be handled almost 
entirely by the patterns in ldmstm.md. The way to generate this 
generically this would be through movmem expanders or through the 
load_multiple expander in the backend. However all of this falls through 
the arm_gen_load_multiple interface all of which goes through 
arm_gen_multiple_op and we have an assert in arm_gen_multiple_op that 
the number of registers is <= MAX_LDM_STM_OPS.

Thus OK, please apply but be aware of any fallout

Thanks,
Ramana


> Thanks,
> Kyrill
>
>>
>>
>> [1] https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01400.html
>>
>>>
>>>     Jakub
>>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-03-23  9:05 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-30 16:47 [PATCH][ARM] PR target/71436: Restrict *load_multiple pattern till after LRA Kyrill Tkachov
2016-12-08 11:55 ` Kyrill Tkachov
2016-12-15  9:55   ` Kyrill Tkachov
2016-12-15 10:00 ` Richard Earnshaw (lists)
2016-12-15 10:06   ` Richard Earnshaw (lists)
2016-12-19 15:04     ` Jakub Jelinek
2017-01-18  9:53       ` Kyrill Tkachov
2017-02-07 14:50         ` Kyrill Tkachov
2017-03-14 14:07           ` Kyrill Tkachov
2017-03-23  9:05           ` Ramana Radhakrishnan
2016-12-15 16:40   ` Kyrill Tkachov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).