public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values.
@ 2020-09-10  3:44 Kito Cheng
  2020-09-10  7:03 ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Kito Cheng @ 2020-09-10  3:44 UTC (permalink / raw)
  To: gcc-patches, kito.cheng, rguenther; +Cc: Kito Cheng

In g:70cdb21e579191fe9f0f1d45e328908e59c0179e, DECL/global variable has handled
misaligned stores, but it didn't handle PARALLEL values, and I refer the
other part of this function, I found the PARALLEL need handled by
emit_group_* functions, so I add a check, and using emit_group_store if
storing a PARALLEL value, also checked this change didn't break the
testcase(gcc.target/arm/unaligned-argument-3.c) added by the orginal changes.

For riscv64 target, struct S {int a; double b;} will pack into a parallel
value to return and it has TImode when misaligned access is supported,
however TImode required 16-byte align, but it only 8-byte align, so it go to
the misaligned stores handling, then it will try to generate move
instruction from a PARALLEL value.

Tested on following target without introduced new reguression:
  - riscv32/riscv64 elf
  - x86_64-linux
  - arm-eabi

gcc/ChangeLog:

	PR target/96759
	* expr.c (expand_assignment): Handle misaligned stores with PARALLEL
	value.

gcc/testsuite/ChangeLog:

	PR target/96759
	* g++.target/riscv/pr96759.C: New.
	* gcc.target/riscv/pr96759.c: New.
---
 gcc/expr.c                               |  9 ++++++++-
 gcc/testsuite/g++.target/riscv/pr96759.C |  8 ++++++++
 gcc/testsuite/gcc.target/riscv/pr96759.c | 13 +++++++++++++
 3 files changed, 29 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.target/riscv/pr96759.C
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr96759.c

diff --git a/gcc/expr.c b/gcc/expr.c
index 1a15f24b3979..cc2bc52c457f 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -5173,7 +5173,14 @@ expand_assignment (tree to, tree from, bool nontemporal)
       if (TREE_CODE (to) == MEM_REF && REF_REVERSE_STORAGE_ORDER (to))
 	reg = flip_storage_order (mode, reg);
 
-      if (icode != CODE_FOR_nothing)
+      if (GET_CODE (reg) == PARALLEL)
+	{
+	  push_temp_slots ();
+	  emit_group_store (mem, reg, TREE_TYPE (from),
+			    int_size_in_bytes (TREE_TYPE (from)));
+	  pop_temp_slots ();
+	}
+      else if (icode != CODE_FOR_nothing)
 	{
 	  class expand_operand ops[2];
 
diff --git a/gcc/testsuite/g++.target/riscv/pr96759.C b/gcc/testsuite/g++.target/riscv/pr96759.C
new file mode 100644
index 000000000000..673999a4baf7
--- /dev/null
+++ b/gcc/testsuite/g++.target/riscv/pr96759.C
@@ -0,0 +1,8 @@
+/* { dg-options "-mno-strict-align -std=gnu++17" } */
+/* { dg-do compile } */
+struct S {
+  int a;
+  double b;
+};
+S GetNumbers();
+auto [globalC, globalD] = GetNumbers();
diff --git a/gcc/testsuite/gcc.target/riscv/pr96759.c b/gcc/testsuite/gcc.target/riscv/pr96759.c
new file mode 100644
index 000000000000..621c39196fca
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr96759.c
@@ -0,0 +1,13 @@
+/* { dg-options "-mno-strict-align" } */
+/* { dg-do compile } */
+
+struct S {
+  int a;
+  double b;
+};
+struct S GetNumbers();
+struct S g;
+
+void foo(){
+  g = GetNumbers();
+}
-- 
2.28.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values.
  2020-09-10  3:44 [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values Kito Cheng
@ 2020-09-10  7:03 ` Richard Biener
  2020-09-10  8:18   ` Kito Cheng
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2020-09-10  7:03 UTC (permalink / raw)
  To: Kito Cheng; +Cc: gcc-patches, kito.cheng

On Thu, 10 Sep 2020, Kito Cheng wrote:

> In g:70cdb21e579191fe9f0f1d45e328908e59c0179e, DECL/global variable has handled
> misaligned stores, but it didn't handle PARALLEL values, and I refer the
> other part of this function, I found the PARALLEL need handled by
> emit_group_* functions, so I add a check, and using emit_group_store if
> storing a PARALLEL value, also checked this change didn't break the
> testcase(gcc.target/arm/unaligned-argument-3.c) added by the orginal changes.
> 
> For riscv64 target, struct S {int a; double b;} will pack into a parallel
> value to return and it has TImode when misaligned access is supported,
> however TImode required 16-byte align, but it only 8-byte align, so it go to
> the misaligned stores handling, then it will try to generate move
> instruction from a PARALLEL value.
> 
> Tested on following target without introduced new reguression:
>   - riscv32/riscv64 elf
>   - x86_64-linux
>   - arm-eabi

riscv doesn't seem to have movmisalign and I don't see why movmisalign
should not support a TImode parallel RHS so at least you should
put this check after the icode != CODE_FOR_nothing check?

Also wouldn't it be better to support PARALLEL from within
store_bit_field?  After all this is a misaligned access and
since you didn't quote the RTL involved I'm guessing that
emit_group_store knows nothing about this fact.

Richard.

> gcc/ChangeLog:
> 
> 	PR target/96759
> 	* expr.c (expand_assignment): Handle misaligned stores with PARALLEL
> 	value.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR target/96759
> 	* g++.target/riscv/pr96759.C: New.
> 	* gcc.target/riscv/pr96759.c: New.
> ---
>  gcc/expr.c                               |  9 ++++++++-
>  gcc/testsuite/g++.target/riscv/pr96759.C |  8 ++++++++
>  gcc/testsuite/gcc.target/riscv/pr96759.c | 13 +++++++++++++
>  3 files changed, 29 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.target/riscv/pr96759.C
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr96759.c
> 
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 1a15f24b3979..cc2bc52c457f 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -5173,7 +5173,14 @@ expand_assignment (tree to, tree from, bool nontemporal)
>        if (TREE_CODE (to) == MEM_REF && REF_REVERSE_STORAGE_ORDER (to))
>  	reg = flip_storage_order (mode, reg);
>  
> -      if (icode != CODE_FOR_nothing)
> +      if (GET_CODE (reg) == PARALLEL)
> +	{
> +	  push_temp_slots ();
> +	  emit_group_store (mem, reg, TREE_TYPE (from),
> +			    int_size_in_bytes (TREE_TYPE (from)));
> +	  pop_temp_slots ();
> +	}
> +      else if (icode != CODE_FOR_nothing)
>  	{
>  	  class expand_operand ops[2];
>  
> diff --git a/gcc/testsuite/g++.target/riscv/pr96759.C b/gcc/testsuite/g++.target/riscv/pr96759.C
> new file mode 100644
> index 000000000000..673999a4baf7
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/riscv/pr96759.C
> @@ -0,0 +1,8 @@
> +/* { dg-options "-mno-strict-align -std=gnu++17" } */
> +/* { dg-do compile } */
> +struct S {
> +  int a;
> +  double b;
> +};
> +S GetNumbers();
> +auto [globalC, globalD] = GetNumbers();
> diff --git a/gcc/testsuite/gcc.target/riscv/pr96759.c b/gcc/testsuite/gcc.target/riscv/pr96759.c
> new file mode 100644
> index 000000000000..621c39196fca
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr96759.c
> @@ -0,0 +1,13 @@
> +/* { dg-options "-mno-strict-align" } */
> +/* { dg-do compile } */
> +
> +struct S {
> +  int a;
> +  double b;
> +};
> +struct S GetNumbers();
> +struct S g;
> +
> +void foo(){
> +  g = GetNumbers();
> +}
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values.
  2020-09-10  7:03 ` Richard Biener
@ 2020-09-10  8:18   ` Kito Cheng
  2020-09-10  9:09     ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Kito Cheng @ 2020-09-10  8:18 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, Kito Cheng

Hi Richard:

Thanks for your review :)

> riscv doesn't seem to have movmisalign and I don't see why movmisalign
> should not support a TImode parallel RHS so at least you should
> put this check after the icode != CODE_FOR_nothing check?

RISC-V has an option `-mno-strict-align` to enable mis-aligned access,
but we didn't provide movmisalign pattern, I guess might be another issue,
I tried to add movmisalign pattern and then it will feed PARALLEL into that,
and got into a similar situation again, most targets are not except
got PARALLEL,
so that's the reason why I put before icode != CODE_FOR_nothing check.

> Also wouldn't it be better to support PARALLEL from within
> store_bit_field?
For the above reason, I think store_bit_field supports PARALLEL is  not enough.

> After all this is a misaligned access and
> since you didn't quote the RTL involved I'm guessing that
> emit_group_store knows nothing about this fact.

Oh, good point, how do you think about loading into temp and then
storing it in the DECL, so that we could use original logic to
handle misaligned store.

A PoC for this idea, didn't fully tested yet:

diff --git a/gcc/expr.c b/gcc/expr.c
index 1a15f24b3979..6ef97740c764 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -5166,8 +5166,19 @@ expand_assignment (tree to, tree from, bool nontemporal)
         || targetm.slow_unaligned_access (mode, align)))
    {
      rtx reg, mem;
+      push_temp_slots ();

      reg = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL);
+
+      if (GET_CODE (reg) == PARALLEL)
+       {
+         rtx temp = assign_stack_temp (mode,
+                                       GET_MODE_SIZE (mode));
+         emit_group_store (temp, reg, TREE_TYPE (from),
+                          int_size_in_bytes (TREE_TYPE (from)));
+         reg = temp;
+       }
+
      reg = force_not_mem (reg);
      mem = expand_expr (to, NULL_RTX, VOIDmode, EXPAND_WRITE);
      if (TREE_CODE (to) == MEM_REF && REF_REVERSE_STORAGE_ORDER (to))
@@ -5186,6 +5197,7 @@ expand_assignment (tree to, tree from, bool nontemporal)
      else
       store_bit_field (mem, GET_MODE_BITSIZE (mode), 0, 0, 0, mode, reg,
                        false);
+      pop_temp_slots ();
      return;
    }

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values.
  2020-09-10  8:18   ` Kito Cheng
@ 2020-09-10  9:09     ` Richard Biener
  2020-09-10  9:39       ` Eric Botcazou
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2020-09-10  9:09 UTC (permalink / raw)
  To: Kito Cheng; +Cc: GCC Patches, Kito Cheng, ebotcazou

On Thu, 10 Sep 2020, Kito Cheng wrote:

> Hi Richard:
> 
> Thanks for your review :)
> 
> > riscv doesn't seem to have movmisalign and I don't see why movmisalign
> > should not support a TImode parallel RHS so at least you should
> > put this check after the icode != CODE_FOR_nothing check?
> 
> RISC-V has an option `-mno-strict-align` to enable mis-aligned access,
> but we didn't provide movmisalign pattern, I guess might be another issue,
> I tried to add movmisalign pattern and then it will feed PARALLEL into that,
> and got into a similar situation again, most targets are not except
> got PARALLEL,
> so that's the reason why I put before icode != CODE_FOR_nothing check.
> 
> > Also wouldn't it be better to support PARALLEL from within
> > store_bit_field?
> For the above reason, I think store_bit_field supports PARALLEL is  not enough.
> 
> > After all this is a misaligned access and
> > since you didn't quote the RTL involved I'm guessing that
> > emit_group_store knows nothing about this fact.
> 
> Oh, good point, how do you think about loading into temp and then
> storing it in the DECL, so that we could use original logic to
> handle misaligned store.
> 
> A PoC for this idea, didn't fully tested yet:
> 
> diff --git a/gcc/expr.c b/gcc/expr.c
> index 1a15f24b3979..6ef97740c764 100644
> --- a/gcc/expr.c
> +++ b/gcc/expr.c
> @@ -5166,8 +5166,19 @@ expand_assignment (tree to, tree from, bool nontemporal)
>          || targetm.slow_unaligned_access (mode, align)))
>     {
>       rtx reg, mem;
> +      push_temp_slots ();
> 
>       reg = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL);

I'd say there must be (wishful thinking) some expand_expr
modifier that guarantees this?  Alternatively use

     reg = force_reg (mode, reg);

instead of assign_stack_temp & emit_group_store?

That said, I don't know very much about the fancy ways to handle
stores from PARALLEL - but doesn't PARALLEL mean we can just take
any of its members as source?  How does 'reg' look like exactly
in your case?  What does 'to' expand to?

Richard.

> +
> +      if (GET_CODE (reg) == PARALLEL)
> +       {
> +         rtx temp = assign_stack_temp (mode,
> +                                       GET_MODE_SIZE (mode));
> +         emit_group_store (temp, reg, TREE_TYPE (from),
> +                          int_size_in_bytes (TREE_TYPE (from)));
> +         reg = temp;
> +       }
> +
>       reg = force_not_mem (reg);
>       mem = expand_expr (to, NULL_RTX, VOIDmode, EXPAND_WRITE);
>       if (TREE_CODE (to) == MEM_REF && REF_REVERSE_STORAGE_ORDER (to))
> @@ -5186,6 +5197,7 @@ expand_assignment (tree to, tree from, bool nontemporal)
>       else
>        store_bit_field (mem, GET_MODE_BITSIZE (mode), 0, 0, 0, mode, reg,
>                         false);
> +      pop_temp_slots ();
>       return;
>     }
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values.
  2020-09-10  9:09     ` Richard Biener
@ 2020-09-10  9:39       ` Eric Botcazou
  2020-09-10 10:04         ` Kito Cheng
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Botcazou @ 2020-09-10  9:39 UTC (permalink / raw)
  To: Richard Biener; +Cc: Kito Cheng, GCC Patches, Kito Cheng

> I'd say there must be (wishful thinking) some expand_expr
> modifier that guarantees this?  Alternatively use
> 
>      reg = force_reg (mode, reg);
> 
> instead of assign_stack_temp & emit_group_store?
> 
> That said, I don't know very much about the fancy ways to handle
> stores from PARALLEL - but doesn't PARALLEL mean we can just take
> any of its members as source?  How does 'reg' look like exactly
> in your case?  What does 'to' expand to?

No, PARALLEL describes a multi-register layout, typically a return value with 
aggregate type, and you need to use emit_group_store to decipher it properly.

emit_group_store should already be able to handle misaligned stores I think.

-- 
Eric Botcazou



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values.
  2020-09-10  9:39       ` Eric Botcazou
@ 2020-09-10 10:04         ` Kito Cheng
  2020-09-10 10:05           ` Kito Cheng
  0 siblings, 1 reply; 7+ messages in thread
From: Kito Cheng @ 2020-09-10 10:04 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Richard Biener, Kito Cheng, GCC Patches

On Thu, Sep 10, 2020 at 5:41 PM Eric Botcazou <botcazou@adacore.com> wrote:
>
> > I'd say there must be (wishful thinking) some expand_expr
> > modifier that guarantees this?  Alternatively use
> >
> >      reg = force_reg (mode, reg);

I did a try, seems like force_reg didn't handle PARALLE :(

> >
> > instead of assign_stack_temp & emit_group_store?
> >
> > That said, I don't know very much about the fancy ways to handle
> > stores from PARALLEL - but doesn't PARALLEL mean we can just take
> > any of its members as source?  How does 'reg' look like exactly
> > in your case?  What does 'to' expand to?

'reg' look like this;
(parallel:TI [
    (expr_list:REG_DEP_TRUE (reg:SI 72)
        (const_int 0 [0]))
    (expr_list:REG_DEP_TRUE (reg:DF 73)
        (const_int 8 [0x8]))])

'to' is global var decl as struct S {int a; double b;}, align to 4,
and TYPE_MODE is TImode,
and TImode requires 8 byte alignment.


>
> No, PARALLEL describes a multi-register layout, typically a return value with
> aggregate type, and you need to use emit_group_store to decipher it properly.
>
> emit_group_store should already be able to handle misaligned stores I think.
>
> --
> Eric Botcazou
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values.
  2020-09-10 10:04         ` Kito Cheng
@ 2020-09-10 10:05           ` Kito Cheng
  0 siblings, 0 replies; 7+ messages in thread
From: Kito Cheng @ 2020-09-10 10:05 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: Richard Biener, Kito Cheng, GCC Patches

Optimized version of the v2 patch, get rid of assign_stack_temp.

diff --git a/gcc/expr.c b/gcc/expr.c
index 1a15f24b3979..5f744a6c1b8d 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -5168,7 +5168,16 @@ expand_assignment (tree to, tree from, bool nontemporal)
      rtx reg, mem;

      reg = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL);
-      reg = force_not_mem (reg);
+
+      if (GET_CODE (reg) == PARALLEL)
+       {
+         rtx temp = gen_reg_rtx (mode);
+         emit_group_store (temp, reg, TREE_TYPE (from),
+                          int_size_in_bytes (TREE_TYPE (from)));
+         reg = temp;
+       }
+
+      reg = force_not_mem (mode, reg);
      mem = expand_expr (to, NULL_RTX, VOIDmode, EXPAND_WRITE);
      if (TREE_CODE (to) == MEM_REF && REF_REVERSE_STORAGE_ORDER (to))
       reg = flip_storage_order (mode, reg);

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-09-10 10:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-10  3:44 [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values Kito Cheng
2020-09-10  7:03 ` Richard Biener
2020-09-10  8:18   ` Kito Cheng
2020-09-10  9:09     ` Richard Biener
2020-09-10  9:39       ` Eric Botcazou
2020-09-10 10:04         ` Kito Cheng
2020-09-10 10:05           ` Kito Cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).