* [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values. @ 2020-09-10 3:44 Kito Cheng 2020-09-10 7:03 ` Richard Biener 0 siblings, 1 reply; 7+ messages in thread From: Kito Cheng @ 2020-09-10 3:44 UTC (permalink / raw) To: gcc-patches, kito.cheng, rguenther; +Cc: Kito Cheng In g:70cdb21e579191fe9f0f1d45e328908e59c0179e, DECL/global variable has handled misaligned stores, but it didn't handle PARALLEL values, and I refer the other part of this function, I found the PARALLEL need handled by emit_group_* functions, so I add a check, and using emit_group_store if storing a PARALLEL value, also checked this change didn't break the testcase(gcc.target/arm/unaligned-argument-3.c) added by the orginal changes. For riscv64 target, struct S {int a; double b;} will pack into a parallel value to return and it has TImode when misaligned access is supported, however TImode required 16-byte align, but it only 8-byte align, so it go to the misaligned stores handling, then it will try to generate move instruction from a PARALLEL value. Tested on following target without introduced new reguression: - riscv32/riscv64 elf - x86_64-linux - arm-eabi gcc/ChangeLog: PR target/96759 * expr.c (expand_assignment): Handle misaligned stores with PARALLEL value. gcc/testsuite/ChangeLog: PR target/96759 * g++.target/riscv/pr96759.C: New. * gcc.target/riscv/pr96759.c: New. --- gcc/expr.c | 9 ++++++++- gcc/testsuite/g++.target/riscv/pr96759.C | 8 ++++++++ gcc/testsuite/gcc.target/riscv/pr96759.c | 13 +++++++++++++ 3 files changed, 29 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.target/riscv/pr96759.C create mode 100644 gcc/testsuite/gcc.target/riscv/pr96759.c diff --git a/gcc/expr.c b/gcc/expr.c index 1a15f24b3979..cc2bc52c457f 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -5173,7 +5173,14 @@ expand_assignment (tree to, tree from, bool nontemporal) if (TREE_CODE (to) == MEM_REF && REF_REVERSE_STORAGE_ORDER (to)) reg = flip_storage_order (mode, reg); - if (icode != CODE_FOR_nothing) + if (GET_CODE (reg) == PARALLEL) + { + push_temp_slots (); + emit_group_store (mem, reg, TREE_TYPE (from), + int_size_in_bytes (TREE_TYPE (from))); + pop_temp_slots (); + } + else if (icode != CODE_FOR_nothing) { class expand_operand ops[2]; diff --git a/gcc/testsuite/g++.target/riscv/pr96759.C b/gcc/testsuite/g++.target/riscv/pr96759.C new file mode 100644 index 000000000000..673999a4baf7 --- /dev/null +++ b/gcc/testsuite/g++.target/riscv/pr96759.C @@ -0,0 +1,8 @@ +/* { dg-options "-mno-strict-align -std=gnu++17" } */ +/* { dg-do compile } */ +struct S { + int a; + double b; +}; +S GetNumbers(); +auto [globalC, globalD] = GetNumbers(); diff --git a/gcc/testsuite/gcc.target/riscv/pr96759.c b/gcc/testsuite/gcc.target/riscv/pr96759.c new file mode 100644 index 000000000000..621c39196fca --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/pr96759.c @@ -0,0 +1,13 @@ +/* { dg-options "-mno-strict-align" } */ +/* { dg-do compile } */ + +struct S { + int a; + double b; +}; +struct S GetNumbers(); +struct S g; + +void foo(){ + g = GetNumbers(); +} -- 2.28.0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values. 2020-09-10 3:44 [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values Kito Cheng @ 2020-09-10 7:03 ` Richard Biener 2020-09-10 8:18 ` Kito Cheng 0 siblings, 1 reply; 7+ messages in thread From: Richard Biener @ 2020-09-10 7:03 UTC (permalink / raw) To: Kito Cheng; +Cc: gcc-patches, kito.cheng On Thu, 10 Sep 2020, Kito Cheng wrote: > In g:70cdb21e579191fe9f0f1d45e328908e59c0179e, DECL/global variable has handled > misaligned stores, but it didn't handle PARALLEL values, and I refer the > other part of this function, I found the PARALLEL need handled by > emit_group_* functions, so I add a check, and using emit_group_store if > storing a PARALLEL value, also checked this change didn't break the > testcase(gcc.target/arm/unaligned-argument-3.c) added by the orginal changes. > > For riscv64 target, struct S {int a; double b;} will pack into a parallel > value to return and it has TImode when misaligned access is supported, > however TImode required 16-byte align, but it only 8-byte align, so it go to > the misaligned stores handling, then it will try to generate move > instruction from a PARALLEL value. > > Tested on following target without introduced new reguression: > - riscv32/riscv64 elf > - x86_64-linux > - arm-eabi riscv doesn't seem to have movmisalign and I don't see why movmisalign should not support a TImode parallel RHS so at least you should put this check after the icode != CODE_FOR_nothing check? Also wouldn't it be better to support PARALLEL from within store_bit_field? After all this is a misaligned access and since you didn't quote the RTL involved I'm guessing that emit_group_store knows nothing about this fact. Richard. > gcc/ChangeLog: > > PR target/96759 > * expr.c (expand_assignment): Handle misaligned stores with PARALLEL > value. > > gcc/testsuite/ChangeLog: > > PR target/96759 > * g++.target/riscv/pr96759.C: New. > * gcc.target/riscv/pr96759.c: New. > --- > gcc/expr.c | 9 ++++++++- > gcc/testsuite/g++.target/riscv/pr96759.C | 8 ++++++++ > gcc/testsuite/gcc.target/riscv/pr96759.c | 13 +++++++++++++ > 3 files changed, 29 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/g++.target/riscv/pr96759.C > create mode 100644 gcc/testsuite/gcc.target/riscv/pr96759.c > > diff --git a/gcc/expr.c b/gcc/expr.c > index 1a15f24b3979..cc2bc52c457f 100644 > --- a/gcc/expr.c > +++ b/gcc/expr.c > @@ -5173,7 +5173,14 @@ expand_assignment (tree to, tree from, bool nontemporal) > if (TREE_CODE (to) == MEM_REF && REF_REVERSE_STORAGE_ORDER (to)) > reg = flip_storage_order (mode, reg); > > - if (icode != CODE_FOR_nothing) > + if (GET_CODE (reg) == PARALLEL) > + { > + push_temp_slots (); > + emit_group_store (mem, reg, TREE_TYPE (from), > + int_size_in_bytes (TREE_TYPE (from))); > + pop_temp_slots (); > + } > + else if (icode != CODE_FOR_nothing) > { > class expand_operand ops[2]; > > diff --git a/gcc/testsuite/g++.target/riscv/pr96759.C b/gcc/testsuite/g++.target/riscv/pr96759.C > new file mode 100644 > index 000000000000..673999a4baf7 > --- /dev/null > +++ b/gcc/testsuite/g++.target/riscv/pr96759.C > @@ -0,0 +1,8 @@ > +/* { dg-options "-mno-strict-align -std=gnu++17" } */ > +/* { dg-do compile } */ > +struct S { > + int a; > + double b; > +}; > +S GetNumbers(); > +auto [globalC, globalD] = GetNumbers(); > diff --git a/gcc/testsuite/gcc.target/riscv/pr96759.c b/gcc/testsuite/gcc.target/riscv/pr96759.c > new file mode 100644 > index 000000000000..621c39196fca > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/pr96759.c > @@ -0,0 +1,13 @@ > +/* { dg-options "-mno-strict-align" } */ > +/* { dg-do compile } */ > + > +struct S { > + int a; > + double b; > +}; > +struct S GetNumbers(); > +struct S g; > + > +void foo(){ > + g = GetNumbers(); > +} > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values. 2020-09-10 7:03 ` Richard Biener @ 2020-09-10 8:18 ` Kito Cheng 2020-09-10 9:09 ` Richard Biener 0 siblings, 1 reply; 7+ messages in thread From: Kito Cheng @ 2020-09-10 8:18 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches, Kito Cheng Hi Richard: Thanks for your review :) > riscv doesn't seem to have movmisalign and I don't see why movmisalign > should not support a TImode parallel RHS so at least you should > put this check after the icode != CODE_FOR_nothing check? RISC-V has an option `-mno-strict-align` to enable mis-aligned access, but we didn't provide movmisalign pattern, I guess might be another issue, I tried to add movmisalign pattern and then it will feed PARALLEL into that, and got into a similar situation again, most targets are not except got PARALLEL, so that's the reason why I put before icode != CODE_FOR_nothing check. > Also wouldn't it be better to support PARALLEL from within > store_bit_field? For the above reason, I think store_bit_field supports PARALLEL is not enough. > After all this is a misaligned access and > since you didn't quote the RTL involved I'm guessing that > emit_group_store knows nothing about this fact. Oh, good point, how do you think about loading into temp and then storing it in the DECL, so that we could use original logic to handle misaligned store. A PoC for this idea, didn't fully tested yet: diff --git a/gcc/expr.c b/gcc/expr.c index 1a15f24b3979..6ef97740c764 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -5166,8 +5166,19 @@ expand_assignment (tree to, tree from, bool nontemporal) || targetm.slow_unaligned_access (mode, align))) { rtx reg, mem; + push_temp_slots (); reg = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL); + + if (GET_CODE (reg) == PARALLEL) + { + rtx temp = assign_stack_temp (mode, + GET_MODE_SIZE (mode)); + emit_group_store (temp, reg, TREE_TYPE (from), + int_size_in_bytes (TREE_TYPE (from))); + reg = temp; + } + reg = force_not_mem (reg); mem = expand_expr (to, NULL_RTX, VOIDmode, EXPAND_WRITE); if (TREE_CODE (to) == MEM_REF && REF_REVERSE_STORAGE_ORDER (to)) @@ -5186,6 +5197,7 @@ expand_assignment (tree to, tree from, bool nontemporal) else store_bit_field (mem, GET_MODE_BITSIZE (mode), 0, 0, 0, mode, reg, false); + pop_temp_slots (); return; } ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values. 2020-09-10 8:18 ` Kito Cheng @ 2020-09-10 9:09 ` Richard Biener 2020-09-10 9:39 ` Eric Botcazou 0 siblings, 1 reply; 7+ messages in thread From: Richard Biener @ 2020-09-10 9:09 UTC (permalink / raw) To: Kito Cheng; +Cc: GCC Patches, Kito Cheng, ebotcazou On Thu, 10 Sep 2020, Kito Cheng wrote: > Hi Richard: > > Thanks for your review :) > > > riscv doesn't seem to have movmisalign and I don't see why movmisalign > > should not support a TImode parallel RHS so at least you should > > put this check after the icode != CODE_FOR_nothing check? > > RISC-V has an option `-mno-strict-align` to enable mis-aligned access, > but we didn't provide movmisalign pattern, I guess might be another issue, > I tried to add movmisalign pattern and then it will feed PARALLEL into that, > and got into a similar situation again, most targets are not except > got PARALLEL, > so that's the reason why I put before icode != CODE_FOR_nothing check. > > > Also wouldn't it be better to support PARALLEL from within > > store_bit_field? > For the above reason, I think store_bit_field supports PARALLEL is not enough. > > > After all this is a misaligned access and > > since you didn't quote the RTL involved I'm guessing that > > emit_group_store knows nothing about this fact. > > Oh, good point, how do you think about loading into temp and then > storing it in the DECL, so that we could use original logic to > handle misaligned store. > > A PoC for this idea, didn't fully tested yet: > > diff --git a/gcc/expr.c b/gcc/expr.c > index 1a15f24b3979..6ef97740c764 100644 > --- a/gcc/expr.c > +++ b/gcc/expr.c > @@ -5166,8 +5166,19 @@ expand_assignment (tree to, tree from, bool nontemporal) > || targetm.slow_unaligned_access (mode, align))) > { > rtx reg, mem; > + push_temp_slots (); > > reg = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL); I'd say there must be (wishful thinking) some expand_expr modifier that guarantees this? Alternatively use reg = force_reg (mode, reg); instead of assign_stack_temp & emit_group_store? That said, I don't know very much about the fancy ways to handle stores from PARALLEL - but doesn't PARALLEL mean we can just take any of its members as source? How does 'reg' look like exactly in your case? What does 'to' expand to? Richard. > + > + if (GET_CODE (reg) == PARALLEL) > + { > + rtx temp = assign_stack_temp (mode, > + GET_MODE_SIZE (mode)); > + emit_group_store (temp, reg, TREE_TYPE (from), > + int_size_in_bytes (TREE_TYPE (from))); > + reg = temp; > + } > + > reg = force_not_mem (reg); > mem = expand_expr (to, NULL_RTX, VOIDmode, EXPAND_WRITE); > if (TREE_CODE (to) == MEM_REF && REF_REVERSE_STORAGE_ORDER (to)) > @@ -5186,6 +5197,7 @@ expand_assignment (tree to, tree from, bool nontemporal) > else > store_bit_field (mem, GET_MODE_BITSIZE (mode), 0, 0, 0, mode, reg, > false); > + pop_temp_slots (); > return; > } > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values. 2020-09-10 9:09 ` Richard Biener @ 2020-09-10 9:39 ` Eric Botcazou 2020-09-10 10:04 ` Kito Cheng 0 siblings, 1 reply; 7+ messages in thread From: Eric Botcazou @ 2020-09-10 9:39 UTC (permalink / raw) To: Richard Biener; +Cc: Kito Cheng, GCC Patches, Kito Cheng > I'd say there must be (wishful thinking) some expand_expr > modifier that guarantees this? Alternatively use > > reg = force_reg (mode, reg); > > instead of assign_stack_temp & emit_group_store? > > That said, I don't know very much about the fancy ways to handle > stores from PARALLEL - but doesn't PARALLEL mean we can just take > any of its members as source? How does 'reg' look like exactly > in your case? What does 'to' expand to? No, PARALLEL describes a multi-register layout, typically a return value with aggregate type, and you need to use emit_group_store to decipher it properly. emit_group_store should already be able to handle misaligned stores I think. -- Eric Botcazou ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values. 2020-09-10 9:39 ` Eric Botcazou @ 2020-09-10 10:04 ` Kito Cheng 2020-09-10 10:05 ` Kito Cheng 0 siblings, 1 reply; 7+ messages in thread From: Kito Cheng @ 2020-09-10 10:04 UTC (permalink / raw) To: Eric Botcazou; +Cc: Richard Biener, Kito Cheng, GCC Patches On Thu, Sep 10, 2020 at 5:41 PM Eric Botcazou <botcazou@adacore.com> wrote: > > > I'd say there must be (wishful thinking) some expand_expr > > modifier that guarantees this? Alternatively use > > > > reg = force_reg (mode, reg); I did a try, seems like force_reg didn't handle PARALLE :( > > > > instead of assign_stack_temp & emit_group_store? > > > > That said, I don't know very much about the fancy ways to handle > > stores from PARALLEL - but doesn't PARALLEL mean we can just take > > any of its members as source? How does 'reg' look like exactly > > in your case? What does 'to' expand to? 'reg' look like this; (parallel:TI [ (expr_list:REG_DEP_TRUE (reg:SI 72) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (reg:DF 73) (const_int 8 [0x8]))]) 'to' is global var decl as struct S {int a; double b;}, align to 4, and TYPE_MODE is TImode, and TImode requires 8 byte alignment. > > No, PARALLEL describes a multi-register layout, typically a return value with > aggregate type, and you need to use emit_group_store to decipher it properly. > > emit_group_store should already be able to handle misaligned stores I think. > > -- > Eric Botcazou > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values. 2020-09-10 10:04 ` Kito Cheng @ 2020-09-10 10:05 ` Kito Cheng 0 siblings, 0 replies; 7+ messages in thread From: Kito Cheng @ 2020-09-10 10:05 UTC (permalink / raw) To: Eric Botcazou; +Cc: Richard Biener, Kito Cheng, GCC Patches Optimized version of the v2 patch, get rid of assign_stack_temp. diff --git a/gcc/expr.c b/gcc/expr.c index 1a15f24b3979..5f744a6c1b8d 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -5168,7 +5168,16 @@ expand_assignment (tree to, tree from, bool nontemporal) rtx reg, mem; reg = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL); - reg = force_not_mem (reg); + + if (GET_CODE (reg) == PARALLEL) + { + rtx temp = gen_reg_rtx (mode); + emit_group_store (temp, reg, TREE_TYPE (from), + int_size_in_bytes (TREE_TYPE (from))); + reg = temp; + } + + reg = force_not_mem (mode, reg); mem = expand_expr (to, NULL_RTX, VOIDmode, EXPAND_WRITE); if (TREE_CODE (to) == MEM_REF && REF_REVERSE_STORAGE_ORDER (to)) reg = flip_storage_order (mode, reg); ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-09-10 10:05 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-09-10 3:44 [PATCH] PR target/96759 - Handle global variable assignment from misaligned structure/PARALLEL return values Kito Cheng 2020-09-10 7:03 ` Richard Biener 2020-09-10 8:18 ` Kito Cheng 2020-09-10 9:09 ` Richard Biener 2020-09-10 9:39 ` Eric Botcazou 2020-09-10 10:04 ` Kito Cheng 2020-09-10 10:05 ` Kito Cheng
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).