* [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr @ 2016-01-08 13:29 Christian Bruel 2016-01-14 13:09 ` ping:[PATCH][ARM,AARCH64] " Christian Bruel 2016-01-18 11:36 ` [PATCH][ARM,AARCH64] " Richard Biener 0 siblings, 2 replies; 11+ messages in thread From: Christian Bruel @ 2016-01-08 13:29 UTC (permalink / raw) To: kyrylo.tkachov, Richard.Earnshaw, ramana.radhakrishnan, richard.guenther, bschmidt, gcc-patches [-- Attachment #1: Type: text/plain, Size: 1496 bytes --] When compiling code with attribute targets on arm or aarch64, vector_type_mode returns different results (eg Vmode or BLKmode) depending on the current simd flags that are not set between functions. for example the following code: #include <arm_neon.h> extern int8x8_t a; extern int8x8_t b; int16x8_t __attribute__ ((target("fpu=neon"))) foo(void) { return vaddl_s8 (a, b); } Triggers gcc_asserts in copy_to_mode_regs while expanding NEON builtins , because the mismatch and DECL_MODE current's TYPE_MODE used in expand_builtin for global variables. but the best explanation is in the vector_type_mode: /* Vector types need to re-check the target flags each time we report the machine mode. We need to do this because attribute target can change the result of vector_mode_supported_p and have_regs_of_mode on a per-function basis. Thus the TYPE_MODE of a VECTOR_TYPE can change on a per-function basis. */ I first tried to hack the 2 machine descriptions to insert convert_to_mode or relayout_decls here and there, but I found this very fragile. Instead a more central relayout the of type while expanding gave good results, as proposed here. bootstraped and tested with no regression for arm, aarch64 and i586. Does this look to be the right approach ? nb: for testing this patch is complementary with https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00332.html https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00248.html thanks for your comments. [-- Attachment #2: pr68674.patch --] [-- Type: text/x-patch, Size: 2108 bytes --] 2016-01-06 Christian Bruel <christian.bruel@st.com> PR target/68674 * expr.c (expand_expr_real_1): Relayout VECTOR_TYPE expression. 2016-01-06 Christian Bruel <christian.bruel@st.com> PR target/68674 * gcc.target/arm/pr68674.c * gcc.target/aarch64/pr68674.c Index: gcc/expr.c =================================================================== --- gcc/expr.c (revision 232158) +++ gcc/expr.c (working copy) @@ -9602,8 +9602,17 @@ expand_expr_real_1 (tree exp, rtx target exp = SSA_NAME_VAR (ssa_name); goto expand_decl_rtl; - case PARM_DECL: case VAR_DECL: + /* Vector types need to re-check the target flags, + since DECL_MODE might change with attribute target. */ + if (TREE_CODE (type) == VECTOR_TYPE + && DECL_MODE (exp) != TYPE_MODE (type) + && (TREE_STATIC (exp) || DECL_EXTERNAL (exp))) + relayout_decl (exp); + + /* ... fall through ... */ + + case PARM_DECL: /* If a static var's type was incomplete when the decl was written, but the type is complete now, lay out the decl now. */ if (DECL_SIZE (exp) == 0 Index: gcc/testsuite/gcc.target/arm/pr68674.c =================================================================== --- gcc/testsuite/gcc.target/arm/pr68674.c (revision 0) +++ gcc/testsuite/gcc.target/arm/pr68674.c (working copy) @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O2 -mfloat-abi=softfp" } */ + +#include <arm_neon.h> + +int8x8_t a, b; +int16x8_t e; + +void +__attribute__ ((target("fpu=neon"))) +foo(void) +{ + e = (int16x8_t) vaddl_s8(a, b); +} Index: gcc/testsuite/gcc.target/aarch64/pr68674.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/pr68674.c (revision 0) +++ gcc/testsuite/gcc.target/aarch64/pr68674.c (working copy) @@ -0,0 +1,15 @@ +/* { dg-do assemble } */ +/* { dg-options "-O2 -march=armv8-a+nosimd" } */ + +#include <arm_neon.h> + +int8x8_t a, b; +int16x8_t e; + +void +__attribute__ ((target("+simd"))) +foo(void) +{ + e = (int16x8_t) vaddl_s8(a, b); +} + ^ permalink raw reply [flat|nested] 11+ messages in thread
* ping:[PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr 2016-01-08 13:29 [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr Christian Bruel @ 2016-01-14 13:09 ` Christian Bruel 2016-01-18 11:36 ` [PATCH][ARM,AARCH64] " Richard Biener 1 sibling, 0 replies; 11+ messages in thread From: Christian Bruel @ 2016-01-14 13:09 UTC (permalink / raw) To: richard.guenther, bschmidt Cc: kyrylo.tkachov, Richard.Earnshaw, ramana.radhakrishnan, gcc-patches https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00415.html thanks ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr 2016-01-08 13:29 [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr Christian Bruel 2016-01-14 13:09 ` ping:[PATCH][ARM,AARCH64] " Christian Bruel @ 2016-01-18 11:36 ` Richard Biener 2016-01-19 15:01 ` Christian Bruel 1 sibling, 1 reply; 11+ messages in thread From: Richard Biener @ 2016-01-18 11:36 UTC (permalink / raw) To: Christian Bruel Cc: kyrylo.tkachov, Richard Earnshaw, ramana.radhakrishnan, bschmidt, GCC Patches On Fri, Jan 8, 2016 at 2:29 PM, Christian Bruel <christian.bruel@st.com> wrote: > When compiling code with attribute targets on arm or aarch64, > vector_type_mode returns different results (eg Vmode or BLKmode) depending > on the current simd flags that are not set between functions. > > for example the following code: > > #include <arm_neon.h> > > extern int8x8_t a; > extern int8x8_t b; > > int16x8_t > __attribute__ ((target("fpu=neon"))) > foo(void) > { > return vaddl_s8 (a, b); > } > > Triggers gcc_asserts in copy_to_mode_regs while expanding NEON builtins , > because the mismatch and DECL_MODE current's TYPE_MODE used in > expand_builtin for global variables. > > but the best explanation is in the vector_type_mode: > /* Vector types need to re-check the target flags each time we report > the machine mode. We need to do this because attribute target can > change the result of vector_mode_supported_p and have_regs_of_mode > on a per-function basis. Thus the TYPE_MODE of a VECTOR_TYPE can > change on a per-function basis. */ > > I first tried to hack the 2 machine descriptions to insert convert_to_mode > or relayout_decls here and there, but I found this very fragile. Instead a > more central relayout the of type while expanding gave good results, as > proposed here. > > bootstraped and tested with no regression for arm, aarch64 and i586. > > Does this look to be the right approach ? > > nb: for testing this patch is complementary with > > https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00332.html > https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00248.html > > thanks for your comments. A x86 specific testcase that ICEs as well: typedef int v8si __attribute__((vector_size(32))); v8si a; v8si __attribute__((target("avx"))) foo() { return a; } in your patch not using the shared DECL_RTL of the global var "fixes" this so I think a conceptually better fix would be to "adjust" DECL_RTL from globals via a adjust_address (or so). Also given that we do /* ... fall through ... */ case FUNCTION_DECL: case RESULT_DECL: decl_rtl = DECL_RTL (exp); expand_decl_rtl: gcc_assert (decl_rtl); decl_rtl = copy_rtx (decl_rtl); thus always "unshare" DECL_RTL anyway it might be not so bad to simply do decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0); instead of that to avoid one copy. Index: expr.c =================================================================== --- expr.c (revision 232496) +++ expr.c (working copy) @@ -9597,7 +9597,10 @@ expand_expr_real_1 (tree exp, rtx target decl_rtl = DECL_RTL (exp); expand_decl_rtl: gcc_assert (decl_rtl); - decl_rtl = copy_rtx (decl_rtl); + if (MEM_P (decl_rtl)) + decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0); + else + decl_rtl = copy_rtx (decl_rtl); /* Record writes to register variables. */ if (modifier == EXPAND_WRITE && REG_P (decl_rtl) untested apart from on the x86_64 testcase (which it fixes). One could guard this further to only apply on vector typed decls with mismatched mode of course. I think that re-layouting globals is not very good design. Richard. > > > > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr 2016-01-18 11:36 ` [PATCH][ARM,AARCH64] " Richard Biener @ 2016-01-19 15:01 ` Christian Bruel 2016-01-19 15:13 ` Christian Bruel 0 siblings, 1 reply; 11+ messages in thread From: Christian Bruel @ 2016-01-19 15:01 UTC (permalink / raw) To: Richard Biener Cc: kyrylo.tkachov, Richard Earnshaw, ramana.radhakrishnan, bschmidt, GCC Patches Hi Richard, thanks for your input, On 01/18/2016 12:36 PM, Richard Biener wrote: > On Fri, Jan 8, 2016 at 2:29 PM, Christian Bruel <christian.bruel@st.com> wrote: >> When compiling code with attribute targets on arm or aarch64, >> vector_type_mode returns different results (eg Vmode or BLKmode) depending >> on the current simd flags that are not set between functions. >> >> for example the following code: >> >> #include <arm_neon.h> >> >> extern int8x8_t a; >> extern int8x8_t b; >> >> int16x8_t >> __attribute__ ((target("fpu=neon"))) >> foo(void) >> { >> return vaddl_s8 (a, b); >> } >> >> Triggers gcc_asserts in copy_to_mode_regs while expanding NEON builtins , >> because the mismatch and DECL_MODE current's TYPE_MODE used in >> expand_builtin for global variables. >> >> but the best explanation is in the vector_type_mode: >> /* Vector types need to re-check the target flags each time we report >> the machine mode. We need to do this because attribute target can >> change the result of vector_mode_supported_p and have_regs_of_mode >> on a per-function basis. Thus the TYPE_MODE of a VECTOR_TYPE can >> change on a per-function basis. */ >> >> I first tried to hack the 2 machine descriptions to insert convert_to_mode >> or relayout_decls here and there, but I found this very fragile. Instead a >> more central relayout the of type while expanding gave good results, as >> proposed here. >> >> bootstraped and tested with no regression for arm, aarch64 and i586. >> >> Does this look to be the right approach ? >> >> nb: for testing this patch is complementary with >> >> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00332.html >> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00248.html >> >> thanks for your comments. > A x86 specific testcase that ICEs as well: > > typedef int v8si __attribute__((vector_size(32))); > v8si a; > v8si __attribute__((target("avx"))) foo() > { > return a; > } > > in your patch not using the shared DECL_RTL of the global var > "fixes" this so I think a conceptually better fix would be to > "adjust" DECL_RTL from globals via a adjust_address (or so). > > Also given that we do > > /* ... fall through ... */ > > case FUNCTION_DECL: > case RESULT_DECL: > decl_rtl = DECL_RTL (exp); > expand_decl_rtl: > gcc_assert (decl_rtl); > decl_rtl = copy_rtx (decl_rtl); > > thus always "unshare" DECL_RTL anyway it might be not so > bad to simply do > > decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0); > > instead of that to avoid one copy. > > Index: expr.c > =================================================================== > --- expr.c (revision 232496) > +++ expr.c (working copy) > @@ -9597,7 +9597,10 @@ expand_expr_real_1 (tree exp, rtx target > decl_rtl = DECL_RTL (exp); > expand_decl_rtl: > gcc_assert (decl_rtl); > - decl_rtl = copy_rtx (decl_rtl); > + if (MEM_P (decl_rtl)) > + decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0); > + else > + decl_rtl = copy_rtx (decl_rtl); > /* Record writes to register variables. */ > if (modifier == EXPAND_WRITE > && REG_P (decl_rtl) > > untested apart from on the x86_64 testcase (which it fixes). One could guard > this further to only apply on vector typed decls with mismatched mode of course. > > I think that re-layouting globals is not very good design. > > Richard. A few other ICEs with this implementation, for instance if the context is not in a function, such as typedef __simd64_int8_t int8x8_t; extern int8x8_t b; int8x8_t *a = &b; So, to avoid a var re-layout and a copy_rtx (implied by adjust_address btw). What about just calling 'change_address' ? like: (very lightly tested) Index: expr.c =================================================================== --- expr.c (revision 232564) +++ expr.c (working copy) @@ -9392,7 +9392,8 @@ enum expand_modifier modifier, rtx *alt_rtl, bool inner_reference_p) { - rtx op0, op1, temp, decl_rtl; + rtx op0, op1, temp; + rtx decl_rtl = NULL_RTX; tree type; int unsignedp; machine_mode mode, dmode; @@ -9590,11 +9591,22 @@ && (TREE_STATIC (exp) || DECL_EXTERNAL (exp))) layout_decl (exp, 0); + decl_rtl = DECL_RTL (exp); + + if (MEM_P (decl_rtl) + && (VECTOR_TYPE_P (type) && DECL_MODE (exp) != mode)) + { + if (current_function_decl + && (! reload_completed && !reload_in_progress)) + decl_rtl = change_address (decl_rtl, TYPE_MODE (type), 0); + } + /* ... fall through ... */ case FUNCTION_DECL: case RESULT_DECL: - decl_rtl = DECL_RTL (exp); + if (! decl_rtl) + decl_rtl = DECL_RTL (exp); expand_decl_rtl: gcc_assert (decl_rtl); decl_rtl = copy_rtx (decl_rtl); I'm not sure that moving the code in the 'expand_decl_rtl' label is best, as we'd need to test for exp and the case should only happen for global vars (not functions or results) thanks, ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr 2016-01-19 15:01 ` Christian Bruel @ 2016-01-19 15:13 ` Christian Bruel 2016-01-19 15:18 ` Richard Biener 0 siblings, 1 reply; 11+ messages in thread From: Christian Bruel @ 2016-01-19 15:13 UTC (permalink / raw) To: Richard Biener Cc: kyrylo.tkachov, Richard Earnshaw, ramana.radhakrishnan, bschmidt, GCC Patches On 01/19/2016 04:01 PM, Christian Bruel wrote: > Hi Richard, > > thanks for your input, > > On 01/18/2016 12:36 PM, Richard Biener wrote: >> On Fri, Jan 8, 2016 at 2:29 PM, Christian Bruel <christian.bruel@st.com> wrote: >>> When compiling code with attribute targets on arm or aarch64, >>> vector_type_mode returns different results (eg Vmode or BLKmode) depending >>> on the current simd flags that are not set between functions. >>> >>> for example the following code: >>> >>> #include <arm_neon.h> >>> >>> extern int8x8_t a; >>> extern int8x8_t b; >>> >>> int16x8_t >>> __attribute__ ((target("fpu=neon"))) >>> foo(void) >>> { >>> return vaddl_s8 (a, b); >>> } >>> >>> Triggers gcc_asserts in copy_to_mode_regs while expanding NEON builtins , >>> because the mismatch and DECL_MODE current's TYPE_MODE used in >>> expand_builtin for global variables. >>> >>> but the best explanation is in the vector_type_mode: >>> /* Vector types need to re-check the target flags each time we report >>> the machine mode. We need to do this because attribute target can >>> change the result of vector_mode_supported_p and have_regs_of_mode >>> on a per-function basis. Thus the TYPE_MODE of a VECTOR_TYPE can >>> change on a per-function basis. */ >>> >>> I first tried to hack the 2 machine descriptions to insert convert_to_mode >>> or relayout_decls here and there, but I found this very fragile. Instead a >>> more central relayout the of type while expanding gave good results, as >>> proposed here. >>> >>> bootstraped and tested with no regression for arm, aarch64 and i586. >>> >>> Does this look to be the right approach ? >>> >>> nb: for testing this patch is complementary with >>> >>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00332.html >>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00248.html >>> >>> thanks for your comments. >> A x86 specific testcase that ICEs as well: >> >> typedef int v8si __attribute__((vector_size(32))); >> v8si a; >> v8si __attribute__((target("avx"))) foo() >> { >> return a; >> } >> >> in your patch not using the shared DECL_RTL of the global var >> "fixes" this so I think a conceptually better fix would be to >> "adjust" DECL_RTL from globals via a adjust_address (or so). >> >> Also given that we do >> >> /* ... fall through ... */ >> >> case FUNCTION_DECL: >> case RESULT_DECL: >> decl_rtl = DECL_RTL (exp); >> expand_decl_rtl: >> gcc_assert (decl_rtl); >> decl_rtl = copy_rtx (decl_rtl); >> >> thus always "unshare" DECL_RTL anyway it might be not so >> bad to simply do >> >> decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0); >> >> instead of that to avoid one copy. >> >> Index: expr.c >> =================================================================== >> --- expr.c (revision 232496) >> +++ expr.c (working copy) >> @@ -9597,7 +9597,10 @@ expand_expr_real_1 (tree exp, rtx target >> decl_rtl = DECL_RTL (exp); >> expand_decl_rtl: >> gcc_assert (decl_rtl); >> - decl_rtl = copy_rtx (decl_rtl); >> + if (MEM_P (decl_rtl)) >> + decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0); >> + else >> + decl_rtl = copy_rtx (decl_rtl); >> /* Record writes to register variables. */ >> if (modifier == EXPAND_WRITE >> && REG_P (decl_rtl) >> >> untested apart from on the x86_64 testcase (which it fixes). One could guard >> this further to only apply on vector typed decls with mismatched mode of course. >> >> I think that re-layouting globals is not very good design. >> >> Richard. > A few other ICEs with this implementation, for instance if the context > is not in a function, such as > > typedef __simd64_int8_t int8x8_t; > > extern int8x8_t b; > int8x8_t *a = &b; > > So, to avoid a var re-layout and a copy_rtx (implied by adjust_address > btw). What about just calling 'change_address' ? like: (very lightly tested) > > Index: expr.c > =================================================================== > --- expr.c (revision 232564) > +++ expr.c (working copy) > @@ -9392,7 +9392,8 @@ > enum expand_modifier modifier, rtx *alt_rtl, > bool inner_reference_p) > { > - rtx op0, op1, temp, decl_rtl; > + rtx op0, op1, temp; > + rtx decl_rtl = NULL_RTX; > tree type; > int unsignedp; > machine_mode mode, dmode; > @@ -9590,11 +9591,22 @@ > && (TREE_STATIC (exp) || DECL_EXTERNAL (exp))) > layout_decl (exp, 0); > > + decl_rtl = DECL_RTL (exp); > + > + if (MEM_P (decl_rtl) > + && (VECTOR_TYPE_P (type) && DECL_MODE (exp) != mode)) > + { > + if (current_function_decl > + && (! reload_completed && !reload_in_progress)) > + decl_rtl = change_address (decl_rtl, TYPE_MODE (type), 0); > + } > + > /* ... fall through ... */ > > case FUNCTION_DECL: > case RESULT_DECL: > - decl_rtl = DECL_RTL (exp); > + if (! decl_rtl) > + decl_rtl = DECL_RTL (exp); > expand_decl_rtl: > gcc_assert (decl_rtl); > decl_rtl = copy_rtx (decl_rtl); > > I'm not sure that moving the code in the 'expand_decl_rtl' label is > best, as we'd need to test for exp and the case should only happen for > global vars (not functions or results) Here is the alternative implementation, shorter after all. testing in progress. Index: expr.c =================================================================== --- expr.c (revision 232570) +++ expr.c (working copy) @@ -9597,6 +9597,15 @@ decl_rtl = DECL_RTL (exp); expand_decl_rtl: gcc_assert (decl_rtl); + + if (exp && code == VAR_DECL && MEM_P (decl_rtl) + && (VECTOR_TYPE_P (type) && DECL_MODE (exp) != mode)) + { + if (current_function_decl + && (! reload_completed && !reload_in_progress)) + decl_rtl = change_address (decl_rtl, TYPE_MODE (type), 0); + } + decl_rtl = copy_rtx (decl_rtl); /* Record writes to register variables. */ if (modifier == EXPAND_WRITE > > thanks, > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr 2016-01-19 15:13 ` Christian Bruel @ 2016-01-19 15:18 ` Richard Biener 2016-01-22 11:41 ` Christian Bruel 0 siblings, 1 reply; 11+ messages in thread From: Richard Biener @ 2016-01-19 15:18 UTC (permalink / raw) To: Christian Bruel Cc: kyrylo.tkachov, Richard Earnshaw, ramana.radhakrishnan, bschmidt, GCC Patches On Tue, Jan 19, 2016 at 4:13 PM, Christian Bruel <christian.bruel@st.com> wrote: > > > On 01/19/2016 04:01 PM, Christian Bruel wrote: >> >> Hi Richard, >> >> thanks for your input, >> >> On 01/18/2016 12:36 PM, Richard Biener wrote: >>> >>> On Fri, Jan 8, 2016 at 2:29 PM, Christian Bruel <christian.bruel@st.com> >>> wrote: >>>> >>>> When compiling code with attribute targets on arm or aarch64, >>>> vector_type_mode returns different results (eg Vmode or BLKmode) >>>> depending >>>> on the current simd flags that are not set between functions. >>>> >>>> for example the following code: >>>> >>>> #include <arm_neon.h> >>>> >>>> extern int8x8_t a; >>>> extern int8x8_t b; >>>> >>>> int16x8_t >>>> __attribute__ ((target("fpu=neon"))) >>>> foo(void) >>>> { >>>> return vaddl_s8 (a, b); >>>> } >>>> >>>> Triggers gcc_asserts in copy_to_mode_regs while expanding NEON builtins >>>> , >>>> because the mismatch and DECL_MODE current's TYPE_MODE used in >>>> expand_builtin for global variables. >>>> >>>> but the best explanation is in the vector_type_mode: >>>> /* Vector types need to re-check the target flags each time we report >>>> the machine mode. We need to do this because attribute target can >>>> change the result of vector_mode_supported_p and have_regs_of_mode >>>> on a per-function basis. Thus the TYPE_MODE of a VECTOR_TYPE can >>>> change on a per-function basis. */ >>>> >>>> I first tried to hack the 2 machine descriptions to insert >>>> convert_to_mode >>>> or relayout_decls here and there, but I found this very fragile. Instead >>>> a >>>> more central relayout the of type while expanding gave good results, as >>>> proposed here. >>>> >>>> bootstraped and tested with no regression for arm, aarch64 and i586. >>>> >>>> Does this look to be the right approach ? >>>> >>>> nb: for testing this patch is complementary with >>>> >>>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00332.html >>>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00248.html >>>> >>>> thanks for your comments. >>> >>> A x86 specific testcase that ICEs as well: >>> >>> typedef int v8si __attribute__((vector_size(32))); >>> v8si a; >>> v8si __attribute__((target("avx"))) foo() >>> { >>> return a; >>> } >>> >>> in your patch not using the shared DECL_RTL of the global var >>> "fixes" this so I think a conceptually better fix would be to >>> "adjust" DECL_RTL from globals via a adjust_address (or so). >>> >>> Also given that we do >>> >>> /* ... fall through ... */ >>> >>> case FUNCTION_DECL: >>> case RESULT_DECL: >>> decl_rtl = DECL_RTL (exp); >>> expand_decl_rtl: >>> gcc_assert (decl_rtl); >>> decl_rtl = copy_rtx (decl_rtl); >>> >>> thus always "unshare" DECL_RTL anyway it might be not so >>> bad to simply do >>> >>> decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0); >>> >>> instead of that to avoid one copy. >>> >>> Index: expr.c >>> =================================================================== >>> --- expr.c (revision 232496) >>> +++ expr.c (working copy) >>> @@ -9597,7 +9597,10 @@ expand_expr_real_1 (tree exp, rtx target >>> decl_rtl = DECL_RTL (exp); >>> expand_decl_rtl: >>> gcc_assert (decl_rtl); >>> - decl_rtl = copy_rtx (decl_rtl); >>> + if (MEM_P (decl_rtl)) >>> + decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0); >>> + else >>> + decl_rtl = copy_rtx (decl_rtl); >>> /* Record writes to register variables. */ >>> if (modifier == EXPAND_WRITE >>> && REG_P (decl_rtl) >>> >>> untested apart from on the x86_64 testcase (which it fixes). One could >>> guard >>> this further to only apply on vector typed decls with mismatched mode of >>> course. >>> >>> I think that re-layouting globals is not very good design. >>> >>> Richard. >> >> A few other ICEs with this implementation, for instance if the context >> is not in a function, such as >> >> typedef __simd64_int8_t int8x8_t; >> >> extern int8x8_t b; >> int8x8_t *a = &b; >> >> So, to avoid a var re-layout and a copy_rtx (implied by adjust_address >> btw). What about just calling 'change_address' ? like: (very lightly >> tested) >> >> Index: expr.c >> =================================================================== >> --- expr.c (revision 232564) >> +++ expr.c (working copy) >> @@ -9392,7 +9392,8 @@ >> enum expand_modifier modifier, rtx *alt_rtl, >> bool inner_reference_p) >> { >> - rtx op0, op1, temp, decl_rtl; >> + rtx op0, op1, temp; >> + rtx decl_rtl = NULL_RTX; >> tree type; >> int unsignedp; >> machine_mode mode, dmode; >> @@ -9590,11 +9591,22 @@ >> && (TREE_STATIC (exp) || DECL_EXTERNAL (exp))) >> layout_decl (exp, 0); >> >> + decl_rtl = DECL_RTL (exp); >> + >> + if (MEM_P (decl_rtl) >> + && (VECTOR_TYPE_P (type) && DECL_MODE (exp) != mode)) >> + { >> + if (current_function_decl >> + && (! reload_completed && !reload_in_progress)) >> + decl_rtl = change_address (decl_rtl, TYPE_MODE (type), 0); >> + } >> + >> /* ... fall through ... */ >> >> case FUNCTION_DECL: >> case RESULT_DECL: >> - decl_rtl = DECL_RTL (exp); >> + if (! decl_rtl) >> + decl_rtl = DECL_RTL (exp); >> expand_decl_rtl: >> gcc_assert (decl_rtl); >> decl_rtl = copy_rtx (decl_rtl); >> >> I'm not sure that moving the code in the 'expand_decl_rtl' label is >> best, as we'd need to test for exp and the case should only happen for >> global vars (not functions or results) > > > Here is the alternative implementation, shorter after all. testing in > progress. > > Index: expr.c > =================================================================== > --- expr.c (revision 232570) > +++ expr.c (working copy) > @@ -9597,6 +9597,15 @@ > decl_rtl = DECL_RTL (exp); > expand_decl_rtl: > gcc_assert (decl_rtl); > + > + if (exp && code == VAR_DECL && MEM_P (decl_rtl) > + && (VECTOR_TYPE_P (type) && DECL_MODE (exp) != mode)) > + { > + if (current_function_decl > + && (! reload_completed && !reload_in_progress)) maybe just if (currently_expanding_to_rtl)? But yes, this looks like a safe variant of the fix. Richard. > + decl_rtl = change_address (decl_rtl, TYPE_MODE (type), 0); > + } > + > decl_rtl = copy_rtx (decl_rtl); > /* Record writes to register variables. */ > if (modifier == EXPAND_WRITE > >> >> thanks, >> > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr 2016-01-19 15:18 ` Richard Biener @ 2016-01-22 11:41 ` Christian Bruel 2016-01-22 11:56 ` Richard Biener 0 siblings, 1 reply; 11+ messages in thread From: Christian Bruel @ 2016-01-22 11:41 UTC (permalink / raw) To: Richard Biener Cc: kyrylo.tkachov, Richard Earnshaw, ramana.radhakrishnan, bschmidt, GCC Patches [-- Attachment #1: Type: text/plain, Size: 440 bytes --] On 01/19/2016 04:18 PM, Richard Biener wrote: > maybe just if (currently_expanding_to_rtl)? > > But yes, this looks like a safe variant of the fix. > > Richard. > thanks, currently_expanding_to_rtl works perfectly. So the final version. I added a test for each target. bootstrapped / tested for : unix/-m32/-march=i586 unix arm-qemu/ arm-qemu//-mfpu=neon arm-qemu//-mfpu=neon-fp-armv8 aarch64-qemu [-- Attachment #2: pr68674.patch --] [-- Type: text/x-patch, Size: 2950 bytes --] 2016-01-21 Christian Bruel <christian.bruel@st.com> PR target/68674 * expr.c (expand_expr_real_1): Reset DECL_MODE if VECTOR_TYPE_P changed. 2016-01-21 Christian Bruel <christian.bruel@st.com> PR target/68674 * gcc.target/i386/pr68674.c * gcc.target/aarch64/pr68674.c * gcc.target/arm/pr68674.c Index: gcc/expr.c =================================================================== --- gcc/expr.c (revision 232724) +++ gcc/expr.c (working copy) @@ -9597,7 +9597,16 @@ expand_expr_real_1 (tree exp, rtx target, machine_ decl_rtl = DECL_RTL (exp); expand_decl_rtl: gcc_assert (decl_rtl); - decl_rtl = copy_rtx (decl_rtl); + + /* DECL_MODE might change when TYPE_MODE depends on attribute target + settings for VECTOR_TYPE_P that might switch for the function. */ + if (currently_expanding_to_rtl + && code == VAR_DECL && MEM_P (decl_rtl) + && VECTOR_TYPE_P (type) && exp && DECL_MODE (exp) != mode) + decl_rtl = change_address (decl_rtl, TYPE_MODE (type), 0); + else + decl_rtl = copy_rtx (decl_rtl); + /* Record writes to register variables. */ if (modifier == EXPAND_WRITE && REG_P (decl_rtl) Index: gcc/testsuite/gcc.target/aarch64/pr68674.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/pr68674.c (revision 0) +++ gcc/testsuite/gcc.target/aarch64/pr68674.c (working copy) @@ -0,0 +1,22 @@ +/* PR target/68674 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcpu=generic+nosimd" } */ + +#include <arm_neon.h> + +int8x8_t a; +extern int8x8_t b; +int16x8_t e; + +void __attribute__((target("+simd"))) +foo1(void) +{ + e = (int16x8_t) vaddl_s8(a, b); +} + +int8x8_t __attribute__((target("+simd"))) +foo2(void) +{ + return a; +} + Index: gcc/testsuite/gcc.target/arm/pr68674.c =================================================================== --- gcc/testsuite/gcc.target/arm/pr68674.c (revision 0) +++ gcc/testsuite/gcc.target/arm/pr68674.c (working copy) @@ -0,0 +1,26 @@ +/* PR target/68674 */ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O2 -mfloat-abi=softfp" } */ + +#pragma GCC target ("fpu=vfp") + +#include <arm_neon.h> + +int8x8_t a; +extern int8x8_t b; +int16x8_t e; + +void __attribute__((target("fpu=neon"))) +foo1(void) +{ + e = (int16x8_t) vaddl_s8(a, b); +} + +int8x8_t __attribute__((target("fpu=neon"))) +foo2(void) +{ + return b; +} + + Index: gcc/testsuite/gcc.target/i386/pr68674.c =================================================================== --- gcc/testsuite/gcc.target/i386/pr68674.c (revision 0) +++ gcc/testsuite/gcc.target/i386/pr68674.c (working copy) @@ -0,0 +1,15 @@ +/* PR target/68674 */ +/* { dg-do compile } */ +/* { dg-require-effective-target avx } */ +/* { dg-options "-O2" } */ + +typedef int v8si __attribute__((vector_size(32))); + +v8si a; + + __attribute__((target("avx"))) +v8si +foo() +{ + return a; +} ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr 2016-01-22 11:41 ` Christian Bruel @ 2016-01-22 11:56 ` Richard Biener 2016-01-25 19:06 ` Christophe Lyon 0 siblings, 1 reply; 11+ messages in thread From: Richard Biener @ 2016-01-22 11:56 UTC (permalink / raw) To: Christian Bruel Cc: kyrylo.tkachov, Richard Earnshaw, ramana.radhakrishnan, bschmidt, GCC Patches On Fri, Jan 22, 2016 at 12:41 PM, Christian Bruel <christian.bruel@st.com> wrote: > > > On 01/19/2016 04:18 PM, Richard Biener wrote: >> >> maybe just if (currently_expanding_to_rtl)? >> >> But yes, this looks like a safe variant of the fix. >> >> Richard. >> > thanks, currently_expanding_to_rtl works perfectly. So the final version. > I added a test for each target. Ok. Thanks, Richard. > bootstrapped / tested for : > unix/-m32/-march=i586 > unix > > arm-qemu/ > arm-qemu//-mfpu=neon > arm-qemu//-mfpu=neon-fp-armv8 > > aarch64-qemu > > > > > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr 2016-01-22 11:56 ` Richard Biener @ 2016-01-25 19:06 ` Christophe Lyon 2016-01-26 8:17 ` Christian Bruel 2016-01-26 9:27 ` Kyrill Tkachov 0 siblings, 2 replies; 11+ messages in thread From: Christophe Lyon @ 2016-01-25 19:06 UTC (permalink / raw) To: Richard Biener Cc: Christian Bruel, kyrylo.tkachov, Richard Earnshaw, ramana.radhakrishnan, bschmidt, GCC Patches [-- Attachment #1: Type: text/plain, Size: 1012 bytes --] On 22 January 2016 at 12:56, Richard Biener <richard.guenther@gmail.com> wrote: > On Fri, Jan 22, 2016 at 12:41 PM, Christian Bruel > <christian.bruel@st.com> wrote: >> >> >> On 01/19/2016 04:18 PM, Richard Biener wrote: >>> >>> maybe just if (currently_expanding_to_rtl)? >>> >>> But yes, this looks like a safe variant of the fix. >>> >>> Richard. >>> >> thanks, currently_expanding_to_rtl works perfectly. So the final version. >> I added a test for each target. > > Ok. > Hi, This small patch is needed to make the new test pass on arm hard-float targets (eg. arm-none-linux-gnueabihf). I'm not sure it counts as obvious, so here it is. OK? Christophe. DATE Christophe Lyon <christophe.lyon@linaro.org> * gcc.target/arm/pr68674.c: Check and use arm_fp effective target. > Thanks, > Richard. > >> bootstrapped / tested for : >> unix/-m32/-march=i586 >> unix >> >> arm-qemu/ >> arm-qemu//-mfpu=neon >> arm-qemu//-mfpu=neon-fp-armv8 >> >> aarch64-qemu >> >> >> >> >> >> >> [-- Attachment #2: pr68674-test.patch.txt --] [-- Type: text/plain, Size: 521 bytes --] diff --git a/gcc/testsuite/gcc.target/arm/pr68674.c b/gcc/testsuite/gcc.target/arm/pr68674.c index a31a88a..0b32374 100644 --- a/gcc/testsuite/gcc.target/arm/pr68674.c +++ b/gcc/testsuite/gcc.target/arm/pr68674.c @@ -1,7 +1,9 @@ /* PR target/68674 */ /* { dg-do compile } */ /* { dg-require-effective-target arm_neon_ok } */ -/* { dg-options "-O2 -mfloat-abi=softfp" } */ +/* { dg-require-effective-target arm_fp_ok } */ +/* { dg-options "-O2" } */ +/* { dg-add-options arm_fp } */ #pragma GCC target ("fpu=vfp") ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr 2016-01-25 19:06 ` Christophe Lyon @ 2016-01-26 8:17 ` Christian Bruel 2016-01-26 9:27 ` Kyrill Tkachov 1 sibling, 0 replies; 11+ messages in thread From: Christian Bruel @ 2016-01-26 8:17 UTC (permalink / raw) To: Christophe Lyon, Richard Biener Cc: kyrylo.tkachov, Richard Earnshaw, ramana.radhakrishnan, bschmidt, GCC Patches On 01/25/2016 08:06 PM, Christophe Lyon wrote: > On 22 January 2016 at 12:56, Richard Biener <richard.guenther@gmail.com> wrote: >> On Fri, Jan 22, 2016 at 12:41 PM, Christian Bruel >> <christian.bruel@st.com> wrote: >>> >>> On 01/19/2016 04:18 PM, Richard Biener wrote: >>>> maybe just if (currently_expanding_to_rtl)? >>>> >>>> But yes, this looks like a safe variant of the fix. >>>> >>>> Richard. >>>> >>> thanks, currently_expanding_to_rtl works perfectly. So the final version. >>> I added a test for each target. >> Ok. >> > Hi, > > This small patch is needed to make the new test pass on arm hard-float > targets (eg. arm-none-linux-gnueabihf). > > I'm not sure it counts as obvious, so here it is. > OK? > > Christophe. > > DATE Christophe Lyon <christophe.lyon@linaro.org> > > * gcc.target/arm/pr68674.c: Check and use arm_fp effective target. At least that's fine with me. Just for the story, arm_neon.h ends up to include stubs.h from the sysroot. Depending on the glibc version it might assert that we cannot use softfp abi with a sysroot built with hardfp. thanks for spotting this Christophe, > >> Thanks, >> Richard. >> >>> bootstrapped / tested for : >>> unix/-m32/-march=i586 >>> unix >>> >>> arm-qemu/ >>> arm-qemu//-mfpu=neon >>> arm-qemu//-mfpu=neon-fp-armv8 >>> >>> aarch64-qemu >>> >>> >>> >>> >>> >>> >>> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr 2016-01-25 19:06 ` Christophe Lyon 2016-01-26 8:17 ` Christian Bruel @ 2016-01-26 9:27 ` Kyrill Tkachov 1 sibling, 0 replies; 11+ messages in thread From: Kyrill Tkachov @ 2016-01-26 9:27 UTC (permalink / raw) To: Christophe Lyon, Richard Biener Cc: Christian Bruel, Richard Earnshaw, ramana.radhakrishnan, bschmidt, GCC Patches On 25/01/16 19:06, Christophe Lyon wrote: > On 22 January 2016 at 12:56, Richard Biener <richard.guenther@gmail.com> wrote: >> On Fri, Jan 22, 2016 at 12:41 PM, Christian Bruel >> <christian.bruel@st.com> wrote: >>> >>> On 01/19/2016 04:18 PM, Richard Biener wrote: >>>> maybe just if (currently_expanding_to_rtl)? >>>> >>>> But yes, this looks like a safe variant of the fix. >>>> >>>> Richard. >>>> >>> thanks, currently_expanding_to_rtl works perfectly. So the final version. >>> I added a test for each target. >> Ok. >> > Hi, > > This small patch is needed to make the new test pass on arm hard-float > targets (eg. arm-none-linux-gnueabihf). > > I'm not sure it counts as obvious, so here it is. > OK? Ok. Thanks, Kyrill > Christophe. > > DATE Christophe Lyon <christophe.lyon@linaro.org> > > * gcc.target/arm/pr68674.c: Check and use arm_fp effective target. > > >> Thanks, >> Richard. >> >>> bootstrapped / tested for : >>> unix/-m32/-march=i586 >>> unix >>> >>> arm-qemu/ >>> arm-qemu//-mfpu=neon >>> arm-qemu//-mfpu=neon-fp-armv8 >>> >>> aarch64-qemu >>> >>> >>> >>> >>> >>> >>> ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-01-26 9:27 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-01-08 13:29 [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr Christian Bruel 2016-01-14 13:09 ` ping:[PATCH][ARM,AARCH64] " Christian Bruel 2016-01-18 11:36 ` [PATCH][ARM,AARCH64] " Richard Biener 2016-01-19 15:01 ` Christian Bruel 2016-01-19 15:13 ` Christian Bruel 2016-01-19 15:18 ` Richard Biener 2016-01-22 11:41 ` Christian Bruel 2016-01-22 11:56 ` Richard Biener 2016-01-25 19:06 ` Christophe Lyon 2016-01-26 8:17 ` Christian Bruel 2016-01-26 9:27 ` Kyrill Tkachov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).