* [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr
@ 2016-01-08 13:29 Christian Bruel
2016-01-14 13:09 ` ping:[PATCH][ARM,AARCH64] " Christian Bruel
2016-01-18 11:36 ` [PATCH][ARM,AARCH64] " Richard Biener
0 siblings, 2 replies; 11+ messages in thread
From: Christian Bruel @ 2016-01-08 13:29 UTC (permalink / raw)
To: kyrylo.tkachov, Richard.Earnshaw, ramana.radhakrishnan,
richard.guenther, bschmidt, gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1496 bytes --]
When compiling code with attribute targets on arm or aarch64,
vector_type_mode returns different results (eg Vmode or BLKmode)
depending on the current simd flags that are not set between functions.
for example the following code:
#include <arm_neon.h>
extern int8x8_t a;
extern int8x8_t b;
int16x8_t
__attribute__ ((target("fpu=neon")))
foo(void)
{
return vaddl_s8 (a, b);
}
Triggers gcc_asserts in copy_to_mode_regs while expanding NEON builtins
, because the mismatch and DECL_MODE current's TYPE_MODE used in
expand_builtin for global variables.
but the best explanation is in the vector_type_mode:
/* Vector types need to re-check the target flags each time we report
the machine mode. We need to do this because attribute target can
change the result of vector_mode_supported_p and have_regs_of_mode
on a per-function basis. Thus the TYPE_MODE of a VECTOR_TYPE can
change on a per-function basis. */
I first tried to hack the 2 machine descriptions to insert
convert_to_mode or relayout_decls here and there, but I found this very
fragile. Instead a more central relayout the of type while expanding
gave good results, as proposed here.
bootstraped and tested with no regression for arm, aarch64 and i586.
Does this look to be the right approach ?
nb: for testing this patch is complementary with
https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00332.html
https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00248.html
thanks for your comments.
[-- Attachment #2: pr68674.patch --]
[-- Type: text/x-patch, Size: 2108 bytes --]
2016-01-06 Christian Bruel <christian.bruel@st.com>
PR target/68674
* expr.c (expand_expr_real_1): Relayout VECTOR_TYPE expression.
2016-01-06 Christian Bruel <christian.bruel@st.com>
PR target/68674
* gcc.target/arm/pr68674.c
* gcc.target/aarch64/pr68674.c
Index: gcc/expr.c
===================================================================
--- gcc/expr.c (revision 232158)
+++ gcc/expr.c (working copy)
@@ -9602,8 +9602,17 @@ expand_expr_real_1 (tree exp, rtx target
exp = SSA_NAME_VAR (ssa_name);
goto expand_decl_rtl;
- case PARM_DECL:
case VAR_DECL:
+ /* Vector types need to re-check the target flags,
+ since DECL_MODE might change with attribute target. */
+ if (TREE_CODE (type) == VECTOR_TYPE
+ && DECL_MODE (exp) != TYPE_MODE (type)
+ && (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
+ relayout_decl (exp);
+
+ /* ... fall through ... */
+
+ case PARM_DECL:
/* If a static var's type was incomplete when the decl was written,
but the type is complete now, lay out the decl now. */
if (DECL_SIZE (exp) == 0
Index: gcc/testsuite/gcc.target/arm/pr68674.c
===================================================================
--- gcc/testsuite/gcc.target/arm/pr68674.c (revision 0)
+++ gcc/testsuite/gcc.target/arm/pr68674.c (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O2 -mfloat-abi=softfp" } */
+
+#include <arm_neon.h>
+
+int8x8_t a, b;
+int16x8_t e;
+
+void
+__attribute__ ((target("fpu=neon")))
+foo(void)
+{
+ e = (int16x8_t) vaddl_s8(a, b);
+}
Index: gcc/testsuite/gcc.target/aarch64/pr68674.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/pr68674.c (revision 0)
+++ gcc/testsuite/gcc.target/aarch64/pr68674.c (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do assemble } */
+/* { dg-options "-O2 -march=armv8-a+nosimd" } */
+
+#include <arm_neon.h>
+
+int8x8_t a, b;
+int16x8_t e;
+
+void
+__attribute__ ((target("+simd")))
+foo(void)
+{
+ e = (int16x8_t) vaddl_s8(a, b);
+}
+
^ permalink raw reply [flat|nested] 11+ messages in thread
* ping:[PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr
2016-01-08 13:29 [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr Christian Bruel
@ 2016-01-14 13:09 ` Christian Bruel
2016-01-18 11:36 ` [PATCH][ARM,AARCH64] " Richard Biener
1 sibling, 0 replies; 11+ messages in thread
From: Christian Bruel @ 2016-01-14 13:09 UTC (permalink / raw)
To: richard.guenther, bschmidt
Cc: kyrylo.tkachov, Richard.Earnshaw, ramana.radhakrishnan, gcc-patches
https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00415.html
thanks
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr
2016-01-08 13:29 [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr Christian Bruel
2016-01-14 13:09 ` ping:[PATCH][ARM,AARCH64] " Christian Bruel
@ 2016-01-18 11:36 ` Richard Biener
2016-01-19 15:01 ` Christian Bruel
1 sibling, 1 reply; 11+ messages in thread
From: Richard Biener @ 2016-01-18 11:36 UTC (permalink / raw)
To: Christian Bruel
Cc: kyrylo.tkachov, Richard Earnshaw, ramana.radhakrishnan, bschmidt,
GCC Patches
On Fri, Jan 8, 2016 at 2:29 PM, Christian Bruel <christian.bruel@st.com> wrote:
> When compiling code with attribute targets on arm or aarch64,
> vector_type_mode returns different results (eg Vmode or BLKmode) depending
> on the current simd flags that are not set between functions.
>
> for example the following code:
>
> #include <arm_neon.h>
>
> extern int8x8_t a;
> extern int8x8_t b;
>
> int16x8_t
> __attribute__ ((target("fpu=neon")))
> foo(void)
> {
> return vaddl_s8 (a, b);
> }
>
> Triggers gcc_asserts in copy_to_mode_regs while expanding NEON builtins ,
> because the mismatch and DECL_MODE current's TYPE_MODE used in
> expand_builtin for global variables.
>
> but the best explanation is in the vector_type_mode:
> /* Vector types need to re-check the target flags each time we report
> the machine mode. We need to do this because attribute target can
> change the result of vector_mode_supported_p and have_regs_of_mode
> on a per-function basis. Thus the TYPE_MODE of a VECTOR_TYPE can
> change on a per-function basis. */
>
> I first tried to hack the 2 machine descriptions to insert convert_to_mode
> or relayout_decls here and there, but I found this very fragile. Instead a
> more central relayout the of type while expanding gave good results, as
> proposed here.
>
> bootstraped and tested with no regression for arm, aarch64 and i586.
>
> Does this look to be the right approach ?
>
> nb: for testing this patch is complementary with
>
> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00332.html
> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00248.html
>
> thanks for your comments.
A x86 specific testcase that ICEs as well:
typedef int v8si __attribute__((vector_size(32)));
v8si a;
v8si __attribute__((target("avx"))) foo()
{
return a;
}
in your patch not using the shared DECL_RTL of the global var
"fixes" this so I think a conceptually better fix would be to
"adjust" DECL_RTL from globals via a adjust_address (or so).
Also given that we do
/* ... fall through ... */
case FUNCTION_DECL:
case RESULT_DECL:
decl_rtl = DECL_RTL (exp);
expand_decl_rtl:
gcc_assert (decl_rtl);
decl_rtl = copy_rtx (decl_rtl);
thus always "unshare" DECL_RTL anyway it might be not so
bad to simply do
decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0);
instead of that to avoid one copy.
Index: expr.c
===================================================================
--- expr.c (revision 232496)
+++ expr.c (working copy)
@@ -9597,7 +9597,10 @@ expand_expr_real_1 (tree exp, rtx target
decl_rtl = DECL_RTL (exp);
expand_decl_rtl:
gcc_assert (decl_rtl);
- decl_rtl = copy_rtx (decl_rtl);
+ if (MEM_P (decl_rtl))
+ decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0);
+ else
+ decl_rtl = copy_rtx (decl_rtl);
/* Record writes to register variables. */
if (modifier == EXPAND_WRITE
&& REG_P (decl_rtl)
untested apart from on the x86_64 testcase (which it fixes). One could guard
this further to only apply on vector typed decls with mismatched mode of course.
I think that re-layouting globals is not very good design.
Richard.
>
>
>
>
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr
2016-01-18 11:36 ` [PATCH][ARM,AARCH64] " Richard Biener
@ 2016-01-19 15:01 ` Christian Bruel
2016-01-19 15:13 ` Christian Bruel
0 siblings, 1 reply; 11+ messages in thread
From: Christian Bruel @ 2016-01-19 15:01 UTC (permalink / raw)
To: Richard Biener
Cc: kyrylo.tkachov, Richard Earnshaw, ramana.radhakrishnan, bschmidt,
GCC Patches
Hi Richard,
thanks for your input,
On 01/18/2016 12:36 PM, Richard Biener wrote:
> On Fri, Jan 8, 2016 at 2:29 PM, Christian Bruel <christian.bruel@st.com> wrote:
>> When compiling code with attribute targets on arm or aarch64,
>> vector_type_mode returns different results (eg Vmode or BLKmode) depending
>> on the current simd flags that are not set between functions.
>>
>> for example the following code:
>>
>> #include <arm_neon.h>
>>
>> extern int8x8_t a;
>> extern int8x8_t b;
>>
>> int16x8_t
>> __attribute__ ((target("fpu=neon")))
>> foo(void)
>> {
>> return vaddl_s8 (a, b);
>> }
>>
>> Triggers gcc_asserts in copy_to_mode_regs while expanding NEON builtins ,
>> because the mismatch and DECL_MODE current's TYPE_MODE used in
>> expand_builtin for global variables.
>>
>> but the best explanation is in the vector_type_mode:
>> /* Vector types need to re-check the target flags each time we report
>> the machine mode. We need to do this because attribute target can
>> change the result of vector_mode_supported_p and have_regs_of_mode
>> on a per-function basis. Thus the TYPE_MODE of a VECTOR_TYPE can
>> change on a per-function basis. */
>>
>> I first tried to hack the 2 machine descriptions to insert convert_to_mode
>> or relayout_decls here and there, but I found this very fragile. Instead a
>> more central relayout the of type while expanding gave good results, as
>> proposed here.
>>
>> bootstraped and tested with no regression for arm, aarch64 and i586.
>>
>> Does this look to be the right approach ?
>>
>> nb: for testing this patch is complementary with
>>
>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00332.html
>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00248.html
>>
>> thanks for your comments.
> A x86 specific testcase that ICEs as well:
>
> typedef int v8si __attribute__((vector_size(32)));
> v8si a;
> v8si __attribute__((target("avx"))) foo()
> {
> return a;
> }
>
> in your patch not using the shared DECL_RTL of the global var
> "fixes" this so I think a conceptually better fix would be to
> "adjust" DECL_RTL from globals via a adjust_address (or so).
>
> Also given that we do
>
> /* ... fall through ... */
>
> case FUNCTION_DECL:
> case RESULT_DECL:
> decl_rtl = DECL_RTL (exp);
> expand_decl_rtl:
> gcc_assert (decl_rtl);
> decl_rtl = copy_rtx (decl_rtl);
>
> thus always "unshare" DECL_RTL anyway it might be not so
> bad to simply do
>
> decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0);
>
> instead of that to avoid one copy.
>
> Index: expr.c
> ===================================================================
> --- expr.c (revision 232496)
> +++ expr.c (working copy)
> @@ -9597,7 +9597,10 @@ expand_expr_real_1 (tree exp, rtx target
> decl_rtl = DECL_RTL (exp);
> expand_decl_rtl:
> gcc_assert (decl_rtl);
> - decl_rtl = copy_rtx (decl_rtl);
> + if (MEM_P (decl_rtl))
> + decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0);
> + else
> + decl_rtl = copy_rtx (decl_rtl);
> /* Record writes to register variables. */
> if (modifier == EXPAND_WRITE
> && REG_P (decl_rtl)
>
> untested apart from on the x86_64 testcase (which it fixes). One could guard
> this further to only apply on vector typed decls with mismatched mode of course.
>
> I think that re-layouting globals is not very good design.
>
> Richard.
A few other ICEs with this implementation, for instance if the context
is not in a function, such as
typedef __simd64_int8_t int8x8_t;
extern int8x8_t b;
int8x8_t *a = &b;
So, to avoid a var re-layout and a copy_rtx (implied by adjust_address
btw). What about just calling 'change_address' ? like: (very lightly tested)
Index: expr.c
===================================================================
--- expr.c (revision 232564)
+++ expr.c (working copy)
@@ -9392,7 +9392,8 @@
enum expand_modifier modifier, rtx *alt_rtl,
bool inner_reference_p)
{
- rtx op0, op1, temp, decl_rtl;
+ rtx op0, op1, temp;
+ rtx decl_rtl = NULL_RTX;
tree type;
int unsignedp;
machine_mode mode, dmode;
@@ -9590,11 +9591,22 @@
&& (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
layout_decl (exp, 0);
+ decl_rtl = DECL_RTL (exp);
+
+ if (MEM_P (decl_rtl)
+ && (VECTOR_TYPE_P (type) && DECL_MODE (exp) != mode))
+ {
+ if (current_function_decl
+ && (! reload_completed && !reload_in_progress))
+ decl_rtl = change_address (decl_rtl, TYPE_MODE (type), 0);
+ }
+
/* ... fall through ... */
case FUNCTION_DECL:
case RESULT_DECL:
- decl_rtl = DECL_RTL (exp);
+ if (! decl_rtl)
+ decl_rtl = DECL_RTL (exp);
expand_decl_rtl:
gcc_assert (decl_rtl);
decl_rtl = copy_rtx (decl_rtl);
I'm not sure that moving the code in the 'expand_decl_rtl' label is
best, as we'd need to test for exp and the case should only happen for
global vars (not functions or results)
thanks,
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr
2016-01-19 15:01 ` Christian Bruel
@ 2016-01-19 15:13 ` Christian Bruel
2016-01-19 15:18 ` Richard Biener
0 siblings, 1 reply; 11+ messages in thread
From: Christian Bruel @ 2016-01-19 15:13 UTC (permalink / raw)
To: Richard Biener
Cc: kyrylo.tkachov, Richard Earnshaw, ramana.radhakrishnan, bschmidt,
GCC Patches
On 01/19/2016 04:01 PM, Christian Bruel wrote:
> Hi Richard,
>
> thanks for your input,
>
> On 01/18/2016 12:36 PM, Richard Biener wrote:
>> On Fri, Jan 8, 2016 at 2:29 PM, Christian Bruel <christian.bruel@st.com> wrote:
>>> When compiling code with attribute targets on arm or aarch64,
>>> vector_type_mode returns different results (eg Vmode or BLKmode) depending
>>> on the current simd flags that are not set between functions.
>>>
>>> for example the following code:
>>>
>>> #include <arm_neon.h>
>>>
>>> extern int8x8_t a;
>>> extern int8x8_t b;
>>>
>>> int16x8_t
>>> __attribute__ ((target("fpu=neon")))
>>> foo(void)
>>> {
>>> return vaddl_s8 (a, b);
>>> }
>>>
>>> Triggers gcc_asserts in copy_to_mode_regs while expanding NEON builtins ,
>>> because the mismatch and DECL_MODE current's TYPE_MODE used in
>>> expand_builtin for global variables.
>>>
>>> but the best explanation is in the vector_type_mode:
>>> /* Vector types need to re-check the target flags each time we report
>>> the machine mode. We need to do this because attribute target can
>>> change the result of vector_mode_supported_p and have_regs_of_mode
>>> on a per-function basis. Thus the TYPE_MODE of a VECTOR_TYPE can
>>> change on a per-function basis. */
>>>
>>> I first tried to hack the 2 machine descriptions to insert convert_to_mode
>>> or relayout_decls here and there, but I found this very fragile. Instead a
>>> more central relayout the of type while expanding gave good results, as
>>> proposed here.
>>>
>>> bootstraped and tested with no regression for arm, aarch64 and i586.
>>>
>>> Does this look to be the right approach ?
>>>
>>> nb: for testing this patch is complementary with
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00332.html
>>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00248.html
>>>
>>> thanks for your comments.
>> A x86 specific testcase that ICEs as well:
>>
>> typedef int v8si __attribute__((vector_size(32)));
>> v8si a;
>> v8si __attribute__((target("avx"))) foo()
>> {
>> return a;
>> }
>>
>> in your patch not using the shared DECL_RTL of the global var
>> "fixes" this so I think a conceptually better fix would be to
>> "adjust" DECL_RTL from globals via a adjust_address (or so).
>>
>> Also given that we do
>>
>> /* ... fall through ... */
>>
>> case FUNCTION_DECL:
>> case RESULT_DECL:
>> decl_rtl = DECL_RTL (exp);
>> expand_decl_rtl:
>> gcc_assert (decl_rtl);
>> decl_rtl = copy_rtx (decl_rtl);
>>
>> thus always "unshare" DECL_RTL anyway it might be not so
>> bad to simply do
>>
>> decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0);
>>
>> instead of that to avoid one copy.
>>
>> Index: expr.c
>> ===================================================================
>> --- expr.c (revision 232496)
>> +++ expr.c (working copy)
>> @@ -9597,7 +9597,10 @@ expand_expr_real_1 (tree exp, rtx target
>> decl_rtl = DECL_RTL (exp);
>> expand_decl_rtl:
>> gcc_assert (decl_rtl);
>> - decl_rtl = copy_rtx (decl_rtl);
>> + if (MEM_P (decl_rtl))
>> + decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0);
>> + else
>> + decl_rtl = copy_rtx (decl_rtl);
>> /* Record writes to register variables. */
>> if (modifier == EXPAND_WRITE
>> && REG_P (decl_rtl)
>>
>> untested apart from on the x86_64 testcase (which it fixes). One could guard
>> this further to only apply on vector typed decls with mismatched mode of course.
>>
>> I think that re-layouting globals is not very good design.
>>
>> Richard.
> A few other ICEs with this implementation, for instance if the context
> is not in a function, such as
>
> typedef __simd64_int8_t int8x8_t;
>
> extern int8x8_t b;
> int8x8_t *a = &b;
>
> So, to avoid a var re-layout and a copy_rtx (implied by adjust_address
> btw). What about just calling 'change_address' ? like: (very lightly tested)
>
> Index: expr.c
> ===================================================================
> --- expr.c (revision 232564)
> +++ expr.c (working copy)
> @@ -9392,7 +9392,8 @@
> enum expand_modifier modifier, rtx *alt_rtl,
> bool inner_reference_p)
> {
> - rtx op0, op1, temp, decl_rtl;
> + rtx op0, op1, temp;
> + rtx decl_rtl = NULL_RTX;
> tree type;
> int unsignedp;
> machine_mode mode, dmode;
> @@ -9590,11 +9591,22 @@
> && (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
> layout_decl (exp, 0);
>
> + decl_rtl = DECL_RTL (exp);
> +
> + if (MEM_P (decl_rtl)
> + && (VECTOR_TYPE_P (type) && DECL_MODE (exp) != mode))
> + {
> + if (current_function_decl
> + && (! reload_completed && !reload_in_progress))
> + decl_rtl = change_address (decl_rtl, TYPE_MODE (type), 0);
> + }
> +
> /* ... fall through ... */
>
> case FUNCTION_DECL:
> case RESULT_DECL:
> - decl_rtl = DECL_RTL (exp);
> + if (! decl_rtl)
> + decl_rtl = DECL_RTL (exp);
> expand_decl_rtl:
> gcc_assert (decl_rtl);
> decl_rtl = copy_rtx (decl_rtl);
>
> I'm not sure that moving the code in the 'expand_decl_rtl' label is
> best, as we'd need to test for exp and the case should only happen for
> global vars (not functions or results)
Here is the alternative implementation, shorter after all. testing in
progress.
Index: expr.c
===================================================================
--- expr.c (revision 232570)
+++ expr.c (working copy)
@@ -9597,6 +9597,15 @@
decl_rtl = DECL_RTL (exp);
expand_decl_rtl:
gcc_assert (decl_rtl);
+
+ if (exp && code == VAR_DECL && MEM_P (decl_rtl)
+ && (VECTOR_TYPE_P (type) && DECL_MODE (exp) != mode))
+ {
+ if (current_function_decl
+ && (! reload_completed && !reload_in_progress))
+ decl_rtl = change_address (decl_rtl, TYPE_MODE (type), 0);
+ }
+
decl_rtl = copy_rtx (decl_rtl);
/* Record writes to register variables. */
if (modifier == EXPAND_WRITE
>
> thanks,
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr
2016-01-19 15:13 ` Christian Bruel
@ 2016-01-19 15:18 ` Richard Biener
2016-01-22 11:41 ` Christian Bruel
0 siblings, 1 reply; 11+ messages in thread
From: Richard Biener @ 2016-01-19 15:18 UTC (permalink / raw)
To: Christian Bruel
Cc: kyrylo.tkachov, Richard Earnshaw, ramana.radhakrishnan, bschmidt,
GCC Patches
On Tue, Jan 19, 2016 at 4:13 PM, Christian Bruel <christian.bruel@st.com> wrote:
>
>
> On 01/19/2016 04:01 PM, Christian Bruel wrote:
>>
>> Hi Richard,
>>
>> thanks for your input,
>>
>> On 01/18/2016 12:36 PM, Richard Biener wrote:
>>>
>>> On Fri, Jan 8, 2016 at 2:29 PM, Christian Bruel <christian.bruel@st.com>
>>> wrote:
>>>>
>>>> When compiling code with attribute targets on arm or aarch64,
>>>> vector_type_mode returns different results (eg Vmode or BLKmode)
>>>> depending
>>>> on the current simd flags that are not set between functions.
>>>>
>>>> for example the following code:
>>>>
>>>> #include <arm_neon.h>
>>>>
>>>> extern int8x8_t a;
>>>> extern int8x8_t b;
>>>>
>>>> int16x8_t
>>>> __attribute__ ((target("fpu=neon")))
>>>> foo(void)
>>>> {
>>>> return vaddl_s8 (a, b);
>>>> }
>>>>
>>>> Triggers gcc_asserts in copy_to_mode_regs while expanding NEON builtins
>>>> ,
>>>> because the mismatch and DECL_MODE current's TYPE_MODE used in
>>>> expand_builtin for global variables.
>>>>
>>>> but the best explanation is in the vector_type_mode:
>>>> /* Vector types need to re-check the target flags each time we report
>>>> the machine mode. We need to do this because attribute target can
>>>> change the result of vector_mode_supported_p and have_regs_of_mode
>>>> on a per-function basis. Thus the TYPE_MODE of a VECTOR_TYPE can
>>>> change on a per-function basis. */
>>>>
>>>> I first tried to hack the 2 machine descriptions to insert
>>>> convert_to_mode
>>>> or relayout_decls here and there, but I found this very fragile. Instead
>>>> a
>>>> more central relayout the of type while expanding gave good results, as
>>>> proposed here.
>>>>
>>>> bootstraped and tested with no regression for arm, aarch64 and i586.
>>>>
>>>> Does this look to be the right approach ?
>>>>
>>>> nb: for testing this patch is complementary with
>>>>
>>>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00332.html
>>>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00248.html
>>>>
>>>> thanks for your comments.
>>>
>>> A x86 specific testcase that ICEs as well:
>>>
>>> typedef int v8si __attribute__((vector_size(32)));
>>> v8si a;
>>> v8si __attribute__((target("avx"))) foo()
>>> {
>>> return a;
>>> }
>>>
>>> in your patch not using the shared DECL_RTL of the global var
>>> "fixes" this so I think a conceptually better fix would be to
>>> "adjust" DECL_RTL from globals via a adjust_address (or so).
>>>
>>> Also given that we do
>>>
>>> /* ... fall through ... */
>>>
>>> case FUNCTION_DECL:
>>> case RESULT_DECL:
>>> decl_rtl = DECL_RTL (exp);
>>> expand_decl_rtl:
>>> gcc_assert (decl_rtl);
>>> decl_rtl = copy_rtx (decl_rtl);
>>>
>>> thus always "unshare" DECL_RTL anyway it might be not so
>>> bad to simply do
>>>
>>> decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0);
>>>
>>> instead of that to avoid one copy.
>>>
>>> Index: expr.c
>>> ===================================================================
>>> --- expr.c (revision 232496)
>>> +++ expr.c (working copy)
>>> @@ -9597,7 +9597,10 @@ expand_expr_real_1 (tree exp, rtx target
>>> decl_rtl = DECL_RTL (exp);
>>> expand_decl_rtl:
>>> gcc_assert (decl_rtl);
>>> - decl_rtl = copy_rtx (decl_rtl);
>>> + if (MEM_P (decl_rtl))
>>> + decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0);
>>> + else
>>> + decl_rtl = copy_rtx (decl_rtl);
>>> /* Record writes to register variables. */
>>> if (modifier == EXPAND_WRITE
>>> && REG_P (decl_rtl)
>>>
>>> untested apart from on the x86_64 testcase (which it fixes). One could
>>> guard
>>> this further to only apply on vector typed decls with mismatched mode of
>>> course.
>>>
>>> I think that re-layouting globals is not very good design.
>>>
>>> Richard.
>>
>> A few other ICEs with this implementation, for instance if the context
>> is not in a function, such as
>>
>> typedef __simd64_int8_t int8x8_t;
>>
>> extern int8x8_t b;
>> int8x8_t *a = &b;
>>
>> So, to avoid a var re-layout and a copy_rtx (implied by adjust_address
>> btw). What about just calling 'change_address' ? like: (very lightly
>> tested)
>>
>> Index: expr.c
>> ===================================================================
>> --- expr.c (revision 232564)
>> +++ expr.c (working copy)
>> @@ -9392,7 +9392,8 @@
>> enum expand_modifier modifier, rtx *alt_rtl,
>> bool inner_reference_p)
>> {
>> - rtx op0, op1, temp, decl_rtl;
>> + rtx op0, op1, temp;
>> + rtx decl_rtl = NULL_RTX;
>> tree type;
>> int unsignedp;
>> machine_mode mode, dmode;
>> @@ -9590,11 +9591,22 @@
>> && (TREE_STATIC (exp) || DECL_EXTERNAL (exp)))
>> layout_decl (exp, 0);
>>
>> + decl_rtl = DECL_RTL (exp);
>> +
>> + if (MEM_P (decl_rtl)
>> + && (VECTOR_TYPE_P (type) && DECL_MODE (exp) != mode))
>> + {
>> + if (current_function_decl
>> + && (! reload_completed && !reload_in_progress))
>> + decl_rtl = change_address (decl_rtl, TYPE_MODE (type), 0);
>> + }
>> +
>> /* ... fall through ... */
>>
>> case FUNCTION_DECL:
>> case RESULT_DECL:
>> - decl_rtl = DECL_RTL (exp);
>> + if (! decl_rtl)
>> + decl_rtl = DECL_RTL (exp);
>> expand_decl_rtl:
>> gcc_assert (decl_rtl);
>> decl_rtl = copy_rtx (decl_rtl);
>>
>> I'm not sure that moving the code in the 'expand_decl_rtl' label is
>> best, as we'd need to test for exp and the case should only happen for
>> global vars (not functions or results)
>
>
> Here is the alternative implementation, shorter after all. testing in
> progress.
>
> Index: expr.c
> ===================================================================
> --- expr.c (revision 232570)
> +++ expr.c (working copy)
> @@ -9597,6 +9597,15 @@
> decl_rtl = DECL_RTL (exp);
> expand_decl_rtl:
> gcc_assert (decl_rtl);
> +
> + if (exp && code == VAR_DECL && MEM_P (decl_rtl)
> + && (VECTOR_TYPE_P (type) && DECL_MODE (exp) != mode))
> + {
> + if (current_function_decl
> + && (! reload_completed && !reload_in_progress))
maybe just if (currently_expanding_to_rtl)?
But yes, this looks like a safe variant of the fix.
Richard.
> + decl_rtl = change_address (decl_rtl, TYPE_MODE (type), 0);
> + }
> +
> decl_rtl = copy_rtx (decl_rtl);
> /* Record writes to register variables. */
> if (modifier == EXPAND_WRITE
>
>>
>> thanks,
>>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr
2016-01-19 15:18 ` Richard Biener
@ 2016-01-22 11:41 ` Christian Bruel
2016-01-22 11:56 ` Richard Biener
0 siblings, 1 reply; 11+ messages in thread
From: Christian Bruel @ 2016-01-22 11:41 UTC (permalink / raw)
To: Richard Biener
Cc: kyrylo.tkachov, Richard Earnshaw, ramana.radhakrishnan, bschmidt,
GCC Patches
[-- Attachment #1: Type: text/plain, Size: 440 bytes --]
On 01/19/2016 04:18 PM, Richard Biener wrote:
> maybe just if (currently_expanding_to_rtl)?
>
> But yes, this looks like a safe variant of the fix.
>
> Richard.
>
thanks, currently_expanding_to_rtl works perfectly. So the final version.
I added a test for each target.
bootstrapped / tested for :
unix/-m32/-march=i586
unix
arm-qemu/
arm-qemu//-mfpu=neon
arm-qemu//-mfpu=neon-fp-armv8
aarch64-qemu
[-- Attachment #2: pr68674.patch --]
[-- Type: text/x-patch, Size: 2950 bytes --]
2016-01-21 Christian Bruel <christian.bruel@st.com>
PR target/68674
* expr.c (expand_expr_real_1): Reset DECL_MODE if VECTOR_TYPE_P changed.
2016-01-21 Christian Bruel <christian.bruel@st.com>
PR target/68674
* gcc.target/i386/pr68674.c
* gcc.target/aarch64/pr68674.c
* gcc.target/arm/pr68674.c
Index: gcc/expr.c
===================================================================
--- gcc/expr.c (revision 232724)
+++ gcc/expr.c (working copy)
@@ -9597,7 +9597,16 @@ expand_expr_real_1 (tree exp, rtx target, machine_
decl_rtl = DECL_RTL (exp);
expand_decl_rtl:
gcc_assert (decl_rtl);
- decl_rtl = copy_rtx (decl_rtl);
+
+ /* DECL_MODE might change when TYPE_MODE depends on attribute target
+ settings for VECTOR_TYPE_P that might switch for the function. */
+ if (currently_expanding_to_rtl
+ && code == VAR_DECL && MEM_P (decl_rtl)
+ && VECTOR_TYPE_P (type) && exp && DECL_MODE (exp) != mode)
+ decl_rtl = change_address (decl_rtl, TYPE_MODE (type), 0);
+ else
+ decl_rtl = copy_rtx (decl_rtl);
+
/* Record writes to register variables. */
if (modifier == EXPAND_WRITE
&& REG_P (decl_rtl)
Index: gcc/testsuite/gcc.target/aarch64/pr68674.c
===================================================================
--- gcc/testsuite/gcc.target/aarch64/pr68674.c (revision 0)
+++ gcc/testsuite/gcc.target/aarch64/pr68674.c (working copy)
@@ -0,0 +1,22 @@
+/* PR target/68674 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=generic+nosimd" } */
+
+#include <arm_neon.h>
+
+int8x8_t a;
+extern int8x8_t b;
+int16x8_t e;
+
+void __attribute__((target("+simd")))
+foo1(void)
+{
+ e = (int16x8_t) vaddl_s8(a, b);
+}
+
+int8x8_t __attribute__((target("+simd")))
+foo2(void)
+{
+ return a;
+}
+
Index: gcc/testsuite/gcc.target/arm/pr68674.c
===================================================================
--- gcc/testsuite/gcc.target/arm/pr68674.c (revision 0)
+++ gcc/testsuite/gcc.target/arm/pr68674.c (working copy)
@@ -0,0 +1,26 @@
+/* PR target/68674 */
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "-O2 -mfloat-abi=softfp" } */
+
+#pragma GCC target ("fpu=vfp")
+
+#include <arm_neon.h>
+
+int8x8_t a;
+extern int8x8_t b;
+int16x8_t e;
+
+void __attribute__((target("fpu=neon")))
+foo1(void)
+{
+ e = (int16x8_t) vaddl_s8(a, b);
+}
+
+int8x8_t __attribute__((target("fpu=neon")))
+foo2(void)
+{
+ return b;
+}
+
+
Index: gcc/testsuite/gcc.target/i386/pr68674.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr68674.c (revision 0)
+++ gcc/testsuite/gcc.target/i386/pr68674.c (working copy)
@@ -0,0 +1,15 @@
+/* PR target/68674 */
+/* { dg-do compile } */
+/* { dg-require-effective-target avx } */
+/* { dg-options "-O2" } */
+
+typedef int v8si __attribute__((vector_size(32)));
+
+v8si a;
+
+ __attribute__((target("avx")))
+v8si
+foo()
+{
+ return a;
+}
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr
2016-01-22 11:41 ` Christian Bruel
@ 2016-01-22 11:56 ` Richard Biener
2016-01-25 19:06 ` Christophe Lyon
0 siblings, 1 reply; 11+ messages in thread
From: Richard Biener @ 2016-01-22 11:56 UTC (permalink / raw)
To: Christian Bruel
Cc: kyrylo.tkachov, Richard Earnshaw, ramana.radhakrishnan, bschmidt,
GCC Patches
On Fri, Jan 22, 2016 at 12:41 PM, Christian Bruel
<christian.bruel@st.com> wrote:
>
>
> On 01/19/2016 04:18 PM, Richard Biener wrote:
>>
>> maybe just if (currently_expanding_to_rtl)?
>>
>> But yes, this looks like a safe variant of the fix.
>>
>> Richard.
>>
> thanks, currently_expanding_to_rtl works perfectly. So the final version.
> I added a test for each target.
Ok.
Thanks,
Richard.
> bootstrapped / tested for :
> unix/-m32/-march=i586
> unix
>
> arm-qemu/
> arm-qemu//-mfpu=neon
> arm-qemu//-mfpu=neon-fp-armv8
>
> aarch64-qemu
>
>
>
>
>
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr
2016-01-22 11:56 ` Richard Biener
@ 2016-01-25 19:06 ` Christophe Lyon
2016-01-26 8:17 ` Christian Bruel
2016-01-26 9:27 ` Kyrill Tkachov
0 siblings, 2 replies; 11+ messages in thread
From: Christophe Lyon @ 2016-01-25 19:06 UTC (permalink / raw)
To: Richard Biener
Cc: Christian Bruel, kyrylo.tkachov, Richard Earnshaw,
ramana.radhakrishnan, bschmidt, GCC Patches
[-- Attachment #1: Type: text/plain, Size: 1012 bytes --]
On 22 January 2016 at 12:56, Richard Biener <richard.guenther@gmail.com> wrote:
> On Fri, Jan 22, 2016 at 12:41 PM, Christian Bruel
> <christian.bruel@st.com> wrote:
>>
>>
>> On 01/19/2016 04:18 PM, Richard Biener wrote:
>>>
>>> maybe just if (currently_expanding_to_rtl)?
>>>
>>> But yes, this looks like a safe variant of the fix.
>>>
>>> Richard.
>>>
>> thanks, currently_expanding_to_rtl works perfectly. So the final version.
>> I added a test for each target.
>
> Ok.
>
Hi,
This small patch is needed to make the new test pass on arm hard-float
targets (eg. arm-none-linux-gnueabihf).
I'm not sure it counts as obvious, so here it is.
OK?
Christophe.
DATE Christophe Lyon <christophe.lyon@linaro.org>
* gcc.target/arm/pr68674.c: Check and use arm_fp effective target.
> Thanks,
> Richard.
>
>> bootstrapped / tested for :
>> unix/-m32/-march=i586
>> unix
>>
>> arm-qemu/
>> arm-qemu//-mfpu=neon
>> arm-qemu//-mfpu=neon-fp-armv8
>>
>> aarch64-qemu
>>
>>
>>
>>
>>
>>
>>
[-- Attachment #2: pr68674-test.patch.txt --]
[-- Type: text/plain, Size: 521 bytes --]
diff --git a/gcc/testsuite/gcc.target/arm/pr68674.c b/gcc/testsuite/gcc.target/arm/pr68674.c
index a31a88a..0b32374 100644
--- a/gcc/testsuite/gcc.target/arm/pr68674.c
+++ b/gcc/testsuite/gcc.target/arm/pr68674.c
@@ -1,7 +1,9 @@
/* PR target/68674 */
/* { dg-do compile } */
/* { dg-require-effective-target arm_neon_ok } */
-/* { dg-options "-O2 -mfloat-abi=softfp" } */
+/* { dg-require-effective-target arm_fp_ok } */
+/* { dg-options "-O2" } */
+/* { dg-add-options arm_fp } */
#pragma GCC target ("fpu=vfp")
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr
2016-01-25 19:06 ` Christophe Lyon
@ 2016-01-26 8:17 ` Christian Bruel
2016-01-26 9:27 ` Kyrill Tkachov
1 sibling, 0 replies; 11+ messages in thread
From: Christian Bruel @ 2016-01-26 8:17 UTC (permalink / raw)
To: Christophe Lyon, Richard Biener
Cc: kyrylo.tkachov, Richard Earnshaw, ramana.radhakrishnan, bschmidt,
GCC Patches
On 01/25/2016 08:06 PM, Christophe Lyon wrote:
> On 22 January 2016 at 12:56, Richard Biener <richard.guenther@gmail.com> wrote:
>> On Fri, Jan 22, 2016 at 12:41 PM, Christian Bruel
>> <christian.bruel@st.com> wrote:
>>>
>>> On 01/19/2016 04:18 PM, Richard Biener wrote:
>>>> maybe just if (currently_expanding_to_rtl)?
>>>>
>>>> But yes, this looks like a safe variant of the fix.
>>>>
>>>> Richard.
>>>>
>>> thanks, currently_expanding_to_rtl works perfectly. So the final version.
>>> I added a test for each target.
>> Ok.
>>
> Hi,
>
> This small patch is needed to make the new test pass on arm hard-float
> targets (eg. arm-none-linux-gnueabihf).
>
> I'm not sure it counts as obvious, so here it is.
> OK?
>
> Christophe.
>
> DATE Christophe Lyon <christophe.lyon@linaro.org>
>
> * gcc.target/arm/pr68674.c: Check and use arm_fp effective target.
At least that's fine with me.
Just for the story, arm_neon.h ends up to include stubs.h from the
sysroot. Depending on the glibc version it might assert that we cannot
use softfp abi with a sysroot built with hardfp.
thanks for spotting this Christophe,
>
>> Thanks,
>> Richard.
>>
>>> bootstrapped / tested for :
>>> unix/-m32/-march=i586
>>> unix
>>>
>>> arm-qemu/
>>> arm-qemu//-mfpu=neon
>>> arm-qemu//-mfpu=neon-fp-armv8
>>>
>>> aarch64-qemu
>>>
>>>
>>>
>>>
>>>
>>>
>>>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr
2016-01-25 19:06 ` Christophe Lyon
2016-01-26 8:17 ` Christian Bruel
@ 2016-01-26 9:27 ` Kyrill Tkachov
1 sibling, 0 replies; 11+ messages in thread
From: Kyrill Tkachov @ 2016-01-26 9:27 UTC (permalink / raw)
To: Christophe Lyon, Richard Biener
Cc: Christian Bruel, Richard Earnshaw, ramana.radhakrishnan,
bschmidt, GCC Patches
On 25/01/16 19:06, Christophe Lyon wrote:
> On 22 January 2016 at 12:56, Richard Biener <richard.guenther@gmail.com> wrote:
>> On Fri, Jan 22, 2016 at 12:41 PM, Christian Bruel
>> <christian.bruel@st.com> wrote:
>>>
>>> On 01/19/2016 04:18 PM, Richard Biener wrote:
>>>> maybe just if (currently_expanding_to_rtl)?
>>>>
>>>> But yes, this looks like a safe variant of the fix.
>>>>
>>>> Richard.
>>>>
>>> thanks, currently_expanding_to_rtl works perfectly. So the final version.
>>> I added a test for each target.
>> Ok.
>>
> Hi,
>
> This small patch is needed to make the new test pass on arm hard-float
> targets (eg. arm-none-linux-gnueabihf).
>
> I'm not sure it counts as obvious, so here it is.
> OK?
Ok.
Thanks,
Kyrill
> Christophe.
>
> DATE Christophe Lyon <christophe.lyon@linaro.org>
>
> * gcc.target/arm/pr68674.c: Check and use arm_fp effective target.
>
>
>> Thanks,
>> Richard.
>>
>>> bootstrapped / tested for :
>>> unix/-m32/-march=i586
>>> unix
>>>
>>> arm-qemu/
>>> arm-qemu//-mfpu=neon
>>> arm-qemu//-mfpu=neon-fp-armv8
>>>
>>> aarch64-qemu
>>>
>>>
>>>
>>>
>>>
>>>
>>>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-01-26 9:27 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-08 13:29 [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr Christian Bruel
2016-01-14 13:09 ` ping:[PATCH][ARM,AARCH64] " Christian Bruel
2016-01-18 11:36 ` [PATCH][ARM,AARCH64] " Richard Biener
2016-01-19 15:01 ` Christian Bruel
2016-01-19 15:13 ` Christian Bruel
2016-01-19 15:18 ` Richard Biener
2016-01-22 11:41 ` Christian Bruel
2016-01-22 11:56 ` Richard Biener
2016-01-25 19:06 ` Christophe Lyon
2016-01-26 8:17 ` Christian Bruel
2016-01-26 9:27 ` Kyrill Tkachov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).