* veclower: improve selection of vector mode when lowering [PR 112787]
@ 2023-12-06 18:49 Andre Vieira (lists)
2023-12-07 7:45 ` Richard Biener
0 siblings, 1 reply; 6+ messages in thread
From: Andre Vieira (lists) @ 2023-12-06 18:49 UTC (permalink / raw)
To: gcc-patches; +Cc: Richard Sandiford, Richard Biener
[-- Attachment #1: Type: text/plain, Size: 712 bytes --]
Hi,
This patch addresses the issue reported in PR target/112787 by improving the
compute type selection. We do this by not considering types with more
elements
than the type we are lowering since we'd reject such types anyway.
gcc/ChangeLog:
PR target/112787
* tree-vect-generic (type_for_widest_vector_mode): Add a parameter to
control maximum amount of elements in resulting vector mode.
(get_compute_type): Restrict vector_compute_type to a mode no wider
than the original compute type.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/pr112787.c: New test.
Bootstrapped and regression tested on aarch64-unknown-linux-gnu and
x86_64-pc-linux-gnu.
Is this OK for trunk?
Kind regards,
Andre Vieira
[-- Attachment #2: pr112787.patch --]
[-- Type: text/plain, Size: 2149 bytes --]
diff --git a/gcc/testsuite/gcc.target/aarch64/pr112787.c b/gcc/testsuite/gcc.target/aarch64/pr112787.c
new file mode 100644
index 0000000000000000000000000000000000000000..caca1bf7ef447e4489b2c134d7200a4afd16763f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr112787.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -march=armv8-a+sve -mcpu=neoverse-n2" } */
+
+typedef int __attribute__((__vector_size__ (64))) vec;
+
+vec fn (vec a, vec b)
+{
+ return a + b;
+}
+
+/* { dg-final { scan-assembler-times {add\tv[0-9]+} 4 } } */
diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc
index a7e6cb87a5e31e3dd2a893ea5652eeebf8d5d214..2dbf3c8f5f64f2623944110dbc371fe0944198f0 100644
--- a/gcc/tree-vect-generic.cc
+++ b/gcc/tree-vect-generic.cc
@@ -1347,7 +1347,7 @@ optimize_vector_constructor (gimple_stmt_iterator *gsi)
TYPE, or NULL_TREE if none is found. */
static tree
-type_for_widest_vector_mode (tree type, optab op)
+type_for_widest_vector_mode (tree type, optab op, poly_int64 max_nunits = 0)
{
machine_mode inner_mode = TYPE_MODE (type);
machine_mode best_mode = VOIDmode, mode;
@@ -1371,7 +1371,9 @@ type_for_widest_vector_mode (tree type, optab op)
FOR_EACH_MODE_FROM (mode, mode)
if (GET_MODE_INNER (mode) == inner_mode
&& maybe_gt (GET_MODE_NUNITS (mode), best_nunits)
- && optab_handler (op, mode) != CODE_FOR_nothing)
+ && optab_handler (op, mode) != CODE_FOR_nothing
+ && (known_eq (max_nunits, 0)
+ || known_lt (GET_MODE_NUNITS (mode), max_nunits)))
best_mode = mode, best_nunits = GET_MODE_NUNITS (mode);
if (best_mode == VOIDmode)
@@ -1702,7 +1704,8 @@ get_compute_type (enum tree_code code, optab op, tree type)
|| optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing))
{
tree vector_compute_type
- = type_for_widest_vector_mode (TREE_TYPE (type), op);
+ = type_for_widest_vector_mode (TREE_TYPE (type), op,
+ TYPE_VECTOR_SUBPARTS (compute_type));
if (vector_compute_type != NULL_TREE
&& subparts_gt (compute_type, vector_compute_type)
&& maybe_ne (TYPE_VECTOR_SUBPARTS (vector_compute_type), 1U)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: veclower: improve selection of vector mode when lowering [PR 112787]
2023-12-06 18:49 veclower: improve selection of vector mode when lowering [PR 112787] Andre Vieira (lists)
@ 2023-12-07 7:45 ` Richard Biener
2023-12-20 14:29 ` Andre Vieira (lists)
0 siblings, 1 reply; 6+ messages in thread
From: Richard Biener @ 2023-12-07 7:45 UTC (permalink / raw)
To: Andre Vieira (lists); +Cc: gcc-patches, Richard Sandiford
On Wed, 6 Dec 2023, Andre Vieira (lists) wrote:
> Hi,
>
> This patch addresses the issue reported in PR target/112787 by improving the
> compute type selection. We do this by not considering types with more
> elements
> than the type we are lowering since we'd reject such types anyway.
>
> gcc/ChangeLog:
>
> PR target/112787
> * tree-vect-generic (type_for_widest_vector_mode): Add a parameter to
> control maximum amount of elements in resulting vector mode.
> (get_compute_type): Restrict vector_compute_type to a mode no wider
> than the original compute type.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/pr112787.c: New test.
>
> Bootstrapped and regression tested on aarch64-unknown-linux-gnu and
> x86_64-pc-linux-gnu.
>
> Is this OK for trunk?
@@ -1347,7 +1347,7 @@ optimize_vector_constructor (gimple_stmt_iterator
*gsi)
TYPE, or NULL_TREE if none is found. */
Can you improve the function comment? It also doesn't mention OP ...
static tree
-type_for_widest_vector_mode (tree type, optab op)
+type_for_widest_vector_mode (tree type, optab op, poly_int64 max_nunits =
0)
{
machine_mode inner_mode = TYPE_MODE (type);
machine_mode best_mode = VOIDmode, mode;
@@ -1371,7 +1371,9 @@ type_for_widest_vector_mode (tree type, optab op)
FOR_EACH_MODE_FROM (mode, mode)
if (GET_MODE_INNER (mode) == inner_mode
&& maybe_gt (GET_MODE_NUNITS (mode), best_nunits)
- && optab_handler (op, mode) != CODE_FOR_nothing)
+ && optab_handler (op, mode) != CODE_FOR_nothing
+ && (known_eq (max_nunits, 0)
+ || known_lt (GET_MODE_NUNITS (mode), max_nunits)))
max_nunits suggests that known_le would be appropriate instead.
I see the only other caller with similar "problems":
}
/* Can't use get_compute_type here, as supportable_convert_operation
doesn't necessarily use an optab and needs two arguments. */
tree vec_compute_type
= type_for_widest_vector_mode (TREE_TYPE (arg_type), mov_optab);
if (vec_compute_type
&& VECTOR_MODE_P (TYPE_MODE (vec_compute_type))
&& subparts_gt (arg_type, vec_compute_type))
so please do not default to 0 but adjust this one as well. It also
seems you then can remove the subparts_gt guards on both
vec_compute_type uses.
I think the API would be cleaner if we'd pass the original vector type
we can then extract TYPE_VECTOR_SUBPARTS from, avoiding the extra arg.
No?
Thanks,
Richard.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: veclower: improve selection of vector mode when lowering [PR 112787]
2023-12-07 7:45 ` Richard Biener
@ 2023-12-20 14:29 ` Andre Vieira (lists)
2023-12-20 14:30 ` Richard Biener
0 siblings, 1 reply; 6+ messages in thread
From: Andre Vieira (lists) @ 2023-12-20 14:29 UTC (permalink / raw)
To: Richard Biener; +Cc: gcc-patches, Richard Sandiford
[-- Attachment #1: Type: text/plain, Size: 3080 bytes --]
Thanks, fully agree with all comments.
gcc/ChangeLog:
PR target/112787
* tree-vect-generic (type_for_widest_vector_mode): Change function
to use original vector type and check widest vector mode has at
most
the same number of elements.
(get_compute_type): Pass original vector type rather than the element
type to type_for_widest_vector_mode and remove now obsolete check
for the number of elements.
On 07/12/2023 07:45, Richard Biener wrote:
> On Wed, 6 Dec 2023, Andre Vieira (lists) wrote:
>
>> Hi,
>>
>> This patch addresses the issue reported in PR target/112787 by improving the
>> compute type selection. We do this by not considering types with more
>> elements
>> than the type we are lowering since we'd reject such types anyway.
>>
>> gcc/ChangeLog:
>>
>> PR target/112787
>> * tree-vect-generic (type_for_widest_vector_mode): Add a parameter to
>> control maximum amount of elements in resulting vector mode.
>> (get_compute_type): Restrict vector_compute_type to a mode no wider
>> than the original compute type.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/aarch64/pr112787.c: New test.
>>
>> Bootstrapped and regression tested on aarch64-unknown-linux-gnu and
>> x86_64-pc-linux-gnu.
>>
>> Is this OK for trunk?
>
> @@ -1347,7 +1347,7 @@ optimize_vector_constructor (gimple_stmt_iterator
> *gsi)
> TYPE, or NULL_TREE if none is found. */
>
> Can you improve the function comment? It also doesn't mention OP ...
>
> static tree
> -type_for_widest_vector_mode (tree type, optab op)
> +type_for_widest_vector_mode (tree type, optab op, poly_int64 max_nunits =
> 0)
> {
> machine_mode inner_mode = TYPE_MODE (type);
> machine_mode best_mode = VOIDmode, mode;
> @@ -1371,7 +1371,9 @@ type_for_widest_vector_mode (tree type, optab op)
> FOR_EACH_MODE_FROM (mode, mode)
> if (GET_MODE_INNER (mode) == inner_mode
> && maybe_gt (GET_MODE_NUNITS (mode), best_nunits)
> - && optab_handler (op, mode) != CODE_FOR_nothing)
> + && optab_handler (op, mode) != CODE_FOR_nothing
> + && (known_eq (max_nunits, 0)
> + || known_lt (GET_MODE_NUNITS (mode), max_nunits)))
>
> max_nunits suggests that known_le would be appropriate instead.
>
> I see the only other caller with similar "problems":
>
> }
> /* Can't use get_compute_type here, as supportable_convert_operation
> doesn't necessarily use an optab and needs two arguments. */
> tree vec_compute_type
> = type_for_widest_vector_mode (TREE_TYPE (arg_type), mov_optab);
> if (vec_compute_type
> && VECTOR_MODE_P (TYPE_MODE (vec_compute_type))
> && subparts_gt (arg_type, vec_compute_type))
>
> so please do not default to 0 but adjust this one as well. It also
> seems you then can remove the subparts_gt guards on both
> vec_compute_type uses.
>
> I think the API would be cleaner if we'd pass the original vector type
> we can then extract TYPE_VECTOR_SUBPARTS from, avoiding the extra arg.
>
> No?
>
> Thanks,
> Richard.
[-- Attachment #2: pr112787v2.patch --]
[-- Type: text/plain, Size: 3373 bytes --]
diff --git a/gcc/testsuite/gcc.target/aarch64/pr112787.c b/gcc/testsuite/gcc.target/aarch64/pr112787.c
new file mode 100644
index 0000000000000000000000000000000000000000..caca1bf7ef447e4489b2c134d7200a4afd16763f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr112787.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -march=armv8-a+sve -mcpu=neoverse-n2" } */
+
+typedef int __attribute__((__vector_size__ (64))) vec;
+
+vec fn (vec a, vec b)
+{
+ return a + b;
+}
+
+/* { dg-final { scan-assembler-times {add\tv[0-9]+} 4 } } */
diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc
index a7e6cb87a5e31e3dd2a893ea5652eeebf8d5d214..c906eb3521ea01fd2bdfc89c3476d02c555cf8cc 100644
--- a/gcc/tree-vect-generic.cc
+++ b/gcc/tree-vect-generic.cc
@@ -1343,12 +1343,16 @@ optimize_vector_constructor (gimple_stmt_iterator *gsi)
gsi_replace (gsi, g, false);
}
\f
-/* Return a type for the widest vector mode whose components are of type
- TYPE, or NULL_TREE if none is found. */
+/* Return a type for the widest vector mode with the same element type as
+ type ORIGINAL_VECTOR_TYPE, with at most the same number of elements as type
+ ORIGINAL_VECTOR_TYPE and that is supported by the target for an operation
+ with optab OP, or return NULL_TREE if none is found. */
static tree
-type_for_widest_vector_mode (tree type, optab op)
+type_for_widest_vector_mode (tree original_vector_type, optab op)
{
+ gcc_assert (VECTOR_TYPE_P (original_vector_type));
+ tree type = TREE_TYPE (original_vector_type);
machine_mode inner_mode = TYPE_MODE (type);
machine_mode best_mode = VOIDmode, mode;
poly_int64 best_nunits = 0;
@@ -1371,7 +1375,9 @@ type_for_widest_vector_mode (tree type, optab op)
FOR_EACH_MODE_FROM (mode, mode)
if (GET_MODE_INNER (mode) == inner_mode
&& maybe_gt (GET_MODE_NUNITS (mode), best_nunits)
- && optab_handler (op, mode) != CODE_FOR_nothing)
+ && optab_handler (op, mode) != CODE_FOR_nothing
+ && known_le (GET_MODE_NUNITS (mode),
+ TYPE_VECTOR_SUBPARTS (original_vector_type)))
best_mode = mode, best_nunits = GET_MODE_NUNITS (mode);
if (best_mode == VOIDmode)
@@ -1702,9 +1708,8 @@ get_compute_type (enum tree_code code, optab op, tree type)
|| optab_handler (op, TYPE_MODE (type)) == CODE_FOR_nothing))
{
tree vector_compute_type
- = type_for_widest_vector_mode (TREE_TYPE (type), op);
+ = type_for_widest_vector_mode (type, op);
if (vector_compute_type != NULL_TREE
- && subparts_gt (compute_type, vector_compute_type)
&& maybe_ne (TYPE_VECTOR_SUBPARTS (vector_compute_type), 1U)
&& (optab_handler (op, TYPE_MODE (vector_compute_type))
!= CODE_FOR_nothing))
@@ -1877,10 +1882,9 @@ expand_vector_conversion (gimple_stmt_iterator *gsi)
/* Can't use get_compute_type here, as supportable_convert_operation
doesn't necessarily use an optab and needs two arguments. */
tree vec_compute_type
- = type_for_widest_vector_mode (TREE_TYPE (arg_type), mov_optab);
+ = type_for_widest_vector_mode (arg_type, mov_optab);
if (vec_compute_type
- && VECTOR_MODE_P (TYPE_MODE (vec_compute_type))
- && subparts_gt (arg_type, vec_compute_type))
+ && VECTOR_MODE_P (TYPE_MODE (vec_compute_type)))
{
unsigned HOST_WIDE_INT nelts
= constant_lower_bound (TYPE_VECTOR_SUBPARTS (vec_compute_type));
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: veclower: improve selection of vector mode when lowering [PR 112787]
2023-12-20 14:29 ` Andre Vieira (lists)
@ 2023-12-20 14:30 ` Richard Biener
2024-02-19 14:00 ` Andre Vieira (lists)
0 siblings, 1 reply; 6+ messages in thread
From: Richard Biener @ 2023-12-20 14:30 UTC (permalink / raw)
To: Andre Vieira (lists); +Cc: gcc-patches, Richard Sandiford
On Wed, 20 Dec 2023, Andre Vieira (lists) wrote:
> Thanks, fully agree with all comments.
>
> gcc/ChangeLog:
>
> PR target/112787
> * tree-vect-generic (type_for_widest_vector_mode): Change function
> to use original vector type and check widest vector mode has at most
> the same number of elements.
> (get_compute_type): Pass original vector type rather than the element
> type to type_for_widest_vector_mode and remove now obsolete check
> for the number of elements.
OK.
Richard.
> On 07/12/2023 07:45, Richard Biener wrote:
> > On Wed, 6 Dec 2023, Andre Vieira (lists) wrote:
> >
> >> Hi,
> >>
> >> This patch addresses the issue reported in PR target/112787 by improving
> >> the
> >> compute type selection. We do this by not considering types with more
> >> elements
> >> than the type we are lowering since we'd reject such types anyway.
> >>
> >> gcc/ChangeLog:
> >>
> >> PR target/112787
> >> * tree-vect-generic (type_for_widest_vector_mode): Add a parameter to
> >> control maximum amount of elements in resulting vector mode.
> >> (get_compute_type): Restrict vector_compute_type to a mode no wider
> >> than the original compute type.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >> * gcc.target/aarch64/pr112787.c: New test.
> >>
> >> Bootstrapped and regression tested on aarch64-unknown-linux-gnu and
> >> x86_64-pc-linux-gnu.
> >>
> >> Is this OK for trunk?
> >
> > @@ -1347,7 +1347,7 @@ optimize_vector_constructor (gimple_stmt_iterator
> > *gsi)
> > TYPE, or NULL_TREE if none is found. */
> >
> > Can you improve the function comment? It also doesn't mention OP ...
> >
> > static tree
> > -type_for_widest_vector_mode (tree type, optab op)
> > +type_for_widest_vector_mode (tree type, optab op, poly_int64 max_nunits =
> > 0)
> > {
> > machine_mode inner_mode = TYPE_MODE (type);
> > machine_mode best_mode = VOIDmode, mode;
> > @@ -1371,7 +1371,9 @@ type_for_widest_vector_mode (tree type, optab op)
> > FOR_EACH_MODE_FROM (mode, mode)
> > if (GET_MODE_INNER (mode) == inner_mode
> > && maybe_gt (GET_MODE_NUNITS (mode), best_nunits)
> > - && optab_handler (op, mode) != CODE_FOR_nothing)
> > + && optab_handler (op, mode) != CODE_FOR_nothing
> > + && (known_eq (max_nunits, 0)
> > + || known_lt (GET_MODE_NUNITS (mode), max_nunits)))
> >
> > max_nunits suggests that known_le would be appropriate instead.
> >
> > I see the only other caller with similar "problems":
> >
> > }
> > /* Can't use get_compute_type here, as supportable_convert_operation
> > doesn't necessarily use an optab and needs two arguments. */
> > tree vec_compute_type
> > = type_for_widest_vector_mode (TREE_TYPE (arg_type), mov_optab);
> > if (vec_compute_type
> > && VECTOR_MODE_P (TYPE_MODE (vec_compute_type))
> > && subparts_gt (arg_type, vec_compute_type))
> >
> > so please do not default to 0 but adjust this one as well. It also
> > seems you then can remove the subparts_gt guards on both
> > vec_compute_type uses.
> >
> > I think the API would be cleaner if we'd pass the original vector type
> > we can then extract TYPE_VECTOR_SUBPARTS from, avoiding the extra arg.
> >
> > No?
> >
> > Thanks,
> > Richard.
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: veclower: improve selection of vector mode when lowering [PR 112787]
2023-12-20 14:30 ` Richard Biener
@ 2024-02-19 14:00 ` Andre Vieira (lists)
2024-02-19 14:40 ` Richard Biener
0 siblings, 1 reply; 6+ messages in thread
From: Andre Vieira (lists) @ 2024-02-19 14:00 UTC (permalink / raw)
To: Richard Biener; +Cc: gcc-patches, Richard Sandiford
Hi all,
OK to backport this to gcc-12 and gcc-13? Patch applies cleanly,
bootstrapped and regression tested on aarch64-unknown-linux-gnu. Only
change is in the testcase as I had to use -march=armv9-a because
-march=armv8-a+sve conflicts with -mcpu=neoverse-n2 in previous gcc
versions.
Kind Regards,
Andre
On 20/12/2023 14:30, Richard Biener wrote:
> On Wed, 20 Dec 2023, Andre Vieira (lists) wrote:
>
>> Thanks, fully agree with all comments.
>>
>> gcc/ChangeLog:
>>
>> PR target/112787
>> * tree-vect-generic (type_for_widest_vector_mode): Change function
>> to use original vector type and check widest vector mode has at most
>> the same number of elements.
>> (get_compute_type): Pass original vector type rather than the element
>> type to type_for_widest_vector_mode and remove now obsolete check
>> for the number of elements.
>
> OK.
>
> Richard.
>
>> On 07/12/2023 07:45, Richard Biener wrote:
>>> On Wed, 6 Dec 2023, Andre Vieira (lists) wrote:
>>>
>>>> Hi,
>>>>
>>>> This patch addresses the issue reported in PR target/112787 by improving
>>>> the
>>>> compute type selection. We do this by not considering types with more
>>>> elements
>>>> than the type we are lowering since we'd reject such types anyway.
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>> PR target/112787
>>>> * tree-vect-generic (type_for_widest_vector_mode): Add a parameter to
>>>> control maximum amount of elements in resulting vector mode.
>>>> (get_compute_type): Restrict vector_compute_type to a mode no wider
>>>> than the original compute type.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>> * gcc.target/aarch64/pr112787.c: New test.
>>>>
>>>> Bootstrapped and regression tested on aarch64-unknown-linux-gnu and
>>>> x86_64-pc-linux-gnu.
>>>>
>>>> Is this OK for trunk?
>>>
>>> @@ -1347,7 +1347,7 @@ optimize_vector_constructor (gimple_stmt_iterator
>>> *gsi)
>>> TYPE, or NULL_TREE if none is found. */
>>>
>>> Can you improve the function comment? It also doesn't mention OP ...
>>>
>>> static tree
>>> -type_for_widest_vector_mode (tree type, optab op)
>>> +type_for_widest_vector_mode (tree type, optab op, poly_int64 max_nunits =
>>> 0)
>>> {
>>> machine_mode inner_mode = TYPE_MODE (type);
>>> machine_mode best_mode = VOIDmode, mode;
>>> @@ -1371,7 +1371,9 @@ type_for_widest_vector_mode (tree type, optab op)
>>> FOR_EACH_MODE_FROM (mode, mode)
>>> if (GET_MODE_INNER (mode) == inner_mode
>>> && maybe_gt (GET_MODE_NUNITS (mode), best_nunits)
>>> - && optab_handler (op, mode) != CODE_FOR_nothing)
>>> + && optab_handler (op, mode) != CODE_FOR_nothing
>>> + && (known_eq (max_nunits, 0)
>>> + || known_lt (GET_MODE_NUNITS (mode), max_nunits)))
>>>
>>> max_nunits suggests that known_le would be appropriate instead.
>>>
>>> I see the only other caller with similar "problems":
>>>
>>> }
>>> /* Can't use get_compute_type here, as supportable_convert_operation
>>> doesn't necessarily use an optab and needs two arguments. */
>>> tree vec_compute_type
>>> = type_for_widest_vector_mode (TREE_TYPE (arg_type), mov_optab);
>>> if (vec_compute_type
>>> && VECTOR_MODE_P (TYPE_MODE (vec_compute_type))
>>> && subparts_gt (arg_type, vec_compute_type))
>>>
>>> so please do not default to 0 but adjust this one as well. It also
>>> seems you then can remove the subparts_gt guards on both
>>> vec_compute_type uses.
>>>
>>> I think the API would be cleaner if we'd pass the original vector type
>>> we can then extract TYPE_VECTOR_SUBPARTS from, avoiding the extra arg.
>>>
>>> No?
>>>
>>> Thanks,
>>> Richard.
>>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: veclower: improve selection of vector mode when lowering [PR 112787]
2024-02-19 14:00 ` Andre Vieira (lists)
@ 2024-02-19 14:40 ` Richard Biener
0 siblings, 0 replies; 6+ messages in thread
From: Richard Biener @ 2024-02-19 14:40 UTC (permalink / raw)
To: Andre Vieira (lists); +Cc: gcc-patches, Richard Sandiford
On Mon, 19 Feb 2024, Andre Vieira (lists) wrote:
> Hi all,
>
> OK to backport this to gcc-12 and gcc-13? Patch applies cleanly, bootstrapped
> and regression tested on aarch64-unknown-linux-gnu. Only change is in the
> testcase as I had to use -march=armv9-a because -march=armv8-a+sve conflicts
> with -mcpu=neoverse-n2 in previous gcc versions.
Yes.
Thanks,
Richard.
> Kind Regards,
> Andre
>
> On 20/12/2023 14:30, Richard Biener wrote:
> > On Wed, 20 Dec 2023, Andre Vieira (lists) wrote:
> >
> >> Thanks, fully agree with all comments.
> >>
> >> gcc/ChangeLog:
> >>
> >> PR target/112787
> >> * tree-vect-generic (type_for_widest_vector_mode): Change function
> >> to use original vector type and check widest vector mode has at
> >> most
> >> the same number of elements.
> >> (get_compute_type): Pass original vector type rather than the element
> >> type to type_for_widest_vector_mode and remove now obsolete check
> >> for the number of elements.
> >
> > OK.
> >
> > Richard.
> >
> >> On 07/12/2023 07:45, Richard Biener wrote:
> >>> On Wed, 6 Dec 2023, Andre Vieira (lists) wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> This patch addresses the issue reported in PR target/112787 by improving
> >>>> the
> >>>> compute type selection. We do this by not considering types with more
> >>>> elements
> >>>> than the type we are lowering since we'd reject such types anyway.
> >>>>
> >>>> gcc/ChangeLog:
> >>>>
> >>>> PR target/112787
> >>>> * tree-vect-generic (type_for_widest_vector_mode): Add a parameter to
> >>>> control maximum amount of elements in resulting vector mode.
> >>>> (get_compute_type): Restrict vector_compute_type to a mode no wider
> >>>> than the original compute type.
> >>>>
> >>>> gcc/testsuite/ChangeLog:
> >>>>
> >>>> * gcc.target/aarch64/pr112787.c: New test.
> >>>>
> >>>> Bootstrapped and regression tested on aarch64-unknown-linux-gnu and
> >>>> x86_64-pc-linux-gnu.
> >>>>
> >>>> Is this OK for trunk?
> >>>
> >>> @@ -1347,7 +1347,7 @@ optimize_vector_constructor (gimple_stmt_iterator
> >>> *gsi)
> >>> TYPE, or NULL_TREE if none is found. */
> >>>
> >>> Can you improve the function comment? It also doesn't mention OP ...
> >>>
> >>> static tree
> >>> -type_for_widest_vector_mode (tree type, optab op)
> >>> +type_for_widest_vector_mode (tree type, optab op, poly_int64 max_nunits =
> >>> 0)
> >>> {
> >>> machine_mode inner_mode = TYPE_MODE (type);
> >>> machine_mode best_mode = VOIDmode, mode;
> >>> @@ -1371,7 +1371,9 @@ type_for_widest_vector_mode (tree type, optab op)
> >>> FOR_EACH_MODE_FROM (mode, mode)
> >>> if (GET_MODE_INNER (mode) == inner_mode
> >>> && maybe_gt (GET_MODE_NUNITS (mode), best_nunits)
> >>> - && optab_handler (op, mode) != CODE_FOR_nothing)
> >>> + && optab_handler (op, mode) != CODE_FOR_nothing
> >>> + && (known_eq (max_nunits, 0)
> >>> + || known_lt (GET_MODE_NUNITS (mode), max_nunits)))
> >>>
> >>> max_nunits suggests that known_le would be appropriate instead.
> >>>
> >>> I see the only other caller with similar "problems":
> >>>
> >>> }
> >>> /* Can't use get_compute_type here, as
> >>> supportable_convert_operation
> >>> doesn't necessarily use an optab and needs two arguments. */
> >>> tree vec_compute_type
> >>> = type_for_widest_vector_mode (TREE_TYPE (arg_type), mov_optab);
> >>> if (vec_compute_type
> >>> && VECTOR_MODE_P (TYPE_MODE (vec_compute_type))
> >>> && subparts_gt (arg_type, vec_compute_type))
> >>>
> >>> so please do not default to 0 but adjust this one as well. It also
> >>> seems you then can remove the subparts_gt guards on both
> >>> vec_compute_type uses.
> >>>
> >>> I think the API would be cleaner if we'd pass the original vector type
> >>> we can then extract TYPE_VECTOR_SUBPARTS from, avoiding the extra arg.
> >>>
> >>> No?
> >>>
> >>> Thanks,
> >>> Richard.
> >>
> >
>
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-02-19 14:40 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-06 18:49 veclower: improve selection of vector mode when lowering [PR 112787] Andre Vieira (lists)
2023-12-07 7:45 ` Richard Biener
2023-12-20 14:29 ` Andre Vieira (lists)
2023-12-20 14:30 ` Richard Biener
2024-02-19 14:00 ` Andre Vieira (lists)
2024-02-19 14:40 ` Richard Biener
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).