* [PATCH] target/104762 - vectorization costs of CONSTRUCTORs
@ 2022-03-11 12:42 Richard Biener
2022-03-11 12:56 ` Hongtao Liu
0 siblings, 1 reply; 2+ messages in thread
From: Richard Biener @ 2022-03-11 12:42 UTC (permalink / raw)
To: gcc-patches
After accounting for GPR -> XMM move cost for vec_construct the
base cost needs adjustments to not double-cost those. This also
lowers the cost when such move is not necessary.
This fixes the observed 538.imagick_r and 525.x264_r regressions
for me on Zen2 with -Ofast -march=native.
Bootstrapped and tested on x86_64-unknown-linux-gnu.
OK for trunk?
Thanks,
Richard.
2022-03-11 Richard Biener <rguenther@suse.de>
PR target/104762
* config/i386/i386.cc (ix86_builtin_vectorization_cost): Do not
cost the first lane of SSE pieces as inserts for vec_construct.
---
gcc/config/i386/i386.cc | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 4121f986221..23bedea92bd 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -22597,16 +22597,21 @@ ix86_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
case vec_construct:
{
- /* N element inserts into SSE vectors. */
- int cost = TYPE_VECTOR_SUBPARTS (vectype) * ix86_cost->sse_op;
+ int n = TYPE_VECTOR_SUBPARTS (vectype);
+ /* N - 1 element inserts into an SSE vector, the possible
+ GPR -> XMM move is accounted for in add_stmt_cost. */
+ if (GET_MODE_BITSIZE (mode) <= 128)
+ return (n - 1) * ix86_cost->sse_op;
/* One vinserti128 for combining two SSE vectors for AVX256. */
- if (GET_MODE_BITSIZE (mode) == 256)
- cost += ix86_vec_cost (mode, ix86_cost->addss);
+ else if (GET_MODE_BITSIZE (mode) == 256)
+ return ((n - 2) * ix86_cost->sse_op
+ + ix86_vec_cost (mode, ix86_cost->addss));
/* One vinserti64x4 and two vinserti128 for combining SSE
and AVX256 vectors to AVX512. */
else if (GET_MODE_BITSIZE (mode) == 512)
- cost += 3 * ix86_vec_cost (mode, ix86_cost->addss);
- return cost;
+ return ((n - 4) * ix86_cost->sse_op
+ + 3 * ix86_vec_cost (mode, ix86_cost->addss));
+ gcc_unreachable ();
}
default:
--
2.34.1
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH] target/104762 - vectorization costs of CONSTRUCTORs
2022-03-11 12:42 [PATCH] target/104762 - vectorization costs of CONSTRUCTORs Richard Biener
@ 2022-03-11 12:56 ` Hongtao Liu
0 siblings, 0 replies; 2+ messages in thread
From: Hongtao Liu @ 2022-03-11 12:56 UTC (permalink / raw)
To: Richard Biener; +Cc: GCC Patches
On Fri, Mar 11, 2022 at 8:43 PM Richard Biener via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> After accounting for GPR -> XMM move cost for vec_construct the
> base cost needs adjustments to not double-cost those. This also
> lowers the cost when such move is not necessary.
>
> This fixes the observed 538.imagick_r and 525.x264_r regressions
> for me on Zen2 with -Ofast -march=native.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>
> OK for trunk?
LGTM.
>
> Thanks,
> Richard.
>
> 2022-03-11 Richard Biener <rguenther@suse.de>
>
> PR target/104762
> * config/i386/i386.cc (ix86_builtin_vectorization_cost): Do not
> cost the first lane of SSE pieces as inserts for vec_construct.
> ---
> gcc/config/i386/i386.cc | 17 +++++++++++------
> 1 file changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index 4121f986221..23bedea92bd 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -22597,16 +22597,21 @@ ix86_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
>
> case vec_construct:
> {
> - /* N element inserts into SSE vectors. */
> - int cost = TYPE_VECTOR_SUBPARTS (vectype) * ix86_cost->sse_op;
> + int n = TYPE_VECTOR_SUBPARTS (vectype);
> + /* N - 1 element inserts into an SSE vector, the possible
> + GPR -> XMM move is accounted for in add_stmt_cost. */
> + if (GET_MODE_BITSIZE (mode) <= 128)
> + return (n - 1) * ix86_cost->sse_op;
> /* One vinserti128 for combining two SSE vectors for AVX256. */
> - if (GET_MODE_BITSIZE (mode) == 256)
> - cost += ix86_vec_cost (mode, ix86_cost->addss);
> + else if (GET_MODE_BITSIZE (mode) == 256)
> + return ((n - 2) * ix86_cost->sse_op
> + + ix86_vec_cost (mode, ix86_cost->addss));
> /* One vinserti64x4 and two vinserti128 for combining SSE
> and AVX256 vectors to AVX512. */
> else if (GET_MODE_BITSIZE (mode) == 512)
> - cost += 3 * ix86_vec_cost (mode, ix86_cost->addss);
> - return cost;
> + return ((n - 4) * ix86_cost->sse_op
> + + 3 * ix86_vec_cost (mode, ix86_cost->addss));
> + gcc_unreachable ();
> }
>
> default:
> --
> 2.34.1
--
BR,
Hongtao
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2022-03-11 12:56 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-11 12:42 [PATCH] target/104762 - vectorization costs of CONSTRUCTORs Richard Biener
2022-03-11 12:56 ` Hongtao Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).