From: Uros Bizjak <ubizjak@gmail.com>
To: liuhongt <hongtao.liu@intel.com>
Cc: gcc-patches@gcc.gnu.org, hubicka@ucw.cz
Subject: Re: [PATCH] Optimize vlddqu to vmovdqu for TARGET_AVX
Date: Thu, 20 Jul 2023 10:10:47 +0200 [thread overview]
Message-ID: <CAFULd4asbMZkPF2u0=7DgF-xBR09Mfzc0z+jAYAkt7ZAch7DnQ@mail.gmail.com> (raw)
In-Reply-To: <20230720073516.2171485-1-hongtao.liu@intel.com>
On Thu, Jul 20, 2023 at 9:35 AM liuhongt <hongtao.liu@intel.com> wrote:
>
> For Intel processors, after TARGET_AVX, vmovdqu is optimized as fast
> as vlddqu, UNSPEC_LDDQU can be removed to enable more optimizations.
> Can someone confirm this with AMD folks?
> If AMD doesn't like such optimization, I'll put my optimization under
> micro-architecture tuning.
The instruction is reachable only as __builtin_ia32_lddqu* (aka
_mm_lddqu_si*), so it was chosen by the programmer for a reason. I
think that in this case, the compiler should not be too smart and
change the instruction behind the programmer's back. The caveats are
also explained at length in the ISA manual.
Uros.
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> If AMD also like such optimization, Ok for trunk?
>
> gcc/ChangeLog:
>
> * config/i386/sse.md (<sse3>_lddqu<avxsizesuffix>): Change to
> define_expand, expand as simple move when TARGET_AVX
> && (<MODE_SIZE> == 16 || !TARGET_AVX256_SPLIT_UNALIGNED_LOAD).
> The original define_insn is renamed to
> ..
> (<sse3>_lddqu<avxsizesuffix>): .. this.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/vlddqu_vinserti128.c: New test.
> ---
> gcc/config/i386/sse.md | 15 ++++++++++++++-
> .../gcc.target/i386/vlddqu_vinserti128.c | 11 +++++++++++
> 2 files changed, 25 insertions(+), 1 deletion(-)
> create mode 100644 gcc/testsuite/gcc.target/i386/vlddqu_vinserti128.c
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 2d81347c7b6..d571a78f4c4 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -1835,7 +1835,20 @@ (define_peephole2
> [(set (match_dup 4) (match_dup 1))]
> "operands[4] = adjust_address (operands[0], V2DFmode, 0);")
>
> -(define_insn "<sse3>_lddqu<avxsizesuffix>"
> +(define_expand "<sse3>_lddqu<avxsizesuffix>"
> + [(set (match_operand:VI1 0 "register_operand")
> + (unspec:VI1 [(match_operand:VI1 1 "memory_operand")]
> + UNSPEC_LDDQU))]
> + "TARGET_SSE3"
> +{
> + if (TARGET_AVX && (<MODE_SIZE> == 16 || !TARGET_AVX256_SPLIT_UNALIGNED_LOAD))
> + {
> + emit_move_insn (operands[0], operands[1]);
> + DONE;
> + }
> +})
> +
> +(define_insn "*<sse3>_lddqu<avxsizesuffix>"
> [(set (match_operand:VI1 0 "register_operand" "=x")
> (unspec:VI1 [(match_operand:VI1 1 "memory_operand" "m")]
> UNSPEC_LDDQU))]
> diff --git a/gcc/testsuite/gcc.target/i386/vlddqu_vinserti128.c b/gcc/testsuite/gcc.target/i386/vlddqu_vinserti128.c
> new file mode 100644
> index 00000000000..29699a5fa7f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/vlddqu_vinserti128.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx2 -O2" } */
> +/* { dg-final { scan-assembler-times "vbroadcasti128" 1 } } */
> +/* { dg-final { scan-assembler-not {(?n)vlddqu.*xmm} } } */
> +
> +#include <immintrin.h>
> +__m256i foo(void *data) {
> + __m128i X1 = _mm_lddqu_si128((__m128i*)data);
> + __m256i V1 = _mm256_broadcastsi128_si256 (X1);
> + return V1;
> +}
> --
> 2.39.1.388.g2fc9e9ca3c
>
next prev parent reply other threads:[~2023-07-20 8:11 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-20 7:35 liuhongt
2023-07-20 8:10 ` Uros Bizjak [this message]
2023-07-20 23:50 ` Hongtao Liu
2023-08-02 1:31 ` [PATCH] Optimize vlddqu + inserti128 to vbroadcasti128 liuhongt
2023-08-02 5:47 ` Uros Bizjak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAFULd4asbMZkPF2u0=7DgF-xBR09Mfzc0z+jAYAkt7ZAch7DnQ@mail.gmail.com' \
--to=ubizjak@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hongtao.liu@intel.com \
--cc=hubicka@ucw.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).