public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Hongtao Liu <crazylht@gmail.com>,
	Kirill Yukhin <kirill.yukhin@gmail.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>, "H. J. Lu" <hjl.tools@gmail.com>
Subject: Re: [PATCH] [PR target/97194] [AVX2] Support variable index vec_set.
Date: Tue, 20 Oct 2020 09:36:27 +0200	[thread overview]
Message-ID: <CAFiYyc1mNhhmSs3wEkbpd1pJg8VTn0K3U0+P5dKsm_dOqMXB3g@mail.gmail.com> (raw)
In-Reply-To: <CAMZc-bx17te3q+KvWsX_-zKfAcZFyYWLW541nV-Qs-Vg_kfYag@mail.gmail.com>

On Tue, Oct 20, 2020 at 4:35 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Mon, Oct 19, 2020 at 5:55 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
> >
> > On Mon, Oct 19, 2020 at 11:37 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > >
> > > On Mon, Oct 19, 2020 at 5:07 PM Richard Biener
> > > <richard.guenther@gmail.com> wrote:
> > > >
> > > > On Mon, Oct 19, 2020 at 10:21 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > > > >
> > > > > Hi:
> > > > >   It's implemented as below:
> > > > > V setg (V v, int idx, T val)
> > > > >
> > > > > {
> > > > >   V idxv = (V){idx, idx, idx, idx, idx, idx, idx, idx};
> > > > >   V valv = (V){val, val, val, val, val, val, val, val};
> > > > >   V mask = ((V){0, 1, 2, 3, 4, 5, 6, 7} == idxv);
> > > > >   v = (v & ~mask) | (valv & mask);
> > > > >   return v;
> > > > > }
> > > > >
> > > > > Bootstrap is fine, regression test for i386/x86-64 backend is ok.
> > > > > Ok for trunk?
> > > >
> > > > Hmm, I guess you're trying to keep the code for !AVX512BW simple
> > > > but isn't just splitting the compare into
> > > >
> > > >  clow = {0, 1, 2, 3 ... } == idxv
> > > >  chigh = {16, 17, 18, ... } == idxv;
> > > >  cmp = {clow, chigh}
> > > >
> > >
> > > We also don't have 512-bits byte/word blend instructions without
> > > TARGET_AVX512W, so how to use 512-bits cmp?
> >
> > Oh, I see.  Guess two back-to-back vpternlog could emulate
>
> Yes, we can have something like vpternlogd %zmm0, %zmm1, %zmm2, 0xD8,
> but since we don't have 512-bits bytes/word broadcast instruction,
> It would need 2 broadcast and 1 vec_concat to get 1 512-bits vector.
> it wouldn't save many instructions compared to my version(as below).
>
> ---
>         leal    -16(%rsi), %eax
>         vmovd   %edi, %xmm2
>         vmovdqa .LC0(%rip), %ymm4
>         vextracti64x4   $0x1, %zmm0, %ymm3
>         vmovd   %eax, %xmm1
>         vpbroadcastw    %xmm2, %ymm2
>         vpbroadcastw    %xmm1, %ymm1
>         vpcmpeqw        %ymm4, %ymm1, %ymm1
>         vpblendvb       %ymm1, %ymm2, %ymm3, %ymm3
>         vmovd   %esi, %xmm1
>         vpbroadcastw    %xmm1, %ymm1
>         vpcmpeqw        %ymm4, %ymm1, %ymm1
>         vpblendvb       %ymm1, %ymm2, %ymm0, %ymm0
>         vinserti64x4    $0x1, %ymm3, %zmm0, %zmm0
> ---
>
> > the blend?  Not sure if important - I recall only knl didn't have bw?
> >
>
> Yes, after(including) SKX, all avx512 targets will support AVX512BW.
> And i don't think performance for V32HI/V64QI without AVX512BW is important.

True.

I have no further comments on the patch then - it still needs i386 maintainer
approval though.

Thanks,
Richard.

>
> > > cut from i386-expand.c:
> > > in ix86_expand_sse_movcc
> > >  3682    case E_V64QImode:
> > >  3683      gen = gen_avx512bw_blendmv64qi; ---> TARGET_AVX512BW needed
> > >  3684      break;
> > >  3685    case E_V32HImode:
> > >  3686      gen = gen_avx512bw_blendmv32hi; --> TARGET_AVX512BW needed
> > >  3687      break;
> > >  3688    case E_V16SImode:
> > >  3689      gen = gen_avx512f_blendmv16si;
> > >  3690      break;
> > >  3691    case E_V8DImode:
> > >  3692      gen = gen_avx512f_blendmv8di;
> > >  3693      break;
> > >  3694    case E_V8DFmode:
> > >
> > > > faster, smaller and eventually even easier during expansion?
> > > >
> > > > +  gcc_assert (ix86_expand_vector_init_duplicate (false, mode, valv, val));
> > > > +  gcc_assert (ix86_expand_vector_init_duplicate (false, cmp_mode,
> > > > idxv, idx_tmp));
> > > >
> > > > side-effects in gcc_assert is considered bad style, use
> > > >
> > > >   ok = ix86_expand_vector_init_duplicate (false, mode, valv, val);
> > > >   gcc_assert (ok);
> > > >
> > > > +  vec[5] = constv;
> > > > +  ix86_expand_int_vcond (vec);
> > > >
> > > > this also returns a bool you probably should assert true.
> > > >
> > >
> > > Yes, will change.
> > >
> > > > Otherwise thanks for tackling this.
> > > >
> > > > Richard.
> > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > >         PR target/97194
> > > > >         * config/i386/i386-expand.c (ix86_expand_vector_set_var): New function.
> > > > >         * config/i386/i386-protos.h (ix86_expand_vector_set_var): New Decl.
> > > > >         * config/i386/predicates.md (vec_setm_operand): New predicate,
> > > > >         true for const_int_operand or register_operand under TARGET_AVX2.
> > > > >         * config/i386/sse.md (vec_set<mode>): Support both constant
> > > > >         and variable index vec_set.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > >         * gcc.target/i386/avx2-vec-set-1.c: New test.
> > > > >         * gcc.target/i386/avx2-vec-set-2.c: New test.
> > > > >         * gcc.target/i386/avx512bw-vec-set-1.c: New test.
> > > > >         * gcc.target/i386/avx512bw-vec-set-2.c: New test.
> > > > >         * gcc.target/i386/avx512f-vec-set-2.c: New test.
> > > > >         * gcc.target/i386/avx512vl-vec-set-2.c: New test.
> > > > >
> > > > > --
> > > > > BR,
> > > > > Hongtao
> > >
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
>
>
>
> --
> BR,
> Hongtao

  reply	other threads:[~2020-10-20  7:36 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-19  8:23 Hongtao Liu
2020-10-19  9:07 ` Richard Biener
2020-10-19  9:39   ` Hongtao Liu
2020-10-19  9:55     ` Richard Biener
2020-10-20  2:37       ` Hongtao Liu
2020-10-20  7:36         ` Richard Biener [this message]
2020-10-27  7:51           ` Hongtao Liu
2020-11-11  8:03             ` Hongtao Liu
2020-11-25 19:15               ` Jeff Law
2020-11-26  1:45                 ` Hongtao Liu
2020-11-11  8:45 Uros Bizjak
2020-11-12  2:06 ` Hongtao Liu
2020-11-12  8:20   ` Uros Bizjak
2020-11-12  9:12     ` Hongtao Liu
2020-11-12  9:15       ` Hongtao Liu
2020-11-12  9:25         ` Hongtao Liu
2020-11-12 13:59           ` Richard Biener
2020-11-12 17:51             ` Uros Bizjak
2020-11-12 18:26               ` Uros Bizjak
2020-11-12 18:34                 ` Uros Bizjak
2020-11-16 11:57             ` Uros Bizjak
2020-11-19 10:54               ` Richard Sandiford
2020-11-16 10:16       ` Uros Bizjak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFiYyc1mNhhmSs3wEkbpd1pJg8VTn0K3U0+P5dKsm_dOqMXB3g@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=crazylht@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hjl.tools@gmail.com \
    --cc=kirill.yukhin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).