public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Hongtao Liu <crazylht@gmail.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	Kirill Yukhin <kirill.yukhin@gmail.com>,
	 Hongtao Liu <hongtao.liu@intel.com>
Subject: Re: [PATCH] x86: make better use of VBROADCASTSS / VPBROADCASTD
Date: Wed, 14 Jun 2023 15:41:55 +0800	[thread overview]
Message-ID: <CAMZc-bzx=SGWDOXnn9MMawjJ4XAjkUMmsq_c09OcHZW8heqRbQ@mail.gmail.com> (raw)
In-Reply-To: <48be2ae1-66d7-f87f-5997-b5307bd25fbc@suse.com>

On Wed, Jun 14, 2023 at 1:58 PM Jan Beulich via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> ... in vec_dupv4sf / *vec_dupv4si. The respective broadcast insns are
> never longer (yet sometimes shorter) than the corresponding VSHUFPS /
> VPSHUFD, due to the immediate operand of the shuffle insns balancing the
> need for VEX3 in the broadcast ones. When EVEX encoding is required the
> broadcast insns are always shorter.
>
> Add two new alternatives each, one covering the AVX2 case and one
> covering AVX512.
I think you can just change assemble output for this first alternative
when TARGET_AVX2, use vbroadcastss, else use vshufps since
vbroadcastss only accept register operand when TARGET_AVX2. And no
need to support 2 extra alternatives which doesn't make sense just
make RA more confused about the same meaning of different
alternatives.
>
> gcc/
>
>         * config/i386/sse.md (vec_dupv4sf): New AVX2 and AVX512F
>         alternatives using vbroadcastss.
>         (*vec_dupv4si): New AVX2 and AVX512F alternatives using
>         vpbroadcastd.
> ---
> I'm working from the assumption that the isa attributes to the original
> 1st and 2nd alternatives don't need further restricting (to sse2_noavx2
> or avx_noavx2 as applicable), as the new earlier alternatives cover all
> operand forms already when at least AVX2 is enabled.
>
> Isn't prefix_extra use bogus here? What extra prefix does vbroadcastss
> use? (Same further down in *vec_dupv4si and avx2_vbroadcasti128_<mode>
> and elsewhere.)
Not sure about this part. I grep prefix_extra, seems only used by
znver.md/znver4.md for schedule, and only for comi instructions(?the
reservation name seems so).
>
> Is use of Yv for the source operand really necessary in *vec_dupv4si?
> I.e. would scalar integer values be put in XMM{16...31} when AVX512VL
Yes, You can look at ix86_hard_regno_mode_ok, EXT_REX_SSE_REGNO is
allowed for scalar mode, but not for 128/256-bit vector modes.

20204      if (TARGET_AVX512F
20205          && (VALID_AVX512F_REG_OR_XI_MODE (mode)
20206              || VALID_AVX512F_SCALAR_MODE (mode)))
20207        return true;


> isn't enabled? If so (*movsi_internal / *movdi_internal suggest they
> might), wouldn't *vec_dupv2di need to use Yv as well in its 3rd
> alternative (or just m, as Yv is already covered by the 2nd one)?
I guess xm is more suitable since we still want to allocate
operands[1] to register when sse3_noavx.
It didn't hit any error since for avx and above, alternative 1(2rd
one) is always matched than alternative 2.
>
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -25798,38 +25798,42 @@
>         (const_int 1)))])
>
>  (define_insn "vec_dupv4sf"
> -  [(set (match_operand:V4SF 0 "register_operand" "=v,v,x")
> +  [(set (match_operand:V4SF 0 "register_operand" "=Yv,v,v,v,x")
>         (vec_duplicate:V4SF
> -         (match_operand:SF 1 "nonimmediate_operand" "Yv,m,0")))]
> +         (match_operand:SF 1 "nonimmediate_operand" "v,vm,Yv,m,0")))]
>    "TARGET_SSE"
>    "@
> +   vbroadcastss\t{%1, %0|%0, %1}
> +   vbroadcastss\t{%1, %g0|%g0, %1}
>     vshufps\t{$0, %1, %1, %0|%0, %1, %1, 0}
>     vbroadcastss\t{%1, %0|%0, %1}
>     shufps\t{$0, %0, %0|%0, %0, 0}"
> -  [(set_attr "isa" "avx,avx,noavx")
> -   (set_attr "type" "sseshuf1,ssemov,sseshuf1")
> -   (set_attr "length_immediate" "1,0,1")
> -   (set_attr "prefix_extra" "0,1,*")
> -   (set_attr "prefix" "maybe_evex,maybe_evex,orig")
> -   (set_attr "mode" "V4SF")])
> +  [(set_attr "isa" "avx2,avx512f,avx,avx,noavx")
> +   (set_attr "type" "ssemov,ssemov,sseshuf1,ssemov,sseshuf1")
> +   (set_attr "length_immediate" "0,0,1,0,1")
> +   (set_attr "prefix_extra" "*,*,0,1,*")
> +   (set_attr "prefix" "maybe_evex,evex,maybe_evex,maybe_evex,orig")
> +   (set_attr "mode" "V4SF,V16SF,V4SF,V4SF,V4SF")])
>
>  (define_insn "*vec_dupv4si"
> -  [(set (match_operand:V4SI 0 "register_operand"     "=v,v,x")
> +  [(set (match_operand:V4SI 0 "register_operand"     "=Yv,v,v,v,x")
>         (vec_duplicate:V4SI
> -         (match_operand:SI 1 "nonimmediate_operand" "Yv,m,0")))]
> +         (match_operand:SI 1 "nonimmediate_operand" "vm,vm,Yv,m,0")))]
>    "TARGET_SSE"
>    "@
> +   vpbroadcastd\t{%1, %0|%0, %1}
> +   vpbroadcastd\t{%1, %g0|%g0, %1}
>     %vpshufd\t{$0, %1, %0|%0, %1, 0}
>     vbroadcastss\t{%1, %0|%0, %1}
>     shufps\t{$0, %0, %0|%0, %0, 0}"
> -  [(set_attr "isa" "sse2,avx,noavx")
> -   (set_attr "type" "sselog1,ssemov,sselog1")
> -   (set_attr "length_immediate" "1,0,1")
> -   (set_attr "prefix_extra" "0,1,*")
> -   (set_attr "prefix" "maybe_vex,maybe_evex,orig")
> -   (set_attr "mode" "TI,V4SF,V4SF")
> +  [(set_attr "isa" "avx2,avx512f,sse2,avx,noavx")
> +   (set_attr "type" "ssemov,ssemov,sselog1,ssemov,sselog1")
> +   (set_attr "length_immediate" "0,0,1,0,1")
> +   (set_attr "prefix_extra" "*,*,0,1,*")
> +   (set_attr "prefix" "maybe_evex,evex,maybe_vex,maybe_evex,orig")
> +   (set_attr "mode" "TI,XI,TI,V4SF,V4SF")
>     (set (attr "preferred_for_speed")
> -     (cond [(eq_attr "alternative" "1")
> +     (cond [(eq_attr "alternative" "3")
>               (symbol_ref "!TARGET_INTER_UNIT_MOVES_TO_VEC")
>            ]
>            (symbol_ref "true")))])



-- 
BR,
Hongtao

  reply	other threads:[~2023-06-14  7:42 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-14  5:57 Jan Beulich
2023-06-14  7:41 ` Hongtao Liu [this message]
2023-06-14  9:03   ` Jan Beulich
2023-06-15  5:23     ` Hongtao Liu
2023-06-15  5:35       ` Hongtao Liu
2023-06-15  6:40       ` Jan Beulich
2023-06-15  7:36         ` Hongtao Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMZc-bzx=SGWDOXnn9MMawjJ4XAjkUMmsq_c09OcHZW8heqRbQ@mail.gmail.com' \
    --to=crazylht@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hongtao.liu@intel.com \
    --cc=jbeulich@suse.com \
    --cc=kirill.yukhin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).