public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Roger Sayle" <roger@nextmovesoftware.com>
To: "'Hongtao Liu'" <crazylht@gmail.com>
Cc: <gcc-patches@gcc.gnu.org>,
	"'Hongtao Liu'" <hongtao.liu@intel.com>,
	"'Uros Bizjak'" <ubizjak@gmail.com>
Subject: Re: [x86 PATCH] Improve V[48]QI shifts on AVX512
Date: Fri, 10 May 2024 08:41:05 +0100	[thread overview]
Message-ID: <004701daa2ad$6b78c610$426a5230$@nextmovesoftware.com> (raw)
In-Reply-To: <CAMZc-byCzfn-Tbex05F8KBx7udCky4aVCL1eF2suYzNDw0rHsg@mail.gmail.com>


Many thanks for the speedy review and correction/improvement.
It's interesting that you spotted the ternlog "spill"...
I have a patch that rewrites ternlog handling that's been
waiting for stage 1, that would also fix this mem operand
issue.  I hope to submit it for review this weekend.

Thanks again,
Roger

> From: Hongtao Liu <crazylht@gmail.com>
> On Fri, May 10, 2024 at 6:26 AM Roger Sayle <roger@nextmovesoftware.com>
> wrote:
> >
> >
> > The following one line patch improves the code generated for V8QI and
> > V4QI shifts when AV512BW and AVX512VL functionality is available.
> +      /* With AVX512 its cheaper to do vpmovsxbw/op/vpmovwb.  */
> +      && !(TARGET_AVX512BW && TARGET_AVX512VL && TARGET_SSE4_1)
>        && ix86_expand_vec_shift_qihi_constant (code, qdest, qop1, qop2)) I think
> TARGET_SSE4_1 is enough, it's always better w/ sse4.1 and above when not going
> into ix86_expand_vec_shift_qihi_constant.
> Others LGTM.
> >
> > For the testcase (from gcc.target/i386/vect-shiftv8qi.c):
> >
> > typedef signed char v8qi __attribute__ ((__vector_size__ (8))); v8qi
> > foo (v8qi x) { return x >> 5; }
> >
> > GCC with -O2 -march=cascadelake currently generates:
> >
> > foo:    movl    $67372036, %eax
> >         vpsraw  $5, %xmm0, %xmm2
> >         vpbroadcastd    %eax, %xmm1
> >         movl    $117901063, %eax
> >         vpbroadcastd    %eax, %xmm3
> >         vmovdqa %xmm1, %xmm0
> >         vmovdqa %xmm3, -24(%rsp)
> >         vpternlogd      $120, -24(%rsp), %xmm2, %xmm0
> It looks like a miss-optimization under AVX512, but it's a separate issue.
> >         vpsubb  %xmm1, %xmm0, %xmm0
> >         ret
> >
> > with this patch we now generate the much improved:
> >
> > foo:    vpmovsxbw       %xmm0, %xmm0
> >         vpsraw  $5, %xmm0, %xmm0
> >         vpmovwb %xmm0, %xmm0
> >         ret
> >
> > This patch also fixes the FAILs of gcc.target/i386/vect-shiftv[48]qi.c
> > when run with the additional -march=cascadelake flag, by splitting
> > these tests into two; one form testing code generation with -msse2
> > (and
> > -mno-avx512vl) as originally intended, and the other testing AVX512
> > code generation with an explicit -march=cascadelake.
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check, both with and without --target_board=unix{-m32}
> > with no new failures.  Ok for mainline?
> >
> >
> > 2024-05-09  Roger Sayle  <roger@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> >         * config/i386/i386-expand.cc (ix86_expand_vecop_qihi_partial):
> >         Don't attempt ix86_expand_vec_shift_qihi_constant on AVX512.
> >
> > gcc/testsuite/ChangeLog
> >         * gcc.target/i386/vect-shiftv4qi.c: Specify -mno-avx512vl.
> >         * gcc.target/i386/vect-shiftv8qi.c: Likewise.
> >         * gcc.target/i386/vect-shiftv4qi-2.c: New test case.
> >         * gcc.target/i386/vect-shiftv8qi-2.c: Likewise.
> >
> >
> > Thanks in advance,
> > Roger
> > --
> >
> --
> BR,
> Hongtao


  reply	other threads:[~2024-05-10  7:41 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-09 22:26 Roger Sayle
2024-05-10  2:39 ` Hongtao Liu
2024-05-10  7:41   ` Roger Sayle [this message]
2024-05-10  7:56     ` Hongtao Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='004701daa2ad$6b78c610$426a5230$@nextmovesoftware.com' \
    --to=roger@nextmovesoftware.com \
    --cc=crazylht@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hongtao.liu@intel.com \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).