Re: [PATCH][PR target/97642] Fix incorrect replacement of vmovdqu32 with vpblendd.

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Hongtao Liu <crazylht@gmail.com>
To: Jeff Law <law@redhat.com>
Cc: Kirill Yukhin <kirill.yukhin@gmail.com>,
	GCC Patches <gcc-patches@gcc.gnu.org>,
	Jakub Jelinek <jakub@redhat.com>
Subject: Re: [PATCH][PR target/97642] Fix incorrect replacement of vmovdqu32 with vpblendd.
Date: Tue, 24 Nov 2020 10:36:49 +0800	[thread overview]
Message-ID: <CAMZc-bxTJjd0Mp+zxg4GuTW0wYqeuhMNoKWcFZ1kxSjatfxGBQ@mail.gmail.com> (raw)
In-Reply-To: <8643c8d6-5e88-0bf0-b174-cc56e4bcb024@redhat.com>

On Tue, Nov 24, 2020 at 4:27 AM Jeff Law <law@redhat.com> wrote:
>
>
>
> On 11/4/20 2:19 AM, Hongtao Liu via Gcc-patches wrote:
> > Hi:
> >   When programmers explicitly use mask loaded intrinsics, don't
> > transform the instruction to vpblend{b,w,d,q} since If mem_addr points
> > to a memory region with less than whole vector size of accessible
> > memory,  the mask would prevent reading the inaccessible bytes which
> > could avoid fault.
> >
> >   Bootstrap is ok, gcc regress test for i386/x86_64 backend is ok.
> >   Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> >         PR target/97642
> >         * config/i386/sse.md (UNSPEC_MASKLOAD): New unspec.
> >         (*<avx512>_load<mode>_mask): New define_insns for masked load
> >         instructions.
> >         (<avx512>_load<mode>_mask): Changed to define_expands which
> >         specifically handle memory operands.
> >         (<avx512>_blendm<mode>): Changed to define_insns which are same
> >         as original <avx512>_load<mode>_mask with adjustment of
> >         operands order.
> >         (*<avx512>_load<mode>): New define_insn_and_split which is
> >         used to optimize for masked load with all one mask.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.target/i386/avx512bw-vmovdqu16-1.c: Adjust testcase to
> >         make sure only masked load instruction is generated.
> >         * gcc.target/i386/avx512bw-vmovdqu8-1.c: Ditto.
> >         * gcc.target/i386/avx512f-vmovapd-1.c: Ditto.
> >         * gcc.target/i386/avx512f-vmovaps-1.c: Ditto.
> >         * gcc.target/i386/avx512f-vmovdqa32-1.c: Ditto.
> >         * gcc.target/i386/avx512f-vmovdqa64-1.c: Ditto.
> >         * gcc.target/i386/avx512vl-vmovapd-1.c: Ditto.
> >         * gcc.target/i386/avx512vl-vmovaps-1.c: Ditto.
> >         * gcc.target/i386/avx512vl-vmovdqa32-1.c: Ditto.
> >         * gcc.target/i386/avx512vl-vmovdqa64-1.c: Ditto.
> >         * gcc.target/i386/pr97642-1.c: New test.
> >         * gcc.target/i386/pr97642-2.c: New test.
> >
> >
> > 0001-Fix-incorrect-replacement-of-vmovdqu32-with-vpblendd.patch
> >
> > From 48cf0adcd55395653891888f4768b8bdc19786f2 Mon Sep 17 00:00:00 2001
> > From: liuhongt <hongtao.liu@intel.com>
> > Date: Tue, 3 Nov 2020 17:26:43 +0800
> > Subject: [PATCH] Fix incorrect replacement of vmovdqu32 with vpblendd which
> >  can cause fault.
> >
> > gcc/ChangeLog:
> >
> >       PR target/97642
> >       * config/i386/sse.md (UNSPEC_MASKLOAD): New unspec.
> >       (*<avx512>_load<mode>_mask): New define_insns for masked load
> >       instructions.
> >       (<avx512>_load<mode>_mask): Changed to define_expands which
> >       specifically handle memory operands.
> >       (<avx512>_blendm<mode>): Changed to define_insns which are same
> >       as original <avx512>_load<mode>_mask with adjustment of
> >       operands order.
> >       (*<avx512>_load<mode>): New define_insn_and_split which is
> >       used to optimize for masked load with all one mask.
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/i386/avx512bw-vmovdqu16-1.c: Adjust testcase to
> >       make sure only masked load instruction is generated.
> >       * gcc.target/i386/avx512bw-vmovdqu8-1.c: Ditto.
> >       * gcc.target/i386/avx512f-vmovapd-1.c: Ditto.
> >       * gcc.target/i386/avx512f-vmovaps-1.c: Ditto.
> >       * gcc.target/i386/avx512f-vmovdqa32-1.c: Ditto.
> >       * gcc.target/i386/avx512f-vmovdqa64-1.c: Ditto.
> >       * gcc.target/i386/avx512vl-vmovapd-1.c: Ditto.
> >       * gcc.target/i386/avx512vl-vmovaps-1.c: Ditto.
> >       * gcc.target/i386/avx512vl-vmovdqa32-1.c: Ditto.
> >       * gcc.target/i386/avx512vl-vmovdqa64-1.c: Ditto.
> >       * gcc.target/i386/pr97642-1.c: New test.
> >       * gcc.target/i386/pr97642-2.c: New test.
> So in the BZ Jakub asked for the all-ones mask case to be specially
> handled to emit a normal load.  I don't see where we're handling that.
> ISTM that we'd want a test for that too.  Right?
>

all-ones mask would be simplified to a simple load but with unspec in
set_src and would be handled by the following

+(define_insn_and_split "*<avx512>_load<mode>"
+  [(set (match_operand:V48_AVX512VL 0 "register_operand")
+ (unspec:V48_AVX512VL
+   [(match_operand:V48_AVX512VL 1 "memory_operand")]
+   UNSPEC_MASKLOAD))]
+  "TARGET_AVX512F"
+  "#"
+  "&& 1"
+  [(set (match_dup 0) (match_dup 1))])

and the corresponding testcase is

new file   gcc/testsuite/gcc.target/i386/pr97642-1.c
@@ -0,0 +1,23 @@
+/* PR target/97642 */
+/* { dg-do compile } */
+/* { dg-options "-mavx512vl -O2" } */
+/* { dg-final { scan-assembler-not { k[0-8] } } } */
+
+#include <immintrin.h>
+__m128i
+foo1 (__m128i src, void const* P)
+{
+  return _mm_mask_loadu_epi32 (src, 15, P);
+}
+
+__m256i
+foo2 (__m256i src, void const* P)
+{
+  return _mm256_mask_loadu_epi32 (src, 255, P);
+}
+
+__m512i
+foo3 (__m512i src, void const* P)
+{
+  return _mm512_mask_loadu_epi32 (src, 65535 , P);
+}


> WIth that in place and tested, this is probably ready for the trunk.
>
> jeff
>
>


-- 
BR,
Hongtao

next prev parent reply	other threads:[~2020-11-24  2:35 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-04  9:19 Hongtao Liu
2020-11-23 20:27 ` Jeff Law
2020-11-24  2:36   ` Hongtao Liu [this message]
2020-11-24 13:00     ` Jakub Jelinek
2020-11-25 11:32       ` Hongtao Liu
2020-11-25 11:37         ` Jakub Jelinek
2020-11-26  4:47           ` Hongtao Liu
2020-11-26  4:50             ` [PATCH] [X86] Delete Deadcode Hongtao Liu
2020-11-26  7:32               ` Jakub Jelinek
2020-12-02 19:11             ` [PATCH][PR target/97642] Fix incorrect replacement of vmovdqu32 with vpblendd Jeff Law
2020-12-03  5:49               ` Hongtao Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMZc-bxTJjd0Mp+zxg4GuTW0wYqeuhMNoKWcFZ1kxSjatfxGBQ@mail.gmail.com \
    --to=crazylht@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=kirill.yukhin@gmail.com \
    --cc=law@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).