From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x1131.google.com (mail-yw1-x1131.google.com [IPv6:2607:f8b0:4864:20::1131]) by sourceware.org (Postfix) with ESMTPS id D9D1F3858D35 for ; Tue, 27 Jun 2023 07:46:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D9D1F3858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yw1-x1131.google.com with SMTP id 00721157ae682-5701eaf0d04so41736497b3.2 for ; Tue, 27 Jun 2023 00:46:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687851994; x=1690443994; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=UUPvJG53utF9C9Q9RpZYpS+QgMdnGKPjIV+ORZLxd+I=; b=XaYX5KHyMLQZeIhDyV8/Sn3Bf0n5INkudbc1nbYY/b9lvh9Rd1/yrxpaO1J6mmg0zp l2pbVNuLEoF+IBtFqABOUxLZw8htdSX4irL4HHsVlx69OzDsjpf9xeb4dIBBAZy8aaBZ MLIQSvFnCegef3LVUWtHeS64klTAoEV+1AXTjVF9oheeqNZBflsbGdKdx93WeeafNFjO Y0SPcioSMR8iI2xSeRrCcbTuvJDNpttnsbMDGF1g5SKiF0QzOFxgbAMD3+ZbB9VgMhoh 9LVFF8dcbrYWtvw2RmBhF5jgptvzd0WToefp7vO32dtH8oIfwXkWWlTBFrHOMFU7eEOg Z+ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687851994; x=1690443994; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UUPvJG53utF9C9Q9RpZYpS+QgMdnGKPjIV+ORZLxd+I=; b=EmRR9Qnm6dkpXF6ikOlU8LemjciRuUZWCWO2K9olpRJi5BdVa+XHDIxsC6IkcDOX7s gH29pGEOTPNMDb8xMUuaCBNn8xloN85y6qhiUQ+mPmVopeRnNS4p7Y2oOBPBliUK6l9L O1S5I/rWjRGk6DH2ogsnByilk/dMaGi6BEdU4H5grh9vvrStQ3C3Itq6Rvr1MU+Y6I7x X8evkh3uYgNBhf41BjZ2HSuZro/Qj2hFUbxI/j3o6PFFSV0hFXUP0NUf7FmZ0je9tALe xk2ZIh9h92dIcxvU+6QR4jFcIG65Ts8io/Cb+GXzVHJGdXd7NZaY9qzrPGnot4ZLzLzT OOsg== X-Gm-Message-State: AC+VfDx2WcD5oNc2atbhP4BFLL266ezXOvhosE7uKss6XOQLAauZOcqV oYvKVM3G4nA2F+qMhsg3tR1Oj0iaZYKlCIZX3PnGbg+wIenaUA== X-Google-Smtp-Source: ACHHUZ5IzFildyGsuuxJ8ctBOFEWmN1ggw0LbdQjNCXcOMen9LRLgd3JkmEJVr0cBepeSQhrHmoS79m8Lxh+rj6bTdw= X-Received: by 2002:a25:f302:0:b0:c00:4aa9:e22a with SMTP id c2-20020a25f302000000b00c004aa9e22amr12790308ybs.5.1687851994019; Tue, 27 Jun 2023 00:46:34 -0700 (PDT) MIME-Version: 1.0 References: <20230627053806.2880955-1-hongtao.liu@intel.com> In-Reply-To: From: Hongtao Liu Date: Tue, 27 Jun 2023 15:46:22 +0800 Message-ID: Subject: Re: [PATCH] [x86] Refine maskstore patterns with UNSPEC_MASKMOV. To: Richard Biener Cc: liuhongt , gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Jun 27, 2023 at 3:28=E2=80=AFPM Hongtao Liu wr= ote: > > On Tue, Jun 27, 2023 at 3:20=E2=80=AFPM Richard Biener via Gcc-patches > wrote: > > > > On Tue, Jun 27, 2023 at 7:38=E2=80=AFAM liuhongt wrote: > > > > > > At the rtl level, we cannot guarantee that the maskstore is not optim= ized > > > to other full-memory accesses, as the current implementations are equ= ivalent > > > in terms of pattern, to solve this potential problem, this patch refi= nes > > > the pattern of the maskstore and the intrinsics with unspec. > > > > > > One thing I'm not sure is VCOND_EXPR, should VCOND_EXPR also expect > > > fault suppression for masked-out elements? > > > > You mean the vcond and vcond_eq optabs? No, those do not expect > > fault suppression. > Yes, vcond/vcond_eq, thanks for clarifying. > > > > > > > > Currently we're still using vec_merge for both AVX2 and AVX512 target= . > > > > > > ------------------------ > > > Similar like r14-2070-gc79476da46728e > > > > > > If mem_addr points to a memory region with less than whole vector siz= e > > > bytes of accessible memory and k is a mask that would prevent reading > > > the inaccessible bytes from mem_addr, add UNSPEC_MASKMOV to prevent > > > it to be transformed to any other whole memory access instructions. > > > > > > Bootstrapped and regtested on x86_64-pc-linu-gnu{-m32,}. > > > Ready to push to trunk. I'm going to backpart this patch and masload one[1] to GCC11/GCC12/GCC13 [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622410.html > > > > > > gcc/ChangeLog: > > > > > > PR rtl-optimization/110237 > > > * config/i386/sse.md (_store_mask): Refine with > > > UNSPEC_MASKMOV. > > > (maskstore > > (*_store_mask): New define_insn, it's renamed > > > from original _store_mask. > > > --- > > > gcc/config/i386/sse.md | 69 ++++++++++++++++++++++++++++++++++------= -- > > > 1 file changed, 57 insertions(+), 12 deletions(-) > > > > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > > > index 3b50c7117f8..812cfca4b92 100644 > > > --- a/gcc/config/i386/sse.md > > > +++ b/gcc/config/i386/sse.md > > > @@ -1608,7 +1608,7 @@ (define_insn "_blendm" > > > (set_attr "prefix" "evex") > > > (set_attr "mode" "")]) > > > > > > -(define_insn "_store_mask" > > > +(define_insn "*_store_mask" > > > [(set (match_operand:V48_AVX512VL 0 "memory_operand" "=3Dm") > > > (vec_merge:V48_AVX512VL > > > (match_operand:V48_AVX512VL 1 "register_operand" "v") > > > @@ -1636,7 +1636,7 @@ (define_insn "_store_mask" > > > (set_attr "memory" "store") > > > (set_attr "mode" "")]) > > > > > > -(define_insn "_store_mask" > > > +(define_insn "*_store_mask" > > > [(set (match_operand:VI12HFBF_AVX512VL 0 "memory_operand" "=3Dm") > > > (vec_merge:VI12HFBF_AVX512VL > > > (match_operand:VI12HFBF_AVX512VL 1 "register_operand" "v") > > > @@ -27008,21 +27008,66 @@ (define_expand "maskstore" > > > "TARGET_AVX") > > > > > > (define_expand "maskstore" > > > - [(set (match_operand:V48H_AVX512VL 0 "memory_operand") > > > - (vec_merge:V48H_AVX512VL > > > - (match_operand:V48H_AVX512VL 1 "register_operand") > > > - (match_dup 0) > > > - (match_operand: 2 "register_operand")))] > > > + [(set (match_operand:V48_AVX512VL 0 "memory_operand") > > > + (unspec:V48_AVX512VL > > > + [(match_operand:V48_AVX512VL 1 "register_operand") > > > + (match_dup 0) > > > + (match_operand: 2 "register_operand")] > > > + UNSPEC_MASKMOV))] > > > "TARGET_AVX512F") > > > > > > (define_expand "maskstore" > > > - [(set (match_operand:VI12_AVX512VL 0 "memory_operand") > > > - (vec_merge:VI12_AVX512VL > > > - (match_operand:VI12_AVX512VL 1 "register_operand") > > > - (match_dup 0) > > > - (match_operand: 2 "register_operand")))] > > > + [(set (match_operand:VI12HFBF_AVX512VL 0 "memory_operand") > > > + (unspec:VI12HFBF_AVX512VL > > > + [(match_operand:VI12HFBF_AVX512VL 1 "register_operand") > > > + (match_dup 0) > > > + (match_operand: 2 "register_operand")] > > > + UNSPEC_MASKMOV))] > > > "TARGET_AVX512BW") > > > > > > +(define_insn "_store_mask" > > > + [(set (match_operand:V48_AVX512VL 0 "memory_operand" "=3Dm") > > > + (unspec:V48_AVX512VL > > > + [(match_operand:V48_AVX512VL 1 "register_operand" "v") > > > + (match_dup 0) > > > + (match_operand: 2 "register_operand" "Yk"= )] > > > + UNSPEC_MASKMOV))] > > > + "TARGET_AVX512F" > > > +{ > > > + if (FLOAT_MODE_P (GET_MODE_INNER (mode))) > > > + { > > > + if (misaligned_operand (operands[0], mode)) > > > + return "vmovu\t{%1, %0%{%2%}|%0%{%2%}, %1}"; > > > + else > > > + return "vmova\t{%1, %0%{%2%}|%0%{%2%}, %1}"; > > > + } > > > + else > > > + { > > > + if (misaligned_operand (operands[0], mode)) > > > + return "vmovdqu\t{%1, %0%{%2%}|%0%{%2%}, %1}"; > > > + else > > > + return "vmovdqa\t{%1, %0%{%2%}|%0%{%2%}, %1}"; > > > + } > > > +} > > > + [(set_attr "type" "ssemov") > > > + (set_attr "prefix" "evex") > > > + (set_attr "memory" "store") > > > + (set_attr "mode" "")]) > > > + > > > +(define_insn "_store_mask" > > > + [(set (match_operand:VI12HFBF_AVX512VL 0 "memory_operand" "=3Dm") > > > + (unspec:VI12HFBF_AVX512VL > > > + [(match_operand:VI12HFBF_AVX512VL 1 "register_operand" "v") > > > + (match_dup 0) > > > + (match_operand: 2 "register_operand" "Yk"= )] > > > + UNSPEC_MASKMOV))] > > > + "TARGET_AVX512BW" > > > + "vmovdqu\t{%1, %0%{%2%}|%0%{%2%}, %1}" > > > + [(set_attr "type" "ssemov") > > > + (set_attr "prefix" "evex") > > > + (set_attr "memory" "store") > > > + (set_attr "mode" "")]) > > > + > > > (define_expand "cbranch4" > > > [(set (reg:CC FLAGS_REG) > > > (compare:CC (match_operand:VI48_AVX 1 "register_operand") > > > -- > > > 2.39.1.388.g2fc9e9ca3c > > > > > > > -- > BR, > Hongtao --=20 BR, Hongtao