From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb36.google.com (mail-yb1-xb36.google.com [IPv6:2607:f8b0:4864:20::b36]) by sourceware.org (Postfix) with ESMTPS id 591D4385841D for ; Tue, 27 Jun 2023 07:29:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 591D4385841D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yb1-xb36.google.com with SMTP id 3f1490d57ef6-bff89873d34so2744093276.2 for ; Tue, 27 Jun 2023 00:29:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687850946; x=1690442946; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Uc4TKxlBR1OwUgefP8/BcNWiNXhffLOMCBh3j+PfzKU=; b=HtwhsoXBel2Q/RBlti9lP4Q8ccuu4sxLqZNHamlnfXMRUl6BkJgZl1j8X6k0xhbgv8 sSozGbBMxpWj3WycVWDkMa4557Rqyn+k3U96/ikWF0yxT26X2D5NTEX+vcz9gU9PQDQn oqL1mFXa5UyfB8RoaOa65tn1tFd89uaMoCCRbf3XJSJpSlwNHO+pPa3rnXES9rGDsYG7 W9p2hXZoRB+VrDDBQYpVtRS1i9bznPFLm5BdTlLgN5Wqd1W/oFGWHWhsMbAqIQvUKwAX CR8c8SpoAx+uRJ4JtRoUgvJ7JSArNCYpcmbSzXT2mUbMsnOWkgBU2isGbKSC4/M/SG6E z5IA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687850946; x=1690442946; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Uc4TKxlBR1OwUgefP8/BcNWiNXhffLOMCBh3j+PfzKU=; b=AKjezBRgZJFg1XpmaDbFLKRvNepQI3ItuTO6BcEOddMylFFSjzYXoI5IbfgAc3i/NQ kwPpcX4ej/U/3x7FVQu3WVLzP/gpASKA4b2O7RDoi1JFdCkEEr0ktaLb5puL4pIH2Tkg fKkl2b20swDhex1VUh4KVwiadR1P5ii7UULOr8KuibwenOkwPbrDM/OafNCf/etsspFE +FcvnwB4TxIvAN0+riR2Vk+B0kHzChMbyznczRMg4Ac4+qpB96DWZr3hxZfHTb+WNuP1 5GEZEjRa5ul3aEFw8JFynLBKMqxEte+pX8dCS+r3IUCZPMCjYsQ/+PtKH8fBRvyg+GJb MEXQ== X-Gm-Message-State: AC+VfDz2Qv7uUvPDN2QK9rA3+Z4cbzIboXWv1ERCgEHDzMro6uGYAfUN Kku+Et77qJhXdlmjOIYWrPVBkz0/PjaROpbygwM= X-Google-Smtp-Source: ACHHUZ6KP07jUz8N4F3vEYuENL5ShpuJeS2Q9bqrD2FDX5GwculZU63UOJGghaFrEAC6BAS64I6VI7SDhjzPQvdrJh8= X-Received: by 2002:a25:d404:0:b0:c21:4bc4:331d with SMTP id m4-20020a25d404000000b00c214bc4331dmr4115901ybf.43.1687850946463; Tue, 27 Jun 2023 00:29:06 -0700 (PDT) MIME-Version: 1.0 References: <20230627053806.2880955-1-hongtao.liu@intel.com> In-Reply-To: From: Hongtao Liu Date: Tue, 27 Jun 2023 15:28:54 +0800 Message-ID: Subject: Re: [PATCH] [x86] Refine maskstore patterns with UNSPEC_MASKMOV. To: Richard Biener Cc: liuhongt , gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Jun 27, 2023 at 3:20=E2=80=AFPM Richard Biener via Gcc-patches wrote: > > On Tue, Jun 27, 2023 at 7:38=E2=80=AFAM liuhongt = wrote: > > > > At the rtl level, we cannot guarantee that the maskstore is not optimiz= ed > > to other full-memory accesses, as the current implementations are equiv= alent > > in terms of pattern, to solve this potential problem, this patch refine= s > > the pattern of the maskstore and the intrinsics with unspec. > > > > One thing I'm not sure is VCOND_EXPR, should VCOND_EXPR also expect > > fault suppression for masked-out elements? > > You mean the vcond and vcond_eq optabs? No, those do not expect > fault suppression. Yes, vcond/vcond_eq, thanks for clarifying. > > > > > Currently we're still using vec_merge for both AVX2 and AVX512 target. > > > > ------------------------ > > Similar like r14-2070-gc79476da46728e > > > > If mem_addr points to a memory region with less than whole vector size > > bytes of accessible memory and k is a mask that would prevent reading > > the inaccessible bytes from mem_addr, add UNSPEC_MASKMOV to prevent > > it to be transformed to any other whole memory access instructions. > > > > Bootstrapped and regtested on x86_64-pc-linu-gnu{-m32,}. > > Ready to push to trunk. > > > > gcc/ChangeLog: > > > > PR rtl-optimization/110237 > > * config/i386/sse.md (_store_mask): Refine with > > UNSPEC_MASKMOV. > > (maskstore > (*_store_mask): New define_insn, it's renamed > > from original _store_mask. > > --- > > gcc/config/i386/sse.md | 69 ++++++++++++++++++++++++++++++++++-------- > > 1 file changed, 57 insertions(+), 12 deletions(-) > > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > > index 3b50c7117f8..812cfca4b92 100644 > > --- a/gcc/config/i386/sse.md > > +++ b/gcc/config/i386/sse.md > > @@ -1608,7 +1608,7 @@ (define_insn "_blendm" > > (set_attr "prefix" "evex") > > (set_attr "mode" "")]) > > > > -(define_insn "_store_mask" > > +(define_insn "*_store_mask" > > [(set (match_operand:V48_AVX512VL 0 "memory_operand" "=3Dm") > > (vec_merge:V48_AVX512VL > > (match_operand:V48_AVX512VL 1 "register_operand" "v") > > @@ -1636,7 +1636,7 @@ (define_insn "_store_mask" > > (set_attr "memory" "store") > > (set_attr "mode" "")]) > > > > -(define_insn "_store_mask" > > +(define_insn "*_store_mask" > > [(set (match_operand:VI12HFBF_AVX512VL 0 "memory_operand" "=3Dm") > > (vec_merge:VI12HFBF_AVX512VL > > (match_operand:VI12HFBF_AVX512VL 1 "register_operand" "v") > > @@ -27008,21 +27008,66 @@ (define_expand "maskstore" > > "TARGET_AVX") > > > > (define_expand "maskstore" > > - [(set (match_operand:V48H_AVX512VL 0 "memory_operand") > > - (vec_merge:V48H_AVX512VL > > - (match_operand:V48H_AVX512VL 1 "register_operand") > > - (match_dup 0) > > - (match_operand: 2 "register_operand")))] > > + [(set (match_operand:V48_AVX512VL 0 "memory_operand") > > + (unspec:V48_AVX512VL > > + [(match_operand:V48_AVX512VL 1 "register_operand") > > + (match_dup 0) > > + (match_operand: 2 "register_operand")] > > + UNSPEC_MASKMOV))] > > "TARGET_AVX512F") > > > > (define_expand "maskstore" > > - [(set (match_operand:VI12_AVX512VL 0 "memory_operand") > > - (vec_merge:VI12_AVX512VL > > - (match_operand:VI12_AVX512VL 1 "register_operand") > > - (match_dup 0) > > - (match_operand: 2 "register_operand")))] > > + [(set (match_operand:VI12HFBF_AVX512VL 0 "memory_operand") > > + (unspec:VI12HFBF_AVX512VL > > + [(match_operand:VI12HFBF_AVX512VL 1 "register_operand") > > + (match_dup 0) > > + (match_operand: 2 "register_operand")] > > + UNSPEC_MASKMOV))] > > "TARGET_AVX512BW") > > > > +(define_insn "_store_mask" > > + [(set (match_operand:V48_AVX512VL 0 "memory_operand" "=3Dm") > > + (unspec:V48_AVX512VL > > + [(match_operand:V48_AVX512VL 1 "register_operand" "v") > > + (match_dup 0) > > + (match_operand: 2 "register_operand" "Yk")] > > + UNSPEC_MASKMOV))] > > + "TARGET_AVX512F" > > +{ > > + if (FLOAT_MODE_P (GET_MODE_INNER (mode))) > > + { > > + if (misaligned_operand (operands[0], mode)) > > + return "vmovu\t{%1, %0%{%2%}|%0%{%2%}, %1}"; > > + else > > + return "vmova\t{%1, %0%{%2%}|%0%{%2%}, %1}"; > > + } > > + else > > + { > > + if (misaligned_operand (operands[0], mode)) > > + return "vmovdqu\t{%1, %0%{%2%}|%0%{%2%}, %1}"; > > + else > > + return "vmovdqa\t{%1, %0%{%2%}|%0%{%2%}, %1}"; > > + } > > +} > > + [(set_attr "type" "ssemov") > > + (set_attr "prefix" "evex") > > + (set_attr "memory" "store") > > + (set_attr "mode" "")]) > > + > > +(define_insn "_store_mask" > > + [(set (match_operand:VI12HFBF_AVX512VL 0 "memory_operand" "=3Dm") > > + (unspec:VI12HFBF_AVX512VL > > + [(match_operand:VI12HFBF_AVX512VL 1 "register_operand" "v") > > + (match_dup 0) > > + (match_operand: 2 "register_operand" "Yk")] > > + UNSPEC_MASKMOV))] > > + "TARGET_AVX512BW" > > + "vmovdqu\t{%1, %0%{%2%}|%0%{%2%}, %1}" > > + [(set_attr "type" "ssemov") > > + (set_attr "prefix" "evex") > > + (set_attr "memory" "store") > > + (set_attr "mode" "")]) > > + > > (define_expand "cbranch4" > > [(set (reg:CC FLAGS_REG) > > (compare:CC (match_operand:VI48_AVX 1 "register_operand") > > -- > > 2.39.1.388.g2fc9e9ca3c > > --=20 BR, Hongtao