From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x235.google.com (mail-lj1-x235.google.com [IPv6:2a00:1450:4864:20::235]) by sourceware.org (Postfix) with ESMTPS id 604323858D33 for ; Tue, 27 Jun 2023 07:20:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 604323858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-x235.google.com with SMTP id 38308e7fff4ca-2b5c231f842so37589011fa.2 for ; Tue, 27 Jun 2023 00:20:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687850425; x=1690442425; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=RmfHe+9wFeVJNsY3SiipUy0Xa86qCfnLzNTTP5kBR94=; b=RF7MENAY8Qs94ZGtvYq1MzTFeWJ2IHaJb2PzAOHpG1WXQVGGcLTq03E8zCO5afYtzt y6jX7S5bh5/G99Wk3NZ8az9LeiPxpwfb3A6+9zYBifjMo/JQkWzgLiBkJe1fQs2z2nnV gWtfKuq4V9KxDvmX4ykNYW0lU2wC6yATqs8kwn5z0SEdKzm/B+H5IRXESJkmH0oSPnvo D52Sa1qmaITG2DW9bpIpcwlc+OhjKPUjE5QtqUFpCdH29go4YHbijDeI/6loc51i18jG XMb8AwcuMVVPxIB+/HGlBfpMT112uQ5pDXSfbBd5ZKkOcgBxoQ/AwVElz1+hHbjWHLJu 1e9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687850425; x=1690442425; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RmfHe+9wFeVJNsY3SiipUy0Xa86qCfnLzNTTP5kBR94=; b=DFBNm7IMVSIwycTT1TI5VO7SkfHrHCxmJp7TpKSb0XtfLf4aUwO+WoFQWxJtpNaPq8 odoThLFbDydB/n0Sn35hlo9IsO9M/PyDdZlTns4ZYRDagzfs7KRgkfBurcdU7iyrgzlY l9QeGEcL4bkINKl9WBEL9h1Q7EyMuB2SN9mcNeHhprQ8qh9sK/QRr9Id4tcZvxQK5UWj ncgPZMeEUaEZIS1W/lNhnL7DI+bych/jfCfPXUeaOb0MHCsOJ+P3h4t6/zjSj+LWJ9wk xpJTdeFGKW1rieAYCA14cXCCfJMG7HSKRDdobVmQJPZQqx2yr2s/MJUZtt70tLDRzide JTKg== X-Gm-Message-State: AC+VfDx68Xo/0QRy8VaPWGSg/SYYOnRNAdlXnFhX43yFfoDKLHYKVNKq J4aSaBnQTcSqj9xdgKuC6gQIwG8pMenRWA5U1Exzerpx X-Google-Smtp-Source: ACHHUZ7NXnlb1uVgIfBOB/VUPVfN9wTG7cek/8h6cm2tfB3i4PTE8i5tzRBZvZG7/QUVKowxmr3Ssp8CuSxCV0x9WmA= X-Received: by 2002:a05:651c:115:b0:2b5:7fd2:ec36 with SMTP id a21-20020a05651c011500b002b57fd2ec36mr13381326ljb.21.1687850424610; Tue, 27 Jun 2023 00:20:24 -0700 (PDT) MIME-Version: 1.0 References: <20230627053806.2880955-1-hongtao.liu@intel.com> In-Reply-To: <20230627053806.2880955-1-hongtao.liu@intel.com> From: Richard Biener Date: Tue, 27 Jun 2023 09:20:12 +0200 Message-ID: Subject: Re: [PATCH] [x86] Refine maskstore patterns with UNSPEC_MASKMOV. To: liuhongt Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Jun 27, 2023 at 7:38=E2=80=AFAM liuhongt wr= ote: > > At the rtl level, we cannot guarantee that the maskstore is not optimized > to other full-memory accesses, as the current implementations are equival= ent > in terms of pattern, to solve this potential problem, this patch refines > the pattern of the maskstore and the intrinsics with unspec. > > One thing I'm not sure is VCOND_EXPR, should VCOND_EXPR also expect > fault suppression for masked-out elements? You mean the vcond and vcond_eq optabs? No, those do not expect fault suppression. > > Currently we're still using vec_merge for both AVX2 and AVX512 target. > > ------------------------ > Similar like r14-2070-gc79476da46728e > > If mem_addr points to a memory region with less than whole vector size > bytes of accessible memory and k is a mask that would prevent reading > the inaccessible bytes from mem_addr, add UNSPEC_MASKMOV to prevent > it to be transformed to any other whole memory access instructions. > > Bootstrapped and regtested on x86_64-pc-linu-gnu{-m32,}. > Ready to push to trunk. > > gcc/ChangeLog: > > PR rtl-optimization/110237 > * config/i386/sse.md (_store_mask): Refine with > UNSPEC_MASKMOV. > (maskstore (*_store_mask): New define_insn, it's renamed > from original _store_mask. > --- > gcc/config/i386/sse.md | 69 ++++++++++++++++++++++++++++++++++-------- > 1 file changed, 57 insertions(+), 12 deletions(-) > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > index 3b50c7117f8..812cfca4b92 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -1608,7 +1608,7 @@ (define_insn "_blendm" > (set_attr "prefix" "evex") > (set_attr "mode" "")]) > > -(define_insn "_store_mask" > +(define_insn "*_store_mask" > [(set (match_operand:V48_AVX512VL 0 "memory_operand" "=3Dm") > (vec_merge:V48_AVX512VL > (match_operand:V48_AVX512VL 1 "register_operand" "v") > @@ -1636,7 +1636,7 @@ (define_insn "_store_mask" > (set_attr "memory" "store") > (set_attr "mode" "")]) > > -(define_insn "_store_mask" > +(define_insn "*_store_mask" > [(set (match_operand:VI12HFBF_AVX512VL 0 "memory_operand" "=3Dm") > (vec_merge:VI12HFBF_AVX512VL > (match_operand:VI12HFBF_AVX512VL 1 "register_operand" "v") > @@ -27008,21 +27008,66 @@ (define_expand "maskstore" > "TARGET_AVX") > > (define_expand "maskstore" > - [(set (match_operand:V48H_AVX512VL 0 "memory_operand") > - (vec_merge:V48H_AVX512VL > - (match_operand:V48H_AVX512VL 1 "register_operand") > - (match_dup 0) > - (match_operand: 2 "register_operand")))] > + [(set (match_operand:V48_AVX512VL 0 "memory_operand") > + (unspec:V48_AVX512VL > + [(match_operand:V48_AVX512VL 1 "register_operand") > + (match_dup 0) > + (match_operand: 2 "register_operand")] > + UNSPEC_MASKMOV))] > "TARGET_AVX512F") > > (define_expand "maskstore" > - [(set (match_operand:VI12_AVX512VL 0 "memory_operand") > - (vec_merge:VI12_AVX512VL > - (match_operand:VI12_AVX512VL 1 "register_operand") > - (match_dup 0) > - (match_operand: 2 "register_operand")))] > + [(set (match_operand:VI12HFBF_AVX512VL 0 "memory_operand") > + (unspec:VI12HFBF_AVX512VL > + [(match_operand:VI12HFBF_AVX512VL 1 "register_operand") > + (match_dup 0) > + (match_operand: 2 "register_operand")] > + UNSPEC_MASKMOV))] > "TARGET_AVX512BW") > > +(define_insn "_store_mask" > + [(set (match_operand:V48_AVX512VL 0 "memory_operand" "=3Dm") > + (unspec:V48_AVX512VL > + [(match_operand:V48_AVX512VL 1 "register_operand" "v") > + (match_dup 0) > + (match_operand: 2 "register_operand" "Yk")] > + UNSPEC_MASKMOV))] > + "TARGET_AVX512F" > +{ > + if (FLOAT_MODE_P (GET_MODE_INNER (mode))) > + { > + if (misaligned_operand (operands[0], mode)) > + return "vmovu\t{%1, %0%{%2%}|%0%{%2%}, %1}"; > + else > + return "vmova\t{%1, %0%{%2%}|%0%{%2%}, %1}"; > + } > + else > + { > + if (misaligned_operand (operands[0], mode)) > + return "vmovdqu\t{%1, %0%{%2%}|%0%{%2%}, %1}"; > + else > + return "vmovdqa\t{%1, %0%{%2%}|%0%{%2%}, %1}"; > + } > +} > + [(set_attr "type" "ssemov") > + (set_attr "prefix" "evex") > + (set_attr "memory" "store") > + (set_attr "mode" "")]) > + > +(define_insn "_store_mask" > + [(set (match_operand:VI12HFBF_AVX512VL 0 "memory_operand" "=3Dm") > + (unspec:VI12HFBF_AVX512VL > + [(match_operand:VI12HFBF_AVX512VL 1 "register_operand" "v") > + (match_dup 0) > + (match_operand: 2 "register_operand" "Yk")] > + UNSPEC_MASKMOV))] > + "TARGET_AVX512BW" > + "vmovdqu\t{%1, %0%{%2%}|%0%{%2%}, %1}" > + [(set_attr "type" "ssemov") > + (set_attr "prefix" "evex") > + (set_attr "memory" "store") > + (set_attr "mode" "")]) > + > (define_expand "cbranch4" > [(set (reg:CC FLAGS_REG) > (compare:CC (match_operand:VI48_AVX 1 "register_operand") > -- > 2.39.1.388.g2fc9e9ca3c >