From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vk1-xa36.google.com (mail-vk1-xa36.google.com [IPv6:2607:f8b0:4864:20::a36]) by sourceware.org (Postfix) with ESMTPS id AD2893858D34 for ; Sat, 18 Sep 2021 02:19:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AD2893858D34 Received: by mail-vk1-xa36.google.com with SMTP id h132so4421603vke.8 for ; Fri, 17 Sep 2021 19:19:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6kosB7SYcxPQS7lnSgFK8/mZOtWza0+05v6SUIDQWgk=; b=wOkvmS5Vnryr2DRqlIv1iTrUBOVLIysHFLpsiv5Y3kV3TfeyETX8D51CssgYK6/sYE qd27LJ8sCk9XR4YEqD5cFN7IBMSVtFN8g/oEEMeBhQEXQ07D0EIkh4rkE4+j3gpLKFS6 KyLGvRdcwnzwWg2oAT4jV5Q+7rzIhkWknwsHH9VZsJLZW3AefTGANDZ6WVd6Qt7+Ht8a 1GBsx5eA1twjVYiSbM8q+ZUwa2jP3kALDBRxutSTF/M6+zPa1AWN8GqmFEiFRfl87Sh4 S6CHCxmvaM04c7sacHPdOikV8CA1noi751RKCKrzLiryUtWul6lHkXwEgu8cLpkioSmZ DY5g== X-Gm-Message-State: AOAM530ttjx5J0S9UM8t4V3eBVbZPscwqFDCBnpLJ50fMm+fBlsa5ztS mCjKLzrh2sTFXkNXLNW7dTzv7ubSYBRmhFi9lsI= X-Google-Smtp-Source: ABdhPJzagVSwkZMhtonlXnPfdJUo5f5GYbKE7ym2gkzlXHhQjVLa8sypQF1GGkWD+U0cu8PFRBLVrv+3ybCOnuYO3eE= X-Received: by 2002:a1f:257:: with SMTP id 84mr4423875vkc.19.1631931552200; Fri, 17 Sep 2021 19:19:12 -0700 (PDT) MIME-Version: 1.0 References: <20210915080951.10362-1-lili.cui@intel.com> <20210915080951.10362-4-lili.cui@intel.com> <20210917235014.GT304296@tucnak> In-Reply-To: <20210917235014.GT304296@tucnak> From: Hongtao Liu Date: Sat, 18 Sep 2021 10:25:27 +0800 Message-ID: Subject: Re: [PATCH 3/4] [PATCH 3/4] x86: Properly handle USE_VECTOR_FP_CONVERTS/USE_VECTOR_CONVERTS To: Jakub Jelinek Cc: Uros Bizjak , "gcc-patches@gcc.gnu.org" , "Liu, Hongtao" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Sep 2021 02:19:14 -0000 On Sat, Sep 18, 2021 at 7:50 AM Jakub Jelinek via Gcc-patches wrote: > > On Fri, Sep 17, 2021 at 08:35:57AM +0200, Uros Bizjak via Gcc-patches wrote: > > > > On Wed, Sep 15, 2021 at 10:10 AM wrote: > > > > > > > > > > From: "H.J. Lu" > > > > > > > > > > Check TARGET_USE_VECTOR_FP_CONVERTS or > > > > TARGET_USE_VECTOR_CONVERTS when > > > > > handling avx_partial_xmm_update attribute. Don't convert AVX partial > > > > > XMM register update if vector packed SSE conversion should be used. > > > > > > > > > > gcc/ > > > > > > > > > > PR target/101900 > > > > > * config/i386/i386-features.c (remove_partial_avx_dependency): > > > > > Check TARGET_USE_VECTOR_FP_CONVERTS and > > > > TARGET_USE_VECTOR_CONVERTS > > > > > before generating vxorps. > > > > > > > > > > gcc/ > > > > > > > > > > PR target/101900 > > > > > * testsuite/gcc.target/i386/pr101900-1.c: New test. > > > > > * testsuite/gcc.target/i386/pr101900-2.c: Likewise. > > > > > * testsuite/gcc.target/i386/pr101900-3.c: Likewise. > > > > > --- > > > > > gcc/config/i386/i386-features.c | 21 ++++++++++++++++++--- > > > > > gcc/testsuite/gcc.target/i386/pr101900-1.c | 18 ++++++++++++++++++ > > > > > gcc/testsuite/gcc.target/i386/pr101900-2.c | 18 ++++++++++++++++++ > > > > > gcc/testsuite/gcc.target/i386/pr101900-3.c | 19 +++++++++++++++++++ > > > > > 4 files changed, 73 insertions(+), 3 deletions(-) create mode 100644 > > > > > gcc/testsuite/gcc.target/i386/pr101900-1.c > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101900-2.c > > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101900-3.c > > > > > > > > > > diff --git a/gcc/config/i386/i386-features.c > > > > > b/gcc/config/i386/i386-features.c index 5a99ea7c046..ae5ea02a002 > > > > > 100644 > > > > > --- a/gcc/config/i386/i386-features.c > > > > > +++ b/gcc/config/i386/i386-features.c > > > > > @@ -2210,15 +2210,30 @@ remove_partial_avx_dependency (void) > > > > > != AVX_PARTIAL_XMM_UPDATE_TRUE) > > > > > continue; > > > > > > > > > > - if (!v4sf_const0) > > > > > - v4sf_const0 = gen_reg_rtx (V4SFmode); > > > > > - > > > > > /* Convert PARTIAL_XMM_UPDATE_TRUE insns, DF -> SF, SF -> DF, > > > > > SI -> SF, SI -> DF, DI -> SF, DI -> DF, to vec_dup and > > > > > vec_merge with subreg. */ > > > > > rtx src = SET_SRC (set); > > > > > rtx dest = SET_DEST (set); > > > > > machine_mode dest_mode = GET_MODE (dest); > > > > > + machine_mode src_mode; > > > > > + > > > > > + if (TARGET_USE_VECTOR_FP_CONVERTS) > > > > > + { > > > > > + src_mode = GET_MODE (XEXP (src, 0)); > > > > > + if (src_mode == E_SFmode || src_mode == E_DFmode) > > > > > + continue; > > > > > + } > > > > > + > > > > > + if (TARGET_USE_VECTOR_CONVERTS) > > > > > + { > > > > > + src_mode = GET_MODE (XEXP (src, 0)); > > > > > + if (src_mode == E_SImode || src_mode == E_DImode) > > > > > + continue; > > > > > + } > > > > > + > > > > > + if (!v4sf_const0) > > > > > + v4sf_const0 = gen_reg_rtx (V4SFmode); > > > > > > > > Please better move initialization of src_mode to the top of the new hunk, like: > > > > > > > > machine_mode src_mode = GET_MODE (XEXP (src, 0)); switch (src_mode) { > > > > case E_SFmode: > > > > case E_DFmode: > > > > if (TARGET_USE_VECTOR_FP_CONVERTS) > > > > continue; > > > > break; > > > > case E_SImode: > > > > case E_DImode: > > > > if (TARGET_USE_VECTOR_CONVERTS) > > > > continue; > > > > break; > > > > default: > > > > break; > > > > } > > > > > > > > or something like the above. > > > > > > Done, thanks for your good advice, I also rebased patch 4/4, since it is based on patch 3/4. > > The above change broke > +FAIL: gcc.target/i386/avx512f-vscalefpd-2.c (internal compiler error) > +FAIL: gcc.target/i386/avx512f-vscalefpd-2.c (test for excess errors) > +UNRESOLVED: gcc.target/i386/avx512f-vscalefpd-2.c compilation failed to produce executable > +FAIL: gcc.target/i386/avx512f-vscalefps-2.c (internal compiler error) > +FAIL: gcc.target/i386/avx512f-vscalefps-2.c (test for excess errors) > +UNRESOLVED: gcc.target/i386/avx512f-vscalefps-2.c compilation failed to produce executable > +FAIL: gcc.target/i386/avx512f-vscalefss-2.c (internal compiler error) > +FAIL: gcc.target/i386/avx512f-vscalefss-2.c (test for excess errors) > +UNRESOLVED: gcc.target/i386/avx512f-vscalefss-2.c compilation failed to produce executable > +FAIL: gcc.target/i386/avx512vl-vscalefpd-2.c (internal compiler error) > +FAIL: gcc.target/i386/avx512vl-vscalefpd-2.c (test for excess errors) > +UNRESOLVED: gcc.target/i386/avx512vl-vscalefpd-2.c compilation failed to produce executable > +FAIL: gcc.target/i386/avx512vl-vscalefps-2.c (internal compiler error) > +FAIL: gcc.target/i386/avx512vl-vscalefps-2.c (test for excess errors) > +UNRESOLVED: gcc.target/i386/avx512vl-vscalefps-2.c compilation failed to produce executable > when configured with --enable-checking=yes,rtl,extra, the error is: > during RTL pass: rpad > /home/jakub/src/gcc/gcc/testsuite/gcc.target/i386/avx512f-vscalefpd-2.c:57:1: internal compiler error: RTL check: expected elt 0 type 'e' or 'u', have 'E' (rtx unspec) in remove_partial_avx_dependency, at config/i386/i386-features.c:2219 > 0x77541d rtl_check_failed_type2(rtx_def const*, int, int, int, char const*, int, char const*) > ../../gcc/rtl.c:898 > 0x84e731 remove_partial_avx_dependency > ../../gcc/config/i386/i386-features.c:2219 > 0x84e731 execute > ../../gcc/config/i386/i386-features.c:2389 > This is on > 2219 machine_mode src_mode = GET_MODE (XEXP (src, 0)); > and src is: > (unspec:DF [ > (mem:DF (plus:DI (reg/f:DI 149) > (reg:DI 103 [ ivtmp.65 ])) [3 MEM[(double *)&src2 + ivtmp.65_65 * 1]+0 S8 A64]) > (const_int 9 [0x9]) > ] UNSPEC_ROUND) > - whole insn > (insn 55 54 56 5 (set (reg:DF 99 [ _50 ]) > (unspec:DF [ > (mem:DF (plus:DI (reg/f:DI 149) > (reg:DI 103 [ ivtmp.65 ])) [3 MEM[(double *)&src2 + ivtmp.65_65 * 1]+0 S8 A64]) > (const_int 9 [0x9]) > ] UNSPEC_ROUND)) "/home/jakub/src/gcc/gcc/testsuite/gcc.target/i386/avx512f-vscalefpd-2.c":19:28 1076 {sse4_1_rounddf2} > (nil)) > so XEXP (src, 0) can't be used in that case. Let me fix this. > Looking at insns with avx_partial_xmm_update attribute, it seems > src is either FLOAT_EXTEND/FLOAT_TRUNCATE/FLOAT/UNSIGNED_FLOAT and > in that case it looks like a conversion and has different modes, > or it is UNSPEC (UNSPEC_{RCP,RSQRT,ROUND}) or SQRT and in that case it doesn't. Yes, i'll add comments to mention pass_rpad also handle rcp/round/sqrt/rsqrt. > > Jakub > -- BR, Hongtao