From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 98925385040B; Mon, 18 Jan 2021 10:18:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 98925385040B Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 20A261FB; Mon, 18 Jan 2021 02:18:55 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4AA673F66E; Mon, 18 Jan 2021 02:18:54 -0800 (PST) From: Richard Sandiford To: Hongtao Liu via Gcc-patches Mail-Followup-To: Hongtao Liu via Gcc-patches , ebotcazou@libertysurf.fr, steven@gcc.gnu.org, Hongtao Liu , Jakub Jelinek , richard.sandiford@arm.com Cc: ebotcazou@libertysurf.fr, steven@gcc.gnu.org, Hongtao Liu , Jakub Jelinek Subject: Re: [PATCH] [PR rtl/optimization/98694] Fix incorrect optimization by cprop_hardreg. References: Date: Mon, 18 Jan 2021 10:18:52 +0000 In-Reply-To: (Hongtao Liu via Gcc-patches's message of "Mon, 18 Jan 2021 17:16:42 +0800") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jan 2021 10:18:57 -0000 Hongtao Liu via Gcc-patches writes: > Hi: > If SRC had been assigned a mode narrower than the copy, we can't link > DEST into the chain even they have same > hard_regno_nregs(i.e. HImode/SImode in i386 backend). In general, changes between modes within the same hard register are OK. Could you explain in more detail what's going wrong? Thanks, Richard > > i.e > kmovw %k0, %edi > vmovd %edi, %xmm2 > vpshuflw $0, %xmm2, %xmm0 > kmovw %k0, %r8d > kmovd %k0, %r9d > ... > - movl %r9d, %r11d > + vmovd %xmm2, %r11d > > Bootstrap and regtested on x86_64-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR rtl-optimization/98694 > * regcprop.c (copy_value): If SRC had been assigned a mode > narrower than the copy, we can't link DEST into the chain even > they have same hard_regno_nregs(i.e. HImode/SImode in i386 > backend). > > gcc/testsuite/ChangeLog: > > PR rtl-optimization/98694 > * gcc.target/i386/pr98694.c: New test. > > --- > gcc/regcprop.c | 3 +- > gcc/testsuite/gcc.target/i386/pr98694.c | 38 +++++++++++++++++++++++++ > 2 files changed, 40 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr98694.c > > diff --git a/gcc/regcprop.c b/gcc/regcprop.c > index dd62cb36013..997516eca07 100644 > --- a/gcc/regcprop.c > +++ b/gcc/regcprop.c > @@ -355,7 +355,8 @@ copy_value (rtx dest, rtx src, struct value_data *vd) > /* If SRC had been assigned a mode narrower than the copy, we can't > link DEST into the chain, because not all of the pieces of the > copy came from oldest_regno. */ > - else if (sn > hard_regno_nregs (sr, vd->e[sr].mode)) > + else if (sn > hard_regno_nregs (sr, vd->e[sr].mode) > + || partial_subreg_p (vd->e[sr].mode, GET_MODE (src))) > return; > > /* Link DR at the end of the value chain used by SR. */ > diff --git a/gcc/testsuite/gcc.target/i386/pr98694.c > b/gcc/testsuite/gcc.target/i386/pr98694.c > new file mode 100644 > index 00000000000..611f9e77627 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr98694.c > @@ -0,0 +1,38 @@ > +/* PR rtl-optimization/98694 */ > +/* { dg-do run { target { ! ia32 } } } */ > +/* { dg-options "-O2 -mavx512bw" } */ > +/* { dg-require-effective-target avx512bw } */ > + > +#include > +typedef short v4hi __attribute__ ((vector_size (8))); > +typedef int v2si __attribute__ ((vector_size (8))); > +v4hi b; > + > +__attribute__ ((noipa)) > +v2si > +foo (__m512i src1, __m512i src2) > +{ > + __mmask64 m = _mm512_cmpeq_epu8_mask (src1, src2); > + short s = (short) m; > + int i = (int)m; > + b = __extension__ (v4hi) {s, s, s, s}; > + return __extension__ (v2si) {i, i}; > +} > + > +int main () > +{ > + __m512i src1 = _mm512_setzero_si512 (); > + __m512i src2 = _mm512_set_epi8 (0, 1, 0, 1, 0, 1, 0, 1, > + 0, 1, 0, 1, 0, 1, 0, 1, > + 0, 1, 0, 1, 0, 1, 0, 1, > + 0, 1, 0, 1, 0, 1, 0, 1, > + 0, 1, 0, 1, 0, 1, 0, 1, > + 0, 1, 0, 1, 0, 1, 0, 1, > + 0, 1, 0, 1, 0, 1, 0, 1, > + 0, 1, 0, 1, 0, 1, 0, 1); > + __mmask64 m = _mm512_cmpeq_epu8_mask (src1, src2); > + v2si a = foo (src1, src2); > + if (a[0] != (int)m) > + __builtin_abort (); > + return 0; > +} > --