From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 089703857C5E; Tue, 19 Jan 2021 12:38:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 089703857C5E Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A3FC0113E; Tue, 19 Jan 2021 04:38:49 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D03A73F719; Tue, 19 Jan 2021 04:38:48 -0800 (PST) From: Richard Sandiford To: Hongtao Liu Mail-Followup-To: Hongtao Liu , Hongtao Liu via Gcc-patches , ebotcazou@libertysurf.fr, steven@gcc.gnu.org, Jakub Jelinek , richard.sandiford@arm.com Cc: Hongtao Liu via Gcc-patches , ebotcazou@libertysurf.fr, steven@gcc.gnu.org, Jakub Jelinek Subject: Re: [PATCH] [PR rtl/optimization/98694] Fix incorrect optimization by cprop_hardreg. References: Date: Tue, 19 Jan 2021 12:38:47 +0000 In-Reply-To: (Hongtao Liu's message of "Tue, 19 Jan 2021 08:59:25 +0800") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jan 2021 12:38:53 -0000 Hongtao Liu writes: > On Mon, Jan 18, 2021 at 7:10 PM Richard Sandiford > wrote: >> >> Hongtao Liu writes: >> > On Mon, Jan 18, 2021 at 6:18 PM Richard Sandiford >> > wrote: >> >> >> >> Hongtao Liu via Gcc-patches writes: >> >> > Hi: >> >> > If SRC had been assigned a mode narrower than the copy, we can't link >> >> > DEST into the chain even they have same >> >> > hard_regno_nregs(i.e. HImode/SImode in i386 backend). >> >> >> >> In general, changes between modes within the same hard register are OK. >> >> Could you explain in more detail what's going wrong? >> >> >> > >> > cprop hardreg change >> > >> > (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86]) >> > (reg:SI 37 r9 [orig:86 _11 ] [86])) "test.c":29:36 75 {*movsi_internal} >> > (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86]) >> > (nil))) >> > >> > to >> > >> > (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86]) >> > (reg:SI 22 xmm2 [orig:86 _11 ] [86])) "test.c":29:36 75 >> > {*movsi_internal} >> > (expr_list:REG_DEAD (reg:SI 22 xmm2 [orig:86 _11 ] [86]) >> > (nil))) >> > >> > since (reg:SI 22 xmm2) and (reg:SI r9) are in the same value chain in >> > which the oldest regno is k0. >> > >> > but with xmm2 defined as >> > >> > kmovw %k0, %edi # 69 [c=4 l=4] *movhi_internal/6----- kmovw move the >> > lower 16bits to %edi, and clear the upper 16 bits. >> > vmovd %edi, %xmm2 # 489 *movsi_internal --- vmovd move 32bits from >> > %edi to %xmm2. >> > >> > (insn 69 68 70 12 (set (reg:HI 5 di [orig:96 _52 ] [96]) >> > (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76 >> > {*movhi_internal} >> > (nil)) >> > >> > (insn 489 75 78 12 (set (reg:SI 22 xmm2 [297]) >> > (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal} >> > (nil)) >> >> The sequence is OK in itself, but insn 489 can't make any assumptions >> about what's in the upper 16 bits of %edi. In other words, as far as >> RTL semantics are concerned, insn 489 only leaves bits 0-15 of %xmm2 >> with defined values; the other bits are undefined. >> >> If the target wants all 32 bits of %edi to be carried over to insn 489 >> then it needs to make insn 69 an SImode set instead of a HImode set. >> > > actually only the lower 16bits are needed, the original insn is like > > .294.r.ira > (insn 69 68 70 13 (set (reg:HI 96 [ _52 ]) > (subreg:HI (reg:DI 82 [ var_6.0_1 ]) 0)) "test.c":21:23 76 > {*movhi_internal} > (nil)) > (insn 78 75 82 13 (set (reg:V4HI 140 [ _283 ]) > (vec_duplicate:V4HI (truncate:HI (subreg:SI (reg:HI 96 [ _52 > ]) 0)))) 1412 {*vec_dupv4hi} > (nil)) > > .295r.reload > (insn 69 68 70 13 (set (reg:HI 5 di [orig:96 _52 ] [96]) > (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76 > {*movhi_internal} > (nil)) > (insn 489 75 78 13 (set (reg:SI 22 xmm2 [297]) > (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal} > (nil)) > (insn 78 489 490 13 (set (reg:V4HI 20 xmm0 [orig:140 _283 ] [140]) > (vec_duplicate:V4HI (truncate:HI (reg:SI 22 xmm2 [297])))) > 1412 {*vec_dupv4hi} > (nil)) > > and insn 489 is created by lra/reload which seems ok for the sequence, > but problemistic with considering the logic of hardreg_cprop. It looks OK even with the regcprop behaviour though: - insn 69 defines only the low 16 bits of di, - insn 489 defines only the low 16 bits of xmm2, but copies bits 16-31 too (with unknown contents) - insn 78 uses only the low 16 bits of xmm2 (the unknown contents introduced by insn 489 are truncated away) So where do bits 16-31 become significant? What goes wrong if they're not zero? Thanks, Richard