From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by sourceware.org (Postfix) with ESMTP id D5F55393C864 for ; Tue, 19 Jan 2021 14:45:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org D5F55393C864 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-294-FlBfbeHdNNatmHqbJZvllQ-1; Tue, 19 Jan 2021 09:45:21 -0500 X-MC-Unique: FlBfbeHdNNatmHqbJZvllQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7F77D1005504; Tue, 19 Jan 2021 14:45:20 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-112-64.ams2.redhat.com [10.36.112.64]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C7C825DA2D; Tue, 19 Jan 2021 14:45:19 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 10JEjGBv4021182 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 19 Jan 2021 15:45:16 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 10JEjE3U4021179; Tue, 19 Jan 2021 15:45:14 +0100 Date: Tue, 19 Jan 2021 15:45:14 +0100 From: Jakub Jelinek To: Hongtao Liu , Hongtao Liu via Gcc-patches , ebotcazou@libertysurf.fr, steven@gcc.gnu.org, richard.sandiford@arm.com Subject: Re: [PATCH] [PR rtl/optimization/98694] Fix incorrect optimization by cprop_hardreg. Message-ID: <20210119144514.GA4020736@tucnak> Reply-To: Jakub Jelinek References: MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=unavailable autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jan 2021 14:45:30 -0000 On Tue, Jan 19, 2021 at 12:38:47PM +0000, Richard Sandiford via Gcc-patches wrote: > > actually only the lower 16bits are needed, the original insn is like > > > > .294.r.ira > > (insn 69 68 70 13 (set (reg:HI 96 [ _52 ]) > > (subreg:HI (reg:DI 82 [ var_6.0_1 ]) 0)) "test.c":21:23 76 > > {*movhi_internal} > > (nil)) > > (insn 78 75 82 13 (set (reg:V4HI 140 [ _283 ]) > > (vec_duplicate:V4HI (truncate:HI (subreg:SI (reg:HI 96 [ _52 > > ]) 0)))) 1412 {*vec_dupv4hi} > > (nil)) > > > > .295r.reload > > (insn 69 68 70 13 (set (reg:HI 5 di [orig:96 _52 ] [96]) > > (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "test.c":21:23 76 > > {*movhi_internal} > > (nil)) > > (insn 489 75 78 13 (set (reg:SI 22 xmm2 [297]) > > (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal} > > (nil)) > > (insn 78 489 490 13 (set (reg:V4HI 20 xmm0 [orig:140 _283 ] [140]) > > (vec_duplicate:V4HI (truncate:HI (reg:SI 22 xmm2 [297])))) > > 1412 {*vec_dupv4hi} > > (nil)) > > > > and insn 489 is created by lra/reload which seems ok for the sequence, > > but problemistic with considering the logic of hardreg_cprop. > > It looks OK even with the regcprop behaviour though: > > - insn 69 defines only the low 16 bits of di, > - insn 489 defines only the low 16 bits of xmm2, but copies bits 16-31 > too (with unknown contents) > - insn 78 uses only the low 16 bits of xmm2 (the unknown contents > introduced by insn 489 are truncated away) > > So where do bits 16-31 become significant? What goes wrong if they're > not zero? The k0 register is initialized I believe with (insn 20 2 21 2 (set (reg:DI 68 k0 [orig:82 var_6.0_1 ] [82]) (mem/c:DI (symbol_ref:DI ("var_6") [flags 0x40] ) [3 var_6+0 S8 A64])) "pr98694.C":21:10 74 {*movdi_internal} (nil)) and so it contains all 64-bits, and then the code sometimes uses all the bits, sometimes just the low 16-bits and sometimes low 32-bits of that value. (insn 69 68 70 12 (set (reg:HI 5 di [orig:96 _52 ] [96]) (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82])) "pr98694.C":27:23 76 {*movhi_internal} (nil)) (insn 74 73 75 12 (set (reg:SI 36 r8 [orig:149 _52 ] [149]) (zero_extend:SI (reg:HI 68 k0 [orig:82 var_6.0_1 ] [82]))) 144 {*zero_extendhisi2} (nil)) (insn 489 75 78 12 (set (reg:SI 22 xmm2 [297]) (reg:SI 5 di [orig:96 _52 ] [96])) 75 {*movsi_internal} (nil)) (insn 78 489 490 12 (set (reg:V4HI 20 xmm0 [orig:140 _283 ] [140]) (vec_duplicate:V4HI (truncate:HI (reg:SI 22 xmm2 [297])))) 1412 {*vec_dupv4hi} (expr_list:REG_DEAD (reg:SI 22 xmm2 [297]) (nil))) are examples when it uses only the low 16 bits from that, and (insn 487 72 73 12 (set (reg:SI 1 dx [148]) (reg:SI 68 k0 [orig:82 var_6.0_1 ] [82])) 75 {*movsi_internal} (nil)) (insn 85 84 491 13 (set (reg:SI 37 r9 [orig:86 _11 ] [86]) (reg:SI 68 k0 [orig:82 var_6.0_1 ] [82])) "pr98694.C":28:14 75 {*movsi_internal} (nil)) (insn 491 85 88 13 (set (reg:SI 3 bx [299]) (reg:SI 68 k0 [orig:82 var_6.0_1 ] [82])) 75 {*movsi_internal} (nil)) (insn 88 491 89 13 (set (reg:CCNO 17 flags) (compare:CCNO (reg:SI 3 bx [299]) (const_int 0 [0]))) 7 {*cmpsi_ccno_1} (expr_list:REG_DEAD (reg:SI 3 bx [299]) (nil))) (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86]) (reg:SI 37 r9 [orig:86 _11 ] [86])) "pr98694.C":35:36 75 {*movsi_internal} (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86]) (nil))) are examples where it uses low 32-bits from k0. So the (insn 457 499 460 33 (set (reg:SI 39 r11 [orig:86 _11 ] [86]) - (reg:SI 37 r9 [orig:86 _11 ] [86])) "pr98694.C":35:36 75 {*movsi_internal} - (expr_list:REG_DEAD (reg:SI 37 r9 [orig:86 _11 ] [86]) + (reg:SI 22 xmm2 [orig:86 _11 ] [86])) "pr98694.C":35:36 75 {*movsi_internal} + (expr_list:REG_DEAD (reg:SI 22 xmm2 [orig:86 _11 ] [86]) (nil))) cprop_hardreg change indeed looks bogus, while xmm2 has SImode, it holds only the low 16-bits of the value and has the upper bits undefined, while r9 it is replacing had all of the low 32-bits well defined. Jakub