From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id A0E5F3858CDA for ; Wed, 5 Apr 2023 09:17:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A0E5F3858CDA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680686226; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=N9bk2wqLARoGL1nNOrbmF3SdaGPOjcmR9yGlwTMSXaM=; b=iPDAGsvhy9FhFROdWNf49F3NMdqPyMt7XBIn3NRh1Pj54w1HkwtSa0uUxnzHNkvNhQwcNe ZhfpfvAmZJX5a0ByxpYaqpbXBdJ4rUfQZ/szTFzOpTjM1b2tWFkC/maKem9MhWzm1EG14T JqW1p4aQPdrYY0eOaouuuQmOFDbbuaY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-449-6Jx5IRpQMaGNZea3P-puqQ-1; Wed, 05 Apr 2023 05:16:58 -0400 X-MC-Unique: 6Jx5IRpQMaGNZea3P-puqQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5581085A5B1; Wed, 5 Apr 2023 09:16:58 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.16]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0540118EC7; Wed, 5 Apr 2023 09:16:57 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 3359Gss83972345 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Wed, 5 Apr 2023 11:16:55 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 3359GrPF3972344; Wed, 5 Apr 2023 11:16:53 +0200 Date: Wed, 5 Apr 2023 11:16:53 +0200 From: Jakub Jelinek To: Richard Biener , Jeff Law , Richard Sandiford , Eric Botcazou Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] dse: Handle SUBREGs of word REGs differently for WORD_REGISTER_OPERATIONS targets [PR109040] Message-ID: Reply-To: Jakub Jelinek MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi! The following testcase is miscompiled on riscv since the addition of *mvconst_internal define_insn_and_split. I believe the bug is in DSE. We have: (insn 36 35 39 2 (set (mem/c:SI (plus:SI (reg/f:SI 65 frame) (const_int -64 [0xffffffffffffffc0])) [2 S4 A128]) (reg:SI 166)) "pr109040.c":9:11 178 {*movsi_internal} (expr_list:REG_DEAD (reg:SI 166) (nil))) (insn 39 36 40 2 (set (reg:SI 171) (zero_extend:SI (mem/c:HI (plus:SI (reg/f:SI 65 frame) (const_int -64 [0xffffffffffffffc0])) [0 S2 A128]))) "pr109040.c":9:11 111 {*zero_extendhisi2} (nil)) and RTL DSE's replace_read since r0-86337-g18b526e806ab6455 handles even different modes like in the above case, and so it optimizes it into: (insn 47 35 39 2 (set (reg:HI 175) (subreg:HI (reg:SI 166) 0)) "pr109040.c":9:11 179 {*movhi_internal} (expr_list:REG_DEAD (reg:SI 166) (nil))) (insn 39 47 40 2 (set (reg:SI 171) (zero_extend:SI (reg:HI 175))) "pr109040.c":9:11 111 {*zero_extendhisi2} (expr_list:REG_DEAD (reg:HI 175) (nil))) Pseudo 166 is result of AND with 0x8084c constant (forced into a register). Combine attempts to combine the AND with the insn 47 above created by DSE, and turns it because of WORD_REGISTER_OPERATIONS and its assumption that all the subword operations are actually done on word mode into: (set (subreg:SI (reg:HI 175) 0) (and:SI (reg:SI 167 [ m ]) (reg:SI 168))) and later on the ZERO_EXTEND is thrown away. The following patch changes DSE to instead emit (insn 47 35 39 2 (set (reg:SI 175) (reg:SI 166)) "pr109040.c":10:11 180 {*movsi_internal} (expr_list:REG_DEAD (reg:SI 166) (nil))) (insn 39 47 40 2 (set (reg:SI 171) (zero_extend:SI (subreg:HI (reg:SI 175) 0))) "pr109040.c":10:11 111 {*zero_extendhisi2} (expr_list:REG_DEAD (reg:SI 175) (nil))) i.e. in the new insn copy whole reg rather than subword part of it where we don't really know anything about the upper bits. With this change, combine manages to do the right thing, optimies the (unsigned) (unsigned short) (reg & 0x8084c) into reg & 0x84c. Bootstrapped/regtested on x86_64-linux and i686-linux (admittedly not WORD_REGISTER_OPERATIONS targets) and tested using a cross to riscv on the testcase. I have unfortunately no way to bootstrap this on risc*-linux, could somebody do that? Ok for trunk if that passes/ 2023-04-05 Jakub Jelinek PR target/109040 * dse.cc (replace_read): If read_reg is a SUBREG of a word mode REG, for WORD_REGISTER_OPERATIONS copy SUBREG_REG of it into a new REG rather than the SUBREG. * gcc.c-torture/execute/pr109040.c: New test. --- gcc/dse.cc.jj 2023-01-02 09:32:50.369880943 +0100 +++ gcc/dse.cc 2023-04-04 22:17:22.906347794 +0200 @@ -2012,7 +2012,19 @@ replace_read (store_info *store_info, in } /* Force the value into a new register so that it won't be clobbered between the store and the load. */ - read_reg = copy_to_mode_reg (read_mode, read_reg); + if (WORD_REGISTER_OPERATIONS + && GET_CODE (read_reg) == SUBREG + && REG_P (SUBREG_REG (read_reg)) + && GET_MODE (SUBREG_REG (read_reg)) == word_mode) + { + /* For WORD_REGISTER_OPERATIONS with subreg of word_mode register + force SUBREG_REG into a new register rather than the SUBREG. */ + rtx r = copy_to_mode_reg (word_mode, SUBREG_REG (read_reg)); + read_reg = shallow_copy_rtx (read_reg); + SUBREG_REG (read_reg) = r; + } + else + read_reg = copy_to_mode_reg (read_mode, read_reg); insns = get_insns (); end_sequence (); --- gcc/testsuite/gcc.c-torture/execute/pr109040.c.jj 2023-04-04 16:57:31.739658564 +0200 +++ gcc/testsuite/gcc.c-torture/execute/pr109040.c 2023-04-04 16:57:15.238900638 +0200 @@ -0,0 +1,23 @@ +/* PR target/109040 */ + +typedef unsigned short __attribute__((__vector_size__ (32))) V; + +unsigned short a, b, c, d; + +void +foo (V m, unsigned short *ret) +{ + V v = 6 > ((V) { 2124, 8 } & m); + unsigned short uc = v[0] + a + b + c + d; + *ret = uc; +} + +int +main () +{ + unsigned short x; + foo ((V) { 0, 15 }, &x); + if (x != (unsigned short) ~0) + __builtin_abort (); + return 0; +} Jakub