From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 6C4203858C5E for ; Fri, 22 Dec 2023 08:36:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6C4203858C5E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6C4203858C5E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703234260; cv=none; b=mWfKEMUYpnLBfrKADl8Ow+Vt4YcQ7SLz1cxZGKR/LCw6l/oVMgDoU4A7ginH1CX5PdAF5yVDHgxxllOoPKO+if9vXBhXFNvWNM/2QWiuRSfY97slzgN7eU3irjR4EWGexxnK7vMRRXcQUYRddxALK5dswQHV2MWuhDoWi/IRVNw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703234260; c=relaxed/simple; bh=xm1t3QzwLrj0XpqdStSkLYGxXt9/Vjq0hwBhn0vcX7o=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=v5lzdpzZXVr7EWglFgrj9zzoO0mg1dRLSz6cKO6BSYhjU60BMYGgtczQIpxyJZYt/HDvIbymzDtqVdSMuflwMDWaHHQqb0iDyH70zxm66e7nXdMDA6urylNLYJhINiiqgfhE6MN/qYFXGlExYgHuk9Pvcz0lCQy05OiVJJq/riA= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1703234164; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=eIu+tjNkWGMfyzbdFnuto4ntfetrc5h9ePLOmgoqjO4=; b=fAuojWTRy3ypEfitXYqs3Ifkt3D5K+ZCfayAzRZdeVIeq1TM0Own9uKVOcVPGmtaXrUwcq KFrm9yPF2PrJge/k49HupXZjd7fu8FNRacJqjhsRP1LLvE+VLMyz+MQpzAISSqs8QPWqFP 13g17h7szsYCSCcuAxHB3wPFX8x9gVs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-475-kqbzAp4EP5qMVneJ0XigXA-1; Fri, 22 Dec 2023 03:36:00 -0500 X-MC-Unique: kqbzAp4EP5qMVneJ0XigXA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E7083185A780; Fri, 22 Dec 2023 08:35:59 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.92]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 960A02026F95; Fri, 22 Dec 2023 08:35:59 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 3BM8Zuk8241319 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Fri, 22 Dec 2023 09:35:57 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 3BM8ZtWx241318; Fri, 22 Dec 2023 09:35:55 +0100 Date: Fri, 22 Dec 2023 09:35:55 +0100 From: Jakub Jelinek To: Eric Botcazou , Segher Boessenkool , Jeff Law Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] combine: Don't optimize paradoxical SUBREG AND CONST_INT on WORD_REGISTER_OPERATIONS targets [PR112758] Message-ID: Reply-To: Jakub Jelinek MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi! As discussed in the PR, the following testcase is miscompiled on RISC-V 64-bit, because num_sign_bit_copies in one spot pretends the bits in a paradoxical SUBREG beyond SUBREG_REG SImode are all sign bit copies: 5444 /* For paradoxical SUBREGs on machines where all register operations 5445 affect the entire register, just look inside. Note that we are 5446 passing MODE to the recursive call, so the number of sign bit 5447 copies will remain relative to that mode, not the inner mode. 5448 5449 This works only if loads sign extend. Otherwise, if we get a 5450 reload for the inner part, it may be loaded from the stack, and 5451 then we lose all sign bit copies that existed before the store 5452 to the stack. */ 5453 if (WORD_REGISTER_OPERATIONS 5454 && load_extend_op (inner_mode) == SIGN_EXTEND 5455 && paradoxical_subreg_p (x) 5456 && MEM_P (SUBREG_REG (x))) and then optimizes based on that in one place, but then the r7-1077 optimization triggers in and treats all the upper bits in paradoxical SUBREG as undefined and performs based on that another optimization. The r7-1077 optimization is done only if SUBREG_REG is either a REG or MEM, from the discussions in the PR seems that if it is a REG, the upper bits in paradoxical SUBREG on WORD_REGISTER_OPERATIONS targets aren't really undefined, but we can't tell what values they have because we don't see the operation which computed that REG, and for MEM it depends on load_extend_op - if it is SIGN_EXTEND, the upper bits are sign bit copies and so something not really usable for the optimization, if ZERO_EXTEND, they are zeros and it is usable for the optimization, for UNKNOWN I think it is better to punt as well. So, the following patch basically disables the r7-1077 optimization on WORD_REGISTER_OPERATIONS unless we know it is still ok for sure, which is either if sub_width is >= BITS_PER_WORD because then the WORD_REGISTER_OPERATIONS rules don't apply, or load_extend_op on a MEM is ZERO_EXTEND. Bootstrapped/regtested on x86_64-linux and i686-linux (neither of which is WORD_REGISTER_OPERATIONS target), tested on the testcase using cross to riscv64-linux but don't have an easy access to a WORD_REGISTER_OPERATIONS target to bootstrap/regtest it there. Ok for trunk? 2023-12-22 Jakub Jelinek PR rtl-optimization/112758 * combine.cc (make_compopund_operation_int): Optimize AND of a SUBREG based on nonzero_bits of SUBREG_REG and constant mask on WORD_REGISTER_OPERATIONS targets only if it is a zero extending MEM load. * gcc.c-torture/execute/pr112758.c: New test. --- gcc/combine.cc.jj 2023-12-11 23:52:03.528513943 +0100 +++ gcc/combine.cc 2023-12-21 20:25:45.461737423 +0100 @@ -8227,12 +8227,20 @@ make_compound_operation_int (scalar_int_ int sub_width; if ((REG_P (sub) || MEM_P (sub)) && GET_MODE_PRECISION (sub_mode).is_constant (&sub_width) - && sub_width < mode_width) + && sub_width < mode_width + && (!WORD_REGISTER_OPERATIONS + || sub_width >= BITS_PER_WORD + /* On WORD_REGISTER_OPERATIONS targets the bits + beyond sub_mode aren't considered undefined, + so optimize only if it is a MEM load when MEM loads + zero extend, because then the upper bits are all zero. */ + || (MEM_P (sub) + && load_extend_op (sub_mode) == ZERO_EXTEND))) { unsigned HOST_WIDE_INT mode_mask = GET_MODE_MASK (sub_mode); unsigned HOST_WIDE_INT mask; - /* original AND constant with all the known zero bits set */ + /* Original AND constant with all the known zero bits set. */ mask = UINTVAL (XEXP (x, 1)) | (~nonzero_bits (sub, sub_mode)); if ((mask & mode_mask) == mode_mask) { --- gcc/testsuite/gcc.c-torture/execute/pr112758.c.jj 2023-12-21 21:01:43.780755959 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr112758.c 2023-12-21 21:01:30.521940358 +0100 @@ -0,0 +1,15 @@ +/* PR rtl-optimization/112758 */ + +int a = -__INT_MAX__ - 1; + +int +main () +{ + if (-__INT_MAX__ - 1U == 0x80000000ULL) + { + unsigned long long b = 0xffff00ffffffffffULL; + if ((b & a) != 0xffff00ff80000000ULL) + __builtin_abort (); + } + return 0; +} Jakub