From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id E6B3E3861937; Wed, 4 Oct 2023 16:12:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E6B3E3861937 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1696435927; bh=MgElcFvuw9D1XF62s5t6BY+1Q9JpaQgoDOlddqlsgEM=; h=From:To:Subject:Date:In-Reply-To:References:From; b=UMsSBf3/X9PTKWAfPxrmHTf10i+R/ecupnFKi1zox2QvKMscK262QPzEVPZY9ThE4 9hxjQxa0GhkwaU9Ww4cud+LOYZqfgAWvUeUNdc9T9aFjcvzyf+RFCdVTF42Mk8MP/C 4AtuvGqZCv2jV8jFQkPqAJ076X8Pd0zKluP7BV9U= From: "cvs-commit at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/110701] [14 Regression] Wrong code at -O1/2/3/s on x86_64-linux-gnu Date: Wed, 04 Oct 2023 16:12:06 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: needs-bisection, wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: cvs-commit at gcc dot gnu.org X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: roger at nextmovesoftware dot com X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110701 --- Comment #8 from CVS Commits --- The master branch has been updated by Roger Sayle : https://gcc.gnu.org/g:263369b2f7f726a3d4b269678d2c13a9d06a041e commit r14-4398-g263369b2f7f726a3d4b269678d2c13a9d06a041e Author: Roger Sayle Date: Wed Oct 4 17:11:23 2023 +0100 PR rtl-optimization/110701: Fix SUBREG SET_DEST handling in combine. This patch is my proposed fix to PR rtl-optimization 110701, a latent b= ug in combine's record_dead_and_set_regs_1 exposed by recent improvements = to simplify_subreg. The issue involves the handling of (normal) SUBREG SET_DESTs as in the instruction: (set (subreg:HI (reg:SI x) 0) (expr:HI y)) The semantics of this are that the bits specified by the SUBREG are set to the SET_SRC, y, and that the other bits of the SET_DEST are left/bec= ome undefined. To simplify explanation, we'll only consider lowpart SUBREGs (though in theory non-lowpart SUBREGS could be handled), and the fact t= hat bits outside of the lowpart WORD retain their original values (treating these as undefined is a missed optimization rather than incorrect code bug, that only affects targets with less than 64-bit words). The bug is that combine simulates the behaviour of the above instructio= n, for calculating nonzero_bits and set_sign_bit_copies, in the function record_value_for_reg, by using the equivalent of: (set (reg:SI x) (subreg:SI (expr:HI y)) by calling gen_lowpart on the SET_SRC. Alas, the semantics of this revised instruction aren't always equivalent to the original. In the test case for PR110701, the original instruction (set (subreg:HI (reg:SI x), 0) (and:HI (subreg:HI (reg:SI y) 0) (const_int 340))) which (by definition) leaves the top bits of x undefined, is mistakenly considered to be equivalent to (set (reg:SI x) (and:SI (reg:SI y) (const_int 340))) where gen_lowpart's freedom to do anything with paradoxical SUBREG bits, has now cleared the high bits. The same bug also triggers when the SET_SRC is say (subreg:HI (reg:DI z)), where gen_lowpart transforms this into (subreg:SI (reg:DI z)) which defines bits 16-31 to be the same as bits 16-31 of z. The fix is that after calling record_value_for_reg, we need to mark the bits that should be undefined as undefined, in case gen_lowpart, which performs transforms appropriate for r-values, has changed the interpretation of the SUBREG when used as an l-value. 2023-10-04 Roger Sayle gcc/ChangeLog PR rtl-optimization/110701 * combine.cc (record_dead_and_set_regs_1): Split comment into pieces placed before the relevant clauses. When the SET_DEST is a partial_subreg_p, mark the bits outside of the updated portion of the destination as undefined. gcc/testsuite/ChangeLog PR rtl-optimization/110701 * gcc.target/i386/pr110701.c: New test case.=