From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 02F1D3858D32; Tue, 16 Apr 2024 06:46:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 02F1D3858D32 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1713249970; bh=K9PnrcsqQ9m0GA1to3imn2f8bgSS976g8/Y7WRwK4KY=; h=From:To:Subject:Date:In-Reply-To:References:From; b=jABDyVJkah4RCYe3OnfRCIpGHHOBMAqCiZKq3G0BXnXb41YVt2cXbLyPJ7nNpxP7f xbL4xmtBOeJ98JBkwUfU9pLr4+kb3XlxPyqu8nLqGhNVqPhtSxImW/5vQhCeakBrKx N25S0Xe1T3+DzgtP6d2X5tSI+AccKfhmWHoE/JRE= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/111231] [12/13/14 regression] armhf: Miscompilation with -O2/-fno-exceptions level (-fno-tree-vectorize is working) Date: Tue, 16 Apr 2024 06:46:07 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 13.2.0 X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 12.4 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D111231 --- Comment #32 from Richard Biener --- (In reply to Richard Earnshaw from comment #31) > While that does seem to fix the bug, it's at the cost of 6 additional sto= res > in the problematic test that are redundant other than changing the alias = set > view. The alternative is to alter the earlier store MEM_ATTRs to use an alias-set covering both which usually means using alias-set zero. This will pessimize followup optimizations around the store though but it might be a good trade-off if done only late - I'd say after sched2 but it doesn't look like theres CSE/DSE after it. So maybe after sched1 which effectively means after reload, but there's no regular CSE after reload either. The latest CSE is pass_cse2. IIRC a minor complication is that the earlier insn isn't readily available - IIRC 'dest' is copied/mangled and not necessarily the single origial RTX of the earlier SET_DEST (IIRC - it's been some time). OTOH I think that correctness trumps optimization and if this is the problematical transform then I don't see much options here. In the place CSE applies the transform we'd have to set MEM_ALIAS_SET to zero if the alias set condition doesn't hold and clear MEM_EXPR if the MEM_EXPR condition doesn't hold. Note I can't get the cse.cc code to trigger with the full preprocessed source and a cross to arm and using -O2 -fno-exceptions -march=3Darmv7-a -mfpu=3Dneon-vfpv4 -mfloat-abi=3Dhard -mfp16-format=3Dieee -fmath-errno You mention at one point an insn removed by postreload, but that doesn't use alias_set_subset_of. I also don't remember postreload doing redundant store removal.=