From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 6505C3857344; Thu, 18 May 2023 05:34:47 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6505C3857344 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1684388087; bh=APWTY6zHaNurlBt3ToGqdArVI6F3l1SKNtJlduosh7A=; h=From:To:Subject:Date:In-Reply-To:References:From; b=aQxoQK+6L5uswOjaRcVHK0IqlfOEoWMoLCaB3JRJSZKiekrj0Nk4ATEPYXBv92KaT ppkbYUl6WractuXMac5YmwQRRuUsqRHIwLwUchbfJDq3E4+WTaFlxoBeVQshQomn13 tOWr5ScC0UQpYh9grmE6cDMPMvMbxwu+u8LQr2Yw= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/109441] missed optimization when all elements of vector are known Date: Thu, 18 May 2023 05:34:46 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109441 --- Comment #4 from Richard Biener --- (In reply to AK from comment #3) > > But IMHO it's academic, right? >=20 > yes. i was just messing with vector codegen. But in case all the elements= of > a vector/array are same, maybe the loop can be replaced with equivalent > computation? Yes. GCC doesn't currently have the ability to constant propagate or value-number defs defined by cycles [that actually iterate]. In general doing that is computationally expensive and only in very few cases you can short-cut simulating all iterations (final value replacement does this, but the case in this PR is already too complicated because it involves a memory load). For the case in this PR when simplified to not require removal of all the C++ abstraction via inlining we'd need to handle a loopy memory definition and a loopy memory use. The loopy memory def we can probably pattern match to a memset and the loopy memory use could be (but isn't currently) identified to be always zero. Pass ordering still stands in the way then though.=