From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id CCFB1386101E; Thu, 15 Apr 2021 07:17:23 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CCFB1386101E From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/100089] [11 Regression] 30% performance regression for denbench/mp2decoddata2 with -O3 Date: Thu, 15 Apr 2021 07:17:23 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 11.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 11.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: everconfirmed cf_reconfirmed_on bug_status short_desc target_milestone Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Apr 2021 07:17:23 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D100089 Richard Biener changed: What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Last reconfirmed| |2021-04-15 Status|UNCONFIRMED |NEW Summary|[11 Performance regression |[11 Regression] 30% |] 30% for |performance regression for |denbench/mp2decoddata2 with |denbench/mp2decoddata2 with |-O3 |-O3 Target Milestone|--- |11.0 --- Comment #1 from Richard Biener --- Indeed loop vectorization throws if-converted bodies at the BB vectorizer a= s a last resort (because BB vectorization doesn't do if-conversion itself). But the BB vectorizer then uses the if-converted scalar code as the thing to cost against (costing against the not if-converted loop body isn't really possible). To quote /* If we applied if-conversion then try to vectorize the BB of innermost loops. ??? Ideally BB vectorization would learn to vectorize control flow by applying if-conversion on-the-fly, the following retains the if-converted loop body even when only non-if-converted parts took part in BB vectorization. */ if (flag_tree_slp_vectorize !=3D 0 && loop_vectorized_call && ! loop->inner) { as a "hack" we could see to scalar cost the always executed part of the not if-converted loop body and apply the full bias of this cost vs. the scalar cost of the if-converted body to the scalar cost of the BB vectorization. But that's really apples-to-oranges in the end (as it is now). Maybe we can cost the whole partly vectorized loop body in this mode and compare it against the scalar cost of the original loop. But even the loop vectorizer costs the if-converted scalar loop, so it is off as wel= l. Long-term if-conversion needs to be integrated with vectorization so we can at least keep track of what stmts were originally executed conditional and what not. Short-term I'm not sure we can do much. Doing SLP on the if-converted body does help in quite some cases.=