From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 1A1AD385842C; Sat, 12 Feb 2022 20:16:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1A1AD385842C From: "gcc at rabensky dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/104515] New: trivially-destructible destructors interfere with loop optimization - maybe related to lifetime-dse. Date: Sat, 12 Feb 2022 20:16:20 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: og11 (devel/omp/gcc-11) X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: gcc at rabensky dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Feb 2022 20:16:21 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104515 Bug ID: 104515 Summary: trivially-destructible destructors interfere with loop optimization - maybe related to lifetime-dse. Product: gcc Version: og11 (devel/omp/gcc-11) Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: gcc at rabensky dot com Target Milestone: --- This issue started in GCC-9.1, but a change in GCC-11 made it worse. It didn't exist in GCC-7.1-GCC-8.5 Short description: ----------------- When we have a loop that can be optimized out, calling the destructor for a trivially-destructible type will prevent the optimization starting from GCC= -9.1 These are loops that correctly optimized out in GCC-7.1 to GCC-8.5 This bug doesn't happen if we set -fno-lifetime-dse Interestingly enough - a non-trivially-destructible destructor doesn't necessarily prevent the optimization. How this became worse in GCC-11: ------------------------------- In GCC-11 this also applies to calling the destructor of basic types (int, = long etc.) So loops that optimized in GCC-7.1 to GCC-10.3 no longer optimize. Short reproducing example: ------------------------- NOTE: No `include`s are needed ``` using T =3D int; struct Vec { T* end; }; void pop_back_many(Vec& v, unsigned n) { for (unsigned i =3D 0; i < n; ++i) { --v.end; v.end->~T(); } } ``` compiled with `-O3 -Wall` In GCC-7 to GCC-10, `pop_back_many` optimizes out the loop (becomes `v.end-=3Dn`). In GCC-11, the loop remains. See https://godbolt.org/z/vTexxhxP9 NOTE that adding `-fno-lifetime-dse` will re-enable the loop optimization. Why this matters ---------------- This prevents optimization of a loop over `std::vector::pop_back()`, w= hich is a very common usecase! Loops that optimize out in GCC-7.1 to GCC-10.3 will suddenly not optimize in GCC-11.1/2, making existing code run MUCH slower! (O(n) instead of O(1)) NOTE: std::vector::resize is a lot slower than loop over pop_back. A l= oop over pop_back is currently the most efficient way to do pop_back_many! More complete reproducing example: --------------------------------- - We can replace the type `T` with a class that is trivially destructible. **In that case, the problem exists in previous versions of GCC as well** - We can replace the type `T` with a class that had user-supplied destructo= r. **In that case, the loop correctly optimizes out if possible** Actual examples: https://godbolt.org/z/7WqTPq3cE compiled with `-O3 -Wall` ``` template struct Vec { T* end; }; template void pop_back_many(Vec& v, unsigned n) { for (unsigned i =3D 0; i < n; ++i) { --v.end; v.end->~T(); } } struct TrivialDestruct { ~TrivialDestruct()=3Ddefault; }; struct NoopDestruct { ~NoopDestruct(){} }; unsigned count=3D0; struct CountDestruct { ~CountDestruct(){++count;} }; // Here loop optimization fails in GCC-11.1-11.2 // But succeeds in GCC 7.1-10.3 // // NOTE that adding -fno-lifetime-dse re-enabled the optimization template void pop_back_many(Vec&, unsigned); // Here loop optimization fails in GCC-9.1-11.2 // But succeeds in GCC 7.1-8.5 // // NOTE that adding -fno-lifetime-dse re-enabled the optimization template void pop_back_many(Vec&, unsigned); // Here loop optimization succeeds in all versions // // NOTE that it's surprising that a no-op destructor can be optimized // but a trivial destructor can't template void pop_back_many(Vec&, unsigned); // Here loop optimization succeeds in all version // // NOTE that it's surprising that a destructor with an action // can be optimized, but a trivial destructor can't template void pop_back_many(Vec&, unsigned); ```=