From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 12A06385DC1C; Fri, 22 May 2020 11:49:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 12A06385DC1C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1590148166; bh=90Nad2G9z9YwxwtrraA2uwuCz7oJeIguF/A81Z7WVXE=; h=From:To:Subject:Date:In-Reply-To:References:From; b=jvu4Iri7llZP3BVH5DM8/MZoB6ioE5q2jE0lLPbjUUzhDP8LuugkvjetndUHBkyx6 ECNig9/lBxIXjosjRE64omxwDeJEsJdtSLTMSNsAPiSTjEJanJGnH4IxMgLBGrGyjB BWzINrkWoGTbD8l2GEZ63qsj0H4GrConsupv2JEU= From: "freddie at witherden dot org" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/95264] Infinite Loop When Compiling Templated C++ code at -O1 and above Date: Fri, 22 May 2020 11:49:25 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: 10.1.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: freddie at witherden dot org X-Bugzilla-Status: WAITING X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 May 2020 11:49:26 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D95264 --- Comment #8 from Freddie Witherden --- (In reply to rguenther@suse.de from comment #7) >=20 > Instead of [[gnu::flatten]] you could use the=20 > __attribute__((always_inline)) attribute on the foo function definition > if you didn't simplify the outline above too much to make that > infeasible. IIRC we do not have sth like >=20 > [[gnu::inline]] foo(i, ...); >=20 > to force inlining of a specific call, nor [[gnu::noinline]] foo(i, ...); > both which seem useful. Not sure if the C++ syntax would support > such placement of an attribute of course. So this is exactly what we had in the pre-flatten version of the code: https://github.com/PyFR/Polyquad/commit/f24366c059d2d693222985cdd9333238bd9= 09ad3 The issue was while GCC would inline the annotated functions it would go no further. As such, if I recall correctly, all of the constructor calls to t= he relatively simple Eigen vector types were no longer inlined. Thus a line of code which should translate into a few register-to-memory mov instructions results in a a constructor call, an assignment call, and some cleanup. Si= nce I could not add the force inline attribute to the library types I went in search of an alternative. For the T =3D bfloat eval_orthob instance is the "if (std::is_fundamental::value)" considered before the body is inlined?=