From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id CE1F13858C50; Sun, 22 Oct 2023 19:18:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CE1F13858C50 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1698002322; bh=NLOWcG6lt4QSGdbZ5Grzx21bp2dmR/VouLF/vQnhmZc=; h=From:To:Subject:Date:In-Reply-To:References:From; b=pptrLBrpV0OToMRPoZfJh5LwfoEnl/Cr1C0phs5SLd/QJIPS0f3ThD/0RKginNzMe fV9CNLGXF0H3Ep53h6GFYssuuKLMC1miPiCztzQJSiBBC+Jms052mg41Jj7hnZrX/2 YrRwoVnt+osk0d1pLUoJrQ1y3OorA9zRAf1MBe3k= From: "anlauf at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug fortran/30409] [fortran] missed optimization with pure function arguments Date: Sun, 22 Oct 2023 19:18:42 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: fortran X-Bugzilla-Version: 4.3.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: anlauf at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D30409 --- Comment #8 from anlauf at gcc dot gnu.org --- The suggested optimization needs to take into account that the evaluation of the temporary expression might trap, or that allocatable variables are not allocated, etc. The trap etc. would not occur if the trip count of the loop is zero for the non-hoisted variant, so we need to make sure not to generate failing code for the hoisted one. Similarly, for conditional code in the loop body, like if (cond) then expression1 (..., 1/y) else expression2 (..., 1/z) end if where cond protects from traps even for finite trip counts, these conditions may also need to be identified, and an appropriate block generated. Some HPC compilers have directives (MOVE/NOMOVE) to annotate the respective loops, and corresponding compiler options that are enabled only at aggressi= ve optimization levels for real-world code. I wonder how much (or little) really needs to be done here, or if the task can be split in a suitable way between FE and ME. The tree-dump shows a __builtin_malloc/__builtin_free for the temporary *within* the i-loop. Would it be possible to move this *management* just one loop level up, if the size of the temporary is known to be constant? (Which is the case here). I mean attach it to the outer scope? Maybe the middle end then better "sees" what can reasonably be done?=