From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 3BC73385781F; Wed, 12 Jul 2023 21:27:38 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3BC73385781F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689197258; bh=fZmrX7x/tz4zGXAt0yvGj2sSi/vlPHsURD3dJ6JYc1k=; h=From:To:Subject:Date:From; b=pKkZRuz9TCjcEOMDHCQpfumSPyNUtL5+UG6UXAVQLS7E1Aar2H9gXKEIr5I1hegnl y9ybmd+w17iKi7rwRJuX3LlODBG7uR8HUdd5hA4oNVCnHdaLX2ETC5+qy9rYhneuaI OoJ7r2aoifwPcVe//c8EyghPfQOghu863HyeycWM= From: "cfsteefel at arista dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/110648] New: Missed optimization for small returned optional leads to redundant memory accesses Date: Wed, 12 Jul 2023 21:27:37 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: 11.3.1 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: cfsteefel at arista dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110648 Bug ID: 110648 Summary: Missed optimization for small returned optional leads to redundant memory accesses Product: gcc Version: 11.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: cfsteefel at arista dot com Target Milestone: --- The following code=20 #include std::optional< int > foo( int x ) { return 1; } produces x86_64 assembly which does two stores into the stack, and then a l= oad into rax, rather than simply operating directly on rax. i.e.: foo(int): mov DWORD PTR [rsp-8], 1 mov BYTE PTR [rsp-4], 1 mov rax, QWORD PTR [rsp-8] ret clang produces much more direct code: foo(int): # @foo(int) movabs rax, 4294967297 ret Since the returned value is always returned in rax as the optional is small enough (less than two registers wide), there is no reason for the memory accesses here. The code can be improved by naming the returned object, but this breaks down again if there are any conditionals, i.e.: std::optional< int > foo( int x ) { std::optional< int > ret =3D 1; return ret; } produces better code, but there is no way to get this better code once a br= anch is introduced, i.e. std::optional< int > foo( int x ) { std::optional< int > ret =3D 1; if ( x < 1 ) { return std::nullopt; } return ret; } The same applies with godbolts' trunk version of gcc, as well as gcc11.3.=