From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 6E20D385700F; Sat, 27 May 2023 18:47:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6E20D385700F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1685213224; bh=XdTafhHJPJHeAW18UtrNi8Md7qVFwn6qE/qED3moQLU=; h=From:To:Subject:Date:From; b=PWf4iCVfsaQE3ZuNufrXlVL36N/SPRb8kLHkcTBNvJrCaTyP+WQYoD2DAY68Bcgo1 CaDlC/zGrf3wGWMRnmnh+lAM7U6lQxV+7ZnrP7WXSRGI0bCMA52N5gpxU/HC9/yLxO Q1pSNI+9RkWtgugmRh1fwUIbFaUgUu55mrghbeEI= From: "lh_mouse at 126 dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/110008] New: early returns from functions result in suboptimal code Date: Sat, 27 May 2023 18:47:04 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: lh_mouse at 126 dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110008 Bug ID: 110008 Summary: early returns from functions result in suboptimal code Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: lh_mouse at 126 dot com Target Milestone: --- https://gcc.godbolt.org/z/94Wf3Worq ``` int complex_one(int, int); int test(int a, int b, int c) { if(__builtin_expect(a, 0) =3D=3D 0) return 0; int r =3D complex_one(a, b); r +=3D complex_one(r, c); return r + a + b; } ``` GCC: ``` test: push rdi push rsi push rbx sub rsp, 32 mov ebx, ecx mov esi, edx test ecx, ecx jne .L7 mov eax, ebx add rsp, 32 pop rbx pop rsi pop rdi ret .L7: mov DWORD PTR 80[rsp], r8d call complex_one mov edx, DWORD PTR 80[rsp] mov ecx, eax mov edi, eax call complex_one add edi, eax add ebx, edi add ebx, esi mov eax, ebx add rsp, 32 pop rbx pop rsi pop rdi ret ``` Clang: ``` test: # @test xor eax, eax test edi, edi jne .LBB0_1 ret .LBB0_1: push rbp push r15 push r14 push rbx push rax mov r14d, edx mov ebx, esi mov ebp, edi call complex_one@PLT mov r15d, eax mov edi, eax mov esi, r14d call complex_one@PLT add ebx, ebp add ebx, r15d add ebx, eax mov eax, ebx add rsp, 8 pop rbx pop r14 pop r15 pop rbp ret ``` There are two issues in this code: The first one is that GCC uses apparently more space for temporary variables than Clang. The other is that when `a` equals zero, Clang skips the normal function prologue which pushes a lot of registers onto the stack, but GCC performs the check after it, in which case both the prologue and epilogue get executed for nothing.=