From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 65DB73858C60; Thu, 25 Jan 2024 08:48:25 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 65DB73858C60
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1706172506;
	bh=X7uKMFDmnYvtP8k59UY56Jl2rkXmyeiYxqhhHP5f6ZY=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=gkXgttgDLDe9w8lKeJwe/2Gkn+coTTPrVf33X+A+17r9mret9ftMEK2tMOEHr/j1g
	 fOmyYrBFG6UkkgJTFESNW/b/kBhiKBoFjTAs7NnNFV9aW7FW668Xa27OB27vsOvTGF
	 FM6MGGaxTe9gQ+Q1x8ppTv6uuUM814F8gN6c4xCE=
From: "pinskia at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/90582] AArch64 stack-protector wastes an instruction on
 address-generation
Date: Thu, 25 Jan 2024 08:48:14 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 8.2.1
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: pinskia at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: bug_severity
Message-ID: <bug-90582-4-hMkcD4DxGG@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-90582-4@http.gcc.gnu.org/bugzilla/>
References: <bug-90582-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D90582

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #1)
> > I assume EOR / CBNZ is as at least as efficient as SUBS / BNE on
> > all/most AArch64 microarchitectures, but someone should check.
>=20
> It is similar as x86 with that respect on some cores (Marvell's cores
> mostly).
> That is ThunderX, ThunderX 2 and OcteonTX and OcteonTX2 all have the abil=
ity
> to do macro-combining of the two instructions into one micro-op.

Even on non-most Marvell cores now, subs/bne is better than eor/cbnz.


Anyways starting GCC 10.3/9.4  we get:
        ldr     x2, [x0]
        subs    x1, x1, x2
        mov     x2, 0
        bne     .L5

Which we can't fuse anyways.  I wonder if we should clobber x1 too.


Note for -fomit-frame-pointer issue, it is not really an issue as only
-momit-leaf-frame-pointer is turned on by default and now the function is N=
OT a
leaf function due to the call to __stack_chk_fail .

>        mov     x1,0                            # and destroy the reg
>        mov     w1, 3                           # right before it's alread=
y destroyed

This is by design, GCC does not go back and figure out if we could remove t=
he
zeroing as if it deletes it on accident, it might introduce a "security hol=
e".
So emitting it always allows that NOT to happen.


As far as the other issue dealing with the address formation, it is a small
missed optmization and might not help in general or at all.=