From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 860273856DC5; Wed, 24 Aug 2022 07:13:51 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 860273856DC5
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1661325231;
	bh=1yvcrNIxHRqHQA4d1vQChOZIsm12JhQfKU9PeHl1fl0=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=Y2MQNHdPxojD4jEKeCyMI3yCJzPYVBC418/Tw2C9Ie9gjPIsQ1UE0zdB7G7K8pbBt
	 Z2z0DcgJY7K70DeuZVPuHMkVb7DO92OKQk9vPltHUtNg8ueufz4n6av9M5/66c07+G
	 xGO7u28UUEWvWhQjUOpyxrs54sNw3IdOU/rN9KZI=
From: "linkw at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/99888] Add powerpc ELFv2 support for
 -fpatchable-function-entry*
Date: Wed, 24 Aug 2022 07:13:50 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 11.0
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: normal
X-Bugzilla-Who: linkw at gcc dot gnu.org
X-Bugzilla-Status: ASSIGNED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: linkw at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-99888-4-hHnrypJXq4@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-99888-4@http.gcc.gnu.org/bugzilla/>
References: <bug-99888-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D99888
--- Comment #10 from Kewen Lin <linkw at gcc dot gnu.org> ---
By searching the history of this feature, I found its initial versions only
proposed to place nops after the function entry, such as: v2[1], then it's
requested to be more generic to handle some "exploited atomically" requirem=
ents
for RISV arches. Please see the below quoted content posted by Jose for
SPARC[2] and more [3] to extend it for preceding nops.

"
  How is this supposed to be exploited atomically in RISC arches such as
  sparc?  In such architectures you usually need to patch several
  instructions to load an absolute address into a register.

  If a general mechanism is what is intended I would suggest to offer the
  possibility of extending the nops _before_ the function entry point,
  like in:

  (a) nop   ! Load address
      nop   ! Load address
      nop   ! Load address
      nop   ! Load address
      nop   ! Jump to loaded address.
  entry:
  (b) nop   ! PC-relative jump to (a)
      save %sp, bleh, %sp
      ...

  So after the live-patcher patches the loading of the destination address
  and the jump, it can atomically patch (b) to effectively replace the
  implementation of `entry'.
"

So placing just only one nop after function entry and leaving multiple nops=
 to
be patched before function entry was meant to make it exploited atomically.

I'm not sure if there will be this kind of requirement for future uses of t=
his
feature on ppc64le. If we assume there is, we need to consider if the curre=
nt
proposal can support it and even easily.

With proposal 1) in #c1, that is to place nops before and after local entry
point. It allows three kinds of counts for preceding nops: 2, 6 and 14. IMH=
O,
the count 14 seems to be enough for most cases? But people can blame it's n=
ot
flexible for all kinds of counts, and it can take more size if the required
count doesn't perfectly match one allowable count. Besides, it can offer bad
user experience when users port their workable cases here but get errors.

With proposal in #5, it doesn't have the restriction on the count of preced=
ing
nops, it's a very good thing. The main concern is what Segher pointed out, =
the
patched nops are concluded to be consecutive, in the initial versions it's =
more
explicit as the option name is "prolog-pad". And the separated nop sequences
are for different function entry points, not for "the" function entry.

To offer more flexibility to users, proposal in #5 looks better, but it
requires one documentation update by saying the particularity on ppc64le, t=
hat
is dual entries and the patching way.=