From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by sourceware.org (Postfix) with ESMTP id AB6B53858419 for ; Tue, 9 Aug 2022 21:11:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AB6B53858419 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=kernel.crashing.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kernel.crashing.org Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 279LAESc011419; Tue, 9 Aug 2022 16:10:15 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 279LAETe011418; Tue, 9 Aug 2022 16:10:14 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Tue, 9 Aug 2022 16:10:14 -0500 From: Segher Boessenkool To: "Kewen.Lin" Cc: GCC Patches , David Edelsohn , amodra@gmail.com Subject: Re: [PATCH] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888] Message-ID: <20220809211013.GT25951@gate.crashing.org> References: <20220809103504.GS25951@gate.crashing.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Aug 2022 21:11:17 -0000 Hi! On Tue, Aug 09, 2022 at 08:51:59PM +0800, Kewen.Lin wrote: > on 2022/8/9 18:35, Segher Boessenkool wrote: > >> + /* As ELFv2 ABI shows, the allowable bytes past the global entry > >> + point are 0, 4, 8, 16, 32 and 64. Considering there are two > >> + non-prefixed instructions for global entry (8 bytes), the count > >> + for patchable NOPs before local entry would be 2, 6 and 14. */ > > > > The other option is to allow other numbers of nops, but in that case not > > have a local entry point (so, always use the global entry point). > > Good point, it's doable, but it means for the other counts of NOPs, the > patched function has to pay the cost of TOC initialization all the time, > IMHO it may not be what we want. It isn't very expensive: the main benefit of the LEP is not not having to do those two insns, but having the r2 setter earlier, allowing loads via the TOC reg to execute earlier. > > I don't know if that is useful for any users of this support (if there > > even are such users :-P ) > > Yeah, as the discussions in PR98125, powerpc linux kernel doesn't adopt > this feature. :-P Right, -mprofile-kernel is more efficient. So maybe just say in the comment that it is possible to support those other nop pad sizes, by not doing a LEP at all? Instead of sasying it cannot be done :-) > > > > >> + if (patch_area_entry > 0) > >> + { > >> + if (patch_area_entry != 2 > >> + && patch_area_entry != 6 > >> + && patch_area_entry != 14) > >> + error ("for %<-fpatchable-function-entry=%u,%u%>, patching " > >> + "%u NOP(s) before function entry is invalid, it can " > >> + "cause assembler error", > > > > I would not say "it can [etc.]" at all. Oh, and "NOP" (capitals) isn't > > a thing, it is not an acronym or such ;-) > > > > Poor at wording. :( Could you help to suggest some words here? I'll try... "unsupported number of nops before function entry (%u)" > >> +/* { dg-require-effective-target powerpc_elfv2 } */ > >> +/* Specify -mcpu=power9 to ensure global entry is needed. */ > >> +/* { dg-options "-mdejagnu-cpu=power9" } */ > > > > Why would it be needed for p9, and not older, or newer? > > > > It can be p8 or p9, but not p10 and later. > > It's meant to exclude pc-relative feature which can make the case not > generate a global entry point prologue and the test point will become > unavailable. I thought about adding -mno-pcrel, but guessed it's safer > to use one cpu type which doesn't support pcrel at all, since it can > exclude all possibilities that pcrel gets re-enabled. > > Do you think -mno-pcrel is more elegant and relatively safe? > Or just update the comments to make it more meaningful? Just use { ! powerpc_pcrel } ? I don't think you can put that in a dg-require-effective-target, but you can do for example dg-do compile { target { ! powerpc_pcrel } } or similar. Direct things are aleays much preferred. There should be a comment saying what some non-obvious restriction is for always, and it will be simple and boring then (the code already says that pcrel is not okay, just add a word or two "no TOC etc. with pcrel" or whatever :-) Segher