From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id F38D63858419 for ; Wed, 10 Aug 2022 07:42:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org F38D63858419 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 27A5ZrOa002273; Wed, 10 Aug 2022 07:42:45 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3hv65vv1vq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 10 Aug 2022 07:42:45 +0000 Received: from m0098399.ppops.net (m0098399.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 27A6fWAC025329; Wed, 10 Aug 2022 07:42:44 GMT Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3hv65vv1v0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 10 Aug 2022 07:42:44 +0000 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 27A7dWRX030527; Wed, 10 Aug 2022 07:42:42 GMT Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by ppma06fra.de.ibm.com with ESMTP id 3huww2g7v1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 10 Aug 2022 07:42:42 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 27A7ge3o36962604 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 10 Aug 2022 07:42:40 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id ECD765204E; Wed, 10 Aug 2022 07:42:39 +0000 (GMT) Received: from [9.197.226.101] (unknown [9.197.226.101]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 7F8865204F; Wed, 10 Aug 2022 07:42:38 +0000 (GMT) Message-ID: <99e4c60b-0334-1e95-7629-1a2271512ed3@linux.ibm.com> Date: Wed, 10 Aug 2022 15:42:37 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [PATCH] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888] Content-Language: en-US To: Segher Boessenkool Cc: GCC Patches , David Edelsohn , amodra@gmail.com References: <20220809103504.GS25951@gate.crashing.org> <20220809211013.GT25951@gate.crashing.org> From: "Kewen.Lin" In-Reply-To: <20220809211013.GT25951@gate.crashing.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: kkvwhoO8KWuxZbT3bPSbY6MY6ozBbYnb X-Proofpoint-GUID: sFqQeBnsuMC0fO8-x1zJnfUeio5Nw7eQ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-08-10_03,2022-08-09_02,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 mlxlogscore=999 impostorscore=0 lowpriorityscore=0 malwarescore=0 clxscore=1015 suspectscore=0 priorityscore=1501 phishscore=0 mlxscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2207270000 definitions=main-2208100021 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2022 07:42:48 -0000 on 2022/8/10 05:10, Segher Boessenkool wrote: > Hi! > > On Tue, Aug 09, 2022 at 08:51:59PM +0800, Kewen.Lin wrote: >> on 2022/8/9 18:35, Segher Boessenkool wrote: >>>> + /* As ELFv2 ABI shows, the allowable bytes past the global entry >>>> + point are 0, 4, 8, 16, 32 and 64. Considering there are two >>>> + non-prefixed instructions for global entry (8 bytes), the count >>>> + for patchable NOPs before local entry would be 2, 6 and 14. */ >>> >>> The other option is to allow other numbers of nops, but in that case not >>> have a local entry point (so, always use the global entry point). >> >> Good point, it's doable, but it means for the other counts of NOPs, the >> patched function has to pay the cost of TOC initialization all the time, >> IMHO it may not be what we want. > > It isn't very expensive: the main benefit of the LEP is not not having > to do those two insns, but having the r2 setter earlier, allowing loads > via the TOC reg to execute earlier. > OK. >>> I don't know if that is useful for any users of this support (if there >>> even are such users :-P ) >> >> Yeah, as the discussions in PR98125, powerpc linux kernel doesn't adopt >> this feature. :-P > > Right, -mprofile-kernel is more efficient. > > So maybe just say in the comment that it is possible to support those > other nop pad sizes, by not doing a LEP at all? Instead of sasying it > cannot be done :-) OK, I'll update the comments like: /* As ELFv2 ABI shows, the allowable bytes past the global entry point are 0, 4, 8, 16, 32 and 64 when there is a local entry. Considering there are two non-prefixed instructions for global entry (8 bytes), the count for patchable NOPs before local entry would be 2, 6 and 14. It's possible to support those other counts of NOPs by not doing a local entry at all, but we don't have clear user cases for them, so leave them unsupported for now. */ > >> >>> >>>> + if (patch_area_entry > 0) >>>> + { >>>> + if (patch_area_entry != 2 >>>> + && patch_area_entry != 6 >>>> + && patch_area_entry != 14) >>>> + error ("for %<-fpatchable-function-entry=%u,%u%>, patching " >>>> + "%u NOP(s) before function entry is invalid, it can " >>>> + "cause assembler error", >>> >>> I would not say "it can [etc.]" at all. Oh, and "NOP" (capitals) isn't >>> a thing, it is not an acronym or such ;-) >>> >> >> Poor at wording. :( Could you help to suggest some words here? > > I'll try... > > "unsupported number of nops before function entry (%u)" > Nice, will update with this. >>>> +/* { dg-require-effective-target powerpc_elfv2 } */ >>>> +/* Specify -mcpu=power9 to ensure global entry is needed. */ >>>> +/* { dg-options "-mdejagnu-cpu=power9" } */ >>> >>> Why would it be needed for p9, and not older, or newer? >>> >> >> It can be p8 or p9, but not p10 and later. >> >> It's meant to exclude pc-relative feature which can make the case not >> generate a global entry point prologue and the test point will become >> unavailable. I thought about adding -mno-pcrel, but guessed it's safer >> to use one cpu type which doesn't support pcrel at all, since it can >> exclude all possibilities that pcrel gets re-enabled. >> >> Do you think -mno-pcrel is more elegant and relatively safe? >> Or just update the comments to make it more meaningful? > > Just use { ! powerpc_pcrel } ? I don't think you can put that in a > dg-require-effective-target, but you can do for example > dg-do compile { target { ! powerpc_pcrel } } > or similar. > Good idea, I'll send out one new version of patch after some testings. BR, Kewen