From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 92D9E3858D32 for ; Thu, 25 May 2023 15:59:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 92D9E3858D32 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=us.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=us.ibm.com Received: from pps.filterd (m0353724.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 34PFqXkx016004; Thu, 25 May 2023 15:59:32 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : subject : from : to : cc : date : in-reply-to : references : content-type : mime-version : content-transfer-encoding; s=pp1; bh=kR6T033RGdYFiMxcOO8XauHVe/QoccMi55Tn94rI+6s=; b=phXBMJDzNFqGelSp/jHIs+8FBC+3CsLXq6QsAAD8efRsU8hEOQHoYB491bmpdoQutEPH BZGxLnvYl8XjSCsQOqT2pvUYhZII8Y4HZVKJGjS5LNg6KECkaLTHbefHeochDVWQf9DT +UZL7fUWx22F7iwyEYBq5OuRaXMnVweUYH2S1EHbasNBbPMoqObWR69ISNI3pgGQKl4/ prjHp4BhQb7Rz1qyO/KMMSVFF1W3k3gZwZ9IROw0snsBSKqNAl7UU42vJrOOFj112K2G RHPjqdoYxj2FJmtNW5nNnsLu2/4HxNqf9b1vOvmjxxukNZ0z7DhhkXO1so1T48S+3/P4 8Q== Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qtappg4c9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 25 May 2023 15:59:31 +0000 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 34PAolSk010944; Thu, 25 May 2023 15:59:31 GMT Received: from smtprelay04.wdc07v.mail.ibm.com ([9.208.129.114]) by ppma03dal.us.ibm.com (PPS) with ESMTPS id 3qske2j6sh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 25 May 2023 15:59:30 +0000 Received: from smtpav04.wdc07v.mail.ibm.com (smtpav04.wdc07v.mail.ibm.com [10.39.53.231]) by smtprelay04.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 34PFxSR034669062 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 25 May 2023 15:59:28 GMT Received: from smtpav04.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8116358050; Thu, 25 May 2023 15:59:28 +0000 (GMT) Received: from smtpav04.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DE21B58054; Thu, 25 May 2023 15:59:27 +0000 (GMT) Received: from li-e362e14c-2378-11b2-a85c-87d605f3c641.ibm.com (unknown [9.163.31.184]) by smtpav04.wdc07v.mail.ibm.com (Postfix) with ESMTP; Thu, 25 May 2023 15:59:27 +0000 (GMT) Message-ID: Subject: Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions From: Carl Love To: "Kewen.Lin" , cel@us.ibm.com Cc: Segher Boessenkool , gcc-patches@gcc.gnu.org, Peter Bergner Date: Thu, 25 May 2023 08:59:27 -0700 In-Reply-To: <2fab487a-df28-0a5b-dba8-bed417e039bb@linux.ibm.com> References: <4f1af7ba04999d0258354b8e6794621ee303ec53.camel@us.ibm.com> <2fab487a-df28-0a5b-dba8-bed417e039bb@linux.ibm.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 (3.28.5-18.el8) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 3Cr1K1RFEn1hqPhke8W4FNT3CN6hAtaG X-Proofpoint-ORIG-GUID: 3Cr1K1RFEn1hqPhke8W4FNT3CN6hAtaG X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-05-25_08,2023-05-25_03,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 suspectscore=0 bulkscore=0 adultscore=0 mlxscore=0 impostorscore=0 phishscore=0 spamscore=0 priorityscore=1501 mlxlogscore=999 lowpriorityscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2304280000 definitions=main-2305250128 X-Spam-Status: No, score=-5.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Peter, Kewen: On Thu, 2023-05-25 at 13:28 +0800, Kewen.Lin wrote: > on 2023/5/24 23:20, Carl Love wrote: > > On Wed, 2023-05-24 at 13:32 +0800, Kewen.Lin wrote: > > > on 2023/5/24 06:30, Peter Bergner wrote: > > > > On 5/23/23 12:24 AM, Kewen.Lin wrote: > > > > > on 2023/5/23 01:31, Carl Love wrote: > > > > > > The builtins were requested for use in GLibC. As of > > > > > > version > > > > > > 2.31 they > > > > > > were added as inline asm. They requested a builtin so the > > > > > > asm > > > > > > could be > > > > > > removed. > > > > > > > > > > So IMHO we also want the similar support for mffscrn, that is > > > > > to > > > > > make > > > > > use of mffscrn and mffscrni on Power9 and later, but falls > > > > > back > > > > > to > > > > > __builtin_set_fpscr_rn + mffs similar on older platforms. > > > > > > > > So __builtin_set_fpscr_rn everything we want (sets the RN bits) > > > > and > > > > uses mffscrn/mffscrni on P9 and later and uses older insns on > > > > pre- > > > > P9. > > > > The only problem is we don't return the current FPSCR bits, as > > > > the > > > > bif > > > > is defined to return void. > > > > > > Yes. > > > > > > > Crazy idea, but could we extend the built-in > > > > with an overload that returns the FPSCR bits? > > > > > > So you agree that we should make this proposed new bif handle > > > pre-P9 > > > just > > > like some other existing bifs. :) I think extending it is good > > > and > > > doable, > > > but the only concern here is the bif name > > > "__builtin_set_fpscr_rn", > > > which > > > matches the existing behavior (only set rounding) but doesn't > > > match > > > the > > > proposed extending behavior (set rounding and get some env bits > > > back). > > > Maybe it's not a big deal if the documentation clarify it well. > > > > Extending the builtin to pre Power 9 is straight forward and I > > agree > > would make good sense to do. > > > > I am a bit concerned on how to extend __builtin_set_fpscr_rn to add > > the > > new functionality. Peter suggests overloading the builtin to > > either > > return void or returns FPSCR bits. It is my understanding that the > > return value for a given builtin had to be the same, i.e. you can't > > overload the return value. Maybe you can with Bill's new > > infrastructure? I recall having problems trying to overload the > > return > > value in the past and Bill said you couldn't do it. I play with > > this > > and see if I can overload the return value. > > Your understanding on that we fail to overload this for just > different > return types is correct. But previously I interpreted the extending > proposal as to extend > > void __builtin_set_fpscr_rn (int); > > to > > void __builtin_set_fpscr_rn (int, double*); > > The related address taken and store here can be optimized out > normally. I don't think that is correct. The current definition of the builtin is: void __builtin_set_fpscr_rn (int); The proposal by Peter was to change the return type to double, i.e. double __builtin_set_fpscr_rn (int); Peter also said the following: The built-in machinery can see that the usage is expecting a return value or not and for the pre-P9 code, can skip generating the ending mffs if we don't want the return value. Which I don't think we want. The mffscrn and mffscrni instructions return the contents of the control bits in the FPSCR, that is, bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI, RN), are placed into the corresponding bits in register FRT. All other bits in register FRT are set to 0. The instructions also updates the current RN field of the FPSCR with the new RN supplied the second argument of the instruction. So, the instructions update the RN field just like the __builtin_set_fpscr_rn. So, we can use the existing __builtin_set_fpscr_rn to update the RN for all ISAs, we just need to have __builtin_set_fpscr_rn always return a double with the desired fields from the FPSCR (the current RN). This will then emulate the behavior of the mffscrn and mffscrni instructions. The current uses of __builtin_set_fpscr_rn will just ignore the return value which is not a problem. The return value can be stored in the places were the user is currently using the inline asm for the mffscrn and mffscrni instructions. The __builtin_set_fpscr_rn builtin is currently using the mffscrn and mffscrni on Power 9 and throwing away the result from the instruction. We just need to change __builtin_set_fpscr_rn to return the value instead. For the pre Power 9 code, the builtin will need to read the full FPSCR, mask of the desired fields and return the fields. So, there is no need for the builtin to have to determine if the user is storing the result of the __builtin_set_fpscr_rn. The RN bits will always be updated by the __builtin_set_fpscr_rn builtin and the existing fields of the FPSCR will always be returned by the builtin. Please let me know if you agree. I think I have this sorted out correctly. Thanks. Carl