From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id B3BFA3858C66 for ; Wed, 21 Jun 2023 06:20:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B3BFA3858C66 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35L6HjsI005646 for ; Wed, 21 Jun 2023 06:20:51 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=EByPm6ak3T7qCnCrmeWJSSFJABuTqAo4vMHHY1XJ99U=; b=XRF5xZXECRMxW3OZgf82DMKD798odAFsKpm1z3c30ILa3VM968jeB/SCf1fyAP+XcpHL aApcmAhVI6Rcki/Dzuweu7+Pf5RaRT9XgL3NeMnojQDfRln3Ug40u+H+NZe53kc2GhxT yR5ZLrXvKKzjRyC+RgM4x9F3+CzfOLhxyR+88GI7fkscjUn4yMoWIfQm1Sh1+wgFWSCP S5icbgZNR0H2b7aZXWaG0ajexftV68En+IdfP9Pt9jjSXaouqkQnbVMTlzUSG1pAWkAa stL/I5WbZl7WGTNp3oc5Uab3YInaObg2oWNbvb47EEih+XHtf5LmTzJE+M07HfcKN8QT lQ== Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rbute02a5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 21 Jun 2023 06:20:50 +0000 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35KJZ7Ud005581 for ; Wed, 21 Jun 2023 06:20:50 GMT Received: from smtprelay07.wdc07v.mail.ibm.com ([9.208.129.116]) by ppma03dal.us.ibm.com (PPS) with ESMTPS id 3r94f5e5h3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 21 Jun 2023 06:20:50 +0000 Received: from smtpav01.wdc07v.mail.ibm.com (smtpav01.wdc07v.mail.ibm.com [10.39.53.228]) by smtprelay07.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35L6Kl9261538568 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 21 Jun 2023 06:20:47 GMT Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5B7995806B; Wed, 21 Jun 2023 06:20:47 +0000 (GMT) Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 68B565805B; Wed, 21 Jun 2023 06:20:45 +0000 (GMT) Received: from [9.43.32.64] (unknown [9.43.32.64]) by smtpav01.wdc07v.mail.ibm.com (Postfix) with ESMTP; Wed, 21 Jun 2023 06:20:44 +0000 (GMT) Message-ID: <992ee573-a1f1-4d1e-5330-e3e2dd03a32e@linux.ibm.com> Date: Wed, 21 Jun 2023 11:50:43 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH] PowerPC: Influence hwcaps via cpu arch-level GLIBC_TUNABLES. To: Peter Bergner , libc-alpha@sourceware.org Cc: rajis@linux.ibm.com, Mahesh Bodapati References: <20230619080956.3187040-1-bmahi496@linux.ibm.com> <94be5917-3607-6e45-115f-a2f6db95b321@linux.ibm.com> <642ac186-f13b-8a0f-e007-5df706d9004b@linux.ibm.com> <2d81cc9e-acb2-b21f-8f18-1bffa354442f@linux.ibm.com> From: MAHESH BODAPATI In-Reply-To: <2d81cc9e-acb2-b21f-8f18-1bffa354442f@linux.ibm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: NE2wPZDFLkvxLgB8Np1q2kw9V9YMBiIi X-Proofpoint-GUID: NE2wPZDFLkvxLgB8Np1q2kw9V9YMBiIi X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-06-21_03,2023-06-16_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 mlxlogscore=817 suspectscore=0 phishscore=0 clxscore=1015 bulkscore=0 malwarescore=0 impostorscore=0 adultscore=0 spamscore=0 mlxscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306210051 X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 21/06/23 9:41 am, Peter Bergner wrote: > On 6/20/23 12:45 PM, MAHESH BODAPATI wrote: >> On 20/06/23 9:50 pm, Peter Bergner wrote: >>> I'm all for allowing modifying full cpu specific hwcap tunables with one >>> "cpu" option, but it's hard to tell whether this change allows modifying >>> single HWCAP/HWCAP2 features too.  Say I only want to disable the VSX >>> feature or the MMA feature and nothing else.  Does this patch support that? >>> We *do* want that ability! >>> >> This patch will not support single HWCAP/HWCAP2 features. This is only for CPU arch-level features. >> We can add tunable support for single HWCAP/HWCAP2 in a separate patch. > Great to hear! Like I said, we do want/need that and I actually think > that will be the most common usage for users. > > > >>>> +  if (disable_vsx) >>>> +    cpu_features_curr.hwcap &= ~PPC_FEATURE_HAS_VSX; >>> Why the special handling for the VSX feature here?  How is it different >>> than say the Altivec feature or any of our other feature bits which don't >>> have special handling?  It's not obvious to me why we need special handling, >>> so it's probably not obvious to others either.  If we really do need special >>> handling for this, you should add a comment explaining why. >>> >> On PowerPC32, The function selection happened through VSX feature on some libraries. >> Say I set tunable as "power7,power6" then it should set to power6 but it's picking the power7 specific code >> So I am disabling VSX feature on the machines which are lower than power7 and the code should work on the precedence as well, >> For suppose "power6,power7" then it should set to power7. > So for the "power7,power6" example, you're saying that handling the > power7 tunable enables the VSX HWCAP bit, but when we handle power6, > we need to disable it because last option wins? If that is the case, > then the current code needs a lot more special handling! Take for > example "power6,power5". In this case, power6 will enable the > altivec bit, but power5 doesn't have altivec, so you'll need to > disable that like you disable vsx. That's only one example, there > are MANY more special cases. There are also cases where an older > cpu has a feature that doesn't exist in new cpus (eg, htm is in > power8, but not power10). Those too would have to be handled > specially. Powerpc32/*/ifunc-impl-list.c ,if we look at the function selection , it always happened through ISA and VSX bits but not with altivec bits so altivec check is redundant here. Powerpc64/*/ifunc-impl-list.c ,if we see the code. The function selection always happened through ISA,VSX and altivec bits and altivec got enabled on higher capable machines always so i didn't make any changes to that. > I think the whole issue here, is that you're updating cpu_features_curr > hwcap and hwcap2 values inside the do-while loop, and you have to back > out bits you set when you see another cpu in the tunables list that doesn't > have those bits. It seems to me that if you wait until after the do-while > loop to update cpu_features_curr with the hwcap and hwcap2 bits of the last > cpu seen, then won't everything just work out without any special handling > needed at all? > > So thinking out loud here, it seems when you see a new cpu in the > tunables list, you want to clear out the temporary hwcap/hwcap2 > masks (which you're doing unconditionally right now) which throws > away the hwcap/hwcap2 mask from any previous handled cpu. > Then at the end of the do-while loop, you use those temp masks > to set cpu_features_curr. In the future patch to add support for > handling single feature tunables, you'd just reuse the current temp > hwcap/hwcap2 masks without clearing it. That way, one could do > "power5,altivec" and you'd get all the power5 hwcap/hwcap2 masks > in addition to altivec. On the other hand, if you said "altivec,power5", > you'd end up with just the power5 bits, which is what we'd want. > If we have disable tunable set like "power8,-power6,-power9" then it should set to power8 but power6 and power9 has to be disabled . I disabled ISA bits but not VSX and altivec bits so i didn't include VSX and altivec on the enable/disable sets inside the loop. I saw a specific case where function selection happened on powerpc32 with only VSX on some libraries so i handled it separately. I can add altivec similarly but i felt it's redundant for CPU arch-level tunable. If you want me to integrate the single hwcap features (VSX,altivec) then i will integrate it and submit a new patch. > Peter > >