From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 892AB3858426 for ; Thu, 6 Jul 2023 20:39:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 892AB3858426 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0353728.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 366KVbAa025812; Thu, 6 Jul 2023 20:39:42 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=hqluLgQ9YdWj04g3+Bu+gsP1Q9o+Y+BvWbjmFNt+9r4=; b=iZYxTiFmmQoEagZTnXGqapturhOBP7iZNa20PXxR1g3QMQd4hRQ4ndN0EraAums6pDZG 40xb9PL5aSRhhQEOH0T7OhHjFiWQ0a4oS+T/1jPyTNQYC07w+sFzuY+76B4icHRd0CjH nRDNpCmHSkS6nAsHHDZj/DfzT61oEXZNR+Hsl3OfCnw/l89FisZnRgE16XKPeGFnUc/T d6y2WyuexdHAmUvA8NI0eu+QtTWfIAEzH1vckx+YKsy2wOagoFggV1rIO7wDidX5QYRK pSbTBKh/E/+gZwVv1Hb+chv0nQ689tTB6Q0ZIqlcy5xbXGuS/kC5KkLgodJcUhprgTdi ug== Received: from ppma05wdc.us.ibm.com (1b.90.2fa9.ip4.static.sl-reverse.com [169.47.144.27]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3rp4qqgdqf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Jul 2023 20:39:41 +0000 Received: from pps.filterd (ppma05wdc.us.ibm.com [127.0.0.1]) by ppma05wdc.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 366GoYF1002627; Thu, 6 Jul 2023 20:39:39 GMT Received: from smtprelay02.wdc07v.mail.ibm.com ([9.208.129.120]) by ppma05wdc.us.ibm.com (PPS) with ESMTPS id 3rjbs62cd4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 06 Jul 2023 20:39:39 +0000 Received: from smtpav06.dal12v.mail.ibm.com (smtpav06.dal12v.mail.ibm.com [10.241.53.105]) by smtprelay02.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 366KcMbo65601846 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 6 Jul 2023 20:38:22 GMT Received: from smtpav06.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6824C58055; Thu, 6 Jul 2023 20:38:22 +0000 (GMT) Received: from smtpav06.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 08AEA58059; Thu, 6 Jul 2023 20:38:22 +0000 (GMT) Received: from [9.61.25.68] (unknown [9.61.25.68]) by smtpav06.dal12v.mail.ibm.com (Postfix) with ESMTP; Thu, 6 Jul 2023 20:38:21 +0000 (GMT) Message-ID: <6deeb5cf-3b57-21a2-0d5f-56b48f8d147b@linux.ibm.com> Date: Thu, 6 Jul 2023 15:38:21 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH v3] PowerPC: Influence cpu/arch hwcap features via GLIBC_TUNABLES. Content-Language: en-US To: Adhemerval Zanella Netto , bmahi496@linux.ibm.com, libc-alpha@sourceware.org Cc: rajis@linux.ibm.com, Mahesh Bodapati References: <20230706122544.175643-1-bmahi496@linux.ibm.com> From: Peter Bergner In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: Ju8tq7141fbZXFiCehCMcMsog9SRD0PI X-Proofpoint-ORIG-GUID: Ju8tq7141fbZXFiCehCMcMsog9SRD0PI X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-06_15,2023-07-06_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 bulkscore=0 impostorscore=0 mlxscore=0 suspectscore=0 phishscore=0 spamscore=0 mlxlogscore=644 adultscore=0 lowpriorityscore=0 clxscore=1011 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2307060181 X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 7/6/23 8:16 AM, Adhemerval Zanella Netto wrote: > On 06/07/23 09:25, bmahi496--- via Libc-alpha wrote: >> struct cpu_features >> { >> bool use_cached_memopt; >> + unsigned long int hwcap; >> + unsigned long int hwcap2; >> +}; >> + >> +static const struct >> +{ >> + const char *name; >> + int mask; >> + bool id; >> +} hwcap_tunables[] = { >> + /* AT_HWCAP tunable masks. */ >> + { "4xxmac", PPC_FEATURE_HAS_4xxMAC, 0 }, > > This creates one extra dynamic relocation per entry: > > powerpc64le-linux-gnu-base$ powerpc64le-linux-gnu-readelf -a elf/ld.so > [...] > Relocation section '.relr.dyn' at offset 0xc68 contains 3 entries: > 10 offsets > [...] > > powerpc64le-linux-gnu-patch$ powerpc64le-linux-gnu-readelf -a elf/ld.so > [...] > Relocation section '.relr.dyn' at offset 0xc68 contains 5 entries: > 53 offsets > [...] Good catch! Especially since the plan is to add space for hwcap3 and hwcap4 (possibly more) in the near future. We don't need the extra hwcaps yet, but we want to reserve the space in the TCB for them now, where __builtin_cpu_supports() uses them, so that when we need them in the future, the space is already there to use in distro glibcs. > Which I think we should avoid since is a small slow down on every program > invocation. You can either define the name with a predefine size that > fits for every name (say 32), but this will waste a some of space. Or > you can specify the hwcap_tunables struct as pointing to an offset: > > static const char hwcap_names[] = > "4xxmac\0" > "altivec\0" > [...] > > static const struct > { > unsigned short off; > int mask; > bool id; > } hwcap_tunables[] = { > { 0, PPC_FEATURE_HAS_4xxMAC, 0 }, > { 7, PPC_FEATURE_HAS_ALTIVEC, 0 }, > [...] > } > And then you check the name as: > > for (i = 0; array_length (hwcap_tunables); ++i) > { > const char *hwcap_name = hwcap_names + hwcap_tunables[i].off; > [...] > } > > The drowback is to get the offsets right it would require some preprocessor > phase (something like we do for the signal and errno list). I don't think we need the offset in the struct, since Mahesh's loop is already calculating the strlen of hwcap_names[].name, so we can just update the pointer as we go. Ala... const char *hwcap_name = &hwcap_names[0]; for (i = 0; array_length (hwcap_tunables); ++i) { size_t tunable_len = STRLEN_DEFAULT (hwcap_name); /* Check the tunable name on the supported list. */ if (tunable_len == feature_len && MEMCMP_DEFAULT (feature, hwcap_name, feature_len) == 0) { ... } hwcap_name += tunable_len + 1; } Peter