From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id D602A3858400 for ; Wed, 10 Nov 2021 14:43:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D602A3858400 Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1AADGY50010448; Wed, 10 Nov 2021 14:43:55 GMT Received: from ppma02dal.us.ibm.com (a.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.10]) by mx0a-001b2d01.pphosted.com with ESMTP id 3c8etujbku-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 10 Nov 2021 14:43:54 +0000 Received: from pps.filterd (ppma02dal.us.ibm.com [127.0.0.1]) by ppma02dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1AAEYhpV009880; Wed, 10 Nov 2021 14:43:54 GMT Received: from b03cxnp08025.gho.boulder.ibm.com (b03cxnp08025.gho.boulder.ibm.com [9.17.130.17]) by ppma02dal.us.ibm.com with ESMTP id 3c5hbc7dpp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 10 Nov 2021 14:43:54 +0000 Received: from b03ledav003.gho.boulder.ibm.com (b03ledav003.gho.boulder.ibm.com [9.17.130.234]) by b03cxnp08025.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1AAEhqBN43778494 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 10 Nov 2021 14:43:52 GMT Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5235A6A04F; Wed, 10 Nov 2021 14:43:52 +0000 (GMT) Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2ECB56A05A; Wed, 10 Nov 2021 14:43:51 +0000 (GMT) Received: from li-24c3614c-2adc-11b2-a85c-85f334518bdb.ibm.com (unknown [9.65.79.57]) by b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTPS; Wed, 10 Nov 2021 14:43:50 +0000 (GMT) Date: Wed, 10 Nov 2021 08:43:48 -0600 From: "Paul A. Clarke" To: Wilco Dijkstra Cc: Adhemerval Zanella , "libc-alpha@sourceware.org" , Tulio Magno Quites Machado Filho Subject: Re: [PATCH v3 5/7] math: Remove powerpc e_hypot Message-ID: <20211110144348.GB4930@li-24c3614c-2adc-11b2-a85c-85f334518bdb.ibm.com> References: <20211101202059.1026032-1-adhemerval.zanella@linaro.org> <20211101202059.1026032-6-adhemerval.zanella@linaro.org> <20211109192800.GA4930@li-24c3614c-2adc-11b2-a85c-85f334518bdb.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: hQL5frIQs7smXqw5EbapdLT_9tHsqfvC X-Proofpoint-GUID: hQL5frIQs7smXqw5EbapdLT_9tHsqfvC X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-10_05,2021-11-08_02,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 mlxlogscore=999 spamscore=0 bulkscore=0 impostorscore=0 priorityscore=1501 suspectscore=0 clxscore=1015 phishscore=0 mlxscore=0 adultscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111100076 X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TVD_SUBJ_WIPE_DEBT, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2021 14:43:59 -0000 On Wed, Nov 10, 2021 at 02:34:09PM +0000, Wilco Dijkstra wrote: > > The patches see poorer performance on Power10, also: > > Was that with the last patch to remove the redirects as well? The latest version > is still slower even with that on AArch64... Sorry that wasn't clear. I tested with all 7 patches in the series applied. PC > I think the new algorithm will always be slower due to the dependent sqrt and > division. So it's hard to improve unless we only use it for special cases (eg. when > ax and ay are close). Returning sqrt (fma (ax, ax, ay * ay)) is about twice as fast > and gives just over 1 ULP, so we're losing a lot of performance for a small ULP > improvement.