From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id C853E384AB58 for ; Fri, 3 May 2024 19:16:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C853E384AB58 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C853E384AB58 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714763777; cv=none; b=I6zLYGvkfpxY/jwVRIXZcaOpGH1S7YEfPy/Z50EOaWPN+idHk5EXJF56DaWreEqW0hIGZ4MaVuBM02KV//z9F8i2Lit3R43vOLG9iwoZqFb2p7j1nyzuPcM020vMZYt0VqG6+6fJxpWNWFUb9hCOVtzUAF6H+hk3Tmi2JP649uM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714763777; c=relaxed/simple; bh=UX26Jp1UtuwuRh7jFgmkNcdJhlGe/NvTUhjoFb9qpog=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=r+588OZn8xibLR5n7YdjLEbpocpgextGLqN/202FE+X7RJg+GpSkpDyym6EtHLU+p3Wu0ZWHXp5DpCe8xkogaGleeo2hqnVcZ332DVlraGNuKKbxxSU7LpMpwpLqGDGosyQiXJ02lGypq3kZwfr6Cmm1PmxTMUBFE0m6uc0oYm4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353722.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 443IB8f6026056; Fri, 3 May 2024 18:30:45 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=2q4SSz9/cD3SAo6rbscJZlOyRB7mMRxyUHHtlrWzHkU=; b=WQchIZYfHWXHGEYHldFAqS6lWASvIgXGZnPMm8bU2iDHhR1ZnUp9yzTN34BLtTYtuvvG MHGlGwRPsU01M+CQwEQKUFdOVNxFDAAaEovYHfYcNPkJyDvzSTQpRcns5gg4HKhA0Tpl gRl7VjA0q49Mi2mnOHAFMdlQ+8XggCJOFJTEO1axfEdwFGRSEJWFuFKGvba90WVgk6yO vnChvbsx9Ed38wS7ISerHSV2/uF3NUyoUgDSiIt4FFZZUG2fxyiv+CxKcAiQytqj5DCt u+b7McGHNC+uqBqnLUIfLKpMZPyDbWFpasVMLVP/R3gJzrR+xcmL1yrLjoX2+nqh1uhQ SA== Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3xw4ypg1qx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 03 May 2024 18:30:45 +0000 Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 443I0lCw011769; Fri, 3 May 2024 18:30:44 GMT Received: from smtprelay06.dal12v.mail.ibm.com ([172.16.1.8]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3xsdwmpw7t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 03 May 2024 18:30:44 +0000 Received: from smtpav01.wdc07v.mail.ibm.com (smtpav01.wdc07v.mail.ibm.com [10.39.53.228]) by smtprelay06.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 443IUgeP27263608 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 3 May 2024 18:30:44 GMT Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0A13D58063; Fri, 3 May 2024 18:30:42 +0000 (GMT) Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4C4F35804B; Fri, 3 May 2024 18:30:41 +0000 (GMT) Received: from [9.67.79.151] (unknown [9.67.79.151]) by smtpav01.wdc07v.mail.ibm.com (Postfix) with ESMTP; Fri, 3 May 2024 18:30:41 +0000 (GMT) Message-ID: Date: Fri, 3 May 2024 13:30:40 -0500 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] powerpc: Fix __fesetround_inline_nocheck on POWER9+ (BZ 31682) To: Adhemerval Zanella Netto , libc-alpha@sourceware.org Cc: Manjunath S Matti , Peter Bergner References: <20240430124011.12408-1-adhemerval.zanella@linaro.org> <17f748c4-399e-4f8a-9a62-a9b86f7a0aae@linaro.org> <4daa27e7-159f-4bf2-a682-5fd6bd00ad47@linux.ibm.com> <103f99dd-ac9e-4193-b67d-99d0e5f3eb9e@linaro.org> Content-Language: en-US From: Paul E Murphy In-Reply-To: <103f99dd-ac9e-4193-b67d-99d0e5f3eb9e@linaro.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: WrC4-4Zu-pRNVEBxQAxO7XomY-pI94vz X-Proofpoint-GUID: WrC4-4Zu-pRNVEBxQAxO7XomY-pI94vz X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1011,Hydra:6.0.650,FMLib:17.11.176.26 definitions=2024-05-03_12,2024-05-03_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 clxscore=1015 suspectscore=0 mlxlogscore=884 adultscore=0 malwarescore=0 bulkscore=0 impostorscore=0 lowpriorityscore=0 mlxscore=0 phishscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2404010000 definitions=main-2405030130 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,KAM_LOTSOFHASH,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 5/1/24 9:01 AM, Adhemerval Zanella Netto wrote: > > > On 30/04/24 18:58, Paul E Murphy wrote: >> >> >> On 4/30/24 1:34 PM, Adhemerval Zanella Netto wrote: >>> >>> >>> On 30/04/24 15:13, Paul E Murphy wrote: >>>> >>>> >>>> On 4/30/24 7:40 AM, Adhemerval Zanella wrote: >>>>> The e68b1151f7460d5fa88c3a567c13f66052da79a7 commit changed the >>>>> __fesetround_inline_nocheck implementation to use mffscrni >>>>> (through __fe_mffscrn) instead of mtfsfi.  For generic powerpc >>>>> ceil/floor/trunc, the function is supposed to not change the >>>>> Floating-Point Inexact Exception Enable bit, however mffscrni >>>>> clear bits 56:63 (VE, OE, UE, ZE, XE, NI, RN), different than mtfsfi. >>>> >>>> I don't think that explanation is correct. mffscrni should not alter the exception enable bits. It copies them into FRT, but does not clear them. >>> >>> Yeah, I forgot to add that mffscrni in this context clears because >>> there is no easy way to mask out the current bits 56:63 since only >>> the rounding bit is provided.  So maybe: >>> >>>    The e68b1151f7460d5fa88c3a567c13f66052da79a7 commit changed the >>>    __fesetround_inline_nocheck implementation to use mffscrni >>>    (through __fe_mffscrn) instead of mtfsfi.  For generic powerpc >>>    ceil/floor/trunc, the function is supposed to not change the >>>    Floating-Point Inexact Exception Enable bit, however mffscrni >>>    usage always clear the the bits 56:63 (VE, OE, UE, ZE, XE, NI, RN), >>>    since only the rounding mode is provided. >> >> I think the comment has mtfsfi and mffscrni swapped.  The usage of mtfsfi clears bits 60 (XE) and 61 (NI) of the fpscr.  mffscrni does not alter any fpscr bits besides RN. >> >> Should __fesetround_inline_nocheck be removed in favor of using __fesetround_inline?  The description of __fesetround_inline_nocheck is confusing. It fails to mention that XE and NI are cleared, as implied by the usage of mtfsfi. > The main issue is the use of __fe_mffscrn on POWER9, below it is a striped > testcase from the generic ceil. If you run by building against POWER8: > > $ powerpc64le-glibc-linux-gnu-gcc -O2 -g -std=gnu11 -mcpu=power8 -D_GNU_SOURCE test.c -o test-pwr8 -lm > $ ./test-pwr8 > 0000000000000000000000000000000000000000000000000000000000001000 > 0000000000000000000000000000000000000000000000000000000000000010 > > Now with POWER9: > $ powerpc64le-glibc-linux-gnu-gcc -O2 -g -std=gnu11 -mcpu=power9 -D_GNU_SOURCE test.c -o test-pwr9 -lm > $ ./test-pwr9 > 0000000000000000000000000000000000000000000000000000000000001000 > 0000000000000000000000000000000000000000000000000000000000001010 > Floating point exception > > You can see that for POWER8, the XE bits it cleared as expected by > __fesetround_inline_nocheck (because ceil/floor/trunc should not trigger > any floating-point exception), while for POWER9, as you said, only the > rounding bits are changed. The commit message is still hard to digest. The caller expects the inexact exception to be disable upon return. mffscrni does not change any exception enable bits. Can you also update the description (and maybe name) of the function to explicitly state this? Likewise, note that we expect the NI bit to be 0 always. Otherwise, this patch LGTM.