From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x234.google.com (mail-lj1-x234.google.com [IPv6:2a00:1450:4864:20::234]) by sourceware.org (Postfix) with ESMTPS id AD5603858C27 for ; Wed, 20 Sep 2023 13:33:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AD5603858C27 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-lj1-x234.google.com with SMTP id 38308e7fff4ca-2c007d6159aso55091791fa.3 for ; Wed, 20 Sep 2023 06:33:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1695216811; x=1695821611; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=jw5Pjm8A0axqO9DAic3fXsuueq9GgtEmK7Qpf/Uw+uI=; b=iVwNKdp5rmGficYiB9TWBpfHSxxlwkpFBntulmB1UozsbYpQ4BoCWv8Ig12r3b7u9w eqtQHw7RUn2y+TtanLMTnjmgU0TTYDEuowujS3JWa9D2r5IlFeMjWRBnBmNb3Jbsnma1 r0mh/dJla+mVIwBpWzCLpGh5C2lbRFO7eXnygpR66KchXT0H99YU1/78r1QB44wWPgJe zkCBTCVoaeodgSxmx21Hgwy6Pgq5cQlBFH6/zvFn0ZvCEnsnH8JpGwNSXtfDO2XYaPYA BMeOKWHtL4WeSzL3JNzO3ZDDOkb94KTHlCqhFPiWvlJfSf+4sZUExkTTlLHCVdtOkHiZ vOBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695216811; x=1695821611; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=jw5Pjm8A0axqO9DAic3fXsuueq9GgtEmK7Qpf/Uw+uI=; b=qzd6J/ySUHbPgrrJlvH2Rkjn2z+I1NFPpkCOEqWkPTk42UW7udEMoDhNH/Rr2TY/lD i9MzbicICXQTLI0Lo3I24yaS0WdyYto+mU4Dh7Ep4m7q1hDOf8eiuXWDDpFJPMrmY5Fr B2PeuemEF9a5qrpJso2U4cX7b2hCOVWlSqBfe2rRaIiIdiPx5ypc5OyULt2Us41ExQJx TSyJA4ncNIBp0/MQ8irmfU4r28e7EquMzL5qBlR1euFyKO/Wm5Zw6qOOVh34NZ4PTQ2s Aqg+lvMtvvH8Le/1Pkw4vtIC15Vko13CuZcWDQM3NxNQSytEX0FzG0JWR97i9U4wQF5s xq0A== X-Gm-Message-State: AOJu0Yw8IeyDhGzW+PlAqlk5a6mxsYJ3/BpuwjpJIXLxTWtWkBiojWcN XRDp8fjhcw/9EiIong88059j8A== X-Google-Smtp-Source: AGHT+IEOHxjILq9pTfbabQulgFtZoNMXRRxvLWQwCAOGo054i+BP5qfQTQuCMWIfnlRZ5ZyyLvrBNA== X-Received: by 2002:a2e:884e:0:b0:2bc:39f5:ecb4 with SMTP id z14-20020a2e884e000000b002bc39f5ecb4mr2265213ljj.25.1695216810859; Wed, 20 Sep 2023 06:33:30 -0700 (PDT) Received: from [172.20.43.239] (static-212-193-78-212.thenetworkfactory.nl. [212.78.193.212]) by smtp.gmail.com with ESMTPSA id n25-20020a170906b31900b0099bcdfff7cbsm9448668ejz.160.2023.09.20.06.33.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 20 Sep 2023 06:33:30 -0700 (PDT) Message-ID: <3270195f-7700-423e-acb6-e16889984c12@linaro.org> Date: Wed, 20 Sep 2023 15:33:29 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] [powerpc] fegetenv_and_set_rn now uses the builtins provided by GCC. Content-Language: en-US To: Manjunath S Matti , Manjunath Matti , libc-alpha@sourceware.org Cc: rajis@linux.ibm.com, Carl Love References: <20230912082730.3951006-1-mmatti@linux.ibm.com> <526f2aac-a06c-d1cf-4c7b-1339846a1711@linaro.org> <866d7459-b031-883b-f356-b05fa2e5ad71@linux.vnet.ibm.com> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: <866d7459-b031-883b-f356-b05fa2e5ad71@linux.vnet.ibm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 20/09/23 08:28, Manjunath S Matti wrote: > > On 12/09/23 6:57 pm, Adhemerval Zanella Netto wrote: >> >> On 12/09/23 05:27, Manjunath Matti wrote: >>> On powerpc, SET_RESTORE_ROUND uses inline assembly to optimize the >>> prologue get/save/set rounding mode operations for POWER9 and >>> later by using 'mffscrn' where possible, this was introduced by >>> commit f1c56cdff09f650ad721fae026eb6a3651631f3d. >>> >>> GCC version 14 onwards supports builtins as __builtin_set_fpscr_rn >>> which now returns the FPSCR fields in a double. This feature is >>> available on Power9 when the __SET_FPSCR_RN_RETURNS_FPSCR__ macro >>> is defined along with __builtin_set_fpscr_rn enabled. >>> GCC commit ef3bbc69d15707e4db6e2f198c621effb636cc26 adds >>> this feature. >>> >>> Changes are done to use __builtin_set_fpscr_rn instead of mffscrn >>> or mffscrni in __fe_mffscrn(rn). >>> >>> Suggested-by: Carl Love >>> --- >>>   sysdeps/powerpc/fpu/fenv_libc.h | 23 ++++++++++++++++++++--- >>>   1 file changed, 20 insertions(+), 3 deletions(-) >>> >>> diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h >>> index fa5e1c697e..55484eb229 100644 >>> --- a/sysdeps/powerpc/fpu/fenv_libc.h >>> +++ b/sysdeps/powerpc/fpu/fenv_libc.h >>> @@ -84,8 +84,15 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; >>>       __fr.fenv;                                \ >>>     }) >>>   +/* GCC version 14 onwards supports builtins as __builtin_set_fpscr_rn and >>> +   now returns the FPSCR fields in a double. This support is available >>> +   on Power9 when the __SET_FPSCR_RN_RETURNS_FPSCR__ macro is defined. >>> +   To retain backward compatibility with older GCC, we still retain the >>> +   old inline assembly implementation.  */ >>> +#if defined _ARCH_PWR9 && defined __SET_FPSCR_RN_RETURNS_FPSCR__ >>> +#define fegetenv_and_set_rn(rn) __builtin_set_fpscr_rn (rn) >>> +#elif defined _ARCH_PWR9 >>>   /* Like fegetenv_control, but also sets the rounding mode.  */ >>> -#ifdef _ARCH_PWR9 >>>   #define fegetenv_and_set_rn(rn) __fe_mffscrn (rn) >>>   #else >>>   /* 'mffscrn' will decode to 'mffs' on ARCH < 3_00, which is still necessary >> I think the macro would be better defined as: >> >> #ifdef __SET_FPSCR_RN_RETURNS_FPSCR__ >> # define __fe_mffscrn(rn)  __builtin_set_fpscr_rn (rn) >> #else >> # define __fe_mffscrn(rn) [...] >> #endif >> >> Then there is no need to redefine fegetenv_and_set_rn nor __fesetround_inline. > > So this is what you are asking me to do right ? > > --- a/sysdeps/powerpc/fpu/fenv_libc.h > +++ b/sysdeps/powerpc/fpu/fenv_libc.h > @@ -89,22 +89,11 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; >     on Power9 when the __SET_FPSCR_RN_RETURNS_FPSCR__ macro is defined. >     To retain backward compatibility with older GCC, we still retain the >     old inline assembly implementation.  */ > -#if defined _ARCH_PWR9 && defined __SET_FPSCR_RN_RETURNS_FPSCR__ > +#ifdef __SET_FPSCR_RN_RETURNS_FPSCR__ >  #define fegetenv_and_set_rn(rn) __builtin_set_fpscr_rn (rn) > -#elif defined _ARCH_PWR9 > +#else >  /* Like fegetenv_control, but also sets the rounding mode.  */ >  #define fegetenv_and_set_rn(rn) __fe_mffscrn (rn) > -#else > -/* 'mffscrn' will decode to 'mffs' on ARCH < 3_00, which is still necessary > -   but not sufficient, because it does not set the rounding mode. > -   Explicitly set the rounding mode when 'mffscrn' actually doesn't.  */ > -#define fegetenv_and_set_rn(rn) \ > -  ({register fenv_union_t __fr;                                                \ > -    __fr.fenv = __fe_mffscrn (rn);                                     \ > -    if (__glibc_unlikely (!(GLRO(dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)))        \ > -      __fesetround_inline (rn);                                                \ > - __fr.fenv; \ > -  }) >  #endif > > > I was under the impression that the redefine of fegetenv_and_set_rn and __fesetround_inline > > was needed for architectures below POWE9, i.e POWER8, 7, etc. > > Please correct me if I am wrong. Not really, my understanding is it only requires: diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h index fa5e1c697e..8cdce1d6e7 100644 --- a/sysdeps/powerpc/fpu/fenv_libc.h +++ b/sysdeps/powerpc/fpu/fenv_libc.h @@ -68,7 +68,12 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; __fr; \ }) -#define __fe_mffscrn(rn) \ +/* Starting with GCC 14 __builtin_set_fpscr_rn can be used to return the + FPSCR fields as a double. */ +#ifdef __SET_FPSCR_RN_RETURNS_FPSCR__ +# define __fe_mffscrn(rn) __builtin_set_fpscr_rn (rn) +#else +# define __fe_mffscrn(rn) \ ({register fenv_union_t __fr; \ if (__builtin_constant_p (rn)) \ __asm__ __volatile__ ( \ @@ -83,6 +88,7 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; } \ __fr.fenv; \ }) +#endif /* Like fegetenv_control, but also sets the rounding mode. */ #ifdef _ARCH_PWR9 Since fegetenv_and_set_rn will call the __fe_mffscrn macro, and this will be used by libc_feresetround_ppc macros. > >>> @@ -148,7 +155,12 @@ typedef union >>>   static inline int >>>   __fesetround_inline (int round) >>>   { >>> -#ifdef _ARCH_PWR9 >>> +/* GCC version 14 onwards supports builtins as __builtin_set_fpscr_rn and >>> +   now returns the FPSCR fields in a double. This support is available >>> +   on Power9 when the __SET_FPSCR_RN_RETURNS_FPSCR__ macro is defined.  */ >>> +#if defined _ARCH_PWR9 && defined __SET_FPSCR_RN_RETURNS_FPSCR__ >>> +  __builtin_set_fpscr_rn (round); >>> +#elif defined _ARCH_PWR9 >>>     __fe_mffscrn (round); >>>   #else >>>     if (__glibc_likely (GLRO(dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)) >>> @@ -178,7 +190,12 @@ __fesetround_inline (int round) >>>   static inline void >>>   __fesetround_inline_nocheck (const int round) >>>   { >>> -#ifdef _ARCH_PWR9 >>> +/* GCC version 14 onwards supports builtins as __builtin_set_fpscr_rn and >>> +   now returns the FPSCR fields in a double. This support is available >>> +   on Power9 when the __SET_FPSCR_RN_RETURNS_FPSCR__ macro is defined.  */ >>> +#if defined _ARCH_PWR9 && defined __SET_FPSCR_RN_RETURNS_FPSCR__ >>> +  __builtin_set_fpscr_rn (round); >>> +#elif defined _ARCH_PWR9 >>>     __fe_mffscrn (round); >>>   #else >>>     if (__glibc_likely (GLRO(dl_hwcap2) & PPC_FEATURE2_ARCH_3_00)) > > > So the rest it OK ? I will update the patch with suggested changes. >