From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb2e.google.com (mail-yb1-xb2e.google.com [IPv6:2607:f8b0:4864:20::b2e]) by sourceware.org (Postfix) with ESMTPS id 2BC01384604F for ; Wed, 3 Apr 2024 20:04:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2BC01384604F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2BC01384604F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::b2e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712174657; cv=none; b=ABQmYzJTquoZkUepwETj7FsIUTVp7Xxv00s8K/Rz3sDp9jRvbekZecgyetMx7gQ5M012wmmQ4ch1BeTxpejCCOXBOc00KdHl7J/cMIMYrnb7bkMNaMByPsVLFlHheS8c322B2YOpr4tf2ujyMYui+b/Q4lUUPiDN5KdqBlT5V/A= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712174657; c=relaxed/simple; bh=Ux/7mZ7qYn7Yf0x+G32xIZEz4g/kt3IOoP8+/eDXOSA=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=W2jt5OQVZkqa29K7RjpooLzB0//bnKVlFgIQDXSYo2tRTxB2qPFyEfXGkSHvffdLJtKb++7axC9dkJwKE5iP9FtQjWqJNt1KYkheWhZ2K3G7s6lwc7iBcoWseCLahIH22FplnUdojzARFrCuZ8pt5p550m8kA83aoV0i34QmA6Y= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yb1-xb2e.google.com with SMTP id 3f1490d57ef6-dd02fb9a31cso282013276.3 for ; Wed, 03 Apr 2024 13:04:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1712174654; x=1712779454; darn=sourceware.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=VUsQg9weUCq3FvqRzFuPM5BeQxHcGMvrhz8jrI4pWsQ=; b=JSD19QA3RxJCXQKAe5XIzEzChGuylvbX6ccScNQpNu8/0JKaSDZnIQu8O/DiX0pDT7 M5L4QIW0jTUVgsMb9vFg2tYSWHyKUEfSBHrfihraLHlqtZ074xp4BH7cAnFszs/mTamG Ep5JkptdRz4RabMVc8Dntq5D0h0+eL2b9KJo7LQm131yyKXrlc4AXNtMtJ0MJjzMCIyk oOEnzPTX7gN21dSEjWMhA/S2iI7ATSfMB/lbbo3XETDimJ69uQmKhgnAlxsduGcJuJ6o 2kZl6pBuQKubBgerPB/UkKxrMOkpZFzZUYjh+cuodie5VwpACR4n3aezl2GXiE0Bc+Rk sojA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712174654; x=1712779454; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VUsQg9weUCq3FvqRzFuPM5BeQxHcGMvrhz8jrI4pWsQ=; b=bBgRPMWxSvs/kZPEdJp9w+f6Vle2Ggry57OfF1zwpWRooUzqjzOJWVNCEDZpVIoNRY /FnqokFArRfaO2dOxa855c7iUYeUBU+pWXL1xDbNBp4Yd3Jgvxk69YrgCPmeZq12vHaz BHGIhZAif/aRmWUaC3w4wXEamZxDvuY2dEHXItgpG3FCOeMrBS6eRvoyYB4LwUmrOFNc xd+UT/C29dx7BbALpXOlH6qBxFP6R7zwgRTiNA8qGH7GMy8dMuyPc0EBIPf8RvIorr/o sF/D4Nyh3yxXCzjlX8pc9HjONPCpPJssVhzNdYvwmhcD1cn+CM9DNp8cGIYrP4wIOGuE E86g== X-Gm-Message-State: AOJu0YzgB5qiGGiKng+9+uGtmLGxKRVmG40Ez3Zye8ovrtu2iiF/Xocf DoTxoIfUp6s19RRPD+c8543vAvO3eP5iGQnglxmjcxYmjiIhcw458MQyX/ubWNcymtbmwkHUX1z tA4oznbLhwEWh4twC08lqIu6yZqQ= X-Google-Smtp-Source: AGHT+IGnSsX+Exezg0YRG/8VfC55WdJjFq75/yOJ3sQaJ71LK9u9dYSusDvVwg4AscZZzOkIyTjXYyouk8YmDuoX4Mo= X-Received: by 2002:a25:4dc5:0:b0:dcc:79ab:e51a with SMTP id a188-20020a254dc5000000b00dcc79abe51amr491210ybb.57.1712174654331; Wed, 03 Apr 2024 13:04:14 -0700 (PDT) MIME-Version: 1.0 References: <20240403193919.1533786-1-adhemerval.zanella@linaro.org> <20240403193919.1533786-3-adhemerval.zanella@linaro.org> In-Reply-To: <20240403193919.1533786-3-adhemerval.zanella@linaro.org> From: "H.J. Lu" Date: Wed, 3 Apr 2024 13:03:38 -0700 Message-ID: Subject: Re: [PATCH 2/3] math: math: x86 floor traps when FE_INEXACT is enabled (BZ 31601) To: Adhemerval Zanella Cc: libc-alpha@sourceware.org, Joseph Myers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3019.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Apr 3, 2024 at 12:39=E2=80=AFPM Adhemerval Zanella wrote: > > The implementations of floor functions using x87 floating point (i386 and > 86_64 long double only) traps when FE_INEXACT is enabled. Although > this is a GNU extension outside the scope of the C standard, other > architectures that also support traps do not show this behavior. > > The fix moves the implementation to a common one that holds any > exceptions with a 'fnclex' (libc_feholdexcept_setround_387). > > Checked on x86_64-linux-gnu and i686-linux-gnu. > --- > math/Makefile | 2 ++ > math/test-floor-except-2.c | 67 +++++++++++++++++++++++++++++++++++ > sysdeps/i386/fpu/s_floor.S | 34 ------------------ > sysdeps/i386/fpu/s_floor.c | 25 +++++++++++++ > sysdeps/i386/fpu/s_floorf.S | 34 ------------------ > sysdeps/i386/fpu/s_floorf.c | 25 +++++++++++++ > sysdeps/i386/fpu/s_floorl.S | 39 -------------------- > sysdeps/x86/fpu/s_floorl.c | 25 +++++++++++++ > sysdeps/x86_64/fpu/s_floorl.S | 33 ----------------- > 9 files changed, 144 insertions(+), 140 deletions(-) > create mode 100644 math/test-floor-except-2.c > delete mode 100644 sysdeps/i386/fpu/s_floor.S > create mode 100644 sysdeps/i386/fpu/s_floor.c > delete mode 100644 sysdeps/i386/fpu/s_floorf.S > create mode 100644 sysdeps/i386/fpu/s_floorf.c > delete mode 100644 sysdeps/i386/fpu/s_floorl.S > create mode 100644 sysdeps/x86/fpu/s_floorl.c > delete mode 100644 sysdeps/x86_64/fpu/s_floorl.S > > diff --git a/math/Makefile b/math/Makefile > index d2a740eebe..121fe2881a 100644 > --- a/math/Makefile > +++ b/math/Makefile > @@ -511,6 +511,7 @@ tests =3D \ > test-fetestexceptflag \ > test-fexcept \ > test-fexcept-traps \ > + test-floor-except-2 \ > test-flt-eval-method \ > test-fp-ilogb-constants \ > test-fp-llogb-constants \ > @@ -991,6 +992,7 @@ CFLAGS-test-fe-snans-always-signal.c +=3D $(config-cf= lags-signaling-nans) > CFLAGS-test-nan-const.c +=3D -fno-builtin > > CFLAGS-test-ceil-except-2.c +=3D -fno-builtin > +CFLAGS-test-floor-except-2.c +=3D -fno-builtin > > include ../Rules > > diff --git a/math/test-floor-except-2.c b/math/test-floor-except-2.c > new file mode 100644 > index 0000000000..d99e835909 > --- /dev/null > +++ b/math/test-floor-except-2.c > @@ -0,0 +1,67 @@ > +/* Test floor functions do not disable exception traps. > + Copyright (C) 2024 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#include > +#include > +#include > + > +#ifndef FE_INEXACT > +# define FE_INEXACT 0 > +#endif > + > +#define TEST_FUNC(NAME, FLOAT, SUFFIX) \ > +static int \ > +NAME (void) \ > +{ \ > + int result =3D 0; = \ > + volatile FLOAT a, b __attribute__ ((unused)); = \ > + a =3D 1.5; = \ > + /* floor must work when traps on "inexact" are enabled. */ \ > + b =3D floor ## SUFFIX (a); = \ > + /* And it must have left those traps enabled. */ \ > + if (fegetexcept () =3D=3D FE_INEXACT) = \ > + puts ("PASS: " #FLOAT); \ > + else \ > + { \ > + puts ("FAIL: " #FLOAT); \ > + result =3D 1; = \ > + } \ > + return result; \ > +} > + > +TEST_FUNC (float_test, float, f) > +TEST_FUNC (double_test, double, ) > +TEST_FUNC (ldouble_test, long double, l) > + > +static int > +do_test (void) > +{ > + if (feenableexcept (FE_INEXACT) =3D=3D -1) > + { > + puts ("enabling FE_INEXACT traps failed, cannot test"); > + return 77; > + } > + int result =3D float_test (); > + feenableexcept (FE_INEXACT); > + result |=3D double_test (); > + feenableexcept (FE_INEXACT); > + result |=3D ldouble_test (); > + return result; > +} > + > +#include > diff --git a/sysdeps/i386/fpu/s_floor.S b/sysdeps/i386/fpu/s_floor.S > deleted file mode 100644 > index 7143fdcc9a..0000000000 > --- a/sysdeps/i386/fpu/s_floor.S > +++ /dev/null > @@ -1,34 +0,0 @@ > -/* > - * Public domain. > - */ > - > -#include > -#include > - > -RCSID("$NetBSD: s_floor.S,v 1.4 1995/05/09 00:01:59 jtc Exp $") > - > -ENTRY(__floor) > - fldl 4(%esp) > - subl $32,%esp > - cfi_adjust_cfa_offset (32) > - > - fnstenv 4(%esp) /* store fpu environment */ > - > - /* We use here %edx although only the low 1 bits are defined. > - But none of the operations should care and they are faster > - than the 16 bit operations. */ > - movl $0x400,%edx /* round towards -oo */ > - orl 4(%esp),%edx > - andl $0xf7ff,%edx > - movl %edx,(%esp) > - fldcw (%esp) /* load modified control word */ > - > - frndint /* round */ > - > - fldenv 4(%esp) /* restore original environment *= / > - > - addl $32,%esp > - cfi_adjust_cfa_offset (-32) > - ret > -END (__floor) > -libm_alias_double (__floor, floor) > diff --git a/sysdeps/i386/fpu/s_floor.c b/sysdeps/i386/fpu/s_floor.c > new file mode 100644 > index 0000000000..cc50e33b59 > --- /dev/null > +++ b/sysdeps/i386/fpu/s_floor.c > @@ -0,0 +1,25 @@ > +/* Return smallest integral value not less than argument. i386 version. > + Copyright (C) 2024 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#include > + > +#define FUNC __floor > +#define TYPE double > +#define FE_OPTION FE_DOWNWARD > +#include "s_nearestint_387_template.c" > +libm_alias_double (__floor, floor) > diff --git a/sysdeps/i386/fpu/s_floorf.S b/sysdeps/i386/fpu/s_floorf.S > deleted file mode 100644 > index 8fad9c0698..0000000000 > --- a/sysdeps/i386/fpu/s_floorf.S > +++ /dev/null > @@ -1,34 +0,0 @@ > -/* > - * Public domain. > - */ > - > -#include > -#include > - > -RCSID("$NetBSD: s_floorf.S,v 1.3 1995/05/09 00:04:32 jtc Exp $") > - > -ENTRY(__floorf) > - flds 4(%esp) > - subl $32,%esp > - cfi_adjust_cfa_offset (32) > - > - fnstenv 4(%esp) /* store fpu environment */ > - > - /* We use here %edx although only the low 1 bits are defined. > - But none of the operations should care and they are faster > - than the 16 bit operations. */ > - movl $0x400,%edx /* round towards -oo */ > - orl 4(%esp),%edx > - andl $0xf7ff,%edx > - movl %edx,(%esp) > - fldcw (%esp) /* load modified control word */ > - > - frndint /* round */ > - > - fldenv 4(%esp) /* restore original environment *= / > - > - addl $32,%esp > - cfi_adjust_cfa_offset (-32) > - ret > -END (__floorf) > -libm_alias_float (__floor, floor) > diff --git a/sysdeps/i386/fpu/s_floorf.c b/sysdeps/i386/fpu/s_floorf.c > new file mode 100644 > index 0000000000..fa9454e56b > --- /dev/null > +++ b/sysdeps/i386/fpu/s_floorf.c > @@ -0,0 +1,25 @@ > +/* Largest integral value not greater than argument i386 version. > + Copyright (C) 2024 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#include > + > +#define FUNC __floorf > +#define TYPE float > +#define FE_OPTION FE_DOWNWARD > +#include "s_nearestint_387_template.c" > +libm_alias_float (__floor, floor) > diff --git a/sysdeps/i386/fpu/s_floorl.S b/sysdeps/i386/fpu/s_floorl.S > deleted file mode 100644 > index 3ec28b477b..0000000000 > --- a/sysdeps/i386/fpu/s_floorl.S > +++ /dev/null > @@ -1,39 +0,0 @@ > -/* > - * Public domain. > - */ > - > -#include > -#include > - > -RCSID("$NetBSD: $") > - > -ENTRY(__floorl) > - fldt 4(%esp) > - subl $32,%esp > - cfi_adjust_cfa_offset (32) > - > - fnstenv 4(%esp) /* store fpu environment */ > - > - /* We use here %edx although only the low 1 bits are defined. > - But none of the operations should care and they are faster > - than the 16 bit operations. */ > - movl $0x400,%edx /* round towards -oo */ > - orl 4(%esp),%edx > - andl $0xf7ff,%edx > - movl %edx,(%esp) > - fldcw (%esp) /* load modified control word */ > - > - frndint /* round */ > - > - /* Preserve "invalid" exceptions from sNaN input. */ > - fnstsw > - andl $0x1, %eax > - orl %eax, 8(%esp) > - > - fldenv 4(%esp) /* restore original environment *= / > - > - addl $32,%esp > - cfi_adjust_cfa_offset (-32) > - ret > -END (__floorl) > -libm_alias_ldouble (__floor, floor) > diff --git a/sysdeps/x86/fpu/s_floorl.c b/sysdeps/x86/fpu/s_floorl.c > new file mode 100644 > index 0000000000..9c92d33fbe > --- /dev/null > +++ b/sysdeps/x86/fpu/s_floorl.c > @@ -0,0 +1,25 @@ > +/* Return largest integral value not less than argument. x86 version. > + Copyright (C) 2024 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#include > + > +#define FUNC __floorl > +#define TYPE long double > +#define FE_OPTION FE_DOWNWARD > +#include "s_nearestint_387_template.c" > +libm_alias_ldouble (__floor, floor) > diff --git a/sysdeps/x86_64/fpu/s_floorl.S b/sysdeps/x86_64/fpu/s_floorl.= S > deleted file mode 100644 > index b74d1a4d6b..0000000000 > --- a/sysdeps/x86_64/fpu/s_floorl.S > +++ /dev/null > @@ -1,33 +0,0 @@ > -/* > - * Public domain. > - */ > - > -#include > -#include > - > -ENTRY(__floorl) > - fldt 8(%rsp) > - > - fnstenv -28(%rsp) /* store fpu environment */ > - > - /* We use here %edx although only the low 1 bits are defined. > - But none of the operations should care and they are faster > - than the 16 bit operations. */ > - movl $0x400,%edx /* round towards -oo */ > - orl -28(%rsp),%edx > - andl $0xf7ff,%edx > - movl %edx,-32(%rsp) > - fldcw -32(%rsp) /* load modified control word */ > - > - frndint /* round */ > - > - /* Preserve "invalid" exceptions from sNaN input. */ > - fnstsw > - andl $0x1, %eax > - orl %eax, -24(%rsp) > - > - fldenv -28(%rsp) /* restore original environment *= / > - > - ret > -END (__floorl) > -libm_alias_ldouble (__floor, floor) > -- > 2.34.1 > LGTM. Reviewed-by: H.J. Lu Thanks. --=20 H.J.