[PATCH] S390: Do not set FE_INEXACT with feraiseexcept (FE_OWERFLOW|FE

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

* [PATCH] S390: Do not set FE_INEXACT with feraiseexcept (FE_OWERFLOW|FE_UNDERFLOW).
@ 2016-08-18 14:14 Stefan Liebler
  2016-08-18 15:59 ` Joseph Myers
  0 siblings, 1 reply; 6+ messages in thread
From: Stefan Liebler @ 2016-08-18 14:14 UTC (permalink / raw)
  To: libc-alpha

[-- Attachment #1: Type: text/plain, Size: 1009 bytes --]

Hi,

on s390 feraiseexcept (FE_OVERFLOW|FE_UNDERFLOW) sets FE_INEXACT, too.
This patch uses z196 zarch load rounded instruction which can suppress
FE_INEXACT exception.
If gcc has no z196 support in used configuration, FE_INEXACT flag is
cleared if it was set before.
The gcc support is tested in a new configure-check.

A comment in fsetexcptflg.c is corrected as new exceptions are not
executed with the next floating-point instruction if fpc is set with
_FPU_SETCW macro. It seems the comment was copied e.g. from
sysdeps/x86_64/fpu/fsetexcptflg.c file.

Ok to commit?

Bye
Stefan

ChangeLog:

	* config.h.in: (HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT)
	New undefine.
	* sysdeps/s390/configure.ac: Add test for z196 zarch support.
	* sysdeps/s390/configure: Regenerated.
	* sysdeps/s390/fpu/fraiseexcpt.c (__feraiseexcept): Use ledbra
	instruction for raising over-/underflow if z196 zarch is
	supported by default or clear inexact flag.
	* sysdeps/s390/fpu/fsetexcptflg.c (fesetexceptflag):
	Correct comment.

[-- Attachment #2: 20160818_feraiseexception.patch --]
[-- Type: text/x-patch, Size: 7690 bytes --]

commit 800ee0a868b9461a614c96bab965664668f4af68
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Thu Aug 18 15:51:47 2016 +0200

    S390: Do not set FE_INEXACT with feraiseexcept (FE_OWERFLOW|FE_UNDERFLOW).
    
    On s390 feraiseexcept (FE_OVERFLOW|FE_UNDERFLOW) sets FE_INEXACT, too.
    This patch uses z196 zarch load rounded instruction which can suppress
    FE_INEXACT exception.
    If gcc has no z196 support in used configuration, FE_INEXACT flag is
    cleared if it was set before.
    The gcc support is tested in a new configure-check.
    
    A comment in fsetexcptflg.c is corrected as new exceptions are not
    executed with the next floating-point instruction if fpc is set with
    _FPU_SETCW macro. It seems the comment was copied e.g. from
    sysdeps/x86_64/fpu/fsetexcptflg.c file.
    
    ChangeLog:
    
    	* config.h.in: (HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT)
    	New undefine.
    	* sysdeps/s390/configure.ac: Add test for z196 zarch support.
    	* sysdeps/s390/configure: Regenerated.
    	* sysdeps/s390/fpu/fraiseexcpt.c (__feraiseexcept): Use ledbra
    	instruction for raising over-/underflow if z196 zarch is supported
    	by default or clear inexact flag.
    	* sysdeps/s390/fpu/fsetexcptflg.c (fesetexceptflag):
    	Correct comment.

diff --git a/config.h.in b/config.h.in
index 856ef6a..8cd08b0 100644
--- a/config.h.in
+++ b/config.h.in
@@ -70,6 +70,9 @@
 /* Define if assembler supports AVX512DQ.  */
 #undef  HAVE_AVX512DQ_ASM_SUPPORT
 
+/* Define if assembler supports z196 zarch instructions as default on S390.  */
+#undef  HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT
+
 /* Define if assembler supports vector instructions on S390.  */
 #undef  HAVE_S390_VX_ASM_SUPPORT
 
diff --git a/sysdeps/s390/configure b/sysdeps/s390/configure
index c9fb69c..347ac28 100644
--- a/sysdeps/s390/configure
+++ b/sysdeps/s390/configure
@@ -177,5 +177,41 @@ then
 
 fi
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for S390 z196 zarch instruction support as default" >&5
+$as_echo_n "checking for S390 z196 zarch instruction support as default... " >&6; }
+if ${libc_cv_asm_s390_min_z196_zarch+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat > conftest.c <<\EOF
+float testinsn (double e)
+{
+    float d;
+    __asm__ ("ledbra %0,5,%1,4" : "=f" (d) : "f" (e) );
+    return d;
+}
+EOF
+if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS --shared conftest.c
+			-o conftest.o &> /dev/null'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; } ;
+then
+  libc_cv_asm_s390_min_z196_zarch=yes
+else
+  libc_cv_asm_s390_min_z196_zarch=no
+fi
+rm -f conftest*
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libc_cv_asm_s390_min_z196_zarch" >&5
+$as_echo "$libc_cv_asm_s390_min_z196_zarch" >&6; }
+
+if test "$libc_cv_asm_s390_min_z196_zarch" = yes ;
+then
+  $as_echo "#define HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT 1" >>confdefs.h
+
+fi
+
 test -n "$critic_missing" && as_fn_error $? "
 *** $critic_missing" "$LINENO" 5
diff --git a/sysdeps/s390/configure.ac b/sysdeps/s390/configure.ac
index 1db6d84..8a782e7 100644
--- a/sysdeps/s390/configure.ac
+++ b/sysdeps/s390/configure.ac
@@ -86,5 +86,31 @@ then
   AC_DEFINE(HAVE_S390_VX_GCC_SUPPORT)
 fi
 
+AC_CACHE_CHECK(for S390 z196 zarch instruction support as default,
+	       libc_cv_asm_s390_min_z196_zarch, [dnl
+cat > conftest.c <<\EOF
+float testinsn (double e)
+{
+    float d;
+    __asm__ ("ledbra %0,5,%1,4" : "=f" (d) : "f" (e) );
+    return d;
+}
+EOF
+dnl
+dnl test, if assembler supports S390 z196 zarch instructions as default
+if AC_TRY_COMMAND([${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS --shared conftest.c
+			-o conftest.o &> /dev/null]) ;
+then
+  libc_cv_asm_s390_min_z196_zarch=yes
+else
+  libc_cv_asm_s390_min_z196_zarch=no
+fi
+rm -f conftest* ])
+
+if test "$libc_cv_asm_s390_min_z196_zarch" = yes ;
+then
+  AC_DEFINE(HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT)
+fi
+
 test -n "$critic_missing" && AC_MSG_ERROR([
 *** $critic_missing])
diff --git a/sysdeps/s390/fpu/fraiseexcpt.c b/sysdeps/s390/fpu/fraiseexcpt.c
index 92a1a7d..612e323 100644
--- a/sysdeps/s390/fpu/fraiseexcpt.c
+++ b/sysdeps/s390/fpu/fraiseexcpt.c
@@ -19,6 +19,7 @@
    <http://www.gnu.org/licenses/>.  */
 
 #include <fenv_libc.h>
+#include <fpu_control.h>
 #include <float.h>
 #include <math.h>
 
@@ -35,6 +36,23 @@ fexceptadd (float d, float e)
   __asm__ __volatile__ ("aebr %0,%1" : : "f" (d), "f" (e) );
 }
 
+#ifdef HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT
+static __inline__ void
+fexceptround (double e)
+{
+  float d;
+  /* Load rounded from double to float with M3 = round toward 0, M4 = Suppress
+     IEEE-inexact exception.
+     In case of e=0x1p128 and the overflow-mask bit is zero, only the
+     IEEE-overflow flag is set. If overflow-mask bit is one, DXC field is set to
+     0x20 "IEEE overflow, exact".
+     In case of e=0x1p-150 and the underflow-mask bit is zero, only the
+     IEEE-underflow flag is set. If underflow-mask bit is one, DXC field is set
+     to 0x10 "IEEE underflow, exact".
+     This instruction is available with a zarch machine >= z196.  */
+  __asm__ __volatile__ ("ledbra %0,5,%1,4" : "=f" (d) : "f" (e) );
+}
+#endif
 
 int
 __feraiseexcept (int excepts)
@@ -54,13 +72,49 @@ __feraiseexcept (int excepts)
 
   /* Next: overflow.  */
   if (FE_OVERFLOW & excepts)
-    /* I don't think we can do the same trick as intel so we will have
-       to live with inexact coming also.  */
-    fexceptadd (FLT_MAX, 1.0e32);
+    {
+#ifdef HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT
+      fexceptround (0x1p128);
+#else
+      fexcept_t temp;
+      _FPU_GETCW (temp);
+      /* If overflow-mask bit is zero, both IEEE-overflow and IEEE-inexact flags
+	 are set.  In this case the IEEE-inexact flag will be cleared afterwards
+	 if it wasn't set before.  If overflow-mask bit is one, DXC field is set
+	 to 0x20 "IEEE overflow, exact".  */
+      fexceptadd (0x1p127f, 0x1p127f);
+      if ((temp & (FE_OVERFLOW << FPC_EXCEPTION_MASK_SHIFT
+		   | FE_INEXACT << FPC_FLAGS_SHIFT)) == 0)
+	{
+	  _FPU_GETCW (temp);
+	  temp &= ~(FE_INEXACT << FPC_FLAGS_SHIFT);
+	  _FPU_SETCW (temp);
+	}
+#endif
+    }
 
   /* Next: underflow.  */
   if (FE_UNDERFLOW & excepts)
-    fexceptdiv (FLT_MIN, 3.0);
+    {
+#ifdef HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT
+      fexceptround (0x1p-150);
+#else
+      fexcept_t temp;
+      _FPU_GETCW (temp);
+      /* If underflow-mask bit is zero, both IEEE-underflow and IEEE-inexact
+	 flags are set.  In this case the IEEE-inexact flag will be cleared
+	 afterwards if it wasn't set before.  If underflow-mask bit is one, DXC
+	 field is set to 0x10 "IEEE underflow, exact".  */
+      fexceptdiv (0x1p-127f, 0x1p127f);
+      if ((temp & (FE_UNDERFLOW << FPC_EXCEPTION_MASK_SHIFT
+		   | FE_INEXACT << FPC_FLAGS_SHIFT)) == 0)
+	{
+	  _FPU_GETCW (temp);
+	  temp &= ~(FE_INEXACT << FPC_FLAGS_SHIFT);
+	  _FPU_SETCW (temp);
+	}
+#endif
+    }
 
   /* Last: inexact.  */
   if (FE_INEXACT & excepts)
diff --git a/sysdeps/s390/fpu/fsetexcptflg.c b/sysdeps/s390/fpu/fsetexcptflg.c
index 25ade85..56a52c6 100644
--- a/sysdeps/s390/fpu/fsetexcptflg.c
+++ b/sysdeps/s390/fpu/fsetexcptflg.c
@@ -45,8 +45,7 @@ fesetexceptflag (const fexcept_t *flagp, int excepts)
     & newexcepts;
 
   /* Store the new status word (along with the rest of the environment.
-     Possibly new exceptions are set but they won't get executed unless
-     the next floating-point instruction.  */
+     Possibly new exceptions are set but they won't get executed.  */
   _FPU_SETCW (temp);
 
   /* Success.  */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] S390: Do not set FE_INEXACT with feraiseexcept (FE_OWERFLOW|FE_UNDERFLOW).
  2016-08-18 14:14 [PATCH] S390: Do not set FE_INEXACT with feraiseexcept (FE_OWERFLOW|FE_UNDERFLOW) Stefan Liebler
@ 2016-08-18 15:59 ` Joseph Myers
  2016-08-22  7:16   ` Stefan Liebler
  0 siblings, 1 reply; 6+ messages in thread
From: Joseph Myers @ 2016-08-18 15:59 UTC (permalink / raw)
  To: Stefan Liebler; +Cc: libc-alpha

On Thu, 18 Aug 2016, Stefan Liebler wrote:

> Hi,
> 
> on s390 feraiseexcept (FE_OVERFLOW|FE_UNDERFLOW) sets FE_INEXACT, too.

Note that this is permitted by the standards (whereas implicitly setting 
FE_INEXACT is not permitted for fesetexcept, or fesetexceptflag if it 
wasn't set in the saved state).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] S390: Do not set FE_INEXACT with feraiseexcept (FE_OWERFLOW|FE_UNDERFLOW).
  2016-08-18 15:59 ` Joseph Myers
@ 2016-08-22  7:16   ` Stefan Liebler
  2016-08-22 10:59     ` Joseph Myers
  0 siblings, 1 reply; 6+ messages in thread
From: Stefan Liebler @ 2016-08-22  7:16 UTC (permalink / raw)
  To: libc-alpha; +Cc: Joseph S. Myers

On 08/18/2016 05:59 PM, Joseph Myers wrote:
> On Thu, 18 Aug 2016, Stefan Liebler wrote:
>
>> Hi,
>>
>> on s390 feraiseexcept (FE_OVERFLOW|FE_UNDERFLOW) sets FE_INEXACT, too.
>
> Note that this is permitted by the standards (whereas implicitly setting
> FE_INEXACT is not permitted for fesetexcept, or fesetexceptflag if it
> wasn't set in the saved state).
>
Hi Joseph,

fesetexcept and fesetexceptflag are using the _FPU_SETCW macro to set 
the flags. They don't implicitly set FE_INEXACT.

For feraiseexcept I think if the z196-round-instruction is available 
with used toolchain then it should be used to avoid FE_INEXACT. Then it 
is the same behaviour as on intel.
If it is not available feraiseexcept will use the add/div instructions 
as before with the FE_INEXACT flag/exception. And the additional 
_FPU_GETCW/SETCW usages are avoided.

What's your suggestion for feraiseexcept?

Bye
Stefan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] S390: Do not set FE_INEXACT with feraiseexcept (FE_OWERFLOW|FE_UNDERFLOW).
  2016-08-22  7:16   ` Stefan Liebler
@ 2016-08-22 10:59     ` Joseph Myers
  2016-08-23  6:59       ` Stefan Liebler
  0 siblings, 1 reply; 6+ messages in thread
From: Joseph Myers @ 2016-08-22 10:59 UTC (permalink / raw)
  To: Stefan Liebler; +Cc: libc-alpha

On Mon, 22 Aug 2016, Stefan Liebler wrote:

> For feraiseexcept I think if the z196-round-instruction is available with used
> toolchain then it should be used to avoid FE_INEXACT. Then it is the same
> behaviour as on intel.
> If it is not available feraiseexcept will use the add/div instructions as
> before with the FE_INEXACT flag/exception. And the additional _FPU_GETCW/SETCW
> usages are avoided.
> 
> What's your suggestion for feraiseexcept?

I don't have a suggestion; I was simply observing that extra steps for 
this case (such as the x86 code does) are not actually needed to conform 
to the standard.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] S390: Do not set FE_INEXACT with feraiseexcept (FE_OWERFLOW|FE_UNDERFLOW).
  2016-08-22 10:59     ` Joseph Myers
@ 2016-08-23  6:59       ` Stefan Liebler
  2016-08-30 15:11         ` Stefan Liebler
  0 siblings, 1 reply; 6+ messages in thread
From: Stefan Liebler @ 2016-08-23  6:59 UTC (permalink / raw)
  To: libc-alpha

[-- Attachment #1: Type: text/plain, Size: 1275 bytes --]

On 08/22/2016 12:59 PM, Joseph Myers wrote:
> On Mon, 22 Aug 2016, Stefan Liebler wrote:
>
>> For feraiseexcept I think if the z196-round-instruction is available with used
>> toolchain then it should be used to avoid FE_INEXACT. Then it is the same
>> behaviour as on intel.
>> If it is not available feraiseexcept will use the add/div instructions as
>> before with the FE_INEXACT flag/exception. And the additional _FPU_GETCW/SETCW
>> usages are avoided.
>>
>> What's your suggestion for feraiseexcept?
>
> I don't have a suggestion; I was simply observing that extra steps for
> this case (such as the x86 code does) are not actually needed to conform
> to the standard.
>
Okay. Then I've updated the patch. Now it uses z196-round-instruction if 
available to omit FE_INEXACT. If it is not available the old behaviour 
is used without clearing FE_INEXACT.

Bye
Stefan

ChangeLog:

	* config.h.in: (HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT)
	New undefine.
	* sysdeps/s390/configure.ac: Add test for z196 zarch support.
	* sysdeps/s390/configure: Regenerated.
	* sysdeps/s390/fpu/fraiseexcpt.c (__feraiseexcept): Use ledbra
	instruction for raising over-/underflow if z196 zarch is
	supported by default.
	* sysdeps/s390/fpu/fsetexcptflg.c (fesetexceptflag):
	Correct comment.

[-- Attachment #2: 20160823_feraiseexception.patch --]
[-- Type: text/x-patch, Size: 6837 bytes --]

commit 224637717d53581a081bbb42471c7e91fc137085
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Tue Aug 23 08:44:26 2016 +0200

    S390: Do not set FE_INEXACT with feraiseexcept (FE_OWERFLOW|FE_UNDERFLOW).
    
    On s390 feraiseexcept (FE_OVERFLOW|FE_UNDERFLOW) sets FE_INEXACT, too.
    This patch uses z196 zarch load rounded instruction which can suppress
    FE_INEXACT exception if gcc has z196 support in used configuration.
    Otherwise FE_INEXACT flag is set as before. The gcc support is tested
    in a new configure-check.
    
    A comment in fsetexcptflg.c is corrected as new exceptions are not
    executed with the next floating-point instruction if fpc is set with
    _FPU_SETCW macro. It seems the comment was copied e.g. from
    sysdeps/x86_64/fpu/fsetexcptflg.c file.
    
    ChangeLog:
    
    	* config.h.in: (HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT)
    	New undefine.
    	* sysdeps/s390/configure.ac: Add test for z196 zarch support.
    	* sysdeps/s390/configure: Regenerated.
    	* sysdeps/s390/fpu/fraiseexcpt.c (__feraiseexcept): Use ledbra
    	instruction for raising over-/underflow if z196 zarch is supported
    	by default.
    	* sysdeps/s390/fpu/fsetexcptflg.c (fesetexceptflag):
    	Correct comment.

diff --git a/config.h.in b/config.h.in
index 856ef6a..8cd08b0 100644
--- a/config.h.in
+++ b/config.h.in
@@ -70,6 +70,9 @@
 /* Define if assembler supports AVX512DQ.  */
 #undef  HAVE_AVX512DQ_ASM_SUPPORT
 
+/* Define if assembler supports z196 zarch instructions as default on S390.  */
+#undef  HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT
+
 /* Define if assembler supports vector instructions on S390.  */
 #undef  HAVE_S390_VX_ASM_SUPPORT
 
diff --git a/sysdeps/s390/configure b/sysdeps/s390/configure
index c9fb69c..347ac28 100644
--- a/sysdeps/s390/configure
+++ b/sysdeps/s390/configure
@@ -177,5 +177,41 @@ then
 
 fi
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for S390 z196 zarch instruction support as default" >&5
+$as_echo_n "checking for S390 z196 zarch instruction support as default... " >&6; }
+if ${libc_cv_asm_s390_min_z196_zarch+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat > conftest.c <<\EOF
+float testinsn (double e)
+{
+    float d;
+    __asm__ ("ledbra %0,5,%1,4" : "=f" (d) : "f" (e) );
+    return d;
+}
+EOF
+if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS --shared conftest.c
+			-o conftest.o &> /dev/null'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; } ;
+then
+  libc_cv_asm_s390_min_z196_zarch=yes
+else
+  libc_cv_asm_s390_min_z196_zarch=no
+fi
+rm -f conftest*
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libc_cv_asm_s390_min_z196_zarch" >&5
+$as_echo "$libc_cv_asm_s390_min_z196_zarch" >&6; }
+
+if test "$libc_cv_asm_s390_min_z196_zarch" = yes ;
+then
+  $as_echo "#define HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT 1" >>confdefs.h
+
+fi
+
 test -n "$critic_missing" && as_fn_error $? "
 *** $critic_missing" "$LINENO" 5
diff --git a/sysdeps/s390/configure.ac b/sysdeps/s390/configure.ac
index 1db6d84..8a782e7 100644
--- a/sysdeps/s390/configure.ac
+++ b/sysdeps/s390/configure.ac
@@ -86,5 +86,31 @@ then
   AC_DEFINE(HAVE_S390_VX_GCC_SUPPORT)
 fi
 
+AC_CACHE_CHECK(for S390 z196 zarch instruction support as default,
+	       libc_cv_asm_s390_min_z196_zarch, [dnl
+cat > conftest.c <<\EOF
+float testinsn (double e)
+{
+    float d;
+    __asm__ ("ledbra %0,5,%1,4" : "=f" (d) : "f" (e) );
+    return d;
+}
+EOF
+dnl
+dnl test, if assembler supports S390 z196 zarch instructions as default
+if AC_TRY_COMMAND([${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS --shared conftest.c
+			-o conftest.o &> /dev/null]) ;
+then
+  libc_cv_asm_s390_min_z196_zarch=yes
+else
+  libc_cv_asm_s390_min_z196_zarch=no
+fi
+rm -f conftest* ])
+
+if test "$libc_cv_asm_s390_min_z196_zarch" = yes ;
+then
+  AC_DEFINE(HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT)
+fi
+
 test -n "$critic_missing" && AC_MSG_ERROR([
 *** $critic_missing])
diff --git a/sysdeps/s390/fpu/fraiseexcpt.c b/sysdeps/s390/fpu/fraiseexcpt.c
index 92a1a7d..ac6dfe7 100644
--- a/sysdeps/s390/fpu/fraiseexcpt.c
+++ b/sysdeps/s390/fpu/fraiseexcpt.c
@@ -35,6 +35,23 @@ fexceptadd (float d, float e)
   __asm__ __volatile__ ("aebr %0,%1" : : "f" (d), "f" (e) );
 }
 
+#ifdef HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT
+static __inline__ void
+fexceptround (double e)
+{
+  float d;
+  /* Load rounded from double to float with M3 = round toward 0, M4 = Suppress
+     IEEE-inexact exception.
+     In case of e=0x1p128 and the overflow-mask bit is zero, only the
+     IEEE-overflow flag is set. If overflow-mask bit is one, DXC field is set to
+     0x20 "IEEE overflow, exact".
+     In case of e=0x1p-150 and the underflow-mask bit is zero, only the
+     IEEE-underflow flag is set. If underflow-mask bit is one, DXC field is set
+     to 0x10 "IEEE underflow, exact".
+     This instruction is available with a zarch machine >= z196.  */
+  __asm__ __volatile__ ("ledbra %0,5,%1,4" : "=f" (d) : "f" (e) );
+}
+#endif
 
 int
 __feraiseexcept (int excepts)
@@ -54,13 +71,29 @@ __feraiseexcept (int excepts)
 
   /* Next: overflow.  */
   if (FE_OVERFLOW & excepts)
-    /* I don't think we can do the same trick as intel so we will have
-       to live with inexact coming also.  */
-    fexceptadd (FLT_MAX, 1.0e32);
+    {
+#ifdef HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT
+      fexceptround (0x1p128);
+#else
+      /* If overflow-mask bit is zero, both IEEE-overflow and IEEE-inexact flags
+	 are set.  If overflow-mask bit is one, DXC field is set to 0x2C "IEEE
+	 overflow, inexact and incremented".  */
+      fexceptadd (FLT_MAX, 1.0e32);
+#endif
+    }
 
   /* Next: underflow.  */
   if (FE_UNDERFLOW & excepts)
-    fexceptdiv (FLT_MIN, 3.0);
+    {
+#ifdef HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT
+      fexceptround (0x1p-150);
+#else
+      /* If underflow-mask bit is zero, both IEEE-underflow and IEEE-inexact
+	 flags are set.  If underflow-mask bit is one, DXC field is set to 0x1C
+	 "IEEE underflow, inexact and incremented".  */
+      fexceptdiv (FLT_MIN, 3.0);
+#endif
+    }
 
   /* Last: inexact.  */
   if (FE_INEXACT & excepts)
diff --git a/sysdeps/s390/fpu/fsetexcptflg.c b/sysdeps/s390/fpu/fsetexcptflg.c
index 25ade85..56a52c6 100644
--- a/sysdeps/s390/fpu/fsetexcptflg.c
+++ b/sysdeps/s390/fpu/fsetexcptflg.c
@@ -45,8 +45,7 @@ fesetexceptflag (const fexcept_t *flagp, int excepts)
     & newexcepts;
 
   /* Store the new status word (along with the rest of the environment.
-     Possibly new exceptions are set but they won't get executed unless
-     the next floating-point instruction.  */
+     Possibly new exceptions are set but they won't get executed.  */
   _FPU_SETCW (temp);
 
   /* Success.  */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] S390: Do not set FE_INEXACT with feraiseexcept (FE_OWERFLOW|FE_UNDERFLOW).
  2016-08-23  6:59       ` Stefan Liebler
@ 2016-08-30 15:11         ` Stefan Liebler
  0 siblings, 0 replies; 6+ messages in thread
From: Stefan Liebler @ 2016-08-30 15:11 UTC (permalink / raw)
  To: libc-alpha

On 08/23/2016 08:58 AM, Stefan Liebler wrote:
> On 08/22/2016 12:59 PM, Joseph Myers wrote:
>> On Mon, 22 Aug 2016, Stefan Liebler wrote:
>>
>>> For feraiseexcept I think if the z196-round-instruction is available
>>> with used
>>> toolchain then it should be used to avoid FE_INEXACT. Then it is the
>>> same
>>> behaviour as on intel.
>>> If it is not available feraiseexcept will use the add/div
>>> instructions as
>>> before with the FE_INEXACT flag/exception. And the additional
>>> _FPU_GETCW/SETCW
>>> usages are avoided.
>>>
>>> What's your suggestion for feraiseexcept?
>>
>> I don't have a suggestion; I was simply observing that extra steps for
>> this case (such as the x86 code does) are not actually needed to conform
>> to the standard.
>>
> Okay. Then I've updated the patch. Now it uses z196-round-instruction if
> available to omit FE_INEXACT. If it is not available the old behaviour
> is used without clearing FE_INEXACT.
>
If there is no objection, I'll commit this patch tomorrow.
> Bye
> Stefan
>
> ChangeLog:
>
>     * config.h.in: (HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT)
>     New undefine.
>     * sysdeps/s390/configure.ac: Add test for z196 zarch support.
>     * sysdeps/s390/configure: Regenerated.
>     * sysdeps/s390/fpu/fraiseexcpt.c (__feraiseexcept): Use ledbra
>     instruction for raising over-/underflow if z196 zarch is
>     supported by default.
>     * sysdeps/s390/fpu/fsetexcptflg.c (fesetexceptflag):
>     Correct comment.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-08-30 15:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-18 14:14 [PATCH] S390: Do not set FE_INEXACT with feraiseexcept (FE_OWERFLOW|FE_UNDERFLOW) Stefan Liebler
2016-08-18 15:59 ` Joseph Myers
2016-08-22  7:16   ` Stefan Liebler
2016-08-22 10:59     ` Joseph Myers
2016-08-23  6:59       ` Stefan Liebler
2016-08-30 15:11         ` Stefan Liebler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).