public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] middle end, s390: optimization for builtin isnan
@ 2007-08-03 12:54 Wolfgang Gellerich
  2007-08-07  1:27 ` Ian Lance Taylor
  0 siblings, 1 reply; 5+ messages in thread
From: Wolfgang Gellerich @ 2007-08-03 12:54 UTC (permalink / raw)
  To: gcc-patches


This patch introduces a new target hook allowing to optimize the
implementation of builtin function isnan on platforms like s390 where
arithmetic operations set the condition code when the result was NaN.  It
bootstraps on Intel and s390 and does not introduce new test case failures.

Concerning the details: Translating C code like

  double x, y;
  ...
  x= x+y;
  if (isnan (x))
    do_something;

yields an instruction sequence like

  adbr f2, f3     ; add f2 and f3, store result in f2
  cdbr f2, f2     ; sets cc to 3 if f2 contains NaN
  jo some_label   ; branch if cc==3

on s390, assuming that the variables are stored in floating point registers f2
and f3, respectively.

The call to isnan is replaced by an UNORDERED_EXPR comparing the argument of
isnan with itself in builtins.c:fold_builtin_classify.  This later yields the
compare instruction cdbr which sets the condition code to 3 if f2 was NaN.

This translation is correct but not optimal given that most of s390's floating
point instructions already set the condition code this way if the result of
the operation was NaN, i.e. we could omit the compare instruction. Executing
just

  adbr f2, f3     ; add f2 and f3, store result in f2
  jo some_label   ; branch if cc==3

achieves the same effect.

Combine aready has some support for instructions that have the side effect to
set the condition code in the way as if the instruction had compared its
result with constant 0.  But, this mechanism is tied to comparing the result
with constant 0, not with itself.

However, the compare instruction mentioned above would also be a valid
implementation of isnan if we compared f2 not with itself but with float
constant 0. In this case, combine and the already existing patterns in s390.md
are sufficient to remove the unneccessary compare instruction.

What remains to be done is a way to specify that we want different arguments
for the UNORDERED_EXPR generated by builtins.c:fold_builtin_classify. This is
what the new target hook does. The previously hard-coded replacement for isnan
is now the default implementation so no platform needs to make any changes.

On s390, the hook is defined differently.  In cases where we needed the
compare instruction, the patched gcc now generate a s390 load-and-test
instruction instead.

With best regards,

  Wolfgang Gellerich


---
Dr. Wolfgang Gellerich
IBM Deutschland Entwicklung GmbH
Schönaicher Strasse 220
71032 Böblingen, Germany
Tel. +49 / 7031 / 162598
gellerich@de.ibm.com

=======================

IBM Deutschland Entwicklung GmbH
Vorsitzender des Aufsichtsrats: Johann Weihen 
Geschäftsführung: Herbert Kircher 
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

----------------------------------------------------------------------------------

Changelog:

2007-08-02  Wolfgang Gellerich  <gellerich@de.ibm.com>

	* doc/tm.texi: Descripton of new target hook TARGET_EXPR_FOR_ISNAN.
	* targhooks.c (default_expr_for_isnan): New function.
	* targhooks.h (default_expr_for_isnan): New prototype.
	* target.h (expr_for_isnan): New target hook.
	* builtins.c (fold_builtin_classify): Introduced call to new hook 
          targetm.expr_for_isnan.
	* target-def.h: (TARGET_EXPR_FOR_ISNAN) New enum const.
	* config/s390/s390.c: (s390_expr_for_isnan): New function.
	  (TARGET_EXPR_FOR_ISNAN) New target hook.


---------------------------------------------------

Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	(Revision 126913)
+++ gcc/doc/tm.texi	(Arbeitskopie)
@@ -10211,3 +10211,12 @@
 This macro determines the size of the objective C jump buffer for the
 NeXT runtime. By default, OBJC_JBLEN is defined to an innocuous value.
 @end defmac
+
+@deftypefn {Target Hook} tree TARGET_EXPR_FOR_ISNAN  (tree @var{arg}, tree @var{type})
+This macro determines the expression that is used to implement calls
+to the builtin variant of function isnan that was called with argument
+@var{arg}. The type of the value returned by isnan is @var{type}.  The
+default implementation is an UNORDERED_EXPR that takes the function´s
+argument as its two operands.
+@end deftypefn
+
Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c	(Revision 126913)
+++ gcc/targhooks.c	(Arbeitskopie)
@@ -357,6 +357,14 @@
   return NULL_TREE;
 }
 
+/* Expression to implement builtin isnan.  */
+
+tree
+default_expr_for_isnan (tree arg, tree type)
+{
+  return fold_build2 (UNORDERED_EXPR, type, arg, arg);
+}
+
 /* Reciprocal.  */
 
 tree
Index: gcc/targhooks.h
===================================================================
--- gcc/targhooks.h	(Revision 126913)
+++ gcc/targhooks.h	(Arbeitskopie)
@@ -64,6 +64,8 @@
 
 extern tree default_builtin_vectorized_conversion (enum tree_code, tree);
 
+extern tree default_expr_for_isnan (tree arg, tree type);
+
 extern tree default_builtin_reciprocal (enum built_in_function, bool, bool);
 
 extern bool default_builtin_vector_alignment_reachable (tree, bool);
Index: gcc/target.h
===================================================================
--- gcc/target.h	(Revision 126913)
+++ gcc/target.h	(Arbeitskopie)
@@ -501,6 +501,9 @@
   /* Fold a target-specific builtin.  */
   tree (* fold_builtin) (tree fndecl, tree arglist, bool ignore);
 
+  /* Returns the implementation of builtin function isnan.  */
+  tree (* expr_for_isnan) (tree arg, tree type);
+
   /* Returns a code for a target-specific builtin that implements
      reciprocal of the function, or NULL_TREE if not available.  */
   tree (* builtin_reciprocal) (unsigned, bool, bool);
Index: gcc/builtins.c
===================================================================
--- gcc/builtins.c	(Revision 126913)
+++ gcc/builtins.c	(Arbeitskopie)
@@ -9635,7 +9635,7 @@
 	}
 
       arg = builtin_save_expr (arg);
-      return fold_build2 (UNORDERED_EXPR, type, arg, arg);
+      return targetm.expr_for_isnan (arg, type);
 
     default:
       gcc_unreachable ();
Index: gcc/target-def.h
===================================================================
--- gcc/target-def.h	(Revision 126913)
+++ gcc/target-def.h	(Arbeitskopie)
@@ -397,6 +397,8 @@
 #define TARGET_RESOLVE_OVERLOADED_BUILTIN NULL
 #define TARGET_FOLD_BUILTIN hook_tree_tree_tree_bool_null
 
+#define TARGET_EXPR_FOR_ISNAN default_expr_for_isnan
+
 /* In tree-ssa-math-opts.c  */
 #define TARGET_BUILTIN_RECIPROCAL default_builtin_reciprocal
 
@@ -695,6 +697,7 @@
   TARGET_EXPAND_BUILTIN,			\
   TARGET_RESOLVE_OVERLOADED_BUILTIN,		\
   TARGET_FOLD_BUILTIN,				\
+  TARGET_EXPR_FOR_ISNAN,                        \
   TARGET_BUILTIN_RECIPROCAL,			\
   TARGET_MANGLE_FUNDAMENTAL_TYPE,		\
   TARGET_INIT_LIBFUNCS,				\
Index: gcc/config/s390/s390.c
===================================================================
--- gcc/config/s390/s390.c	(Revision 126913)
+++ gcc/config/s390/s390.c	(Arbeitskopie)
@@ -335,6 +335,21 @@
   return TARGET_64BIT ? DImode : SImode;
 }
 
+/* Return an UNORDERED_EXPR to implement builtin isnan where ARG is
+   the tree expression to be checked, and TYPE is the type of the
+   value to be returned by isnan.  While the default implementation is
+   an UNORDERED_EXPR comparing ARG with itself, s390 prefers a
+   comparison with constant zero as this allows combine to remove the
+   comparison in case a previously executed arithmetic operation has
+   already set the condition code.  */
+
+static tree
+s390_expr_for_isnan (tree arg, tree type)
+{
+  return fold_build2 (UNORDERED_EXPR, type, arg,
+                      build_real (TREE_TYPE (arg), dconst0));
+}
+
 /* Return true if the back end supports mode MODE.  */
 static bool
 s390_scalar_mode_supported_p (enum machine_mode mode)
@@ -9359,6 +9374,9 @@
 #undef TARGET_LIBGCC_SHIFT_COUNT_MODE
 #define TARGET_LIBGCC_SHIFT_COUNT_MODE s390_libgcc_shift_count_mode
 
+#undef TARGET_EXPR_FOR_ISNAN
+#define TARGET_EXPR_FOR_ISNAN s390_expr_for_isnan
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-s390.h"

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] middle end, s390: optimization for builtin isnan
  2007-08-03 12:54 [PATCH] middle end, s390: optimization for builtin isnan Wolfgang Gellerich
@ 2007-08-07  1:27 ` Ian Lance Taylor
  2007-08-07 14:03   ` Wolfgang Gellerich
  0 siblings, 1 reply; 5+ messages in thread
From: Ian Lance Taylor @ 2007-08-07  1:27 UTC (permalink / raw)
  To: Wolfgang Gellerich; +Cc: gcc-patches

Wolfgang Gellerich<gellerich@de.ibm.com> writes:

> This patch introduces a new target hook allowing to optimize the
> implementation of builtin function isnan on platforms like s390 where
> arithmetic operations set the condition code when the result was NaN.  It
> bootstraps on Intel and s390 and does not introduce new test case failures.

It seems to me that we should handle this by treating isnan as a unary
operator which may be implemented in the MD file.  E.g., implement
isnansf1 and isnandf1, with optabs.  I don't see why a target hook is
the right thing here.

Ian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] middle end, s390: optimization for builtin isnan
  2007-08-07  1:27 ` Ian Lance Taylor
@ 2007-08-07 14:03   ` Wolfgang Gellerich
  2007-08-07 16:29     ` Ian Lance Taylor
  0 siblings, 1 reply; 5+ messages in thread
From: Wolfgang Gellerich @ 2007-08-07 14:03 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-patches

Ian Lance Taylor <iant@google.com> wrote on 07.08.2007 03:26:28:

> Wolfgang Gellerich<gellerich@de.ibm.com> writes:
>
> > This patch introduces a new target hook allowing to optimize the
> > implementation of builtin function isnan on platforms like s390 where
> > arithmetic operations set the condition code when the result was NaN.
It
> > bootstraps on Intel and s390 and does not introduce new test case
failures.
>
> It seems to me that we should handle this by treating isnan as a unary
> operator which may be implemented in the MD file.  E.g., implement
> isnansf1 and isnandf1, with optabs.  I don't see why a target hook is
> the right thing here.

Hi Ian, many for your comments! However, what would we gain from
implementing isnan as operator, and what do you consider as major
disadvantages of using the new target hook? The reason why I chose the
implementation I sent is that the current implementation was hard-wired at
a rather early stage. I thought that there might be a reason for this, it
may enable further optimizations in some cases or on some platforms. Also,
the target hook approach has the advantage that is allows s390 (and other
platforms) to add their optimizations while not causing any changes for
platforms that are happy with the current implementation.

Best regards, Wolfgang

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] middle end, s390: optimization for builtin isnan
  2007-08-07 14:03   ` Wolfgang Gellerich
@ 2007-08-07 16:29     ` Ian Lance Taylor
  2007-08-09 13:46       ` Wolfgang Gellerich
  0 siblings, 1 reply; 5+ messages in thread
From: Ian Lance Taylor @ 2007-08-07 16:29 UTC (permalink / raw)
  To: Wolfgang Gellerich; +Cc: gcc-patches

Wolfgang Gellerich <gellerich@de.ibm.com> writes:

> Ian Lance Taylor <iant@google.com> wrote on 07.08.2007 03:26:28:
> 
> > Wolfgang Gellerich<gellerich@de.ibm.com> writes:
> >
> > > This patch introduces a new target hook allowing to optimize the
> > > implementation of builtin function isnan on platforms like s390 where
> > > arithmetic operations set the condition code when the result was NaN.
> It
> > > bootstraps on Intel and s390 and does not introduce new test case
> failures.
> >
> > It seems to me that we should handle this by treating isnan as a unary
> > operator which may be implemented in the MD file.  E.g., implement
> > isnansf1 and isnandf1, with optabs.  I don't see why a target hook is
> > the right thing here.
> 
> Hi Ian, many for your comments! However, what would we gain from
> implementing isnan as operator, and what do you consider as major
> disadvantages of using the new target hook? The reason why I chose the
> implementation I sent is that the current implementation was hard-wired at
> a rather early stage. I thought that there might be a reason for this, it
> may enable further optimizations in some cases or on some platforms. Also,
> the target hook approach has the advantage that is allows s390 (and other
> platforms) to add their optimizations while not causing any changes for
> platforms that are happy with the current implementation.

Sorry, you're right, my suggestion doesn't make sense.  Still, I'm not
that fond of target-specific target hooks.  Also, it's not obvious to
me that your suggestion is always an improvement.  It's clearly an
improvement for your test case, but what happens when there is no
floating point operation to set the condition code?  As in
    foo (double x) { return isnan (x); }
Don't you wind up materializing a floating point zero in order to do
the comparison, thus adding an unnecessary instruction?

Looking at your e-mail more closely, I expect that what you want can
be handle with a define_split which runs at combine time.  You want
the split to recognize the three instructions
    adbr f2, f3
    cdbr f2, f2
    jo label
and combine them into simply
    adbr f2, f3
    jo label
This combination of three instructions into two is what combine-time
define_splits do.

Ian

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] middle end, s390: optimization for builtin isnan
  2007-08-07 16:29     ` Ian Lance Taylor
@ 2007-08-09 13:46       ` Wolfgang Gellerich
  0 siblings, 0 replies; 5+ messages in thread
From: Wolfgang Gellerich @ 2007-08-09 13:46 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: gcc-patches

> Sorry, you're right, my suggestion doesn't make sense.  Still, I'm not
> that fond of target-specific target hooks.  Also, it's not obvious to
> me that your suggestion is always an improvement.  It's clearly an
> improvement for your test case, but what happens when there is no
> floating point operation to set the condition code?  As in
>     foo (double x) { return isnan (x); }
> Don't you wind up materializing a floating point zero in order to do
> the comparison, thus adding an unnecessary instruction?

Yes, a floating point zero can be much more expensive than
the value suggests-:) However, on s390, comparing a register with float
const zero is implemented by executing a load-and-test instruction
storing the register into itself. The instruction has the side effect to
set the conditon code as if the value being loaded
had been compared with zero and this is exploitet to avoid having to
load the zero explicitly. Source code like the example you
provided actually compiles into code containing load-and-test in a gcc
with the patch applied. To summarize, in all cases where the patch
does not eliminate the compare at all it replaces the compare by a load.

> Looking at your e-mail more closely, I expect that what you want can
> be handle with a define_split which runs at combine time.  You want
> the split to recognize the three instructions
>     adbr f2, f3
>     cdbr f2, f2
>     jo label
> and combine them into simply
>     adbr f2, f3
>     jo label
> This combination of three instructions into two is what combine-time
> define_splits do.

Yes, this is the effect I want to achieve. The problem is that there is a
large number of  instructions that set the condition code the right way:
most if not all arithmetic operations, some of the load instructions (as
mentioned above) and so on. This is good because the compare
instruction previously generated for the isnan will be omitted in many
cases. On the other hand, every of these instructions already has
several (!) insns in s390.md to describe their main purpose, their side
effect(s) of setting the condition code, and combinations thereof.
Adding even more variants to catch the case of one register getting
compared with itself would cause a significant amount of new descriptions.

Also, combine already has some special code to recognize those cases
where a compare with zero cen be eliminated because a previously
executed instruction has already set the condition code that way.
Probably, additional support were needed in combine to catch the cases
where an operand is not compared to zero but with itself.

Therefore, my suggestion is to re-use the existing for optimizations based
on other instructions setting the compared-with-zero condition code
which is already there in combine and in some back ends by changing
the isnan implementation from a compare of the argument with itself to
a compare against zero. This is just a one line change. The remainder
of the patch is only there to make this change optional via the target
hook.

Regards, Wolfgang



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-08-09 13:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-08-03 12:54 [PATCH] middle end, s390: optimization for builtin isnan Wolfgang Gellerich
2007-08-07  1:27 ` Ian Lance Taylor
2007-08-07 14:03   ` Wolfgang Gellerich
2007-08-07 16:29     ` Ian Lance Taylor
2007-08-09 13:46       ` Wolfgang Gellerich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).