public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch, fortran] Function call optimization
@ 2011-03-14 23:12 Thomas Koenig
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Koenig @ 2011-03-14 23:12 UTC (permalink / raw)
  To: fortran, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1174 bytes --]

Hello world,

the attached patch is a front-end optimization which replaces multiple 
calls to a function with identical argument lists with an assignment to 
a temporary variable, and then uses that variable in the original 
expression.

AFAIK, this is permitted by the Fortran standard because such functions 
have side effects, the program is illegal.

OK for trunk, now that it has reopened?

	Thomas

2010-03-14  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/22572
         * frontend_passes (expr_array):  New static variable.
         (expr_size):  Likewise.
         (expr_count):  Likewise.
         (current_code):  Likewise.
         (current_ns):  Likewise.
         (gfc_run_passes):  Allocate and free space for expressions.
         (compare_functions):  New function.
         (cfe_expr):  New function.
         (create_var):  New function.
         (cfc_expr_0):  New function.
         (cfe_code):  New function.
         (optimize_namespace):  Invoke gfc_code_walker with cfe_code
         and cfe_expr_0.

2010-03-14  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/22572
         * gfortran.dg/function_optimize_1.f90:  New test.

[-- Attachment #2: function_optimize_1.f90 --]
[-- Type: text/plain, Size: 722 bytes --]

! { dg-do compile }
! { dg-options "-O -fdump-tree-original" }
program main
  implicit none
  real, dimension(2,2) :: a, b, c, d
  character(60) :: line
  real, external :: ext_func
  real :: x
  data a /2., 3., 5., 7./
  data b /11., 13., 17., 23./
  write (unit=line, fmt='(4F7.2)') matmul(a,b) + matmul(a,b)
  d = sin(a) + cos(a) + sin(a) + cos(a)
  x = ext_func(a) + 23 + ext_func(a)
end program main
! { dg-final { scan-tree-dump-times "matmul_r4" 1 "original" } }
! { dg-final { scan-tree-dump-times "__builtin_sinf" 1 "original" } }
! { dg-final { scan-tree-dump-times "__builtin_cosf" 1 "original" } }
! { dg-final { scan-tree-dump-times "ext_func" 1 "original" } }
! { dg-final { cleanup-tree-dump "original" } }

[-- Attachment #3: p8.diff --]
[-- Type: text/x-patch, Size: 7311 bytes --]

Index: frontend-passes.c
===================================================================
--- frontend-passes.c	(Revision 170960)
+++ frontend-passes.c	(Arbeitskopie)
@@ -40,6 +40,21 @@ static bool optimize_trim (gfc_expr *);
 
 static int count_arglist;
 
+/* Pointer to an array of gfc_expr ** we operate on, plus its size
+   and counter.  */
+
+static gfc_expr ***expr_array;
+static int expr_size, expr_count;
+
+/* Pointer to the gfc_code we currently work on - to be able to insert
+   a statement before.  */
+
+static gfc_code **current_code;
+
+/* The namespace we are currently dealing with.  */
+
+gfc_namespace *current_ns;
+
 /* Entry point - run all passes for a namespace.  So far, only an
    optimization pass is run.  */
 
@@ -48,9 +63,16 @@ gfc_run_passes (gfc_namespace *ns)
 {
   if (optimize)
     {
+      expr_size = 20;
+      expr_array = XNEWVEC(gfc_expr **, expr_size);
+
       optimize_namespace (ns);
       if (gfc_option.dump_fortran_optimized)
 	gfc_dump_parse_tree (ns, stdout);
+
+      /* FIXME: The following should be XDELETEVEC(expr_array);
+      but we cannot do that because it depends on free.  */
+      gfc_free (expr_array);
     }
 }
 
@@ -106,11 +128,222 @@ optimize_expr (gfc_expr **e, int *walk_subtrees AT
   return 0;
 }
 
+/* Compare two functions for equality.  We could use gfc_dep_compare_expr
+   except that we also consider impure functions equal, because anybody
+   changing the return value of the function within an expression would
+   violate the Fortran standard.  */
+
+static bool
+compare_functions (gfc_expr **ep1, gfc_expr **ep2)
+{
+  gfc_expr *e1, *e2;
+
+  e1 = *ep1;
+  e2 = *ep2;
+
+  if (e1->expr_type != EXPR_FUNCTION || e2->expr_type != EXPR_FUNCTION)
+    return false;
+
+  if ((e1->value.function.esym && e2->value.function.esym
+       && e1->value.function.esym == e2->value.function.esym)
+      || (e1->value.function.isym && e2->value.function.isym
+	  && e1->value.function.isym == e2->value.function.isym))
+    {
+      gfc_actual_arglist *args1, *args2;
+      
+      args1 = e1->value.function.actual;
+      args2 = e2->value.function.actual;
+
+      /* Compare the argument lists for equality.  */
+      while (args1 && args2)
+	{
+	  /*  Bitwise xor, since C has no non-bitwise xor operator.  */
+	  if ((args1->expr == NULL) ^ (args2->expr == NULL))
+	    return false;
+
+	  if (args1->expr != NULL && args2->expr != NULL
+	      && gfc_dep_compare_expr (args1->expr, args2->expr) != 0)
+	    return false;
+
+	  args1 = args1->next;
+	  args2 = args2->next;
+	}
+      return args1 == NULL && args2 == NULL;
+    }
+  else
+    return false;
+      
+}
+
+/* Callback function for gfc_expr_walker, called from cfe_expr_0.  Put all
+   eligible function expressions into expr_array.  We can't do allocatable
+   functions.  */
+
+static int
+cfe_expr (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED,
+	  void *data ATTRIBUTE_UNUSED)
+{
+  if ((*e)->expr_type != EXPR_FUNCTION)
+    return 0;
+
+  /* We don't do character functions (yet).  */
+  if ((*e)->ts.type == BT_CHARACTER)
+    return 0;
+
+  /* If we don't know the shape at compile time, we do not create a temporary
+     variable to hold the intermediate result.  FIXME: Change this later when
+     allocation on assignment works for intrinsics.  */
+
+  if ((*e)->rank > 0 && (*e)->shape == NULL)
+    return 0;
+  
+  if ((*e)->value.function.esym 
+      && (*e)->value.function.esym->attr.allocatable)
+    return 0;
+
+  if ((*e)->value.function.isym
+      && (*e)->value.function.isym->id == GFC_ISYM_CONVERSION)
+    return 0;
+
+  if (expr_count >= expr_size)
+    {
+      expr_size += expr_size;
+      expr_array = XRESIZEVEC(gfc_expr **, expr_array, expr_size);
+    }
+  expr_array[expr_count] = e;
+  expr_count ++;
+  return 0;
+}
+
+/* Returns a new expression (a variable) to be used in place of the old one,
+   with an an assignment statement before the current statement to set
+   the value of the variable.  */
+
+ gfc_expr *create_var(gfc_expr *);
+
+gfc_expr*
+create_var (gfc_expr * e)
+{
+  char name[GFC_MAX_SYMBOL_LEN +1];
+  static int num = 1;
+  gfc_symtree *symtree;
+  gfc_symbol *symbol;
+  gfc_expr *result;
+  gfc_code *n;
+  int i;
+
+  sprintf(name, "__var_%d",num++);
+  if (gfc_get_sym_tree (name, current_ns, &symtree, false) != 0)
+    gcc_unreachable ();
+
+  symbol = symtree->n.sym;
+  symbol->ts = e->ts;
+  symbol->as = gfc_get_array_spec ();
+  symbol->as->rank = e->rank;
+  symbol->as->type = AS_EXPLICIT;
+  for (i=0; i<e->rank; i++)
+    {
+      gfc_expr *p, *q;
+      
+      p = gfc_get_constant_expr (BT_INTEGER, gfc_default_integer_kind,
+				 &(e->where));
+      mpz_set_si (p->value.integer, 1);
+      symbol->as->lower[i] = p;
+	  
+      q = gfc_get_constant_expr (BT_INTEGER, gfc_index_integer_kind,
+				 &(e->where));
+      mpz_set (q->value.integer, e->shape[i]);
+      symbol->as->upper[i] = q;
+    }
+
+  symbol->attr.flavor = FL_VARIABLE;
+  symbol->attr.referenced = 1;
+  symbol->attr.dimension = e->rank > 0;
+  gfc_commit_symbol (symbol);
+
+  result = gfc_get_expr ();
+  result->expr_type = EXPR_VARIABLE;
+  result->ts = e->ts;
+  result->rank = e->rank;
+  result->shape = gfc_copy_shape (e->shape, e->rank);
+  result->symtree = symtree;
+  result->where = e->where;
+  if (e->rank > 0)
+    {
+      result->ref = gfc_get_ref ();
+      result->ref->type = REF_ARRAY;
+      result->ref->u.ar.type = AR_FULL;
+      result->ref->u.ar.where = e->where;
+      result->ref->u.ar.as = symbol->as;
+    }
+
+  /* Generate the new assignment.  */
+  n = XCNEW (gfc_code);
+  n->op = EXEC_ASSIGN;
+  n->loc = (*current_code)->loc;
+  n->next = *current_code;
+  n->expr1 = gfc_copy_expr (result);
+  n->expr2 = e;
+  *current_code = n;
+
+  return result;
+}
+
+/* Callback function for the code walker for doing common function
+   elimination.  This builds up the list of functions in the expression
+   and goes through them to detect duplicates, which it then replaces
+   by variables.  */
+
+static int
+cfe_expr_0 (gfc_expr **e, int *walk_subtrees,
+	  void *data ATTRIBUTE_UNUSED)
+{
+  int i,j;
+  gfc_expr *newvar;
+
+  expr_count = 0;
+  gfc_expr_walker (e, cfe_expr, NULL);
+  /* Walk backwards through all the functions to make sure we
+     catch the leaf functions first.  */
+  for (i=expr_count-1; i>=1; i--)
+    {
+      newvar = NULL;
+      for (j=i-1; j>=0; j--)
+	{
+	  if (compare_functions(expr_array[i], expr_array[j]))
+	    {
+	      if (newvar == NULL)
+		newvar = create_var (*(expr_array[i]));
+	      gfc_free (*(expr_array[j]));
+	      *(expr_array[j]) = gfc_copy_expr (newvar);
+	    }
+	}
+      if (newvar)
+	*(expr_array[i]) = newvar;
+    }
+
+  /* We did all the necessary walking in this function.  */
+  *walk_subtrees = 0;
+  return 0;
+}
+
+static int
+cfe_code (gfc_code **c, int *walk_subtrees ATTRIBUTE_UNUSED,
+	  void *data ATTRIBUTE_UNUSED)
+{
+  current_code = c;
+  return 0;
+}
+
 /* Optimize a namespace, including all contained namespaces.  */
 
 static void
 optimize_namespace (gfc_namespace *ns)
 {
+
+  current_ns = ns;
+
+  gfc_code_walker (&ns->code, cfe_code, cfe_expr_0, NULL);
   gfc_code_walker (&ns->code, optimize_code, optimize_expr, NULL);
 
   for (ns = ns->contained; ns; ns = ns->sibling)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch, fortran] Function call optimization
  2011-03-20 23:32                 ` Mikael Morin
@ 2011-03-21  7:15                   ` Thomas Koenig
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Koenig @ 2011-03-21  7:15 UTC (permalink / raw)
  To: Mikael Morin; +Cc: fortran, gcc-patches

Hi Mikael,
>> Here is the new version of the patch.  Regression-tested.  OK for trunk?
> OK this time.
>
> Thank you.

Sende          fortran/ChangeLog
Sende          fortran/dependency.c
Sende          fortran/frontend-passes.c
Sende          fortran/gfortran.h
Sende          fortran/invoke.texi
Sende          fortran/lang.opt
Sende          fortran/options.c
Sende          testsuite/ChangeLog
HinzufÃŒgen     testsuite/gfortran.dg/function_optimize_1.f90
HinzufÃŒgen     testsuite/gfortran.dg/function_optimize_2.f90
Ãœbertrage Daten ..........
Revision 171207 ÃŒbertragen.

Thanks for the review!

	Thomas

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch, fortran] Function call optimization
  2011-03-20 17:10               ` Thomas Koenig
@ 2011-03-20 23:32                 ` Mikael Morin
  2011-03-21  7:15                   ` Thomas Koenig
  0 siblings, 1 reply; 9+ messages in thread
From: Mikael Morin @ 2011-03-20 23:32 UTC (permalink / raw)
  To: fortran; +Cc: Thomas Koenig, gcc-patches

> Here is the new version of the patch.  Regression-tested.  OK for trunk?
OK this time.

Thank you.

Mikael

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch, fortran] Function call optimization
  2011-03-19 20:33             ` Mikael Morin
@ 2011-03-20 17:10               ` Thomas Koenig
  2011-03-20 23:32                 ` Mikael Morin
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Koenig @ 2011-03-20 17:10 UTC (permalink / raw)
  To: fortran; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3385 bytes --]

Hi Mikael,


> - Array elementals: here it's hard to tell which one is faster; save function
> calls to a temporary and use the temporary or do multiple function calls every
> time (but without temporary).

I have removed elementals for now. It would probably be best to create
a scalar temporary variable during scalarzation.

> - Transpose optimization: here the transpose call is changed into a direct
> array access, so your patch will definitely make things worse. Even if
> transpose calls are used multiple times as actual argument for example, better
> create multiple descriptors than copy the whole lot to a temp. The common
> function elimination should be disabled (IMO) in this case.

Removed for transpose.


> Please use more descriptive names.
> There are 3 cfe* functions and only one has a comment explaining what cfe
> means.

Changed.


> Is it expected that you allow allocatables with -faggressive-function-
> elimination?

No, changed.

>> +
>> +         if (!(*e)->value.function.esym->attr.pure
>> +&&  !(*e)->value.function.esym->attr.implicit_pure)
>> +           return 0;
>> +       }
>> +    }
>> +
>> +  if ((*e)->value.function.isym)
>> +    {
>> +      if ((*e)->value.function.isym->id == GFC_ISYM_CONVERSION)
>> +       return 0;
>> +
>> +      if (! (*e)->value.function.isym->pure
>> +&&  !(*e)->value.function.isym->elemental)
> Tobias' comment also applies here, even if there is no intrinsic impure
> elemental: the following code in intrinsic.c makes the elemental check
> redundant.
>        next_sym->pure = (cl != CLASS_IMPURE);
>        next_sym->elemental = (cl == CLASS_ELEMENTAL);

Changed.


>> +  for (i=expr_count-1; i>=1; i--)
>> +    {
> Tiny optimization here ;-):
> /* Don't bother if the expression has been factored already.  */

Applied.

> if ((*(expr_array[i]))->expr_type == EXPR_VARIABLE)
>    continue;
>> +      newvar = NULL;
>



 >> +
 > Small comment here, or better, more descriptive name
>> +static int
>> +cfe_code (gfc_code **c, int *walk_subtrees ATTRIBUTE_UNUSED,
>> +         void *data ATTRIBUTE_UNUSED)
>> +{

Better comment added.

I have also factored out the code with gfc_dep_compare_expression
into its own function.

Here is the new version of the patch.  Regression-tested.  OK for trunk?

	Thomas

2010-03-20  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/22572
	* gfortran.h (gfc_option_t) : Add
	flag_aggressive_function_elimination.
	(gfc_dep_compare_functions):  Add prototype.
	* lang.opt: Add faggressive-function-elimination.
	* invoke.texi: Document -faggressive-function-elimination.
	* frontend_passes (expr_array):  New static variable.
	(expr_size):  Likewise.
	(expr_count):  Likewise.
	(current_code):  Likewise.
	(current_ns):  Likewise.
	(gfc_run_passes):  Allocate and free space for expressions.
	(cfe_register_funcs):  New function.
	(create_var):  New function.
	(cfc_expr_0):  New function.
	(cfe_code):  New function.
	(optimize_namespace):  Invoke gfc_code_walker with cfe_code
	and cfe_expr_0.
	* dependency.c (gfc_dep_compare_functions):  New function.
	(gfc_dep_compare_expr):  Use it.
	* options.c (gfc_init_options):  Handle
	flag_aggressive_function_elimination.
	(gfc_handle_option):  Likewise.

2010-03-20  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/22572
	* gfortran.dg/function_optimize_1.f90:  New test.
	* gfortran.dg/function_optimize_2.f90:  New test.

[-- Attachment #2: function_optimize_1.f90 --]
[-- Type: text/plain, Size: 1469 bytes --]

! { dg-do compile }
! { dg-options "-O -fdump-tree-original" }
program main
  implicit none
  real, dimension(2,2) :: a, b, c, d
  integer :: i
  real :: x, z
  character(60) :: line
  real, external :: ext_func
  interface
     elemental function element(x)
       real, intent(in) :: x
       real :: elem
     end function element
     pure function mypure(x)
       real, intent(in) :: x
       integer :: mypure
     end function mypure
     elemental impure function elem_impure(x)
       real, intent(in) :: x
       real :: elem_impure
     end function elem_impure
  end interface

  data a /2., 3., 5., 7./
  data b /11., 13., 17., 23./
  write (unit=line, fmt='(4F7.2)') matmul(a,b) + matmul(a,b)
  z = sin(x) + cos(x) + sin(x) + cos(x)
  print *,z
  x = ext_func(a) + 23 + ext_func(a)
  print *,d,x
  z = element(x) + element(x)
  print *,z
  i = mypure(x) - mypure(x)
  print *,i
  z = elem_impure(x) - elem_impure(x)
  print *,z
end program main
! { dg-final { scan-tree-dump-times "matmul_r4" 1 "original" } }
! { dg-final { scan-tree-dump-times "__builtin_sinf" 1 "original" } }
! { dg-final { scan-tree-dump-times "__builtin_cosf" 1 "original" } }
! { dg-final { scan-tree-dump-times "ext_func" 2 "original" } }
! { dg-final { scan-tree-dump-times "element" 1 "original" } }
! { dg-final { scan-tree-dump-times "mypure" 1 "original" } }
! { dg-final { scan-tree-dump-times "elem_impure" 2 "original" } }
! { dg-final { cleanup-tree-dump "original" } }

[-- Attachment #3: function_optimize_2.f90 --]
[-- Type: text/plain, Size: 1513 bytes --]

! { dg-do compile }
! { dg-options "-O -faggressive-function-elimination -fdump-tree-original" }
program main
  implicit none
  real, dimension(2,2) :: a, b, c, d
  real :: x, z
  integer :: i
  character(60) :: line
  real, external :: ext_func
  interface
     elemental function element(x)
       real, intent(in) :: x
       real :: elem
     end function element
     pure function mypure(x)
       real, intent(in) :: x
       integer :: mypure
     end function mypure
     elemental impure function elem_impure(x)
       real, intent(in) :: x
       real :: elem_impure
     end function elem_impure
  end interface

  data a /2., 3., 5., 7./
  data b /11., 13., 17., 23./
  write (unit=line, fmt='(4F7.2)') matmul(a,b) + matmul(a,b)
  x = 1.2
  z = sin(x) + cos(x) + sin(x) + cos(x)
  print *,z
  x = ext_func(a) + 23 + ext_func(a)
  print *,d,x
  z = element(x) + element(x)
  print *,z
  i = mypure(x) - mypure(x)
  print *,i
  z = elem_impure(x) - elem_impure(x)
  print *,z
end program main
! { dg-final { scan-tree-dump-times "matmul_r4" 1 "original" } }
! { dg-final { scan-tree-dump-times "__builtin_sinf" 1 "original" } }
! { dg-final { scan-tree-dump-times "__builtin_cosf" 1 "original" } }
! { dg-final { scan-tree-dump-times "ext_func" 1 "original" } }
! { dg-final { scan-tree-dump-times "element" 1 "original" } }
! { dg-final { scan-tree-dump-times "mypure" 1 "original" } }
! { dg-final { scan-tree-dump-times "elem_impure" 1 "original" } }
! { dg-final { cleanup-tree-dump "original" } }

[-- Attachment #4: p12.diff --]
[-- Type: text/x-patch, Size: 13415 bytes --]

Index: gfortran.h
===================================================================
--- gfortran.h	(Revision 170960)
+++ gfortran.h	(Arbeitskopie)
@@ -2232,6 +2232,7 @@ typedef struct
   int flag_whole_file;
   int flag_protect_parens;
   int flag_realloc_lhs;
+  int flag_aggressive_function_elimination;
 
   int fpe;
   int rtcheck;
@@ -2865,6 +2866,7 @@ void gfc_global_used (gfc_gsymbol *, locus *);
 gfc_namespace* gfc_build_block_ns (gfc_namespace *);
 
 /* dependency.c */
+int gfc_dep_compare_functions (gfc_expr *, gfc_expr *, bool);
 int gfc_dep_compare_expr (gfc_expr *, gfc_expr *);
 
 /* check.c */
Index: lang.opt
===================================================================
--- lang.opt	(Revision 170960)
+++ lang.opt	(Arbeitskopie)
@@ -278,6 +278,10 @@ d
 Fortran Joined
 ; Documented in common.opt
 
+faggressive-function-elimination
+Fortran
+Eliminate multiple function invokations also for impure functions
+
 falign-commons
 Fortran
 Enable alignment of COMMON blocks
Index: invoke.texi
===================================================================
--- invoke.texi	(Revision 170960)
+++ invoke.texi	(Arbeitskopie)
@@ -1468,6 +1468,18 @@ need to be in effect.
 An allocatable left-hand side of an intrinsic assignment is automatically
 (re)allocated if it is either unallocated or has a different shape. The
 option is enabled by default except when @option{-std=f95} is given.
+
+@item -faggressive-function-elimination
+@opindex @code{faggressive-function-elimination}
+@cindex Elimination of functions with identical argument lists
+Functions with identical argument lists are eliminated within
+statements, regardless of whether these functions are marked
+@code{PURE} or not. For example, in
+@smallexample
+  a = f(b,c) + f(b,c)
+@end smallexample
+there will only be a single call to @code{f}.
+
 @end table
 
 @xref{Code Gen Options,,Options for Code Generation Conventions,
@@ -1475,7 +1487,6 @@ gcc,Using the GNU Compiler Collection (GCC)}, for
 offered by the GBE
 shared by @command{gfortran}, @command{gcc}, and other GNU compilers.
 
-
 @c man end
 
 @node Environment Variables
Index: frontend-passes.c
===================================================================
--- frontend-passes.c	(Revision 170960)
+++ frontend-passes.c	(Arbeitskopie)
@@ -40,6 +40,21 @@ static bool optimize_trim (gfc_expr *);
 
 static int count_arglist;
 
+/* Pointer to an array of gfc_expr ** we operate on, plus its size
+   and counter.  */
+
+static gfc_expr ***expr_array;
+static int expr_size, expr_count;
+
+/* Pointer to the gfc_code we currently work on - to be able to insert
+   a statement before.  */
+
+static gfc_code **current_code;
+
+/* The namespace we are currently dealing with.  */
+
+gfc_namespace *current_ns;
+
 /* Entry point - run all passes for a namespace.  So far, only an
    optimization pass is run.  */
 
@@ -48,9 +63,16 @@ gfc_run_passes (gfc_namespace *ns)
 {
   if (optimize)
     {
+      expr_size = 20;
+      expr_array = XNEWVEC(gfc_expr **, expr_size);
+
       optimize_namespace (ns);
       if (gfc_option.dump_fortran_optimized)
 	gfc_dump_parse_tree (ns, stdout);
+
+      /* FIXME: The following should be XDELETEVEC(expr_array);
+      but we cannot do that because it depends on free.  */
+      gfc_free (expr_array);
     }
 }
 
@@ -106,11 +128,214 @@ optimize_expr (gfc_expr **e, int *walk_subtrees AT
   return 0;
 }
 
+
+/* Callback function for common function elimination, called from cfe_expr_0.
+   Put all eligible function expressions into expr_array.  We can't do
+   allocatable functions.  */
+
+static int
+cfe_register_funcs (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED,
+	  void *data ATTRIBUTE_UNUSED)
+{
+  if ((*e)->expr_type != EXPR_FUNCTION)
+    return 0;
+
+  /* We don't do character functions (yet).  */
+  if ((*e)->ts.type == BT_CHARACTER)
+    return 0;
+
+  /* If we don't know the shape at compile time, we do not create a temporary
+     variable to hold the intermediate result.  FIXME: Change this later when
+     allocation on assignment works for intrinsics.  */
+
+  if ((*e)->rank > 0 && (*e)->shape == NULL)
+    return 0;
+  
+  /* Skip the test for pure functions if -faggressive-function-elimination
+     is specified.  */
+  if ((*e)->value.function.esym)
+    {
+      if ((*e)->value.function.esym->attr.allocatable)
+	return 0;
+
+      /* Don't create an array temporary for elemental functions.  */
+      if ((*e)->value.function.esym->attr.elemental && (*e)->rank > 0)
+	return 0;
+
+      /* Only eliminate potentially impure functions if the
+	 user specifically requested it.  */
+      if (!gfc_option.flag_aggressive_function_elimination
+	  && !(*e)->value.function.esym->attr.pure
+	  && !(*e)->value.function.esym->attr.implicit_pure)
+	return 0;
+    }
+
+  if ((*e)->value.function.isym)
+    {
+      /* Conversions are handled on the fly by the middle end,
+	 transpose during trans-* stages.  */
+      if ((*e)->value.function.isym->id == GFC_ISYM_CONVERSION
+	  || (*e)->value.function.isym->id == GFC_ISYM_TRANSPOSE)
+	return 0;
+
+      /* Don't create an array temporary for elemental functions,
+	 as this would be wasteful of memory.
+	 FIXME: Create a scalar temporary during scalarization.  */
+      if ((*e)->value.function.isym->elemental && (*e)->rank > 0)
+	return 0;
+
+      if (!(*e)->value.function.isym->pure)
+	return 0;
+    }
+
+  if (expr_count >= expr_size)
+    {
+      expr_size += expr_size;
+      expr_array = XRESIZEVEC(gfc_expr **, expr_array, expr_size);
+    }
+  expr_array[expr_count] = e;
+  expr_count ++;
+  return 0;
+}
+
+/* Returns a new expression (a variable) to be used in place of the old one,
+   with an an assignment statement before the current statement to set
+   the value of the variable.  */
+
+static gfc_expr*
+create_var (gfc_expr * e)
+{
+  char name[GFC_MAX_SYMBOL_LEN +1];
+  static int num = 1;
+  gfc_symtree *symtree;
+  gfc_symbol *symbol;
+  gfc_expr *result;
+  gfc_code *n;
+  int i;
+
+  sprintf(name, "__var_%d",num++);
+  if (gfc_get_sym_tree (name, current_ns, &symtree, false) != 0)
+    gcc_unreachable ();
+
+  symbol = symtree->n.sym;
+  symbol->ts = e->ts;
+  symbol->as = gfc_get_array_spec ();
+  symbol->as->rank = e->rank;
+  symbol->as->type = AS_EXPLICIT;
+  for (i=0; i<e->rank; i++)
+    {
+      gfc_expr *p, *q;
+      
+      p = gfc_get_constant_expr (BT_INTEGER, gfc_default_integer_kind,
+				 &(e->where));
+      mpz_set_si (p->value.integer, 1);
+      symbol->as->lower[i] = p;
+	  
+      q = gfc_get_constant_expr (BT_INTEGER, gfc_index_integer_kind,
+				 &(e->where));
+      mpz_set (q->value.integer, e->shape[i]);
+      symbol->as->upper[i] = q;
+    }
+
+  symbol->attr.flavor = FL_VARIABLE;
+  symbol->attr.referenced = 1;
+  symbol->attr.dimension = e->rank > 0;
+  gfc_commit_symbol (symbol);
+
+  result = gfc_get_expr ();
+  result->expr_type = EXPR_VARIABLE;
+  result->ts = e->ts;
+  result->rank = e->rank;
+  result->shape = gfc_copy_shape (e->shape, e->rank);
+  result->symtree = symtree;
+  result->where = e->where;
+  if (e->rank > 0)
+    {
+      result->ref = gfc_get_ref ();
+      result->ref->type = REF_ARRAY;
+      result->ref->u.ar.type = AR_FULL;
+      result->ref->u.ar.where = e->where;
+      result->ref->u.ar.as = symbol->as;
+    }
+
+  /* Generate the new assignment.  */
+  n = XCNEW (gfc_code);
+  n->op = EXEC_ASSIGN;
+  n->loc = (*current_code)->loc;
+  n->next = *current_code;
+  n->expr1 = gfc_copy_expr (result);
+  n->expr2 = e;
+  *current_code = n;
+
+  return result;
+}
+
+/* Callback function for the code walker for doing common function
+   elimination.  This builds up the list of functions in the expression
+   and goes through them to detect duplicates, which it then replaces
+   by variables.  */
+
+static int
+cfe_expr_0 (gfc_expr **e, int *walk_subtrees,
+	  void *data ATTRIBUTE_UNUSED)
+{
+  int i,j;
+  gfc_expr *newvar;
+
+  expr_count = 0;
+
+  gfc_expr_walker (e, cfe_register_funcs, NULL);
+
+  /* Walk backwards through all the functions to make sure we
+     catch the leaf functions first.  */
+  for (i=expr_count-1; i>=1; i--)
+    {
+      /* Skip if the function has been replaced by a variable already.  */
+      if ((*(expr_array[i]))->expr_type == EXPR_VARIABLE)
+	continue;
+
+      newvar = NULL;
+      for (j=i-1; j>=0; j--)
+	{
+	  if (gfc_dep_compare_functions(*(expr_array[i]),
+					*(expr_array[j]), true)	== 0)
+	    {
+	      if (newvar == NULL)
+		newvar = create_var (*(expr_array[i]));
+	      gfc_free (*(expr_array[j]));
+	      *(expr_array[j]) = gfc_copy_expr (newvar);
+	    }
+	}
+      if (newvar)
+	*(expr_array[i]) = newvar;
+    }
+
+  /* We did all the necessary walking in this function.  */
+  *walk_subtrees = 0;
+  return 0;
+}
+
+/* Callback function for common function elimination, called from
+   gfc_code_walker.  This keeps track of the current code, in order
+   to insert statements as needed.  */
+
+static int
+cfe_code (gfc_code **c, int *walk_subtrees ATTRIBUTE_UNUSED,
+	  void *data ATTRIBUTE_UNUSED)
+{
+  current_code = c;
+  return 0;
+}
+
 /* Optimize a namespace, including all contained namespaces.  */
 
 static void
 optimize_namespace (gfc_namespace *ns)
 {
+
+  current_ns = ns;
+
+  gfc_code_walker (&ns->code, cfe_code, cfe_expr_0, NULL);
   gfc_code_walker (&ns->code, optimize_code, optimize_expr, NULL);
 
   for (ns = ns->contained; ns; ns = ns->sibling)
Index: dependency.c
===================================================================
--- dependency.c	(Revision 170960)
+++ dependency.c	(Arbeitskopie)
@@ -177,16 +177,60 @@ gfc_are_identical_variables (gfc_expr *e1, gfc_exp
   return true;
 }
 
+/* Compare two functions for equality.  Only the first expression is known
+   to be a function.  Returns 0 if e1==e2, -2 otherwise.  If impure_ok is
+   false, only return 0 for pure functions.  */
+
+int
+gfc_dep_compare_functions (gfc_expr *e1, gfc_expr *e2, bool impure_ok)
+{
+
+  gfc_actual_arglist *args1;
+  gfc_actual_arglist *args2;
+  
+  if (e2->expr_type != EXPR_FUNCTION)
+    return -2;
+
+  if ((e1->value.function.esym && e2->value.function.esym
+       && e1->value.function.esym == e2->value.function.esym
+       && (e1->value.function.esym->result->attr.pure || impure_ok))
+       || (e1->value.function.isym && e2->value.function.isym
+	   && e1->value.function.isym == e2->value.function.isym
+	   && (e1->value.function.isym->pure || impure_ok)))
+    {
+      args1 = e1->value.function.actual;
+      args2 = e2->value.function.actual;
+
+      /* Compare the argument lists for equality.  */
+      while (args1 && args2)
+	{
+	  /*  Bitwise xor, since C has no non-bitwise xor operator.  */
+	  if ((args1->expr == NULL) ^ (args2->expr == NULL))
+	    return -2;
+	  
+	  if (args1->expr != NULL && args2->expr != NULL
+	      && gfc_dep_compare_expr (args1->expr, args2->expr) != 0)
+	    return -2;
+	  
+	  args1 = args1->next;
+	  args2 = args2->next;
+	}
+      return (args1 || args2) ? -2 : 0;
+    }
+      else
+	return -2;      
+}
+
 /* Compare two values.  Returns 0 if e1 == e2, -1 if e1 < e2, +1 if e1 > e2,
    and -2 if the relationship could not be determined.  */
 
 int
 gfc_dep_compare_expr (gfc_expr *e1, gfc_expr *e2)
 {
+  int i;
+  gfc_expr *n1, *n2;
   gfc_actual_arglist *args1;
   gfc_actual_arglist *args2;
-  int i;
-  gfc_expr *n1, *n2;
 
   n1 = NULL;
   n2 = NULL;
@@ -399,36 +443,7 @@ gfc_dep_compare_expr (gfc_expr *e1, gfc_expr *e2)
       return -2;
 
     case EXPR_FUNCTION:
-
-      /* PURE functions can be compared for argument equality.  */
-      if ((e1->value.function.esym && e2->value.function.esym
-	   && e1->value.function.esym == e2->value.function.esym
-	   && e1->value.function.esym->result->attr.pure)
-	  || (e1->value.function.isym && e2->value.function.isym
-	      && e1->value.function.isym == e2->value.function.isym
-	      && e1->value.function.isym->pure))
-	{
-	  args1 = e1->value.function.actual;
-	  args2 = e2->value.function.actual;
-
-	  /* Compare the argument lists for equality.  */
-	  while (args1 && args2)
-	    {
-	      /*  Bitwise xor, since C has no non-bitwise xor operator.  */
-	      if ((args1->expr == NULL) ^ (args2->expr == NULL))
-		return -2;
-
-	      if (args1->expr != NULL && args2->expr != NULL
-		  && gfc_dep_compare_expr (args1->expr, args2->expr) != 0)
-		return -2;
-
-	      args1 = args1->next;
-	      args2 = args2->next;
-	    }
-	  return (args1 || args2) ? -2 : 0;
-	}
-      else
-	return -2;
+      return gfc_dep_compare_functions (e1, e2, false);
       break;
 
     default:
Index: options.c
===================================================================
--- options.c	(Revision 170960)
+++ options.c	(Arbeitskopie)
@@ -150,6 +150,7 @@ gfc_init_options (unsigned int decoded_options_cou
   gfc_option.flag_align_commons = 1;
   gfc_option.flag_protect_parens = 1;
   gfc_option.flag_realloc_lhs = -1;
+  gfc_option.flag_aggressive_function_elimination = 0;
   
   gfc_option.fpe = 0;
   gfc_option.rtcheck = 0;
@@ -972,6 +973,10 @@ gfc_handle_option (size_t scode, const char *arg,
       gfc_option.flag_align_commons = value;
       break;
 
+    case  OPT_faggressive_function_elimination:
+      gfc_option.flag_aggressive_function_elimination = value;
+      break;
+
     case OPT_fprotect_parens:
       gfc_option.flag_protect_parens = value;
       break;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch, fortran] Function call optimization
  2011-03-19 19:00           ` Thomas Koenig
@ 2011-03-19 20:33             ` Mikael Morin
  2011-03-20 17:10               ` Thomas Koenig
  0 siblings, 1 reply; 9+ messages in thread
From: Mikael Morin @ 2011-03-19 20:33 UTC (permalink / raw)
  To: fortran; +Cc: Thomas Koenig, gcc-patches

On Saturday 19 March 2011 19:59:56 Thomas Koenig wrote:
> Am 19.03.2011 00:23, schrieb Tobias Burnus:
> > I have not followed the discussion nor have I fully read the patch, but
> > what's the reason for allowing ELEMENTAL functions?
> 
> Here's an updated version of the patch, which removes the elemental
> functions as well.  I have also added an option which allows full access
> to all function call eliminations, so if any user wants it, it is there.
>   (I know I will use it :-)  This option is not enabled by any
> optimization option.
> 
> Regression-tested.  Before committing, I'll check on the status of the
> gfc_free removal patch, and re-test.  Also tested with "make dvi" and
> "make info".
> 
> OK for trunk?
Not yet, comment here, nits below.


I'm a bit worried about the patch possibly conflicting with other 
optimizations at code generation time. I'm thinking especially about cases 
where we manage not to create a temporary without the patch:
- Inline intrinsics have only been for scalar cases so far, so it should not 
matter for now.
- Array elementals: here it's hard to tell which one is faster; save function 
calls to a temporary and use the temporary or do multiple function calls every 
time (but without temporary). 
- Transpose optimization: here the transpose call is changed into a direct 
array access, so your patch will definitely make things worse. Even if 
transpose calls are used multiple times as actual argument for example, better 
create multiple descriptors than copy the whole lot to a temp. The common 
function elimination should be disabled (IMO) in this case.

Mikael.


> Index: frontend-passes.c
> ===================================================================
> --- frontend-passes.c   (Revision 170960)
> +++ frontend-passes.c   (Arbeitskopie)
> @@ -106,11 +128,237 @@ optimize_expr (gfc_expr **e, int *walk_subtrees AT
>    return 0;
>  }
>  
> +/* Compare two functions for equality.  We could use gfc_dep_compare_expr
> +   except that we also consider impure functions equal, because anybody
> +   changing the return value of the function within an expression would
> +   violate the Fortran standard.  */
Given how much the codes match, it looks like one can move the relevant 
gfc_dep_compare_expr code to a function, use that function, driving the small 
differences with a flag, no?
> +
> +static bool
> +compare_functions (gfc_expr **ep1, gfc_expr **ep2)
> +{
> +  gfc_expr *e1, *e2;
> +
> +  e1 = *ep1;
> +  e2 = *ep2;
> +
> +  if (e1->expr_type != EXPR_FUNCTION || e2->expr_type != EXPR_FUNCTION)
> +    return false;
> +
> +  if ((e1->value.function.esym && e2->value.function.esym
> +       && e1->value.function.esym == e2->value.function.esym)
> +      || (e1->value.function.isym && e2->value.function.isym
> +         && e1->value.function.isym == e2->value.function.isym))
> +    {
> +      gfc_actual_arglist *args1, *args2;
> +      
> +      args1 = e1->value.function.actual;
> +      args2 = e2->value.function.actual;
> +
> +      /* Compare the argument lists for equality.  */
> +      while (args1 && args2)
> +       {
> +         /*  Bitwise xor, since C has no non-bitwise xor operator.  */
> +         if ((args1->expr == NULL) ^ (args2->expr == NULL))
> +           return false;
> +
> +         if (args1->expr != NULL && args2->expr != NULL
> +             && gfc_dep_compare_expr (args1->expr, args2->expr) != 0)
> +           return false;
> +
> +         args1 = args1->next;
> +         args2 = args2->next;
> +       }
> +      return args1 == NULL && args2 == NULL;
> +    }
> +  else
> +    return false;
> +      
> +}
> +
> +/* Callback function for gfc_expr_walker, called from cfe_expr_0.  Put all
> +   eligible function expressions into expr_array.  We can't do allocatable
> +   functions.  */
> +
> +static int
> +cfe_expr (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED,
> +         void *data ATTRIBUTE_UNUSED)
Please use more descriptive names.
There are 3 cfe* functions and only one has a comment explaining what cfe 
means.
Something like "register_function_expr" seems to match what this function 
does, but you may propose something better ;-)
> +{
> +  if ((*e)->expr_type != EXPR_FUNCTION)
> +    return 0;
> +
> +  /* We don't do character functions (yet).  */
> +  if ((*e)->ts.type == BT_CHARACTER)
> +    return 0;
> +
> +  /* If we don't know the shape at compile time, we do not create a 
temporary
> +     variable to hold the intermediate result.  FIXME: Change this later 
when
> +     allocation on assignment works for intrinsics.  */
> +
> +  if ((*e)->rank > 0 && (*e)->shape == NULL)
> +    return 0;
> +  
> +  /* Skip the test for pure functions if -faggressive-function-elimination
> +     is specified.  */
> +  if (!gfc_option.flag_aggressive_function_elimination)
> +    {
> +      if ((*e)->value.function.esym)
> +       {
> +         if ((*e)->value.function.esym->attr.allocatable)
> +           return 0;
Is it expected that you allow allocatables with -faggressive-function-
elimination?

> +
> +         if (!(*e)->value.function.esym->attr.pure
> +             && !(*e)->value.function.esym->attr.implicit_pure)
> +           return 0;
> +       }
> +    }
> +
> +  if ((*e)->value.function.isym)
> +    {
> +      if ((*e)->value.function.isym->id == GFC_ISYM_CONVERSION)
> +       return 0;
> +
> +      if (! (*e)->value.function.isym->pure
> +         && !(*e)->value.function.isym->elemental)
Tobias' comment also applies here, even if there is no intrinsic impure 
elemental: the following code in intrinsic.c makes the elemental check 
redundant.
      next_sym->pure = (cl != CLASS_IMPURE);
      next_sym->elemental = (cl == CLASS_ELEMENTAL);

> +       return 0;
> +    }
> +
> +  if (expr_count >= expr_size)
> +    {
> +      expr_size += expr_size;
> +      expr_array = XRESIZEVEC(gfc_expr **, expr_array, expr_size);
> +    }
> +  expr_array[expr_count] = e;
> +  expr_count ++;
> +  return 0;
> +}
> +
> +/* Returns a new expression (a variable) to be used in place of the old 
one,
> +   with an an assignment statement before the current statement to set
> +   the value of the variable.  */
> +
> +static gfc_expr*
> +create_var (gfc_expr * e)
> +{
> +  char name[GFC_MAX_SYMBOL_LEN +1];
> +  static int num = 1;
> +  gfc_symtree *symtree;
> +  gfc_symbol *symbol;
> +  gfc_expr *result;
> +  gfc_code *n;
> +  int i;
> +
> +  sprintf(name, "__var_%d",num++);
> +  if (gfc_get_sym_tree (name, current_ns, &symtree, false) != 0)
> +    gcc_unreachable ();
> +
> +  symbol = symtree->n.sym;
> +  symbol->ts = e->ts;
> +  symbol->as = gfc_get_array_spec ();
> +  symbol->as->rank = e->rank;
> +  symbol->as->type = AS_EXPLICIT;
> +  for (i=0; i<e->rank; i++)
> +    {
> +      gfc_expr *p, *q;
> +      
> +      p = gfc_get_constant_expr (BT_INTEGER, gfc_default_integer_kind,
> +                                &(e->where));
> +      mpz_set_si (p->value.integer, 1);
> +      symbol->as->lower[i] = p;
> +         
> +      q = gfc_get_constant_expr (BT_INTEGER, gfc_index_integer_kind,
> +                                &(e->where));
> +      mpz_set (q->value.integer, e->shape[i]);
> +      symbol->as->upper[i] = q;
> +    }
> +
> +  symbol->attr.flavor = FL_VARIABLE;
> +  symbol->attr.referenced = 1;
> +  symbol->attr.dimension = e->rank > 0;
> +  gfc_commit_symbol (symbol);
> +
> +  result = gfc_get_expr ();
> +  result->expr_type = EXPR_VARIABLE;
> +  result->ts = e->ts;
> +  result->rank = e->rank;
> +  result->shape = gfc_copy_shape (e->shape, e->rank);
> +  result->symtree = symtree;
> +  result->where = e->where;
> +  if (e->rank > 0)
> +    {
> +      result->ref = gfc_get_ref ();
> +      result->ref->type = REF_ARRAY;
> +      result->ref->u.ar.type = AR_FULL;
> +      result->ref->u.ar.where = e->where;
> +      result->ref->u.ar.as = symbol->as;
> +    }
> +
> +  /* Generate the new assignment.  */
> +  n = XCNEW (gfc_code);
> +  n->op = EXEC_ASSIGN;
> +  n->loc = (*current_code)->loc;
> +  n->next = *current_code;
> +  n->expr1 = gfc_copy_expr (result);
> +  n->expr2 = e;
> +  *current_code = n;
> +
> +  return result;
> +}
> +
> +/* Callback function for the code walker for doing common function
> +   elimination.  This builds up the list of functions in the expression
> +   and goes through them to detect duplicates, which it then replaces
> +   by variables.  */
> +
> +static int
> +cfe_expr_0 (gfc_expr **e, int *walk_subtrees,
> +         void *data ATTRIBUTE_UNUSED)
> +{
> +  int i,j;
> +  gfc_expr *newvar;
> +
> +  expr_count = 0;
> +  gfc_expr_walker (e, cfe_expr, NULL);
> +  /* Walk backwards through all the functions to make sure we
> +     catch the leaf functions first.  */
> +  for (i=expr_count-1; i>=1; i--)
> +    {
Tiny optimization here ;-):
/* Don't bother if the expression has been factored already.  */
if ((*(expr_array[i]))->expr_type == EXPR_VARIABLE)
  continue;
> +      newvar = NULL;
> +      for (j=i-1; j>=0; j--)
> +       {
> +         if (compare_functions(expr_array[i], expr_array[j]))
> +           {
> +             if (newvar == NULL)
> +               newvar = create_var (*(expr_array[i]));
> +             gfc_free (*(expr_array[j]));
> +             *(expr_array[j]) = gfc_copy_expr (newvar);
> +           }
> +       }
> +      if (newvar)
> +       *(expr_array[i]) = newvar;
> +    }
> +
> +  /* We did all the necessary walking in this function.  */
> +  *walk_subtrees = 0;
> +  return 0;
> +}
> +
Small comment here, or better, more descriptive name
> +static int
> +cfe_code (gfc_code **c, int *walk_subtrees ATTRIBUTE_UNUSED,
> +         void *data ATTRIBUTE_UNUSED)
> +{
> +  current_code = c;
> +  return 0;
> +}
> +
>  /* Optimize a namespace, including all contained namespaces.  */
>  
>  static void
>  optimize_namespace (gfc_namespace *ns)
>  {
> +
> +  current_ns = ns;
> +
> +  gfc_code_walker (&ns->code, cfe_code, cfe_expr_0, NULL);
>    gfc_code_walker (&ns->code, optimize_code, optimize_expr, NULL);
>  
>    for (ns = ns->contained; ns; ns = ns->sibling)
> Index: options.c
> ===================================================================
> --- options.c   (Revision 170960)
> +++ options.c   (Arbeitskopie)
> @@ -150,6 +150,7 @@ gfc_init_options (unsigned int decoded_options_cou
>    gfc_option.flag_align_commons = 1;
>    gfc_option.flag_protect_parens = 1;
>    gfc_option.flag_realloc_lhs = -1;
> +  gfc_option.flag_aggressive_function_elimination = 0;
>    
>    gfc_option.fpe = 0;
>    gfc_option.rtcheck = 0;
> @@ -972,6 +973,10 @@ gfc_handle_option (size_t scode, const char *arg,
>        gfc_option.flag_align_commons = value;
>        break;
>  
> +    case  OPT_faggressive_function_elimination:
> +      gfc_option.flag_aggressive_function_elimination = value;
> +      break;
> +
>      case OPT_fprotect_parens:
>        gfc_option.flag_protect_parens = value;
>        break;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch, fortran] Function call optimization
  2011-03-18 23:23         ` Tobias Burnus
  2011-03-19  8:51           ` N.M. Maclaren
@ 2011-03-19 19:00           ` Thomas Koenig
  2011-03-19 20:33             ` Mikael Morin
  1 sibling, 1 reply; 9+ messages in thread
From: Thomas Koenig @ 2011-03-19 19:00 UTC (permalink / raw)
  To: fortran, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1836 bytes --]

Am 19.03.2011 00:23, schrieb Tobias Burnus:
> I have not followed the discussion nor have I fully read the patch, but
> what's the reason for allowing ELEMENTAL functions?

Here's an updated version of the patch, which removes the elemental 
functions as well.  I have also added an option which allows full access 
to all function call eliminations, so if any user wants it, it is there. 
  (I know I will use it :-)  This option is not enabled by any 
optimization option.

Regression-tested.  Before committing, I'll check on the status of the
gfc_free removal patch, and re-test.  Also tested with "make dvi" and
"make info".

OK for trunk?

	Thomas

2010-03-14  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/22572
         * gfortran.h (gfc_option_t) : Add
         flag_aggressive_function_elimination.
         * lang.opt: Add faggressive-function-elimination.
         * invoke.texi: Document -faggressive-function-elimination.
         * frontend_passes (expr_array):  New static variable.
         (expr_size):  Likewise.
         (expr_count):  Likewise.
         (current_code):  Likewise.
         (current_ns):  Likewise.
         (gfc_run_passes):  Allocate and free space for expressions.
         (compare_functions):  New function.
         (cfe_expr):  New function.
         (create_var):  New function.
         (cfc_expr_0):  New function.
         (cfe_code):  New function.
         (optimize_namespace):  Invoke gfc_code_walker with cfe_code
         and cfe_expr_0.
         * options.c (gfc_init_options):  Handle
         flag_aggressive_function_elimination.
         (gfc_handle_option):  Likewise.

2010-03-14  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/22572
         * gfortran.dg/function_optimize_1.f90:  New test.
         * gfortran.dg/function_optimize_2.f90:  New test.

[-- Attachment #2: p11.diff --]
[-- Type: text/x-patch, Size: 10350 bytes --]

Index: gfortran.h
===================================================================
--- gfortran.h	(Revision 170960)
+++ gfortran.h	(Arbeitskopie)
@@ -2232,6 +2232,7 @@ typedef struct
   int flag_whole_file;
   int flag_protect_parens;
   int flag_realloc_lhs;
+  int flag_aggressive_function_elimination;
 
   int fpe;
   int rtcheck;
Index: lang.opt
===================================================================
--- lang.opt	(Revision 170960)
+++ lang.opt	(Arbeitskopie)
@@ -278,6 +278,10 @@ d
 Fortran Joined
 ; Documented in common.opt
 
+faggressive-function-elimination
+Fortran
+Eliminate multiple function invokations also for impure functions
+
 falign-commons
 Fortran
 Enable alignment of COMMON blocks
Index: invoke.texi
===================================================================
--- invoke.texi	(Revision 170960)
+++ invoke.texi	(Arbeitskopie)
@@ -1468,6 +1468,18 @@ need to be in effect.
 An allocatable left-hand side of an intrinsic assignment is automatically
 (re)allocated if it is either unallocated or has a different shape. The
 option is enabled by default except when @option{-std=f95} is given.
+
+@item -faggressive-function-elimination
+@opindex @code{faggressive-function-elimination}
+@cindex Elimination of functions with identical argument lists
+Functions with identical argument lists are eliminated within
+statements, regardless of whether these functions are marked
+@code{PURE} or not. For example, in
+@smallexample
+  a = f(b,c) + f(b,c)
+@end smallexample
+there will only be a single call to @code{f}.
+
 @end table
 
 @xref{Code Gen Options,,Options for Code Generation Conventions,
@@ -1475,7 +1487,6 @@ gcc,Using the GNU Compiler Collection (GCC)}, for
 offered by the GBE
 shared by @command{gfortran}, @command{gcc}, and other GNU compilers.
 
-
 @c man end
 
 @node Environment Variables
Index: frontend-passes.c
===================================================================
--- frontend-passes.c	(Revision 170960)
+++ frontend-passes.c	(Arbeitskopie)
@@ -40,6 +40,21 @@ static bool optimize_trim (gfc_expr *);
 
 static int count_arglist;
 
+/* Pointer to an array of gfc_expr ** we operate on, plus its size
+   and counter.  */
+
+static gfc_expr ***expr_array;
+static int expr_size, expr_count;
+
+/* Pointer to the gfc_code we currently work on - to be able to insert
+   a statement before.  */
+
+static gfc_code **current_code;
+
+/* The namespace we are currently dealing with.  */
+
+gfc_namespace *current_ns;
+
 /* Entry point - run all passes for a namespace.  So far, only an
    optimization pass is run.  */
 
@@ -48,9 +63,16 @@ gfc_run_passes (gfc_namespace *ns)
 {
   if (optimize)
     {
+      expr_size = 20;
+      expr_array = XNEWVEC(gfc_expr **, expr_size);
+
       optimize_namespace (ns);
       if (gfc_option.dump_fortran_optimized)
 	gfc_dump_parse_tree (ns, stdout);
+
+      /* FIXME: The following should be XDELETEVEC(expr_array);
+      but we cannot do that because it depends on free.  */
+      gfc_free (expr_array);
     }
 }
 
@@ -106,11 +128,237 @@ optimize_expr (gfc_expr **e, int *walk_subtrees AT
   return 0;
 }
 
+/* Compare two functions for equality.  We could use gfc_dep_compare_expr
+   except that we also consider impure functions equal, because anybody
+   changing the return value of the function within an expression would
+   violate the Fortran standard.  */
+
+static bool
+compare_functions (gfc_expr **ep1, gfc_expr **ep2)
+{
+  gfc_expr *e1, *e2;
+
+  e1 = *ep1;
+  e2 = *ep2;
+
+  if (e1->expr_type != EXPR_FUNCTION || e2->expr_type != EXPR_FUNCTION)
+    return false;
+
+  if ((e1->value.function.esym && e2->value.function.esym
+       && e1->value.function.esym == e2->value.function.esym)
+      || (e1->value.function.isym && e2->value.function.isym
+	  && e1->value.function.isym == e2->value.function.isym))
+    {
+      gfc_actual_arglist *args1, *args2;
+      
+      args1 = e1->value.function.actual;
+      args2 = e2->value.function.actual;
+
+      /* Compare the argument lists for equality.  */
+      while (args1 && args2)
+	{
+	  /*  Bitwise xor, since C has no non-bitwise xor operator.  */
+	  if ((args1->expr == NULL) ^ (args2->expr == NULL))
+	    return false;
+
+	  if (args1->expr != NULL && args2->expr != NULL
+	      && gfc_dep_compare_expr (args1->expr, args2->expr) != 0)
+	    return false;
+
+	  args1 = args1->next;
+	  args2 = args2->next;
+	}
+      return args1 == NULL && args2 == NULL;
+    }
+  else
+    return false;
+      
+}
+
+/* Callback function for gfc_expr_walker, called from cfe_expr_0.  Put all
+   eligible function expressions into expr_array.  We can't do allocatable
+   functions.  */
+
+static int
+cfe_expr (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED,
+	  void *data ATTRIBUTE_UNUSED)
+{
+  if ((*e)->expr_type != EXPR_FUNCTION)
+    return 0;
+
+  /* We don't do character functions (yet).  */
+  if ((*e)->ts.type == BT_CHARACTER)
+    return 0;
+
+  /* If we don't know the shape at compile time, we do not create a temporary
+     variable to hold the intermediate result.  FIXME: Change this later when
+     allocation on assignment works for intrinsics.  */
+
+  if ((*e)->rank > 0 && (*e)->shape == NULL)
+    return 0;
+  
+  /* Skip the test for pure functions if -faggressive-function-elimination
+     is specified.  */
+  if (!gfc_option.flag_aggressive_function_elimination)
+    {
+      if ((*e)->value.function.esym)
+	{
+	  if ((*e)->value.function.esym->attr.allocatable)
+	    return 0;
+
+	  if (!(*e)->value.function.esym->attr.pure
+	      && !(*e)->value.function.esym->attr.implicit_pure)
+	    return 0;
+	}
+    }
+
+  if ((*e)->value.function.isym)
+    {
+      if ((*e)->value.function.isym->id == GFC_ISYM_CONVERSION)
+	return 0;
+
+      if (! (*e)->value.function.isym->pure
+	  && !(*e)->value.function.isym->elemental)
+	return 0;
+    }
+
+  if (expr_count >= expr_size)
+    {
+      expr_size += expr_size;
+      expr_array = XRESIZEVEC(gfc_expr **, expr_array, expr_size);
+    }
+  expr_array[expr_count] = e;
+  expr_count ++;
+  return 0;
+}
+
+/* Returns a new expression (a variable) to be used in place of the old one,
+   with an an assignment statement before the current statement to set
+   the value of the variable.  */
+
+static gfc_expr*
+create_var (gfc_expr * e)
+{
+  char name[GFC_MAX_SYMBOL_LEN +1];
+  static int num = 1;
+  gfc_symtree *symtree;
+  gfc_symbol *symbol;
+  gfc_expr *result;
+  gfc_code *n;
+  int i;
+
+  sprintf(name, "__var_%d",num++);
+  if (gfc_get_sym_tree (name, current_ns, &symtree, false) != 0)
+    gcc_unreachable ();
+
+  symbol = symtree->n.sym;
+  symbol->ts = e->ts;
+  symbol->as = gfc_get_array_spec ();
+  symbol->as->rank = e->rank;
+  symbol->as->type = AS_EXPLICIT;
+  for (i=0; i<e->rank; i++)
+    {
+      gfc_expr *p, *q;
+      
+      p = gfc_get_constant_expr (BT_INTEGER, gfc_default_integer_kind,
+				 &(e->where));
+      mpz_set_si (p->value.integer, 1);
+      symbol->as->lower[i] = p;
+	  
+      q = gfc_get_constant_expr (BT_INTEGER, gfc_index_integer_kind,
+				 &(e->where));
+      mpz_set (q->value.integer, e->shape[i]);
+      symbol->as->upper[i] = q;
+    }
+
+  symbol->attr.flavor = FL_VARIABLE;
+  symbol->attr.referenced = 1;
+  symbol->attr.dimension = e->rank > 0;
+  gfc_commit_symbol (symbol);
+
+  result = gfc_get_expr ();
+  result->expr_type = EXPR_VARIABLE;
+  result->ts = e->ts;
+  result->rank = e->rank;
+  result->shape = gfc_copy_shape (e->shape, e->rank);
+  result->symtree = symtree;
+  result->where = e->where;
+  if (e->rank > 0)
+    {
+      result->ref = gfc_get_ref ();
+      result->ref->type = REF_ARRAY;
+      result->ref->u.ar.type = AR_FULL;
+      result->ref->u.ar.where = e->where;
+      result->ref->u.ar.as = symbol->as;
+    }
+
+  /* Generate the new assignment.  */
+  n = XCNEW (gfc_code);
+  n->op = EXEC_ASSIGN;
+  n->loc = (*current_code)->loc;
+  n->next = *current_code;
+  n->expr1 = gfc_copy_expr (result);
+  n->expr2 = e;
+  *current_code = n;
+
+  return result;
+}
+
+/* Callback function for the code walker for doing common function
+   elimination.  This builds up the list of functions in the expression
+   and goes through them to detect duplicates, which it then replaces
+   by variables.  */
+
+static int
+cfe_expr_0 (gfc_expr **e, int *walk_subtrees,
+	  void *data ATTRIBUTE_UNUSED)
+{
+  int i,j;
+  gfc_expr *newvar;
+
+  expr_count = 0;
+  gfc_expr_walker (e, cfe_expr, NULL);
+  /* Walk backwards through all the functions to make sure we
+     catch the leaf functions first.  */
+  for (i=expr_count-1; i>=1; i--)
+    {
+      newvar = NULL;
+      for (j=i-1; j>=0; j--)
+	{
+	  if (compare_functions(expr_array[i], expr_array[j]))
+	    {
+	      if (newvar == NULL)
+		newvar = create_var (*(expr_array[i]));
+	      gfc_free (*(expr_array[j]));
+	      *(expr_array[j]) = gfc_copy_expr (newvar);
+	    }
+	}
+      if (newvar)
+	*(expr_array[i]) = newvar;
+    }
+
+  /* We did all the necessary walking in this function.  */
+  *walk_subtrees = 0;
+  return 0;
+}
+
+static int
+cfe_code (gfc_code **c, int *walk_subtrees ATTRIBUTE_UNUSED,
+	  void *data ATTRIBUTE_UNUSED)
+{
+  current_code = c;
+  return 0;
+}
+
 /* Optimize a namespace, including all contained namespaces.  */
 
 static void
 optimize_namespace (gfc_namespace *ns)
 {
+
+  current_ns = ns;
+
+  gfc_code_walker (&ns->code, cfe_code, cfe_expr_0, NULL);
   gfc_code_walker (&ns->code, optimize_code, optimize_expr, NULL);
 
   for (ns = ns->contained; ns; ns = ns->sibling)
Index: options.c
===================================================================
--- options.c	(Revision 170960)
+++ options.c	(Arbeitskopie)
@@ -150,6 +150,7 @@ gfc_init_options (unsigned int decoded_options_cou
   gfc_option.flag_align_commons = 1;
   gfc_option.flag_protect_parens = 1;
   gfc_option.flag_realloc_lhs = -1;
+  gfc_option.flag_aggressive_function_elimination = 0;
   
   gfc_option.fpe = 0;
   gfc_option.rtcheck = 0;
@@ -972,6 +973,10 @@ gfc_handle_option (size_t scode, const char *arg,
       gfc_option.flag_align_commons = value;
       break;
 
+    case  OPT_faggressive_function_elimination:
+      gfc_option.flag_aggressive_function_elimination = value;
+      break;
+
     case OPT_fprotect_parens:
       gfc_option.flag_protect_parens = value;
       break;

[-- Attachment #3: function_optimize_1.f90 --]
[-- Type: text/plain, Size: 1454 bytes --]

! { dg-do compile }
! { dg-options "-O -fdump-tree-original" }
program main
  implicit none
  real, dimension(2,2) :: a, b, c, d
  integer :: i
  character(60) :: line
  real, external :: ext_func
  interface
     elemental function element(x)
       real, intent(in) :: x
       real :: elem
     end function element
     pure function mypure(x)
       real, intent(in) :: x
       integer :: mypure
     end function mypure
     elemental impure function elem_impure(x)
       real, intent(in) :: x
       real :: elem_impure
     end function elem_impure
  end interface

  real :: x
  data a /2., 3., 5., 7./
  data b /11., 13., 17., 23./
  write (unit=line, fmt='(4F7.2)') matmul(a,b) + matmul(a,b)
  d = sin(a) + cos(a) + sin(a) + cos(a)
  x = ext_func(a) + 23 + ext_func(a)
  print *,d,x
  d = element(x) + element(x)
  print *,d
  i = mypure(x) - mypure(x)
  print *,i
  d = elem_impure(x) - elem_impure(x)
  print *,d
end program main
! { dg-final { scan-tree-dump-times "matmul_r4" 1 "original" } }
! { dg-final { scan-tree-dump-times "__builtin_sinf" 1 "original" } }
! { dg-final { scan-tree-dump-times "__builtin_cosf" 1 "original" } }
! { dg-final { scan-tree-dump-times "ext_func" 2 "original" } }
! { dg-final { scan-tree-dump-times "element" 1 "original" } }
! { dg-final { scan-tree-dump-times "mypure" 1 "original" } }
! { dg-final { scan-tree-dump-times "elem_impure" 2 "original" } }
! { dg-final { cleanup-tree-dump "original" } }

[-- Attachment #4: function_optimize_2.f90 --]
[-- Type: text/plain, Size: 1488 bytes --]

! { dg-do compile }
! { dg-options "-O -faggressive-function-elimination -fdump-tree-original" }
program main
  implicit none
  real, dimension(2,2) :: a, b, c, d
  integer :: i
  character(60) :: line
  real, external :: ext_func
  interface
     elemental function element(x)
       real, intent(in) :: x
       real :: elem
     end function element
     pure function mypure(x)
       real, intent(in) :: x
       integer :: mypure
     end function mypure
     elemental impure function elem_impure(x)
       real, intent(in) :: x
       real :: elem_impure
     end function elem_impure
  end interface

  real :: x
  data a /2., 3., 5., 7./
  data b /11., 13., 17., 23./
  write (unit=line, fmt='(4F7.2)') matmul(a,b) + matmul(a,b)
  d = sin(a) + cos(a) + sin(a) + cos(a)
  x = ext_func(a) + 23 + ext_func(a)
  print *,d,x
  d = element(x) + element(x)
  print *,d
  i = mypure(x) - mypure(x)
  print *,i
  d = elem_impure(x) - elem_impure(x)
  print *,d
end program main
! { dg-final { scan-tree-dump-times "matmul_r4" 1 "original" } }
! { dg-final { scan-tree-dump-times "__builtin_sinf" 1 "original" } }
! { dg-final { scan-tree-dump-times "__builtin_cosf" 1 "original" } }
! { dg-final { scan-tree-dump-times "ext_func" 1 "original" } }
! { dg-final { scan-tree-dump-times "element" 1 "original" } }
! { dg-final { scan-tree-dump-times "mypure" 1 "original" } }
! { dg-final { scan-tree-dump-times "elem_impure" 1 "original" } }
! { dg-final { cleanup-tree-dump "original" } }

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch, fortran] Function call optimization
  2011-03-18 23:23         ` Tobias Burnus
@ 2011-03-19  8:51           ` N.M. Maclaren
  2011-03-19 19:00           ` Thomas Koenig
  1 sibling, 0 replies; 9+ messages in thread
From: N.M. Maclaren @ 2011-03-19  8:51 UTC (permalink / raw)
  To: Tobias Burnus; +Cc: Thomas Koenig, fortran, gcc-patches

On Mar 18 2011, Tobias Burnus wrote:

>Thomas Koenig wrote:
>> +      if (!(*e)->value.function.esym->attr.pure
>> +	&&  !(*e)->value.function.esym->attr.implicit_pure
>> +	&&  !(*e)->value.function.esym->attr.elemental)
>> +	return 0;
>
>I have not followed the discussion nor have I fully read the patch, but 
>what's the reason for allowing ELEMENTAL functions? I understand the 
>PURE and the implicitly pure part. But without looking at the 
>scalarizer, I would assume that the same reasons which speak against 
>non-elemental impure functions should also speak against IMPURE 
>ELEMENTAL functions, don't they?

Impure elemental procedures came in only in Fortran 2008, if I understand
it.  I don't know for certain, but my guess is that they are there
primarily to allow diagnostics in elemental procedures.  What that means
about their required semantics is a bit unclear ....


Regards,
Nick Maclaren.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch, fortran] Function call optimization
  2011-03-16 21:25       ` Thomas Koenig
@ 2011-03-18 23:23         ` Tobias Burnus
  2011-03-19  8:51           ` N.M. Maclaren
  2011-03-19 19:00           ` Thomas Koenig
  0 siblings, 2 replies; 9+ messages in thread
From: Tobias Burnus @ 2011-03-18 23:23 UTC (permalink / raw)
  To: Thomas Koenig; +Cc: fortran, gcc-patches

Thomas Koenig wrote:
> +      if (!(*e)->value.function.esym->attr.pure
> +	&&  !(*e)->value.function.esym->attr.implicit_pure
> +	&&  !(*e)->value.function.esym->attr.elemental)
> +	return 0;

I have not followed the discussion nor have I fully read the patch, but 
what's the reason for allowing ELEMENTAL functions? I understand the 
PURE and the implicitly pure part. But without looking at the 
scalarizer, I would assume that the same reasons which speak against 
non-elemental impure functions should also speak against IMPURE 
ELEMENTAL functions, don't they?

Tobias

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [patch, fortran] Function call optimization
       [not found]     ` <201103152342.45139.mikael.morin@sfr.fr>
@ 2011-03-16 21:25       ` Thomas Koenig
  2011-03-18 23:23         ` Tobias Burnus
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Koenig @ 2011-03-16 21:25 UTC (permalink / raw)
  To: fortran, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1263 bytes --]

Am 15.03.2011 23:42, schrieb Mikael Morin:

> On second thought, maybe you're right; the speed-up / likeliness-to-break
> ratio doesn't look so interesting after all, and selecting only
> pure/implicitely pure functions for optimization would get rid of the weird
> cases without (hopefully) being too restrictive on the candidates for
> optimization.

Since this appears to be the consensus, here is an updated version of 
the patch which does indeed that.


Regression-tested.  OK for trunk?

2010-03-14  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/22572
         * frontend_passes (expr_array):  New static variable.
         (expr_size):  Likewise.
         (expr_count):  Likewise.
         (current_code):  Likewise.
         (current_ns):  Likewise.
         (gfc_run_passes):  Allocate and free space for expressions.
         (compare_functions):  New function.
         (cfe_expr):  New function.
         (create_var):  New function.
         (cfc_expr_0):  New function.
         (cfe_code):  New function.
         (optimize_namespace):  Invoke gfc_code_walker with cfe_code
         and cfe_expr_0.

2010-03-14  Thomas Koenig  <tkoenig@gcc.gnu.org>

         PR fortran/22572
         * gfortran.dg/function_optimize_1.f90:  New test.

[-- Attachment #2: p9.diff --]
[-- Type: text/x-patch, Size: 7578 bytes --]

Index: frontend-passes.c
===================================================================
--- frontend-passes.c	(Revision 170960)
+++ frontend-passes.c	(Arbeitskopie)
@@ -40,6 +40,21 @@ static bool optimize_trim (gfc_expr *);
 
 static int count_arglist;
 
+/* Pointer to an array of gfc_expr ** we operate on, plus its size
+   and counter.  */
+
+static gfc_expr ***expr_array;
+static int expr_size, expr_count;
+
+/* Pointer to the gfc_code we currently work on - to be able to insert
+   a statement before.  */
+
+static gfc_code **current_code;
+
+/* The namespace we are currently dealing with.  */
+
+gfc_namespace *current_ns;
+
 /* Entry point - run all passes for a namespace.  So far, only an
    optimization pass is run.  */
 
@@ -48,9 +63,16 @@ gfc_run_passes (gfc_namespace *ns)
 {
   if (optimize)
     {
+      expr_size = 20;
+      expr_array = XNEWVEC(gfc_expr **, expr_size);
+
       optimize_namespace (ns);
       if (gfc_option.dump_fortran_optimized)
 	gfc_dump_parse_tree (ns, stdout);
+
+      /* FIXME: The following should be XDELETEVEC(expr_array);
+      but we cannot do that because it depends on free.  */
+      gfc_free (expr_array);
     }
 }
 
@@ -106,11 +128,233 @@ optimize_expr (gfc_expr **e, int *walk_subtrees AT
   return 0;
 }
 
+/* Compare two functions for equality.  We could use gfc_dep_compare_expr
+   except that we also consider impure functions equal, because anybody
+   changing the return value of the function within an expression would
+   violate the Fortran standard.  */
+
+static bool
+compare_functions (gfc_expr **ep1, gfc_expr **ep2)
+{
+  gfc_expr *e1, *e2;
+
+  e1 = *ep1;
+  e2 = *ep2;
+
+  if (e1->expr_type != EXPR_FUNCTION || e2->expr_type != EXPR_FUNCTION)
+    return false;
+
+  if ((e1->value.function.esym && e2->value.function.esym
+       && e1->value.function.esym == e2->value.function.esym)
+      || (e1->value.function.isym && e2->value.function.isym
+	  && e1->value.function.isym == e2->value.function.isym))
+    {
+      gfc_actual_arglist *args1, *args2;
+      
+      args1 = e1->value.function.actual;
+      args2 = e2->value.function.actual;
+
+      /* Compare the argument lists for equality.  */
+      while (args1 && args2)
+	{
+	  /*  Bitwise xor, since C has no non-bitwise xor operator.  */
+	  if ((args1->expr == NULL) ^ (args2->expr == NULL))
+	    return false;
+
+	  if (args1->expr != NULL && args2->expr != NULL
+	      && gfc_dep_compare_expr (args1->expr, args2->expr) != 0)
+	    return false;
+
+	  args1 = args1->next;
+	  args2 = args2->next;
+	}
+      return args1 == NULL && args2 == NULL;
+    }
+  else
+    return false;
+      
+}
+
+/* Callback function for gfc_expr_walker, called from cfe_expr_0.  Put all
+   eligible function expressions into expr_array.  We can't do allocatable
+   functions.  */
+
+static int
+cfe_expr (gfc_expr **e, int *walk_subtrees ATTRIBUTE_UNUSED,
+	  void *data ATTRIBUTE_UNUSED)
+{
+  if ((*e)->expr_type != EXPR_FUNCTION)
+    return 0;
+
+  /* We don't do character functions (yet).  */
+  if ((*e)->ts.type == BT_CHARACTER)
+    return 0;
+
+  /* If we don't know the shape at compile time, we do not create a temporary
+     variable to hold the intermediate result.  FIXME: Change this later when
+     allocation on assignment works for intrinsics.  */
+
+  if ((*e)->rank > 0 && (*e)->shape == NULL)
+    return 0;
+  
+  if ((*e)->value.function.esym)
+    {
+      if ((*e)->value.function.esym->attr.allocatable)
+	return 0;
+
+      if (!(*e)->value.function.esym->attr.pure
+	  && !(*e)->value.function.esym->attr.implicit_pure
+	  && !(*e)->value.function.esym->attr.elemental)
+	return 0;
+    }
+
+  if ((*e)->value.function.isym)
+    {
+      if ((*e)->value.function.isym->id == GFC_ISYM_CONVERSION)
+	return 0;
+
+      if (! (*e)->value.function.isym->pure
+	  && !(*e)->value.function.isym->elemental)
+	return 0;
+    }
+
+  if (expr_count >= expr_size)
+    {
+      expr_size += expr_size;
+      expr_array = XRESIZEVEC(gfc_expr **, expr_array, expr_size);
+    }
+  expr_array[expr_count] = e;
+  expr_count ++;
+  return 0;
+}
+
+/* Returns a new expression (a variable) to be used in place of the old one,
+   with an an assignment statement before the current statement to set
+   the value of the variable.  */
+
+static gfc_expr*
+create_var (gfc_expr * e)
+{
+  char name[GFC_MAX_SYMBOL_LEN +1];
+  static int num = 1;
+  gfc_symtree *symtree;
+  gfc_symbol *symbol;
+  gfc_expr *result;
+  gfc_code *n;
+  int i;
+
+  sprintf(name, "__var_%d",num++);
+  if (gfc_get_sym_tree (name, current_ns, &symtree, false) != 0)
+    gcc_unreachable ();
+
+  symbol = symtree->n.sym;
+  symbol->ts = e->ts;
+  symbol->as = gfc_get_array_spec ();
+  symbol->as->rank = e->rank;
+  symbol->as->type = AS_EXPLICIT;
+  for (i=0; i<e->rank; i++)
+    {
+      gfc_expr *p, *q;
+      
+      p = gfc_get_constant_expr (BT_INTEGER, gfc_default_integer_kind,
+				 &(e->where));
+      mpz_set_si (p->value.integer, 1);
+      symbol->as->lower[i] = p;
+	  
+      q = gfc_get_constant_expr (BT_INTEGER, gfc_index_integer_kind,
+				 &(e->where));
+      mpz_set (q->value.integer, e->shape[i]);
+      symbol->as->upper[i] = q;
+    }
+
+  symbol->attr.flavor = FL_VARIABLE;
+  symbol->attr.referenced = 1;
+  symbol->attr.dimension = e->rank > 0;
+  gfc_commit_symbol (symbol);
+
+  result = gfc_get_expr ();
+  result->expr_type = EXPR_VARIABLE;
+  result->ts = e->ts;
+  result->rank = e->rank;
+  result->shape = gfc_copy_shape (e->shape, e->rank);
+  result->symtree = symtree;
+  result->where = e->where;
+  if (e->rank > 0)
+    {
+      result->ref = gfc_get_ref ();
+      result->ref->type = REF_ARRAY;
+      result->ref->u.ar.type = AR_FULL;
+      result->ref->u.ar.where = e->where;
+      result->ref->u.ar.as = symbol->as;
+    }
+
+  /* Generate the new assignment.  */
+  n = XCNEW (gfc_code);
+  n->op = EXEC_ASSIGN;
+  n->loc = (*current_code)->loc;
+  n->next = *current_code;
+  n->expr1 = gfc_copy_expr (result);
+  n->expr2 = e;
+  *current_code = n;
+
+  return result;
+}
+
+/* Callback function for the code walker for doing common function
+   elimination.  This builds up the list of functions in the expression
+   and goes through them to detect duplicates, which it then replaces
+   by variables.  */
+
+static int
+cfe_expr_0 (gfc_expr **e, int *walk_subtrees,
+	  void *data ATTRIBUTE_UNUSED)
+{
+  int i,j;
+  gfc_expr *newvar;
+
+  expr_count = 0;
+  gfc_expr_walker (e, cfe_expr, NULL);
+  /* Walk backwards through all the functions to make sure we
+     catch the leaf functions first.  */
+  for (i=expr_count-1; i>=1; i--)
+    {
+      newvar = NULL;
+      for (j=i-1; j>=0; j--)
+	{
+	  if (compare_functions(expr_array[i], expr_array[j]))
+	    {
+	      if (newvar == NULL)
+		newvar = create_var (*(expr_array[i]));
+	      gfc_free (*(expr_array[j]));
+	      *(expr_array[j]) = gfc_copy_expr (newvar);
+	    }
+	}
+      if (newvar)
+	*(expr_array[i]) = newvar;
+    }
+
+  /* We did all the necessary walking in this function.  */
+  *walk_subtrees = 0;
+  return 0;
+}
+
+static int
+cfe_code (gfc_code **c, int *walk_subtrees ATTRIBUTE_UNUSED,
+	  void *data ATTRIBUTE_UNUSED)
+{
+  current_code = c;
+  return 0;
+}
+
 /* Optimize a namespace, including all contained namespaces.  */
 
 static void
 optimize_namespace (gfc_namespace *ns)
 {
+
+  current_ns = ns;
+
+  gfc_code_walker (&ns->code, cfe_code, cfe_expr_0, NULL);
   gfc_code_walker (&ns->code, optimize_code, optimize_expr, NULL);
 
   for (ns = ns->contained; ns; ns = ns->sibling)

[-- Attachment #3: function_optimize_1.f90 --]
[-- Type: text/plain, Size: 1190 bytes --]

! { dg-do compile }
! { dg-options "-O -fdump-tree-original" }
program main
  implicit none
  real, dimension(2,2) :: a, b, c, d
  integer :: i
  character(60) :: line
  real, external :: ext_func
  interface
     elemental function elem(x)
       real, intent(in) :: x
       real :: elem
     end function elem
     pure function mypure(x)
       real, intent(in) :: x
       integer :: mypure
     end function mypure
  end interface

  real :: x
  data a /2., 3., 5., 7./
  data b /11., 13., 17., 23./
  write (unit=line, fmt='(4F7.2)') matmul(a,b) + matmul(a,b)
  d = sin(a) + cos(a) + sin(a) + cos(a)
  x = ext_func(a) + 23 + ext_func(a)
  print *,d,x
  d = elem(x) + elem(x)
  print *,d
  i = mypure(x) - mypure(x)
  print *,i
end program main
! { dg-final { scan-tree-dump-times "matmul_r4" 1 "original" } }
! { dg-final { scan-tree-dump-times "__builtin_sinf" 1 "original" } }
! { dg-final { scan-tree-dump-times "__builtin_cosf" 1 "original" } }
! { dg-final { scan-tree-dump-times "ext_func" 2 "original" } }
! { dg-final { scan-tree-dump-times "elem" 1 "original" } }
! { dg-final { scan-tree-dump-times "mypure" 1 "original" } }
! { dg-final { cleanup-tree-dump "original" } }

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-03-21  7:15 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-14 23:12 [patch, fortran] Function call optimization Thomas Koenig
     [not found] <46835D054D115449BBE84934E2F2000907821C71@ARLEXCHVS02.lst.link.l-3com.com>
     [not found] ` <201103151951.43233.mikael.morin@sfr.fr>
     [not found]   ` <4D7FBBA7.2080904@domob.eu>
     [not found]     ` <201103152342.45139.mikael.morin@sfr.fr>
2011-03-16 21:25       ` Thomas Koenig
2011-03-18 23:23         ` Tobias Burnus
2011-03-19  8:51           ` N.M. Maclaren
2011-03-19 19:00           ` Thomas Koenig
2011-03-19 20:33             ` Mikael Morin
2011-03-20 17:10               ` Thomas Koenig
2011-03-20 23:32                 ` Mikael Morin
2011-03-21  7:15                   ` Thomas Koenig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).