public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [Patch, Fortran] PR fortran/35681: First part, fix ELEMENTAL dependency  handling for MVBITS
@ 2008-10-27 19:43 Daniel Kraft
  2008-10-27 21:29 ` Paul Richard Thomas
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Daniel Kraft @ 2008-10-27 19:43 UTC (permalink / raw)
  To: Fortran List, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3352 bytes --]

Hi all,

working on PR fortran/35681, I've got some rather big patch now handling 
part of the problem.  What it exactly does:

1) Some tab-indentation formatting fixes as I came along, sorry for 
those.  I hope it is ok so.

2) When resolving a MVBITS intrinsic call, the code->resolved_sym gets a 
dummy formal argument list with the correct INTENTs specified; this is 
needed later for gfc_conv_elemental_dependencies.

3) gfc_code got a new member "resolved_isym" that tracks calls to 
intrinsic procedures, so we can later check if some call is to intrinsic 
MVBITS.  This got a little ugly and would be probably nicer to union it 
(and possibly "resolved_sym", too) with actual, but that would probably 
introduce a lot of changes to existing code pieces.

4) gfc_trans_allocate_array_storage (or what it is called) got a new 
argument `initial' that allows to initialize the created storage from 
some other array (this is done using a combination of internal_pack and 
memcpy if it was already packed, I hope I got this all right).  This is 
used for gfc_trans_create_temp_array to allow initializing the new 
temporary.  Here is (probably) most of the "critical" changes.

5) For calls to intrinsic MVBITS, I enabled dependency checking using 
gfc_conv_elemental_dependencies and made this routine aware of 
INTENT(INOUT) arguments that use the new initialization feature to copy 
over the initial content of the mirrored array to the created temporary.

6) I could not find a test to verify this (not even one that uses 
gfc_conv_elemental_dependencies) in a quick trial, but I believe the 
handling of the temporary there was wrong, in that it was free'd (if 
allocated on the heap) *before* it was used with internal_unpack, 
because gfc_trans_create_temp_array added the temporary clean-up code to 
se->post and the unpack-call was added to se->post later.  In my 
opinion, this is some rather general problem with how post-commands are 
usually added to other post blocks; shouldn't they be added to the top 
usually rather than to the bottom, to get some sort of "nested" scope 
with inner most pairs of pre/post?  Well, for now I changed this 
behaviour inside gfc_conv_elemental_dependencies, which corrected 
problems I got with MVBITS tests.

This enabled the (valid) tests in the PR to run, but only with modifying 
them slightly by removing the parentheses around the first argument (so 
it is not an expression; that will be part 2 of this fix).  As I 
understand it, this is valid in case of MVBITS but not for any other 
ELEMENTAL subroutine, right?  This is why I added the check for whether 
some call is to MVBITS.  I guess the rationale why the compiler is not 
required to create temporaries for all such ELEMENTAL calls (and they 
are invalid instead) is performance?  gfortran could handle those calls 
well in addition to only MVBITS calls simply if I take this conditional 
check out, but then we might generate temporaries for cases where the 
user knows no one is needed and the code is valid but the compiler can't 
figure it out.

I hope I got this one at least somewhat clear...  What do you think 
about it?  Currently regression-testing on GNU/Linux-x86-32, but I don't 
expect any (a very similar patch worked fine before).

Cheers,
Daniel

-- 
Done:  Arc-Bar-Cav-Rog-Sam-Val-Wiz
To go: Hea-Kni-Mon-Pri-Ran-Tou

[-- Attachment #2: patch.changelog --]
[-- Type: text/plain, Size: 1494 bytes --]

2008-10-27  Daniel Kraft  <d@domob.eu>

	PR fortran/35681
	* gfortran.h (struct gfc_code): New field `resolved_isym'.
	* trans.h (gfc_build_memcpy_call): Made public.
	* trans-array.h (gfc_trans_create_temp_array): New argument `initial'.
	* intrinsic.c (gfc_intrinsic_sub_interface): Set resolved_isym.
	* iresolve.c (create_formal_for_intents): New helper method.
	(gfc_resolve_mvbits): Put dummy formal arglist on resolved_sym.
	* resolve.c (resolve_call): Initialize resolved_isym to NULL.
	* trans-array.c (gfc_trans_allocate_array_storage): New argument
	`initial' to allow initializing the allocated storage to some initial
	value copied from another array.
	(gfc_trans_create_temp_array): Allow initialization of the temporary
	with a copy of some other array by using the new extension.
	(gfc_trans_array_constructor): Pass NULL_TREE for initial argument.
	(gfc_conv_loop_setup): Ditto.
	* trans-intrinsic.c (gfc_conv_intrinsic_array_transfer): Ditto.
	* trans-expr.c (gfc_conv_function_call): Ditto.
	(gfc_build_memcpy_call): Made public.
	* trans-stmt.c (gfc_conv_elemental_dependencies): Initialize created
	temporary for INTENT(INOUT) arguments to the value of the mirrored
	array and clean up the temporary as very last intructions in the created
	block.
	* trans.c (gfc_trans_code): For EXEC_CALL, see if we have a MVBITS call
	and enable elemental dependency checking if we have.

2008-10-27  Daniel Kraft  <d@domob.eu>

	PR fortran/35681
	* gfortran.dg/mvbits_4.f90: New test.

[-- Attachment #3: patch.diff --]
[-- Type: text/plain, Size: 23769 bytes --]

Index: gcc/fortran/intrinsic.c
===================================================================
*** gcc/fortran/intrinsic.c	(revision 141345)
--- gcc/fortran/intrinsic.c	(working copy)
*************** gfc_intrinsic_sub_interface (gfc_code *c
*** 3746,3751 ****
--- 3746,3752 ----
    if (!error_flag)
      gfc_pop_suppress_errors ();
  
+   c->resolved_isym = isym;
    if (isym->resolve.s1 != NULL)
      isym->resolve.s1 (c);
    else
Index: gcc/fortran/trans-expr.c
===================================================================
*** gcc/fortran/trans-expr.c	(revision 141345)
--- gcc/fortran/trans-expr.c	(working copy)
*************** gfc_conv_function_call (gfc_se * se, gfc
*** 2862,2869 ****
  	     mustn't be deallocated.  */
  	  callee_alloc = sym->attr.allocatable || sym->attr.pointer;
  	  gfc_trans_create_temp_array (&se->pre, &se->post, se->loop, info, tmp,
! 				       false, !sym->attr.pointer, callee_alloc,
! 				       &se->ss->expr->where);
  
  	  /* Pass the temporary as the first argument.  */
  	  tmp = info->descriptor;
--- 2862,2869 ----
  	     mustn't be deallocated.  */
  	  callee_alloc = sym->attr.allocatable || sym->attr.pointer;
  	  gfc_trans_create_temp_array (&se->pre, &se->post, se->loop, info, tmp,
! 				       NULL_TREE, false, !sym->attr.pointer,
! 				       callee_alloc, &se->ss->expr->where);
  
  	  /* Pass the temporary as the first argument.  */
  	  tmp = info->descriptor;
*************** gfc_trans_zero_assign (gfc_expr * expr)
*** 4383,4389 ****
  /* Helper for gfc_trans_array_copy and gfc_trans_array_constructor_copy
     that constructs the call to __builtin_memcpy.  */
  
! static tree
  gfc_build_memcpy_call (tree dst, tree src, tree len)
  {
    tree tmp;
--- 4383,4389 ----
  /* Helper for gfc_trans_array_copy and gfc_trans_array_constructor_copy
     that constructs the call to __builtin_memcpy.  */
  
! tree
  gfc_build_memcpy_call (tree dst, tree src, tree len)
  {
    tree tmp;
Index: gcc/fortran/trans-array.c
===================================================================
*** gcc/fortran/trans-array.c	(revision 141345)
--- gcc/fortran/trans-array.c	(working copy)
*************** gfc_set_loop_bounds_from_array_spec (gfc
*** 493,506 ****
     callee will allocate the array.  If DEALLOC is true, also generate code to
     free the array afterwards.
  
     Initialization code is added to PRE and finalization code to POST.
     DYNAMIC is true if the caller may want to extend the array later
     using realloc.  This prevents us from putting the array on the stack.  */
  
  static void
  gfc_trans_allocate_array_storage (stmtblock_t * pre, stmtblock_t * post,
!                                   gfc_ss_info * info, tree size, tree nelem,
!                                   bool dynamic, bool dealloc)
  {
    tree tmp;
    tree desc;
--- 493,509 ----
     callee will allocate the array.  If DEALLOC is true, also generate code to
     free the array afterwards.
  
+    If INITIAL is not NULL, it is packed using internal_pack and the result used
+    as data instead of allocating a fresh, unitialized area of memory.
+ 
     Initialization code is added to PRE and finalization code to POST.
     DYNAMIC is true if the caller may want to extend the array later
     using realloc.  This prevents us from putting the array on the stack.  */
  
  static void
  gfc_trans_allocate_array_storage (stmtblock_t * pre, stmtblock_t * post,
! 				  gfc_ss_info * info, tree size, tree nelem,
! 				  tree initial, bool dynamic, bool dealloc)
  {
    tree tmp;
    tree desc;
*************** gfc_trans_allocate_array_storage (stmtbl
*** 517,523 ****
    else
      {
        /* Allocate the temporary.  */
!       onstack = !dynamic && gfc_can_put_var_on_stack (size);
  
        if (onstack)
  	{
--- 520,527 ----
    else
      {
        /* Allocate the temporary.  */
!       onstack = !dynamic && initial == NULL_TREE
! 			 && gfc_can_put_var_on_stack (size);
  
        if (onstack)
  	{
*************** gfc_trans_allocate_array_storage (stmtbl
*** 534,542 ****
  	}
        else
  	{
! 	  /* Allocate memory to hold the data.  */
! 	  tmp = gfc_call_malloc (pre, NULL, size);
! 	  tmp = gfc_evaluate_now (tmp, pre);
  	  gfc_conv_descriptor_data_set (pre, desc, tmp);
  	}
      }
--- 538,590 ----
  	}
        else
  	{
! 	  /* Allocate memory to hold the data or call internal_pack.  */
! 	  if (initial == NULL_TREE)
! 	    {
! 	      tmp = gfc_call_malloc (pre, NULL, size);
! 	      tmp = gfc_evaluate_now (tmp, pre);
! 	    }
! 	  else
! 	    {
! 	      tree packed;
! 	      tree source_data;
! 	      tree was_packed;
! 	      stmtblock_t do_copying;
! 
! 	      tmp = TREE_TYPE (initial); /* Pointer to descriptor.  */
! 	      gcc_assert (TREE_CODE (tmp) == POINTER_TYPE);
! 	      tmp = TREE_TYPE (tmp); /* The descriptor itself.  */
! 	      tmp = gfc_get_element_type (tmp);
! 	      gcc_assert (tmp == gfc_get_element_type (TREE_TYPE (desc)));
! 	      packed = gfc_create_var (build_pointer_type (tmp), "data");
! 
! 	      tmp = build_call_expr (gfor_fndecl_in_pack, 1, initial);
! 	      tmp = fold_convert (TREE_TYPE (packed), tmp);
! 	      gfc_add_modify (pre, packed, tmp);
! 
! 	      tmp = build_fold_indirect_ref (initial);
! 	      source_data = gfc_conv_descriptor_data_get (tmp);
! 
! 	      /* internal_pack may return source->data without any allocation
! 		 or copying if it is already packed.  If that's the case, we
! 		 need to allocate and copy manually.  */
! 
! 	      gfc_start_block (&do_copying);
! 	      tmp = gfc_call_malloc (&do_copying, NULL, size);
! 	      tmp = fold_convert (TREE_TYPE (packed), tmp);
! 	      gfc_add_modify (&do_copying, packed, tmp);
! 	      tmp = gfc_build_memcpy_call (packed, source_data, size);
! 	      gfc_add_expr_to_block (&do_copying, tmp);
! 
! 	      was_packed = fold_build2 (EQ_EXPR, boolean_type_node,
! 					packed, source_data);
! 	      tmp = gfc_finish_block (&do_copying);
! 	      tmp = build3_v (COND_EXPR, was_packed, tmp, build_empty_stmt ());
! 	      gfc_add_expr_to_block (pre, tmp);
! 
! 	      tmp = fold_convert (pvoid_type_node, packed);
! 	    }
! 
  	  gfc_conv_descriptor_data_set (pre, desc, tmp);
  	}
      }
*************** gfc_trans_allocate_array_storage (stmtbl
*** 567,580 ****
     fields of info if known.  Returns the size of the array, or NULL for a
     callee allocated array.
  
!    PRE, POST, DYNAMIC and DEALLOC are as for gfc_trans_allocate_array_storage.
   */
  
  tree
  gfc_trans_create_temp_array (stmtblock_t * pre, stmtblock_t * post,
  			     gfc_loopinfo * loop, gfc_ss_info * info,
! 			     tree eltype, bool dynamic, bool dealloc,
! 			     bool callee_alloc, locus * where)
  {
    tree type;
    tree desc;
--- 615,629 ----
     fields of info if known.  Returns the size of the array, or NULL for a
     callee allocated array.
  
!    PRE, POST, INITIAL, DYNAMIC and DEALLOC are as for
!    gfc_trans_allocate_array_storage.
   */
  
  tree
  gfc_trans_create_temp_array (stmtblock_t * pre, stmtblock_t * post,
  			     gfc_loopinfo * loop, gfc_ss_info * info,
! 			     tree eltype, tree initial, bool dynamic,
! 			     bool dealloc, bool callee_alloc, locus * where)
  {
    tree type;
    tree desc;
*************** gfc_trans_create_temp_array (stmtblock_t
*** 600,607 ****
        if (n >= loop->temp_dim)
  	{
  	  /* Callee allocated arrays may not have a known bound yet.  */
!           if (loop->to[n])
!               loop->to[n] = fold_build2 (MINUS_EXPR, gfc_array_index_type,
  					 loop->to[n], loop->from[n]);
  	  loop->from[n] = gfc_index_zero_node;
  	}
--- 649,656 ----
        if (n >= loop->temp_dim)
  	{
  	  /* Callee allocated arrays may not have a known bound yet.  */
! 	  if (loop->to[n])
! 	      loop->to[n] = fold_build2 (MINUS_EXPR, gfc_array_index_type,
  					 loop->to[n], loop->from[n]);
  	  loop->from[n] = gfc_index_zero_node;
  	}
*************** gfc_trans_create_temp_array (stmtblock_t
*** 635,641 ****
         {
  	 stride[n] = size
  	 delta = ubound[n] + 1 - lbound[n];
!          size = size * delta;
         }
       size = size * sizeof(element);
    */
--- 684,690 ----
         {
  	 stride[n] = size
  	 delta = ubound[n] + 1 - lbound[n];
! 	 size = size * delta;
         }
       size = size * sizeof(element);
    */
*************** gfc_trans_create_temp_array (stmtblock_t
*** 645,662 ****
    for (n = 0; n < info->dimen; n++)
      {
        if (loop->to[n] == NULL_TREE)
!         {
  	  /* For a callee allocated array express the loop bounds in terms
  	     of the descriptor fields.  */
!           tmp =
  	    fold_build2 (MINUS_EXPR, gfc_array_index_type,
  			 gfc_conv_descriptor_ubound (desc, gfc_rank_cst[n]),
  			 gfc_conv_descriptor_lbound (desc, gfc_rank_cst[n]));
!           loop->to[n] = tmp;
!           size = NULL_TREE;
!           continue;
!         }
!         
        /* Store the stride and bound components in the descriptor.  */
        tmp = gfc_conv_descriptor_stride (desc, gfc_rank_cst[n]);
        gfc_add_modify (pre, tmp, size);
--- 694,711 ----
    for (n = 0; n < info->dimen; n++)
      {
        if (loop->to[n] == NULL_TREE)
! 	{
  	  /* For a callee allocated array express the loop bounds in terms
  	     of the descriptor fields.  */
! 	  tmp =
  	    fold_build2 (MINUS_EXPR, gfc_array_index_type,
  			 gfc_conv_descriptor_ubound (desc, gfc_rank_cst[n]),
  			 gfc_conv_descriptor_lbound (desc, gfc_rank_cst[n]));
! 	  loop->to[n] = tmp;
! 	  size = NULL_TREE;
! 	  continue;
! 	}
! 	
        /* Store the stride and bound components in the descriptor.  */
        tmp = gfc_conv_descriptor_stride (desc, gfc_rank_cst[n]);
        gfc_add_modify (pre, tmp, size);
*************** gfc_trans_create_temp_array (stmtblock_t
*** 704,711 ****
        size = NULL_TREE;
      }
  
!   gfc_trans_allocate_array_storage (pre, post, info, size, nelem, dynamic,
! 			            dealloc);
  
    if (info->dimen > loop->temp_dim)
      loop->temp_dim = info->dimen;
--- 753,760 ----
        size = NULL_TREE;
      }
  
!   gfc_trans_allocate_array_storage (pre, post, info, size, nelem, initial,
! 				    dynamic, dealloc);
  
    if (info->dimen > loop->temp_dim)
      loop->temp_dim = info->dimen;
*************** gfc_trans_array_constructor (gfc_loopinf
*** 1818,1824 ****
      loopfrom = NULL_TREE;
  
    gfc_trans_create_temp_array (&loop->pre, &loop->post, loop, &ss->data.info,
! 			       type, dynamic, true, false, where);
  
    if (loopfrom != NULL_TREE)
      {
--- 1867,1873 ----
      loopfrom = NULL_TREE;
  
    gfc_trans_create_temp_array (&loop->pre, &loop->post, loop, &ss->data.info,
! 			       type, NULL_TREE, dynamic, true, false, where);
  
    if (loopfrom != NULL_TREE)
      {
*************** gfc_conv_loop_setup (gfc_loopinfo * loop
*** 3539,3546 ****
        loop->temp_ss->type = GFC_SS_SECTION;
        loop->temp_ss->data.info.dimen = n;
        gfc_trans_create_temp_array (&loop->pre, &loop->post, loop,
! 				   &loop->temp_ss->data.info, tmp, false, true,
! 				   false, where);
      }
  
    for (n = 0; n < loop->temp_dim; n++)
--- 3588,3595 ----
        loop->temp_ss->type = GFC_SS_SECTION;
        loop->temp_ss->data.info.dimen = n;
        gfc_trans_create_temp_array (&loop->pre, &loop->post, loop,
! 				   &loop->temp_ss->data.info, tmp, NULL_TREE,
! 				   false, true, false, where);
      }
  
    for (n = 0; n < loop->temp_dim; n++)
Index: gcc/fortran/trans-array.h
===================================================================
*** gcc/fortran/trans-array.h	(revision 141345)
--- gcc/fortran/trans-array.h	(working copy)
*************** void gfc_set_loop_bounds_from_array_spec
*** 32,38 ****
  
  /* Generate code to create a temporary array.  */
  tree gfc_trans_create_temp_array (stmtblock_t *, stmtblock_t *, gfc_loopinfo *,
!                                   gfc_ss_info *, tree, bool, bool, bool,
  				  locus *);
  
  /* Generate function entry code for allocation of compiler allocated array
--- 32,38 ----
  
  /* Generate code to create a temporary array.  */
  tree gfc_trans_create_temp_array (stmtblock_t *, stmtblock_t *, gfc_loopinfo *,
! 				  gfc_ss_info *, tree, tree, bool, bool, bool,
  				  locus *);
  
  /* Generate function entry code for allocation of compiler allocated array
Index: gcc/fortran/gfortran.h
===================================================================
*** gcc/fortran/gfortran.h	(revision 141345)
--- gcc/fortran/gfortran.h	(working copy)
*************** typedef struct gfc_code
*** 1886,1891 ****
--- 1886,1892 ----
       symbol for the interface definition.
    const char *sub_name;  */
    gfc_symbol *resolved_sym;
+   gfc_intrinsic_sym *resolved_isym;
  
    union
    {
Index: gcc/fortran/trans-stmt.c
===================================================================
*** gcc/fortran/trans-stmt.c	(revision 141345)
--- gcc/fortran/trans-stmt.c	(working copy)
*************** gfc_conv_elemental_dependencies (gfc_se 
*** 251,256 ****
--- 251,259 ----
  	    && gfc_check_fncall_dependency (e, fsym->attr.intent,
  					    sym, arg0))
  	{
+ 	  tree initial;
+ 	  stmtblock_t temp_post;
+ 
  	  /* Make a local loopinfo for the temporary creation, so that
  	     none of the other ss->info's have to be renormalized.  */
  	  gfc_init_loopinfo (&tmp_loop);
*************** gfc_conv_elemental_dependencies (gfc_se 
*** 261,287 ****
  	      tmp_loop.order[n] = loopse->loop->order[n];
  	    }
  
  	  /* Generate the temporary.  Merge the block so that the
! 	     declarations are put at the right binding level.  */
  	  size = gfc_create_var (gfc_array_index_type, NULL);
  	  data = gfc_create_var (pvoid_type_node, NULL);
  	  gfc_start_block (&block);
  	  tmp = gfc_typenode_for_spec (&e->ts);
! 	  tmp = gfc_trans_create_temp_array (&se->pre, &se->post,
! 					      &tmp_loop, info, tmp,
! 					      false, true, false,
! 					     & arg->expr->where);
  	  gfc_add_modify (&se->pre, size, tmp);
  	  tmp = fold_convert (pvoid_type_node, info->data);
  	  gfc_add_modify (&se->pre, data, tmp);
  	  gfc_merge_block_scope (&block);
  
- 	  /* Obtain the argument descriptor for unpacking.  */
- 	  gfc_init_se (&parmse, NULL);
- 	  parmse.want_pointer = 1;
- 	  gfc_conv_expr_descriptor (&parmse, e, gfc_walk_expr (e));
- 	  gfc_add_block_to_block (&se->pre, &parmse.pre);
- 
  	  /* Calculate the offset for the temporary.  */
  	  offset = gfc_index_zero_node;
  	  for (n = 0; n < info->dimen; n++)
--- 264,301 ----
  	      tmp_loop.order[n] = loopse->loop->order[n];
  	    }
  
+ 	  /* Obtain the argument descriptor for unpacking.  */
+ 	  gfc_init_se (&parmse, NULL);
+ 	  parmse.want_pointer = 1;
+ 	  gfc_conv_expr_descriptor (&parmse, e, gfc_walk_expr (e));
+ 	  gfc_add_block_to_block (&se->pre, &parmse.pre);
+ 
+ 	  /* If we've got INTENT(INOUT), initialize the array temporary with
+ 	     a copy of the values.  */
+ 	  if (fsym->attr.intent == INTENT_INOUT)
+ 	    initial = parmse.expr;
+ 	  else
+ 	    initial = NULL_TREE;
+ 
  	  /* Generate the temporary.  Merge the block so that the
! 	     declarations are put at the right binding level.  Cleaning up the
! 	     temporary should be the very last thing done, so we add the code to
! 	     a new block and add it to se->post as last instructions.  */
  	  size = gfc_create_var (gfc_array_index_type, NULL);
  	  data = gfc_create_var (pvoid_type_node, NULL);
  	  gfc_start_block (&block);
+ 	  gfc_init_block (&temp_post);
  	  tmp = gfc_typenode_for_spec (&e->ts);
! 	  tmp = gfc_trans_create_temp_array (&se->pre, &temp_post,
! 					     &tmp_loop, info, tmp,
! 					     initial,
! 					     false, true, false,
! 					     &arg->expr->where);
  	  gfc_add_modify (&se->pre, size, tmp);
  	  tmp = fold_convert (pvoid_type_node, info->data);
  	  gfc_add_modify (&se->pre, data, tmp);
  	  gfc_merge_block_scope (&block);
  
  	  /* Calculate the offset for the temporary.  */
  	  offset = gfc_index_zero_node;
  	  for (n = 0; n < info->dimen; n++)
*************** gfc_conv_elemental_dependencies (gfc_se 
*** 296,306 ****
--- 310,325 ----
  	  info->offset = gfc_create_var (gfc_array_index_type, NULL);	  
  	  gfc_add_modify (&se->pre, info->offset, offset);
  
+ 
  	  /* Copy the result back using unpack.  */
  	  tmp = build_call_expr (gfor_fndecl_in_unpack, 2, parmse.expr, data);
  	  gfc_add_expr_to_block (&se->post, tmp);
  
+ 	  /* XXX: This is possibly not needed; but isn't it cleaner this way? */
+ 	  gfc_add_block_to_block (&se->pre, &parmse.pre);
+ 
  	  gfc_add_block_to_block (&se->post, &parmse.post);
+ 	  gfc_add_block_to_block (&se->post, &temp_post);
  	}
      }
  }
*************** gfc_trans_call (gfc_code * code, bool de
*** 367,373 ****
        gfc_se loopse;
  
        /* gfc_walk_elemental_function_args renders the ss chain in the
!          reverse order to the actual argument order.  */
        ss = gfc_reverse_ss (ss);
  
        /* Initialize the loop.  */
--- 386,392 ----
        gfc_se loopse;
  
        /* gfc_walk_elemental_function_args renders the ss chain in the
! 	 reverse order to the actual argument order.  */
        ss = gfc_reverse_ss (ss);
  
        /* Initialize the loop.  */
Index: gcc/fortran/trans.c
===================================================================
*** gcc/fortran/trans.c	(revision 141345)
--- gcc/fortran/trans.c	(working copy)
*************** gfc_trans_code (gfc_code * code)
*** 1102,1108 ****
  	  break;
  
  	case EXEC_CALL:
! 	  res = gfc_trans_call (code, false);
  	  break;
  
  	case EXEC_ASSIGN_CALL:
--- 1102,1116 ----
  	  break;
  
  	case EXEC_CALL:
! 	  /* For MVBITS we've got the special exception that we need a
! 	     dependency check, too.  */
! 	  {
! 	    bool is_mvbits = false;
! 	    if (code->resolved_isym
! 		&& code->resolved_isym->id == GFC_ISYM_MVBITS)
! 	      is_mvbits = true;
! 	    res = gfc_trans_call (code, is_mvbits);
! 	  }
  	  break;
  
  	case EXEC_ASSIGN_CALL:
Index: gcc/fortran/trans.h
===================================================================
*** gcc/fortran/trans.h	(revision 141345)
--- gcc/fortran/trans.h	(working copy)
*************** tree gfc_call_free (tree);
*** 464,469 ****
--- 464,472 ----
  /* Allocate memory after performing a few checks.  */
  tree gfc_call_malloc (stmtblock_t *, tree, tree);
  
+ /* Build a memcpy call.  */
+ tree gfc_build_memcpy_call (tree, tree, tree);
+ 
  /* Allocate memory for arrays, with optional status variable.  */
  tree gfc_allocate_array_with_status (stmtblock_t*, tree, tree, tree, gfc_expr*);
  
Index: gcc/fortran/resolve.c
===================================================================
*** gcc/fortran/resolve.c	(revision 141345)
--- gcc/fortran/resolve.c	(working copy)
*************** resolve_call (gfc_code *c)
*** 2913,2935 ****
  
    t = SUCCESS;
    if (c->resolved_sym == NULL)
!     switch (procedure_kind (csym))
!       {
!       case PTYPE_GENERIC:
! 	t = resolve_generic_s (c);
! 	break;
  
!       case PTYPE_SPECIFIC:
! 	t = resolve_specific_s (c);
! 	break;
  
!       case PTYPE_UNKNOWN:
! 	t = resolve_unknown_s (c);
! 	break;
  
!       default:
! 	gfc_internal_error ("resolve_subroutine(): bad function type");
!       }
  
    /* Some checks of elemental subroutine actual arguments.  */
    if (resolve_elemental_actual (NULL, c) == FAILURE)
--- 2913,2938 ----
  
    t = SUCCESS;
    if (c->resolved_sym == NULL)
!     {
!       c->resolved_isym = NULL;
!       switch (procedure_kind (csym))
! 	{
! 	case PTYPE_GENERIC:
! 	  t = resolve_generic_s (c);
! 	  break;
  
! 	case PTYPE_SPECIFIC:
! 	  t = resolve_specific_s (c);
! 	  break;
  
! 	case PTYPE_UNKNOWN:
! 	  t = resolve_unknown_s (c);
! 	  break;
  
! 	default:
! 	  gfc_internal_error ("resolve_subroutine(): bad function type");
! 	}
!     }
  
    /* Some checks of elemental subroutine actual arguments.  */
    if (resolve_elemental_actual (NULL, c) == FAILURE)
Index: gcc/fortran/iresolve.c
===================================================================
*** gcc/fortran/iresolve.c	(revision 141345)
--- gcc/fortran/iresolve.c	(working copy)
*************** gfc_resolve_cpu_time (gfc_code *c)
*** 2608,2616 ****
--- 2608,2650 ----
  }
  
  
+ /* Create a formal arglist based on an actual one and set the INTENTs given.  */
+ 
+ static gfc_formal_arglist*
+ create_formal_for_intents (gfc_actual_arglist* actual, const sym_intent* ints)
+ {
+   gfc_formal_arglist* head;
+   gfc_formal_arglist* tail;
+   int i;
+ 
+   if (!actual)
+     return NULL;
+ 
+   head = tail = gfc_get_formal_arglist ();
+   for (i = 0; actual; actual = actual->next, tail = tail->next, ++i)
+     {
+       gfc_symbol* sym;
+ 
+       sym = gfc_new_symbol ("dummyarg", NULL);
+       sym->ts = actual->expr->ts;
+ 
+       sym->attr.intent = ints[i];
+       tail->sym = sym;
+ 
+       if (actual->next)
+ 	tail->next = gfc_get_formal_arglist ();
+     }
+ 
+   return head;
+ }
+ 
+ 
  void
  gfc_resolve_mvbits (gfc_code *c)
  {
+   static const sym_intent INTENTS[] = {INTENT_IN, INTENT_IN, INTENT_IN,
+ 				       INTENT_INOUT, INTENT_IN};
+ 
    const char *name;
    gfc_typespec ts;
    gfc_clear_ts (&ts);
*************** gfc_resolve_mvbits (gfc_code *c)
*** 2632,2637 ****
--- 2666,2675 ----
    c->resolved_sym = gfc_get_intrinsic_sub_symbol (name);
    /* Mark as elemental subroutine as this does not happen automatically.  */
    c->resolved_sym->attr.elemental = 1;
+ 
+   /* Create a dummy formal arglist so the INTENTs are known later for purpose
+      of creating temporaries.  */
+   c->resolved_sym->formal = create_formal_for_intents (c->ext.actual, INTENTS);
  }
  
  
Index: gcc/fortran/trans-intrinsic.c
===================================================================
*** gcc/fortran/trans-intrinsic.c	(revision 141345)
--- gcc/fortran/trans-intrinsic.c	(working copy)
*************** gfc_conv_intrinsic_array_transfer (gfc_s
*** 3787,3793 ****
       FIXME callee_alloc is not set!  */
  
    gfc_trans_create_temp_array (&se->pre, &se->post, se->loop,
! 			       info, mold_type, false, true, false,
  			       &expr->where);
  
    /* Cast the pointer to the result.  */
--- 3787,3793 ----
       FIXME callee_alloc is not set!  */
  
    gfc_trans_create_temp_array (&se->pre, &se->post, se->loop,
! 			       info, mold_type, NULL_TREE, false, true, false,
  			       &expr->where);
  
    /* Cast the pointer to the result.  */
Index: gcc/testsuite/gfortran.dg/mvbits_4.f90
===================================================================
*** gcc/testsuite/gfortran.dg/mvbits_4.f90	(revision 0)
--- gcc/testsuite/gfortran.dg/mvbits_4.f90	(revision 0)
***************
*** 0 ****
--- 1,33 ----
+ ! { dg-do run }
+ 
+ ! PR fortran/35681
+ ! Check that dependencies of MVBITS arguments are resolved correctly by using
+ ! temporaries if both arguments refer to the same variable.
+ 
+   integer, dimension(10) :: ila1 = (/1,2,3,4,5,6,7,8,9,10/)
+   integer, dimension(20) :: ila2
+   integer, dimension(10), target :: ila3
+   integer, pointer :: ila3_ptr(:)
+   integer, parameter :: SHOULD_BE(10) = (/17,18,11,4,13,22,7,16,9,18/)
+   integer, parameter :: INDEX_VECTOR(10) = (/9,9,6,2,4,9,2,9,6,10/)
+ 
+   ila2(2:20:2) = ila1
+   ila3 = ila1
+ 
+   ! Argument is already packed.
+   call mvbits (ila1(INDEX_VECTOR), 2, 4, ila1, 3)
+   write (*,'(10(I3))') ila1
+   if (any (ila1 /= SHOULD_BE)) call abort ()
+ 
+   ! Argument is not packed.
+   call mvbits (ila2(2*INDEX_VECTOR), 2, 4, ila2(2:20:2), 3)
+   write (*,'(10(I3))') ila2(2:20:2)
+   if (any (ila2(2:20:2) /= SHOULD_BE)) call abort ()
+ 
+   ! Pointer and target
+   ila3_ptr => ila3
+   call mvbits (ila3(INDEX_VECTOR), 2, 4, ila3_ptr, 3)
+   write (*,'(10(I3))') ila3
+   if (any (ila3 /= SHOULD_BE)) call abort ()
+ 
+   end 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Patch, Fortran] PR fortran/35681: First part, fix ELEMENTAL dependency handling for MVBITS
  2008-10-27 19:43 [Patch, Fortran] PR fortran/35681: First part, fix ELEMENTAL dependency handling for MVBITS Daniel Kraft
@ 2008-10-27 21:29 ` Paul Richard Thomas
  2008-10-27 22:04   ` Daniel Kraft
  2008-10-28 22:05 ` Mikael Morin
  2008-10-31 15:54 ` Daniel Kraft
  2 siblings, 1 reply; 9+ messages in thread
From: Paul Richard Thomas @ 2008-10-27 21:29 UTC (permalink / raw)
  To: Daniel Kraft; +Cc: Fortran List, gcc-patches

Daniel,

I'm a bit pole-axed with the daytime job right now, so my reply must be short.

I have to say that I'me really puzzled about this:

> 6) I could not find a test to verify this (not even one that uses
> gfc_conv_elemental_dependencies) in a quick trial, but I believe the
> handling of the temporary there was wrong, in that it was free'd (if
> allocated on the heap) *before* it was used with internal_unpack, because
> gfc_trans_create_temp_array added the temporary clean-up code to se->post
> and the unpack-call was added to se->post later.  In my opinion, this is
> some rather general problem with how post-commands are usually added to
> other post blocks; shouldn't they be added to the top usually rather than to
> the bottom, to get some sort of "nested" scope with inner most pairs of
> pre/post?  Well, for now I changed this behaviour inside
> gfc_conv_elemental_dependencies, which corrected problems I got with MVBITS
> tests.
>

The code from the first assignement in elemental_subroutine_3.f90,
which was the first test of this function gives:

  {
    integer(kind=8) D.1563;
    struct array1_mytype parm.5;
    struct array1_mytype atmp.3;
    struct mytype A.4[6];
    void * D.1559;
    integer(kind=8) D.1558;
    struct array1_integer(kind=8) parm.2;
    static integer(kind=8) A.1[6] = {2, 3, 1, 4, 5, 6};

    atmp.3.dtype = 297;
    atmp.3.dim[0].stride = 1;
    atmp.3.dim[0].lbound = 0;
    atmp.3.dim[0].ubound = 5;
    atmp.3.data = (void *) &A.4;
    atmp.3.offset = 0;
    D.1558 = 24;
    D.1559 = atmp.3.data;
    parm.5.dtype = 297;
    parm.5.dim[0].lbound = 1;
    parm.5.dim[0].ubound = 6;
    parm.5.dim[0].stride = 1;
    parm.5.data = (void *) &x[0];
    parm.5.offset = -1;
    D.1563 = 0;
    parm.2.dtype = 521;
    parm.2.dim[0].lbound = 1;
    parm.2.dim[0].ubound = 6;
    parm.2.dim[0].stride = 1;
    parm.2.data = (void *) &A.1[0];
    parm.2.offset = 0;
    {
      integer(kind=8) S.6;

      S.6 = 0;
      while (1)
        {
          if (S.6 > 5) goto L.1;
          {
            integer(kind=8) D.1565;

            D.1565 = (*(integer(kind=8)[0:] *)
parm.2.data)[parm.2.dim[0].stride * NON_LVALUE_EXPR <S.6>];
            myassign (&(*(struct mytype[6] *) atmp.3.data)[S.6 +
D.1563], &x[D.1565 + -1]);
          }
          S.6 = S.6 + 1;
        }
      L.1:;
    }
    _gfortran_internal_unpack (&parm.5, D.1559);
  }

As you can see, atmp.3.data is assigned to the stack but it is not
freed either before or after being unpacked.  This code, at least is
OK.

Cheers

Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Patch, Fortran] PR fortran/35681: First part, fix ELEMENTAL  dependency handling for MVBITS
  2008-10-27 21:29 ` Paul Richard Thomas
@ 2008-10-27 22:04   ` Daniel Kraft
  0 siblings, 0 replies; 9+ messages in thread
From: Daniel Kraft @ 2008-10-27 22:04 UTC (permalink / raw)
  To: Paul Richard Thomas; +Cc: Fortran List, gcc-patches

Paul Richard Thomas wrote:
> Daniel,
> 
> I'm a bit pole-axed with the daytime job right now, so my reply must be short.
> 
> I have to say that I'me really puzzled about this:
> 
>> 6) I could not find a test to verify this (not even one that uses
>> gfc_conv_elemental_dependencies) in a quick trial, but I believe the
>> handling of the temporary there was wrong, in that it was free'd (if
>> allocated on the heap) *before* it was used with internal_unpack, because
>> gfc_trans_create_temp_array added the temporary clean-up code to se->post
>> and the unpack-call was added to se->post later.  In my opinion, this is
>> some rather general problem with how post-commands are usually added to
>> other post blocks; shouldn't they be added to the top usually rather than to
>> the bottom, to get some sort of "nested" scope with inner most pairs of
>> pre/post?  Well, for now I changed this behaviour inside
>> gfc_conv_elemental_dependencies, which corrected problems I got with MVBITS
>> tests.
>>
> 
> The code from the first assignement in elemental_subroutine_3.f90,
> which was the first test of this function gives:
> 
...
> 
> As you can see, atmp.3.data is assigned to the stack but it is not
> freed either before or after being unpacked.  This code, at least is
> OK.

Indeed it is, but it would not have been (at least I think so) if the 
array was allocated and not on the stack; however, as I wrote, I did not 
manage to build a test that really produced invalid code this way (not 
even one that used gfc_conv_elemental_dependencies...)

Thanks for your comments, they are very welcome :)

Yours,
Daniel

-- 
Done:  Arc-Bar-Cav-Rog-Sam-Val-Wiz
To go: Hea-Kni-Mon-Pri-Ran-Tou

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Patch, Fortran] PR fortran/35681: First part, fix ELEMENTAL  dependency  handling for MVBITS
  2008-10-27 19:43 [Patch, Fortran] PR fortran/35681: First part, fix ELEMENTAL dependency handling for MVBITS Daniel Kraft
  2008-10-27 21:29 ` Paul Richard Thomas
@ 2008-10-28 22:05 ` Mikael Morin
  2008-10-28 22:31   ` Daniel Kraft
  2008-10-29 10:00   ` Paul Richard Thomas
  2008-10-31 15:54 ` Daniel Kraft
  2 siblings, 2 replies; 9+ messages in thread
From: Mikael Morin @ 2008-10-28 22:05 UTC (permalink / raw)
  To: Daniel Kraft; +Cc: Fortran List, gcc-patches

Daniel Kraft wrote:
> 
> 1) Some tab-indentation formatting fixes as I came along, sorry for
> those.  I hope it is ok so.
> 
It looks like you are adding tabs instead of removing them.
Or I am completely wrong ? (I always remove tabs).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Patch, Fortran] PR fortran/35681: First part, fix ELEMENTAL   dependency  handling for MVBITS
  2008-10-28 22:05 ` Mikael Morin
@ 2008-10-28 22:31   ` Daniel Kraft
  2008-10-29 10:00   ` Paul Richard Thomas
  1 sibling, 0 replies; 9+ messages in thread
From: Daniel Kraft @ 2008-10-28 22:31 UTC (permalink / raw)
  To: Mikael Morin; +Cc: Fortran List, gcc-patches

Mikael Morin wrote:
> Daniel Kraft wrote:
>> 1) Some tab-indentation formatting fixes as I came along, sorry for
>> those.  I hope it is ok so.
>>
> It looks like you are adding tabs instead of removing them.
> Or I am completely wrong ? (I always remove tabs).

AFAIK, it's GNU style to replace 8-space indentations by tabs.  I always 
have to do it manually, so would like it not to be so, but it seems to 
be needed...

Cheers,
Daniel

-- 
Done:  Arc-Bar-Cav-Rog-Sam-Val-Wiz
To go: Hea-Kni-Mon-Pri-Ran-Tou

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Patch, Fortran] PR fortran/35681: First part, fix ELEMENTAL dependency handling for MVBITS
  2008-10-28 22:05 ` Mikael Morin
  2008-10-28 22:31   ` Daniel Kraft
@ 2008-10-29 10:00   ` Paul Richard Thomas
  1 sibling, 0 replies; 9+ messages in thread
From: Paul Richard Thomas @ 2008-10-29 10:00 UTC (permalink / raw)
  To: Mikael Morin; +Cc: Daniel Kraft, Fortran List, gcc-patches

Mikael and Daniel,

Tabs in the C source; no tabs in the fortran testcases - there is no "why":-)

Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Patch, Fortran] PR fortran/35681: First part, fix ELEMENTAL  dependency  handling for MVBITS
  2008-10-27 19:43 [Patch, Fortran] PR fortran/35681: First part, fix ELEMENTAL dependency handling for MVBITS Daniel Kraft
  2008-10-27 21:29 ` Paul Richard Thomas
  2008-10-28 22:05 ` Mikael Morin
@ 2008-10-31 15:54 ` Daniel Kraft
  2008-11-01 12:57   ` Paul Thomas
  2 siblings, 1 reply; 9+ messages in thread
From: Daniel Kraft @ 2008-10-31 15:54 UTC (permalink / raw)
  To: Fortran List, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3695 bytes --]

Hi,

I've updated the patch described below to trunk of now (including the 
trivial conflicts merge with Mikael's recent check-in) and run a new 
regtest, no regressions on GNU/Linux-x86-32.

Cheers,
Daniel

Daniel Kraft wrote:
> working on PR fortran/35681, I've got some rather big patch now handling 
> part of the problem.  What it exactly does:
> 
> 1) Some tab-indentation formatting fixes as I came along, sorry for 
> those.  I hope it is ok so.
> 
> 2) When resolving a MVBITS intrinsic call, the code->resolved_sym gets a 
> dummy formal argument list with the correct INTENTs specified; this is 
> needed later for gfc_conv_elemental_dependencies.
> 
> 3) gfc_code got a new member "resolved_isym" that tracks calls to 
> intrinsic procedures, so we can later check if some call is to intrinsic 
> MVBITS.  This got a little ugly and would be probably nicer to union it 
> (and possibly "resolved_sym", too) with actual, but that would probably 
> introduce a lot of changes to existing code pieces.
> 
> 4) gfc_trans_allocate_array_storage (or what it is called) got a new 
> argument `initial' that allows to initialize the created storage from 
> some other array (this is done using a combination of internal_pack and 
> memcpy if it was already packed, I hope I got this all right).  This is 
> used for gfc_trans_create_temp_array to allow initializing the new 
> temporary.  Here is (probably) most of the "critical" changes.
> 
> 5) For calls to intrinsic MVBITS, I enabled dependency checking using 
> gfc_conv_elemental_dependencies and made this routine aware of 
> INTENT(INOUT) arguments that use the new initialization feature to copy 
> over the initial content of the mirrored array to the created temporary.
> 
> 6) I could not find a test to verify this (not even one that uses 
> gfc_conv_elemental_dependencies) in a quick trial, but I believe the 
> handling of the temporary there was wrong, in that it was free'd (if 
> allocated on the heap) *before* it was used with internal_unpack, 
> because gfc_trans_create_temp_array added the temporary clean-up code to 
> se->post and the unpack-call was added to se->post later.  In my 
> opinion, this is some rather general problem with how post-commands are 
> usually added to other post blocks; shouldn't they be added to the top 
> usually rather than to the bottom, to get some sort of "nested" scope 
> with inner most pairs of pre/post?  Well, for now I changed this 
> behaviour inside gfc_conv_elemental_dependencies, which corrected 
> problems I got with MVBITS tests.
> 
> This enabled the (valid) tests in the PR to run, but only with modifying 
> them slightly by removing the parentheses around the first argument (so 
> it is not an expression; that will be part 2 of this fix).  As I 
> understand it, this is valid in case of MVBITS but not for any other 
> ELEMENTAL subroutine, right?  This is why I added the check for whether 
> some call is to MVBITS.  I guess the rationale why the compiler is not 
> required to create temporaries for all such ELEMENTAL calls (and they 
> are invalid instead) is performance?  gfortran could handle those calls 
> well in addition to only MVBITS calls simply if I take this conditional 
> check out, but then we might generate temporaries for cases where the 
> user knows no one is needed and the code is valid but the compiler can't 
> figure it out.
> 
> I hope I got this one at least somewhat clear...  What do you think 
> about it?  Currently regression-testing on GNU/Linux-x86-32, but I don't 
> expect any (a very similar patch worked fine before).
> 
> Cheers,
> Daniel
> 


-- 
Done:  Arc-Bar-Cav-Rog-Sam-Val-Wiz
To go: Hea-Kni-Mon-Pri-Ran-Tou

[-- Attachment #2: patch.changelog --]
[-- Type: text/plain, Size: 1494 bytes --]

2008-10-27  Daniel Kraft  <d@domob.eu>

	PR fortran/35681
	* gfortran.h (struct gfc_code): New field `resolved_isym'.
	* trans.h (gfc_build_memcpy_call): Made public.
	* trans-array.h (gfc_trans_create_temp_array): New argument `initial'.
	* intrinsic.c (gfc_intrinsic_sub_interface): Set resolved_isym.
	* iresolve.c (create_formal_for_intents): New helper method.
	(gfc_resolve_mvbits): Put dummy formal arglist on resolved_sym.
	* resolve.c (resolve_call): Initialize resolved_isym to NULL.
	* trans-array.c (gfc_trans_allocate_array_storage): New argument
	`initial' to allow initializing the allocated storage to some initial
	value copied from another array.
	(gfc_trans_create_temp_array): Allow initialization of the temporary
	with a copy of some other array by using the new extension.
	(gfc_trans_array_constructor): Pass NULL_TREE for initial argument.
	(gfc_conv_loop_setup): Ditto.
	* trans-intrinsic.c (gfc_conv_intrinsic_array_transfer): Ditto.
	* trans-expr.c (gfc_conv_function_call): Ditto.
	(gfc_build_memcpy_call): Made public.
	* trans-stmt.c (gfc_conv_elemental_dependencies): Initialize created
	temporary for INTENT(INOUT) arguments to the value of the mirrored
	array and clean up the temporary as very last intructions in the created
	block.
	* trans.c (gfc_trans_code): For EXEC_CALL, see if we have a MVBITS call
	and enable elemental dependency checking if we have.

2008-10-27  Daniel Kraft  <d@domob.eu>

	PR fortran/35681
	* gfortran.dg/mvbits_4.f90: New test.

[-- Attachment #3: patch.diff --]
[-- Type: text/plain, Size: 23688 bytes --]

Index: gcc/fortran/intrinsic.c
===================================================================
*** gcc/fortran/intrinsic.c	(revision 141493)
--- gcc/fortran/intrinsic.c	(working copy)
*************** gfc_intrinsic_sub_interface (gfc_code *c
*** 3746,3751 ****
--- 3746,3752 ----
    if (!error_flag)
      gfc_pop_suppress_errors ();
  
+   c->resolved_isym = isym;
    if (isym->resolve.s1 != NULL)
      isym->resolve.s1 (c);
    else
Index: gcc/fortran/trans-expr.c
===================================================================
*** gcc/fortran/trans-expr.c	(revision 141493)
--- gcc/fortran/trans-expr.c	(working copy)
*************** gfc_conv_function_call (gfc_se * se, gfc
*** 2862,2869 ****
  	     mustn't be deallocated.  */
  	  callee_alloc = sym->attr.allocatable || sym->attr.pointer;
  	  gfc_trans_create_temp_array (&se->pre, &se->post, se->loop, info, tmp,
! 				       false, !sym->attr.pointer, callee_alloc,
! 				       &se->ss->expr->where);
  
  	  /* Pass the temporary as the first argument.  */
  	  tmp = info->descriptor;
--- 2862,2869 ----
  	     mustn't be deallocated.  */
  	  callee_alloc = sym->attr.allocatable || sym->attr.pointer;
  	  gfc_trans_create_temp_array (&se->pre, &se->post, se->loop, info, tmp,
! 				       NULL_TREE, false, !sym->attr.pointer,
! 				       callee_alloc, &se->ss->expr->where);
  
  	  /* Pass the temporary as the first argument.  */
  	  tmp = info->descriptor;
*************** gfc_trans_zero_assign (gfc_expr * expr)
*** 4383,4389 ****
  /* Helper for gfc_trans_array_copy and gfc_trans_array_constructor_copy
     that constructs the call to __builtin_memcpy.  */
  
! static tree
  gfc_build_memcpy_call (tree dst, tree src, tree len)
  {
    tree tmp;
--- 4383,4389 ----
  /* Helper for gfc_trans_array_copy and gfc_trans_array_constructor_copy
     that constructs the call to __builtin_memcpy.  */
  
! tree
  gfc_build_memcpy_call (tree dst, tree src, tree len)
  {
    tree tmp;
Index: gcc/fortran/trans-array.c
===================================================================
*** gcc/fortran/trans-array.c	(revision 141493)
--- gcc/fortran/trans-array.c	(working copy)
*************** gfc_set_loop_bounds_from_array_spec (gfc
*** 493,506 ****
     callee will allocate the array.  If DEALLOC is true, also generate code to
     free the array afterwards.
  
     Initialization code is added to PRE and finalization code to POST.
     DYNAMIC is true if the caller may want to extend the array later
     using realloc.  This prevents us from putting the array on the stack.  */
  
  static void
  gfc_trans_allocate_array_storage (stmtblock_t * pre, stmtblock_t * post,
!                                   gfc_ss_info * info, tree size, tree nelem,
!                                   bool dynamic, bool dealloc)
  {
    tree tmp;
    tree desc;
--- 493,509 ----
     callee will allocate the array.  If DEALLOC is true, also generate code to
     free the array afterwards.
  
+    If INITIAL is not NULL, it is packed using internal_pack and the result used
+    as data instead of allocating a fresh, unitialized area of memory.
+ 
     Initialization code is added to PRE and finalization code to POST.
     DYNAMIC is true if the caller may want to extend the array later
     using realloc.  This prevents us from putting the array on the stack.  */
  
  static void
  gfc_trans_allocate_array_storage (stmtblock_t * pre, stmtblock_t * post,
! 				  gfc_ss_info * info, tree size, tree nelem,
! 				  tree initial, bool dynamic, bool dealloc)
  {
    tree tmp;
    tree desc;
*************** gfc_trans_allocate_array_storage (stmtbl
*** 517,523 ****
    else
      {
        /* Allocate the temporary.  */
!       onstack = !dynamic && gfc_can_put_var_on_stack (size);
  
        if (onstack)
  	{
--- 520,527 ----
    else
      {
        /* Allocate the temporary.  */
!       onstack = !dynamic && initial == NULL_TREE
! 			 && gfc_can_put_var_on_stack (size);
  
        if (onstack)
  	{
*************** gfc_trans_allocate_array_storage (stmtbl
*** 534,542 ****
  	}
        else
  	{
! 	  /* Allocate memory to hold the data.  */
! 	  tmp = gfc_call_malloc (pre, NULL, size);
! 	  tmp = gfc_evaluate_now (tmp, pre);
  	  gfc_conv_descriptor_data_set (pre, desc, tmp);
  	}
      }
--- 538,590 ----
  	}
        else
  	{
! 	  /* Allocate memory to hold the data or call internal_pack.  */
! 	  if (initial == NULL_TREE)
! 	    {
! 	      tmp = gfc_call_malloc (pre, NULL, size);
! 	      tmp = gfc_evaluate_now (tmp, pre);
! 	    }
! 	  else
! 	    {
! 	      tree packed;
! 	      tree source_data;
! 	      tree was_packed;
! 	      stmtblock_t do_copying;
! 
! 	      tmp = TREE_TYPE (initial); /* Pointer to descriptor.  */
! 	      gcc_assert (TREE_CODE (tmp) == POINTER_TYPE);
! 	      tmp = TREE_TYPE (tmp); /* The descriptor itself.  */
! 	      tmp = gfc_get_element_type (tmp);
! 	      gcc_assert (tmp == gfc_get_element_type (TREE_TYPE (desc)));
! 	      packed = gfc_create_var (build_pointer_type (tmp), "data");
! 
! 	      tmp = build_call_expr (gfor_fndecl_in_pack, 1, initial);
! 	      tmp = fold_convert (TREE_TYPE (packed), tmp);
! 	      gfc_add_modify (pre, packed, tmp);
! 
! 	      tmp = build_fold_indirect_ref (initial);
! 	      source_data = gfc_conv_descriptor_data_get (tmp);
! 
! 	      /* internal_pack may return source->data without any allocation
! 		 or copying if it is already packed.  If that's the case, we
! 		 need to allocate and copy manually.  */
! 
! 	      gfc_start_block (&do_copying);
! 	      tmp = gfc_call_malloc (&do_copying, NULL, size);
! 	      tmp = fold_convert (TREE_TYPE (packed), tmp);
! 	      gfc_add_modify (&do_copying, packed, tmp);
! 	      tmp = gfc_build_memcpy_call (packed, source_data, size);
! 	      gfc_add_expr_to_block (&do_copying, tmp);
! 
! 	      was_packed = fold_build2 (EQ_EXPR, boolean_type_node,
! 					packed, source_data);
! 	      tmp = gfc_finish_block (&do_copying);
! 	      tmp = build3_v (COND_EXPR, was_packed, tmp, build_empty_stmt ());
! 	      gfc_add_expr_to_block (pre, tmp);
! 
! 	      tmp = fold_convert (pvoid_type_node, packed);
! 	    }
! 
  	  gfc_conv_descriptor_data_set (pre, desc, tmp);
  	}
      }
*************** gfc_trans_allocate_array_storage (stmtbl
*** 567,580 ****
     fields of info if known.  Returns the size of the array, or NULL for a
     callee allocated array.
  
!    PRE, POST, DYNAMIC and DEALLOC are as for gfc_trans_allocate_array_storage.
   */
  
  tree
  gfc_trans_create_temp_array (stmtblock_t * pre, stmtblock_t * post,
  			     gfc_loopinfo * loop, gfc_ss_info * info,
! 			     tree eltype, bool dynamic, bool dealloc,
! 			     bool callee_alloc, locus * where)
  {
    tree type;
    tree desc;
--- 615,629 ----
     fields of info if known.  Returns the size of the array, or NULL for a
     callee allocated array.
  
!    PRE, POST, INITIAL, DYNAMIC and DEALLOC are as for
!    gfc_trans_allocate_array_storage.
   */
  
  tree
  gfc_trans_create_temp_array (stmtblock_t * pre, stmtblock_t * post,
  			     gfc_loopinfo * loop, gfc_ss_info * info,
! 			     tree eltype, tree initial, bool dynamic,
! 			     bool dealloc, bool callee_alloc, locus * where)
  {
    tree type;
    tree desc;
*************** gfc_trans_create_temp_array (stmtblock_t
*** 600,607 ****
        else
  	{
  	  /* Callee allocated arrays may not have a known bound yet.  */
!           if (loop->to[n])
!               loop->to[n] = fold_build2 (MINUS_EXPR, gfc_array_index_type,
  					 loop->to[n], loop->from[n]);
  	  loop->from[n] = gfc_index_zero_node;
  	}
--- 649,656 ----
        else
  	{
  	  /* Callee allocated arrays may not have a known bound yet.  */
! 	  if (loop->to[n])
! 	      loop->to[n] = fold_build2 (MINUS_EXPR, gfc_array_index_type,
  					 loop->to[n], loop->from[n]);
  	  loop->from[n] = gfc_index_zero_node;
  	}
*************** gfc_trans_create_temp_array (stmtblock_t
*** 635,641 ****
         {
  	 stride[n] = size
  	 delta = ubound[n] + 1 - lbound[n];
!          size = size * delta;
         }
       size = size * sizeof(element);
    */
--- 684,690 ----
         {
  	 stride[n] = size
  	 delta = ubound[n] + 1 - lbound[n];
! 	 size = size * delta;
         }
       size = size * sizeof(element);
    */
*************** gfc_trans_create_temp_array (stmtblock_t
*** 654,670 ****
    for (n = 0; n < info->dimen; n++)
       {
        if (size == NULL_TREE)
!         {
  	  /* For a callee allocated array express the loop bounds in terms
  	     of the descriptor fields.  */
!           tmp =
  	    fold_build2 (MINUS_EXPR, gfc_array_index_type,
  			 gfc_conv_descriptor_ubound (desc, gfc_rank_cst[n]),
  			 gfc_conv_descriptor_lbound (desc, gfc_rank_cst[n]));
!           loop->to[n] = tmp;
!           continue;
!         }
!         
        /* Store the stride and bound components in the descriptor.  */
        tmp = gfc_conv_descriptor_stride (desc, gfc_rank_cst[n]);
        gfc_add_modify (pre, tmp, size);
--- 703,719 ----
    for (n = 0; n < info->dimen; n++)
       {
        if (size == NULL_TREE)
! 	{
  	  /* For a callee allocated array express the loop bounds in terms
  	     of the descriptor fields.  */
! 	  tmp =
  	    fold_build2 (MINUS_EXPR, gfc_array_index_type,
  			 gfc_conv_descriptor_ubound (desc, gfc_rank_cst[n]),
  			 gfc_conv_descriptor_lbound (desc, gfc_rank_cst[n]));
! 	  loop->to[n] = tmp;
! 	  continue;
! 	}
! 	
        /* Store the stride and bound components in the descriptor.  */
        tmp = gfc_conv_descriptor_stride (desc, gfc_rank_cst[n]);
        gfc_add_modify (pre, tmp, size);
*************** gfc_trans_create_temp_array (stmtblock_t
*** 712,719 ****
        size = NULL_TREE;
      }
  
!   gfc_trans_allocate_array_storage (pre, post, info, size, nelem, dynamic,
! 			            dealloc);
  
    if (info->dimen > loop->temp_dim)
      loop->temp_dim = info->dimen;
--- 761,768 ----
        size = NULL_TREE;
      }
  
!   gfc_trans_allocate_array_storage (pre, post, info, size, nelem, initial,
! 				    dynamic, dealloc);
  
    if (info->dimen > loop->temp_dim)
      loop->temp_dim = info->dimen;
*************** gfc_trans_array_constructor (gfc_loopinf
*** 1811,1817 ****
      }
  
    gfc_trans_create_temp_array (&loop->pre, &loop->post, loop, &ss->data.info,
! 			       type, dynamic, true, false, where);
  
    desc = ss->data.info.descriptor;
    offset = gfc_index_zero_node;
--- 1860,1866 ----
      }
  
    gfc_trans_create_temp_array (&loop->pre, &loop->post, loop, &ss->data.info,
! 			       type, NULL_TREE, dynamic, true, false, where);
  
    desc = ss->data.info.descriptor;
    offset = gfc_index_zero_node;
*************** gfc_conv_loop_setup (gfc_loopinfo * loop
*** 3523,3530 ****
        loop->temp_ss->type = GFC_SS_SECTION;
        loop->temp_ss->data.info.dimen = n;
        gfc_trans_create_temp_array (&loop->pre, &loop->post, loop,
! 				   &loop->temp_ss->data.info, tmp, false, true,
! 				   false, where);
      }
  
    for (n = 0; n < loop->temp_dim; n++)
--- 3572,3579 ----
        loop->temp_ss->type = GFC_SS_SECTION;
        loop->temp_ss->data.info.dimen = n;
        gfc_trans_create_temp_array (&loop->pre, &loop->post, loop,
! 				   &loop->temp_ss->data.info, tmp, NULL_TREE,
! 				   false, true, false, where);
      }
  
    for (n = 0; n < loop->temp_dim; n++)
Index: gcc/fortran/trans-array.h
===================================================================
*** gcc/fortran/trans-array.h	(revision 141493)
--- gcc/fortran/trans-array.h	(working copy)
*************** void gfc_set_loop_bounds_from_array_spec
*** 32,38 ****
  
  /* Generate code to create a temporary array.  */
  tree gfc_trans_create_temp_array (stmtblock_t *, stmtblock_t *, gfc_loopinfo *,
!                                   gfc_ss_info *, tree, bool, bool, bool,
  				  locus *);
  
  /* Generate function entry code for allocation of compiler allocated array
--- 32,38 ----
  
  /* Generate code to create a temporary array.  */
  tree gfc_trans_create_temp_array (stmtblock_t *, stmtblock_t *, gfc_loopinfo *,
! 				  gfc_ss_info *, tree, tree, bool, bool, bool,
  				  locus *);
  
  /* Generate function entry code for allocation of compiler allocated array
Index: gcc/fortran/gfortran.h
===================================================================
*** gcc/fortran/gfortran.h	(revision 141493)
--- gcc/fortran/gfortran.h	(working copy)
*************** typedef struct gfc_code
*** 1886,1891 ****
--- 1886,1892 ----
       symbol for the interface definition.
    const char *sub_name;  */
    gfc_symbol *resolved_sym;
+   gfc_intrinsic_sym *resolved_isym;
  
    union
    {
Index: gcc/fortran/trans-stmt.c
===================================================================
*** gcc/fortran/trans-stmt.c	(revision 141493)
--- gcc/fortran/trans-stmt.c	(working copy)
*************** gfc_conv_elemental_dependencies (gfc_se 
*** 251,256 ****
--- 251,259 ----
  	    && gfc_check_fncall_dependency (e, fsym->attr.intent,
  					    sym, arg0))
  	{
+ 	  tree initial;
+ 	  stmtblock_t temp_post;
+ 
  	  /* Make a local loopinfo for the temporary creation, so that
  	     none of the other ss->info's have to be renormalized.  */
  	  gfc_init_loopinfo (&tmp_loop);
*************** gfc_conv_elemental_dependencies (gfc_se 
*** 261,287 ****
  	      tmp_loop.order[n] = loopse->loop->order[n];
  	    }
  
  	  /* Generate the temporary.  Merge the block so that the
! 	     declarations are put at the right binding level.  */
  	  size = gfc_create_var (gfc_array_index_type, NULL);
  	  data = gfc_create_var (pvoid_type_node, NULL);
  	  gfc_start_block (&block);
  	  tmp = gfc_typenode_for_spec (&e->ts);
! 	  tmp = gfc_trans_create_temp_array (&se->pre, &se->post,
! 					      &tmp_loop, info, tmp,
! 					      false, true, false,
! 					     & arg->expr->where);
  	  gfc_add_modify (&se->pre, size, tmp);
  	  tmp = fold_convert (pvoid_type_node, info->data);
  	  gfc_add_modify (&se->pre, data, tmp);
  	  gfc_merge_block_scope (&block);
  
- 	  /* Obtain the argument descriptor for unpacking.  */
- 	  gfc_init_se (&parmse, NULL);
- 	  parmse.want_pointer = 1;
- 	  gfc_conv_expr_descriptor (&parmse, e, gfc_walk_expr (e));
- 	  gfc_add_block_to_block (&se->pre, &parmse.pre);
- 
  	  /* Calculate the offset for the temporary.  */
  	  offset = gfc_index_zero_node;
  	  for (n = 0; n < info->dimen; n++)
--- 264,301 ----
  	      tmp_loop.order[n] = loopse->loop->order[n];
  	    }
  
+ 	  /* Obtain the argument descriptor for unpacking.  */
+ 	  gfc_init_se (&parmse, NULL);
+ 	  parmse.want_pointer = 1;
+ 	  gfc_conv_expr_descriptor (&parmse, e, gfc_walk_expr (e));
+ 	  gfc_add_block_to_block (&se->pre, &parmse.pre);
+ 
+ 	  /* If we've got INTENT(INOUT), initialize the array temporary with
+ 	     a copy of the values.  */
+ 	  if (fsym->attr.intent == INTENT_INOUT)
+ 	    initial = parmse.expr;
+ 	  else
+ 	    initial = NULL_TREE;
+ 
  	  /* Generate the temporary.  Merge the block so that the
! 	     declarations are put at the right binding level.  Cleaning up the
! 	     temporary should be the very last thing done, so we add the code to
! 	     a new block and add it to se->post as last instructions.  */
  	  size = gfc_create_var (gfc_array_index_type, NULL);
  	  data = gfc_create_var (pvoid_type_node, NULL);
  	  gfc_start_block (&block);
+ 	  gfc_init_block (&temp_post);
  	  tmp = gfc_typenode_for_spec (&e->ts);
! 	  tmp = gfc_trans_create_temp_array (&se->pre, &temp_post,
! 					     &tmp_loop, info, tmp,
! 					     initial,
! 					     false, true, false,
! 					     &arg->expr->where);
  	  gfc_add_modify (&se->pre, size, tmp);
  	  tmp = fold_convert (pvoid_type_node, info->data);
  	  gfc_add_modify (&se->pre, data, tmp);
  	  gfc_merge_block_scope (&block);
  
  	  /* Calculate the offset for the temporary.  */
  	  offset = gfc_index_zero_node;
  	  for (n = 0; n < info->dimen; n++)
*************** gfc_conv_elemental_dependencies (gfc_se 
*** 296,306 ****
--- 310,325 ----
  	  info->offset = gfc_create_var (gfc_array_index_type, NULL);	  
  	  gfc_add_modify (&se->pre, info->offset, offset);
  
+ 
  	  /* Copy the result back using unpack.  */
  	  tmp = build_call_expr (gfor_fndecl_in_unpack, 2, parmse.expr, data);
  	  gfc_add_expr_to_block (&se->post, tmp);
  
+ 	  /* XXX: This is possibly not needed; but isn't it cleaner this way? */
+ 	  gfc_add_block_to_block (&se->pre, &parmse.pre);
+ 
  	  gfc_add_block_to_block (&se->post, &parmse.post);
+ 	  gfc_add_block_to_block (&se->post, &temp_post);
  	}
      }
  }
*************** gfc_trans_call (gfc_code * code, bool de
*** 367,373 ****
        gfc_se loopse;
  
        /* gfc_walk_elemental_function_args renders the ss chain in the
!          reverse order to the actual argument order.  */
        ss = gfc_reverse_ss (ss);
  
        /* Initialize the loop.  */
--- 386,392 ----
        gfc_se loopse;
  
        /* gfc_walk_elemental_function_args renders the ss chain in the
! 	 reverse order to the actual argument order.  */
        ss = gfc_reverse_ss (ss);
  
        /* Initialize the loop.  */
Index: gcc/fortran/trans.c
===================================================================
*** gcc/fortran/trans.c	(revision 141493)
--- gcc/fortran/trans.c	(working copy)
*************** gfc_trans_code (gfc_code * code)
*** 1102,1108 ****
  	  break;
  
  	case EXEC_CALL:
! 	  res = gfc_trans_call (code, false);
  	  break;
  
  	case EXEC_ASSIGN_CALL:
--- 1102,1116 ----
  	  break;
  
  	case EXEC_CALL:
! 	  /* For MVBITS we've got the special exception that we need a
! 	     dependency check, too.  */
! 	  {
! 	    bool is_mvbits = false;
! 	    if (code->resolved_isym
! 		&& code->resolved_isym->id == GFC_ISYM_MVBITS)
! 	      is_mvbits = true;
! 	    res = gfc_trans_call (code, is_mvbits);
! 	  }
  	  break;
  
  	case EXEC_ASSIGN_CALL:
Index: gcc/fortran/trans.h
===================================================================
*** gcc/fortran/trans.h	(revision 141493)
--- gcc/fortran/trans.h	(working copy)
*************** tree gfc_call_free (tree);
*** 464,469 ****
--- 464,472 ----
  /* Allocate memory after performing a few checks.  */
  tree gfc_call_malloc (stmtblock_t *, tree, tree);
  
+ /* Build a memcpy call.  */
+ tree gfc_build_memcpy_call (tree, tree, tree);
+ 
  /* Allocate memory for arrays, with optional status variable.  */
  tree gfc_allocate_array_with_status (stmtblock_t*, tree, tree, tree, gfc_expr*);
  
Index: gcc/fortran/resolve.c
===================================================================
*** gcc/fortran/resolve.c	(revision 141493)
--- gcc/fortran/resolve.c	(working copy)
*************** resolve_call (gfc_code *c)
*** 2913,2935 ****
  
    t = SUCCESS;
    if (c->resolved_sym == NULL)
!     switch (procedure_kind (csym))
!       {
!       case PTYPE_GENERIC:
! 	t = resolve_generic_s (c);
! 	break;
  
!       case PTYPE_SPECIFIC:
! 	t = resolve_specific_s (c);
! 	break;
  
!       case PTYPE_UNKNOWN:
! 	t = resolve_unknown_s (c);
! 	break;
  
!       default:
! 	gfc_internal_error ("resolve_subroutine(): bad function type");
!       }
  
    /* Some checks of elemental subroutine actual arguments.  */
    if (resolve_elemental_actual (NULL, c) == FAILURE)
--- 2913,2938 ----
  
    t = SUCCESS;
    if (c->resolved_sym == NULL)
!     {
!       c->resolved_isym = NULL;
!       switch (procedure_kind (csym))
! 	{
! 	case PTYPE_GENERIC:
! 	  t = resolve_generic_s (c);
! 	  break;
  
! 	case PTYPE_SPECIFIC:
! 	  t = resolve_specific_s (c);
! 	  break;
  
! 	case PTYPE_UNKNOWN:
! 	  t = resolve_unknown_s (c);
! 	  break;
  
! 	default:
! 	  gfc_internal_error ("resolve_subroutine(): bad function type");
! 	}
!     }
  
    /* Some checks of elemental subroutine actual arguments.  */
    if (resolve_elemental_actual (NULL, c) == FAILURE)
Index: gcc/fortran/iresolve.c
===================================================================
*** gcc/fortran/iresolve.c	(revision 141493)
--- gcc/fortran/iresolve.c	(working copy)
*************** gfc_resolve_cpu_time (gfc_code *c)
*** 2608,2616 ****
--- 2608,2650 ----
  }
  
  
+ /* Create a formal arglist based on an actual one and set the INTENTs given.  */
+ 
+ static gfc_formal_arglist*
+ create_formal_for_intents (gfc_actual_arglist* actual, const sym_intent* ints)
+ {
+   gfc_formal_arglist* head;
+   gfc_formal_arglist* tail;
+   int i;
+ 
+   if (!actual)
+     return NULL;
+ 
+   head = tail = gfc_get_formal_arglist ();
+   for (i = 0; actual; actual = actual->next, tail = tail->next, ++i)
+     {
+       gfc_symbol* sym;
+ 
+       sym = gfc_new_symbol ("dummyarg", NULL);
+       sym->ts = actual->expr->ts;
+ 
+       sym->attr.intent = ints[i];
+       tail->sym = sym;
+ 
+       if (actual->next)
+ 	tail->next = gfc_get_formal_arglist ();
+     }
+ 
+   return head;
+ }
+ 
+ 
  void
  gfc_resolve_mvbits (gfc_code *c)
  {
+   static const sym_intent INTENTS[] = {INTENT_IN, INTENT_IN, INTENT_IN,
+ 				       INTENT_INOUT, INTENT_IN};
+ 
    const char *name;
    gfc_typespec ts;
    gfc_clear_ts (&ts);
*************** gfc_resolve_mvbits (gfc_code *c)
*** 2632,2637 ****
--- 2666,2675 ----
    c->resolved_sym = gfc_get_intrinsic_sub_symbol (name);
    /* Mark as elemental subroutine as this does not happen automatically.  */
    c->resolved_sym->attr.elemental = 1;
+ 
+   /* Create a dummy formal arglist so the INTENTs are known later for purpose
+      of creating temporaries.  */
+   c->resolved_sym->formal = create_formal_for_intents (c->ext.actual, INTENTS);
  }
  
  
Index: gcc/fortran/trans-intrinsic.c
===================================================================
*** gcc/fortran/trans-intrinsic.c	(revision 141493)
--- gcc/fortran/trans-intrinsic.c	(working copy)
*************** gfc_conv_intrinsic_array_transfer (gfc_s
*** 3787,3793 ****
       FIXME callee_alloc is not set!  */
  
    gfc_trans_create_temp_array (&se->pre, &se->post, se->loop,
! 			       info, mold_type, false, true, false,
  			       &expr->where);
  
    /* Cast the pointer to the result.  */
--- 3787,3793 ----
       FIXME callee_alloc is not set!  */
  
    gfc_trans_create_temp_array (&se->pre, &se->post, se->loop,
! 			       info, mold_type, NULL_TREE, false, true, false,
  			       &expr->where);
  
    /* Cast the pointer to the result.  */
Index: gcc/testsuite/gfortran.dg/mvbits_4.f90
===================================================================
*** gcc/testsuite/gfortran.dg/mvbits_4.f90	(revision 0)
--- gcc/testsuite/gfortran.dg/mvbits_4.f90	(revision 0)
***************
*** 0 ****
--- 1,33 ----
+ ! { dg-do run }
+ 
+ ! PR fortran/35681
+ ! Check that dependencies of MVBITS arguments are resolved correctly by using
+ ! temporaries if both arguments refer to the same variable.
+ 
+   integer, dimension(10) :: ila1 = (/1,2,3,4,5,6,7,8,9,10/)
+   integer, dimension(20) :: ila2
+   integer, dimension(10), target :: ila3
+   integer, pointer :: ila3_ptr(:)
+   integer, parameter :: SHOULD_BE(10) = (/17,18,11,4,13,22,7,16,9,18/)
+   integer, parameter :: INDEX_VECTOR(10) = (/9,9,6,2,4,9,2,9,6,10/)
+ 
+   ila2(2:20:2) = ila1
+   ila3 = ila1
+ 
+   ! Argument is already packed.
+   call mvbits (ila1(INDEX_VECTOR), 2, 4, ila1, 3)
+   write (*,'(10(I3))') ila1
+   if (any (ila1 /= SHOULD_BE)) call abort ()
+ 
+   ! Argument is not packed.
+   call mvbits (ila2(2*INDEX_VECTOR), 2, 4, ila2(2:20:2), 3)
+   write (*,'(10(I3))') ila2(2:20:2)
+   if (any (ila2(2:20:2) /= SHOULD_BE)) call abort ()
+ 
+   ! Pointer and target
+   ila3_ptr => ila3
+   call mvbits (ila3(INDEX_VECTOR), 2, 4, ila3_ptr, 3)
+   write (*,'(10(I3))') ila3
+   if (any (ila3 /= SHOULD_BE)) call abort ()
+ 
+   end 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Patch, Fortran] PR fortran/35681: First part, fix ELEMENTAL   dependency  handling for MVBITS
  2008-10-31 15:54 ` Daniel Kraft
@ 2008-11-01 12:57   ` Paul Thomas
  2008-11-01 13:33     ` Daniel Kraft
  0 siblings, 1 reply; 9+ messages in thread
From: Paul Thomas @ 2008-11-01 12:57 UTC (permalink / raw)
  To: Daniel Kraft; +Cc: Fortran List, gcc-patches

Daniel,

Regtests fine on FC9/x86_i64.  This is OK for trunk.

Have you had any thoughts on parentheses expressions yet?

Thanks for the patch

Paul
> Hi,
>
> I've updated the patch described below to trunk of now (including the 
> trivial conflicts merge with Mikael's recent check-in) and run a new 
> regtest, no regressions on GNU/Linux-x86-32.
>
> Cheers,
> Daniel
>
> Daniel Kraft wrote:
>> working on PR fortran/35681, I've got some rather big patch now 
>> handling part of the problem.  What it exactly does:
>>
>> 1) Some tab-indentation formatting fixes as I came along, sorry for 
>> those.  I hope it is ok so.
>>
>> 2) When resolving a MVBITS intrinsic call, the code->resolved_sym 
>> gets a dummy formal argument list with the correct INTENTs specified; 
>> this is needed later for gfc_conv_elemental_dependencies.
>>
>> 3) gfc_code got a new member "resolved_isym" that tracks calls to 
>> intrinsic procedures, so we can later check if some call is to 
>> intrinsic MVBITS.  This got a little ugly and would be probably nicer 
>> to union it (and possibly "resolved_sym", too) with actual, but that 
>> would probably introduce a lot of changes to existing code pieces.
>>
>> 4) gfc_trans_allocate_array_storage (or what it is called) got a new 
>> argument `initial' that allows to initialize the created storage from 
>> some other array (this is done using a combination of internal_pack 
>> and memcpy if it was already packed, I hope I got this all right).  
>> This is used for gfc_trans_create_temp_array to allow initializing 
>> the new temporary.  Here is (probably) most of the "critical" changes.
>>
>> 5) For calls to intrinsic MVBITS, I enabled dependency checking using 
>> gfc_conv_elemental_dependencies and made this routine aware of 
>> INTENT(INOUT) arguments that use the new initialization feature to 
>> copy over the initial content of the mirrored array to the created 
>> temporary.
>>
>> 6) I could not find a test to verify this (not even one that uses 
>> gfc_conv_elemental_dependencies) in a quick trial, but I believe the 
>> handling of the temporary there was wrong, in that it was free'd (if 
>> allocated on the heap) *before* it was used with internal_unpack, 
>> because gfc_trans_create_temp_array added the temporary clean-up code 
>> to se->post and the unpack-call was added to se->post later.  In my 
>> opinion, this is some rather general problem with how post-commands 
>> are usually added to other post blocks; shouldn't they be added to 
>> the top usually rather than to the bottom, to get some sort of 
>> "nested" scope with inner most pairs of pre/post?  Well, for now I 
>> changed this behaviour inside gfc_conv_elemental_dependencies, which 
>> corrected problems I got with MVBITS tests.
>>
>> This enabled the (valid) tests in the PR to run, but only with 
>> modifying them slightly by removing the parentheses around the first 
>> argument (so it is not an expression; that will be part 2 of this 
>> fix).  As I understand it, this is valid in case of MVBITS but not 
>> for any other ELEMENTAL subroutine, right?  This is why I added the 
>> check for whether some call is to MVBITS.  I guess the rationale why 
>> the compiler is not required to create temporaries for all such 
>> ELEMENTAL calls (and they are invalid instead) is performance?  
>> gfortran could handle those calls well in addition to only MVBITS 
>> calls simply if I take this conditional check out, but then we might 
>> generate temporaries for cases where the user knows no one is needed 
>> and the code is valid but the compiler can't figure it out.
>>
>> I hope I got this one at least somewhat clear...  What do you think 
>> about it?  Currently regression-testing on GNU/Linux-x86-32, but I 
>> don't expect any (a very similar patch worked fine before).
>>
>> Cheers,
>> Daniel
>>
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Patch, Fortran] PR fortran/35681: First part, fix ELEMENTAL   dependency  handling for MVBITS
  2008-11-01 12:57   ` Paul Thomas
@ 2008-11-01 13:33     ` Daniel Kraft
  0 siblings, 0 replies; 9+ messages in thread
From: Daniel Kraft @ 2008-11-01 13:33 UTC (permalink / raw)
  To: Paul Thomas; +Cc: Fortran List, gcc-patches

Paul Thomas wrote:
> Daniel,
> 
> Regtests fine on FC9/x86_i64.  This is OK for trunk.

Committed as rev 141516.  Seems Janus did commit exactly the same time 
:D  Thanks for the review Paul!

> Have you had any thoughts on parentheses expressions yet?

Only very little...  I'll comment on this from my point of view soon in 
a new post.

Cheers,
Daniel

> Thanks for the patch
> 
> Paul
>> Hi,
>>
>> I've updated the patch described below to trunk of now (including the 
>> trivial conflicts merge with Mikael's recent check-in) and run a new 
>> regtest, no regressions on GNU/Linux-x86-32.
>>
>> Cheers,
>> Daniel
>>
>> Daniel Kraft wrote:
>>> working on PR fortran/35681, I've got some rather big patch now 
>>> handling part of the problem.  What it exactly does:
>>>
>>> 1) Some tab-indentation formatting fixes as I came along, sorry for 
>>> those.  I hope it is ok so.
>>>
>>> 2) When resolving a MVBITS intrinsic call, the code->resolved_sym 
>>> gets a dummy formal argument list with the correct INTENTs specified; 
>>> this is needed later for gfc_conv_elemental_dependencies.
>>>
>>> 3) gfc_code got a new member "resolved_isym" that tracks calls to 
>>> intrinsic procedures, so we can later check if some call is to 
>>> intrinsic MVBITS.  This got a little ugly and would be probably nicer 
>>> to union it (and possibly "resolved_sym", too) with actual, but that 
>>> would probably introduce a lot of changes to existing code pieces.
>>>
>>> 4) gfc_trans_allocate_array_storage (or what it is called) got a new 
>>> argument `initial' that allows to initialize the created storage from 
>>> some other array (this is done using a combination of internal_pack 
>>> and memcpy if it was already packed, I hope I got this all right).  
>>> This is used for gfc_trans_create_temp_array to allow initializing 
>>> the new temporary.  Here is (probably) most of the "critical" changes.
>>>
>>> 5) For calls to intrinsic MVBITS, I enabled dependency checking using 
>>> gfc_conv_elemental_dependencies and made this routine aware of 
>>> INTENT(INOUT) arguments that use the new initialization feature to 
>>> copy over the initial content of the mirrored array to the created 
>>> temporary.
>>>
>>> 6) I could not find a test to verify this (not even one that uses 
>>> gfc_conv_elemental_dependencies) in a quick trial, but I believe the 
>>> handling of the temporary there was wrong, in that it was free'd (if 
>>> allocated on the heap) *before* it was used with internal_unpack, 
>>> because gfc_trans_create_temp_array added the temporary clean-up code 
>>> to se->post and the unpack-call was added to se->post later.  In my 
>>> opinion, this is some rather general problem with how post-commands 
>>> are usually added to other post blocks; shouldn't they be added to 
>>> the top usually rather than to the bottom, to get some sort of 
>>> "nested" scope with inner most pairs of pre/post?  Well, for now I 
>>> changed this behaviour inside gfc_conv_elemental_dependencies, which 
>>> corrected problems I got with MVBITS tests.
>>>
>>> This enabled the (valid) tests in the PR to run, but only with 
>>> modifying them slightly by removing the parentheses around the first 
>>> argument (so it is not an expression; that will be part 2 of this 
>>> fix).  As I understand it, this is valid in case of MVBITS but not 
>>> for any other ELEMENTAL subroutine, right?  This is why I added the 
>>> check for whether some call is to MVBITS.  I guess the rationale why 
>>> the compiler is not required to create temporaries for all such 
>>> ELEMENTAL calls (and they are invalid instead) is performance?  
>>> gfortran could handle those calls well in addition to only MVBITS 
>>> calls simply if I take this conditional check out, but then we might 
>>> generate temporaries for cases where the user knows no one is needed 
>>> and the code is valid but the compiler can't figure it out.
>>>
>>> I hope I got this one at least somewhat clear...  What do you think 
>>> about it?  Currently regression-testing on GNU/Linux-x86-32, but I 
>>> don't expect any (a very similar patch worked fine before).
>>>
>>> Cheers,
>>> Daniel
>>>
>>
>>
> 
> 


-- 
Done:  Arc-Bar-Cav-Rog-Sam-Val-Wiz
To go: Hea-Kni-Mon-Pri-Ran-Tou

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-11-01 13:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-27 19:43 [Patch, Fortran] PR fortran/35681: First part, fix ELEMENTAL dependency handling for MVBITS Daniel Kraft
2008-10-27 21:29 ` Paul Richard Thomas
2008-10-27 22:04   ` Daniel Kraft
2008-10-28 22:05 ` Mikael Morin
2008-10-28 22:31   ` Daniel Kraft
2008-10-29 10:00   ` Paul Richard Thomas
2008-10-31 15:54 ` Daniel Kraft
2008-11-01 12:57   ` Paul Thomas
2008-11-01 13:33     ` Daniel Kraft

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).