public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Tobias Burnus <tobias@codesourcery.com>
To: gcc-patches <gcc-patches@gcc.gnu.org>, fortran <fortran@gcc.gnu.org>
Cc: Paul Richard Thomas <paul.richard.thomas@gmail.com>
Subject: [Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types
Date: Fri, 17 Feb 2023 12:13:52 +0100	[thread overview]
Message-ID: <27cd606a-f019-60b2-a9c8-0a570433b5eb@codesourcery.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 5004 bytes --]

Short version:

This fixes potential and real bugs related to 'len=:' character variables
as for the length/byte size an old/saved expression is used instead of
the current value. - That's fine but not for allocatable/pointer with 'len=:'.


Main part of the patch: Strip the SAVE_EXPR from the size expression:

   if (len && deferred && TREE_CODE (TYPE_SIZE (type)) == SAVE_EXPR)
     {
       gcc_assert (TREE_CODE (TYPE_SIZE_UNIT (type)) == SAVE_EXPR);
       TYPE_SIZE (type) = TREE_OPERAND (TYPE_SIZE (type), 0);
       TYPE_SIZE_UNIT (type) = TREE_OPERAND (TYPE_SIZE_UNIT (type), 0);
     }


OK for mainline?

* * *

Long version:

BACKGROUND:


(A) VLA / EXPLICIT-SIZE ARRAYS + LEN=<expr|var> STRINGS


C knows something like VLA (variable length arrays), likewise Fortran
knows explicit size array and character length where the length/size
depends on an variable set before the current scoping unit. Examples:

void f(int N)
{
   int vla[N*5];
}

subroutine foo(n)
   integer :: n
   integer :: array(n*5)
   integer :: my_len
   ...
   my_len = 5
   block
     character(len=my_len, kind=4) :: str

     my_len = 99
     print *, len(str)  ! still shows 5 - not 99
   end block
end


In all cases, the size / length is not known at compile time but it won't change.
Thus, expressions like (pseudo code)
    byte_size = n * 5 * sizeof(integer)
can be saved and re-used and do not have to be re-calculated every time the
    TYPE_SIZE or TYPE_UNIT_SIZE
is used.

In particular, the 'my_len' example shows that just using the current value of
'my_len' would be wrong as it can be overridden.


* * *


(B) DEFERRED-LENGTH STRINGS ('character(len=:), pointer/allocatable')


But with deferred-length strings, such as

   character(len=:), pointer :: pstr(:)
   ...
   allocate(character(len=2) :: pstr(5))
   ...
   !$omp target enter data map(alloc: pstr(2:5))


this leads to code like:

   integer(kind=8) .pstr;
   struct array01_character(kind=1) pstr;

   D.4302 = (sizetype) NON_LVALUE_EXPR <.pstr>;
   pstr.dtype = {.elem_len=(unsigned long) .pstr, .rank=1, .type=6};
...
     .pstr = 2;  // during allocation/pointer assignment
...
       parm.1.data = pstr.data + (sizetype) ((~pstr.dim[0].lbound * D.4287)
                                 * (integer(kind=8)) SAVE_EXPR <D.4302>);

And here D.4302 is the pre-calculated value instead of the current value,
which can be either 0 or some random value.


Such code happens when using code like:
     elemsz = TYPE_SIZE_UNIT (gfc_get_element_type (type));

Of course, there are various ways to avoid this – like obtaining somehow the
string length directly - either from the expression or from the type such as
   TYPE_DOMAIN (type)
but it can easily go wrong.

  * * *

IDEAL SOLUTION:

I think from the middle-end perspective, we should do:
   build_range_type (type, 0, NULL_TREE)
leaving the upper bound unspecified – which should also help with
type-is-the-same middle-end analysis.


PRACTICAL SOLUTION:

But as code like TYPE_SIZE_UNIT is very widely used - and we currently lack
a place to store the tree decl for the length, I propose the following as discussed
with Jakub yesterday:

We just remove SAVE_EXPR after generating the type.

Side note: In some cases, the type is already constructed with len = NULL; I have not
checked when. In that case, using TYPE_SIZE will fail at compile time.
(That remains unchanged by this patch.)

  * * *

OK for mainline?

Tobias

  * * *

PS: I have no real opinion whether we want to have any backports, thoughts?


PPS: I don't have any real example I want to add as most cases have been work-around
fixed in the meanwhile. If you want to test it, the following fails. I intent to add
an extended tests as part of a larger follow-up patch which fixes more OpenMP issues:

   character(len=:), pointer :: pstr(:)
   allocate(character(len=2) :: pstr(5))
   !$omp target enter data map(alloc: pstr(2:5))
end


Compile with -fopenmp -fdump-tree-original (or a later dump).


BEFORE the patch:

   integer(kind=8) .pstr;
   ...
   D.4291 = (sizetype) NON_LVALUE_EXPR <.pstr>;
   pstr.dtype = {.elem_len=(unsigned long) .pstr, .rank=1, .type=6};
   ...
     .pstr = 2;
   ...
         pstr.data = __builtin_malloc (10);
   ...
       parm.1.data = pstr.data + (sizetype) (((2 - pstr.dim[0].lbound) * D.4287)
                                 * (integer(kind=8)) SAVE_EXPR <D.4291>);

AFTER the patch:
   .....
       parm.1.data = pstr.data + (sizetype) (((2 - pstr.dim[0].lbound) * D.4287)
                                 * NON_LVALUE_EXPR <.pstr>);
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Attachment #2: fix-deferred-len.diff --]
[-- Type: text/x-patch, Size: 13405 bytes --]

Fortran: Avoid SAVE_EXPR for deferred-len char types

Using TYPE_SIZE/TYPE_SIZE_UNIT with deferred-length character variables,
i.e. 'character(len=:), allocatable/pointer' used a SAVE_EXPR, i.e. the
value on entry to the scope instead of the latest value.

Solution: Remove the SAVE_EXPR again in this case.

gcc/fortran/ChangeLog:

	* trans-types.h (gfc_get_character_type, gfc_get_character_type_len,
	(gfc_get_character_type_len_for_eltype): Add argument 'bool deferred'.
	* trans-types.cc (gfc_get_character_type_len_for_eltype): Likewise;
	remove the SAVE_EXPR for the type size for deferred string lengths.
	(gfc_get_character_type_len, gfc_get_character_type): Add arg
	and pass on.
	(gfc_typenode_for_spec): Update call.
	* trans-array.cc (gfc_trans_create_temp_array,
	trans_array_constructor, gfc_conv_loop_setup, gfc_array_init_size,
	gfc_alloc_allocatable_for_assignment): Likewise.
	* trans-expr.cc (gfc_conv_substring, gfc_conv_concat_op,
	gfc_add_interface_mapping, gfc_conv_procedure_call,
	gfc_conv_statement_function, gfc_conv_string_parameter): Likewise.
	* trans-intrinsic.cc (gfc_conv_intrinsic_transfer,
	gfc_conv_intrinsic_repeat): Likewise.
	* trans-stmt.cc (forall_make_variable_temp,
	gfc_trans_assign_need_temp): Likewise.

 gcc/fortran/trans-array.cc     | 11 ++++++-----
 gcc/fortran/trans-expr.cc      | 15 ++++++++-------
 gcc/fortran/trans-intrinsic.cc |  5 +++--
 gcc/fortran/trans-stmt.cc      |  7 ++++---
 gcc/fortran/trans-types.cc     | 39 ++++++++++++++++++++++++++++++---------
 gcc/fortran/trans-types.h      |  6 +++---
 6 files changed, 54 insertions(+), 29 deletions(-)

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index 63bd1ac573a..b0abdadc3f5 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -1480,7 +1480,7 @@ gfc_trans_create_temp_array (stmtblock_t * pre, stmtblock_t * post, gfc_ss * ss,
       elemsize = gfc_resize_class_size_with_len (pre, class_expr, elemsize);
       /* Casting the data as a character of the dynamic length ensures that
 	 assignment of elements works when needed.  */
-      eltype = gfc_get_character_type_len (1, elemsize);
+      eltype = gfc_get_character_type_len (1, elemsize, true);
     }
 
   memset (from, 0, sizeof (from));
@@ -2823,7 +2823,8 @@ trans_array_constructor (gfc_ss * ss, locus * where)
 
       store_backend_decl (&expr->ts.u.cl, ss_info->string_length, force_new_cl);
 
-      type = gfc_get_character_type_len (expr->ts.kind, ss_info->string_length);
+      type = gfc_get_character_type_len (expr->ts.kind, ss_info->string_length,
+					 expr->ts.deferred);
       if (const_string)
 	type = build_pointer_type (type);
     }
@@ -5492,7 +5493,7 @@ gfc_conv_loop_setup (gfc_loopinfo * loop, locus * where)
 	tmp_ss_info->data.temp.type
 		= gfc_get_character_type_len_for_eltype
 			(TREE_TYPE (tmp_ss_info->data.temp.type),
-			 tmp_ss_info->string_length);
+			 tmp_ss_info->string_length, false);
 
       tmp = tmp_ss_info->data.temp.type;
       memset (&tmp_ss_info->data.array, 0, sizeof (gfc_array_info));
@@ -5737,7 +5738,7 @@ gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset,
       tmp = fold_build3_loc (input_location, COMPONENT_REF, TREE_TYPE (tmp),
 			     TREE_OPERAND (descriptor, 0), tmp, NULL_TREE);
       tmp = fold_convert (gfc_charlen_type_node, tmp);
-      type = gfc_get_character_type_len (expr->ts.kind, tmp);
+      type = gfc_get_character_type_len (expr->ts.kind, tmp, expr->ts.deferred);
       tmp = gfc_conv_descriptor_dtype (descriptor);
       gfc_add_modify (pblock, tmp, gfc_get_dtype_rank_type (rank, type));
     }
@@ -10908,7 +10909,7 @@ gfc_alloc_allocatable_for_assignment (gfc_loopinfo *loop,
       if (expr2->ts.type != BT_CLASS)
 	type = gfc_typenode_for_spec (&expr2->ts);
       else
-	type = gfc_get_character_type_len (1, elemsize2);
+	type = gfc_get_character_type_len (1, elemsize2, true);
 
       gfc_add_modify (&fblock, tmp,
 		      gfc_get_dtype_rank_type (expr2->rank,type));
diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index e85b53fae85..50f81ea8881 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -2589,7 +2589,7 @@ gfc_conv_substring (gfc_se * se, gfc_ref * ref, int kind,
   char *msg;
   mpz_t length;
 
-  type = gfc_get_character_type (kind, ref->u.ss.length);
+  type = gfc_get_character_type (kind, ref->u.ss.length, false);
   type = build_pointer_type (type);
 
   gfc_init_se (&start, se);
@@ -3709,7 +3709,7 @@ gfc_conv_concat_op (gfc_se * se, gfc_expr * expr)
   gfc_add_block_to_block (&se->pre, &lse.pre);
   gfc_add_block_to_block (&se->pre, &rse.pre);
 
-  type = gfc_get_character_type (expr->ts.kind, expr->ts.u.cl);
+  type = gfc_get_character_type (expr->ts.kind, expr->ts.u.cl, false);
   len = TYPE_MAX_VALUE (TYPE_DOMAIN (type));
   if (len == NULL_TREE)
     {
@@ -4474,7 +4474,7 @@ gfc_add_interface_mapping (gfc_interface_mapping * mapping,
      convert it to a boundless character type.  */
   else if (!sym->attr.dimension && sym->ts.type == BT_CHARACTER)
     {
-      tmp = gfc_get_character_type_len (sym->ts.kind, NULL);
+      tmp = gfc_get_character_type_len (sym->ts.kind, NULL, sym->ts.deferred);
       tmp = build_pointer_type (tmp);
       if (sym->attr.pointer)
         value = build_fold_indirect_ref_loc (input_location,
@@ -7614,7 +7614,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
       else if (ts.type == BT_CHARACTER)
 	{
 	  /* Pass the string length.  */
-	  type = gfc_get_character_type (ts.kind, ts.u.cl);
+	  type = gfc_get_character_type (ts.kind, ts.u.cl, false);
 	  type = build_pointer_type (type);
 
 	  /* Emit a DECL_EXPR for the VLA type.  */
@@ -8240,7 +8240,7 @@ gfc_conv_statement_function (gfc_se * se, gfc_expr * expr)
 	     fsym->ts.u.cl->backend_decl
 		= gfc_conv_constant_to_tree (fsym->ts.u.cl->length);
 
-	  type = gfc_get_character_type (fsym->ts.kind, fsym->ts.u.cl);
+	  type = gfc_get_character_type (fsym->ts.kind, fsym->ts.u.cl, false);
 	  temp_vars[n] = gfc_create_var (type, fsym->name);
 
 	  arglen = TYPE_MAX_VALUE (TYPE_DOMAIN (type));
@@ -8289,7 +8289,7 @@ gfc_conv_statement_function (gfc_se * se, gfc_expr * expr)
 	  || tree_int_cst_lt (se->string_length,
 			      sym->ts.u.cl->backend_decl))
 	{
-	  type = gfc_get_character_type (sym->ts.kind, sym->ts.u.cl);
+	  type = gfc_get_character_type (sym->ts.kind, sym->ts.u.cl, false);
 	  tmp = gfc_create_var (type, sym->name);
 	  tmp = gfc_build_addr_expr (build_pointer_type (type), tmp);
 	  gfc_trans_string_copy (&se->pre, sym->ts.u.cl->backend_decl, tmp,
@@ -10391,7 +10391,8 @@ gfc_conv_string_parameter (gfc_se * se)
 	  if (TREE_CODE (type) == ARRAY_TYPE)
 	    type = TREE_TYPE (type);
 	  type = gfc_get_character_type_len_for_eltype (type,
-							se->string_length);
+							se->string_length,
+							false);
 	  type = build_pointer_type (type);
 	  se->expr = gfc_build_addr_expr (type, se->expr);
 	}
diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc
index 21eeb12ca89..babe30898a0 100644
--- a/gcc/fortran/trans-intrinsic.cc
+++ b/gcc/fortran/trans-intrinsic.cc
@@ -8548,7 +8548,8 @@ gfc_conv_intrinsic_transfer (gfc_se * se, gfc_expr * expr)
     case BT_CHARACTER:
       tmp = size_of_string_in_bytes (arg->expr->ts.kind, argse.string_length);
       mold_type = gfc_get_character_type_len (arg->expr->ts.kind,
-					      argse.string_length);
+					      argse.string_length,
+					      arg->expr->ts.deferred);
       break;
     case BT_CLASS:
       tmp = gfc_class_vtab_size_get (argse.expr);
@@ -9325,7 +9326,7 @@ gfc_conv_intrinsic_repeat (gfc_se * se, gfc_expr * expr)
   dlen = fold_build2_loc (input_location, MULT_EXPR, gfc_charlen_type_node,
 			  fold_convert (gfc_charlen_type_node, slen),
 			  fold_convert (gfc_charlen_type_node, ncopies));
-  type = gfc_get_character_type (expr->ts.kind, expr->ts.u.cl);
+  type = gfc_get_character_type (expr->ts.kind, expr->ts.u.cl, false);
   dest = gfc_conv_string_tmp (se, build_pointer_type (type), dlen);
 
   /* Generate the code to do the repeat operation:
diff --git a/gcc/fortran/trans-stmt.cc b/gcc/fortran/trans-stmt.cc
index 2b4278be748..9a1caf56bcb 100644
--- a/gcc/fortran/trans-stmt.cc
+++ b/gcc/fortran/trans-stmt.cc
@@ -3895,7 +3895,7 @@ forall_make_variable_temp (gfc_code *c, stmtblock_t *pre, stmtblock_t *post)
 	{
 	  tse.string_length = rse.string_length;
 	  tmp = gfc_get_character_type_len (gfc_default_character_kind,
-					    tse.string_length);
+					    tse.string_length, e->ts.deferred);
 	  tse.expr = gfc_conv_string_tmp (&tse, build_pointer_type (tmp),
 					  rse.string_length);
 	  gfc_add_block_to_block (pre, &tse.pre);
@@ -4676,7 +4676,7 @@ gfc_trans_assign_need_temp (gfc_expr * expr1, gfc_expr * expr2,
 	  gfc_init_se (&ssse, NULL);
 	  gfc_conv_expr (&ssse, expr1);
 	  type = gfc_get_character_type_len (gfc_default_character_kind,
-					     ssse.string_length);
+					     ssse.string_length, false);
 	}
       else
 	{
@@ -4689,7 +4689,8 @@ gfc_trans_assign_need_temp (gfc_expr * expr1, gfc_expr * expr2,
 	      expr1->ts.u.cl->backend_decl = tse.expr;
 	    }
 	  type = gfc_get_character_type_len (gfc_default_character_kind,
-					     expr1->ts.u.cl->backend_decl);
+					     expr1->ts.u.cl->backend_decl,
+					     false);
 	}
     }
   else
diff --git a/gcc/fortran/trans-types.cc b/gcc/fortran/trans-types.cc
index 9c9489a42bd..591661c7630 100644
--- a/gcc/fortran/trans-types.cc
+++ b/gcc/fortran/trans-types.cc
@@ -1112,32 +1112,52 @@ gfc_get_pchar_type (int kind)
 }
 
 \f
-/* Create a character type with the given kind and length.  */
+/* Create a character type with the given kind and length; 'deferred' affects
+   the following: If 'len' is a variable/non-constant expression, it can be
+   either for
+
+   * a stack-allocated variable where the length is taken from the outside
+   ('VLA') (global variable, dummy argument, variable from before a BLOCK) - in
+   this case, the value on entry needs to be preserved -> SAVE_EXPR.
+
+   * or, 'len' is the hidden variable of a deferred-length ('len=:') variable,
+   such that the current value after the last pointer-assignment or allocation
+   must be used. In this case, there shall not be a SAVE_EXPR.  */
 
 tree
-gfc_get_character_type_len_for_eltype (tree eltype, tree len)
+gfc_get_character_type_len_for_eltype (tree eltype, tree len, bool deferred)
 {
   tree bounds, type;
 
   bounds = build_range_type (gfc_charlen_type_node, gfc_index_one_node, len);
   type = build_array_type (eltype, bounds);
   TYPE_STRING_FLAG (type) = 1;
-
+  if (len && deferred && TREE_CODE (TYPE_SIZE (type)) == SAVE_EXPR)
+    {
+      /* TODO: A more middle-end friendly alternative would be to use NULL_TREE
+	 as upper bound and store the value elsewhere; caveat: this requires
+	 some cleanup throughout the code to consistently use some wrapper
+	 function.  */
+      gcc_assert (TREE_CODE (TYPE_SIZE_UNIT (type)) == SAVE_EXPR);
+      TYPE_SIZE (type) = TREE_OPERAND (TYPE_SIZE (type), 0);
+      TYPE_SIZE_UNIT (type) = TREE_OPERAND (TYPE_SIZE_UNIT (type), 0);
+    }
   return type;
 }
 
 tree
-gfc_get_character_type_len (int kind, tree len)
+gfc_get_character_type_len (int kind, tree len, bool deferred)
 {
   gfc_validate_kind (BT_CHARACTER, kind, false);
-  return gfc_get_character_type_len_for_eltype (gfc_get_char_type (kind), len);
+  return gfc_get_character_type_len_for_eltype (gfc_get_char_type (kind), len,
+						deferred);
 }
 
 
 /* Get a type node for a character kind.  */
 
 tree
-gfc_get_character_type (int kind, gfc_charlen * cl)
+gfc_get_character_type (int kind, gfc_charlen * cl, bool deferred)
 {
   tree len;
 
@@ -1145,7 +1165,7 @@ gfc_get_character_type (int kind, gfc_charlen * cl)
   if (len && POINTER_TYPE_P (TREE_TYPE (len)))
     len = build_fold_indirect_ref (len);
 
-  return gfc_get_character_type_len (kind, len);
+  return gfc_get_character_type_len (kind, len, deferred);
 }
 \f
 /* Convert a basic type.  This will be an array for character types.  */
@@ -1189,13 +1209,14 @@ gfc_typenode_for_spec (gfc_typespec * spec, int codim)
       break;
 
     case BT_CHARACTER:
-      basetype = gfc_get_character_type (spec->kind, spec->u.cl);
+      basetype = gfc_get_character_type (spec->kind, spec->u.cl,
+					 spec->deferred);
       break;
 
     case BT_HOLLERITH:
       /* Since this cannot be used, return a length one character.  */
       basetype = gfc_get_character_type_len (gfc_default_character_kind,
-					     gfc_index_one_node);
+					     gfc_index_one_node, false);
       break;
 
     case BT_UNION:
diff --git a/gcc/fortran/trans-types.h b/gcc/fortran/trans-types.h
index 2dc692325cf..b2a0375ddfa 100644
--- a/gcc/fortran/trans-types.h
+++ b/gcc/fortran/trans-types.h
@@ -81,9 +81,9 @@ tree gfc_get_complex_type (int);
 tree gfc_get_logical_type (int);
 tree gfc_get_char_type (int);
 tree gfc_get_pchar_type (int);
-tree gfc_get_character_type (int, gfc_charlen *);
-tree gfc_get_character_type_len (int, tree);
-tree gfc_get_character_type_len_for_eltype (tree, tree);
+tree gfc_get_character_type (int, gfc_charlen *, bool);
+tree gfc_get_character_type_len (int, tree, bool);
+tree gfc_get_character_type_len_for_eltype (tree, tree, bool);
 
 tree gfc_sym_type (gfc_symbol *, bool is_bind_c_arg = false);
 tree gfc_get_cfi_type (int dimen, bool restricted);

             reply	other threads:[~2023-02-17 11:14 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-17 11:13 Tobias Burnus [this message]
2023-02-17 16:27 ` Steve Kargl
2023-02-20  6:56   ` Tobias Burnus
2023-02-20  7:24     ` Steve Kargl
2023-02-20 10:41     ` Richard Biener
2023-02-20 11:07       ` Tobias Burnus
2023-02-20 11:15         ` Jakub Jelinek
2023-02-20 11:48           ` Tobias Burnus
2023-02-20 11:56             ` Jakub Jelinek
2023-02-20 12:46               ` Richard Biener
2023-02-20 16:23                 ` Tobias Burnus
2023-02-21  7:30                   ` Richard Biener
2023-02-21  9:44                     ` Jakub Jelinek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=27cd606a-f019-60b2-a9c8-0a570433b5eb@codesourcery.com \
    --to=tobias@codesourcery.com \
    --cc=fortran@gcc.gnu.org \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=paul.richard.thomas@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).