public inbox for fortran@gcc.gnu.org
 help / color / mirror / Atom feed
From: Tobias Burnus <tobias@codesourcery.com>
To: gcc-patches <gcc-patches@gcc.gnu.org>, fortran <fortran@gcc.gnu.org>
Subject: [Patch] Fortran: Fix reallocation on assignment for kind=4 strings [PR107508]
Date: Sat, 5 Nov 2022 23:28:47 +0100	[thread overview]
Message-ID: <dee92d68-139c-9a2d-325e-2c3f402291e8@codesourcery.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 3763 bytes --]

Prior to the attached patch, there is a problem with realloc on assignment
with kind=4 characters as the string length was compared with the byte size,
which was always true.

I initially thought, looking at the code, that scalars have the same issues,
but they don't; hence, I ended up with a comment and a cleanup.

For arrays: The issue shows up in the testcase (→ PR) because there was
unnecessary reallocation on assignment, which changed the lower bound to 1.

The rest, I found looking at the dump:

(a) cond_null was:
     D.4298 = .a4str != 7 || (character(kind=4)[0:][1:.a4str] *) a4str.data == 0B;
...
           if (D.4298)
               a4str.data = __builtin_malloc (168);
           else
               a4str.data = __builtin_realloc (a4str.data, 168);
which is the wrong condition. It should be just:
           D.4298 = (character(kind=4)[0:][1:.a4str] *) a4str.data == 0B;
to avoid a memory leak.

(b) The rest was removing bogus code; I think it did not do any harm, but makes
the code and the dump rather convoluted.

The dump (with and without the patch) starts with:

           D.4295 = .a4str * 4;
           .a4str = 7;
           D.4298 = (character(kind=4)[0:][1:.a4str] *) a4str.data == 0B;
           if (D.4298) goto L.6;
           if (a4str.dim[0].lbound + 5 != a4str.dim[0].ubound) goto L.6;
           if (D.4295 != 28) goto L.6;
           goto L.7;
           L.6:;
           a4str.dim[0].lbound = 1;
....
           if (D.4298)
               a4str.data = __builtin_malloc (168);
           else
               a4str.data = __builtin_realloc (a4str.data, 168);
           L.7:;

Thus, any code which reaches L.6 should be reallocated and any code
which does not, shouldn't.

The deleted code did add directly after L.6 the following additional code:
         if (D.4298)
             D.4282 = 0;
         else
             D.4282 = MAX_EXPR <a4str.dim[0].ubound - a4str.dim[0].lbound, -1> + 1;
         D.4283 = D.4282 != 6;
and it changed the 'else' into an 'else if' in
           if (D.4298)
               a4str.data = __builtin_malloc (168);
           else if (D.4283)
               a4str.data = __builtin_realloc (a4str.data, 168);

Closely looking at the added condition and at source code, it
does essentially the same check as the code which guarded the L.6 to L.7
code. Thus, the condition should always evaluate as true.

Codewise, the 'D.4282 != 6'  is the 'size1 != size2' array size comparison.

I think it was the now removed code was there before, but then someone
realized the array bounds problem - and the new code was added without
actually removing the old one. The handling of deferred strings both in
the bogus condition for cond_null and by setting 'D.4283' to always true
is not only wrong but implies some early hack.
However, I have not checked the history to confirm my suspicion.

OK for mainline?

Tobias

PS: I have the feeling that there might be an issue with finalization/derived-type
handling in case of 'realloc' as I did not spot finalization code between the size
check and the malloc/realloc. The malloc case should be fine – but if realloc shrinks
the memory, elements beyond the new last element in storage order would access invalid
memory. – However, I have not checked whether there is indeed a problem as
I concentrated on fixing this issue.

PPS: I lost track of pending patches. Are they any which I should review?
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Attachment #2: realloc-wide-char.diff --]
[-- Type: text/x-patch, Size: 8732 bytes --]

Fortran: Fix reallocation on assignment for kind=4 strings [PR107508]

The check whether reallocation on assignment was required did not handle
kind=4 characters correctly such that there was always a reallocation,
implying issues with pointer addresses and lower bounds.  Additionally,
with all deferred strings, the old memory was not freed on reallocation.
And, finally, inside the block which was only executed if string lengths
or bounds or dynamic types changed, was a subcheck of the same, which
was effectively a no op but still confusing and at least added with -O0
extra instructions to the binary.

	PR fortran/107508

gcc/fortran/ChangeLog:

	* trans-array.cc (gfc_alloc_allocatable_for_assignment): Fix
	string-length check, plug memory leak, and avoid generation of
	effectively no-op code.
	* trans-expr.cc (alloc_scalar_allocatable_for_assignment): Extend
	comment; minor cleanup.

gcc/testsuite/ChangeLog:

	* gfortran.dg/widechar_11.f90: New test.

 gcc/fortran/trans-array.cc                | 57 ++++---------------------------
 gcc/fortran/trans-expr.cc                 |  8 ++---
 gcc/testsuite/gfortran.dg/widechar_11.f90 | 52 ++++++++++++++++++++++++++++
 3 files changed, 62 insertions(+), 55 deletions(-)

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index 514cb057afb..b7d4c41b5fe 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -10527,7 +10527,6 @@ gfc_alloc_allocatable_for_assignment (gfc_loopinfo *loop,
   tree offset;
   tree jump_label1;
   tree jump_label2;
-  tree neq_size;
   tree lbd;
   tree class_expr2 = NULL_TREE;
   int n;
@@ -10607,6 +10606,11 @@ gfc_alloc_allocatable_for_assignment (gfc_loopinfo *loop,
 	elemsize1 = expr1->ts.u.cl->backend_decl;
       else
 	elemsize1 = lss->info->string_length;
+      tree unit_size = TYPE_SIZE_UNIT (gfc_get_char_type (expr1->ts.kind));
+      elemsize1 = fold_build2_loc (input_location, MULT_EXPR,
+				   TREE_TYPE (elemsize1), elemsize1,
+				   fold_convert (TREE_TYPE (elemsize1), unit_size));
+
     }
   else if (expr1->ts.type == BT_CLASS)
     {
@@ -10699,19 +10703,7 @@ gfc_alloc_allocatable_for_assignment (gfc_loopinfo *loop,
   /* Allocate if data is NULL.  */
   cond_null = fold_build2_loc (input_location, EQ_EXPR, logical_type_node,
 			 array1, build_int_cst (TREE_TYPE (array1), 0));
-
-  if (expr1->ts.type == BT_CHARACTER && expr1->ts.deferred)
-    {
-      tmp = fold_build2_loc (input_location, NE_EXPR,
-			     logical_type_node,
-			     lss->info->string_length,
-			     rss->info->string_length);
-      cond_null = fold_build2_loc (input_location, TRUTH_OR_EXPR,
-				   logical_type_node, tmp, cond_null);
-      cond_null= gfc_evaluate_now (cond_null, &fblock);
-    }
-  else
-    cond_null= gfc_evaluate_now (cond_null, &fblock);
+  cond_null= gfc_evaluate_now (cond_null, &fblock);
 
   tmp = build3_v (COND_EXPR, cond_null,
 		  build1_v (GOTO_EXPR, jump_label1),
@@ -10778,19 +10770,6 @@ gfc_alloc_allocatable_for_assignment (gfc_loopinfo *loop,
   tmp = build1_v (LABEL_EXPR, jump_label1);
   gfc_add_expr_to_block (&fblock, tmp);
 
-  /* If the lhs has not been allocated, its bounds will not have been
-     initialized and so its size is set to zero.  */
-  size1 = gfc_create_var (gfc_array_index_type, NULL);
-  gfc_init_block (&alloc_block);
-  gfc_add_modify (&alloc_block, size1, gfc_index_zero_node);
-  gfc_init_block (&realloc_block);
-  gfc_add_modify (&realloc_block, size1,
-		  gfc_conv_descriptor_size (desc, expr1->rank));
-  tmp = build3_v (COND_EXPR, cond_null,
-		  gfc_finish_block (&alloc_block),
-		  gfc_finish_block (&realloc_block));
-  gfc_add_expr_to_block (&fblock, tmp);
-
   /* Get the rhs size and fix it.  */
   size2 = gfc_index_one_node;
   for (n = 0; n < expr2->rank; n++)
@@ -10807,16 +10786,6 @@ gfc_alloc_allocatable_for_assignment (gfc_loopinfo *loop,
     }
   size2 = gfc_evaluate_now (size2, &fblock);
 
-  cond = fold_build2_loc (input_location, NE_EXPR, logical_type_node,
-			  size1, size2);
-
-  /* If the lhs is deferred length, assume that the element size
-     changes and force a reallocation.  */
-  if (expr1->ts.deferred)
-    neq_size = gfc_evaluate_now (logical_true_node, &fblock);
-  else
-    neq_size = gfc_evaluate_now (cond, &fblock);
-
   /* Deallocation of allocatable components will have to occur on
      reallocation.  Fix the old descriptor now.  */
   if ((expr1->ts.type == BT_DERIVED)
@@ -11048,20 +11017,6 @@ gfc_alloc_allocatable_for_assignment (gfc_loopinfo *loop,
   gfc_add_block_to_block (&realloc_block, &caf_se.post);
   realloc_expr = gfc_finish_block (&realloc_block);
 
-  /* Reallocate if sizes or dynamic types are different.  */
-  if (elemsize1)
-    {
-      tmp = fold_build2_loc (input_location, NE_EXPR, logical_type_node,
-			     elemsize1, elemsize2);
-      tmp = gfc_evaluate_now (tmp, &fblock);
-      neq_size = fold_build2_loc (input_location, TRUTH_OR_EXPR,
-				  logical_type_node, neq_size, tmp);
-    }
-  tmp = build3_v (COND_EXPR, neq_size, realloc_expr,
-		  build_empty_stmt (input_location));
-
-  realloc_expr = tmp;
-
   /* Malloc expression.  */
   gfc_init_block (&alloc_block);
   if (!coarray)
diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index e7b9211f17e..44c373cc495 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -11236,10 +11236,10 @@ alloc_scalar_allocatable_for_assignment (stmtblock_t *block,
 
   if (expr1->ts.type == BT_CHARACTER && expr1->ts.deferred)
     {
-      /* Use the rhs string length and the lhs element size.  */
-      size = string_length;
-      tmp = TREE_TYPE (gfc_typenode_for_spec (&expr1->ts));
-      tmp = TYPE_SIZE_UNIT (tmp);
+      /* Use the rhs string length and the lhs element size. Note that 'size' is
+	 used below for the string-length comparison, only.  */
+      size = string_length,
+      tmp = TYPE_SIZE_UNIT (gfc_get_char_type (expr2->ts.kind));
       size_in_bytes = fold_build2_loc (input_location, MULT_EXPR,
 				       TREE_TYPE (tmp), tmp,
 				       fold_convert (TREE_TYPE (tmp), size));
diff --git a/gcc/testsuite/gfortran.dg/widechar_11.f90 b/gcc/testsuite/gfortran.dg/widechar_11.f90
new file mode 100644
index 00000000000..3cb8d956c74
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/widechar_11.f90
@@ -0,0 +1,52 @@
+! { dg-do run }
+! { dg-additional-options "-fdump-tree-original" }
+!
+! PR fortran/107508
+!
+use iso_c_binding
+implicit none
+!character(len=:,kind=4), allocatable, target :: a4str(:), a4str2
+character(len=7,kind=4), allocatable, target :: a4str(:), a4str2
+type(c_ptr) :: cptr, cptr2
+
+!allocate(character(len=7,kind=4) :: a4str(-2:3))
+!allocate(character(len=9,kind=4) :: a4str2)
+
+!cptr = c_loc(a4str)
+!cptr2 = c_loc(a4str2)
+!
+!if (len(a4str) /= 7) error stop
+!if (lbound(a4str,1) /= -2) error stop
+!if (ubound(a4str,1) /= 3) error stop
+!if (len(a4str2) /= 9) error stop
+!
+a4str = [4_"sf456aq", 4_"3dtzu24", 4_"_4fh7sm", 4_"=ff85s7", 4_"j=8af4d", 4_".,A%Fsz"]
+a4str2 = 4_"4f5g5f8a9"
+
+!print *, lbound(a4str), ubound(a4str)  ! expected (-2:3) - actually: (1:6)
+
+!if (len(a4str) /= 7) error stop
+!if (lbound(a4str,1) /= -2) error stop
+!if (ubound(a4str,1) /= 3) error stop
+!if (len(a4str2) /= 9) error stop
+!if (.not. c_associated (cptr, c_loc(a4str))) error stop
+!if (.not. c_associated (cptr2, c_loc(a4str2))) error stop
+end
+
+! { dg-final { scan-tree-dump-times "__builtin_malloc" 4 "original" } }
+! { dg-final { scan-tree-dump-times "__builtin_realloc" 2 "original" } }
+
+! { dg-final { scan-tree-dump-times "a4str.data = __builtin_malloc \\(168\\);" 2 "original" } }
+! { dg-final { scan-tree-dump-times "a4str.data = __builtin_realloc \\(a4str.data, 168\\);" 1 "original" } }
+! { dg-final { scan-tree-dump-times "a4str2 = \\(character\\(kind=4\\)\\\[1:.a4str2\\\] \\*\\) __builtin_malloc \\(36\\);" 2 "original" } }
+! { dg-final { scan-tree-dump-times "a4str2 = \\(character\\(kind=4\\)\\\[1:.a4str2\\\] \\*\\) __builtin_realloc \\(\\(void \\*\\) a4str2, 36\\);" 1 "original" } }
+
+! Array: Assert, realloc-check assign string length (alloc + (realloc'ed) assignment):
+! { dg-final { scan-tree-dump-times "if \\(\[^\\n\\r\]*\\.a4str != 7\\)" 2 "original" } }
+! { dg-final { scan-tree-dump-times "if \\(D\\.\[0-9\]+ != 28\\) goto L\\." 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\.a4str = 7;" 2 "original" } }
+
+! Scalar: Assert, realloc-check assign string length (alloc + (realloc'ed) assignment):
+! { dg-final { scan-tree-dump-times "if \\(\[^\\n\\r\]*\\.a4str2 != 9\\)" 2 "original" } }
+! { dg-final { scan-tree-dump-times "if \\(\\.a4str2 == 9\\) goto L\\." 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\.a4str2 = 9;" 2 "original" } }

             reply	other threads:[~2022-11-05 22:28 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-05 22:28 Tobias Burnus [this message]
2022-11-05 23:45 ` Bernhard Reutner-Fischer
2022-11-06 20:32 ` Mikael Morin
2022-11-06 22:15   ` Tobias Burnus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dee92d68-139c-9a2d-325e-2c3f402291e8@codesourcery.com \
    --to=tobias@codesourcery.com \
    --cc=fortran@gcc.gnu.org \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).