From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mout.gmx.net (mout.gmx.net [212.227.17.20]) by sourceware.org (Postfix) with ESMTPS id CB26A385381D; Wed, 18 Aug 2021 21:01:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CB26A385381D X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Received: from [93.207.80.15] ([93.207.80.15]) by web-mail.gmx.net (3c-app-gmx-bap48.server.lan [172.19.172.118]) (via HTTP); Wed, 18 Aug 2021 23:01:25 +0200 MIME-Version: 1.0 Message-ID: From: Harald Anlauf To: Tobias Burnus Cc: Bernhard Reutner-Fischer , Harald Anlauf via Gcc-patches , fortran Subject: Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514 Content-Type: multipart/mixed; boundary=rekceb-c9625c2b-547e-4b56-921a-9454947523fd Date: Wed, 18 Aug 2021 23:01:25 +0200 Importance: normal Sensitivity: Normal In-Reply-To: References: <20210610122435.296a207d@nbbrfq> X-UI-Message-Type: mail X-Priority: 3 X-Provags-ID: V03:K1:viV/GAH1y1q7TWtZqkvco+s75QYoGatpE6E8i2p50YFaJ2alPnwvVngGYXnuHiSEo9xKN 818IChH/nWvsA+r3jlROzqsQnS5Bzl9vjftGINs17sm6Z4YcyVMi8fhrYRsjwqWtXWfkFf4SRSxC FdKmUTLlXAwlgwDRIfp64R8Crzsz+OO9Q94fmmjEwp9Iw1EEJ8b9ubfiAL1wbdCt6uOnnuDXziCq o/TMn8gf98aC+EfxW151Ol7rgD9t0WCY01+yEq1+oFdF+lyzo8lDEo5QM0/dx+zs+UtZ62fkA+1v js= X-UI-Out-Filterresults: notjunk:1;V03:K0:G+ZEmqsQZWg=:gr2ntwo8whCwC10xuL5o7o QkS/i7hR4BosAwWCgO6obcKVHvQ/XKgXhrU+hyWf7s98SgcdXGrDiuK0d4LmvqSfGDgyA7Yb5 hpLKnb9B2VFjABdP2aMYzGM/PrrC8DPFQrk0ZVwWPieAonX49f2qTZZKTwwgp6x4XuZCZ4P6I nrmiPFaLdiZGzAon8/hkEeTZA3FpuC7re8hKzeIJpvYwYcyxBbLoJ6kqsV6sXxKOKKe4VVgei GXoRgTfUkAb9Y8FBi8QRGoUL0os6e3rGhdIihxNNu/1a+l0hnXzIR9SERCuJpiPqCnokw+rSs gwh46EKW800vF/r8Uj/+jAlcalZE6CvXLauPfF7V8W7un48KmYK4A/2pNtN7OSRcHDZuh/T2h 3mc16WBAn2cMQqKAYm0vTuO0FY7nqvvtUcXJyOds+CWfMAArAuQwY2OrKT+QBTqbiocrhEGX9 nAueOrjKksDhWrjDw2nuXWpvItFhDrNCYhFIlWvNL8F8PPO0WZx1+TEBwHF4PNIgxobxrUsch wl+nNCzWh+0s5wdehAuFUi61QA894oVvvMO8YGSeF6QUGAUJ5NctD6u2ZL/IQ4Ut3mT2crbyH r3gM17AYYvA+FR1tQmIH2r4qsxefkcyr0+HhXq0meL+EB3CNbfx2vDxGXb2uW3xzi7XkdETRx +KUEnVkIsRPbrQ6gRs/PuNOo3GBeeq2oPCfYqlPT2oV5RWLUeMBI0LjaJeoIq7FY+iik8hGkv bLu26iJW3X+DL3SCNYrun3j02mf+Bc6cE1vRlav5KgLVXfxTT+ImoWcqlDLAyffQhyxyzdtBO tZ/CH88ztAbch5ngCBZ396O5issIA== X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FREEMAIL_FROM, GIT_PATCH_0, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: fortran@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Fortran mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Aug 2021 21:01:40 -0000 --rekceb-c9625c2b-547e-4b56-921a-9454947523fd Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Tobias, > Gesendet: Mittwoch, 18=2E August 2021 um 12:22 Uhr > Von: "Tobias Burnus" > > Note, however, that gfc_simplify_len still won't handle neither > > deferred strings nor their substrings=2E > > > > I think there is nothing to simplify at compile time here=2E >=20 > Obviously, nonsubstrings cannot be simplified but I do not > see why len(str(1:2)) cannot or should not be simplified=2E >=20 > (Not that I regard substring length inquiries as that common=2E) well, here's an example that Intel rejects: type u character(8) :: s(4) character(:), allocatable :: str end type u type(u) :: q integer, parameter :: k2 =3D len (q% s(:)(3:4)) ! OK integer, parameter :: k3 =3D len (q% str (3:4)) ! Rejected by Intel print *, k2 if (k2 /=3D 2) stop 2 print *, k3 if (k3 /=3D 2) stop 3 end pr100950-ww=2Ef90(7): error #6814: When using this inquiry function, the l= ength of this object cannot be evaluated to a constant=2E [LEN] integer, parameter :: k3 =3D len (q% str (3:4)) ! Rejected by Intel -----------------------------^ pr100950-ww=2Ef90(7): error #7169: Bad initialization expression=2E [LEN= ] integer, parameter :: k3 =3D len (q% str (3:4)) ! Rejected by Intel -----------------------------^ Of course we could accept it regardless what others do=2E I have therefore removed the check for deferred length in the attached patch (but read on)=2E > However, there is no reason why the user cannot do: > if (allocated(str)) then > n =3D len(str) > m =3D len(str(5:8)) > end if > and why the compiler cannot replace the latter by 'm =3D 4'=2E Maybe you can enlighten me here=2E I thought one of the purposes of gfc_simplify_len is to evaluate constant expressions=2E Of course the length is constant, provided bounds are respected=2E Otherwise the result is, well, =2E=2E=2E (It will then eveluate at runtime, which I thought was fine)=2E > But, IMHO, the latter remark does _not_ imply that we > shall/must/have to accept code like: >=20 > if (allocated(str)) then > block > integer, parameter :: n =3D len(str(:5)) > end block > endif So shall we not simplify here (and thus reject it)? This is important! Or silently simplify and accept it? > With the caveat from above that len() is rather special, > there is no real reason why: str_array(:)(4:5) cannot be handled=2E > (=E2=86=92 len =3D 2)=2E Good point=2E This is fixed in the revised patch and tested for=2E > > The updated patch regtests fine=2E OK? > Looks good to me except for the caveats=2E Regtested again=2E > * * * >=20 > And while the following works >=20 > x =3D var%str(:)%len ! ok, yields 5 > y =3D str2(:)%len ! ok, yields 5 >=20 > the following is wrongly rejected: >=20 > x =3D var%str(:)(1:1)%len ! Bogus: 'Invalid character in name' > y =3D str2(:)(1:1)%len ! Bogus: 'Invalid character in name' >=20 > (likewise with '%kind') >=20 > (As "SUBSTRING % LEN", it also appears in the '16=2E9=2E99 INDEX', > but '9=2E4=2E5 Type parameter inquiry's 'R916 type-param-inquiry' > is the official reference=2E) >=20 > If you don't want to spend time on this subpart - you could > fill a PR=2E Well, there's already https://gcc=2Egnu=2Eorg/bugzilla/show_bug=2Ecgi?id=3D101735 which is a much better suited place for discussion=2E > * * * >=20 > For deferred length, I have no strong opinion; in > any case, the upper substring bound > stringlen check > cannot be done in that case (at compile time)=2E I think > I slightly prefer doing the optimization =E2=80=93 but as is > is a special case and has some caveats (must be allocated, > upper bound check not possible, =2E=2E=2E) I also see reasons > not to do it=2E Hence, it also can remain as in your patch=2E Actually, this is now an important point=2E If we really want to allow to handle substrings of deferred length strings in constant expressions, the new patch would be fine, otherwise I would have to reintroduce the hunk + if (e->ts=2Edeferred) + return NULL; and adjust the testcase=2E Your choice=2E See above=2E Of course there may be more corner cases which I did not think of=2E=2E=2E Thanks, Harald --rekceb-c9625c2b-547e-4b56-921a-9454947523fd Content-Type: text/x-patch Content-Disposition: attachment; filename=pr100950-v4.patch Content-Transfer-Encoding: quoted-printable diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c index c27b47aa98f..cf0a4387788 100644 =2D-- a/gcc/fortran/simplify.c +++ b/gcc/fortran/simplify.c @@ -4512,6 +4512,69 @@ gfc_simplify_leadz (gfc_expr *e) } +/* Check for constant length of a substring. */ + +static bool +substring_has_constant_len (gfc_expr *e) +{ + gfc_ref *ref; + HOST_WIDE_INT istart, iend, length; + bool equal_length =3D false; + + if (e->ts.type !=3D BT_CHARACTER || e->ts.deferred) + return false; + + for (ref =3D e->ref; ref; ref =3D ref->next) + if (ref->type !=3D REF_COMPONENT && ref->type !=3D REF_ARRAY) + break; + + if (!ref + || ref->type !=3D REF_SUBSTRING + || !ref->u.ss.start + || ref->u.ss.start->expr_type !=3D EXPR_CONSTANT + || !ref->u.ss.end + || ref->u.ss.end->expr_type !=3D EXPR_CONSTANT + || !ref->u.ss.length + || !ref->u.ss.length->length + || ref->u.ss.length->length->expr_type !=3D EXPR_CONSTANT) + return false; + + /* Basic checks on substring starting and ending indices. */ + if (!gfc_resolve_substring (ref, &equal_length)) + return false; + + istart =3D gfc_mpz_get_hwi (ref->u.ss.start->value.integer); + iend =3D gfc_mpz_get_hwi (ref->u.ss.end->value.integer); + length =3D gfc_mpz_get_hwi (ref->u.ss.length->length->value.integer); + + if (istart <=3D iend) + { + if (istart < 1) + { + gfc_error ("Substring start index (" HOST_WIDE_INT_PRINT_DEC + ") at %L below 1", + istart, &ref->u.ss.start->where); + return false; + } + if (iend > length) + { + gfc_error ("Substring end index (" HOST_WIDE_INT_PRINT_DEC + ") at %L exceeds string length", + iend, &ref->u.ss.end->where); + return false; + } + length =3D iend - istart + 1; + } + else + length =3D 0; + + /* Fix substring length. */ + e->value.character.length =3D length; + + return true; +} + + gfc_expr * gfc_simplify_len (gfc_expr *e, gfc_expr *kind) { @@ -4521,7 +4584,8 @@ gfc_simplify_len (gfc_expr *e, gfc_expr *kind) if (k =3D=3D -1) return &gfc_bad_expr; - if (e->expr_type =3D=3D EXPR_CONSTANT) + if (e->expr_type =3D=3D EXPR_CONSTANT + || substring_has_constant_len (e)) { result =3D gfc_get_constant_expr (BT_INTEGER, k, &e->where); mpz_set_si (result->value.integer, e->value.character.length); diff --git a/gcc/testsuite/gfortran.dg/pr100950.f90 b/gcc/testsuite/gfortr= an.dg/pr100950.f90 new file mode 100644 index 00000000000..7de589fe882 =2D-- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr100950.f90 @@ -0,0 +1,39 @@ +! { dg-do run } +! PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.= c:5514 + +program p + character(8), parameter :: u =3D "123" + character(8) :: x =3D "", s + character(2) :: w(2) =3D [character(len(x(3:4))) :: 'a','b' = ] + character(*), parameter :: y(*) =3D [character(len(u(3:4))) :: 'a','b' = ] + character(*), parameter :: z(*) =3D [character(len(x(3:4))) :: 'a','b' = ] + character(*), parameter :: t(*) =3D [character(len(x( :2))) :: 'a','b' = ] + character(*), parameter :: v(*) =3D [character(len(x(7: ))) :: 'a','b' = ] + type t_ + character(len=3D5) :: s + character(len=3D8) :: t(4) + character(len=3D:), allocatable :: str + end type t_ + type(t_) :: q, r(1) + integer, parameter :: lq =3D len (q%s(3:4)), lr =3D len (r%s(3:4)) + integer, parameter :: l1 =3D len (q %t(1)(3:4)) + integer, parameter :: l2 =3D len (q %t(:)(3:4)) + integer, parameter :: l3 =3D len (q %str (3:4)) + integer, parameter :: l4 =3D len (r(:)%t(1)(3:4)) + integer, parameter :: l5 =3D len (r(1)%t(:)(3:4)) + integer, parameter :: l6 =3D len (r(1)%str (3:4)) + + if (len (y) /=3D 2) stop 1 + if (len (z) /=3D 2) stop 2 + if (any (w /=3D y)) stop 3 + if (len ([character(len(u(3:4))) :: 'a','b' ]) /=3D 2) stop 4 + if (len ([character(len(x(3:4))) :: 'a','b' ]) /=3D 2) stop 5 + if (any ([character(len(x(3:4))) :: 'a','b' ] /=3D y)) stop 6 + write(s,*) [character(len(x(3:4))) :: 'a','b' ] + if (s /=3D " a b ") stop 7 + if (len (t) /=3D 2) stop 8 + if (len (v) /=3D 2) stop 9 + if (lq /=3D 2 .or. lr /=3D 2) stop 10 + if (l1 /=3D 2 .or. l2 /=3D 2 .or. l4 /=3D 2 .or. l5 /=3D 2) stop 11 + if (l3 /=3D 2 .or. l6 /=3D 2) stop 12 +end --rekceb-c9625c2b-547e-4b56-921a-9454947523fd--