public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/52606] New: Confusing diagnostics for long identifiers
@ 2012-03-16 19:24 anlauf at gmx dot de
  2012-03-16 23:18 ` [Bug fortran/52606] " steven at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: anlauf at gmx dot de @ 2012-03-16 19:24 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52606

             Bug #: 52606
           Summary: Confusing diagnostics for long identifiers
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: fortran
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: anlauf@gmx.de


Excessively long identifiers can produce quite confusing diagnostics.
Perhaps error recovery could be improved.

Example:

module gfcbug118
  implicit none
  interface foo
     module procedure very_long_identifier_beyond_31char
  end interface
contains
  subroutine very_long_identifier_beyond_31char ()
  end subroutine very_long_identifier_beyond_31char
end module gfcbug118

% gfc gfcbug118.f90 -std=f95
gfcbug118.f90:4.54:

     module procedure very_long_identifier_beyond_31char
                                                      1
Error: Name at (1) is too long
gfcbug118.f90:7.45:

  subroutine very_long_identifier_beyond_31char ()
                                             1
Error: Name at (1) is too long
gfcbug118.f90:8.5:

  end subroutine very_long_identifier_beyond_31char
     1
Error: Expecting END MODULE statement at (1)
gfcbug118.f90:9.20:

end module gfcbug118
                    1
Error: Fortran 2008: CONTAINS statement without FUNCTION or SUBROUTINE
statement at (1)


%  gfc gfcbug118.f90 -fmax-identifier-length=31
gfcbug118.f90:4.54:

     module procedure very_long_identifier_beyond_31char
                                                      1
Error: Name at (1) is too long
gfcbug118.f90:7.45:

  subroutine very_long_identifier_beyond_31char ()
                                             1
Error: Name at (1) is too long
gfcbug118.f90:8.5:

  end subroutine very_long_identifier_beyond_31char
     1
Error: Expecting END MODULE statement at (1)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug fortran/52606] Confusing diagnostics for long identifiers
  2012-03-16 19:24 [Bug fortran/52606] New: Confusing diagnostics for long identifiers anlauf at gmx dot de
@ 2012-03-16 23:18 ` steven at gcc dot gnu.org
  2012-03-16 23:22 ` steven at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: steven at gcc dot gnu.org @ 2012-03-16 23:18 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52606

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2012-03-16
                 CC|                            |steven at gcc dot gnu.org
     Ever Confirmed|0                           |1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug fortran/52606] Confusing diagnostics for long identifiers
  2012-03-16 19:24 [Bug fortran/52606] New: Confusing diagnostics for long identifiers anlauf at gmx dot de
  2012-03-16 23:18 ` [Bug fortran/52606] " steven at gcc dot gnu.org
@ 2012-03-16 23:22 ` steven at gcc dot gnu.org
  2012-03-17  9:46 ` kargl at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: steven at gcc dot gnu.org @ 2012-03-16 23:22 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52606

--- Comment #1 from Steven Bosscher <steven at gcc dot gnu.org> 2012-03-16 23:18:26 UTC ---
Someting as trivial as the following would perhaps already help (not tested):

Index: match.c
===================================================================
--- match.c    (revision 185477)
+++ match.c    (working copy)
@@ -546,8 +546,8 @@ gfc_match_name (char *buffer)

       if (i > gfc_option.max_identifier_length)
     {
-      gfc_error ("Name at %C is too long");
-      return MATCH_ERROR;
+      gfc_error_now ("Name at %C is too long");
+      goto too_long_or_done;
     }

       old_loc = gfc_current_locus;
@@ -555,6 +555,7 @@ gfc_match_name (char *buffer)
     }
   while (ISALNUM (c) || c == '_' || (gfc_option.flag_dollar_ok && c == '$'));

+too_long_or_done:
   if (c == '$' && !gfc_option.flag_dollar_ok)
     {
       gfc_error ("Invalid character '$' at %C. Use -fdollar-ok to allow it "


Ideally, we'd just accept long identifiers, but the maximum name length is
hard-coded (GFC_MAX_SYMBOL_LEN) and already way too large. I want to change
this to string pointers (probably using a string pool, maybe the GCC common
implementation, TBD).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug fortran/52606] Confusing diagnostics for long identifiers
  2012-03-16 19:24 [Bug fortran/52606] New: Confusing diagnostics for long identifiers anlauf at gmx dot de
  2012-03-16 23:18 ` [Bug fortran/52606] " steven at gcc dot gnu.org
  2012-03-16 23:22 ` steven at gcc dot gnu.org
@ 2012-03-17  9:46 ` kargl at gcc dot gnu.org
  2012-03-20  8:35 ` jb at gcc dot gnu.org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: kargl at gcc dot gnu.org @ 2012-03-17  9:46 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52606

kargl at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kargl at gcc dot gnu.org

--- Comment #2 from kargl at gcc dot gnu.org 2012-03-17 04:57:55 UTC ---
(In reply to comment #1)

> 
> Ideally, we'd just accept long identifiers, but the maximum name length is
> hard-coded (GFC_MAX_SYMBOL_LEN) and already way too large. I want to change
> this to string pointers (probably using a string pool, maybe the GCC common
> implementation, TBD).

The OP used -std=f95, the max identifier is 31.  gfortran's
behavior is already ideal in that she correctly reports
an error.

If the OP finds the run-on diagnostics confusing, then
OP should use -fmax-errors=1.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug fortran/52606] Confusing diagnostics for long identifiers
  2012-03-16 19:24 [Bug fortran/52606] New: Confusing diagnostics for long identifiers anlauf at gmx dot de
                   ` (2 preceding siblings ...)
  2012-03-17  9:46 ` kargl at gcc dot gnu.org
@ 2012-03-20  8:35 ` jb at gcc dot gnu.org
  2012-03-20  9:18 ` burnus at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jb at gcc dot gnu.org @ 2012-03-20  8:35 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52606

Janne Blomqvist <jb at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jb at gcc dot gnu.org

--- Comment #3 from Janne Blomqvist <jb at gcc dot gnu.org> 2012-03-20 08:30:08 UTC ---
(In reply to comment #1)
> Ideally, we'd just accept long identifiers, but the maximum name length is
> hard-coded (GFC_MAX_SYMBOL_LEN) and already way too large. I want to change
> this to string pointers (probably using a string pool, maybe the GCC common
> implementation, TBD).

Like Steve says, the Fortran standard puts an upper limit on the length of
valid identifiers. For F95 that's 31 characters, for newer standards this has
been bumped up to 63, which is the reason for the value of the
GFC_MAX_SYMBOL_LEN macro. So regardless of whether the strings are internally
stored in statically or dynamically sized buffers, we need to generate an error
for identifiers that exceed the limit.

For BIND(C) binding labels, I did basically the change you're proposing,
switching from static arrays to storing the labels in the symbol table. But
that is slightly different, as bind(C) binding labels are not Fortran
identifiers, and are thus allowed to be arbitrarily long. See PR 51808.

Finally, before attacking static GFC_MAX_SYMBOL_LEN buffers, I suggest getting
rid of static buffers of GFC_MAX_MANGLED_SYMBOL_LEN size. We should get rid of
that macro altogether. Also, FWIW, the value for that macro is too small at the
moment, as it's easy to construct examples were the mangled name exceeds that
value. See e.g. PR 51802 for inspiration. ;-)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug fortran/52606] Confusing diagnostics for long identifiers
  2012-03-16 19:24 [Bug fortran/52606] New: Confusing diagnostics for long identifiers anlauf at gmx dot de
                   ` (3 preceding siblings ...)
  2012-03-20  8:35 ` jb at gcc dot gnu.org
@ 2012-03-20  9:18 ` burnus at gcc dot gnu.org
  2012-03-20 10:53 ` jb at gcc dot gnu.org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: burnus at gcc dot gnu.org @ 2012-03-20  9:18 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52606

Tobias Burnus <burnus at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |burnus at gcc dot gnu.org

--- Comment #4 from Tobias Burnus <burnus at gcc dot gnu.org> 2012-03-20 09:10:13 UTC ---
(In reply to comment #3)
> Finally, before attacking static GFC_MAX_SYMBOL_LEN buffers, I suggest getting
> rid of static buffers of GFC_MAX_MANGLED_SYMBOL_LEN size. We should get rid of
> that macro altogether. Also, FWIW, the value for that macro is too small at
> the moment, as it's easy to construct examples were the mangled name exceeds
> that value. See e.g. PR 51802 for inspiration. ;-)

Indeed, one can construct examples which exceed the length: Namely module
procedures or module variables. The problem is that GFC_MAX_MANGLED_SYMBOL_LEN
allows for GFC_MAX_SYMBOL_LEN*2+4 = 130 characters but one has the prefix "__"
plus "_MOD_" which is not 4 but 7 characters long. Thus, if one goes to the
limit, one can create symbols which have the same mangled name. Though, using
both the maximal length for the module name and for the symbol name is a very
constructed example.

On the other hand, PR 51802 does not inspire at all: That PR is about
polymorphism-related names. And those are hashed if they don't fit.


Example

module Very_long_module_name_having_a_length_of_63_characters_really_x
contains
subroutine an_extremely_long_subroutine_name_which_looks_rather_ugly123456
contains
subroutine an_extremely_long_internal_subroutine_name_which_looks_ugly
  type plus_a_very_very_very_long_type_name_of_almost_the_same_length
  end type
class(plus_a_very_very_very_long_type_name_of_almost_the_same_length), &
  allocatable :: t
end subroutine
end subroutine
end module


According to "nm", one then gets the following symbols - the last one should
end in "123456" but only ends in "123".

__very_long_module_name_having_a_length_of_63_characters_really_x_MOD___copy_1169DDF
__very_long_module_name_having_a_length_of_63_characters_really_x_MOD___def_init_1169DDF
__very_long_module_name_having_a_length_of_63_characters_really_x_MOD___vtab_1169DDF
__very_long_module_name_having_a_length_of_63_characters_really_x_MOD_an_extremely_long_subroutine_name_which_looks_rather_ugly123


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug fortran/52606] Confusing diagnostics for long identifiers
  2012-03-16 19:24 [Bug fortran/52606] New: Confusing diagnostics for long identifiers anlauf at gmx dot de
                   ` (4 preceding siblings ...)
  2012-03-20  9:18 ` burnus at gcc dot gnu.org
@ 2012-03-20 10:53 ` jb at gcc dot gnu.org
  2012-03-20 11:04 ` burnus at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jb at gcc dot gnu.org @ 2012-03-20 10:53 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52606

--- Comment #5 from Janne Blomqvist <jb at gcc dot gnu.org> 2012-03-20 10:27:33 UTC ---
(In reply to comment #4)
> Indeed, one can construct examples which exceed the length: Namely module
> procedures or module variables. The problem is that GFC_MAX_MANGLED_SYMBOL_LEN
> allows for GFC_MAX_SYMBOL_LEN*2+4 = 130 characters but one has the prefix "__"
> plus "_MOD_" which is not 4 but 7 characters long. Thus, if one goes to the
> limit, one can create symbols which have the same mangled name. 

Yes. I suspect this bug dates back to the patch that introduced the _MOD_
scheme. Previously gfortran just used "_" to separate the module and procedure
name, allowing a cleverly named external procedure to clash with a module
procedure. 

>Though, using
> both the maximal length for the module name and for the symbol name is a very
> constructed example.

Indeed. If my suspicion above is correct, the bug has been present for ages,
and no PR has been filed.

> On the other hand, PR 51802 does not inspire at all: That PR is about
> polymorphism-related names. And those are hashed if they don't fit.

Ah, I forgot about the hashing scheme, and assumed that as per the PR
identifiers appears up to three times in the mangled name, the factor "*2" in
the calculation of GFC_MAX_MANGLED_SYMBOL_LEN would be incorrect (in addition
to the additive factor being incorrect).

What was the motivation for this hashing scheme, BTW? Linkers already support
1) long symbol names (I read somewhere that OpenOffice has symbols up to 4000
(!!!) characters long) 2) various symbol hashing schemes (see e.g.
DT_GNU_HASH).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug fortran/52606] Confusing diagnostics for long identifiers
  2012-03-16 19:24 [Bug fortran/52606] New: Confusing diagnostics for long identifiers anlauf at gmx dot de
                   ` (5 preceding siblings ...)
  2012-03-20 10:53 ` jb at gcc dot gnu.org
@ 2012-03-20 11:04 ` burnus at gcc dot gnu.org
  2012-03-21  9:06 ` jb at gcc dot gnu.org
  2013-06-24  6:56 ` dominiq at lps dot ens.fr
  8 siblings, 0 replies; 10+ messages in thread
From: burnus at gcc dot gnu.org @ 2012-03-20 11:04 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52606

--- Comment #6 from Tobias Burnus <burnus at gcc dot gnu.org> 2012-03-20 10:58:13 UTC ---
(In reply to comment #5)
> What was the motivation for this hashing scheme, BTW? Linkers already support
> 1) long symbol names (I read somewhere that OpenOffice has symbols up to 4000
> (!!!) characters long) 2) various symbol hashing schemes (see e.g.
> DT_GNU_HASH).

I think the idea was to have legible dumps (i.e. avoid hashing everything) but
also to fit them into the various  name[GFC_MAX_SYMBOL_LEN]  variables which
simply do not take longer names.

If you think one can/should improve the scheme, feel free to propose something
better. (Only) when switching to the new descriptor, we can change it.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug fortran/52606] Confusing diagnostics for long identifiers
  2012-03-16 19:24 [Bug fortran/52606] New: Confusing diagnostics for long identifiers anlauf at gmx dot de
                   ` (6 preceding siblings ...)
  2012-03-20 11:04 ` burnus at gcc dot gnu.org
@ 2012-03-21  9:06 ` jb at gcc dot gnu.org
  2013-06-24  6:56 ` dominiq at lps dot ens.fr
  8 siblings, 0 replies; 10+ messages in thread
From: jb at gcc dot gnu.org @ 2012-03-21  9:06 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52606

--- Comment #7 from Janne Blomqvist <jb at gcc dot gnu.org> 2012-03-21 08:48:38 UTC ---
(In reply to comment #6)
> (In reply to comment #5)
> > What was the motivation for this hashing scheme, BTW? Linkers already support
> > 1) long symbol names (I read somewhere that OpenOffice has symbols up to 4000
> > (!!!) characters long) 2) various symbol hashing schemes (see e.g.
> > DT_GNU_HASH).
> 
> I think the idea was to have legible dumps (i.e. avoid hashing everything) but
> also to fit them into the various  name[GFC_MAX_SYMBOL_LEN]  variables which
> simply do not take longer names.
> 
> If you think one can/should improve the scheme, feel free to propose something
> better.

Well, the obvious fix would be to not try cramming mangled identifiers into
buffers which are statically sized only to hold unqualified identifiers, no?
Essentially, Steven B's proposal in #c1 would fix this as a side-effect.

Adding insult to injury, a workaround for this design mistake is enshrined in
our ABI.. *sigh*.

Grepping for GFC_MAX_SYMBOL_LEN shows 118 occurences, so it would be quite a
large patch, but probably quite mechanical for the most part. Steven, are you
planning to fix this?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug fortran/52606] Confusing diagnostics for long identifiers
  2012-03-16 19:24 [Bug fortran/52606] New: Confusing diagnostics for long identifiers anlauf at gmx dot de
                   ` (7 preceding siblings ...)
  2012-03-21  9:06 ` jb at gcc dot gnu.org
@ 2013-06-24  6:56 ` dominiq at lps dot ens.fr
  8 siblings, 0 replies; 10+ messages in thread
From: dominiq at lps dot ens.fr @ 2013-06-24  6:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52606

Dominique d'Humieres <dominiq at lps dot ens.fr> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |NEW

--- Comment #8 from Dominique d'Humieres <dominiq at lps dot ens.fr> ---
Was marked as ASSIGNED, but actually "Not yet assigned to anyone". Set to NEW.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-06-24  6:56 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-16 19:24 [Bug fortran/52606] New: Confusing diagnostics for long identifiers anlauf at gmx dot de
2012-03-16 23:18 ` [Bug fortran/52606] " steven at gcc dot gnu.org
2012-03-16 23:22 ` steven at gcc dot gnu.org
2012-03-17  9:46 ` kargl at gcc dot gnu.org
2012-03-20  8:35 ` jb at gcc dot gnu.org
2012-03-20  9:18 ` burnus at gcc dot gnu.org
2012-03-20 10:53 ` jb at gcc dot gnu.org
2012-03-20 11:04 ` burnus at gcc dot gnu.org
2012-03-21  9:06 ` jb at gcc dot gnu.org
2013-06-24  6:56 ` dominiq at lps dot ens.fr

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).