Re: new parser: error recovery needs work

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: new parser: error recovery needs work
@ 2003-01-16 13:40 Robert Dewar
  2003-01-16 17:20 ` Joe Buck
  2003-01-16 17:34 ` Joe Buck
  0 siblings, 2 replies; 36+ messages in thread
From: Robert Dewar @ 2003-01-16 13:40 UTC (permalink / raw)
  To: gcc, jbuck, mark

> To be honest, I'm somewhat unsympathetic.  Not because I think the error
> messages are good, or because I think that we shouldn't do better, but
> because it's hard to do better in some of these cases and because
> we do noticably better in other cases -- the old parser just said
> "parse error" a lot. :-)

I certainly understand Mark's reaction here. I agree that it is unreasonable
to label as a regression particular cases in which the error message may or
may not be worse than the previous one (these things are somewhat subjective
after all).

I think the best thing is to recognize that we are never talking about a bug
when it comes to an unclear error message, but merely a possibly opportunity
for improving things. It is almost worth having a special category for such
suggestions. 

One of my particular interests in the Ada parser has been to work on improving
the error messages (also improving error messages in the semantic analysis
pass). I find that if people report a bad error message as a bug, I am less
sympathetic than if they report it as "here is a case of a message that i
found confusing, would be nice to better if possible" :-)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-16 13:40 new parser: error recovery needs work Robert Dewar
@ 2003-01-16 17:20 ` Joe Buck
  2003-01-16 18:15   ` Mark Mitchell
  2003-01-16 17:34 ` Joe Buck
  1 sibling, 1 reply; 36+ messages in thread
From: Joe Buck @ 2003-01-16 17:20 UTC (permalink / raw)
  To: Robert Dewar; +Cc: gcc, mark

On Thu, Jan 16, 2003 at 06:33:44AM -0500, Robert Dewar wrote:
> > To be honest, I'm somewhat unsympathetic.  Not because I think the error
> > messages are good, or because I think that we shouldn't do better, but
> > because it's hard to do better in some of these cases and because
> > we do noticably better in other cases -- the old parser just said
> > "parse error" a lot. :-)
> 
> I certainly understand Mark's reaction here. I agree that it is unreasonable
> to label as a regression particular cases in which the error message may or
> may not be worse than the previous one (these things are somewhat subjective
> after all).

But this is the very case that Gerald wanted to hold the 3.0 release over.
The reaon is that the 2.9x -> 3.x transition moved all the C++ standard
library into the std:: namespace, as the standard required, but there
was tons of C++ code out there that compiled only on g++ 2.95.  The error
messages were completely useless unless we added a small patch to put out
a coherent error message.

Now, we don't need to do anything elaborate; a couple of patches to
recognize the most common cases will do.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-16 13:40 new parser: error recovery needs work Robert Dewar
  2003-01-16 17:20 ` Joe Buck
@ 2003-01-16 17:34 ` Joe Buck
  2003-01-16 18:05   ` Gareth Pearce
  1 sibling, 1 reply; 36+ messages in thread
From: Joe Buck @ 2003-01-16 17:34 UTC (permalink / raw)
  To: Robert Dewar; +Cc: gcc, mark

On Thu, Jan 16, 2003 at 06:33:44AM -0500, Robert Dewar wrote:
> I think the best thing is to recognize that we are never talking about a bug
> when it comes to an unclear error message, but merely a possibly opportunity
> for improving things. It is almost worth having a special category for such
> suggestions. 

Another point: we are talking about a quality regression if
we replace a parser that gives a sensible error message to an extremely
common error with a parser that prints an incomprehensible message.
Fortunately, I think that just two rules should get us back to the
same level as the old parser (one to detect attempts to declare a
variable with an undefined type, the other to detect attempt to use
an undefined template).  I'll put those in PRs.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-16 17:34 ` Joe Buck
@ 2003-01-16 18:05   ` Gareth Pearce
  2003-01-16 20:00     ` Phil Edwards
  2003-01-16 23:04     ` Mark Mitchell
  0 siblings, 2 replies; 36+ messages in thread
From: Gareth Pearce @ 2003-01-16 18:05 UTC (permalink / raw)
  To: gcc

Just a couple of thoughts about errors in the new parser...

i'm not sure why, but the errors now feel more like they need carat errors
then before.  I think its because they sound more precise, without giving
you the accuracy of where to go with it.  Before its like - oh a parse
error - right let me see - but now its 'missing type-id' and it makes me
think, huh, where? (I think i heard something about carat errors being on
the way, just wanting to err suggest that they make it into 3.4)

Second thought, the error messages of the form 'expected type-id' etc ...
are compact and precise.  Something in me would like 'waffly friendly' error
messages.  On the otherhand I dont think thats such a great idea.  However,
(if it already exists forgive me) it would be good if there was really nice
extensive documentation on error messages and what they mean in nice
friendly language :P ... or something at least.  A tool which could look up
the documentation just by passing it the text of an error message ... would
be nice too.

Regards,
Gareth

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-16 17:20 ` Joe Buck
@ 2003-01-16 18:15   ` Mark Mitchell
  0 siblings, 0 replies; 36+ messages in thread
From: Mark Mitchell @ 2003-01-16 18:15 UTC (permalink / raw)
  To: Joe Buck, Robert Dewar; +Cc: gcc

> Now, we don't need to do anything elaborate; a couple of patches to
> recognize the most common cases will do.

I was just grumbling; my feelings aren't that sensitive.

I've already got a patch ready to restore most of what was there, plus
fix some actual miscompilations.  Testing and making ChangeLogs as I
write this...

-- 
Mark Mitchell                mark@codesourcery.com
CodeSourcery, LLC            http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-16 18:05   ` Gareth Pearce
@ 2003-01-16 20:00     ` Phil Edwards
  2003-01-16 20:15       ` Neil Booth
                         ` (3 more replies)
  2003-01-16 23:04     ` Mark Mitchell
  1 sibling, 4 replies; 36+ messages in thread
From: Phil Edwards @ 2003-01-16 20:00 UTC (permalink / raw)
  To: Gareth Pearce; +Cc: gcc

On Fri, Jan 17, 2003 at 04:08:30AM +1100, Gareth Pearce wrote:
> (I think i heard something about carat errors being on
> the way,

They are.

>just wanting to err suggest that they make it into 3.4)

That will only happen if lots of people help make it happen.  Are you
volunteering?  :-)  It's not like the code is finished and waiting on a
managerial decision.


> or something at least.  A tool which could look up
> the documentation just by passing it the text of an error message ... would
> be nice too.

You really don't want to rely on the text of an error message not changing
over time.  For this kind of thing to work, we need error messages to have
numbers (like every other compiler does, and with reason).


Phil

-- 
I would therefore like to posit that computing's central challenge, viz. "How
not to make a mess of it," has /not/ been met.
                                                 - Edsger Dijkstra, 1930-2002

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-16 20:00     ` Phil Edwards
@ 2003-01-16 20:15       ` Neil Booth
  2003-01-16 21:43       ` Janis Johnson
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 36+ messages in thread
From: Neil Booth @ 2003-01-16 20:15 UTC (permalink / raw)
  To: Phil Edwards; +Cc: Gareth Pearce, gcc

Phil Edwards wrote:-

> > (I think i heard something about carat errors being on
> > the way,
> 
> They are.

I have them working fine in my C parser.  Because you're pointing
to the location of the code, you can simplify or shorten many
existing diagnostics, or improve them altogether.

> > or something at least.  A tool which could look up
> > the documentation just by passing it the text of an error message ... would
> > be nice too.
> 
> You really don't want to rely on the text of an error message not changing
> over time.  For this kind of thing to work, we need error messages to have
> numbers (like every other compiler does, and with reason).

Let's not even attempt to set messages in stone (even with numbers)
in the C front ends until all the work in the pipeline is done, and we
have a conforming C99 and C++ front end.

Neil.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-16 20:00     ` Phil Edwards
  2003-01-16 20:15       ` Neil Booth
@ 2003-01-16 21:43       ` Janis Johnson
  2003-01-16 23:33         ` Mark Mitchell
                           ` (2 more replies)
  2003-01-17  5:26       ` Gareth Pearce
  2003-01-18 16:32       ` Gabriel Dos Reis
  3 siblings, 3 replies; 36+ messages in thread
From: Janis Johnson @ 2003-01-16 21:43 UTC (permalink / raw)
  To: Phil Edwards; +Cc: Gareth Pearce, gcc

On Thu, Jan 16, 2003 at 02:11:05PM -0500, Phil Edwards wrote:
> On Fri, Jan 17, 2003 at 04:08:30AM +1100, Gareth Pearce wrote:
> 
> > or something at least.  A tool which could look up
> > the documentation just by passing it the text of an error message ... would
> > be nice too.
> 
> You really don't want to rely on the text of an error message not changing
> over time.  For this kind of thing to work, we need error messages to have
> numbers (like every other compiler does, and with reason).

Oooh, and with message numbers it's possible to add options to enable
or disable individual warnings.

Janis

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-16 18:05   ` Gareth Pearce
  2003-01-16 20:00     ` Phil Edwards
@ 2003-01-16 23:04     ` Mark Mitchell
  2003-01-18 16:52       ` Gabriel Dos Reis
  1 sibling, 1 reply; 36+ messages in thread
From: Mark Mitchell @ 2003-01-16 23:04 UTC (permalink / raw)
  To: Gareth Pearce, gcc

--On Friday, January 17, 2003 04:08:30 AM +1100 Gareth Pearce 
<tilps@hotmail.com> wrote:

> Just a couple of thoughts about errors in the new parser...
>
> i'm not sure why, but the errors now feel more like they need carat errors

I agree -- and I think you're right about the cause.  It's telling you
pretty clearly what's wrong, but you want to know *where exactly*.

We (CodeSourcery) have no plans to add this in the short term, but it
would be a very good thing.

> Second thought, the error messages of the form 'expected type-id' etc ...
> are compact and precise.  Something in me would like 'waffly friendly'

I thought about this, and I intentionally went for more technical error
messages.  I often find G++'s error messages confusing because of their
wafflyness -- I think "what on earth does that mean?"  G++ has a bad
tendency to make up terms that aren't in the standard or even in common
usage in the C++ community.  By using technical terms, at least people
can consult a reference book to figure out what's going on.

You're also right that better documentation about the various error
messages -- and better error messages! -- would help.

-- 
Mark Mitchell                mark@codesourcery.com
CodeSourcery, LLC            http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-16 21:43       ` Janis Johnson
@ 2003-01-16 23:33         ` Mark Mitchell
  2003-01-17  8:12         ` Fergus Henderson
  2003-01-17  8:12         ` Phil Edwards
  2 siblings, 0 replies; 36+ messages in thread
From: Mark Mitchell @ 2003-01-16 23:33 UTC (permalink / raw)
  To: Janis Johnson, Phil Edwards; +Cc: Gareth Pearce, gcc



--On Thursday, January 16, 2003 12:00:07 PM -0800 Janis Johnson 
<janis187@us.ibm.com> wrote:

> Oooh, and with message numbers it's possible to add options to enable
> or disable individual warnings.

Yeah.  If you go back in CVS, you'll find about 12 hours where I had done
this before bed; Jason overruled it and pulled it out the next morning
when he woke up. :-)

-- 
Mark Mitchell                mark@codesourcery.com
CodeSourcery, LLC            http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-16 20:00     ` Phil Edwards
  2003-01-16 20:15       ` Neil Booth
  2003-01-16 21:43       ` Janis Johnson
@ 2003-01-17  5:26       ` Gareth Pearce
  2003-01-17  8:36         ` Phil Edwards
  2003-01-18 16:32       ` Gabriel Dos Reis
  3 siblings, 1 reply; 36+ messages in thread
From: Gareth Pearce @ 2003-01-17  5:26 UTC (permalink / raw)
  To: Phil Edwards; +Cc: gcc


----- Original Message -----
From: "Phil Edwards" <phil@jaj.com>
To: "Gareth Pearce" <tilps@hotmail.com>
Cc: <gcc@gcc.gnu.org>
Sent: Friday, January 17, 2003 6:11 AM
Subject: Re: new parser: error recovery needs work


> On Fri, Jan 17, 2003 at 04:08:30AM +1100, Gareth Pearce wrote:
> > (I think i heard something about carat errors being on
> > the way,
>
> They are.
>
> >just wanting to err suggest that they make it into 3.4)
>
> That will only happen if lots of people help make it happen.  Are you
> volunteering?  :-)  It's not like the code is finished and waiting on a
> managerial decision.
*chuckle* - well as always I'm interested in volunteering, whether i'll
actually end up doing anything - thats a different story.  My interest to
effort conversion engine is rather inefficient
Is there a branch/somewhere this is being worked on?
>
>
> > or something at least.  A tool which could look up
> > the documentation just by passing it the text of an error message ...
would
> > be nice too.
>
> You really don't want to rely on the text of an error message not changing
> over time.  For this kind of thing to work, we need error messages to have
> numbers (like every other compiler does, and with reason).

Well either that or expect people who change error messages to update the
documentation :P
not like that would happen...

Gareth

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-16 21:43       ` Janis Johnson
  2003-01-16 23:33         ` Mark Mitchell
@ 2003-01-17  8:12         ` Fergus Henderson
  2003-01-17 10:06           ` Stan Shebs
                             ` (3 more replies)
  2003-01-17  8:12         ` Phil Edwards
  2 siblings, 4 replies; 36+ messages in thread
From: Fergus Henderson @ 2003-01-17  8:12 UTC (permalink / raw)
  To: Janis Johnson; +Cc: Phil Edwards, Gareth Pearce, gcc

On 16-Jan-2003, Janis Johnson <janis187@us.ibm.com> wrote:
> On Thu, Jan 16, 2003 at 02:11:05PM -0500, Phil Edwards wrote:
> > You really don't want to rely on the text of an error message not changing
> > over time.  For this kind of thing to work, we need error messages to have
> > numbers (like every other compiler does, and with reason).
> 
> Oooh, and with message numbers it's possible to add options to enable
> or disable individual warnings.

We've discussed this before.  The concensus last time (as I understood it)
was that alphanumeric message codes where a better alternative to message
numbers.  Message codes are easier to remember, more self-documenting,
and avoid collisions between new warnings added on different CVS branches
or in different repositories.

IIRC there was significant opposition to message numbers, because they
are too cryptic and because of the problem with collisions, but I don't
recall anyone objecting to alphanumeric message codes.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-16 21:43       ` Janis Johnson
  2003-01-16 23:33         ` Mark Mitchell
  2003-01-17  8:12         ` Fergus Henderson
@ 2003-01-17  8:12         ` Phil Edwards
  2003-01-18 16:48           ` Gabriel Dos Reis
  2 siblings, 1 reply; 36+ messages in thread
From: Phil Edwards @ 2003-01-17  8:12 UTC (permalink / raw)
  To: Janis Johnson; +Cc: Gareth Pearce, gcc

On Thu, Jan 16, 2003 at 12:00:07PM -0800, Janis Johnson wrote:
> On Thu, Jan 16, 2003 at 02:11:05PM -0500, Phil Edwards wrote:
> > On Fri, Jan 17, 2003 at 04:08:30AM +1100, Gareth Pearce wrote:
> > 
> > > or something at least.  A tool which could look up
> > > the documentation just by passing it the text of an error message ... would
> > > be nice too.
> > 
> > You really don't want to rely on the text of an error message not changing
> > over time.  For this kind of thing to work, we need error messages to have
> > numbers (like every other compiler does, and with reason).
> 
> Oooh, and with message numbers it's possible to add options to enable
> or disable individual warnings.

Exactly.  :-)

-- 
I would therefore like to posit that computing's central challenge, viz. "How
not to make a mess of it," has /not/ been met.
                                                 - Edsger Dijkstra, 1930-2002

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-17  5:26       ` Gareth Pearce
@ 2003-01-17  8:36         ` Phil Edwards
  2003-01-17  9:10           ` Gareth Pearce
  0 siblings, 1 reply; 36+ messages in thread
From: Phil Edwards @ 2003-01-17  8:36 UTC (permalink / raw)
  To: Gareth Pearce; +Cc: gcc

On Fri, Jan 17, 2003 at 02:23:51PM +1100, Gareth Pearce wrote:
> *chuckle* - well as always I'm interested in volunteering, whether i'll
> actually end up doing anything - thats a different story.  My interest to
> effort conversion engine is rather inefficient
> Is there a branch/somewhere this is being worked on?

Somewhere, yes; a branch, not at present.  Look in the list archives for
recent messages containing the word 'caret' from Neil Booth in this list.


> > > or something at least.  A tool which could look up
> > > the documentation just by passing it the text of an error message ...
> would
> > > be nice too.
> >
> > You really don't want to rely on the text of an error message not changing
> > over time.  For this kind of thing to work, we need error messages to have
> > numbers (like every other compiler does, and with reason).
> 
> Well either that or expect people who change error messages to update the
> documentation :P
> not like that would happen...

It's not the documentation that would need updating (well, it would, but
that's not the killer).  It's the lookup tool you're proposing that would
need to know about the new error text that it's using as a "lookup key".

If, OTOH, you're assuming that the text of the error message itself is
in the documentation verbatim, and you want a tool to find it when giving
the error message, I refer you to "grep".  :-)


Phil

-- 
I would therefore like to posit that computing's central challenge, viz. "How
not to make a mess of it," has /not/ been met.
                                                 - Edsger Dijkstra, 1930-2002

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-17  8:36         ` Phil Edwards
@ 2003-01-17  9:10           ` Gareth Pearce
  0 siblings, 0 replies; 36+ messages in thread
From: Gareth Pearce @ 2003-01-17  9:10 UTC (permalink / raw)
  To: Phil Edwards; +Cc: gcc

> > > You really don't want to rely on the text of an error message not
changing
> > > over time.  For this kind of thing to work, we need error messages to
have
> > > numbers (like every other compiler does, and with reason).
> >
> > Well either that or expect people who change error messages to update
the
> > documentation :P
> > not like that would happen...
>
> It's not the documentation that would need updating (well, it would, but
> that's not the killer).  It's the lookup tool you're proposing that would
> need to know about the new error text that it's using as a "lookup key".
>
> If, OTOH, you're assuming that the text of the error message itself is
> in the documentation verbatim, and you want a tool to find it when giving
> the error message, I refer you to "grep".  :-)

yes I considered it likely that once the documentation existed - such a tool
would be a simple script.
with alphanumeric identifier codes, it certainly would.

Gareth

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-17  8:12         ` Fergus Henderson
@ 2003-01-17 10:06           ` Stan Shebs
  2003-01-17 11:36           ` Ben Elliston
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 36+ messages in thread
From: Stan Shebs @ 2003-01-17 10:06 UTC (permalink / raw)
  To: Fergus Henderson; +Cc: Janis Johnson, Phil Edwards, Gareth Pearce, gcc

Fergus Henderson wrote:

>On 16-Jan-2003, Janis Johnson <janis187@us.ibm.com> wrote:
>
>>On Thu, Jan 16, 2003 at 02:11:05PM -0500, Phil Edwards wrote:
>>
>>>You really don't want to rely on the text of an error message not changing
>>>over time.  For this kind of thing to work, we need error messages to have
>>>numbers (like every other compiler does, and with reason).
>>>
>>Oooh, and with message numbers it's possible to add options to enable
>>or disable individual warnings.
>>
>
>We've discussed this before.  The concensus last time (as I understood it)
>was that alphanumeric message codes where a better alternative to message
>numbers.  Message codes are easier to remember, more self-documenting,
>and avoid collisions between new warnings added on different CVS branches
>or in different repositories.
>
>IIRC there was significant opposition to message numbers, because they
>are too cryptic and because of the problem with collisions, but I don't
>recall anyone objecting to alphanumeric message codes.
>
As it happens, one of my job tasks for next week is to present a formal
proposal for how to add this, which I've been calling "named warnings",
to GCC.  So if you can hold out for a few more days, there will be
something specific to flame... :-)

Stan


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-17  8:12         ` Fergus Henderson
  2003-01-17 10:06           ` Stan Shebs
@ 2003-01-17 11:36           ` Ben Elliston
  2003-01-17 19:15             ` DJ Delorie
  2003-01-17 12:27           ` Neil Booth
  2003-01-18 16:46           ` Gabriel Dos Reis
  3 siblings, 1 reply; 36+ messages in thread
From: Ben Elliston @ 2003-01-17 11:36 UTC (permalink / raw)
  To: gcc

>>>>> "Fergus" == Fergus Henderson <fjh@cs.mu.OZ.AU> writes:

  Fergus> We've discussed this before.  The concensus last time (as I
  Fergus> understood it) was that alphanumeric message codes where a
  Fergus> better alternative to message numbers.  Message codes are
  Fergus> easier to remember, more self-documenting, and avoid
  Fergus> collisions between new warnings added on different CVS
  Fergus> branches or in different repositories.

Another advantage of message codes (and to a lesser extent, message
numbers) is that it's much easier to filter warnings that you don't
care about.  Emacs' compilation mode, for example, could be modified
to deal with this.  It could almost satisfy every request I've heard
for "give me a command line option to silence <x> warning".

Ben

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-17  8:12         ` Fergus Henderson
  2003-01-17 10:06           ` Stan Shebs
  2003-01-17 11:36           ` Ben Elliston
@ 2003-01-17 12:27           ` Neil Booth
  2003-01-17 23:26             ` Stan Shebs
  2003-01-19  5:18             ` Fergus Henderson
  2003-01-18 16:46           ` Gabriel Dos Reis
  3 siblings, 2 replies; 36+ messages in thread
From: Neil Booth @ 2003-01-17 12:27 UTC (permalink / raw)
  To: Fergus Henderson; +Cc: Janis Johnson, Phil Edwards, Gareth Pearce, gcc

Fergus Henderson wrote:-

> We've discussed this before.  The concensus last time (as I understood it)
> was that alphanumeric message codes where a better alternative to message
> numbers.  Message codes are easier to remember, more self-documenting,
> and avoid collisions between new warnings added on different CVS branches
> or in different repositories.

I take it the idea is that the message code is similar to the text?

Then how do we handle internationalization, where users in a different
language have never even seen, let alone understand, the English form of
the message?

Neil.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-17 11:36           ` Ben Elliston
@ 2003-01-17 19:15             ` DJ Delorie
  0 siblings, 0 replies; 36+ messages in thread
From: DJ Delorie @ 2003-01-17 19:15 UTC (permalink / raw)
  To: bje; +Cc: gcc

I would add that, if we're thinking about a message catalog, we
consider assigning both a specific code and a "category" code to each
message.  I.e., for message WPARAMFOO (function parameter warning) it
might also be in category WcDECL (all declaration warnings) and WcANSI
(all ansi-specific warnings).  Not that I'm suggesting a naming
convention, just thinking that grouping the warnings might prove
useful enough to think about it up front.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-17 12:27           ` Neil Booth
@ 2003-01-17 23:26             ` Stan Shebs
  2003-01-19 14:38               ` Kai Henningsen
  2003-01-19  5:18             ` Fergus Henderson
  1 sibling, 1 reply; 36+ messages in thread
From: Stan Shebs @ 2003-01-17 23:26 UTC (permalink / raw)
  To: Neil Booth
  Cc: Fergus Henderson, Janis Johnson, Phil Edwards, Gareth Pearce, gcc

Neil Booth wrote:

>Fergus Henderson wrote:-
>
>
>>We've discussed this before.  The concensus last time (as I understood it)
>>was that alphanumeric message codes where a better alternative to message
>>numbers.  Message codes are easier to remember, more self-documenting,
>>and avoid collisions between new warnings added on different CVS branches
>>or in different repositories.
>>
>
>I take it the idea is that the message code is similar to the text?
>
>Then how do we handle internationalization, where users in a different
>language have never even seen, let alone understand, the English form of
>the message?
>
My theory is to have the names be basically the same as what the
-Wxxx options use now, so for -Wmissing-prototypes, "missing-prototypes"
is the name of the warning, and would be available for attributes and
pragmas just as for command-line options.  The names are orthogonal to
localization, they're just arbitrary strings.

Although perhaps to be more politically correct, we should borrow
from various languages for names.  How about -Wabwesende-Urbilder? :-)

Stan


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-16 20:00     ` Phil Edwards
                         ` (2 preceding siblings ...)
  2003-01-17  5:26       ` Gareth Pearce
@ 2003-01-18 16:32       ` Gabriel Dos Reis
  2003-01-19 14:48         ` Kai Henningsen
  2003-01-19 19:45         ` Phil Edwards
  3 siblings, 2 replies; 36+ messages in thread
From: Gabriel Dos Reis @ 2003-01-18 16:32 UTC (permalink / raw)
  To: Phil Edwards; +Cc: Gareth Pearce, gcc

[ Sorry for having been absent these days ]

Phil Edwards <phil@jaj.com> writes:

| You really don't want to rely on the text of an error message not changing
| over time.  For this kind of thing to work, we need error messages to have
| numbers (like every other compiler does, and with reason).

Well, I must confess that I don't believe in numbers.  Computers are
very good at that, humans are not I'm afarid.  Certainly we could do
improvments by categoryzing diagnostics, but I'm -not- convinced that
numbers are the way to go.

-- Gaby

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-17  8:12         ` Fergus Henderson
                             ` (2 preceding siblings ...)
  2003-01-17 12:27           ` Neil Booth
@ 2003-01-18 16:46           ` Gabriel Dos Reis
  3 siblings, 0 replies; 36+ messages in thread
From: Gabriel Dos Reis @ 2003-01-18 16:46 UTC (permalink / raw)
  To: Fergus Henderson; +Cc: Janis Johnson, Phil Edwards, Gareth Pearce, gcc

Fergus Henderson <fjh@cs.mu.OZ.AU> writes:

| IIRC there was significant opposition to message numbers, because they
| are too cryptic and because of the problem with collisions, but I don't
| recall anyone objecting to alphanumeric message codes.

That is my recollection also.

-- Gaby

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-17  8:12         ` Phil Edwards
@ 2003-01-18 16:48           ` Gabriel Dos Reis
  0 siblings, 0 replies; 36+ messages in thread
From: Gabriel Dos Reis @ 2003-01-18 16:48 UTC (permalink / raw)
  To: Phil Edwards; +Cc: Janis Johnson, Gareth Pearce, gcc

Phil Edwards <phil@jaj.com> writes:

| > Oooh, and with message numbers it's possible to add options to enable
| > or disable individual warnings.
| 
| Exactly.  :-)

But you don't need numbers to do that.  Names are fine.

-- Gaby

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-16 23:04     ` Mark Mitchell
@ 2003-01-18 16:52       ` Gabriel Dos Reis
  0 siblings, 0 replies; 36+ messages in thread
From: Gabriel Dos Reis @ 2003-01-18 16:52 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Gareth Pearce, gcc

Mark Mitchell <mark@codesourcery.com> writes:

| > Second thought, the error messages of the form 'expected type-id' etc ...
| > are compact and precise.  Something in me would like 'waffly friendly'
| 
| I thought about this, and I intentionally went for more technical error
| messages.  I often find G++'s error messages confusing because of their
| wafflyness -- I think "what on earth does that mean?"  G++ has a bad
| tendency to make up terms that aren't in the standard or even in common
| usage in the C++ community.  By using technical terms, at least people
| can consult a reference book to figure out what's going on.

I believe that you made a good technical decision for that issue.

| You're also right that better documentation about the various error
| messages -- and better error messages! -- would help.

Definitely.

-- Gaby

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-17 12:27           ` Neil Booth
  2003-01-17 23:26             ` Stan Shebs
@ 2003-01-19  5:18             ` Fergus Henderson
  2003-01-19  7:02               ` Neil Booth
  2003-01-19 11:38               ` Zack Weinberg
  1 sibling, 2 replies; 36+ messages in thread
From: Fergus Henderson @ 2003-01-19  5:18 UTC (permalink / raw)
  To: Neil Booth; +Cc: Janis Johnson, Phil Edwards, Gareth Pearce, gcc

On 17-Jan-2003, Neil Booth <neil@daikokuya.co.uk> wrote:
> Fergus Henderson wrote:-
> 
> > We've discussed this before.  The concensus last time (as I understood it)
> > was that alphanumeric message codes where a better alternative to message
> > numbers.  Message codes are easier to remember, more self-documenting,
> > and avoid collisions between new warnings added on different CVS branches
> > or in different repositories.
> 
> I take it the idea is that the message code is similar to the text?

Not really.  The message code is a name for the message.
Usually it should be a lot shorter than the message.
Unlike the message, it does not contain any varying parts.

> Then how do we handle internationalization, where users in a different
> language have never even seen, let alone understand, the English form of
> the message?

If you really want to internationalize these, I suppose it would be
possible to provide translations.  However, I wouldn't recommend it.
These names will be used in compiler input (compiler options and
source code pragmas), and internationalization doesn't work as well
for input as it does for output.

It would also be possible to provide a version of the C programming
language which accepted internationalized version of keywords in its
input.  However, I wouldn't recommend that either ;-)

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-19  5:18             ` Fergus Henderson
@ 2003-01-19  7:02               ` Neil Booth
  2003-01-19  9:45                 ` Fergus Henderson
  2003-01-19 11:38               ` Zack Weinberg
  1 sibling, 1 reply; 36+ messages in thread
From: Neil Booth @ 2003-01-19  7:02 UTC (permalink / raw)
  To: Fergus Henderson; +Cc: Janis Johnson, Phil Edwards, Gareth Pearce, gcc

Fergus Henderson wrote:-

> > > We've discussed this before.  The concensus last time (as I understood it)
> > > was that alphanumeric message codes where a better alternative to message
> > > numbers.  Message codes are easier to remember, more self-documenting,
> > > and avoid collisions between new warnings added on different CVS branches
> > > or in different repositories.
> > 
> > I take it the idea is that the message code is similar to the text?
> 
> Not really.  The message code is a name for the message.
> Usually it should be a lot shorter than the message.
> Unlike the message, it does not contain any varying parts.

I see.  Could you give an example or two?  Thanks.

Neil.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-19  7:02               ` Neil Booth
@ 2003-01-19  9:45                 ` Fergus Henderson
  0 siblings, 0 replies; 36+ messages in thread
From: Fergus Henderson @ 2003-01-19  9:45 UTC (permalink / raw)
  To: Neil Booth; +Cc: Janis Johnson, Phil Edwards, Gareth Pearce, gcc

On 18-Jan-2003, Neil Booth <neil@daikokuya.co.uk> wrote:
> Fergus Henderson wrote:-
> > 
> > The message code is a name for the message.
> > Usually it should be a lot shorter than the message.
> > Unlike the message, it does not contain any varying parts.
> 
> I see.  Could you give an example or two?  Thanks.

Well, as someone else already mentioned, many warnings already have
names which are used in `-W<NAME>' or `-Wno-<NAME>' options,
e.g. -Wimplicit-function-declarations and -Wno-import.
Names for other warnings which do not yet have names
would be in a similar style.

-- 
Fergus Henderson <fjh@cs.mu.oz.au>  |  "I have always known that the pursuit
The University of Melbourne         |  of excellence is a lethal habit"
WWW: <http://www.cs.mu.oz.au/~fjh>  |     -- the last words of T. S. Garp.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-19  5:18             ` Fergus Henderson
  2003-01-19  7:02               ` Neil Booth
@ 2003-01-19 11:38               ` Zack Weinberg
  1 sibling, 0 replies; 36+ messages in thread
From: Zack Weinberg @ 2003-01-19 11:38 UTC (permalink / raw)
  To: Fergus Henderson
  Cc: Neil Booth, Janis Johnson, Phil Edwards, Gareth Pearce, gcc

Fergus Henderson <fjh@cs.mu.OZ.AU> writes:

> On 17-Jan-2003, Neil Booth <neil@daikokuya.co.uk> wrote:
>> Fergus Henderson wrote:-
>> 
>> > We've discussed this before.  The concensus last time (as I understood it)
>> > was that alphanumeric message codes where a better alternative to message
>> > numbers.  Message codes are easier to remember, more self-documenting,
>> > and avoid collisions between new warnings added on different CVS branches
>> > or in different repositories.
>> 
>> I take it the idea is that the message code is similar to the text?
>
> Not really.  The message code is a name for the message.
> Usually it should be a lot shorter than the message.
> Unlike the message, it does not contain any varying parts.

You know, I really liked the idea of matching warning message text
against a set of user-specified regular expressions and suppressing
them if they matched.  Didn't someone have a demonstration patch for
that?

zw

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-17 23:26             ` Stan Shebs
@ 2003-01-19 14:38               ` Kai Henningsen
  0 siblings, 0 replies; 36+ messages in thread
From: Kai Henningsen @ 2003-01-19 14:38 UTC (permalink / raw)
  To: gcc

shebs@apple.com (Stan Shebs)  wrote on 17.01.03 in <3E284DF0.4020604@apple.com>:

> Neil Booth wrote:
>
> >Fergus Henderson wrote:-
> >
> >
> >>We've discussed this before.  The concensus last time (as I understood it)
> >>was that alphanumeric message codes where a better alternative to message
> >>numbers.  Message codes are easier to remember, more self-documenting,
> >>and avoid collisions between new warnings added on different CVS branches
> >>or in different repositories.
> >>
> >
> >I take it the idea is that the message code is similar to the text?
> >
> >Then how do we handle internationalization, where users in a different
> >language have never even seen, let alone understand, the English form of
> >the message?
> >
> My theory is to have the names be basically the same as what the
> -Wxxx options use now, so for -Wmissing-prototypes, "missing-prototypes"
> is the name of the warning, and would be available for attributes and
> pragmas just as for command-line options.  The names are orthogonal to
> localization, they're just arbitrary strings.
>
> Although perhaps to be more politically correct, we should borrow
> from various languages for names.  How about -Wabwesende-Urbilder? :-)

That might make German programmers look funny at you - I've never seen  
prototypes translated as "Urbilder". I think "Prototypen" is usually used,  
and not only with computers. "Urbilder" sounds like it might be used in  
psychology or philosophy. (In fact, Duden/Oxford only mentions  
"Prototyp".)

Of course, you'd also not translate "missing" with "abwesend" for similar  
though less drastic reasons - it's "fehlend". Which makes this "-Wfehlende- 
Prototypen" ...

Oh, and this points out a last problem, that replacing space with dash  
here *really* looks bad in German. Sorry, no better alternative comes to  
mind.

All in all, I'd prefer keeping those switches English. Translate the docs  
instead. After all, it's still "if" and not "wenn" in the language, as  
well!

MfG Kai

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-18 16:32       ` Gabriel Dos Reis
@ 2003-01-19 14:48         ` Kai Henningsen
  2003-01-19 19:45         ` Phil Edwards
  1 sibling, 0 replies; 36+ messages in thread
From: Kai Henningsen @ 2003-01-19 14:48 UTC (permalink / raw)
  To: gcc

gdr@integrable-solutions.net (Gabriel Dos Reis)  wrote on 18.01.03 in <m3el7a7v5u.fsf@uniton.integrable-solutions.net>:

> Phil Edwards <phil@jaj.com> writes:
>
> | You really don't want to rely on the text of an error message not changing
> | over time.  For this kind of thing to work, we need error messages to have
> | numbers (like every other compiler does, and with reason).
>
> Well, I must confess that I don't believe in numbers.  Computers are
> very good at that, humans are not I'm afarid.  Certainly we could do
> improvments by categoryzing diagnostics, but I'm -not- convinced that
> numbers are the way to go.

So long as we get *any* kind of short identifier ...

Personally, I would like to see a scheme that makes it possible to isolate  
the exact place in the gcc source that decided to throw out this  
particular message, but I gather some other people here object to that  
much precision.

Just one of many possible schemes:

file.cc:123: warning: pedantic-17: blah blah ...

MfG Kai

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-18 16:32       ` Gabriel Dos Reis
  2003-01-19 14:48         ` Kai Henningsen
@ 2003-01-19 19:45         ` Phil Edwards
  1 sibling, 0 replies; 36+ messages in thread
From: Phil Edwards @ 2003-01-19 19:45 UTC (permalink / raw)
  To: Gabriel Dos Reis; +Cc: Gareth Pearce, gcc

On Sat, Jan 18, 2003 at 08:51:25AM +0100, Gabriel Dos Reis wrote:
> 
> [ Sorry for having been absent these days ]
> 
> Phil Edwards <phil@jaj.com> writes:
> 
> | You really don't want to rely on the text of an error message not changing
> | over time.  For this kind of thing to work, we need error messages to have
> | numbers (like every other compiler does, and with reason).
> 
> Well, I must confess that I don't believe in numbers.  Computers are
> very good at that, humans are not I'm afarid.  Certainly we could do
> improvments by categoryzing diagnostics, but I'm -not- convinced that
> numbers are the way to go.

Strike "numbers" in my email and read "something else equally unchanging,"
then.  I'm not stuck on numbers per se, just some kind of constant short
label.


Phil

-- 
I would therefore like to posit that computing's central challenge, viz. "How
not to make a mess of it," has /not/ been met.
                                                 - Edsger Dijkstra, 1930-2002

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
@ 2003-01-19 17:31 Robert Dewar
  0 siblings, 0 replies; 36+ messages in thread
From: Robert Dewar @ 2003-01-19 17:31 UTC (permalink / raw)
  To: phil, tilps; +Cc: gcc

> (I think i heard something about carat errors being on
> the way,

A little note here. I assume carat error means errors where there is a pointer
to the column of the error.

What we found very useful in GNAT was to define the following routines:

   procedure Error_Msg_AP (Msg : String);
   --  Output a message just after the previous token. This routine can be
   --  called only from the parser, since it references Prev_Token_Ptr.

   procedure Error_Msg_BC (Msg : String);
   --  Output a message just before the current token. Note that the important
   --  difference between this and the previous routine is that the BC case
   --  posts a flag on the current line, whereas AP can post a flag at the
   --  end of the preceding line. This routine can be called only from the
   --  parser, since it references Token_Ptr.

   procedure Error_Msg_SC (Msg : String);
   --  Output a message at the start of the current token, unless we are at
   --  the end of file, in which case we always output the message after the
   --  last real token in the file. This routine can be called only from the
   --  parser, since it references Token_Ptr.

   procedure Error_Msg_SP (Msg : String);
   --  Output a message at the start of the previous token. This routine can
   --  be called only from the parser, since it references Prev_Token_Ptr.

This allows very precise placement of the message pointers. One danger of
message pointers is that if they are not exactly right, they can obfuscate
rather than clarify the error.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
@ 2003-01-18 21:53 Robert Dewar
  0 siblings, 0 replies; 36+ messages in thread
From: Robert Dewar @ 2003-01-18 21:53 UTC (permalink / raw)
  To: gdr, mark; +Cc: gcc, tilps

> | I thought about this, and I intentionally went for more technical error
> | messages.  I often find G++'s error messages confusing because of their
> | wafflyness -- I think "what on earth does that mean?"  G++ has a bad
> | tendency to make up terms that aren't in the standard or even in common
> | usage in the C++ community.  By using technical terms, at least people
> | can consult a reference book to figure out what's going on.

I think "common usage" is a better guide than "technical terms", especially
for a language where the great majority of C++ users have no effective access
to the standard. Even with Ada, where typically any serious Ada programmer
*does* have a copy of the standard readily available and consults it 
frequently, the use of technical terms can be confusing. For example if
a message says package, then technically this does not include generic
packages, but relying on this knowledge would be confusing. Similarly,
in common usage everyone uses the term package spec instead of package
declaration.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
@ 2003-01-14 14:31 Jeff Donner
  0 siblings, 0 replies; 36+ messages in thread
From: Jeff Donner @ 2003-01-14 14:31 UTC (permalink / raw)
  To: gcc

<a href="http://gcc.gnu.org/ml/gcc/2003-01/msg00663.html">Re:</a>

 >--On Monday, January 13, 2003 03:46:43 PM -0800 Joe Buck 
 ><jbuck@synopsys.com> wrote:
 >
 > The new parser doesn't produce useful diagnostics in the presence of
 > common errors.  Since the old parser did better, this is a regression.
 >
 >-- Mark Mitchell:
 >To be honest, I'm somewhat unsympathetic.  Not because I think the >error
 >messages are good, or because I think that we shouldn't do better, but
 >because it's hard to do better in some of these cases and because
 >we do noticably better in other cases -- the old parser just said
 >"parse error" a lot. :-)
 >
 >It's also hard to do better without breaking legal programs; it takes
 >a lot of head-scratching to think of all the cases.
 >
 >> The new parser might want to use a strategy that goes something
 >> like this:
 >> make a guess as to what was intended.  If a complete statement can be
 >> parsed according to that guess, then keep it.  Optionally try a >second
 >> guess, if there is one available, otherwise skip to some >synchronizing
 >> token.
 >
 >-- Mark Mitchell:
 >I'd prefer that we not introduce yet more backtracking.  Too >complicated.
 >...

If there is an explicit trace of states / previously seen tokens,
you can use it to dispense with having to imagine errors
by having a bunch of programmers dump traces when
they make errors.  A human analyses these, and maps
the traces to nice human messages.  (This idea comes from
a tool that automates this process for YACC-based compilers,

http://unicon.sourceforge.net/merr

which has a paper that explains the idea.)  This makes it
mechanical instead of imagination-stressing to deal with
the many specific cases, & it turns out it has pretty good
resolving power.

Examples of what a token trace can distinguish,
   allowing specific messages:
int main()       // parenthesis or semi-colon expected
int x y;         // missing comma in variable list
char() {}        // function name expected
int a[] = {1, 2; // unclosed initializer
struct foo
int x;           // missing { after struct label

So, I'm saying if it isn't in there, such a
state/token-trace & dump facility would make
constructing the error messages more mechanical, and
take less expertise to find/add new ones, and allow
a more comprehensive & specific set of messages.

Jeff Donner

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: new parser: error recovery needs work
  2003-01-14  1:24 Joe Buck
@ 2003-01-14  7:35 ` Mark Mitchell
  0 siblings, 0 replies; 36+ messages in thread
From: Mark Mitchell @ 2003-01-14  7:35 UTC (permalink / raw)
  To: Joe Buck, gcc

--On Monday, January 13, 2003 03:46:43 PM -0800 Joe Buck 
<jbuck@synopsys.com> wrote:

> The new parser doesn't produce useful diagnostics in the presence of
> common errors.  Since the old parser did better, this is a regression.

To be honest, I'm somewhat unsympathetic.  Not because I think the error
messages are good, or because I think that we shouldn't do better, but
because it's hard to do better in some of these cases and because
we do noticably better in other cases -- the old parser just said
"parse error" a lot. :-)

It's also hard to do better without breaking legal programs; it takes
a lot of head-scratching to think of all the cases.

> The new parser might want to use a strategy that goes something like this:
> make a guess as to what was intended.  If a complete statement can be
> parsed according to that guess, then keep it.  Optionally try a second
> guess, if there is one available, otherwise skip to some synchronizing
> token.

I'd prefer that we not introduce yet more backtracking.  Too complicated.

Treating "identifier_1 identifier_2" by saying "identifier_1 is not a
type" is an excellent idea.  Probably not too hard to implement,
either. :-)

Put these into PRs; I'll look at them when I've gotten through the
crashes and such.

Thanks,

-- 
Mark Mitchell                mark@codesourcery.com
CodeSourcery, LLC            http://www.codesourcery.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* new parser: error recovery needs work
@ 2003-01-14  1:24 Joe Buck
  2003-01-14  7:35 ` Mark Mitchell
  0 siblings, 1 reply; 36+ messages in thread
From: Joe Buck @ 2003-01-14  1:24 UTC (permalink / raw)
  To: gcc

The new parser doesn't produce useful diagnostics in the presence of
common errors.  Since the old parser did better, this is a regression.
Some examples:

1) mis-spelling a type name.

unsinged i;

with the old parser produces

pe.cpp:1: error: 'unsinged' is used as a type, but is not defined as a type.

with the new parser we get

pe.cpp:1: error: expected constructor, destructor, or type conversion
pe.cpp:1: error: expected `,' or `;'

2) forgetting to provide a template argument list

#include <vector>
std::vector foo;

with the old parser produces

v1.cpp:3: invalid use of template-name 'std::vector' in a declarator
v1.cpp:3: syntax error before `;' token

and with the new parser produces

v1.cpp:3: error: expected constructor, destructor, or type conversion
v1.cpp:3: error: expected `,' or `;'

3) forgetting the std:: namespace (common for gcc-2.95.x code)

#include <vector>
vector<int> foo;

with the new parser:

v1.cpp:3: error: expected constructor, destructor, or type conversion
v1.cpp:3: error: expected `,' or `;'

with the old parser:
v2.cpp:3: error: 'vector' is used as a type, but is not defined as a type.
(not quite correct, but at least it's a good hint)

The old parser's treatment of #1 and #3 is due to an error recovery rule
that I came up with based on Gerald pointing out that the 2.x -> 3.x
conversion would be a disaster without it (as the only thing the 3.0-pre
compiler would say to typical 2.95.x code was lots of "syntax error
before `;` token" messages.  The hack was pretty simple, attempting to
match

IDENTIFIER optional_template_arg_list IDENTIFIER optional_arg_list ';'

which is a sequence that cannot occur in legal C++.  This was good enough
to catch a number of common mistakes.

The new parser's structure should make it possible to do better.  The key
is to attempt a reasonable guess as to what might have been intended;
in some cases, there is really only one possibility.

For example, if we have two unknown identifiers in a row, the only way the
code could be legal is if the first is a type, declaring the second, and
it is mis-spelled or the declaration was forgotten.

If we have a template identifier followed by another identifier, then
the likelihood is that the template argument list was forgotten.

If an symbol is unknown but there is a matching symbol in the std::
namespace, it's possible to print a "did you mean std::foo?" message.

The new parser might want to use a strategy that goes something like this:
make a guess as to what was intended.  If a complete statement can be
parsed according to that guess, then keep it.  Optionally try a second
guess, if there is one available, otherwise skip to some synchronizing
token.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2003-01-19 15:26 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-01-16 13:40 new parser: error recovery needs work Robert Dewar
2003-01-16 17:20 ` Joe Buck
2003-01-16 18:15   ` Mark Mitchell
2003-01-16 17:34 ` Joe Buck
2003-01-16 18:05   ` Gareth Pearce
2003-01-16 20:00     ` Phil Edwards
2003-01-16 20:15       ` Neil Booth
2003-01-16 21:43       ` Janis Johnson
2003-01-16 23:33         ` Mark Mitchell
2003-01-17  8:12         ` Fergus Henderson
2003-01-17 10:06           ` Stan Shebs
2003-01-17 11:36           ` Ben Elliston
2003-01-17 19:15             ` DJ Delorie
2003-01-17 12:27           ` Neil Booth
2003-01-17 23:26             ` Stan Shebs
2003-01-19 14:38               ` Kai Henningsen
2003-01-19  5:18             ` Fergus Henderson
2003-01-19  7:02               ` Neil Booth
2003-01-19  9:45                 ` Fergus Henderson
2003-01-19 11:38               ` Zack Weinberg
2003-01-18 16:46           ` Gabriel Dos Reis
2003-01-17  8:12         ` Phil Edwards
2003-01-18 16:48           ` Gabriel Dos Reis
2003-01-17  5:26       ` Gareth Pearce
2003-01-17  8:36         ` Phil Edwards
2003-01-17  9:10           ` Gareth Pearce
2003-01-18 16:32       ` Gabriel Dos Reis
2003-01-19 14:48         ` Kai Henningsen
2003-01-19 19:45         ` Phil Edwards
2003-01-16 23:04     ` Mark Mitchell
2003-01-18 16:52       ` Gabriel Dos Reis
  -- strict thread matches above, loose matches on Subject: below --
2003-01-19 17:31 Robert Dewar
2003-01-18 21:53 Robert Dewar
2003-01-14 14:31 Jeff Donner
2003-01-14  1:24 Joe Buck
2003-01-14  7:35 ` Mark Mitchell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).