public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Parsing templates as baseclasses
@ 1998-03-01 16:07 Martin von Loewis
  1998-03-01 16:59 ` Mark Mitchell
  0 siblings, 1 reply; 6+ messages in thread
From: Martin von Loewis @ 1998-03-01 16:07 UTC (permalink / raw)
  To: egcs

I'm trying to investigate the code

namespace foo {

  template <class T>
  class x {};

}

class y : public foo::x<int> {};

I've got the lexer to produce, for the last line

(AGGR `class') (IDENTIFIER_DEFN `y')
(':')(VISSPEC)
(NSNAME)(SCOPE)
(PTYPENAME `x')('<')(TYPESPEC `int')('>')
('{')

The last line is eventually reduced -> template_type.
Then I get an error with the stack
state stack now 0 1 4 66 222 223 466 717 958 123

In the non-namespace case, the template_type is further reduced
-> type_name -> nonnested_type -> base_class.1

Now, where should I put the support for namespace-qualified template
types? I'll have to eventually reduce this to baseclass.1 as well,
preferably without declaring foo::x<int> a nested type (it isn't).
Also, I'd like to avoid additional conflicts in the grammar.

Any guidance appreciated.

Martin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Parsing templates as baseclasses
  1998-03-01 16:07 Parsing templates as baseclasses Martin von Loewis
@ 1998-03-01 16:59 ` Mark Mitchell
  1998-05-09  3:34   ` Martin von Loewis
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Mitchell @ 1998-03-01 16:59 UTC (permalink / raw)
  To: Martin von Loewis; +Cc: egcs

>>>>> "Martin" == Martin von Loewis <martin@mira.isdn.cs.tu-berlin.de> writes:

    Martin> I'm trying to investigate the code

As we speak, I'm engaged in the process of redoing the cp/parse.y from
scratch, based directly on the grammar in the standard, with the
addition of the GNU extensions that we currently support.  This is an
unauthorized, unsanctioned project; in particular, Jason's had nothing
to do with it.  I've discovered a number of odd parsing problems, and
after poking around a bit, decided that some major changes were in
order, especially now that the C++ grammar has (finally) stabilized.

So, you might want to hold off; in a week or so I expect to submit the
new (and hopefully much improved) grammar, which should be fully
ISO-conformant.  On the other hand, there's no guarantee whatsoever
that this version will be in g++ any time soon; Jason will of course
have to check it over with his usual eagle eye.

    Martin> namespace foo {

    Martin>   template <class T> class x {};

    Martin> }

    Martin> class y : public foo::x<int> {};

    Martin> I've got the lexer to produce, for the last line

    Martin> (AGGR `class') (IDENTIFIER_DEFN `y') (':')(VISSPEC)
    Martin> (NSNAME)(SCOPE) (PTYPENAME `x')('<')(TYPESPEC `int')('>')
    Martin> ('{')

    Martin> The last line is eventually reduced -> template_type.
    Martin> Then I get an error with the stack state stack now 0 1 4
    Martin> 66 222 223 466 717 958 123

    Martin> In the non-namespace case, the template_type is further
    Martin> reduced
    -> type_name -> nonnested_type -> base_class.1

    Martin> Now, where should I put the support for
    Martin> namespace-qualified template types? I'll have to
    Martin> eventually reduce this to baseclass.1 as well, preferably
    Martin> without declaring foo::x<int> a nested type (it isn't).
    Martin> Also, I'd like to avoid additional conflicts in the
    Martin> grammar.

    Martin> Any guidance appreciated.

    Martin> Martin


-- 
Mark Mitchell		mmitchell@usa.net
Stanford University	http://www.stanford.edu


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Parsing templates as baseclasses
  1998-03-01 16:59 ` Mark Mitchell
@ 1998-05-09  3:34   ` Martin von Loewis
  0 siblings, 0 replies; 6+ messages in thread
From: Martin von Loewis @ 1998-05-09  3:34 UTC (permalink / raw)
  To: mmitchell; +Cc: egcs

Mark> As we speak, I'm engaged in the process of redoing the cp/parse.y from
Mark> scratch, based directly on the grammar in the standard, with the
Mark> addition of the GNU extensions that we currently support.

What is the status of this project? I still believe that the C++
grammar of g++ is in need of a rewrite, so it would be bad if that got
cancelled somehow.

Martin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Parsing templates as baseclasses
  1998-03-02 20:02 ` Mark Mitchell
@ 1998-03-04 13:54   ` Neal Becker
  0 siblings, 0 replies; 6+ messages in thread
From: Neal Becker @ 1998-03-04 13:54 UTC (permalink / raw)
  To: mmitchell; +Cc: Mike Stump, egcs

PCCTS should be a good framework.  If I recall correctly it generates
recursive decent parsers by default.  It has c++ as an example, but I
don't know how completely this implements the standard.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Parsing templates as baseclasses
@ 1998-03-03  1:08 Mike Stump
  1998-03-02 20:02 ` Mark Mitchell
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Stump @ 1998-03-03  1:08 UTC (permalink / raw)
  To: mmitchell; +Cc: egcs

> Date: Sun, 1 Mar 1998 17:02:45 GMT
> From: Mark Mitchell <mmitchell@usa.net>

> As we speak, I'm engaged in the process of redoing the cp/parse.y from
> scratch

This is something that needs to be done.

Do you understand the hard parsing issues with C++?  If you think the
grammar in the standard is the be all end all in grammars, then I
suspect there are things that you don't yet understand, and without
that understanding, completing this project to the level of
completeness that makes the project worthwhile will be hard.  Do you
think the grammar in the standard just works?  If so, why?

Do you have the requisite framework to solve all the hard parsing
problems in C++?

Also, have you implemented all the extra semantic checks as found in
the text of the standard that we previously had implemented in the
parser?

As an example of one of the more subtle parsing issues with C++,
consider the following:

          struct T1 {
                  T1 operator()(int x) { return T1(x); }
                  int operator=(int x) { return x; }
                  T1(int) { }
          };
          struct T2 { T2(int){ } };
          int a, (*(*b)(T2))(int), c, d;


          void f() {
                  // dismabiguation requires this to be parsed
                  // as a declaration
                  T1(a) = 3,
                  T2(4),                  // T2 will be declared as
                  (*(*b)(T2(c)))(int(d)); // a variable of type T1
                                          // but this will not allow
                                          // the last part of the
                                          // declaration to parse
                                          // properly since it depends
                                          // on T2 being a type-name
          }

Do you fully understand this example?  Without fixing it now, does
your parser currently get it right?  If not, did you think you had a
full C++ parser?  There are subtleties contained in that this example
that experts in both parsing and C++ find non-obvious.

Do you think you have the full implications of the following paragraph
implemented?

3 The disambiguation is purely syntactic; that is, the  meaning  of  the
  names  occurring  in  such a statement, beyond whether they are  type-  |
  names  or  not,  is  not  generally  used  in  or   changed   by   the  |
  disambiguation.   Class  templates  are  instantiated  as necessary to  |
  determine if  a  qualified  name  is  a    type-name.   Disambiguation
  precedes  parsing,  and a statement disambiguated as a declaration may
  be an ill-formed  declaration.   If,  during  parsing,  a  name  in  a  |
  template  parameter is bound differently than it would be bound during  |
  a trial parse, the program is ill-formed.

Have you addressed all the relevant issues raised in
ftp://ftp.cygnus.com/pub/g++/g++-bugs/parsing ?

Do you solve the complete problem of expr v decl parsing?  How?

What framework do you use (bison, yacc, PCCTS)?  Is is at least as
maintainable as a good PCCTS implementation?


I look forward to hearing about your work.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Parsing templates as baseclasses
  1998-03-03  1:08 Mike Stump
@ 1998-03-02 20:02 ` Mark Mitchell
  1998-03-04 13:54   ` Neal Becker
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Mitchell @ 1998-03-02 20:02 UTC (permalink / raw)
  To: Mike Stump; +Cc: egcs

>>>>> "Mike" == Mike Stump <mrs@wrs.com> writes:

    >> Date: Sun, 1 Mar 1998 17:02:45 GMT From: Mark Mitchell
    >> <mmitchell@usa.net>

    >> As we speak, I'm engaged in the process of redoing the
    >> cp/parse.y from scratch

    Mike> This is something that needs to be done.

    Mike> Do you understand the hard parsing issues with C++?  If you

Yes. 

    Mike> think the grammar in the standard is the be all end all in
    Mike> grammars, then I suspect there are things that you don't yet
    Mike> understand, and without that understanding, completing this
    Mike> project to the level of completeness that makes the project
    Mike> worthwhile will be hard.  Do you think the grammar in the
    Mike> standard just works?  If so, why?

I know very well that there are a host of ambiguities, and that the
grammar in the standard is not LALR(1).  In fact, I'm familiar with
all the problems you mentioned below.  It's those problems, and the
fact that g++ gets some of the wrong at the moment, that inspired my
work.  Jason and I have been discussing my proposed techniques in some
private email.  We've been debating my approach (still based on bison)
versus and a recursive-descent parser.

Of course, since C++ is not LALR(1) no pure bison approach will work.
In fact, no finite amount of lookahead will do.  As you know, there
are certain statements that could be either function or object
declarations, so it is occasionally necessary to do `trial parses' or
build multiple parse trees and pick the right one later, or something
like this.

My plan involves a hybrid approach, whereby bison is used to much of
the work, but where there will sometimes be recursive parsing calls to
do trial parses, or hand-parsing.  It is easy to make bison generate a
re-entrant parser, and it's possible to have bison parse certain
constructs "by hand", with a bit more work.

At present, I've still got some work to do.  However, the parser in
its current state uses no precedence declarations (one of the major
maintenance nightmares of the current parser, in my opinion), and is
down to a handful (literally) of conflicts that need to be resolved
correctly.

I've suggested that this hybrid approach will be simpler than a
completely by-hand approach, and will allow us to substitue
hand-written parsers as we find it necessary for speed, better
error-recover, of what have you.

Jason has pointed out that most (all?) vendors now use
recursive-descent parsers, and that they tend to provide better
error-recovery. 

At this point, I'm inclined to go a bit farther with my approach;
either I will hit an unanticipated technial snag, or suceed; in the
latter case, everyone will get a chance to see what I've done and
decide whether or not it is satisfactory.

-- 
Mark Mitchell		mmitchell@usa.net
Stanford University	http://www.stanford.edu


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~1998-05-09  3:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-03-01 16:07 Parsing templates as baseclasses Martin von Loewis
1998-03-01 16:59 ` Mark Mitchell
1998-05-09  3:34   ` Martin von Loewis
1998-03-03  1:08 Mike Stump
1998-03-02 20:02 ` Mark Mitchell
1998-03-04 13:54   ` Neal Becker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).