public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* #pragma interface/implementation broken if --enable-mapped-location
@ 2004-09-24  2:08 Per Bothner
  2004-09-24  2:22 ` Zack Weinberg
  0 siblings, 1 reply; 13+ messages in thread
From: Per Bothner @ 2004-09-24  2:08 UTC (permalink / raw)
  To: gcc

Some change in the last week or so broke support for
#pragma interface/implementation in the --enable-mapped-location
case. This breaks building libjava/java/lang/natClass.cc.

What appears to be happening is this:
The #pragma interface in Class.h is deferred until the current
file (as given by input_location) is natClass.cc.  At that point
cp_lexer_handle_pragma calls cp_lexer_set_source_position_from_token,
which nicely sets the input_location to the saved Class.h.  After a
few more calls we call do_pragms, but with the defer_pragmas flag
turned off, so we get to:

   if (p && (p->is_internal || !CPP_OPTION (pfile, defer_pragmas)))
     {
       /* Since the handler below doesn't get the line number, that it
	 might need for diagnostics, make sure it has the right
	 numbers in place.  */
       if (pfile->cb.line_change)
	(*pfile->cb.line_change) (pfile, pragma_token, false);
       (*p->u.handler) (pfile);
     }

The pragma_token has a location in natClass.cc even though the
original (deferred) pragma was in Class.h.  So the call to
line_change sets the input_location to the location in natClass.cc,
undoing the effect of cp_lexer_set_source_position_from_token.
Hence the #pragma interface doesn't work.

The same thing happens in the --disable-mapped-location case,
but it happens to "accidentally" work:  In that case the cb_line_change
function (in c-lex.c) just changes the input_line, leaving the
input_filename unchanged.  The resulting input_location is
non-sensical, but #pragma interface doesn't care.

My quick-and-dirty fix to remove the cb.line_change call.  I'll try
rebuilding with that.

Zack, how do you want to handle this?
-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: #pragma interface/implementation broken if --enable-mapped-location
  2004-09-24  2:08 #pragma interface/implementation broken if --enable-mapped-location Per Bothner
@ 2004-09-24  2:22 ` Zack Weinberg
  2004-09-24  2:43   ` Per Bothner
  0 siblings, 1 reply; 13+ messages in thread
From: Zack Weinberg @ 2004-09-24  2:22 UTC (permalink / raw)
  To: Per Bothner; +Cc: gcc

Per Bothner <per@bothner.com> writes:

> Some change in the last week or so broke support for
> #pragma interface/implementation in the --enable-mapped-location
> case. This breaks building libjava/java/lang/natClass.cc.
>
> What appears to be happening is this:
> The #pragma interface in Class.h is deferred until the current
> file (as given by input_location) is natClass.cc.  At that point
> cp_lexer_handle_pragma calls cp_lexer_set_source_position_from_token,
> which nicely sets the input_location to the saved Class.h.  After a
> few more calls we call do_pragms, but with the defer_pragmas flag
> turned off, so we get to:
>
>   if (p && (p->is_internal || !CPP_OPTION (pfile, defer_pragmas)))
>     {
>       /* Since the handler below doesn't get the line number, that it
> 	 might need for diagnostics, make sure it has the right
> 	 numbers in place.  */
>       if (pfile->cb.line_change)
> 	(*pfile->cb.line_change) (pfile, pragma_token, false);
>       (*p->u.handler) (pfile);
>     }
>
> The pragma_token has a location in natClass.cc even though the
> original (deferred) pragma was in Class.h.  So the call to
> line_change sets the input_location to the location in natClass.cc,
> undoing the effect of cp_lexer_set_source_position_from_token.
> Hence the #pragma interface doesn't work.
>
> The same thing happens in the --disable-mapped-location case,
> but it happens to "accidentally" work:  In that case the cb_line_change
> function (in c-lex.c) just changes the input_line, leaving the
> input_filename unchanged.  The resulting input_location is
> non-sensical, but #pragma interface doesn't care.
>
> My quick-and-dirty fix to remove the cb.line_change call.  I'll try
> rebuilding with that.

We can't take out the cb.line_change call, that will mess up the C
front end.

My question is why is the pragma_token's location wrong?  That would
seem to be the real bug.

zw

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: #pragma interface/implementation broken if --enable-mapped-location
  2004-09-24  2:22 ` Zack Weinberg
@ 2004-09-24  2:43   ` Per Bothner
  2004-09-24  4:58     ` Zack Weinberg
  0 siblings, 1 reply; 13+ messages in thread
From: Per Bothner @ 2004-09-24  2:43 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: gcc

Zack Weinberg wrote:

> We can't take out the cb.line_change call, that will mess up the C
> front end.

Right - I don't suggest it as a real fix.

One option is for cpp_handle_deferred_pragma to temporarily clear
(and later restore) the cb.line_change.

Alternatively, cp_lexer_handle_pragma can do the same thing, though
that seems (even) less clean.

A third option to pass in an extra parameter or to set a flag so that
do_pragma knows that it is processing a pragma that was deferred.

> My question is why is the pragma_token's location wrong?  That would
> seem to be the real bug.

The pragma line is saved as a string, not as tokens.  Then
cp_lexer_handle_pragma calls cpp_handle_deferred_pragms which
calls run_directive.  The pragma_token seen by do_pragma (the
second time) is from lexing the run_directive string.
-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: #pragma interface/implementation broken if --enable-mapped-location
  2004-09-24  2:43   ` Per Bothner
@ 2004-09-24  4:58     ` Zack Weinberg
  2004-09-24 14:17       ` Joseph S. Myers
  0 siblings, 1 reply; 13+ messages in thread
From: Zack Weinberg @ 2004-09-24  4:58 UTC (permalink / raw)
  To: Per Bothner; +Cc: gcc

Per Bothner <per@bothner.com> writes:

> The pragma line is saved as a string, not as tokens.  Then
> cp_lexer_handle_pragma calls cpp_handle_deferred_pragms which
> calls run_directive.  The pragma_token seen by do_pragma (the
> second time) is from lexing the run_directive string.

Maybe then we should change cpp_handle_deferred_pragma and
run_directive to take a source location.  This hasn't historically
been a problem with run_directive since it was used only for command
line switches.

Alternatively, I wanted originally to defer pragmas not by
encapsulating the un-tokenized line in a string, but by injecting a
keyword (__pragma) at the beginning of the line, a semicolon at the
end, and tokenizing normally (but with macro expansion disabled).
Matt persuaded me that was hard to implement, but maybe it should be
revisited.

zw

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: #pragma interface/implementation broken if --enable-mapped-location
  2004-09-24  4:58     ` Zack Weinberg
@ 2004-09-24 14:17       ` Joseph S. Myers
  2004-09-24 21:47         ` Zack Weinberg
  0 siblings, 1 reply; 13+ messages in thread
From: Joseph S. Myers @ 2004-09-24 14:17 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Per Bothner, gcc

On Thu, 23 Sep 2004, Zack Weinberg wrote:

> Alternatively, I wanted originally to defer pragmas not by
> encapsulating the un-tokenized line in a string, but by injecting a
> keyword (__pragma) at the beginning of the line, a semicolon at the
> end, and tokenizing normally (but with macro expansion disabled).
> Matt persuaded me that was hard to implement, but maybe it should be
> revisited.

That would also probably be convenient for OpenMP pragma implementation 
for C (where the pragmas can contain expressions that need to go through 
the parser as usual).  But for most target pragmas attempting to hook them 
in the grammar like that would seem an excess complication; the grammar 
would effectively just gather up a sequence of arbitrary tokens again to 
pass to the target pragma handler.  (Bearing in mind the general 
desirability that target pragmas share a single implementation for C and 
C++.)

I don't care for having three separate paths for pragma handling 
(immediate on lexing, deferred as a pragma token, deferred as a sequence 
of tokens), which suggests making C use the pragma token (but handle it 
immediately on parsing) and then allow for some pragmas - depending on the 
individual pragma - to become token sequences instead.

I wonder if also the check for unrecognized pragmas should be made before 
the pragma becomes any sort of token, on the basis that the C and C++ 
standards explicitly say that unknown pragmas are ignored (and so should 
be permitted anywhere in the source file, not just in reasonable places).  
That would seem cleaner than documenting that all pragmas are recognized 
in order to give errors if they occur in an inconvenient place.

-- 
Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
  http://www.srcf.ucam.org/~jsm28/gcc/#c90status - status of C90 for GCC 4.0
    jsm@polyomino.org.uk (personal mail)
    jsm28@gcc.gnu.org (Bugzilla assignments and CCs)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: #pragma interface/implementation broken if --enable-mapped-location
  2004-09-24 14:17       ` Joseph S. Myers
@ 2004-09-24 21:47         ` Zack Weinberg
  2004-09-24 22:15           ` Joseph S. Myers
  2004-09-29 19:24           ` Per Bothner
  0 siblings, 2 replies; 13+ messages in thread
From: Zack Weinberg @ 2004-09-24 21:47 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Per Bothner, gcc

"Joseph S. Myers" <jsm@polyomino.org.uk> writes:

> On Thu, 23 Sep 2004, Zack Weinberg wrote:
>
>> Alternatively, I wanted originally to defer pragmas not by
>> encapsulating the un-tokenized line in a string, but by injecting a
>> keyword (__pragma) at the beginning of the line, a semicolon at the
>> end, and tokenizing normally (but with macro expansion disabled).
>> Matt persuaded me that was hard to implement, but maybe it should be
>> revisited.
>
> That would also probably be convenient for OpenMP pragma implementation 
> for C (where the pragmas can contain expressions that need to go through 
> the parser as usual).  But for most target pragmas attempting to hook them 
> in the grammar like that would seem an excess complication; the grammar 
> would effectively just gather up a sequence of arbitrary tokens again to 
> pass to the target pragma handler.  (Bearing in mind the general 
> desirability that target pragmas share a single implementation for C and 
> C++.)

Well, remember that the way target pragma handlers work now is they
get tokens one at a time by calling c_lex.  This was always intended
to be hookable into a recursive descent parser as C++ has.  (I didn't
consider hooking target-specific pragmas into C's Yacc parser ever to
be feasible.)

My medium-term goals for the C++ front end involve bypassing c_lex and
having cpplib write its tokens directly into the big buffer that
cp/parser.c is now maintaining.  For the sake of the #pragma handlers,
it would then provide its own c_lex that is basically an exported
version of cp_lexer_consume_token.

Given all that, I think it makes sense for all pragmas to become
token sequences.  The difference between the OpenMP pragmas and the
others is just that they want to call back to the expression parser,
which is currently feasible for C++ and not for C.

> I wonder if also the check for unrecognized pragmas should be made
> before the pragma becomes any sort of token, on the basis that the C
> and C++ standards explicitly say that unknown pragmas are ignored
> (and so should be permitted anywhere in the source file, not just in
> reasonable places).  That would seem cleaner than documenting that
> all pragmas are recognized in order to give errors if they occur in
> an inconvenient place.

Suppose that cpplib, on encountering a #pragma, parsed the (namespace
and) keyword to decide what it meant.  If the #pragma was unknown, it
would then throw away the entire line.  If it was not unknown, it
would inject a CPP_PRAGMA token at the beginning of the line *in place
of* the token sequence # pragma [namespace] keyword.  The value of
this token would be a function pointer for the appropriate #pragma
handler.  I think that'd achieve both what you want and what I want,
and would also avoid doing the handler lookup twice.

Thoughts?

zw

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: #pragma interface/implementation broken if --enable-mapped-location
  2004-09-24 21:47         ` Zack Weinberg
@ 2004-09-24 22:15           ` Joseph S. Myers
  2004-09-24 22:34             ` Zack Weinberg
  2004-09-29 19:24           ` Per Bothner
  1 sibling, 1 reply; 13+ messages in thread
From: Joseph S. Myers @ 2004-09-24 22:15 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Per Bothner, gcc

On Fri, 24 Sep 2004, Zack Weinberg wrote:

> Suppose that cpplib, on encountering a #pragma, parsed the (namespace
> and) keyword to decide what it meant.  If the #pragma was unknown, it
> would then throw away the entire line.  If it was not unknown, it
> would inject a CPP_PRAGMA token at the beginning of the line *in place
> of* the token sequence # pragma [namespace] keyword.  The value of
> this token would be a function pointer for the appropriate #pragma
> handler.  I think that'd achieve both what you want and what I want,
> and would also avoid doing the handler lookup twice.

That should work (with the warning from -Wunknown-pragmas being given at 
the point an unknown pragma is thrown away).  For now yylex for C could 
handle that pragma token by calling the handler immediately (so the 
subsequent tokens never reach the parser).  With the possibility of then 
checking for particular pragmas and passing the token sequence on to the 
parser for those cases only.

I'd add that the CPP_PRAGMA token, though referred to as __pragma, 
shouldn't actually be spellable that way in source code; the only way to 
get one in the token stream should be #pragma / _Pragma (and perhaps the 
diagnostic messages complaining of such pragmas in bad places for the 
grammar should use the spelling "#pragma" rather than "PRAGMA").

-- 
Joseph S. Myers               http://www.srcf.ucam.org/~jsm28/gcc/
  http://www.srcf.ucam.org/~jsm28/gcc/#c90status - status of C90 for GCC 4.0
    jsm@polyomino.org.uk (personal mail)
    jsm28@gcc.gnu.org (Bugzilla assignments and CCs)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: #pragma interface/implementation broken if --enable-mapped-location
  2004-09-24 22:15           ` Joseph S. Myers
@ 2004-09-24 22:34             ` Zack Weinberg
  0 siblings, 0 replies; 13+ messages in thread
From: Zack Weinberg @ 2004-09-24 22:34 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Per Bothner, gcc

"Joseph S. Myers" <jsm@polyomino.org.uk> writes:

> That should work (with the warning from -Wunknown-pragmas being given at 
> the point an unknown pragma is thrown away).  For now yylex for C could 
> handle that pragma token by calling the handler immediately (so the 
> subsequent tokens never reach the parser).  With the possibility of then 
> checking for particular pragmas and passing the token sequence on to the 
> parser for those cases only.

Yes.

> I'd add that the CPP_PRAGMA token, though referred to as __pragma, 
> shouldn't actually be spellable that way in source code; the only way to 
> get one in the token stream should be #pragma / _Pragma (and perhaps the 
> diagnostic messages complaining of such pragmas in bad places for the 
> grammar should use the spelling "#pragma" rather than "PRAGMA").

Right.  When -E is in effect, we'll continue to emit pragmas in
preprocessing-directive notation, as is done now.  I've already got a
bug assigned to me to improve the diagnostics.

zw

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: #pragma interface/implementation broken if --enable-mapped-location
  2004-09-24 21:47         ` Zack Weinberg
  2004-09-24 22:15           ` Joseph S. Myers
@ 2004-09-29 19:24           ` Per Bothner
  2004-09-29 20:56             ` Zack Weinberg
  1 sibling, 1 reply; 13+ messages in thread
From: Per Bothner @ 2004-09-29 19:24 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Joseph S. Myers, gcc

Zack Weinberg wrote:

> Given all that, I think it makes sense for all pragmas to become
> token sequences.

How close are we to having this?

I'm hoping to check in today my gcc/java changes to handle
--enable-mapped-location.  That would allow bootstrapping gcc
with the default set of languages.  But that requires a fix
or workaround for the #pragma interface/implementation bug,
since otherwise libjava won't build.

If we're not close to the #pragma-as-token change, I could
implement a quickie fix to for example temporarily clear
cb.line_change during cpp_handle_deferred_pragma.
-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: #pragma interface/implementation broken if --enable-mapped-location
  2004-09-29 19:24           ` Per Bothner
@ 2004-09-29 20:56             ` Zack Weinberg
  2004-09-29 21:57               ` Matt Austern
  2004-09-30  7:57               ` Per Bothner
  0 siblings, 2 replies; 13+ messages in thread
From: Zack Weinberg @ 2004-09-29 20:56 UTC (permalink / raw)
  To: Per Bothner; +Cc: Joseph S. Myers, gcc

Per Bothner <per@bothner.com> writes:

> Zack Weinberg wrote:
>
>> Given all that, I think it makes sense for all pragmas to become
>> token sequences.
>
> How close are we to having this?
>
> I'm hoping to check in today my gcc/java changes to handle
> --enable-mapped-location.  That would allow bootstrapping gcc
> with the default set of languages.  But that requires a fix
> or workaround for the #pragma interface/implementation bug,
> since otherwise libjava won't build.
>
> If we're not close to the #pragma-as-token change, I could
> implement a quickie fix to for example temporarily clear
> cb.line_change during cpp_handle_deferred_pragma.

I think we're at least a couple weeks away from #pragma-as-tokens.  A
quickie fix would be fine by me.

zw

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: #pragma interface/implementation broken if --enable-mapped-location
  2004-09-29 20:56             ` Zack Weinberg
@ 2004-09-29 21:57               ` Matt Austern
  2004-09-29 22:50                 ` Zack Weinberg
  2004-09-30  7:57               ` Per Bothner
  1 sibling, 1 reply; 13+ messages in thread
From: Matt Austern @ 2004-09-29 21:57 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: gcc, Per Bothner, Joseph S. Myers

On Sep 29, 2004, at 11:14 AM, Zack Weinberg wrote:

> Per Bothner <per@bothner.com> writes:
>
>> Zack Weinberg wrote:
>>
>>> Given all that, I think it makes sense for all pragmas to become
>>> token sequences.
>>
>> How close are we to having this?
>>
>> I'm hoping to check in today my gcc/java changes to handle
>> --enable-mapped-location.  That would allow bootstrapping gcc
>> with the default set of languages.  But that requires a fix
>> or workaround for the #pragma interface/implementation bug,
>> since otherwise libjava won't build.
>>
>> If we're not close to the #pragma-as-token change, I could
>> implement a quickie fix to for example temporarily clear
>> cb.line_change during cpp_handle_deferred_pragma.
>
> I think we're at least a couple weeks away from #pragma-as-tokens.  A
> quickie fix would be fine by me.

The reason I didn't implement pragma-as-tokens was pretty simple:
I thought it would involve fairly tricky changes in both libcpp and
the front end.

First, the libcpp changes: it's easy to get libcpp to do parsing
the same way it normally does but using a different source of
characters; the infrastructure is already there.  I didn't see an
easy way of getting the pragma handlers to use an alternate
mechanism for getting tokens.  There is no such infrastructure in
the existing libcpp, and creating something new, just for this
purpose, didn't seem worth it.

Second, the front end changes.  If a pragma is a single token,
then it's very easy for the front end to skip past it and to
only evaluate it once.  If it's a stream of tokens, then it's
harder.  (But possibly that concern is no longer valid now that
Zack made the changes so that pragmas may only appear in specific
places.)

I still find pragmas-as-strings simpler and more elegant than
pragmas-as-tokens, and I don't quite understand why it's
necessary to move to pragmas-as-tokens to fix #pragma interface/
implementation in the presence of --enable-mapped-location.  But
if other people think this change is a good idea, I'm not going
to stand in the way.

			--Matt

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: #pragma interface/implementation broken if --enable-mapped-location
  2004-09-29 21:57               ` Matt Austern
@ 2004-09-29 22:50                 ` Zack Weinberg
  0 siblings, 0 replies; 13+ messages in thread
From: Zack Weinberg @ 2004-09-29 22:50 UTC (permalink / raw)
  To: Matt Austern; +Cc: gcc, Per Bothner, Joseph S. Myers

Matt Austern <austern@apple.com> writes:

>> I think we're at least a couple weeks away from #pragma-as-tokens.  A
>> quickie fix would be fine by me.
>
> The reason I didn't implement pragma-as-tokens was pretty simple:
> I thought it would involve fairly tricky changes in both libcpp and
> the front end.
>
> First, the libcpp changes: it's easy to get libcpp to do parsing
> the same way it normally does but using a different source of
> characters; the infrastructure is already there.  I didn't see an
> easy way of getting the pragma handlers to use an alternate
> mechanism for getting tokens.  There is no such infrastructure in
> the existing libcpp, and creating something new, just for this
> purpose, didn't seem worth it.

The way I have in mind for #pragma-as-tokens, hardly any changes to
cpplib should be necessary at all.

I think you'll like what I wind up implementing.  It's three or four
patches down the road, though; let's not worry about it now.

zw

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: #pragma interface/implementation broken if --enable-mapped-location
  2004-09-29 20:56             ` Zack Weinberg
  2004-09-29 21:57               ` Matt Austern
@ 2004-09-30  7:57               ` Per Bothner
  1 sibling, 0 replies; 13+ messages in thread
From: Per Bothner @ 2004-09-30  7:57 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: gcc

Zack Weinberg wrote:

> I think we're at least a couple weeks away from #pragma-as-tokens.  A
> quickie fix would be fine by me.

I checked one in:
http://gcc.gnu.org/ml/gcc-patches/2004-09/msg02999.html
-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2004-09-30  1:27 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-09-24  2:08 #pragma interface/implementation broken if --enable-mapped-location Per Bothner
2004-09-24  2:22 ` Zack Weinberg
2004-09-24  2:43   ` Per Bothner
2004-09-24  4:58     ` Zack Weinberg
2004-09-24 14:17       ` Joseph S. Myers
2004-09-24 21:47         ` Zack Weinberg
2004-09-24 22:15           ` Joseph S. Myers
2004-09-24 22:34             ` Zack Weinberg
2004-09-29 19:24           ` Per Bothner
2004-09-29 20:56             ` Zack Weinberg
2004-09-29 21:57               ` Matt Austern
2004-09-29 22:50                 ` Zack Weinberg
2004-09-30  7:57               ` Per Bothner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).