From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-131905-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 28671 invoked by alias); 22 Feb 2005 02:22:54 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Received: (qmail 27906 invoked by alias); 22 Feb 2005 02:22:49 -0000
Date: Tue, 22 Feb 2005 11:24:00 -0000
Message-ID: <20050222022249.27905.qmail@sourceware.org>
From: "joseph at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
In-Reply-To: <20030127145600.9449.rearnsha@arm.com>
References: <20030127145600.9449.rearnsha@arm.com>
Reply-To: gcc-bugzilla@gcc.gnu.org
Subject: [Bug preprocessor/9449] UCNs not recognized in identifiers (c++/c99)
X-Bugzilla-Reason: CC
X-SW-Source: 2005-02/txt/msg02585.txt.bz2
List-Id: <gcc-bugs.sourceware.org>


------- Additional Comments From joseph at codesourcery dot com  2005-02-22 02:22 -------
Subject: Re:  UCNs not recognized in identifiers
 (c++/c99)

On Mon, 21 Feb 2005, neil at daikokuya dot co dot uk wrote:

> jsm28 at gcc dot gnu dot org wrote:-
> 
> > * The greedy algorithm applies for lexing UCNs: for example,
> > a\U0000000z is three preprocessing tokens {a}{\}{U0000000z} (and
> > shouldn't get a diagnostic on lexing, presuming macros are defined
> > such that the eventual token sequence is valid).
> 
> I'm not sure I agree with this: it would seem to be unnecessary
> extra work; further I suspect the user would benefit from it being
> pointed out he entered an ill-formed UCN rather than something random
> from the front end complaining about an unexpected backslash.
> 
> The only case where you wouldn't get a syntax error from the
> front end, or an invalid escape in a literal, is with -E.  I'm
> not sure lexing to the letter of the standard is worthwhile in
> this case, as the standard doesn't discuss -E.
> 
> If you have an example where a compiled program is acceptable
> with multiple lexing tokens then I would agree with you.

#define a b(
#define b(x) q
int a\U0000000z );

Greedy lexing is the standard as applied for other token types.  I don't 
think a difference here makes sense.  _cpp_valid_ucn would need changing 
so it doesn't give an error for incomplete UCNs in identifiers but instead 
returns quietly.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=9449