From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28671 invoked by alias); 22 Feb 2005 02:22:54 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 27906 invoked by alias); 22 Feb 2005 02:22:49 -0000 Date: Tue, 22 Feb 2005 11:24:00 -0000 Message-ID: <20050222022249.27905.qmail@sourceware.org> From: "joseph at codesourcery dot com" To: gcc-bugs@gcc.gnu.org In-Reply-To: <20030127145600.9449.rearnsha@arm.com> References: <20030127145600.9449.rearnsha@arm.com> Reply-To: gcc-bugzilla@gcc.gnu.org Subject: [Bug preprocessor/9449] UCNs not recognized in identifiers (c++/c99) X-Bugzilla-Reason: CC X-SW-Source: 2005-02/txt/msg02585.txt.bz2 List-Id: ------- Additional Comments From joseph at codesourcery dot com 2005-02-22 02:22 ------- Subject: Re: UCNs not recognized in identifiers (c++/c99) On Mon, 21 Feb 2005, neil at daikokuya dot co dot uk wrote: > jsm28 at gcc dot gnu dot org wrote:- > > > * The greedy algorithm applies for lexing UCNs: for example, > > a\U0000000z is three preprocessing tokens {a}{\}{U0000000z} (and > > shouldn't get a diagnostic on lexing, presuming macros are defined > > such that the eventual token sequence is valid). > > I'm not sure I agree with this: it would seem to be unnecessary > extra work; further I suspect the user would benefit from it being > pointed out he entered an ill-formed UCN rather than something random > from the front end complaining about an unexpected backslash. > > The only case where you wouldn't get a syntax error from the > front end, or an invalid escape in a literal, is with -E. I'm > not sure lexing to the letter of the standard is worthwhile in > this case, as the standard doesn't discuss -E. > > If you have an example where a compiled program is acceptable > with multiple lexing tokens then I would agree with you. #define a b( #define b(x) q int a\U0000000z ); Greedy lexing is the standard as applied for other token types. I don't think a difference here makes sense. _cpp_valid_ucn would need changing so it doesn't give an error for incomplete UCNs in identifiers but instead returns quietly. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=9449