public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "joseph at codesourcery dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug preprocessor/9449] UCNs not recognized in identifiers (c++/c99)
Date: Thu, 16 Dec 2004 12:33:00 -0000	[thread overview]
Message-ID: <20041216123320.991.qmail@sourceware.org> (raw)
In-Reply-To: <20030127145600.9449.rearnsha@arm.com>


------- Additional Comments From joseph at codesourcery dot com  2004-12-16 12:33 -------
Subject: Re:  UCNs not recognized in identifiers
 (c++/c99)

On Thu, 16 Dec 2004, zack at codesourcery dot com wrote:

> Because of the ABI implications, I consider it completely unacceptable

Which ABI implications?

(a) It isn't explicitly stated that different UCNs designating the same 
character are equivalent to each other (and to that character) in 
identifiers, but I don't think there's any real doubt that they are meant 
to be equivalent.

(b) There is no normalisation, but I'm confident that the answer from WG14 
if this is queried would be that the standard is correct and by design it 
normatively references ISO 10646 (not Unicode) which doesn't include the 
normalisation definitions of UAX 15 and implementation of the standard is 
not meant to involve large external tables.  If there are cases of 
ambiguity a -Wnfc option (default on) to warn for identifiers not in NFC 
(or indeed -Wnfkc, default on, for identifiers not in NFKC) would draw 
users' attention to doubtful identifiers.  (TR 10176 expressly notes the 
problems of ambiguity of appearance of entirely different characters even 
without combining characters, says that language standards need not 
provide for normalisation if they allow combining characters, and excludes 
most combining characters where precombined characters are available for 
the specific purpose of avoiding alternate representations of 
identifiers.)

(c) Though we could do what we want with extended characters (as opposed 
to UCNs) in source files in phase 1, it seems safest to err on the side of 
rejecting all extended characters that wouldn't be accepted as UCNs, 
rather than e.g. applying NFC, to avoid giving identifiers with such 
characters a meaning which might then need to be preserved in future.

(d) There are genuine ABI issues with how extended characters are 
represented in object files, but I think those need to be resolved by 
selecting between UTF-8 and mangling (default UTF-8) based on target 
configurations rather than on the capabilities of the assembler and linker 
in use, and by getting an explicit statement about encoding put in the ELF 
specification.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=9449


  parent reply	other threads:[~2004-12-16 12:33 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20030127145600.9449.rearnsha@arm.com>
2003-11-05  6:51 ` pinskia at gcc dot gnu dot org
2004-12-16  2:16 ` zack at codesourcery dot com
2004-12-16  2:37 ` gdr at integrable-solutions dot net
2004-12-16  2:54 ` joseph at codesourcery dot com
2004-12-16  3:04 ` zack at codesourcery dot com
2004-12-16 12:33 ` joseph at codesourcery dot com [this message]
2004-12-16 14:23 ` joseph at codesourcery dot com
2004-12-16 23:05 ` joseph at codesourcery dot com
2005-01-07  7:10 ` zack at gcc dot gnu dot org
2005-01-07 10:28 ` joseph at codesourcery dot com
2005-01-07 14:27 ` gdr at integrable-solutions dot net
2005-01-07 15:02 ` joseph at codesourcery dot com
2005-01-07 15:39   ` Gabriel Dos Reis
2005-01-07 15:39 ` gdr at integrable-solutions dot net
2005-01-08  2:20 ` geoffk at gcc dot gnu dot org
2005-01-08  4:11 ` joseph at codesourcery dot com
2005-01-08  4:45 ` gdr at integrable-solutions dot net
2005-01-08  5:32 ` joseph at codesourcery dot com
2005-01-09  3:20 ` gdr at integrable-solutions dot net
2005-02-21 21:34 ` jsm28 at gcc dot gnu dot org
2005-02-21 23:39 ` joseph at codesourcery dot com
2005-02-21 23:50 ` zack at codesourcery dot com
2005-02-21 23:51 ` zack at codesourcery dot com
2005-02-22  0:10 ` zack at codesourcery dot com
2005-02-22  3:14 ` neil at daikokuya dot co dot uk
2005-02-22 10:50 ` joseph at codesourcery dot com
2005-02-22 11:24 ` joseph at codesourcery dot com
2005-02-22 12:00 ` joseph at codesourcery dot com
2005-03-12 11:15 ` jsm28 at gcc dot gnu dot org
2005-07-05  2:14 ` pinskia at gcc dot gnu dot org
2005-09-15 22:34 ` geoffk at gcc dot gnu dot org
2005-09-15 22:53   ` Neil Booth
2005-09-15 22:54 ` neil at daikokuya dot co dot uk
2005-09-15 22:54 ` joseph at codesourcery dot com
2005-09-15 22:59 ` neil at daikokuya dot co dot uk
2005-09-15 23:38 ` joseph at codesourcery dot com
2005-09-16  0:02 ` geoffk at geoffk dot org
     [not found] <bug-9449-4@http.gcc.gnu.org/bugzilla/>
2014-11-05 16:20 ` jsm28 at gcc dot gnu.org
2014-11-05 16:23 ` jsm28 at gcc dot gnu.org
2014-11-06 21:09 ` jsm28 at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041216123320.991.qmail@sourceware.org \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).