From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Lance Taylor To: martin@mira.isdn.cs.tu-berlin.de Cc: eggert@twinsun.com, brolley@cygnus.com, gcc2@gnu.org, egcs@cygnus.com Subject: Re: thoughts on martin's proposed patch for GCC and UTF-8 Date: Thu, 10 Dec 1998 07:57:00 -0000 Message-id: <199812101557.KAA00786@subrogation.cygnus.com> References: <199812100712.IAA00283@mira.isdn.cs.tu-berlin.de> X-SW-Source: 1998-12/msg00378.html Date: Thu, 10 Dec 1998 08:12:20 +0100 From: Martin von Loewis > If the object-code standard is to use UTF-8 names, then I suppose the > assembler can convert to UTF-8. No. The gas people made it very clear that they consider character sets somebody else's problems (i.e. ours). That is too strong. For hand coded assembler, I can see that there may be a need for gas to do some character set conversions. Also, if it is ever possible for an identifier name to include a byte value which gas will consider to be an operator, then it is clearly necessary for gas to permit quoting that byte value, and perhaps to do more general character set conversions. In general, though, if gcc needs to understands character set issues, which appears to be the case, and if it can emit identifiers in a manner which will not confuse gas, then I think it is reasonable for gcc to emit identifiers as uninterpreted byte sequences, and for gas to simply pass those identifiers straight through into the object file. I can't claim to understand many of the issues here, though. Several people have mentioned the linker as an issue. To the best of my knowledge, the linker will permit any byte value except 0 to appear in an identifier. I don't see why the linker has to change at all for any character set issues. Ian