From mboxrd@z Thu Jan  1 00:00:00 1970
From: Ian Lance Taylor <ian@cygnus.com>
To: martin@mira.isdn.cs.tu-berlin.de
Cc: eggert@twinsun.com, brolley@cygnus.com, gcc2@gnu.org, egcs@cygnus.com
Subject: Re: thoughts on martin's proposed patch for GCC and UTF-8
Date: Thu, 10 Dec 1998 07:57:00 -0000
Message-id: <199812101557.KAA00786@subrogation.cygnus.com>
References: <199812100712.IAA00283@mira.isdn.cs.tu-berlin.de>
X-SW-Source: 1998-12/msg00378.html

   Date: Thu, 10 Dec 1998 08:12:20 +0100
   From: Martin von Loewis <martin@mira.isdn.cs.tu-berlin.de>

   > If the object-code standard is to use UTF-8 names, then I suppose the
   > assembler can convert to UTF-8.

   No. The gas people made it very clear that they consider character sets
   somebody else's problems (i.e. ours).

That is too strong.  For hand coded assembler, I can see that there
may be a need for gas to do some character set conversions.  Also, if
it is ever possible for an identifier name to include a byte value
which gas will consider to be an operator, then it is clearly
necessary for gas to permit quoting that byte value, and perhaps to do
more general character set conversions.

In general, though, if gcc needs to understands character set issues,
which appears to be the case, and if it can emit identifiers in a
manner which will not confuse gas, then I think it is reasonable for
gcc to emit identifiers as uninterpreted byte sequences, and for gas
to simply pass those identifiers straight through into the object
file.

I can't claim to understand many of the issues here, though.

Several people have mentioned the linker as an issue.  To the best of
my knowledge, the linker will permit any byte value except 0 to appear
in an identifier.  I don't see why the linker has to change at all for
any character set issues.

Ian