> Sorry, I'm still lost. If the identifier is the UTF-8 character MICRO > SIGN (code 00B5), do you generate the same UTF-8 character on output, > or do you mangle it as if the user had typed `\u00b5'? Suppose I have a class µ{ µ(); //This should read MICRO SIGN }; Then, the compiler tests at installation time whether the assembler on the system is 8-bit-clean. If it is, the constructor is mangled as __2\302\265v If the assembler does not support 8-bit symbols, it is mangled as __U5_00b5 This is what jc1 currently does. > echo ab | tr 'ab' '\123\456' Thanks, this looks good. > OK, so then there's no problem: C++ _does_ distinguish between > non-ASCII digits and letters. Right. It just doesn't distinguish between non-ASCII digits and non-ASCII non-alphanumerics :-) That's why no predicate function was needed. Regards, Martin