public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/13258] New: strings used for the extern keyword are being translated by the -fexec-charset option
@ 2003-12-01 14:39 darcypj at us dot ibm dot com
  2003-12-01 14:48 ` [Bug c++/13258] " darcypj at us dot ibm dot com
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: darcypj at us dot ibm dot com @ 2003-12-01 14:39 UTC (permalink / raw)
  To: gcc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1989 bytes --]

Below are compiler setup and compile error messages, it appears that the "C"
and/or the "C++" in extern statements are being translated with the new
-fexec-charset compiler option, and the compiler is expecting 'good old ASCII'
characters when it lexes those statements.

Configured with: /opt/compiler_source/34base/configure --target=s390x-ibm-tpf
--prefix=/opt/tpf-cross-tools --program-prefix=s390x-ibm-tpf-
--enable-languages=c,c++ --with-dwarf2 --disable-threads
--with-binutils=/opt/tpf-cross-tools/bin
--with-libs=/opt/tpf-cross-tools/s390x-ibm-tpf/lib
Thread model: single
gcc version 3.4 20031121 (experimental)

Compile command and error message:
[~/bug]s390x-ibm-tpf-g++ -fexec-charset=IBM1047 -c bugrept.cc -Wall -save-temps
bugrept.cc:1: error: language string `"ÃNN"' not recognized

Note:
"Ã" is the ISO-8859-1 representation of character point /xC3, which is the
letter "C" in IBM1047, and
"ÃNN" is the ISO-8859-1 representation of character points /xC3/x4E/x4E, which
is the string "C++" in IBM1047)

Below is the original .cc file source.  (No includes and about 5 lines so I
considered the .ii file not worth shipping.)

==============begin bugrept.cc source========

extern "C++" {  //alternately use extern "C" {

 int testbug (void) {
   return 0;
 }

} //extern block

==============end bugrept.cc source========

-- 
           Summary: strings used for the extern keyword are being translated
                    by the -fexec-charset option
           Product: gcc
           Version: 3.4
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: c++
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: darcypj at us dot ibm dot com
                CC: gcc-bugs at gcc dot gnu dot org,zack at codesourcery dot
                    com
 GCC build triplet: s390x-ibm-linux
  GCC host triplet: s390x-ibm-linux
GCC target triplet: s390x-ibm-tpf


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13258


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/13258] strings used for the extern keyword are being translated by the -fexec-charset option
  2003-12-01 14:39 [Bug c++/13258] New: strings used for the extern keyword are being translated by the -fexec-charset option darcypj at us dot ibm dot com
@ 2003-12-01 14:48 ` darcypj at us dot ibm dot com
  2003-12-01 16:53 ` zack at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: darcypj at us dot ibm dot com @ 2003-12-01 14:48 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From darcypj at us dot ibm dot com  2003-12-01 14:48 -------
Created an attachment (id=5251)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=5251&action=view)
testcase whose source was included inline as attachment

bug re-creation code, I was unable to attach in file format when opening the
bug

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13258


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/13258] strings used for the extern keyword are being translated by the -fexec-charset option
  2003-12-01 14:39 [Bug c++/13258] New: strings used for the extern keyword are being translated by the -fexec-charset option darcypj at us dot ibm dot com
  2003-12-01 14:48 ` [Bug c++/13258] " darcypj at us dot ibm dot com
@ 2003-12-01 16:53 ` zack at gcc dot gnu dot org
  2003-12-01 18:46 ` darcypj at us dot ibm dot com
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: zack at gcc dot gnu dot org @ 2003-12-01 16:53 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From zack at gcc dot gnu dot org  2003-12-01 16:53 -------
You're correct in your hypothesis.  I would appreciate ideas - the problem is
that the logic converting string constants to the execution character set is not
aware of the language syntax.  You'll see similar problems with __attribute__
directives that take string constant arguments (for instance).



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
     Ever Confirmed|                            |1
   Last reconfirmed|0000-00-00 00:00:00         |2003-12-01 16:53:06
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13258


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/13258] strings used for the extern keyword are being translated by the -fexec-charset option
  2003-12-01 14:39 [Bug c++/13258] New: strings used for the extern keyword are being translated by the -fexec-charset option darcypj at us dot ibm dot com
  2003-12-01 14:48 ` [Bug c++/13258] " darcypj at us dot ibm dot com
  2003-12-01 16:53 ` zack at gcc dot gnu dot org
@ 2003-12-01 18:46 ` darcypj at us dot ibm dot com
  2003-12-01 18:56 ` zack at codesourcery dot com
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: darcypj at us dot ibm dot com @ 2003-12-01 18:46 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From darcypj at us dot ibm dot com  2003-12-01 18:46 -------
(In reply to comment #2)

The idea I was toying with was to have the lexer not translate strings that fall
into one of these contexts?  

For example, when the keyword 'extern' is found, then perhaps set a flag such
that if the next token is a character literal cpp_interpret_string_notranslate()
is called rather than cpp_interpret_string for the upcoming literal?  (If a flag
were used, I would assume that it would be static and that
cpp_interpret_string_notranslate() would reset it.)

I did see a token lookahead capability in cpplex.c which gives me the idea that,
instead of a flag, when we hit the extern keyword, we can check the next token
to see if it is a string literal that we can tell the compiler to treat as it
would an octal or hex formatted literal and not translate it.

It would take me some more time to poke around in the code if I am to come up
with a coding of this, if it is even possible, but the basic idea is that this
needs to be done at the lexer because that is where you call the translation
functions.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13258


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/13258] strings used for the extern keyword are being translated by the -fexec-charset option
  2003-12-01 14:39 [Bug c++/13258] New: strings used for the extern keyword are being translated by the -fexec-charset option darcypj at us dot ibm dot com
                   ` (2 preceding siblings ...)
  2003-12-01 18:46 ` darcypj at us dot ibm dot com
@ 2003-12-01 18:56 ` zack at codesourcery dot com
  2004-01-02  9:17 ` ivan at vmfacility dot fr
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: zack at codesourcery dot com @ 2003-12-01 18:56 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From zack at codesourcery dot com  2003-12-01 18:56 -------
Subject: Re:  strings used for the extern keyword are being
 translated by the -fexec-charset option


That sounds plausible, with a couple caveats:

- it is a design constraint of cpplib that it not use any global variables
- cpplib's lookahead facility is not really designed for use outside
  of the macro expander; it may be possible to use it for this, but it
  probably isn't feasible to use it for __attribute__.

I would be thinking in terms of setting a flag on the token itself.

Another possibility is to let the conversion happen and then
back-convert in the parser when necessary; a new cppcharset.c
interface could be added for this.  Your idea is cleaner, though, if
it can be made to work.

If you have time to look into a patch that would be appreciated;
I will get to it eventually but I cannot promise any timeframe.

zw


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13258


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/13258] strings used for the extern keyword are being translated by the -fexec-charset option
  2003-12-01 14:39 [Bug c++/13258] New: strings used for the extern keyword are being translated by the -fexec-charset option darcypj at us dot ibm dot com
                   ` (3 preceding siblings ...)
  2003-12-01 18:56 ` zack at codesourcery dot com
@ 2004-01-02  9:17 ` ivan at vmfacility dot fr
  2004-01-05 13:25 ` darcypj at us dot ibm dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: ivan at vmfacility dot fr @ 2004-01-02  9:17 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From ivan at vmfacility dot fr  2004-01-02 09:17 -------
I would also like to mention that this also affects the __asm__ (or asm) 
statement and constraints [if the source & exec charsets are different).

Ex:

__asm__ ("LR %0,%1" : "=r" (x) : "r" (y)); 

This complains that the output lacks an "=" constraint. I am also sure the 
resulting assembler output will be garbled.

This might be similar to the __attribute__ issue.

--Ivan

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13258


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c++/13258] strings used for the extern keyword are being translated by the -fexec-charset option
  2003-12-01 14:39 [Bug c++/13258] New: strings used for the extern keyword are being translated by the -fexec-charset option darcypj at us dot ibm dot com
                   ` (4 preceding siblings ...)
  2004-01-02  9:17 ` ivan at vmfacility dot fr
@ 2004-01-05 13:25 ` darcypj at us dot ibm dot com
  2004-01-11  1:04 ` [Bug preprocessor/13258] " zack at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: darcypj at us dot ibm dot com @ 2004-01-05 13:25 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From darcypj at us dot ibm dot com  2004-01-05 13:25 -------
(In reply to comment #5)

Yes - my attempts at building glibc with translation have shown that text in:

link_warning
versioned_symbol
stub_warning (which I believe is defined using a link_warning)

are also affected, and this causes assembler errors during build because the
translated characters are seen as invalid assembly instructions in the #APP section.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13258


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug preprocessor/13258] strings used for the extern keyword are being translated by the -fexec-charset option
  2003-12-01 14:39 [Bug c++/13258] New: strings used for the extern keyword are being translated by the -fexec-charset option darcypj at us dot ibm dot com
                   ` (5 preceding siblings ...)
  2004-01-05 13:25 ` darcypj at us dot ibm dot com
@ 2004-01-11  1:04 ` zack at gcc dot gnu dot org
  2004-03-18  0:02 ` echristo at redhat dot com
  2004-03-18  0:10 ` pinskia at gcc dot gnu dot org
  8 siblings, 0 replies; 10+ messages in thread
From: zack at gcc dot gnu dot org @ 2004-01-11  1:04 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From zack at gcc dot gnu dot org  2004-01-11 01:04 -------
Recategorize - this affects C as well, and is an issue with the interface
between cpplib and the various parsers.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|c++                         |preprocessor


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13258


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug preprocessor/13258] strings used for the extern keyword are being translated by the -fexec-charset option
  2003-12-01 14:39 [Bug c++/13258] New: strings used for the extern keyword are being translated by the -fexec-charset option darcypj at us dot ibm dot com
                   ` (6 preceding siblings ...)
  2004-01-11  1:04 ` [Bug preprocessor/13258] " zack at gcc dot gnu dot org
@ 2004-03-18  0:02 ` echristo at redhat dot com
  2004-03-18  0:10 ` pinskia at gcc dot gnu dot org
  8 siblings, 0 replies; 10+ messages in thread
From: echristo at redhat dot com @ 2004-03-18  0:02 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From echristo at redhat dot com  2004-03-18 00:02 -------
This has been fixed in mainline.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13258


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug preprocessor/13258] strings used for the extern keyword are being translated by the -fexec-charset option
  2003-12-01 14:39 [Bug c++/13258] New: strings used for the extern keyword are being translated by the -fexec-charset option darcypj at us dot ibm dot com
                   ` (7 preceding siblings ...)
  2004-03-18  0:02 ` echristo at redhat dot com
@ 2004-03-18  0:10 ` pinskia at gcc dot gnu dot org
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-03-18  0:10 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |3.5.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13258


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2004-03-18  0:10 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-01 14:39 [Bug c++/13258] New: strings used for the extern keyword are being translated by the -fexec-charset option darcypj at us dot ibm dot com
2003-12-01 14:48 ` [Bug c++/13258] " darcypj at us dot ibm dot com
2003-12-01 16:53 ` zack at gcc dot gnu dot org
2003-12-01 18:46 ` darcypj at us dot ibm dot com
2003-12-01 18:56 ` zack at codesourcery dot com
2004-01-02  9:17 ` ivan at vmfacility dot fr
2004-01-05 13:25 ` darcypj at us dot ibm dot com
2004-01-11  1:04 ` [Bug preprocessor/13258] " zack at gcc dot gnu dot org
2004-03-18  0:02 ` echristo at redhat dot com
2004-03-18  0:10 ` pinskia at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).