public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/33415]  New: Can't compile .cpp file with UTF-8 BOM.
@ 2007-09-13 10:04 huzheng_001 at 163 dot com
  2007-09-14  4:12 ` [Bug preprocessor/33415] " bangerth at dealii dot org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: huzheng_001 at 163 dot com @ 2007-09-13 10:04 UTC (permalink / raw)
  To: gcc-bugs

As I need to port my project to vs2005, and the source code contain some UTF-8
string which is not suitable to represent by escaping, I have to add UTF-8 BOM
to make vs2005 recognize it. But after I added the UTF-8 BOM, gcc can't compile
it anymore, even using -finput-charset=UTF-8, it still say error about \357
\273 \277.
Can you fix this problem?

escaping is troublesome as too many of them and make the source code
unreadable.
vs2005 surely need UTF-8 BOM.
While gcc can't accept UTF-8 BOM presently.

Thank you!


-- 
           Summary: Can't compile .cpp file with UTF-8 BOM.
           Product: gcc
           Version: 4.1.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: huzheng_001 at 163 dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33415


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug preprocessor/33415] Can't compile .cpp file with UTF-8 BOM.
  2007-09-13 10:04 [Bug c++/33415] New: Can't compile .cpp file with UTF-8 BOM huzheng_001 at 163 dot com
@ 2007-09-14  4:12 ` bangerth at dealii dot org
  2007-09-14  9:28 ` pinskia at gcc dot gnu dot org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: bangerth at dealii dot org @ 2007-09-14  4:12 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #1 from bangerth at dealii dot org  2007-09-14 04:12 -------
Please attach a testcase. See
  http://gcc.gnu.org/bugs.html
for more information.
W.


-- 

bangerth at dealii dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bangerth at dealii dot org
             Status|UNCONFIRMED                 |WAITING


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33415


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug preprocessor/33415] Can't compile .cpp file with UTF-8 BOM.
  2007-09-13 10:04 [Bug c++/33415] New: Can't compile .cpp file with UTF-8 BOM huzheng_001 at 163 dot com
  2007-09-14  4:12 ` [Bug preprocessor/33415] " bangerth at dealii dot org
@ 2007-09-14  9:28 ` pinskia at gcc dot gnu dot org
  2008-04-16 20:38 ` tromey at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2007-09-14  9:28 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #2 from pinskia at gcc dot gnu dot org  2007-09-14 09:28 -------
Actually I already know this is not handled.  In fact any of the BOMs are not
handled.


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|WAITING                     |NEW
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2007-09-14 09:28:32
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33415


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug preprocessor/33415] Can't compile .cpp file with UTF-8 BOM.
  2007-09-13 10:04 [Bug c++/33415] New: Can't compile .cpp file with UTF-8 BOM huzheng_001 at 163 dot com
  2007-09-14  4:12 ` [Bug preprocessor/33415] " bangerth at dealii dot org
  2007-09-14  9:28 ` pinskia at gcc dot gnu dot org
@ 2008-04-16 20:38 ` tromey at gcc dot gnu dot org
  2008-04-16 21:30 ` tromey at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: tromey at gcc dot gnu dot org @ 2008-04-16 20:38 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from tromey at gcc dot gnu dot org  2008-04-16 20:37 -------
I think some BOMs will be handled by iconv.
In particular I tried UTF-16 and this seemed to work ok.

UTF-8 is a special problem in two ways.  First, glibc's iconv does not
appear to recognize the UTF-8 BOM.

And, even if it did, we special-case UTF-8 (at least on non-EBCDIC hosts).

This could be fixed in files.c without too much difficulty (it makes a few
inconvenient assumptions), except that files.c does not know the name of the
source charset.


-- 

tromey at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tromey at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33415


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug preprocessor/33415] Can't compile .cpp file with UTF-8 BOM.
  2007-09-13 10:04 [Bug c++/33415] New: Can't compile .cpp file with UTF-8 BOM huzheng_001 at 163 dot com
                   ` (2 preceding siblings ...)
  2008-04-16 20:38 ` tromey at gcc dot gnu dot org
@ 2008-04-16 21:30 ` tromey at gcc dot gnu dot org
  2008-04-21 14:03 ` tromey at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: tromey at gcc dot gnu dot org @ 2008-04-16 21:30 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from tromey at gcc dot gnu dot org  2008-04-16 21:29 -------
Testing a patch.


-- 

tromey at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |tromey at gcc dot gnu dot
                   |dot org                     |org
             Status|NEW                         |ASSIGNED
   Last reconfirmed|2007-09-14 09:28:32         |2008-04-16 21:29:21
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33415


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug preprocessor/33415] Can't compile .cpp file with UTF-8 BOM.
  2007-09-13 10:04 [Bug c++/33415] New: Can't compile .cpp file with UTF-8 BOM huzheng_001 at 163 dot com
                   ` (4 preceding siblings ...)
  2008-04-21 14:03 ` tromey at gcc dot gnu dot org
@ 2008-04-21 14:03 ` tromey at gcc dot gnu dot org
  2009-06-14 23:03 ` jsm28 at gcc dot gnu dot org
  6 siblings, 0 replies; 8+ messages in thread
From: tromey at gcc dot gnu dot org @ 2008-04-21 14:03 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from tromey at gcc dot gnu dot org  2008-04-21 14:02 -------
Subject: Bug 33415

Author: tromey
Date: Mon Apr 21 14:02:00 2008
New Revision: 134507

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=134507
Log:
libcpp
        PR libcpp/33415:
        * charset.c (_cpp_convert_input): Add buffer_start argument.
        Ignore UTF-8 BOM if seen.
        * internal.h (_cpp_convert_input): Add argument.
        * files.c (struct _cpp_file) <buffer_start>: New field.
        (destroy_cpp_file): Free buffer_start, not buffer.
        (_cpp_pop_file_buffer): Likewise.
        (read_file_guts): Update.
gcc/testsuite
        PR libcpp/33415:
        * gcc.dg/cpp/pr33415.c: New file.

Added:
    trunk/gcc/testsuite/gcc.dg/cpp/pr33415.c
Modified:
    trunk/gcc/testsuite/ChangeLog
    trunk/libcpp/ChangeLog
    trunk/libcpp/charset.c
    trunk/libcpp/files.c
    trunk/libcpp/internal.h


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33415


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug preprocessor/33415] Can't compile .cpp file with UTF-8 BOM.
  2007-09-13 10:04 [Bug c++/33415] New: Can't compile .cpp file with UTF-8 BOM huzheng_001 at 163 dot com
                   ` (3 preceding siblings ...)
  2008-04-16 21:30 ` tromey at gcc dot gnu dot org
@ 2008-04-21 14:03 ` tromey at gcc dot gnu dot org
  2008-04-21 14:03 ` tromey at gcc dot gnu dot org
  2009-06-14 23:03 ` jsm28 at gcc dot gnu dot org
  6 siblings, 0 replies; 8+ messages in thread
From: tromey at gcc dot gnu dot org @ 2008-04-21 14:03 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from tromey at gcc dot gnu dot org  2008-04-21 14:02 -------
Fixed on trunk.
As I doubt this will be back-ported to 4.3.x, I am closing the bug.


-- 

tromey at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED
   Target Milestone|---                         |4.4.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33415


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug preprocessor/33415] Can't compile .cpp file with UTF-8 BOM.
  2007-09-13 10:04 [Bug c++/33415] New: Can't compile .cpp file with UTF-8 BOM huzheng_001 at 163 dot com
                   ` (5 preceding siblings ...)
  2008-04-21 14:03 ` tromey at gcc dot gnu dot org
@ 2009-06-14 23:03 ` jsm28 at gcc dot gnu dot org
  6 siblings, 0 replies; 8+ messages in thread
From: jsm28 at gcc dot gnu dot org @ 2009-06-14 23:03 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from jsm28 at gcc dot gnu dot org  2009-06-14 23:03 -------
*** Bug 40441 has been marked as a duplicate of this bug. ***


-- 

jsm28 at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dh dot liu at msa dot hinet
                   |                            |dot net


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33415


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-06-14 23:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-09-13 10:04 [Bug c++/33415] New: Can't compile .cpp file with UTF-8 BOM huzheng_001 at 163 dot com
2007-09-14  4:12 ` [Bug preprocessor/33415] " bangerth at dealii dot org
2007-09-14  9:28 ` pinskia at gcc dot gnu dot org
2008-04-16 20:38 ` tromey at gcc dot gnu dot org
2008-04-16 21:30 ` tromey at gcc dot gnu dot org
2008-04-21 14:03 ` tromey at gcc dot gnu dot org
2008-04-21 14:03 ` tromey at gcc dot gnu dot org
2009-06-14 23:03 ` jsm28 at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).