public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug other/28315] gcc doesn't use locale for default input charset
       [not found] <bug-28315-4@http.gcc.gnu.org/bugzilla/>
@ 2013-03-29 13:17 ` lacos at caesar dot elte.hu
  2013-04-02  8:47 ` bonzini at gnu dot org
  1 sibling, 0 replies; 2+ messages in thread
From: lacos at caesar dot elte.hu @ 2013-03-29 13:17 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28315

Laszlo Ersek <lacos at caesar dot elte.hu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bonzini at gnu dot org,
                   |                            |lacos at caesar dot elte.hu

--- Comment #1 from Laszlo Ersek <lacos at caesar dot elte.hu> 2013-03-29 13:17:21 UTC ---
gcc has defaulted to UTF-8 rather than the locale's codeset in
_cpp_default_encoding() [libcpp/charset.c] since the following 2004 hunk:

    http://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=d856c8a6#patch25

(
  The default encoding is selected for both "input_charset" (overrideable
  with -finput-charset) and "narrow_charset" (overrideable with
  -fexec-charset):

    cpp_create_reader() [libcpp/init.c]
      ~ narrow_charset = _cpp_default_encoding()
      ~ input_charset = _cpp_default_encoding()

  The "overrides" are implemented in c_common_handle_option()
  [gcc/c-family/c-opts.c].
)

Considering the encodings of source files "in the wild" that gcc has been
used to compile in the last 8+ years (ie. while the "&& 0" has been in
place):

- UTF-8 (of which 7-bit ASCII is a subset) worked.

- Any non-UTF-8 encoding that utilized the MSB (eg. ISO-8859-2) required the
  -finput-charset option.

  People who would have originally wanted gcc to take that codeset from the
  locale were probably *developing* the source code in question, hence they
  could easily add the -finput-charset to their makefiles.

Much of the world must have migrated to UTF-8-encoded locales by now.
Reverting the "&& 0" would:

- not affect people with such a distro-default locale who build UTF-8 /
  ASCII sources: their locale codeset matches the current hardwired default,

- not affect people building sources with non-UTF-8 8-bit codesets (eg.
  ISO-8859-2), since those projects already have to use the -finput-charset
  options in their makefiles,

- affect people who have stuck to their 7-bit ASCII, or non-UTF-8 8-bit
  codesets in their locales, and compile real UTF-8 sources.

People in the last group (which includes me :)) would be forced to (a)
modify their locale when building such sources as end-users, or (b) to find
out about -finput-charset=UTF-8 and pass it via (b1) Makefile hacking or
(b2) ./configure settings (env vars, or command line options).

I think that's unreasonable; building random projects from the tubes would
break for this small but existent group of users.

Therefore I suggest to keep the logic as-is, and update the docs instead
("gcc/doc/cppopts.texi"): "-finput-charset" should not refer to the locale.


^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug other/28315] gcc doesn't use locale for default input charset
       [not found] <bug-28315-4@http.gcc.gnu.org/bugzilla/>
  2013-03-29 13:17 ` [Bug other/28315] gcc doesn't use locale for default input charset lacos at caesar dot elte.hu
@ 2013-04-02  8:47 ` bonzini at gnu dot org
  1 sibling, 0 replies; 2+ messages in thread
From: bonzini at gnu dot org @ 2013-04-02  8:47 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28315

Paolo Bonzini <bonzini at gnu dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |WONTFIX

--- Comment #2 from Paolo Bonzini <bonzini at gnu dot org> 2013-04-02 08:47:26 UTC ---
It took me a while to reconstruct what that patch did, and I couldn't find
quickly the discussion about the hunk between me and Zack.

What I did find out was that GCC's configure at the time lacked
AM_LANGINFO_CODESET:

http://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/configure.ac;h=45e5da59b7568804ce3a09551525acb3f43d9689;hb=d856c8a6

Hence that hunk was actually a no-op.

I agree with Laszlo's analysis, and I am closing this as WONTFIX.


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2013-04-02  8:47 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-28315-4@http.gcc.gnu.org/bugzilla/>
2013-03-29 13:17 ` [Bug other/28315] gcc doesn't use locale for default input charset lacos at caesar dot elte.hu
2013-04-02  8:47 ` bonzini at gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).