[Bug c/67224] New: UTF-8 support for identifier names in GCC

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug c/67224] New: UTF-8 support for identifier names in GCC
@ 2015-08-14 22:37 ejolson at unr dot edu
  2015-08-14 22:41 ` [Bug c/67224] " ejolson at unr dot edu
                   ` (23 more replies)
  0 siblings, 24 replies; 25+ messages in thread
From: ejolson at unr dot edu @ 2015-08-14 22:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

            Bug ID: 67224
           Summary: UTF-8 support for identifier names in GCC
           Product: gcc
           Version: 5.2.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ejolson at unr dot edu
  Target Milestone: ---

Created attachment 36187
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36187&action=edit
One line patch to add C99 UTF-8 support in identifiers to gcc

In response to FAQ

> What is the status of adding the UTF-8 support for identifier names in GCC?

and the request

> Support for actual UTF-8 in identifiers is still pending (please contribute!)

my observation is that UTF-8 in identifiers is easy to add to gcc by changing
one line in the cpp preprocessor, provided a recent version of iconv is
installed on the system.  The patch is attached and has been tested for about 6
months.  More information about this patch as well as unrelated information
about getting cilkrts to work on ARM is available at 

  https://www.raspberrypi.org/forums/viewtopic.php?p=802657


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
@ 2015-08-14 22:41 ` ejolson at unr dot edu
  2015-08-15  6:35 ` pinskia at gcc dot gnu.org
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: ejolson at unr dot edu @ 2015-08-14 22:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #1 from Eric <ejolson at unr dot edu> ---
To check the installed version of iconv has C99 support type

  $ iconv --list | grep "C99"
  C99
  $

which means that iconv is recent enough.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
  2015-08-14 22:41 ` [Bug c/67224] " ejolson at unr dot edu
@ 2015-08-15  6:35 ` pinskia at gcc dot gnu.org
  2015-08-15  6:37 ` pinskia at gcc dot gnu.org
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-08-15  6:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Related to bug 41374.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
  2015-08-14 22:41 ` [Bug c/67224] " ejolson at unr dot edu
  2015-08-15  6:35 ` pinskia at gcc dot gnu.org
@ 2015-08-15  6:37 ` pinskia at gcc dot gnu.org
  2015-08-15 11:30 ` manu at gcc dot gnu.org
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-08-15  6:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Have you tried -fextended-identifiers ?


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (2 preceding siblings ...)
  2015-08-15  6:37 ` pinskia at gcc dot gnu.org
@ 2015-08-15 11:30 ` manu at gcc dot gnu.org
  2015-08-15 12:25 ` joseph at codesourcery dot com
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: manu at gcc dot gnu.org @ 2015-08-15 11:30 UTC (permalink / raw)
  To: gcc-bugs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="UTF-8", Size: 4805 bytes --]

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #4 from Manuel LÃ³pez-IbÃ¡Ã±ez <manu at gcc dot gnu.org> ---
I cannot say anything about the correctness of the patch, but I would expect
such a patch to contain many testcases (at least similar to those that test for
UCNs see https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00337.html), patches
need to be bootstrapped & regression tested and submitted to gcc-patches with a
Changelog
(https://gcc.gnu.org/wiki/GettingStarted#Basics:_Contributing_to_GCC_in_10_easy_steps).
Please CC Joseph Myers when you submit.
>From gcc-bugs-return-494897-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Sat Aug 15 11:31:44 2015
Return-Path: <gcc-bugs-return-494897-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 43538 invoked by alias); 15 Aug 2015 11:31:43 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 43517 invoked by uid 48); 15 Aug 2015 11:31:40 -0000
From: "jack.d.whitham at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/67226] New: Incorrect code generated for tail call, where parameters are structs passed by value, -O2 is used, and target is ARM
Date: Sat, 15 Aug 2015 11:31:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 5.2.1
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: jack.d.whitham at gmail dot com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone attachments.created
Message-ID: <bug-67226-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-08/txt/msg01039.txt.bz2
Content-length: 2279

https://gcc.gnu.org/bugzilla/show_bug.cgi?idg226

            Bug ID: 67226
           Summary: Incorrect code generated for tail call, where
                    parameters are structs passed by value, -O2 is used,
                    and target is ARM
           Product: gcc
           Version: 5.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jack.d.whitham at gmail dot com
  Target Milestone: ---

Created attachment 36188
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id6188&actioníit
Test case to reproduce bug, run "show_bug.sh" on ARMv7 system with GCC 5.2.1

I compiled this code with "gcc (Debian 5.2.1-15) 5.2.1 20150808" on an ARMv7
target (RPi2).

typedef struct assembly_operand_t
{   int   type, value, symtype, symflags, marker;
} assembly_operand;

void t2(assembly_operand to, assembly_operand from)
{   if (to.value == from.value) return;
    if (to.value == 0) assemblez_1(32, from);
    else               assemblez_store(to, from);
}

My code calls the function t2 with to.value == 0. If I compile t2 with -O0 or
-O1, assemblez_1 is called as expected with the correct arguments. If I compile
t2 with -O2, the "from" structure is copied incorrectly. The "type" field has
the incorrect value.

I have seen this bug on GCC 4.6.x and 4.7.x as well as 5.2.1 and it seems to
behave the same regardless of whether the target ISA is 32-bit ARM or Thumb.
However it does not occur on x86/x86_64. For ARM targets, it appears that -O2
causes the call to assemblez_1 to be generated as a tail call whereas it is an
ordinary call with -O1/-O0.

In the attached test case the same code is compiled with both -O0 and -O2. Both
copies are then executed. The test case shows that the "from.type" field is
corrupted in the -O2 case. See the "show_bug.sh" script for the command-line
options used.

More information:

This GCC is from https://packages.debian.org/sid/gcc.
# gcc --version
gcc (Debian 5.2.1-15) 5.2.1 20150808
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (3 preceding siblings ...)
  2015-08-15 11:30 ` manu at gcc dot gnu.org
@ 2015-08-15 12:25 ` joseph at codesourcery dot com
  2015-08-17 17:55 ` ejolson at unr dot edu
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: joseph at codesourcery dot com @ 2015-08-15 12:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #5 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
There is no "C99" character set in glibc libiconv (after all, it's not a 
character set at all).  Converting extended characters to UCNs like that 
would in any case be correct for C++ (provided you also convert $ ` @ and 
control characters other than those in the basic source character set) but 
not for C - but for C++, it would be necessary to keep track of the 
conversions to revert them in raw string literals.  This requirement to 
revert such conversions in raw string literals (in C++14, see 2.5 
[lex.pptoken] paragraph 3: "Between the initial and final double quote 
characters of the raw string, any transformations performed in phases 1 
and 2 (trigraphs, universal-character-names, and line splicing) are 
reverted; this reversion shall apply before any d-char, r-char, or 
delimiting parenthesis is identified.") renders such an approach 
non-viable (it would break things that currently work); the conversions to 
UCNs have to take place within cpplib, not through an external iconv 
conversion.

Note that cpplib identifier spelling preservation is now implemented 
<https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00548.html>, which adds 
other ways in which it should be visible whether an identifier was 
represented with UTF-8 or UCNs.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (4 preceding siblings ...)
  2015-08-15 12:25 ` joseph at codesourcery dot com
@ 2015-08-17 17:55 ` ejolson at unr dot edu
  2015-08-17 18:45 ` ejolson at unr dot edu
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: ejolson at unr dot edu @ 2015-08-17 17:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #6 from Eric <ejolson at unr dot edu> ---
>From the webpage (current as of Aug 17, 2015)

    http://www.gnu.org/software/libiconv/

under *Details* it is described that the library provides support for the
following encodings:

    Full Unicode
        UTF-8
        UCS-2, UCS-2BE, UCS-2LE
        UCS-4, UCS-4BE, UCS-4LE
        UTF-16, UTF-16BE, UTF-16LE
        UTF-32, UTF-32BE, UTF-32LE
        UTF-7
        C99, JAVA 

Therefore, I don't understand the statement that libiconv doesn't support C99
or that it isn't, somehow, a character set.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (5 preceding siblings ...)
  2015-08-17 17:55 ` ejolson at unr dot edu
@ 2015-08-17 18:45 ` ejolson at unr dot edu
  2015-08-17 18:48 ` ejolson at unr dot edu
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: ejolson at unr dot edu @ 2015-08-17 18:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #7 from Eric <ejolson at unr dot edu> ---
Please look at the Raspberry Pi forum post linked in the original report for
more information about testing this patch.  As the text describes there, the
command line options 

    -finput-charset=UTF-8 -fextended-identifiers

are both needed in order to compile a UTF-8 input file containing unicode
identifiers.  I have included a small test program as another attachment. 
Searching on UTF-8 Identifiers in GCC will turn up a number of people asking
for this feature and additional example codes that use UTF-8 identifers.  The
document "Unicode for the PCC C99 Compiler" available at

    http://pcc.ludd.ltu.se/documentation/

also contains example UTF-8 C99 input files which can be used to test the
compiler.  The one-line patch submitted above has also been tested in the sense
that the compiler still bootstraps and has no trouble compiling thousands of
lines of standard ASCII C input.

The patch inserts "C99" in only one place as the uses of SOURCE_CHARSET are
conflicted and changing them all to "C99" doesn't yield a working solution.  In
particular, the "C99" in _cpp_convert_input should not be considered the source
character set appearing in the input files but rather an internal character set
suitable for later parsing.  As iconv is already a well debugged library, it
would appear the risks of this patch are minor.

Note however, the following problem:  "C99" is probably not the correct for
EBCDIC hosts.  In that case it might be possible to write UCNs using trigraphs
of the form ??/uXXXX and ??/UXXXXXXXX, however, as the number of people wanting
to compile C source files with identifiers encoded using UTF-EBCDIC is likely
zero, the easiest solution going forward is to modify the patch so it only
applies to non-EBCDIC hosts.  As there are already #ifdef's in the code to
check for this, this does not add any new complexity to the code base.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (6 preceding siblings ...)
  2015-08-17 18:45 ` ejolson at unr dot edu
@ 2015-08-17 18:48 ` ejolson at unr dot edu
  2015-08-17 18:58 ` manu at gcc dot gnu.org
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: ejolson at unr dot edu @ 2015-08-17 18:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #8 from Eric <ejolson at unr dot edu> ---
Created attachment 36196
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36196&action=edit
Test program with UTF-8 identifiers...

Compile this test program using 

gcc \
    -finput-charset=UTF-8 -fextended-identifiers \
    -o circle circle.c

to check whether gcc can handle UTF-8 identifiers.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (7 preceding siblings ...)
  2015-08-17 18:48 ` ejolson at unr dot edu
@ 2015-08-17 18:58 ` manu at gcc dot gnu.org
  2015-08-17 19:24 ` manu at gcc dot gnu.org
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: manu at gcc dot gnu.org @ 2015-08-17 18:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #9 from Manuel López-Ibáñez <manu at gcc dot gnu.org> ---
(In reply to Eric from comment #7)
> also contains example UTF-8 C99 input files which can be used to test the
> compiler.  The one-line patch submitted above has also been tested in the
> sense that the compiler still bootstraps and has no trouble compiling
> thousands of lines of standard ASCII C input.

I think what Joseph is saying is that your approach may work for the small
examples that you have tested, but it would break things that are working fine
right now (in particular raw string literals). Many of those things are not
tested by a gcc bootstrap (but some of them should be tested by the regression
testsuite, did you run that? Point 4 here:
https://gcc.gnu.org/wiki/GettingStarted#Basics:_Contributing_to_GCC_in_10_easy_steps
)

I hope Joseph can give you more details so you may try to implement this in the
proper way.

The only reason why GCC does not have UTF-8 support in identifiers is that no
one had time to implement it yet, so your help is appreciated.
>From gcc-bugs-return-495015-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Mon Aug 17 19:11:19 2015
Return-Path: <gcc-bugs-return-495015-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 102884 invoked by alias); 17 Aug 2015 19:11:19 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 102849 invoked by uid 48); 17 Aug 2015 19:11:15 -0000
From: "jeff.science at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug fortran/67250] gfortran does not faithfully preprocess the way cpp does
Date: Mon, 17 Aug 2015 19:11:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: fortran
X-Bugzilla-Version: 5.2.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: blocker
X-Bugzilla-Who: jeff.science at gmail dot com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-67250-4-CERm5IcAY0@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-67250-4@http.gcc.gnu.org/bugzilla/>
References: <bug-67250-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-08/txt/msg01157.txt.bz2
Content-length: 686

https://gcc.gnu.org/bugzilla/show_bug.cgi?idg250

--- Comment #3 from Jeff Hammond <jeff.science at gmail dot com> ---
Unfortunately, this does not change anything.

> gfortran-5 -traditional-cpp -I. -E source.F
# 1 "source.F"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "source.F"
C
C OLD SCHOOL COMMENTS
C
      subroutine xyz(stuff)
      implicit none

# 1 "header.fh" 1
! comment
      integer foo
      external foo
# 7 "source.F" 2
      integer stuff
      integer err_##handle;       integer l_##handle,k_##handle;
      err_##handle=foo(100,#handle,l_##handle,k_##handle);       if
(.not.err_##handle) call bar("PUSH"#handle,#100);
      print*,stuff
      return
      end


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (8 preceding siblings ...)
  2015-08-17 18:58 ` manu at gcc dot gnu.org
@ 2015-08-17 19:24 ` manu at gcc dot gnu.org
  2015-08-17 20:18 ` joseph at codesourcery dot com
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: manu at gcc dot gnu.org @ 2015-08-17 19:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

Manuel López-Ibáñez <manu at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2015-08-17
                 CC|                            |manu at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #10 from Manuel López-Ibáñez <manu at gcc dot gnu.org> ---
(In reply to Eric from comment #7)
> command line options 
> 
>     -finput-charset=UTF-8 -fextended-identifiers
> 
> are both needed in order to compile a UTF-8 input file containing unicode

Note also that since GCC 5.1: The option -fextended-identifiers is now enabled
by default for C++, and for C99 and later C versions
(https://gcc.gnu.org/gcc-5/changes.html) and the default C version is C11, thus
it is enabled by default.
>From gcc-bugs-return-495019-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Mon Aug 17 19:25:15 2015
Return-Path: <gcc-bugs-return-495019-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 12512 invoked by alias); 17 Aug 2015 19:25:15 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 12460 invoked by uid 48); 17 Aug 2015 19:25:11 -0000
From: "trippels at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c++/62272] Gimplify throws error on method call from inside nested lambdas
Date: Mon, 17 Aug 2015 19:25:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c++
X-Bugzilla-Version: 4.9.1
X-Bugzilla-Keywords:
X-Bugzilla-Severity: major
X-Bugzilla-Who: trippels at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: cf_known_to_work cf_known_to_fail
Message-ID: <bug-62272-4-QwMFjoE8R4@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-62272-4@http.gcc.gnu.org/bugzilla/>
References: <bug-62272-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-08/txt/msg01161.txt.bz2
Content-length: 505

https://gcc.gnu.org/bugzilla/show_bug.cgi?idb272

Markus Trippelsdorf <trippels at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |5.2.1, 6.0
      Known to fail|                            |4.9.3

--- Comment #6 from Markus Trippelsdorf <trippels at gcc dot gnu.org> ---
Fixed for gcc-5 by r226950 and gcc-6 by r226951.

gcc-4.9 still ICEs.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (9 preceding siblings ...)
  2015-08-17 19:24 ` manu at gcc dot gnu.org
@ 2015-08-17 20:18 ` joseph at codesourcery dot com
  2015-08-18  0:11 ` ejolson at unr dot edu
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: joseph at codesourcery dot com @ 2015-08-17 20:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #11 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
Sorry, glibc iconv (not libiconv) doesn't handle "C99".  So your patch 
would not work on any GNU host in normal configurations of GCC (libiconv 
is a completely separate package and is only likely to be used on non-GNU 
hosts such as Windows, on GNU hosts iconv from glibc is normally used 
although it's possible to use libiconv there).

You need to test cases such as that if a macro is defined twice, once with 
a UCN in its expansion and once with the equivalent character written in 
UTF-8, the difference in the expansion is diagnosed (whichever of all the 
valid UCNs for that character is the one used).  And that the original 
spelling appears on the right hand side of a definition output with -dD.  
And that if (in C but not, properly, C++) a string contains a backslash 
followed by an extended character, this is properly diagnosed as an 
invalid escape sequence rather than being treated as \\u<something> or 
\\U<something>.  See the tests in my spelling preservation patch 
<https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00548.html>.  (Stringizing 
isn't necessarily an issue here because of the special C rules about 
stringizing UCNs together with the C++ rule about converting to UCNs in 
phase 1 - the effect is that for C it's always OK to stringize as the 
extended character, though you can't stringize as a UCN if the extended 
character was originally written, while for C++ you have to stringize as a 
UCN.)  And then you need tests of C++ programs with extended characters 
inside raw strings (like c-c++-common/raw-string-*.c, but none of those 
cover extended characters at present).  And the patch needs to add all 
these tests to the testsuite.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (10 preceding siblings ...)
  2015-08-17 20:18 ` joseph at codesourcery dot com
@ 2015-08-18  0:11 ` ejolson at unr dot edu
  2015-08-18 10:07 ` manu at gcc dot gnu.org
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: ejolson at unr dot edu @ 2015-08-18  0:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #12 from Eric <ejolson at unr dot edu> ---
I'm glad to know people like Joseph are working on UTF-8 in gcc.  Last year I
spent a week adding UTF-8 input support to pcc.  At that time Microsoft Studio
and clang already supported UTF-8 input files and I expected that gcc would do
so in the next release.  As this didn't happen, a few months ago I looked and
developed a one-line patch to add this support to gcc.

It appears the C preprocessor falls back to libiconv when it encounters a
conversion not supported internally.  From what I can tell this is enabled by
default, though it is surely possible to disable it.

I'm aware that C strings are often used to store 8-bit data, for example, to
display various graphics characters from legacy code pages.  I will run the
regression tests as soon as possible to see what, if anything, has broken by my
one-line patch.  UCN quoting of UTF-8 input should happen only if the
-finput-charset=UTF-8 flag is set and this is worth checking.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (11 preceding siblings ...)
  2015-08-18  0:11 ` ejolson at unr dot edu
@ 2015-08-18 10:07 ` manu at gcc dot gnu.org
  2015-08-18 19:24 ` ejolson at unr dot edu
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: manu at gcc dot gnu.org @ 2015-08-18 10:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #13 from Manuel López-Ibáñez <manu at gcc dot gnu.org> ---
(In reply to Eric from comment #12)
> I'm glad to know people like Joseph are working on UTF-8 in gcc. 

I think at the moment, neither Joseph nor anyone else is planning to work on
this. There doesn't seem to be sufficient demand for this feature so that
companies fund it or volunteers step up to implement it (you are the first one
to do an attempt that I am aware of).

> I spent a week adding UTF-8 input support to pcc.  At that time Microsoft
> Studio and clang already supported UTF-8 input files and I expected that gcc
> would do so in the next release.

Unfortunately, GCC has very few developers compared to Microsoft or Clang. Many
things in GCC will never get done if new people do not contribute to its
development. This is why if you want to see this feature, you are the best and
perhaps the only person to make it happen.

The problem is that this cannot be fixed by one-line patch, otherwise it would
have been fixed a long time ago.

* GCC cannot rely on libiconv being always present. It has to work with glibc's
iconv, which is what is used in GNU/Linux.

* Even if glibc's supported C99 conversion, this will break other things. 

* You need to add tests explicitly for various things (see Joseph's comments).
The tests will be added to the GCC testsuite to prove that your patch works as
it should and to make sure future changes do not break the tests.

* At a minimum, look at all the gcc.dg/cpp/ucnid-*.c g++.dg/cpp/ucnid-*.c and
see what happens if you replace the \uNNN with actual extended characters.

* Joseph thinks that the best approach is to do the conversion from UTF-8 to
UCNs "manually" within cpplib, such that you can handle all the corner cases of
C/C++ (quoted strings, \µ, macro names,...)
>From gcc-bugs-return-495052-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Tue Aug 18 10:21:23 2015
Return-Path: <gcc-bugs-return-495052-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 82718 invoked by alias); 18 Aug 2015 10:21:23 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 82476 invoked by uid 48); 18 Aug 2015 10:21:18 -0000
From: "manu at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug fortran/67250] gfortran does not faithfully preprocess the way cpp does
Date: Tue, 18 Aug 2015 10:21:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: fortran
X-Bugzilla-Version: 5.2.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: manu at gcc dot gnu.org
X-Bugzilla-Status: RESOLVED
X-Bugzilla-Resolution: DUPLICATE
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: cc resolution
Message-ID: <bug-67250-4-6UX51po8ZA@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-67250-4@http.gcc.gnu.org/bugzilla/>
References: <bug-67250-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-08/txt/msg01194.txt.bz2
Content-length: 1242

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67250

Manuel López-Ibáñez <manu at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |manu at gcc dot gnu.org
         Resolution|WONTFIX                     |DUPLICATE

--- Comment #12 from Manuel López-Ibáñez <manu at gcc dot gnu.org> ---
Hi Jeff,

I think Steve and Andrew are just explaining why it works this way and why it
is not possible to change the default behavior. Calling this a "bug" is not
fair, because it works as intended and documented.

Of course, it may be possible to implement a different behavior under a new
command-line option, but that would require someone to do all the work required
to get this implemented. In fact, there has been such work going on but
progress has been slow (see PR28662). You could offer your help to Tobias
(https://gcc.gnu.org/wiki/GettingStarted#Basics:_Contributing_to_GCC_in_10_easy_steps).
People who are already happy with the current behavior are not likely to do it
for you.

*** This bug has been marked as a duplicate of bug 28662 ***
>From gcc-bugs-return-495053-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Tue Aug 18 10:21:24 2015
Return-Path: <gcc-bugs-return-495053-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 82772 invoked by alias); 18 Aug 2015 10:21:24 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 82511 invoked by uid 48); 18 Aug 2015 10:21:20 -0000
From: "manu at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug fortran/28662] fpp call of gfortran: -traditional-cpp versus newer macros like #x
Date: Tue, 18 Aug 2015 10:21:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: fortran
X-Bugzilla-Version: 4.2.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: manu at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: cc
Message-ID: <bug-28662-4-CpbcjNcDul@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-28662-4@http.gcc.gnu.org/bugzilla/>
References: <bug-28662-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-08/txt/msg01195.txt.bz2
Content-length: 489

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=28662

Manuel López-Ibáñez <manu at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jeff.science at gmail dot com

--- Comment #11 from Manuel López-Ibáñez <manu at gcc dot gnu.org> ---
*** Bug 67250 has been marked as a duplicate of this bug. ***
>From gcc-bugs-return-495054-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Tue Aug 18 10:22:40 2015
Return-Path: <gcc-bugs-return-495054-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 84703 invoked by alias); 18 Aug 2015 10:22:39 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 84678 invoked by uid 48); 18 Aug 2015 10:22:35 -0000
From: "gmarkhor at gmail dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c/67254] New: On-stack memory regions with aligned attribute overlap on ARM
Date: Tue, 18 Aug 2015 10:22:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c
X-Bugzilla-Version: 5.1.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: gmarkhor at gmail dot com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone
Message-ID: <bug-67254-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-08/txt/msg01196.txt.bz2
Content-length: 1732

https://gcc.gnu.org/bugzilla/show_bug.cgi?idg254

            Bug ID: 67254
           Summary: On-stack memory regions with aligned attribute overlap
                    on ARM
           Product: gcc
           Version: 5.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gmarkhor at gmail dot com
  Target Milestone: ---

Consider the following program:

#include <stdio.h>
#include <stdint.h>

int main() {
  uint16_t data[] __attribute__ ((aligned (32))) = {
      12288, 16777, 18103, 49820, 17421, 18573, 18420, 49771, 24528,
      12288, 12288, 12288, 12288, 64512, 31744, 32256, 2, 32785, 168,
      32768, 0, 0, 0, 0, 0, 0, 32768, 0, 0, 0, 0, 0, 0, 12288 };
  float reference[] __attribute__ ((aligned (32))) = {
      0.125, 2.76757812, 6.71484375, -3.3046875, 4.05078125, 9.1015625,
      7.953125, -3.20898438, 500.066406, 0.125, 0.125, 0.125, 0.125,
      -0, 0, 0, 0.0000001, -0.000001,
      0.00001, -0, 0, 0, 0, 0, 0, 0, -0, 0, 0, 0, 0, 0, 0, 0.125 };
  int data_length = sizeof(data) / sizeof(data[0]);
  printf("data: %p <-> %p\n", &data[0], &data[data_length - 1]);
  printf("reference: %p <-> %p\n", &reference[0], &reference[sizeof(reference)
/ sizeof(reference[0]) - 1]);
  if ((void*)(data + data_length - 1) >= (void*)(&reference[0])) {
    printf("overlap!\n");
  }
  return 0;
}

I compile it and run on armv7 development board. It prints something like

data: 0xbea7f500 <-> 0xbea7f542
reference: 0xbea7f460 <-> 0xbea7f4e4
overlap!

In other words, memory allocated for data and reference overlap.

I checked it on gcc 4.8.2 and gcc 5.1.0. It works as expected on x86_64.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (12 preceding siblings ...)
  2015-08-18 10:07 ` manu at gcc dot gnu.org
@ 2015-08-18 19:24 ` ejolson at unr dot edu
  2015-08-18 19:27 ` ejolson at unr dot edu
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: ejolson at unr dot edu @ 2015-08-18 19:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #14 from Eric <ejolson at unr dot edu> ---
While there may not be current demand for gcc to accept UTF-8 identifiers, the
fact that clang and Visual Studio support this C99 feature means source code
using Greek and accented characters in variable names is likely to become more
prevalent over time.

I have done a little testing to check by default whether string literals can
contain arbitrary 8-bit data.  This is used, for example, in legacy code which
directly includes graphics characters from CP437.  The original preprocessor
specifies "UTF-8" as the default input character set and "UTF-8" as the
internal character set.  Then, if the internal and working character sets are
identical no translation is done and arbitrary 8-bit data is passed through
cleanly.  A slight modification to my patch needs to be made to retain the same
behavior.  In particular, the patch now specifies both the internal and default
input character sets to be "C99" so no translation is done by default.  The
improved patch also includes consideration of EBCDIC hosts.

As iconv was installed on every GNU/Linux system I've tried, I'm not sure what
is wrong with using the C99 mode present in newer releases.  This achieves
exactly the suggested result of converting all UTF-8 input to UCNs in the
preprocessor while directly allowing other potentially useful conversions. 
Perhaps the configure script should be modified to check for a compatibile
version of iconv and if one is not found resort to a manual conversion.

Testing is still underway.  After the standard regression tests are finished I
will create new tests utf8id-.* which will be versions of the uncid-.* tests
for native utf-8 files.  I will also include a new test for arbitrary 8-bit
string literals, to verify further compatibility.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (13 preceding siblings ...)
  2015-08-18 19:24 ` ejolson at unr dot edu
@ 2015-08-18 19:27 ` ejolson at unr dot edu
  2015-08-18 19:57 ` ejolson at unr dot edu
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: ejolson at unr dot edu @ 2015-08-18 19:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #15 from Eric <ejolson at unr dot edu> ---
Created attachment 36206
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36206&action=edit
Improved UTF-8 identifier patch

Improved patch to support UTF-8 identifiers.  This version by default does no
translation unless -finput-charset=XXX is specified where XXX is something
other than C99 and should not affect EBCDIC hosts.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (14 preceding siblings ...)
  2015-08-18 19:27 ` ejolson at unr dot edu
@ 2015-08-18 19:57 ` ejolson at unr dot edu
  2015-08-18 20:11 ` joseph at codesourcery dot com
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: ejolson at unr dot edu @ 2015-08-18 19:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #16 from Eric <ejolson at unr dot edu> ---
With my second patch the command line must now include the options

    -finput-charset=UTF-8 -fextended-identifiers -fexec-charset=UTF-8

or otherwise C99 will also be used for the default execution character set.  A
better approach to maintain nearly 8-bit clean string literals by default might
result from leaving the default input and execution characters sets as UTF-8
and setting the internal character set to C99 only when -fextended-identifiers
is selected.  Sorry for too many comments.  I'll post a new patch when
everything is ready and has been tested.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (15 preceding siblings ...)
  2015-08-18 19:57 ` ejolson at unr dot edu
@ 2015-08-18 20:11 ` joseph at codesourcery dot com
  2015-08-18 22:54 ` ejolson at unr dot edu
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: joseph at codesourcery dot com @ 2015-08-18 20:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #17 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
On Tue, 18 Aug 2015, ejolson at unr dot edu wrote:

> As iconv was installed on every GNU/Linux system I've tried, I'm not sure what
> is wrong with using the C99 mode present in newer releases.  This achieves

The iconv that is installed is glibc iconv.  It has *nothing to do with* 
libiconv, a completely independent package.  iconv --version will report a 
glibc version and iconv --list will produce a list not mentioning C99, 
e.g.:

$ iconv --version
iconv (Ubuntu EGLIBC 2.19-0ubuntu6.6) 2.19
$ iconv --list
The following list contains all the coded character sets known.  This does
not necessarily mean that all combinations of these names can be used for
the FROM and TO command line parameters.  One coded character set can be
listed with several different names (aliases).

  437, 500, 500V1, 850, 851, 852, 855, 856, 857, 860, 861, 862, 863, 864, 865,
  866, 866NAV, 869, 874, 904, 1026, 1046, 1047, 8859_1, 8859_2, 8859_3, 8859_4,
  8859_5, 8859_6, 8859_7, 8859_8, 8859_9, 10646-1:1993, 10646-1:1993/UCS4,
  ANSI_X3.4-1968, ANSI_X3.4-1986, ANSI_X3.4, ANSI_X3.110-1983, ANSI_X3.110,
  ARABIC, ARABIC7, ARMSCII-8, ASCII, ASMO-708, ASMO_449, BALTIC, BIG-5,
  BIG-FIVE, BIG5-HKSCS, BIG5, BIG5HKSCS, BIGFIVE, BRF, BS_4730, CA, CN-BIG5,
  CN-GB, CN, CP-AR, CP-GR, CP-HU, CP037, CP038, CP273, CP274, CP275, CP278,
  CP280, CP281, CP282, CP284, CP285, CP290, CP297, CP367, CP420, CP423, CP424,
  CP437, CP500, CP737, CP770, CP771, CP772, CP773, CP774, CP775, CP803, CP813,
  CP819, CP850, CP851, CP852, CP855, CP856, CP857, CP860, CP861, CP862, CP863,
  CP864, CP865, CP866, CP866NAV, CP868, CP869, CP870, CP871, CP874, CP875,
  CP880, CP891, CP901, CP902, CP903, CP904, CP905, CP912, CP915, CP916, CP918,
  CP920, CP921, CP922, CP930, CP932, CP933, CP935, CP936, CP937, CP939, CP949,
  CP950, CP1004, CP1008, CP1025, CP1026, CP1046, CP1047, CP1070, CP1079,
  CP1081, CP1084, CP1089, CP1097, CP1112, CP1122, CP1123, CP1124, CP1125,
  CP1129, CP1130, CP1132, CP1133, CP1137, CP1140, CP1141, CP1142, CP1143,
  CP1144, CP1145, CP1146, CP1147, CP1148, CP1149, CP1153, CP1154, CP1155,
  CP1156, CP1157, CP1158, CP1160, CP1161, CP1162, CP1163, CP1164, CP1166,
  CP1167, CP1250, CP1251, CP1252, CP1253, CP1254, CP1255, CP1256, CP1257,
  CP1258, CP1282, CP1361, CP1364, CP1371, CP1388, CP1390, CP1399, CP4517,
  CP4899, CP4909, CP4971, CP5347, CP9030, CP9066, CP9448, CP10007, CP12712,
  CP16804, CPIBM861, CSA7-1, CSA7-2, CSASCII, CSA_T500-1983, CSA_T500,
  CSA_Z243.4-1985-1, CSA_Z243.4-1985-2, CSA_Z243.419851, CSA_Z243.419852,
  CSDECMCS, CSEBCDICATDE, CSEBCDICATDEA, CSEBCDICCAFR, CSEBCDICDKNO,
  CSEBCDICDKNOA, CSEBCDICES, CSEBCDICESA, CSEBCDICESS, CSEBCDICFISE,
  CSEBCDICFISEA, CSEBCDICFR, CSEBCDICIT, CSEBCDICPT, CSEBCDICUK, CSEBCDICUS,
  CSEUCKR, CSEUCPKDFMTJAPANESE, CSGB2312, CSHPROMAN8, CSIBM037, CSIBM038,
  CSIBM273, CSIBM274, CSIBM275, CSIBM277, CSIBM278, CSIBM280, CSIBM281,
  CSIBM284, CSIBM285, CSIBM290, CSIBM297, CSIBM420, CSIBM423, CSIBM424,
  CSIBM500, CSIBM803, CSIBM851, CSIBM855, CSIBM856, CSIBM857, CSIBM860,
  CSIBM863, CSIBM864, CSIBM865, CSIBM866, CSIBM868, CSIBM869, CSIBM870,
  CSIBM871, CSIBM880, CSIBM891, CSIBM901, CSIBM902, CSIBM903, CSIBM904,
  CSIBM905, CSIBM918, CSIBM921, CSIBM922, CSIBM930, CSIBM932, CSIBM933,
  CSIBM935, CSIBM937, CSIBM939, CSIBM943, CSIBM1008, CSIBM1025, CSIBM1026,
  CSIBM1097, CSIBM1112, CSIBM1122, CSIBM1123, CSIBM1124, CSIBM1129, CSIBM1130,
  CSIBM1132, CSIBM1133, CSIBM1137, CSIBM1140, CSIBM1141, CSIBM1142, CSIBM1143,
  CSIBM1144, CSIBM1145, CSIBM1146, CSIBM1147, CSIBM1148, CSIBM1149, CSIBM1153,
  CSIBM1154, CSIBM1155, CSIBM1156, CSIBM1157, CSIBM1158, CSIBM1160, CSIBM1161,
  CSIBM1163, CSIBM1164, CSIBM1166, CSIBM1167, CSIBM1364, CSIBM1371, CSIBM1388,
  CSIBM1390, CSIBM1399, CSIBM4517, CSIBM4899, CSIBM4909, CSIBM4971, CSIBM5347,
  CSIBM9030, CSIBM9066, CSIBM9448, CSIBM12712, CSIBM16804, CSIBM11621162,
  CSISO4UNITEDKINGDOM, CSISO10SWEDISH, CSISO11SWEDISHFORNAMES,
  CSISO14JISC6220RO, CSISO15ITALIAN, CSISO16PORTUGESE, CSISO17SPANISH,
  CSISO18GREEK7OLD, CSISO19LATINGREEK, CSISO21GERMAN, CSISO25FRENCH,
  CSISO27LATINGREEK1, CSISO49INIS, CSISO50INIS8, CSISO51INISCYRILLIC,
  CSISO58GB1988, CSISO60DANISHNORWEGIAN, CSISO60NORWEGIAN1, CSISO61NORWEGIAN2,
  CSISO69FRENCH, CSISO84PORTUGUESE2, CSISO85SPANISH2, CSISO86HUNGARIAN,
  CSISO88GREEK7, CSISO89ASMO449, CSISO90, CSISO92JISC62991984B, CSISO99NAPLPS,
  CSISO103T618BIT, CSISO111ECMACYRILLIC, CSISO121CANADIAN1, CSISO122CANADIAN2,
  CSISO139CSN369103, CSISO141JUSIB1002, CSISO143IECP271, CSISO150,
  CSISO150GREEKCCITT, CSISO151CUBA, CSISO153GOST1976874, CSISO646DANISH,
  CSISO2022CN, CSISO2022JP, CSISO2022JP2, CSISO2022KR, CSISO2033,
  CSISO5427CYRILLIC, CSISO5427CYRILLIC1981, CSISO5428GREEK, CSISO10367BOX,
  CSISOLATIN1, CSISOLATIN2, CSISOLATIN3, CSISOLATIN4, CSISOLATIN5, CSISOLATIN6,
  CSISOLATINARABIC, CSISOLATINCYRILLIC, CSISOLATINGREEK, CSISOLATINHEBREW,
  CSKOI8R, CSKSC5636, CSMACINTOSH, CSNATSDANO, CSNATSSEFI, CSN_369103,
  CSPC8CODEPAGE437, CSPC775BALTIC, CSPC850MULTILINGUAL, CSPC862LATINHEBREW,
  CSPCP852, CSSHIFTJIS, CSUCS4, CSUNICODE, CSWINDOWS31J, CUBA, CWI-2, CWI,
  CYRILLIC, DE, DEC-MCS, DEC, DECMCS, DIN_66003, DK, DS2089, DS_2089, E13B,
  EBCDIC-AT-DE-A, EBCDIC-AT-DE, EBCDIC-BE, EBCDIC-BR, EBCDIC-CA-FR,
  EBCDIC-CP-AR1, EBCDIC-CP-AR2, EBCDIC-CP-BE, EBCDIC-CP-CA, EBCDIC-CP-CH,
  EBCDIC-CP-DK, EBCDIC-CP-ES, EBCDIC-CP-FI, EBCDIC-CP-FR, EBCDIC-CP-GB,
  EBCDIC-CP-GR, EBCDIC-CP-HE, EBCDIC-CP-IS, EBCDIC-CP-IT, EBCDIC-CP-NL,
  EBCDIC-CP-NO, EBCDIC-CP-ROECE, EBCDIC-CP-SE, EBCDIC-CP-TR, EBCDIC-CP-US,
  EBCDIC-CP-WT, EBCDIC-CP-YU, EBCDIC-CYRILLIC, EBCDIC-DK-NO-A, EBCDIC-DK-NO,
  EBCDIC-ES-A, EBCDIC-ES-S, EBCDIC-ES, EBCDIC-FI-SE-A, EBCDIC-FI-SE, EBCDIC-FR,
  EBCDIC-GREEK, EBCDIC-INT, EBCDIC-INT1, EBCDIC-IS-FRISS, EBCDIC-IT,
  EBCDIC-JP-E, EBCDIC-JP-KANA, EBCDIC-PT, EBCDIC-UK, EBCDIC-US, EBCDICATDE,
  EBCDICATDEA, EBCDICCAFR, EBCDICDKNO, EBCDICDKNOA, EBCDICES, EBCDICESA,
  EBCDICESS, EBCDICFISE, EBCDICFISEA, EBCDICFR, EBCDICISFRISS, EBCDICIT,
  EBCDICPT, EBCDICUK, EBCDICUS, ECMA-114, ECMA-118, ECMA-128, ECMA-CYRILLIC,
  ECMACYRILLIC, ELOT_928, ES, ES2, EUC-CN, EUC-JISX0213, EUC-JP-MS, EUC-JP,
  EUC-KR, EUC-TW, EUCCN, EUCJP-MS, EUCJP-OPEN, EUCJP-WIN, EUCJP, EUCKR, EUCTW,
  FI, FR, GB, GB2312, GB13000, GB18030, GBK, GB_1988-80, GB_198880,
  GEORGIAN-ACADEMY, GEORGIAN-PS, GOST_19768-74, GOST_19768, GOST_1976874,
  GREEK-CCITT, GREEK, GREEK7-OLD, GREEK7, GREEK7OLD, GREEK8, GREEKCCITT,
  HEBREW, HP-GREEK8, HP-ROMAN8, HP-ROMAN9, HP-THAI8, HP-TURKISH8, HPGREEK8,
  HPROMAN8, HPROMAN9, HPTHAI8, HPTURKISH8, HU, IBM-803, IBM-856, IBM-901,
  IBM-902, IBM-921, IBM-922, IBM-930, IBM-932, IBM-933, IBM-935, IBM-937,
  IBM-939, IBM-943, IBM-1008, IBM-1025, IBM-1046, IBM-1047, IBM-1097, IBM-1112,
  IBM-1122, IBM-1123, IBM-1124, IBM-1129, IBM-1130, IBM-1132, IBM-1133,
  IBM-1137, IBM-1140, IBM-1141, IBM-1142, IBM-1143, IBM-1144, IBM-1145,
  IBM-1146, IBM-1147, IBM-1148, IBM-1149, IBM-1153, IBM-1154, IBM-1155,
  IBM-1156, IBM-1157, IBM-1158, IBM-1160, IBM-1161, IBM-1162, IBM-1163,
  IBM-1164, IBM-1166, IBM-1167, IBM-1364, IBM-1371, IBM-1388, IBM-1390,
  IBM-1399, IBM-4517, IBM-4899, IBM-4909, IBM-4971, IBM-5347, IBM-9030,
  IBM-9066, IBM-9448, IBM-12712, IBM-16804, IBM037, IBM038, IBM256, IBM273,
  IBM274, IBM275, IBM277, IBM278, IBM280, IBM281, IBM284, IBM285, IBM290,
  IBM297, IBM367, IBM420, IBM423, IBM424, IBM437, IBM500, IBM775, IBM803,
  IBM813, IBM819, IBM848, IBM850, IBM851, IBM852, IBM855, IBM856, IBM857,
  IBM860, IBM861, IBM862, IBM863, IBM864, IBM865, IBM866, IBM866NAV, IBM868,
  IBM869, IBM870, IBM871, IBM874, IBM875, IBM880, IBM891, IBM901, IBM902,
  IBM903, IBM904, IBM905, IBM912, IBM915, IBM916, IBM918, IBM920, IBM921,
  IBM922, IBM930, IBM932, IBM933, IBM935, IBM937, IBM939, IBM943, IBM1004,
  IBM1008, IBM1025, IBM1026, IBM1046, IBM1047, IBM1089, IBM1097, IBM1112,
  IBM1122, IBM1123, IBM1124, IBM1129, IBM1130, IBM1132, IBM1133, IBM1137,
  IBM1140, IBM1141, IBM1142, IBM1143, IBM1144, IBM1145, IBM1146, IBM1147,
  IBM1148, IBM1149, IBM1153, IBM1154, IBM1155, IBM1156, IBM1157, IBM1158,
  IBM1160, IBM1161, IBM1162, IBM1163, IBM1164, IBM1166, IBM1167, IBM1364,
  IBM1371, IBM1388, IBM1390, IBM1399, IBM4517, IBM4899, IBM4909, IBM4971,
  IBM5347, IBM9030, IBM9066, IBM9448, IBM12712, IBM16804, IEC_P27-1, IEC_P271,
  INIS-8, INIS-CYRILLIC, INIS, INIS8, INISCYRILLIC, ISIRI-3342, ISIRI3342,
  ISO-2022-CN-EXT, ISO-2022-CN, ISO-2022-JP-2, ISO-2022-JP-3, ISO-2022-JP,
  ISO-2022-KR, ISO-8859-1, ISO-8859-2, ISO-8859-3, ISO-8859-4, ISO-8859-5,
  ISO-8859-6, ISO-8859-7, ISO-8859-8, ISO-8859-9, ISO-8859-9E, ISO-8859-10,
  ISO-8859-11, ISO-8859-13, ISO-8859-14, ISO-8859-15, ISO-8859-16, ISO-10646,
  ISO-10646/UCS2, ISO-10646/UCS4, ISO-10646/UTF-8, ISO-10646/UTF8, ISO-CELTIC,
  ISO-IR-4, ISO-IR-6, ISO-IR-8-1, ISO-IR-9-1, ISO-IR-10, ISO-IR-11, ISO-IR-14,
  ISO-IR-15, ISO-IR-16, ISO-IR-17, ISO-IR-18, ISO-IR-19, ISO-IR-21, ISO-IR-25,
  ISO-IR-27, ISO-IR-37, ISO-IR-49, ISO-IR-50, ISO-IR-51, ISO-IR-54, ISO-IR-55,
  ISO-IR-57, ISO-IR-60, ISO-IR-61, ISO-IR-69, ISO-IR-84, ISO-IR-85, ISO-IR-86,
  ISO-IR-88, ISO-IR-89, ISO-IR-90, ISO-IR-92, ISO-IR-98, ISO-IR-99, ISO-IR-100,
  ISO-IR-101, ISO-IR-103, ISO-IR-109, ISO-IR-110, ISO-IR-111, ISO-IR-121,
  ISO-IR-122, ISO-IR-126, ISO-IR-127, ISO-IR-138, ISO-IR-139, ISO-IR-141,
  ISO-IR-143, ISO-IR-144, ISO-IR-148, ISO-IR-150, ISO-IR-151, ISO-IR-153,
  ISO-IR-155, ISO-IR-156, ISO-IR-157, ISO-IR-166, ISO-IR-179, ISO-IR-193,
  ISO-IR-197, ISO-IR-199, ISO-IR-203, ISO-IR-209, ISO-IR-226, ISO/TR_11548-1,
  ISO646-CA, ISO646-CA2, ISO646-CN, ISO646-CU, ISO646-DE, ISO646-DK, ISO646-ES,
  ISO646-ES2, ISO646-FI, ISO646-FR, ISO646-FR1, ISO646-GB, ISO646-HU,
  ISO646-IT, ISO646-JP-OCR-B, ISO646-JP, ISO646-KR, ISO646-NO, ISO646-NO2,
  ISO646-PT, ISO646-PT2, ISO646-SE, ISO646-SE2, ISO646-US, ISO646-YU,
  ISO2022CN, ISO2022CNEXT, ISO2022JP, ISO2022JP2, ISO2022KR, ISO6937,
  ISO8859-1, ISO8859-2, ISO8859-3, ISO8859-4, ISO8859-5, ISO8859-6, ISO8859-7,
  ISO8859-8, ISO8859-9, ISO8859-9E, ISO8859-10, ISO8859-11, ISO8859-13,
  ISO8859-14, ISO8859-15, ISO8859-16, ISO11548-1, ISO88591, ISO88592, ISO88593,
  ISO88594, ISO88595, ISO88596, ISO88597, ISO88598, ISO88599, ISO88599E,
  ISO885910, ISO885911, ISO885913, ISO885914, ISO885915, ISO885916,
  ISO_646.IRV:1991, ISO_2033-1983, ISO_2033, ISO_5427-EXT, ISO_5427,
  ISO_5427:1981, ISO_5427EXT, ISO_5428, ISO_5428:1980, ISO_6937-2,
  ISO_6937-2:1983, ISO_6937, ISO_6937:1992, ISO_8859-1, ISO_8859-1:1987,
  ISO_8859-2, ISO_8859-2:1987, ISO_8859-3, ISO_8859-3:1988, ISO_8859-4,
  ISO_8859-4:1988, ISO_8859-5, ISO_8859-5:1988, ISO_8859-6, ISO_8859-6:1987,
  ISO_8859-7, ISO_8859-7:1987, ISO_8859-7:2003, ISO_8859-8, ISO_8859-8:1988,
  ISO_8859-9, ISO_8859-9:1989, ISO_8859-9E, ISO_8859-10, ISO_8859-10:1992,
  ISO_8859-14, ISO_8859-14:1998, ISO_8859-15, ISO_8859-15:1998, ISO_8859-16,
  ISO_8859-16:2001, ISO_9036, ISO_10367-BOX, ISO_10367BOX, ISO_11548-1,
  ISO_69372, IT, JIS_C6220-1969-RO, JIS_C6229-1984-B, JIS_C62201969RO,
  JIS_C62291984B, JOHAB, JP-OCR-B, JP, JS, JUS_I.B1.002, KOI-7, KOI-8, KOI8-R,
  KOI8-RU, KOI8-T, KOI8-U, KOI8, KOI8R, KOI8U, KSC5636, L1, L2, L3, L4, L5, L6,
  L7, L8, L10, LATIN-9, LATIN-GREEK-1, LATIN-GREEK, LATIN1, LATIN2, LATIN3,
  LATIN4, LATIN5, LATIN6, LATIN7, LATIN8, LATIN9, LATIN10, LATINGREEK,
  LATINGREEK1, MAC-CENTRALEUROPE, MAC-CYRILLIC, MAC-IS, MAC-SAMI, MAC-UK, MAC,
  MACCYRILLIC, MACINTOSH, MACIS, MACUK, MACUKRAINIAN, MIK, MS-ANSI, MS-ARAB,
  MS-CYRL, MS-EE, MS-GREEK, MS-HEBR, MS-MAC-CYRILLIC, MS-TURK, MS932, MS936,
  MSCP949, MSCP1361, MSMACCYRILLIC, MSZ_7795.3, MS_KANJI, NAPLPS, NATS-DANO,
  NATS-SEFI, NATSDANO, NATSSEFI, NC_NC0010, NC_NC00-10, NC_NC00-10:81,
  NF_Z_62-010, NF_Z_62-010_(1973), NF_Z_62-010_1973, NF_Z_62010,
  NF_Z_62010_1973, NO, NO2, NS_4551-1, NS_4551-2, NS_45511, NS_45512,
  OS2LATIN1, OSF00010001, OSF00010002, OSF00010003, OSF00010004, OSF00010005,
  OSF00010006, OSF00010007, OSF00010008, OSF00010009, OSF0001000A, OSF00010020,
  OSF00010100, OSF00010101, OSF00010102, OSF00010104, OSF00010105, OSF00010106,
  OSF00030010, OSF0004000A, OSF0005000A, OSF05010001, OSF100201A4, OSF100201A8,
  OSF100201B5, OSF100201F4, OSF100203B5, OSF1002011C, OSF1002011D, OSF1002035D,
  OSF1002035E, OSF1002035F, OSF1002036B, OSF1002037B, OSF10010001, OSF10010004,
  OSF10010006, OSF10020025, OSF10020111, OSF10020115, OSF10020116, OSF10020118,
  OSF10020122, OSF10020129, OSF10020352, OSF10020354, OSF10020357, OSF10020359,
  OSF10020360, OSF10020364, OSF10020365, OSF10020366, OSF10020367, OSF10020370,
  OSF10020387, OSF10020388, OSF10020396, OSF10020402, OSF10020417, PT, PT2,
  PT154, R8, R9, RK1048, ROMAN8, ROMAN9, RUSCII, SE, SE2, SEN_850200_B,
  SEN_850200_C, SHIFT-JIS, SHIFT_JIS, SHIFT_JISX0213, SJIS-OPEN, SJIS-WIN,
  SJIS, SS636127, STRK1048-2002, ST_SEV_358-88, T.61-8BIT, T.61, T.618BIT,
  TCVN-5712, TCVN, TCVN5712-1, TCVN5712-1:1993, THAI8, TIS-620, TIS620-0,
  TIS620.2529-1, TIS620.2533-0, TIS620, TS-5881, TSCII, TURKISH8, UCS-2,
  UCS-2BE, UCS-2LE, UCS-4, UCS-4BE, UCS-4LE, UCS2, UCS4, UHC, UJIS, UK,
  UNICODE, UNICODEBIG, UNICODELITTLE, US-ASCII, US, UTF-7, UTF-8, UTF-16,
  UTF-16BE, UTF-16LE, UTF-32, UTF-32BE, UTF-32LE, UTF7, UTF8, UTF16, UTF16BE,
  UTF16LE, UTF32, UTF32BE, UTF32LE, VISCII, WCHAR_T, WIN-SAMI-2, WINBALTRIM,
  WINDOWS-31J, WINDOWS-874, WINDOWS-936, WINDOWS-1250, WINDOWS-1251,
  WINDOWS-1252, WINDOWS-1253, WINDOWS-1254, WINDOWS-1255, WINDOWS-1256,
  WINDOWS-1257, WINDOWS-1258, WINSAMI2, WS2, YU

> exactly the suggested result of converting all UTF-8 input to UCNs in the

Which, as I have explained, is fundamentally incompatible with C++ 
requirements on raw string literals (namely that UTF-8 within such a 
string appears as UTF-8 bytes in the resulting object, while a \u or \U 
sequence appears as such in the resulting object).  Proper C++ semantics 
require a conversion that is aware of the lexical context and only 
converts to UCNs in certain contexts.

The string literal R"(\u00C0)" must contain the six bytes \u00C0 plus the 
trailing null byte.  The string literal R"(À)" (given UTF-8 as the 
multibyte encoding of the execution character set) must contain the two 
bytes of that character's UTF-8 encoding plus the trailing null byte.  
See how lex_raw_string deals with reverting trigraph and line splicing 
transformations; conversions to UCNs would need similarly reverting if 
done at all (it's probably better to do the conversions later, only in the 
corner cases where it's actually visible whether such a conversion was 
done, lexing as UTF-8 as far as possible).
>From gcc-bugs-return-495133-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Tue Aug 18 20:18:30 2015
Return-Path: <gcc-bugs-return-495133-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 50033 invoked by alias); 18 Aug 2015 20:18:30 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 49977 invoked by uid 48); 18 Aug 2015 20:18:23 -0000
From: "manu at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug debug/67192] Backward-goto in loop can get wrong line number
Date: Tue, 18 Aug 2015 20:18:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: debug
X-Bugzilla-Version: 6.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: manu at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-67192-4-MRuCL3p039@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-67192-4@http.gcc.gnu.org/bugzilla/>
References: <bug-67192-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-08/txt/msg01275.txt.bz2
Content-length: 497

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67192

--- Comment #10 from Manuel López-Ibáñez <manu at gcc dot gnu.org> ---
(In reply to Manuel López-Ibáñez from comment #8)
> Does GCC work at all if input_location is saved and restored in
> c_parser_peek_token? I guess not, it seems too much still relies in
> input_location being magically right.

Perhaps you can add c_parser_peek_token_keep_input_location and use that where
you are peeking the next token.
>From gcc-bugs-return-495134-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Tue Aug 18 20:54:05 2015
Return-Path: <gcc-bugs-return-495134-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 17414 invoked by alias); 18 Aug 2015 20:54:05 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 17342 invoked by uid 48); 18 Aug 2015 20:53:59 -0000
From: "jengelh at inai dot de" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c/66425] (void) cast doesn't suppress __attribute__((warn_unused_result))
Date: Tue, 18 Aug 2015 20:54:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c
X-Bugzilla-Version: 4.9.2
X-Bugzilla-Keywords:
X-Bugzilla-Severity: enhancement
X-Bugzilla-Who: jengelh at inai dot de
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-66425-4-o4NRiVlWh2@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-66425-4@http.gcc.gnu.org/bugzilla/>
References: <bug-66425-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-08/txt/msg01276.txt.bz2
Content-length: 328

https://gcc.gnu.org/bugzilla/show_bug.cgi?idf425

--- Comment #20 from Jan Engelhardt <jengelh at inai dot de> ---
Seems like the short route is to add a new attribute
((warn_unused_result_with_void_cancelling)) that exhibits the "desired"
behavior of (void) cancelling the warning, and then make glibc use that.
Simple, no?


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (16 preceding siblings ...)
  2015-08-18 20:11 ` joseph at codesourcery dot com
@ 2015-08-18 22:54 ` ejolson at unr dot edu
  2015-08-18 23:13 ` joseph at codesourcery dot com
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: ejolson at unr dot edu @ 2015-08-18 22:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #18 from Eric <ejolson at unr dot edu> ---
Thanks Joseph for the clarification about the two different versions of iconv. 
I was admittedly confused about this until moments ago.  Anyway, I just
discovered that libiconv doesn't support conversions to and from the IBM1047
EBCDIC character set and this causes some of the regression tests to fail. 
Coupled with the fact that C99 isn't supported in the glibc version of iconv
this creates a little problem with my patch.

You mention a bigger problem which I had not thought about:  the C++ semantics
of raw strings.  Processing UCNs in C++ code apparently requires surprisingly
deep syntactic analysis.  Raw literals seem to appear in the gnu99 and gnu11
extensions to C as well.

Amusingly, if I understand the C++ specifications

    www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf

trigraphs are supposed to be interpreted before any other processing takes
place.  However, the simple code

    #include <stdio.h>
    int main(){
        char p1[]="??/u00E4";
        char p2[]=R"(??/u00E4)";
        char p3[]=R"(\u00E4)";
        printf("%s or %s or %s\n",p1,p2,p3);
        return 0;
    }

compiled with

    $ g++ -std=c++11 pp.c

produces output

    ä or ??/u00E4 or \u00E4

which illustrates that g++ does not process trigraphs inside raw string
literals.  Admittedly I'm looking at the draft standard, but I don't think this
is something which changed suddenly in the final draft.  Clearly, my patch
makes a further mess of raw string literals in gcc.  My first reaction is that
raw string literals were not well thought out, but I suppose bad standards are
sometimes better than no standards.  At anyrate, there appears no easy way of
supporting both UTF-8 identifiers and raw literal strings.

My plan for now is to take a break and keep my UTF-8 identifier support as a
one-line patch reliant on libiconv which breaks EBCDIC encodings and raw string
literals.
>From gcc-bugs-return-495141-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Tue Aug 18 22:56:29 2015
Return-Path: <gcc-bugs-return-495141-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 102370 invoked by alias); 18 Aug 2015 22:56:29 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 102343 invoked by uid 48); 18 Aug 2015 22:56:25 -0000
From: "miyuki at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c++/67261] Demangler infinite recursion
Date: Tue, 18 Aug 2015 22:56:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c++
X-Bugzilla-Version: 5.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: miyuki at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: cc
Message-ID: <bug-67261-4-PL5PgSU4ti@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-67261-4@http.gcc.gnu.org/bugzilla/>
References: <bug-67261-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-08/txt/msg01283.txt.bz2
Content-length: 1742

https://gcc.gnu.org/bugzilla/show_bug.cgi?idg261

Mikhail Maltsev <miyuki at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |miyuki at gcc dot gnu.org

--- Comment #1 from Mikhail Maltsev <miyuki at gcc dot gnu.org> ---
I have more similar cases. When I did fuzz-testing I had several crashes caused
by infinite recursion. I did not find an easy and good solution to fix them.
Perhaps, introducing some additional logic which would specifically target
invalid input could help (like maintaining a hashtable/list/whatever of visited
nodes which cannot be visited again during normal result output; substitutions
would require some additional handling).
Nevertheless, I think that the testcases are still worth being recorded. Here
are some, I believe, distinct examples:

_Z1KIStcvT_E - i.e. something like:
template<typename T>
K<std::operator T>
Normally, when demangling templated conversion operator we would print out the
template parameter, but in this case this leads to infinite recursion, because
we think that the operator itself is the parameter.
More problems with conversion operator:
_ZcvT_IIS0_EE - in substitution
_ZcvT_IZcvT_E1fE - in local name
_Z1gINcvT_EE - in nested name (probably same as std::)
_ZcvT_ILZcvDTT_EEE - template parameter in decltype

Infinite recursion when collapsing ref-qualifiers:
_Z1gIJOOT_EEOT_c

Memory hog with pointers-to-member and arrays:
_Z1KMMMMMMMMMMMMMMMA_xooooooooooooooo
(output size doubles with each M-o pair)
"pointer-to-member" and "array" are not necessarily consecutive:
_ZdvMMMMMMMMMMMMMrrrrA_DTdvfp_fp_Eededilfdfdfdfd


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (17 preceding siblings ...)
  2015-08-18 22:54 ` ejolson at unr dot edu
@ 2015-08-18 23:13 ` joseph at codesourcery dot com
  2015-08-20 22:14 ` ejolson at unr dot edu
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: joseph at codesourcery dot com @ 2015-08-18 23:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #19 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
On Tue, 18 Aug 2015, ejolson at unr dot edu wrote:

> which illustrates that g++ does not process trigraphs inside raw string
> literals.  Admittedly I'm looking at the draft standard, but I don't think this

As stated in [lex.pptoken] in both C++11 and C++14: "Between the initial 
and final double quote characters of the raw string, any transformations 
performed in phases 1 and 2 (trigraphs, universal-character-names, and 
line splicing) are reverted; this reversion shall apply before any d-char, 
r-char, or delimiting parenthesis is identified.".  Yes, the positioning 
of this in the standard may be confusing....

That is, the effect is more or less as if trigraphs weren't processed 
inside raw strings (but the implementation involves undoing trigraph 
substitutions, as described in the standard).

I think the right way to implement UTF-8 in identifiers involves making 
lex_identifier handle UTF-8 (when extended identifiers are enabled), and 
making _cpp_lex_direct handle bytes with the high bit set as 
potentially[*] starting identifiers (requiring the same handling of 
normalization state as for the other cases of characters starting 
identifiers, of course).  If you do that, then raw strings and all the 
corner cases of spelling preservation fall out naturally (though they 
still need testcases added to the testsuite).

[*] I think the right rule for C is that UTF-8 for a character not allowed 
in identifiers should produce a preprocessing token on its own rather than 
an error for an invalid character in an identifier (and similarly, such a 
character after the start of the identifier should terminate the 
identifier and produce such a preprocessing token).  Unless and until 
someone implements the C++ phase 1 conversion to UCNs, it would seem 
reasonable to follow this rule for C++ as well.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (18 preceding siblings ...)
  2015-08-18 23:13 ` joseph at codesourcery dot com
@ 2015-08-20 22:14 ` ejolson at unr dot edu
  2015-08-20 22:41 ` joseph at codesourcery dot com
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: ejolson at unr dot edu @ 2015-08-20 22:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #20 from Eric <ejolson at unr dot edu> ---
I've been looking at the code in lex_identifier as well as what goes on in
forms_identifier_p and so forth.  As some point each identifier needs to be
stored in the symbol table using ht_lookup_with_hash.  Proper functioning
requires that UTF-8 and UCN representations of the same unicode characters are
treated as the same symbol.  Thus, there needs to be some point at which the
identifiers are regularized to be either all UTF-8 or all UCN escaped ASCII. 
As gcc is working with UCNs right now, the obvious implementation allocates
temporary memory to hold the UCN escaped ASCII version of an UTF-8 identifier
and then frees it again after calling ht_lookup.  Any comments would be
appreciated.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (19 preceding siblings ...)
  2015-08-20 22:14 ` ejolson at unr dot edu
@ 2015-08-20 22:41 ` joseph at codesourcery dot com
  2020-05-01 14:17 ` lhyatt at gmail dot com
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 25+ messages in thread
From: joseph at codesourcery dot com @ 2015-08-20 22:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #21 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
_cpp_interpret_identifier converts UCNs to UTF-8 which is the canonical 
internal form for identifiers - for UTF-8 in identifiers, you just need to 
pass in straight through unmodified there.  (cpplib takes care to store 
the original spelling of the identifier as well for purposes for which 
that matters, but that's simply a matter of lex_identifier calling 
cpp_lookup on the original spelling as well as using 
_cpp_interpret_identifier to get the canonical version.)

So you never need to convert UTF-8 to UCNs in order to handle UTF-8 in 
identifiers (cpplib has logic to do so when needed for output, but you 
don't need to add anything new in that regard).  You do need to decode 
UTF-8 into character values for the code that checks normalization, which 
characters are allowed at the start of identifiers, etc., just as the 
existing code decodes UCNs into such values.  (But as I noted, a UCN not 
allowed in identifiers is lexed as part of an identifier, which is then 
considered invalid, whereas a UTF-8 character not allowed in identifiers 
should be lexed as a separate pp-token.  However, UTF-8 for a character 
allowed in identifiers but not at the start of an identifier should, I 
think, be lexed as an identifier character even at the start of an 
identifier, and then give an error for an invalid identifier if it appears 
at the start of an identifier.  That's my reading of the syntax 
productions in the C standard.)

You can ignore anything claiming to handle UTF-EBCDIC.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (20 preceding siblings ...)
  2015-08-20 22:41 ` joseph at codesourcery dot com
@ 2020-05-01 14:17 ` lhyatt at gmail dot com
  2020-05-01 18:46 ` lopezibanez at gmail dot com
  2021-04-21  6:06 ` egallager at gcc dot gnu.org
  23 siblings, 0 replies; 25+ messages in thread
From: lhyatt at gmail dot com @ 2020-05-01 14:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #35 from Lewis Hyatt <lhyatt at gmail dot com> ---
(In reply to Lewis Hyatt from comment #34)
> (In reply to Eric Gallager from comment #33)
> > This is a big enough feature that it should probably get an entry in
> > gcc-10/changes.html
> 
> I emailed a suggested patch to that effect here:
> https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01667.html. I can commit if it
> looks OK. Thanks!

With GCC 10 release imminent, would anyone be able to confirm it's OK to push
this to changes.html please? Thanks so much.
https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544343.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (21 preceding siblings ...)
  2020-05-01 14:17 ` lhyatt at gmail dot com
@ 2020-05-01 18:46 ` lopezibanez at gmail dot com
  2021-04-21  6:06 ` egallager at gcc dot gnu.org
  23 siblings, 0 replies; 25+ messages in thread
From: lopezibanez at gmail dot com @ 2020-05-01 18:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

--- Comment #36 from Manuel López-Ibáñez <lopezibanez at gmail dot com> ---
If the patch is in, it should be ok. Or ask in gcc-patches for someone to
commit on your behalf. Gerald is very helpful. Just make sure the subject
of the email is very clear.

On Fri, 1 May 2020, 16:12 lhyatt at gmail dot com, <gcc-bugzilla@gcc.gnu.org>
wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224
>
> --- Comment #35 from Lewis Hyatt <lhyatt at gmail dot com> ---
> (In reply to Lewis Hyatt from comment #34)
> > (In reply to Eric Gallager from comment #33)
> > > This is a big enough feature that it should probably get an entry in
> > > gcc-10/changes.html
> >
> > I emailed a suggested patch to that effect here:
> > https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01667.html. I can commit
> if it
> > looks OK. Thanks!
>
> With GCC 10 release imminent, would anyone be able to confirm it's OK to
> push
> this to changes.html please? Thanks so much.
> https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544343.html
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Bug c/67224] UTF-8 support for identifier names in GCC
  2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
                   ` (22 preceding siblings ...)
  2020-05-01 18:46 ` lopezibanez at gmail dot com
@ 2021-04-21  6:06 ` egallager at gcc dot gnu.org
  23 siblings, 0 replies; 25+ messages in thread
From: egallager at gcc dot gnu.org @ 2021-04-21  6:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224

Eric Gallager <egallager at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |branning at gmail dot com,
                   |                            |development at jordi dot vilar.cat
                   |                            |, dwolf at dannad dot de,
                   |                            |egallager at gcc dot gnu.org,
                   |                            |spoa at eircom dot net

--- Comment #37 from Eric Gallager <egallager at gcc dot gnu.org> ---
Redoing a few CCs that got removed without being marked as removed in the bug
history; presumably from the server migration

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-04-21  6:06 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-14 22:37 [Bug c/67224] New: UTF-8 support for identifier names in GCC ejolson at unr dot edu
2015-08-14 22:41 ` [Bug c/67224] " ejolson at unr dot edu
2015-08-15  6:35 ` pinskia at gcc dot gnu.org
2015-08-15  6:37 ` pinskia at gcc dot gnu.org
2015-08-15 11:30 ` manu at gcc dot gnu.org
2015-08-15 12:25 ` joseph at codesourcery dot com
2015-08-17 17:55 ` ejolson at unr dot edu
2015-08-17 18:45 ` ejolson at unr dot edu
2015-08-17 18:48 ` ejolson at unr dot edu
2015-08-17 18:58 ` manu at gcc dot gnu.org
2015-08-17 19:24 ` manu at gcc dot gnu.org
2015-08-17 20:18 ` joseph at codesourcery dot com
2015-08-18  0:11 ` ejolson at unr dot edu
2015-08-18 10:07 ` manu at gcc dot gnu.org
2015-08-18 19:24 ` ejolson at unr dot edu
2015-08-18 19:27 ` ejolson at unr dot edu
2015-08-18 19:57 ` ejolson at unr dot edu
2015-08-18 20:11 ` joseph at codesourcery dot com
2015-08-18 22:54 ` ejolson at unr dot edu
2015-08-18 23:13 ` joseph at codesourcery dot com
2015-08-20 22:14 ` ejolson at unr dot edu
2015-08-20 22:41 ` joseph at codesourcery dot com
2020-05-01 14:17 ` lhyatt at gmail dot com
2020-05-01 18:46 ` lopezibanez at gmail dot com
2021-04-21  6:06 ` egallager at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).