public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/49952] New: Unicode literals do not generate errors as prescribed by the FDIS standard
@ 2011-08-02 22:09 z0sh at sogetthis dot com
  2011-08-02 22:32 ` [Bug c++/49952] [C++0x] " paolo.carlini at oracle dot com
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: z0sh at sogetthis dot com @ 2011-08-02 22:09 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49952

           Summary: Unicode literals do not generate errors as prescribed
                    by the FDIS standard
           Product: gcc
           Version: 4.6.1
            Status: UNCONFIRMED
          Severity: trivial
          Priority: P3
         Component: c++
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: z0sh@sogetthis.com
              Host: Linux x86


Referring to the standard:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2442.htm

The character literal \UNNNNNNNN must only accept characters in the range
0-0x10FFFF, excluding surrogates. However, GCC allows 31-bit values above
0x10FFFF. To wit, the following compiles:

    char32_t s[] = U"\U0010FFFF\U7FFFFFFF";

It may be that the actual wording of the FDIS (2.3.2, p.19) is more relaxed
than in the reference I gave above and that this behaviour is in fact
intentional, but I thought I bring it up anyway.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c++/49952] [C++0x] Unicode literals do not generate errors as prescribed by the FDIS standard
  2011-08-02 22:09 [Bug c++/49952] New: Unicode literals do not generate errors as prescribed by the FDIS standard z0sh at sogetthis dot com
@ 2011-08-02 22:32 ` paolo.carlini at oracle dot com
  2011-08-03 11:19 ` joseph at codesourcery dot com
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: paolo.carlini at oracle dot com @ 2011-08-02 22:32 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49952

Paolo Carlini <paolo.carlini at oracle dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kris.van.hees at oracle dot
                   |                            |com

--- Comment #1 from Paolo Carlini <paolo.carlini at oracle dot com> 2011-08-02 22:32:10 UTC ---
Kris, are you willing to triage this PR?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c++/49952] [C++0x] Unicode literals do not generate errors as prescribed by the FDIS standard
  2011-08-02 22:09 [Bug c++/49952] New: Unicode literals do not generate errors as prescribed by the FDIS standard z0sh at sogetthis dot com
  2011-08-02 22:32 ` [Bug c++/49952] [C++0x] " paolo.carlini at oracle dot com
@ 2011-08-03 11:19 ` joseph at codesourcery dot com
  2011-08-03 11:37 ` z0sh at sogetthis dot com
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: joseph at codesourcery dot com @ 2011-08-03 11:19 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49952

--- Comment #2 from joseph at codesourcery dot com <joseph at codesourcery dot com> 2011-08-03 11:19:01 UTC ---
C and C++ reference ISO 10646 instead of Unicode, meaning that it is 
natural and proper for the full ISO 10646 range of values to be accepted 
instead of the restricted Unicode range.  N3291 does appear to have this 
restriction on char32_t string (but not character) literals; C1X does not.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c++/49952] [C++0x] Unicode literals do not generate errors as prescribed by the FDIS standard
  2011-08-02 22:09 [Bug c++/49952] New: Unicode literals do not generate errors as prescribed by the FDIS standard z0sh at sogetthis dot com
  2011-08-02 22:32 ` [Bug c++/49952] [C++0x] " paolo.carlini at oracle dot com
  2011-08-03 11:19 ` joseph at codesourcery dot com
@ 2011-08-03 11:37 ` z0sh at sogetthis dot com
  2011-08-03 12:01 ` paolo.carlini at oracle dot com
  2021-12-02  2:29 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: z0sh at sogetthis dot com @ 2011-08-03 11:37 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49952

--- Comment #3 from Kerrek SB <z0sh at sogetthis dot com> 2011-08-03 11:36:41 UTC ---
Maybe it could trigger a warning in -pedantic mode?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c++/49952] [C++0x] Unicode literals do not generate errors as prescribed by the FDIS standard
  2011-08-02 22:09 [Bug c++/49952] New: Unicode literals do not generate errors as prescribed by the FDIS standard z0sh at sogetthis dot com
                   ` (2 preceding siblings ...)
  2011-08-03 11:37 ` z0sh at sogetthis dot com
@ 2011-08-03 12:01 ` paolo.carlini at oracle dot com
  2021-12-02  2:29 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: paolo.carlini at oracle dot com @ 2011-08-03 12:01 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49952

--- Comment #4 from Paolo Carlini <paolo.carlini at oracle dot com> 2011-08-03 12:00:30 UTC ---
Adding a warning would be easy.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug c++/49952] [C++0x] Unicode literals do not generate errors as prescribed by the FDIS standard
  2011-08-02 22:09 [Bug c++/49952] New: Unicode literals do not generate errors as prescribed by the FDIS standard z0sh at sogetthis dot com
                   ` (3 preceding siblings ...)
  2011-08-03 12:01 ` paolo.carlini at oracle dot com
@ 2021-12-02  2:29 ` pinskia at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-02  2:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49952

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2021-12-02
           Keywords|                            |diagnostic
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
With -std=c++20, we do warn:
<source>:2:21: warning: \U7FFFFFFF is outside the UCS codespace
    2 |     char32_t  s[] = U"\U0010FFFF\U7FFFFFFF";
      |                     ^~~~~~~~~~~~~~~~~~~~~~~

This warning was implemented in r10-3414-g0900e29cdbc5.
I wonder if we should just enable it for all C++ standards and above of C++20+.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-12-02  2:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-02 22:09 [Bug c++/49952] New: Unicode literals do not generate errors as prescribed by the FDIS standard z0sh at sogetthis dot com
2011-08-02 22:32 ` [Bug c++/49952] [C++0x] " paolo.carlini at oracle dot com
2011-08-03 11:19 ` joseph at codesourcery dot com
2011-08-03 11:37 ` z0sh at sogetthis dot com
2011-08-03 12:01 ` paolo.carlini at oracle dot com
2021-12-02  2:29 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).