* [Bug c/41041] Documentation: -fwide-exec-charset defaults to UCS-4/UCS-2, not UTF-32/UTF-16
[not found] <bug-41041-4@http.gcc.gnu.org/bugzilla/>
@ 2022-11-03 11:04 ` redi at gcc dot gnu.org
2022-11-03 11:04 ` redi at gcc dot gnu.org
` (9 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: redi at gcc dot gnu.org @ 2022-11-03 11:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41041
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2022-11-03
Status|RESOLVED |NEW
Resolution|DUPLICATE |---
Ever confirmed|0 |1
--- Comment #6 from Jonathan Wakely <redi at gcc dot gnu.org> ---
I'm reopening this one, and closing 41040 as the dup, because this has all the
attachments.
Samuel, please send patches to the gcc-patches mailing list (as documented in
the contribution docs) instead of attaching them in bugzilla where they get
ignored for over a decade.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug c/41041] Documentation: -fwide-exec-charset defaults to UCS-4/UCS-2, not UTF-32/UTF-16
[not found] <bug-41041-4@http.gcc.gnu.org/bugzilla/>
2022-11-03 11:04 ` [Bug c/41041] Documentation: -fwide-exec-charset defaults to UCS-4/UCS-2, not UTF-32/UTF-16 redi at gcc dot gnu.org
@ 2022-11-03 11:04 ` redi at gcc dot gnu.org
2022-11-03 11:10 ` redi at gcc dot gnu.org
` (8 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: redi at gcc dot gnu.org @ 2022-11-03 11:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41041
--- Comment #7 from Jonathan Wakely <redi at gcc dot gnu.org> ---
*** Bug 41040 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug c/41041] Documentation: -fwide-exec-charset defaults to UCS-4/UCS-2, not UTF-32/UTF-16
[not found] <bug-41041-4@http.gcc.gnu.org/bugzilla/>
2022-11-03 11:04 ` [Bug c/41041] Documentation: -fwide-exec-charset defaults to UCS-4/UCS-2, not UTF-32/UTF-16 redi at gcc dot gnu.org
2022-11-03 11:04 ` redi at gcc dot gnu.org
@ 2022-11-03 11:10 ` redi at gcc dot gnu.org
2022-11-03 13:38 ` samuel.thibault@ens-lyon.org
` (7 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: redi at gcc dot gnu.org @ 2022-11-03 11:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41041
--- Comment #8 from Jonathan Wakely <redi at gcc dot gnu.org> ---
The difference with an explicit -fwide-exec-charset=UTF-32 seems to be the BOM.
It looks like the default is UTF-32LE, are you sure it's UCS4?
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug c/41041] Documentation: -fwide-exec-charset defaults to UCS-4/UCS-2, not UTF-32/UTF-16
[not found] <bug-41041-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2022-11-03 11:10 ` redi at gcc dot gnu.org
@ 2022-11-03 13:38 ` samuel.thibault@ens-lyon.org
2022-11-04 10:29 ` redi at gcc dot gnu.org
` (6 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: samuel.thibault@ens-lyon.org @ 2022-11-03 13:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41041
Samuel Thibault <samuel.thibault@ens-lyon.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |WONTFIX
--- Comment #9 from Samuel Thibault <samuel.thibault@ens-lyon.org> ---
It seems it indeed is by default a UTF encoding rather than a UCS encoding:
$ LANG= gcc -fshort-wchar test.c -o test
$ LANG= gcc -fshort-wchar test.c -o test -fwide-exec-charset=UTF-16LE
$ LANG= gcc -fshort-wchar test.c -o test -fwide-exec-charset=UCS-2LE
test.c: In function `main':
test.c:7:27: error: converting to execution character set: Invalid or
incomplete multibyte or wide character
7 | wchar_t s[] = L"𝄞";
| ^
Now there is indeed the question of the BOM. Ideally the text could mention all
of UTF-32LE, UTF-32BE, UTF-16LE, UTF-16BE, but not sure it's really worth it.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug c/41041] Documentation: -fwide-exec-charset defaults to UCS-4/UCS-2, not UTF-32/UTF-16
[not found] <bug-41041-4@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2022-11-03 13:38 ` samuel.thibault@ens-lyon.org
@ 2022-11-04 10:29 ` redi at gcc dot gnu.org
2022-11-04 10:53 ` redi at gcc dot gnu.org
` (5 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: redi at gcc dot gnu.org @ 2022-11-04 10:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41041
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|WONTFIX |---
Assignee|unassigned at gcc dot gnu.org |redi at gcc dot gnu.org
Status|RESOLVED |ASSIGNED
--- Comment #10 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Now that we have macros exposing the execution character set, we can check it
easily:
$ gcc -E -dM -x c /dev/null | grep EXEC
#define __GNUC_WIDE_EXECUTION_CHARSET_NAME "UTF-32LE"
#define __GNUC_EXECUTION_CHARSET_NAME "UTF-8"
So the docs are misleading. I think I'll take this bug myself and try to
document it without too much verbosity.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug c/41041] Documentation: -fwide-exec-charset defaults to UCS-4/UCS-2, not UTF-32/UTF-16
[not found] <bug-41041-4@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2022-11-04 10:29 ` redi at gcc dot gnu.org
@ 2022-11-04 10:53 ` redi at gcc dot gnu.org
2022-11-05 12:37 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: redi at gcc dot gnu.org @ 2022-11-04 10:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41041
--- Comment #11 from Jonathan Wakely <redi at gcc dot gnu.org> ---
SOmething like this:
--- a/gcc/doc/cppopts.texi
+++ b/gcc/doc/cppopts.texi
@@ -318,9 +318,10 @@ supported by the system's @code{iconv} library routine.
@opindex fwide-exec-charset
@cindex character set, wide execution
Set the wide execution character set, used for wide string and
-character constants. The default is UTF-32 or UTF-16, whichever
-corresponds to the width of @code{wchar_t}. As with
-@option{-fexec-charset}, @var{charset} can be any encoding supported
+character constants. The default is one of UTF-32BE, UTF-32LE, UTF-16BE,
+or UTF-16LE, whichever corresponds to the width of @code{wchar_t} and the
+big-endian or little-endian byte order being used for code generation. As
+with @option{-fexec-charset}, @var{charset} can be any encoding supported
by the system's @code{iconv} library routine; however, you will have
problems with encodings that do not fit exactly in @code{wchar_t}.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug c/41041] Documentation: -fwide-exec-charset defaults to UCS-4/UCS-2, not UTF-32/UTF-16
[not found] <bug-41041-4@http.gcc.gnu.org/bugzilla/>
` (5 preceding siblings ...)
2022-11-04 10:53 ` redi at gcc dot gnu.org
@ 2022-11-05 12:37 ` cvs-commit at gcc dot gnu.org
2022-11-05 12:38 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-11-05 12:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41041
--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jonathan Wakely <redi@gcc.gnu.org>:
https://gcc.gnu.org/g:e50ea3a42f058c14ee29327d5277ab0435e3d36b
commit r13-3694-ge50ea3a42f058c14ee29327d5277ab0435e3d36b
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Fri Nov 4 12:10:32 2022 +0000
doc: Document correct -fwide-exec-charset defaults [PR41041]
As shown in the PR, the default is not UTF-32 but rather UTF-32BE or
UTF-32LE, avoiding the need for a byte order mark in literals.
gcc/ChangeLog:
PR c/41041
* doc/cppopts.texi: Document -fwide-exec-charset defaults
correctly.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug c/41041] Documentation: -fwide-exec-charset defaults to UCS-4/UCS-2, not UTF-32/UTF-16
[not found] <bug-41041-4@http.gcc.gnu.org/bugzilla/>
` (6 preceding siblings ...)
2022-11-05 12:37 ` cvs-commit at gcc dot gnu.org
@ 2022-11-05 12:38 ` cvs-commit at gcc dot gnu.org
2022-11-05 12:38 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
10 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-11-05 12:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41041
--- Comment #13 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:
https://gcc.gnu.org/g:1342c7f46e6e3f8f29d7971531a0af18cd8429bc
commit r12-8893-g1342c7f46e6e3f8f29d7971531a0af18cd8429bc
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Fri Nov 4 12:10:32 2022 +0000
doc: Document correct -fwide-exec-charset defaults [PR41041]
As shown in the PR, the default is not UTF-32 but rather UTF-32BE or
UTF-32LE, avoiding the need for a byte order mark in literals.
gcc/ChangeLog:
PR c/41041
* doc/cppopts.texi: Document -fwide-exec-charset defaults
correctly.
(cherry picked from commit e50ea3a42f058c14ee29327d5277ab0435e3d36b)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug c/41041] Documentation: -fwide-exec-charset defaults to UCS-4/UCS-2, not UTF-32/UTF-16
[not found] <bug-41041-4@http.gcc.gnu.org/bugzilla/>
` (7 preceding siblings ...)
2022-11-05 12:38 ` cvs-commit at gcc dot gnu.org
@ 2022-11-05 12:38 ` cvs-commit at gcc dot gnu.org
2022-11-05 12:45 ` cvs-commit at gcc dot gnu.org
2022-11-05 12:45 ` redi at gcc dot gnu.org
10 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-11-05 12:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41041
--- Comment #14 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:
https://gcc.gnu.org/g:ae31f6acb2cf9d43a265f42c12f95e4687ac1fa4
commit r11-10365-gae31f6acb2cf9d43a265f42c12f95e4687ac1fa4
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Fri Nov 4 12:10:32 2022 +0000
doc: Document correct -fwide-exec-charset defaults [PR41041]
As shown in the PR, the default is not UTF-32 but rather UTF-32BE or
UTF-32LE, avoiding the need for a byte order mark in literals.
gcc/ChangeLog:
PR c/41041
* doc/cppopts.texi: Document -fwide-exec-charset defaults
correctly.
(cherry picked from commit e50ea3a42f058c14ee29327d5277ab0435e3d36b)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug c/41041] Documentation: -fwide-exec-charset defaults to UCS-4/UCS-2, not UTF-32/UTF-16
[not found] <bug-41041-4@http.gcc.gnu.org/bugzilla/>
` (8 preceding siblings ...)
2022-11-05 12:38 ` cvs-commit at gcc dot gnu.org
@ 2022-11-05 12:45 ` cvs-commit at gcc dot gnu.org
2022-11-05 12:45 ` redi at gcc dot gnu.org
10 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-11-05 12:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41041
--- Comment #15 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Jonathan Wakely
<redi@gcc.gnu.org>:
https://gcc.gnu.org/g:87b0935ed43d971a6eeebca963fb673628f138dd
commit r10-11071-g87b0935ed43d971a6eeebca963fb673628f138dd
Author: Jonathan Wakely <jwakely@redhat.com>
Date: Fri Nov 4 12:10:32 2022 +0000
doc: Document correct -fwide-exec-charset defaults [PR41041]
As shown in the PR, the default is not UTF-32 but rather UTF-32BE or
UTF-32LE, avoiding the need for a byte order mark in literals.
gcc/ChangeLog:
PR c/41041
* doc/cppopts.texi: Document -fwide-exec-charset defaults
correctly.
(cherry picked from commit e50ea3a42f058c14ee29327d5277ab0435e3d36b)
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug c/41041] Documentation: -fwide-exec-charset defaults to UCS-4/UCS-2, not UTF-32/UTF-16
[not found] <bug-41041-4@http.gcc.gnu.org/bugzilla/>
` (9 preceding siblings ...)
2022-11-05 12:45 ` cvs-commit at gcc dot gnu.org
@ 2022-11-05 12:45 ` redi at gcc dot gnu.org
10 siblings, 0 replies; 11+ messages in thread
From: redi at gcc dot gnu.org @ 2022-11-05 12:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=41041
Jonathan Wakely <redi at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Target Milestone|--- |10.5
Resolution|--- |FIXED
--- Comment #16 from Jonathan Wakely <redi at gcc dot gnu.org> ---
Docs fixed for 10.5, 11.4 and 12.3
^ permalink raw reply [flat|nested] 11+ messages in thread