From: Ian Abbott <abbotti@mev.co.uk>
To: Alejandro Colomar <alx.manpages@gmail.com>,
Zack Weinberg <zack@owlfolio.org>
Cc: libc-alpha@sourceware.org, 'linux-man' <linux-man@vger.kernel.org>
Subject: Re: [PATCH] scanf.3: Do not mention the ERANGE error
Date: Wed, 14 Dec 2022 14:10:55 +0000 [thread overview]
Message-ID: <c10cf853-19ec-dd88-f366-90262c357151@mev.co.uk> (raw)
In-Reply-To: <dc7e92ad-8b69-fd78-3547-565ed86fa992@gmail.com>
On 14/12/2022 11:23, Alejandro Colomar wrote:
>
>
> On 12/14/22 11:52, Ian Abbott wrote:
>>>>
>>>> '@' isn't included in C's basic character set though. '&' is
>>>> available.
>>>
>>> Just a curious question from an ignorant: what's the difference
>>> between the basic character set and the source character set?
>>
>> The source character set may contain locale-specific characters
>> outside the basic source character set.
>>
>> Actually, there are two basic character sets - the basic source
>> character set and the basic execution character set (which includes
>> the basic source character set plus a few control characters). The
>> source character set and/or execution character set may contain
>> locale-specific, extended characters outside the basic character set.
>>
>> https://port70.net/~nsz/c/c11/n1570.html#5.2.1
>
> I still have a small doubt. C23 added '@' to the source character set,
> but seems to be a second-class citizen:
>
>
>
> The execution character set may also contain multibyte characters, which
> need not have the same encoding as for the source character set. For
> both character sets, the following
> shall hold:
> — The basic character set, @, $, and ` shall be present and each
> character shall be encoded as a
> single byte.
>
> What's the difference, and why isn't it part of the basic character
> set? Maybe because not all keyboards have those three characters?
I think the inability to type certain characters in the basic source
character set is the reason why the language contains the horrible
trigraph sequences (no longer valid since the C23 final draft N3054),
and the slightly less horrible digraph tokens.
Here is the rationale for inclusion of @ and $ in the source and
execution character sets, but ` is only mentioned briefly as an also-ran
at the end of the document in section "Do we also want to add ` in the
same way as @ and $?":
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2701.htm
The rationale for exclusion of @ and $ characters from the basic
character set is given in this paragraph from the document:
"""
By requiring @ and $ in the source and execution character set we, reach
the goal of making them useable in comments and string literals. By not
adding them to the basic source character set, we protect the freedom of
implementations of allowing or disallowing them in identifiers, and
avoid inconsistency or incompability regarding the use of universal
character names (currently the use of universal character names for
characters in the basic source character set is not allowed, so adding
characters to the basic source character set without lifting that
restriction could break existing code).
"""
I guess it was decided to add all three proposed characters during the
Jan/Feb 2022 virtual meeting of WG14 as mentioned here:
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2913.htm
The first C2x draft that incorporated the change is this one:
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf
--
-=( Ian Abbott <abbotti@mev.co.uk> || MEV Ltd. is a company )=-
-=( registered in England & Wales. Regd. number: 02862268. )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-
next prev parent reply other threads:[~2022-12-14 14:10 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20221208123454.13132-1-abbotti@mev.co.uk>
2022-12-09 18:59 ` Alejandro Colomar
2022-12-09 19:28 ` Ian Abbott
2022-12-09 19:33 ` Alejandro Colomar
2022-12-09 21:41 ` Zack Weinberg
2022-12-11 15:58 ` Alejandro Colomar
2022-12-11 16:03 ` Alejandro Colomar
2022-12-12 2:11 ` Zack Weinberg
2022-12-12 10:21 ` Alejandro Colomar
2022-12-14 2:13 ` Zack Weinberg
2022-12-14 10:47 ` Alejandro Colomar
2022-12-14 11:03 ` Ian Abbott
2022-12-29 6:42 ` Zack Weinberg
2022-12-29 6:39 ` Zack Weinberg
2022-12-29 10:47 ` Alejandro Colomar
2022-12-29 16:35 ` Zack Weinberg
2022-12-29 16:39 ` Alejandro Colomar
2022-12-12 15:22 ` Ian Abbott
2022-12-14 2:18 ` Zack Weinberg
2022-12-14 10:22 ` Ian Abbott
2022-12-14 10:39 ` Alejandro Colomar
2022-12-14 10:52 ` Ian Abbott
2022-12-14 11:23 ` Alejandro Colomar
2022-12-14 14:10 ` Ian Abbott [this message]
2022-12-14 16:38 ` Joseph Myers
2022-12-12 10:07 ` Ian Abbott
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c10cf853-19ec-dd88-f366-90262c357151@mev.co.uk \
--to=abbotti@mev.co.uk \
--cc=alx.manpages@gmail.com \
--cc=libc-alpha@sourceware.org \
--cc=linux-man@vger.kernel.org \
--cc=zack@owlfolio.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).