public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Ian Abbott <abbotti@mev.co.uk>
To: Alejandro Colomar <alx.manpages@gmail.com>,
	Zack Weinberg <zack@owlfolio.org>
Cc: libc-alpha@sourceware.org, 'linux-man' <linux-man@vger.kernel.org>
Subject: Re: [PATCH] scanf.3: Do not mention the ERANGE error
Date: Wed, 14 Dec 2022 14:10:55 +0000	[thread overview]
Message-ID: <c10cf853-19ec-dd88-f366-90262c357151@mev.co.uk> (raw)
In-Reply-To: <dc7e92ad-8b69-fd78-3547-565ed86fa992@gmail.com>

On 14/12/2022 11:23, Alejandro Colomar wrote:
> 
> 
> On 12/14/22 11:52, Ian Abbott wrote:
>>>>
>>>> '@' isn't included in C's basic character set though.  '&' is 
>>>> available.
>>>
>>> Just a curious question from an ignorant:  what's the difference 
>>> between the basic character set and the source character set?
>>
>> The source character set may contain locale-specific characters 
>> outside the basic source character set.
>>
>> Actually, there are two basic character sets - the basic source 
>> character set and the basic execution character set (which includes 
>> the basic source character set plus a few control characters).  The 
>> source character set and/or execution character set may contain 
>> locale-specific, extended characters outside the basic character set.
>>
>> https://port70.net/~nsz/c/c11/n1570.html#5.2.1
> 
> I still have a small doubt.  C23 added '@' to the source character set, 
> but seems to be a second-class citizen:
> 
> 
> 
>   The execution character set may also contain multibyte characters, which
> need not have the same encoding as for the source character set. For 
> both character sets, the following
> shall hold:
> — The basic character set, @, $, and ` shall be present and each 
> character shall be encoded as a
> single byte.
> 
> What's the difference, and why isn't it part of the basic character 
> set?  Maybe because not all keyboards have those three characters?

I think the inability to type certain characters in the basic source 
character set is the reason why the language contains the horrible 
trigraph sequences (no longer valid since the C23 final draft N3054), 
and the slightly less horrible digraph tokens.

Here is the rationale for inclusion of @ and $ in the source and 
execution character sets, but ` is only mentioned briefly as an also-ran 
at the end of the document in section "Do we also want to add ` in the 
same way as @ and $?":

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2701.htm

The rationale for exclusion of @ and $ characters from the basic 
character set is given in this paragraph from the document:

"""
By requiring @ and $ in the source and execution character set we, reach 
the goal of making them useable in comments and string literals. By not 
adding them to the basic source character set, we protect the freedom of 
implementations of allowing or disallowing them in identifiers, and 
avoid inconsistency or incompability regarding the use of universal 
character names (currently the use of universal character names for 
characters in the basic source character set is not allowed, so adding 
characters to the basic source character set without lifting that 
restriction could break existing code).
"""

I guess it was decided to add all three proposed characters during the 
Jan/Feb 2022 virtual meeting of WG14 as mentioned here:

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2913.htm

The first C2x draft that incorporated the change is this one:

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf

-- 
-=( Ian Abbott <abbotti@mev.co.uk> || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-


  reply	other threads:[~2022-12-14 14:10 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20221208123454.13132-1-abbotti@mev.co.uk>
2022-12-09 18:59 ` Alejandro Colomar
2022-12-09 19:28   ` Ian Abbott
2022-12-09 19:33     ` Alejandro Colomar
2022-12-09 21:41       ` Zack Weinberg
2022-12-11 15:58         ` Alejandro Colomar
2022-12-11 16:03           ` Alejandro Colomar
2022-12-12  2:11           ` Zack Weinberg
2022-12-12 10:21             ` Alejandro Colomar
2022-12-14  2:13               ` Zack Weinberg
2022-12-14 10:47                 ` Alejandro Colomar
2022-12-14 11:03                   ` Ian Abbott
2022-12-29  6:42                     ` Zack Weinberg
2022-12-29  6:39                   ` Zack Weinberg
2022-12-29 10:47                     ` Alejandro Colomar
2022-12-29 16:35                       ` Zack Weinberg
2022-12-29 16:39                         ` Alejandro Colomar
2022-12-12 15:22             ` Ian Abbott
2022-12-14  2:18               ` Zack Weinberg
2022-12-14 10:22                 ` Ian Abbott
2022-12-14 10:39                   ` Alejandro Colomar
2022-12-14 10:52                     ` Ian Abbott
2022-12-14 11:23                       ` Alejandro Colomar
2022-12-14 14:10                         ` Ian Abbott [this message]
2022-12-14 16:38                         ` Joseph Myers
2022-12-12 10:07       ` Ian Abbott

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c10cf853-19ec-dd88-f366-90262c357151@mev.co.uk \
    --to=abbotti@mev.co.uk \
    --cc=alx.manpages@gmail.com \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-man@vger.kernel.org \
    --cc=zack@owlfolio.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).