From: "Hans-Bernhard Bröker" <HBBroeker@t-online.de>
To: cygwin@cygwin.com
Subject: Re: Invalid tm_zone from localtime() when TZ is not set
Date: Thu, 26 May 2016 11:03:00 -0000 [thread overview]
Message-ID: <2eddaaf6-4e37-cd9b-aa9d-8a87234d0cf9@t-online.de> (raw)
In-Reply-To: <20160525084430.GA17601@calimero.vinschen.de>
Am 25.05.2016 um 10:44 schrieb Corinna Vinschen:
> On May 25 11:28, KOBAYASHI Shinji wrote:
>>
>> Any other comments on this topic? Let me explain my proposal again.
>>
>> The intention of the following code in tzsetwall() should be to pick
>> up UPPERCASE letters "in ASCII range":
Are you sure you're not mixing ASCII with '8-bit character' range there?
>> if (isupper(*src)) *dst++ = *src;
>>
>> NOTE: src is wchar_t *, dst is char *.
>>
>> As Csaba Raduly pointed out, isw*() functions should be the first
>> choice if they achieve the desired behavior (select uppercase AND
>> ASCII).
But it doesn't, so it's not.
>> However, iswupper() does not fit for this purpose, as it
>> returns 1 for L'\uff21' for example. And I could not find isw*()
>
> In that case, wouldn't it make sense to fix iswupper in the first place?
I don't believe it's been shown to be broken, so there's no need to fix it.
> Apart from that, we can workaround all problems in tzsetwall by just
> checking for
>
> if (*src >= L'A' && *src <= L'Z')
While that may be possible if it really is ASCII you're looking for,
it's perverting the whole reason <ctype.h> and <wctype.h> exist: to make
tests like this as independent of the actual character encoding as possible.
Here's what I wrote last week, but apparently only to Csaba Raduli:
Am 20.05.2016 um 09:09 schrieb Csaba Raduly:
> If the type of those members is WCHAR[] then using isascii() /
> isupper() on them is just plain wrong.
Absolutely. The argument type of isupper() and friends is 'int', not
'unsigned char'. But the _only_ allowed argument values are those in
the range of unsigned char, plus EOF. For typical systems, that means
the allowed argument range of is*() is -1 ... 255 inclusive. Calling
these Standard Library functions with any other argument causes
undefined behaviour.
That leaves three sensible ways of calling isupper() in portable code:
*) isupper(foo) # where type of foo is unsigned char
*) isupper((unsigned char)bar) # where bar is signed char, or plain char
*) isupper(baz) # where baz was got from fgetc() or similar
All other call patterns are plain and simply wrong, or at least
non-portable. In particular, passing a wchar_t to any of the <ctype.h>
function is wrong every time.
> The correct function to use would be iswupper().
Actually, the is*upper() isn't even the actual problem here. The whole
idea of copying a wchar_t string into a char one, element by element, is
most likely nonsensical. A wchar_t cannot be assumed to just fit into a
char, regardless whether iswupper() returned true on it or not. E.g.
what do we expect this to do with an upper-case Greek or Cyrillic letter?
A proper solution may have to be more like this:
int mapped = wctob(*src);
/* this call is safe now because of how wctob() works: */
if (isupper(mapped)) {
*dst++ = (unsigned char)mapped;
}
>> So, I propose to call isascii() to assure the wchar_t fits in the
>> range of ASCII before calling isupper().
Calling isascii() would be wrong for the same reasons calling isupper() is.
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
next prev parent reply other threads:[~2016-05-25 20:03 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-20 0:45 Necessity for the assignment form KOBAYASHI Shinji
2016-05-20 1:22 ` Warren Young
2016-05-20 4:22 ` Invalid tm_zone from localtime() when TZ is not set KOBAYASHI Shinji
2016-05-20 7:09 ` Csaba Raduly
2016-05-20 10:16 ` KOBAYASHI Shinji
2016-05-25 9:28 ` KOBAYASHI Shinji
2016-05-25 10:15 ` Corinna Vinschen
2016-05-26 11:03 ` Hans-Bernhard Bröker [this message]
2016-05-26 12:16 ` KOBAYASHI Shinji
2016-05-26 14:56 ` Corinna Vinschen
2016-05-26 17:21 ` KOBAYASHI Shinji
2016-05-26 17:21 ` Hans-Bernhard Bröker
2016-05-20 5:46 ` Necessity for the assignment form David Stacey
2016-05-20 11:36 ` KOBAYASHI Shinji
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2eddaaf6-4e37-cd9b-aa9d-8a87234d0cf9@t-online.de \
--to=hbbroeker@t-online.de \
--cc=cygwin@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).