public inbox for newlib@sourceware.org
 help / color / mirror / Atom feed
From: Brian Inglis <Brian.Inglis@SystematicSw.ab.ca>
To: newlib@sourceware.org
Subject: Re: [PATCH] add tests for tzset(3)
Date: Wed, 13 Apr 2022 16:19:27 -0600	[thread overview]
Message-ID: <a2bd00ae-897a-01f9-2fc9-33df0b0ee08b@SystematicSw.ab.ca> (raw)
In-Reply-To: <CAOox84t_n--Fs4FQDqJbAdHE_xR6T6+uuLKJ=7uzfvegddP_DA@mail.gmail.com>

On 2022-04-13 14:33, Jeff Johnston wrote:
> Looking at the glibc tzset code I have locally (not latest/greatest, but
> does support angle brackets):
> 
> If there any parse failures, UTC is defaulted.

We currently leave the time zone info unchanged.

> Extraneous characters inside brackets or less than 3 characters is a
> parse failure.
✔ Check	✔ Check

> Glibc parses the tz name string char by char and allocates space for
> the name strings so there is no max size.

The suggestion was that glibc ignores the remaining characters, but you 
imply that glibc in fact uses the equivalent of the scanf "%m[...]" 
(malloc) modifier, and I think using that would be against the newlib 
philosophy to keep things limited and under control to support small 
targets.  Larger targets like Cygwin (do our own thing including 
zoneinfo), and perhaps RTEMS, can supply their own enhancements.

> the name strings so there is no max size.  I am fine if you want to 
> mandate a maximum, but if you do, then too many chars should be 
> treated as a failure.  If you aren't certain of the limit, make the 
> limit higher than you expect.
Current limits are 3-10 allowing for e.g. <MESZ+03:30> which is the most 
ever likely to be used. It might be reasonable to bump it up to say 15.

> If people run into max limit with reasonable timezone format strings, then
> we can up the limit.

The conditions are more or less what is implemented, but we could do 
with a couple more tweaks to improve things, like check for more or 
extraneous chars within the bracket quotes, and that no characters 
remain unconsumed at the end of the parse.

Context below are latest posts under thread "add test for tzset(3)".


> On Wed, Apr 13, 2022 at 1:53 PM Brian Inglis wrote:
>> On 2022-04-12 12:33, Brian Inglis wrote:
>>> On 2022-04-12 05:19, jdoubleu wrote:
>>>> On 4/11/2022 7:27 PM, Dimitar Dimitrov wrote:
>>>>> On Mon, Apr 11, 2022 at 01:17:16PM +0200, jdoubleu wrote:
>>>>>> looks like I'm running the testsuite against glibc and not newlib
>>>>>> (for target x86_64-pc-linux-gnu). I'm not even sure whether
>>>>>> there's a backend for linux.
>>>>>> I'm currently trying to run only the tzset code against the test
>>>>>> vectors (like Brian Inglis did[1]).
>>>>>> At least it show, that the newlib implementation differs from glibc.
>>>>>> Maybe the test case is flawed and it should indeed fail.
>>>
>>>>>>> 6:20:12 is the timezone string of the previous test case, whose tzset
>>>>>>> call was successful. Looking at the current code, this is expected
>>>>>>> behaviour.
>>>
>>>>>> Okay. Looks like the condition[2] fails. The question is, which
>>>>>> part of it does?
>>>>> I believe it is the ('>' != tzenv[n]) check that fails, because the
>>>>> maximum parsed string limit of 10 has been reached and n=10.
>>>
>>>>>> I've appended a patch, which prints all variables when the condition
>>>>>> fails.
>>>>>> Could you please apply the patch and then recompile and re-run the
>>>>>> tests
>>>>>> again?
>>>
>>>>> Here is the result for arm-none-eabi:
>>>>> parsing name: tzenv="+0123456789ABCDEF>3:33:33", res=1, n=10,
>>>>> tzenv[n] = 9
>>>>> Assertion failed! Expected 1647906533 to equal 1647916532. winter
>>>>> time, timezone = "<+0123456789ABCDEF>3:33:33"
>>>>> I assume you no longer need assembly output from compiler?
>>>
>>>>>> I've previously noticed something with the sscanf format[3].
>>>>>>> Perhaps TZ should be reset to UTC before the bail out?
>>>
>>>>>> I don't think the implementation should fall back to UTC
>>>>>> whenever parsing failed. It apparently doesn't in glibc. I'm not
>>>>>> sure if the behavior is specified somewhere.
>>>
>>>>> The tzset manual page says that UTC is used if TZ cannot be parsed:
>>>>> https://man7.org/linux/man-pages/man3/tzset.3.html
>>>
>>>>>> Maybe resetting it before each test case is a good idea, though.
>>>>>> That makes it clearer, why the test case failed.
>>>>>>> With that chunk removed, as shown below:
>>>>>>>       {"3:33:33",               IN_SECONDS(3, 33, 33),
>>>>>>> NO_TIME},                 // truncates the name (17 + 1)
>>>>>>> I still get:
>>>>>>>     Assertion failed! Expected 1647906533 to equal 1647916532.
>>>>>>> winter time, timezone = "3:33:33"
>>>
>>>>>> My bad; "3:33:33" isn't a valid timezone string. It has to be
>>>>>> prefixed by a
>>>>>> name e.g. "MESZ" or "<+00>", as you tried. That explains why it is
>> also
>>>>>> failing.
>>>
>>>>>> [1]: https://sourceware.org/pipermail/newlib/2022/019529.html
>>>>>> [2]:
>>>>>>
>> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;h=9cb30b188f989f65ec9eb6417f5d74020f8c72e9;hb=HEAD#l57
>>>>>>
>>>>>> [3]: https://sourceware.org/pipermail/newlib/2022/019535.html
>>>
>>>>> I believe it is the ('>' != tzenv[n]) check that fails, because the
>>>>> maximum parsed string limit of 10 has been reached and n=10.
>>>
>>>> Okay, so the newlib implementation actually fails, when the name is
>>>> too long.
>>>> The POSIX standard leaves it up to the implementation how to handle
>>>> names[1]:
>>>
>>>>> The interpretation of these fields is unspecified if either field is
>>>>> less than three bytes (except for the case when dst is missing), more
>>>>> than {TZNAME_MAX} bytes, or if they contain characters other than
>>>>> those specified.
>>>
>>>> I'm not sure where to go with this. In both cases, the current
>>>> implementation needs a fix. Looping in Brian Inglis, the writer of the
>>>> current implementation and Jeff Johnston, a maintainer.
>>>> As I see it, there are two solutions:
>>>> 1. Treat long names as an error case. The failed condition[2] needs to
>>>> apply UTC as current time, like Dimitar Dimitrov previously suggested.
>>>> The test needs to be changed to check that it actually fails.
>>>> 2. Mimic glibc's behavior and ignore the remaining characters from the
>>>> name. In that case, the condition[2] and following code needs to be
>>>> updated as well.
>>>> I am happy to provide a patch, when we agreed on one solution.
>>>
>>>>> I assume you no longer need assembly output from compiler?
>>>
>>>> No, thanks. We already found the issue. It would only be helpful, if
>>>> different configurations generate different code (through macros,
>>>> optimization, etc.).
>>>> [1]:
>> https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03
>>>> [2]:
>> https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/time/tzset_r.c;hb=HEAD#l57
>>
>>> Does anyone know what glibc or the BSDs do on short < 3 char
>>> abbreviations, or the BSDs do on > LIMIT abbreviations?
>>> What do they do on errors - nothing or revert to UTC?
>>>
>>> Should tzset_r allow 1...2 char abbreviations or treat them as errors?
>>>
>>> Should tzset_r allow > 10 char abbreviations, ignore and skip extra
>>> chars, or treat them as errors?
>>
>> Should tzset_r accept characters within quotes not in the allowed
>> character set, ignore and skip them, or reject such abbreviations?
>>
>>> If it ignores and skips extra or accepts too few characters, then
>>> presumably it should process the remaining fields if present?
>>>
>>> POSIX tzset_r says only that it overrides the default time zone, so is
>>> there a spec for the default time zone, or is it implementation defined
>>> (e.g. /etc/timezone /etc/localtime)?
>>>
>>> Should tzset_r revert to UTC on errors, or just not change the TZ, and
>>> leave it up to the application to handle the error?

-- 
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

This email may be disturbing to some readers as it contains
too much technical detail. Reader discretion is advised.
[Data in binary units and prefixes, physical quantities in SI.]

  reply	other threads:[~2022-04-13 22:19 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-07 15:58 jdoubleu
2022-04-08 21:21 ` Jeff Johnston
2022-04-10  8:43 ` Dimitar Dimitrov
2022-04-10 17:55   ` jdoubleu
2022-04-10 21:00     ` Dimitar Dimitrov
2022-04-11 11:17       ` jdoubleu
2022-04-11 17:27         ` Dimitar Dimitrov
2022-04-12 11:19           ` jdoubleu
2022-04-12 18:33             ` Brian Inglis
2022-04-07 23:34               ` [PATCH v2 0/2] add tzset/_r POSIX angle bracket <> support in TZ env var Brian Inglis
2022-04-07 23:34                 ` [PATCH v2 1/2] newlib/libc/time/tzset.c: doc update POSIX angle bracket <> support Brian Inglis
2022-04-07 23:34                 ` [PATCH v2 2/2] newlib/libc/time/tzset_r.c(_tzset_unlocked_r): " Brian Inglis
2022-04-08 19:11                 ` [PATCH v2 0/2] add tzset/_r POSIX angle bracket <> support in TZ env var Jeff Johnston
2022-04-13 17:53                 ` [PATCH] add tests for tzset(3) Brian Inglis
2022-04-13 20:33                   ` Jeff Johnston
2022-04-13 22:19                     ` Brian Inglis [this message]
2022-04-14  8:59                       ` jdoubleu
2022-04-14 16:31                         ` Brian Inglis
2022-04-14 19:23                           ` Jeff Johnston
2022-04-15 10:10                             ` jdoubleu
2022-04-27 19:30                               ` Jeff Johnston
2022-05-14 14:39                         ` jdoubleu
2022-05-16 16:05                           ` Dimitar Dimitrov
2022-05-16 17:38                             ` Jeff Johnston
2022-05-17  8:45                           ` [PATCH] update tzset tests jdoubleu
2022-05-18 18:48                             ` Dimitar Dimitrov
2022-05-18 20:56                               ` Keith Packard
2022-05-19  8:47                                 ` jdoubleu
2022-05-22  9:51                                   ` [PATCH v2] " jdoubleu
2022-05-22 21:02                                     ` Dimitar Dimitrov
2022-05-27 15:46                                       ` Jeff Johnston
2022-04-13 22:21               ` [PATCH] add tests for tzset(3) Brian Inglis
2022-04-29 15:46 Jeff Johnston
2022-05-12 18:35 ` jdoubleu
2022-05-16 17:49   ` Jeff Johnston

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a2bd00ae-897a-01f9-2fc9-33df0b0ee08b@SystematicSw.ab.ca \
    --to=brian.inglis@systematicsw.ab.ca \
    --cc=newlib@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).